Virtual Drum Set using OpenCV

Learn to build a virtual drum set using computer vision concepts with OpenCV and Python.

Posted by Navendu Pottekkat on May 28, 2020

If you are new to Computer Vision or if you have a strong urge to play the drums but don’t own one, you have come to the right place!

In this tutorial we will build a virtual drum set using basic concepts in computer vision using OpenCV. If you are a beginner to OpenCV, this would be a good tutorial to try.

virtual-drums in action

We will build the project from scratch and in the end you will have a cool project to show off!

All the source code is available in this GitHub repository. To get started, fork and clone the repository in your local machine.

git clone link_to_your_forked_repo


Python 3.6+ should be installed in your device.

Navigate into your forked directory (the folder you have downloaded).

You can install the required packages for your project in your environment by running the following command in the terminal.

pip install -r requirements.txt

Running the file

In the project folder run the following command.


You will see the output of the webcam on your screen with the drums added. You can use a green stick to hit the drums(a pen or pencil would be great).

Green sticks are needed because we have used a detection window that would detect the color green.

Under the hood when a green object is detected by the window(the part of the image with the drums) it would interpret it as the drum being hit and plays the beat.

We will now take a look at the code that does this.

The code

Open the file in your preferred text editor.

You can follow along and build your own file or edit the file that you have downloaded. We will now take a look at how all this works!

We begin by importing the necessary libraries. You need to have these installed in your environment if not run pip install -r requirements.txt

# Importing the necessary libraries
import numpy as np
import time
import cv2
from pygame import mixer

Next we define a function to play the drum beat when a green object is detected on the window.

We play the sound if the detected color lies between our set limits.

# This function plays the corresponding drum beat if a green color object is detected in the region
def play_beat(detected,sound):

# Checks if the detected green color is greater that a preset value
play = (detected) > hat_thickness[0]*hat_thickness[1]*0.8

# If it is detected play the corresponding drum beat
if play and sound==1:

elif play and sound==2:

Next we define a function to detect green color if it is present in our window.

# This function is used to check if green color is present in the small region
def detect_in_region(frame,sound):

# Converting to HSV
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)

# Creating mask
mask = cv2.inRange(hsv, greenLower, greenUpper)

# Calculating the number of green pixels
detected = np.sum(mask)

# Call the function to play the drum beat

return mask

Next we import our drum beats and store it.

We then set the upper and lower limits for the green color that should be detected. You could change this to the limits of the color that you desire.

We then start the webcam and read the input from it.

# A flag variable to choose whether to show the region that is being detected
verbose = False

# Importing drum beats
drum_hat = mixer.Sound('./sounds/high_hat_1.ogg')
drum_snare = mixer.Sound('./sounds/snare_1.wav')

# Set HSV range for detecting green color
greenLower = (25,52,72)
greenUpper = (102,255,255)

# Obtain input from the webcam
camera = cv2.VideoCapture(0)
ret,frame =
H,W = frame.shape[:2]

kernel = np.ones((7,7),np.uint8)

We will import the images of the drums to be added to the video output. In this example I have downloaded and added a high hat and a snare drum.

# Read the image of High Hat and the Snare drum
hat = cv2.resize(cv2.imread('./images/high_hat.png'),(200,100),interpolation=cv2.INTER_CUBIC)
snare = cv2.resize(cv2.imread('./images/snare_drum.png'),(200,100),interpolation=cv2.INTER_CUBIC)

We will now set the region or the window in which we should detect the green color. In this example we have two drums so we create two windows.

# Set the region area for detecting green color 
hat_center = [np.shape(frame)[1]*2//8,np.shape(frame)[0]*6//8]
snare_center = [np.shape(frame)[1]*6//8,np.shape(frame)[0]*6//8]

hat_thickness = [200,100]
hat_top = [hat_center[0]-hat_thickness[0]//2,hat_center[1]-hat_thickness[1]//2]
hat_btm = [hat_center[0]+hat_thickness[0]//2,hat_center[1]+hat_thickness[1]//2]

snare_thickness = [200,100]
snare_top = [snare_center[0]-snare_thickness[0]//2,snare_center[1]-snare_thickness[1]//2]
snare_btm = [snare_center[0]+snare_thickness[0]//2,snare_center[1]+snare_thickness[1]//2]


We then run an infinite loop that breaks when we press “Q” on the keyboard.

We call the function to display the drums on the screen and call the function to detect if a green object hits the window.

Finally we have to clean up the open window after it is quit by pressing “Q”.

while True:

# Select the current frame
ret, frame =
frame = cv2.flip(frame,1)

if not(ret):

# Select region corresponding to the Snare drum
snare_region = np.copy(frame[snare_top[1]:snare_btm[1],snare_top[0]:snare_btm[0]])
mask = detect_in_region(snare_region,1)

# Select region corresponding to the High Hat
hat_region = np.copy(frame[hat_top[1]:hat_btm[1],hat_top[0]:hat_btm[0]])
mask = detect_in_region(hat_region,2)

# Output project title
cv2.putText(frame,'Virtual Drums',(10,30),2,1,(20,20,20),2)

# If flag is selected, display the region under detection
if verbose:
frame[snare_top[1]:snare_btm[1],snare_top[0]:snare_btm[0]] = cv2.bitwise_and(frame[snare_top[1]:snare_btm[1],snare_top[0]:snare_btm[0]],frame[snare_top[1]:snare_btm[1],snare_top[0]:snare_btm[0]], mask=mask[snare_top[1]:snare_btm[1],snare_top[0]:snare_btm[0]])
frame[hat_top[1]:hat_btm[1],hat_top[0]:hat_btm[0]] = cv2.bitwise_and(frame[hat_top[1]:hat_btm[1],hat_top[0]:hat_btm[0]],frame[hat_top[1]:hat_btm[1],hat_top[0]:hat_btm[0]],mask=mask[hat_top[1]:hat_btm[1],hat_top[0]:hat_btm[0]])

# If flag is not selected, display the drums
frame[snare_top[1]:snare_btm[1],snare_top[0]:snare_btm[0]] = cv2.addWeighted(snare, 1, frame[snare_top[1]:snare_btm[1],snare_top[0]:snare_btm[0]], 1, 0)
frame[hat_top[1]:hat_btm[1],hat_top[0]:hat_btm[0]] = cv2.addWeighted(hat, 1, frame[hat_top[1]:hat_btm[1],hat_top[0]:hat_btm[0]], 1, 0)

key = cv2.waitKey(1) & 0xFF
# 'Q' to exit
if key == ord("q"):
# Clean up the open windows

That is it! Now you can jam using your virtual-drum set. Try changing the drums or add more to the project.

You can use this knowledge to build your own computer vision applications using OpenCV.

Happy Coding!