So what is the demo about?
I use a webcam feed, I put a xylophone image on top of it and I detect the user's movements to play the xylophone.
To detect the user's movement I used a technic that is probably well-known by developers, designers and motion designers: the blend mode difference.
The concept is, if you put a picture on top of the same picture and blend them together using the blend mode difference, you obtain... a black screen!
If there's a slight difference between the 2 pictures, the colors start to appear, showing you the "difference" between the two pictures. You get something like that:
Not exactly like that actually, in this demo I made an average on each pixel between the red, the green and the blue, and turn them to white above a certain amount to get a bit more accuracy.
The first step is drawing the webcam feed in a canvas, grab a picture from it at a time interval, and blend the current frame to the previous one. This means that if you don't move, the result will be black, and when you start moving some pixels will turn white.
To be a bit more precise, I loop over the pixels of the current frame and I subtract the pixels color from the same pixel color of the previous frame. This is what the blend mode difference is doing: "Difference subtracts the top layer from the bottom layer or the other way round, to always get a positive value.".
The next step is checking in some areas (the different parts of the xylophone) if you have some white pixels in them. If you get a certain amount of non-black pixels in this area, it means that something moved since the last frame and I play a note.