face detection python
Python hosting: Host, run, and code Python in the cloud!
In this tutorial you will learn how to apply face detection with Python. As input video we will use a Google Hangouts video. There are tons of Google Hangouts videos around the web and in these videos the face is usually large enough for the software to detect the faces.
Detection of faces is achieved using the OpenCV (Open Computer Vision) library. The most common face detection method is to extract cascades. This technique is known to work well with face detection. You need to have the cascade files (included in OpenCV) in the same directory as your program.
Related course
Master Computer Vision with OpenCV and Python
Video with Python OpenCV
To analyse the input video we extract each frame. Each frame is shown for a brief period of time. Start with this basic program:
#! /usr/bin/python |
Upon execution you will see the video played without sound. (OpenCV does not support sound). Inside the while loop we have every video frame inside the variable frame.
Face detection with OpenCV
We will display a rectangle on top of the face. To avoid flickering of the rectangle, we will show it at it latest known position if the face is not detected.
#! /usr/bin/python |
In this program we simply assumed there is one face in the video screen. We reduced the size of the screen to speed up the processing time. This is fine in most cases because detection will work fine in lower resolutions. If you want to execute the face detection in “real time”, keeping the computational cycle short is mandatory. An alternative to this implementation is to process first and display later.
A limitation of this technique is that it does not always detect faces and faces that are very small or occluded may not be detected. It may show false positives such as a bag detected as face. This technique works quite well on certain type of input videos.
Leave a Reply:
What is actually "x, y, w , and h" in this program? Can you Please explain me in the face detect window?
x,y,w and h are the coordinates of the detected face(s). nfaces contains the number of faces detected. If only one face is detected, (x,y) are the top left coordinates of the face on the screen. If many faces are detected, (x,y) contain the coordinates of that particular face.
In this example I simply assumed there is only one face in the video.