Facial Keypoint Detection: Detect relevant features of face in a go using CNN & your own dataset in Python

Facial key-points are relevant for a variety of tasks, such as face filters, emotion recognition, pose recognition, and so on. So if you’re onto these projects, keep reading!

In this project, facial key-points (also called facial landmarks) are the small magenta dots shown on each of the faces in the image below. In each training and test image, there is a single face and 68 key-points, with coordinates (x, y), for that face.These key-points mark important areas of the face: the eyes, corners of the mouth, the nose, etc.

Image for post
Image for post
Magenta dots showing key-points

Dataset used:
We’ll be using YouTube Faces Dataset, which includes videos of people in YouTube videos.This facial key-points dataset consists of 5770 colour images. All of these images are separated into either a training or a test set of data.

  • 3462 of these images are training images, for you to use as you create a model to predict key-points.
  • 2308 are test images, which will be used to test the accuracy of your model.

Now the question arises “ the input images are never of the same size so how would neural network work on it?”

Since neural networks often expect images that are standardized; a fixed size, with a normalized range for color ranges and coordinates, and (for PyTorch) converted from numpy lists and arrays to Tensors. Therefore, we will need to perform some pre-processing.
For this you can :

  1. Normalize: to convert a color image to grayscale values with a range of [0,1] and normalize the keypoints to be in a range of about [-1, 1]
  2. Rescale: to rescale an image to a desired size.
  3. RandomCrop: to crop an image randomly.
  4. ToTensor: to convert numpy images to torch images.
Transform

Now let’s define our own Convolutional Neural Network that can learn from this data !

The steps you need to follow are as follows:

  • Define a CNN with images as input and keypoints as output:
    Input image size is 224*224px (size obtained from tranform earlier) & the output class scores shall be 136 i.e. 136/2 = 68 (our desired 68 keypoints)
Image for post
Image for post
CNN Architecture

You can add regularization as per your discretion, but if you still need a hand, here you go-

Image for post
Image for post
Droput
  • Construct the transformed FaceKeypointsDataset, just as before
Image for post
Image for post
FaceKeypointsDataset
  • Train the CNN on the training data, tracking loss
Image for post
Image for post
Loss & Optimization
  • See how the trained model performs on test data

To quickly observe how your model is training and decide on whether or not you should modify it’s structure or hyperparameters, start off with just one or two epochs at first. As you train, note how your the model’s loss behaves over time: does it decrease quickly at first and then slow down?

Use these initial observations to make changes to your model and decide on the best architecture before you train for many epochs and create a final model.

Image for post
Image for post
Training

If necessary, modify the CNN structure and model hyper-parameters, so that it performs well.
Once you’ve found a good model, Don’t Forget to save it ! So that you can load and use it later!

After you’ve trained a neural network to detect facial keypoints, you can then apply this network to any image that includes faces.

  • Detect all the faces in an image using a face detector (I have used Haar Cascade detector in this project).
Image for post
Image for post

Output:

Image for post
Image for post
Face Detection using Haar Cascade
  • Pre-process those face images so that they are gray-scale, and transformed to a Tensor of the input size that your net expects. This step will be similar to the earlier pre-processing.
Image for post
Image for post
Grayscale Image
  • Use your trained model to detect facial keypoints on the image.
Image for post
Image for post
Detected Keypoints

Wanna check out how to pull this code off in detail ?
Check this project out on my github : Facial Keypoint Detection

A Machine Learning Research scholar who loves to moonlight as a blogger.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store