Using Azure Cognitive Services FACE API In UWP App

I have been playing around with Cognitive Services for some time now, as I find them mind blowing APIs that can do a lot of Artificial Intelligence tasks without spending much time building the AI algorithms yourself. Take FACE API for example. Normally, the most common way of face detection/recognition is to use Eigenface classification algorithm. For this, you need to know the basics of AI, such as Regression and Classification, as well as the basics of algorithms, such as SVD, Neural Networks and so on. Even if you use a library such as OpenCV, you still need some knowledge of Artificial Intelligence to make sure you used the correct set of parameters from the library.

Cognitive Services, however, make this demand completely obsolete and put the power of AI truly in the hands of common developers. You don’t need to learn complicated mathematics to use the AI tools anymore. This is good because as AI becomes more and more common in software world, it should also be accessible to the lowest common denominator of developers who might not have upper level degrees in science and mathematics but who know how to code. The downside of it? The APIs are somewhat restricted in what they can do. But this might change in the future (Or so we should hope.)

Today, I am going to talk about using the Face API from Microsoft Azure Cognitive Services, to build a simple UWP application that can tell some of the characteristics of a face (such as age, emotion, smile, facial hair etc.).

To do so, first open Visual Studio –> Create New Project –> Select Windows Universal Blank App. This is your UWP application. To use Azure Cognitive Services in this project, you will have to do two things.

  1. Subscribe to Face API in Azure Cognitive Services portal – To do so, go to Cognitive Services Subscription, log on with your account, and select Face API. Once you do it, you will see the following.

    Cognitive Services

    You need the key from here later in the code to subscribe to Face API client.
  1. Go To Solution Explorer in Visual Studio –> Right click on Project –> Manage NuGet Packages –> Search for Microsoft.ProjectOxford.Face and install it. Then, in MainPage.xaml.cs, add the following lines of code at the top.
    1. using Microsoft.ProjectOxford.Face;  
    2. using Microsoft.ProjectOxford.Face.Contract;  
    3. using Microsoft.ProjectOxford.Common.Contract;  

Now, you are all set to create the Universal App for Face Detection using Cognitive Services.

To do so, you need to do the following things -

  1. Access the device camera
  2. Run the camera stream in the app
  3. Capture the image
  4. Call Face API.

We will look into them one by one.

First, let us access the camera and start streaming the video in the app. To do so, in MainPage.xaml, add CaptureElement. See the following code snippet.

  1. <CaptureElement Name="PreviewControl" Stretch="Uniform" Margin="0,0,0,0" Grid.Row="0"/>  

Then, create a Windows.Media.Capture.MediaCapture object and initialize it. Add this as a source to CaptureElement and call the StartPreviewAsync() method. Following code snippet might make it clearer.

  1. try {  
  2.     m_mediaCapture = new MediaCapture();  
  3.     await m_mediaCapture.InitializeAsync();  
  4.     m_displayRequest.RequestActive();  
  5.     DisplayInformation.AutoRotationPreferences = DisplayOrientations.Landscape;  
  6.     PreviewControl.Source = m_mediaCapture;  
  7.     await m_mediaCapture.StartPreviewAsync();  
  8.     m_isPreviewing = true;  
  9. catch (Exception ex) {  
  10.     //Handle Exception  
  11. }   

This will start video streaming from the camera in your application. Now, to capture the image and to process it, create a button in MainPage.xaml. Add an event handler to this button. In the event handler, call the FaceServiceClient that you should initialize in the initializing code of your app.

  1. FaceServiceClient fClient = new FaceServiceClient("Here your subscription key");  

And then, use this InMemoryRandomAccessStream object to capture the photo in JPG encoding; then call DetectAsync from the FaceServiceClient to get the information about faces. 

  1. using(var captureStream = new InMemoryRandomAccessStream()) {  
  2.     await m_mediaCapture.CapturePhotoToStreamAsync(ImageEncodingProperties.CreateJpeg(), captureStream);  
  3.     captureStream.Seek(0);  
  4.     var faces = await fClient.DetectAsync(captureStream.AsStream(), returnFaceLandmarks: true, returnFaceAttributes: new FaceAttributes().GetAll());  
  5. };   

returnFaceLandmarks and returnFaceAttributes are two important properties that you need to take care of in order to get the full detection information from the API. When returnFaceLandmarks is set to true, you get all the information of location of your face parts, such as Pupils, Nose, and Mouth and so on. The information that comes back looks like the following.

  1. "faceLandmarks": {  
  2.     "pupilLeft": {  
  3.         "x": 504.4,  
  4.         "y": 202.8  
  5.     },  
  6.     "pupilRight": {  
  7.         "x": 607.7,  
  8.         "y": 175.9  
  9.     },  
  10.     "noseTip": {  
  11.         "x": 598.5,  
  12.         "y": 250.9  
  13.     },  
  14.     "mouthLeft": {  
  15.         "x": 527.7,  
  16.         "y": 298.9  
  17.     },  
  18.     "mouthRight": {  
  19.         "x": 626.4,  
  20.         "y": 271.5  
  21.     },  
  22.     "eyebrowLeftOuter": {  
  23.         "x": 452.3,  
  24.         "y": 191  
  25.     },  
  26.     "eyebrowLeftInner": {  
  27.         "x": 531.4,  
  28.         "y": 180.2  
  29.     }  
  30. }   

In faceAttributes properties, you should give the FaceAttributeType that you want to have such as  the following. In my application, I have created a class from which I am returning all of them in a list with a method called GetAll(). 

  1. FaceAttributeType.Age,  
  2. FaceAttributeType.Emotion,  
  3. FaceAttributeType.FacialHair,  
  4. FaceAttributeType.Gender,  
  5. FaceAttributeType.Glasses,  
  6. FaceAttributeType.HeadPose,  
  7. FaceAttributeType.Smile   

The result will look somewhere similar to this.

  1. "faceAttributes": {  
  2.     "age": 23.8,  
  3.     "gender""female",  
  4.     "headPose": {  
  5.         "roll": -16.9,  
  6.         "yaw": 21.3,  
  7.         "pitch": 0  
  8.     },  
  9.     "smile": 0.826,  
  10.     "facialHair": {  
  11.         "moustache": 0,  
  12.         "beard": 0,  
  13.         "sideburns": 0  
  14.     },  
  15.     "glasses""ReadingGlasses",  
  16.     "emotion": {  
  17.         "anger": 0.103,  
  18.         "contempt": 0.003,  
  19.         "disgust": 0.038,  
  20.         "fear": 0.003,  
  21.         "happiness": 0.826,  
  22.         "neutral": 0.006,  
  23.         "sadness": 0.001,  
  24.         "surprise": 0.02  
  25.     }  
  26. }   

These are float values between 0 and 1, 1 being maximum and 0 is being least. In my application, I wrote a basic threshold method that can display if I am smiling, angry, happy, etc. in UI, based on these values. The final result looks like this.

Cognitive Services

It failed to detect my age (because when I tested it, I was using a warm light bulb for lighting. Tip: Lighting matters a lot in age detection from Face API. Use cold lights if you want to look younger). But apart from that, the other information was quite correct. I was indeed smiling, my face was happy, I wasn’t wearing glasses, and have some beard. 

At a first look, Face API looks really interesting. You can do a lot with other endpoints of API such as verification, identification etc. I will try to cover these other functions in next posts.

Until then, keep learning.  


Similar Articles