Image Text Sentiment Analysis Using Vision And Text Analytics API

Article

Introduction

With the ever-growing use of social media platforms like FaceBook, Twitter, LinkedIn to promote business, analyzing what the users feel and what feedback they share regarding the business is of paramount importance. One such scenario is where the users post their feedback in images on the concerned company's social media pages. It is of great importance that the companies analyze what the users post on their social media pages as it will give them an overall idea of the sentiment of the users towards the services offered by the company, which in turn will help them plan their marketing strategies accordingly.

Microsoft Cognitive Services

As per Microsoft,

Microsoft Cognitive Services (formerly Project Oxford) are a set of APIs, SDKs and services available to the developers to make their applications more intelligent, engaging and discover able. Microsoft Cognitive Services expands on Microsoft's evolving portfolio of machine learning APIs and enables developers to easily add intelligent features such as emotion and video detection; facial, speech and vision recognition; and speech and language understanding - into their applications. Our vision is for more personal computing experiences and enhanced productivity aided by systems that increasingly can see hear, speak, understand and even begin to reason.

Scope

This article explains the concept of the Sentiment analysis of the text detected from an Image using a Sample Console Application. The article assumes that the user is familiar with the basic concepts of C# and knows how to consume the REST APIs in the C# code. In this sample application, images from a SampleImage folder are read and then the image is sent to the Computer Vision API to detect the text from the image. Once the Computer Vision API returns the JSON payload, the detected text is sent to the Text Analytics API to detect the sentiment score.

Pre Implementation

Computer Vision API Account

In order to use the Computer Vision API in the sample application, first, an API account for the Computer Vision API needs to be created. Once this is done, the API will be available to integrate the Computer Vision API in the sample application. Th efollowing screenshot shows the process to do so.

Once the API account is created, select the account from the dashboard and the following window is shown. Access keys and end points are required from this window which will be used to create a connection to the Computer Vision API.

Text Analytics API Account

In order to use the Text Analytics API in the sample application, first, an API account for the Text Analytics API needs to be created. Once this is done, the API will be available to integrate the Text Analytics in the Sample Application. The following screenshot shows the process to do so.

Once the API account is created, select the account from the dashboard and the following window is visible, the access keys and end point are required from this window which will be used to create a connection to the Text Analytics API.

Design - Constants

The constants class is used to store the constant values that are used in the application. Following is the constants class used in the application.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace ImageTextSentimentDetection
{
public class Constants
{
/// <summary>
/// Key Name that is used to pass the subscrption Key Value in the Request Headers
/// </summary>
public static string SubscriptionKeyName = "Ocp-Apim-Subscription-Key";
/// <summary>
/// Uri for the Computer Vision APi to which the Request Will be routed
/// </summary>
public static string VisionApiUri = "https://southeastasia.api.cognitive.microsoft.com/vision/v1.0";
/// <summary>
/// Uri for the Text Analytics Api to which the Request will be routed.
/// </summary>
public static string TextAnalyticsApiUri = "https://southeastasia.api.cognitive.microsoft.com/text/analytics/v2.0";
/// <summary>
/// Path to the SampleImages Folder
/// </summary>
public static string SampleImagesFolderPath = @"..\..\SampleImages";
/// <summary>
/// Vision APi Subscription Key Value. This needs to be populated before Starting the Sample
/// </summary>
public static string VisionApiSubcriptionKey = "Enter key here";
/// <summary>
/// Text Analytics Subscription Key value. This Needs to be Populated before Starting the Sample
/// </summary>
// public static string TextAnalyticsApiSubscriptionKey = "Enter key here";
}
}

Computer Vision API Response Class

This class is used to convert the json payload received from the Computer Vision API into a C# class which can be used later on for extracting text. The class is generated using the sample json payload received from the Computer Vision API. The class structure will change if the response received is changed.

// To parse this JSON data, add NuGet 'Newtonsoft.Json' then do:
//
// using ImageTextSentimentDetection;
//
// var data = ComputerVisionSuccessFullResponseClass.FromJson(jsonString);
namespace ImageTextSentimentDetection
{
using System;
using System.Net;
using System.Collections.Generic;
using Newtonsoft.Json;
public partial class ComputerVisionSuccessFullResponseClass
{
[JsonProperty("language")]
public string Language { get; set; }
[JsonProperty("textAngle")]
public double TextAngle { get; set; }
[JsonProperty("orientation")]
public string Orientation { get; set; }
[JsonProperty("regions")]
public Region[] Regions { get; set; }
}
public partial class Region
{
[JsonProperty("boundingBox")]
public string BoundingBox { get; set; }
[JsonProperty("lines")]
public Line[] Lines { get; set; }
}
public partial class Line
{
[JsonProperty("boundingBox")]
public string BoundingBox { get; set; }
[JsonProperty("words")]
public Word[] Words { get; set; }
}
public partial class Word
{
[JsonProperty("boundingBox")]
public string BoundingBox { get; set; }
[JsonProperty("text")]
public string Text { get; set; }
}
public partial class ComputerVisionSuccessFullResponseClass
{
public static ComputerVisionSuccessFullResponseClass FromJson(string json)
{
return JsonConvert.DeserializeObject<ComputerVisionSuccessFullResponseClass>(json, Converter.Settings);
}
}
public static class Serialize
{
public static string ToJson(this ComputerVisionSuccessFullResponseClass self)
{
return JsonConvert.SerializeObject(self, Converter.Settings);
}
}
public class Converter
{
public static readonly JsonSerializerSettings Settings = new JsonSerializerSettings
{
MetadataPropertyHandling = MetadataPropertyHandling.Ignore,
DateParseHandling = DateParseHandling.None,
};
}
}

Text Analytics Response Class

This class is used to convert the JSON payload received from the Text Analytics API into a C# class which can be used later on for extracting text. The class is generated using the sample JSON payload received from the Text Analytics Api. The class structure will change if the response received is changed.

// To parse this JSON data, add NuGet 'Newtonsoft.Json' then do:
//
// using ImageTextSentimentDetection;
//
// var data = TextAnalyticsResponseClass.FromJson(jsonString);
namespace ImageTextSentimentDetection
{
using System;
using System.Net;
using System.Collections.Generic;
using Newtonsoft.Json;
public partial class TextAnalyticsResponseClass
{
[JsonProperty("documents")]
public Document[] Documents { get; set; }
[JsonProperty("errors")]
public Error[] Errors { get; set; }
}
public partial class Document
{
[JsonProperty("score")]
public long Score { get; set; }
[JsonProperty("id")]
public string Id { get; set; }
}
public partial class Error
{
[JsonProperty("id")]
public string Id { get; set; }
[JsonProperty("message")]
public string Message { get; set; }
}
public partial class TextAnalyticsResponseClass
{
public static TextAnalyticsResponseClass FromJson(string json)
{
return JsonConvert.DeserializeObject<TextAnalyticsResponseClass>(json, Converter.Settings);
}
}
}

Helper Classes

Helper classes are created to cater to each of the Cognitive API calls. One class caters to the Computer Vision API call while the other class caters to the Text Analytics API call. Each of them is discussed below.

VisionHelper

This class caters to the calls made to the Computer Vision API. In this current sample, the DetectTextInImage method in the class is used to call the Computer Vision API. This method passes the image to the API in 'application/octet-stream' format. It deserializes the JSON payload into the class created earlier and extracts out the detected text. The class is as follows.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Newtonsoft.Json.Serialization;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Web;
namespace ImageTextSentimentDetection
{
public class VisionHelper
{
public static string DetectTextInImage(byte[] imageBytes)
{
string detectedText = String.Empty;
var queryString = HttpUtility.ParseQueryString(String.Empty);
HttpClient client = new HttpClient();
using (var content = new ByteArrayContent(imageBytes))
{
client.DefaultRequestHeaders.Add(Constants.SubscriptionKeyName, Constants.VisionApiSubcriptionKey);
queryString["language"] = "unk";
queryString["detectOrientation "] = "false";
content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
var uriString = Constants.VisionApiUri + "/ocr?" + queryString;
HttpResponseMessage response = new HttpResponseMessage();
response = client.PostAsync(uriString, content).Result;
var responseJson = response.Content.ReadAsStringAsync().Result;
if (response.StatusCode == System.Net.HttpStatusCode.OK)
{
var data = ComputerVisionSuccessFullResponseClass.FromJson(responseJson);
foreach (Region region in data.Regions)
{
foreach (Line line in region.Lines)
{
foreach (Word word in line.Words)
{
detectedText = detectedText + " " + word.Text;
}
}
}
}
else
{
detectedText = "Error Occured While Calling Computer Vision Api";
}
return detectedText;
}
}
}
}

SentimentHelper

This class caters to the Text Analytics API call from the Sample Application. In the current context, the DetectSentiment method in the class accepts the detected string as input. It then creates the JSON payload and then invokes the Text Analytics API. It deserializes the JSON payload received from the API call into the response class and extracts out the sentiment score. This method is designed to pass only one detected text at a time to the Text Analytics API, but it can be modified as required. The class is as follows.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Newtonsoft.Json;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Web;
namespace ImageTextSentimentDetection
{
public class SentimentHelper
{
public static string DetectSentiment(string detectedText)
{
string sentimentScore = String.Empty;
string jsonText = "{\"documents\":[{\"language\":\"en\",\"id\":\"1\",\"text\":\"" + detectedText + "\"}]}";
HttpClient client = new HttpClient();
byte[] detectedTextBytes = Encoding.UTF8.GetBytes(jsonText);
using (var content = new ByteArrayContent(detectedTextBytes))
{
var queryString = HttpUtility.ParseQueryString(String.Empty);
client.DefaultRequestHeaders.Add(Constants.SubscriptionKeyName, Constants.TextAnalyticsApiSubscriptionKey);
var uriString = Constants.TextAnalyticsApiUri + "/sentiment?" + queryString;
content.Headers.ContentType = new MediaTypeHeaderValue("application/json");
HttpResponseMessage response = new HttpResponseMessage();
response = client.PostAsync(uriString, content).Result;
if (response.StatusCode == System.Net.HttpStatusCode.OK)
{
var data = TextAnalyticsResponseClass.FromJson(response.Content.ReadAsStringAsync().Result);
foreach (var document in data.Documents)
{
sentimentScore = Convert.ToString(document.Score);
}
}
else
{
sentimentScore = "Error";
}
}
return sentimentScore;
}
}
}

Console Application

The console application greets the user and advises the user to ensure that the Subscription keys for the Computer Vision API and Text Analytics API are generated and added to the Constants class. It then asks the user to confirm to proceed by Accepting [Y/N] input. Once done, it reads the files stored in the SampleImages projects and calls the helpers on each of the images. Following is the Console Application class.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
using System.Reflection;
using System.Globalization;
namespace ImageTextSentimentDetection
{
public class ImageTextSentimentDetctionApp
{
static void Main(string[] args)
{
string visionApiSubcriptionKey = String.Empty;
string textAnalyticsApiSubscriptionKey = String.Empty;
Console.WriteLine("Welcome to the Image Text Detection Sample Application");
Console.WriteLine("This Application will Get the Images From the SampleImages Folder");
Console.WriteLine("Once the Image is read, the Sample APplication Will call Computer Vision Api and Try to Detect Text in the Image"); ;
Console.WriteLine("ON SuccessFull Detection of text, the Application will use Text Analystics To Detect The Sentiment of APi");
Console.WriteLine("To Run this Sample Application, You need to generate the Subscription Key for the Computer Vision and Text Analytics Api.");
Console.WriteLine("Do You Wish To Continue? Press Y for yes and N for No");
try
{
if (Convert.ToString(Console.ReadKey().Key).ToUpper() == "Y")
{
Console.WriteLine(Environment.NewLine);
Console.WriteLine("Starting The Application Flow");
foreach (string imageFile in Directory.GetFiles(Constants.SampleImagesFolderPath, "*.jpg"))
{
var imageFileInfo = new FileInfo(imageFile);
byte[] imageBytes = File.ReadAllBytes(imageFileInfo.FullName);
Console.WriteLine("File Name of the FileRead :" + imageFileInfo.Name);
Console.WriteLine("Initiating the Image TextDetection");
string detectedText = VisionHelper.DetectTextInImage(imageBytes);
if (detectedText.Contains("Error"))
break;
Console.WriteLine("Text Detected in Image:" + detectedText);
Console.WriteLine("Starting Sentiment Score analysis for the detected text");
string sentimentScore = SentimentHelper.DetectSentiment(detectedText);
if (sentimentScore.Contains("Error"))
break;
Console.WriteLine("Sentiment Score for the Detected Text is :" + sentimentScore);
}
}
else
{
Console.WriteLine("Thank You For Visiting the Application Will Exit Now.");
}
}
catch (Exception ex)
{
Console.WriteLine("Sorry an Exception Was Caught!!");
Console.WriteLine("Following are the Exception Details");
Console.WriteLine(ex.Message);
}
Console.WriteLine("Press Any Key To Exit");
Console.ReadKey();
}
}
}

Testing

Following are the images that were used to test the Sample application.

Microsoft Cognitive Services

The Result obtained from the Test Console Application is as follows:

The actual results of the text analysis for the text detected from above test messages when done on the Microsoft Demo Page ( For Details See References) are shown below.

Visual result for the sentiment analysis of the text in the first test image is as shown below.

Visual result for the sentiment analysis of the text in the second test image is as shown below.

Conclusion

The test results as observed from the Sample Console Application as well as from the Microsoft Demo Tool are exactly the same proving the successful consumption of the Computer Vision API and Text Analytics API.