Intelligent Image Object Detection Bot Using Cognitive Computer Vision API

Introduction

Microsoft Cognitive services is set of cloud-based intelligence APIs for building richer and smarter application development. Cognitive API will use for Search meta data from Photos and video and emotions, sentiment analysis and authenticating speakers via voice verification.

Microsoft Cognitive services

The Computer Vision API will help developers to identify the objects with access to advanced algorithms for processing images and returning image meta data information. In this article, you will learn about Computer Vision API and how to implement Compute Vision API into Bot application.

You can follow below steps for implement object detection in Bot Application

Computer Vision API Key Creation

Computer Vision API returns information about visual content found in an image. You can follow the below steps for creating Vision API key.

  1. Navigate to https://azure.microsoft.com/en-us/try/cognitive-services/

    Microsoft Cognitive services

  2. Click on “Get API Key “or Login with Azure login.
  3. Login with Microsoft Account and Get API key

    Microsoft Cognitive services

  4. Copy API key and store securely, we will use this API key into our application

Step 2 Create New Bot Application

Let's create a new bot application using Visual Studio 2017. Open Visual Studio > Select File > Create New Project (Ctrl + Shift +N) > Select Bot application.

Microsoft Cognitive services

The Bot application template gets created with all the components and all required NuGet references installed in the solutions.

Microsoft Cognitive services

In this solution, we are going edit Messagecontroller and add Service class.

Install Microsoft.ProjectOxford.Vision Nuget Package

The Microsoft project oxford vision nuget package will help with access to cognitive service so Install “Microsoft.ProjectOxford.Vision” Library from the solution

Microsoft Cognitive services

Create Vision Service

Create a new helper class to the project called VisionService that wraps around the functionality from the VisionServiceClient from Cognitive Services and only returns what we currently need.

  1. using System;  
  2. using System.Collections.Generic;  
  3. using System.IO;  
  4. using System.Linq;  
  5. using System.Threading.Tasks;  
  6. using System.Web;  
  7. using Microsoft.ProjectOxford.Vision;  
  8. using Microsoft.ProjectOxford.Vision.Contract;  
  9.   
  10. namespace BotObjectDetection.Service  
  11. {  
  12.     public  class VisionService : ICaptionService  
  13.     {  
  14.         /// <summary>  
  15.         /// Microsoft Computer Vision API key.  
  16.         /// </summary>  
  17.         private static readonly string ApiKey = "<API Key>";  
  18.   
  19.         /// <summary>  
  20.         /// The set of visual features we want from the Vision API.  
  21.         /// </summary>  
  22.         private static readonly VisualFeature[] VisualFeatures = { VisualFeature.Description };  
  23.   
  24.         public  async Task<string> GetCaptionAsync(string url)  
  25.         {  
  26.             var client = new VisionServiceClient(ApiKey);  
  27.             var result = await client.AnalyzeImageAsync(url, VisualFeatures);  
  28.             return ProcessAnalysisResult(result);  
  29.         }  
  30.         public async Task<string> GetCaptionAsync(Stream stream)  
  31.         {  
  32.             var client = new VisionServiceClient(ApiKey);  
  33.             var result = await client.AnalyzeImageAsync(stream, VisualFeatures);  
  34.             return ProcessAnalysisResult(result);  
  35.         }  
  36.   
  37.         /// <summary>  
  38.         /// Processes the analysis result.  
  39.         /// </summary>  
  40.         /// <param name="result">The result.</param>  
  41.         /// <returns>The caption if found, error message otherwise.</returns>  
  42.         private static string ProcessAnalysisResult(AnalysisResult result)  
  43.         {  
  44.             string message = result?.Description?.Captions.FirstOrDefault()?.Text;  
  45.   
  46.             return string.IsNullOrEmpty(message) ?  
  47.                         "Couldn't find a caption for this one" :  
  48.                         "I think it's " + message;  
  49.         }  
  50.     }  

In the above helper class, replace vision API key and call the Analyze image client method for identify image meta data

Messages Controller

MessagesController is created by default and it is the main entry point of the application. it will call our helper service class which will handle the interaction with the Microsoft APIs. You can update “Post” method like below

  1. using System;  
  2. using System.Diagnostics;  
  3. using System.IO;  
  4. using System.Linq;  
  5. using System.Net;  
  6. using System.Net.Http;  
  7. using System.Net.Http.Headers;  
  8. using System.Text.RegularExpressions;  
  9. using System.Threading.Tasks;  
  10. using System.Web.Http;  
  11. using BotObjectDetection.Service;  
  12. using Microsoft.Bot.Builder.Dialogs;  
  13. using Microsoft.Bot.Connector;  
  14.   
  15. namespace BotObjectDetection  
  16. {  
  17.     [BotAuthentication]  
  18.     public class MessagesController : ApiController  
  19.     {  
  20.         private readonly ICaptionService captionService = new VisionService();  
  21.         /// <summary>  
  22.         /// POST: api/Messages  
  23.         /// Receive a message from a user and reply to it  
  24.         /// </summary>  
  25.         public async Task<HttpResponseMessage> Post([FromBody]Activity activity)  
  26.         {  
  27.             if (activity.Type == ActivityTypes.Message)  
  28.             {  
  29.                 //await Conversation.SendAsync(activity, () => new Dialogs.RootDialog());  
  30.                 var connector = new ConnectorClient(new Uri(activity.ServiceUrl));  
  31.                 string message;  
  32.                 try  
  33.                 {  
  34.                     message = await this.GetCaptionAsync(activity, connector);  
  35.                 }  
  36.                 catch (Exception)  
  37.                 {  
  38.                     message = "I am object Detection Bot , You can Upload or share Image Url ";  
  39.   
  40.                 }  
  41.                 Activity reply = activity.CreateReply(message);  
  42.                 await connector.Conversations.ReplyToActivityAsync(reply);  
  43.             }  
  44.             else  
  45.             {  
  46.                 HandleSystemMessage(activity);  
  47.             }  
  48.             var response = Request.CreateResponse(HttpStatusCode.OK);  
  49.             return response;  
  50.         }  
  51.   
  52.         private Activity HandleSystemMessage(Activity message)  
  53.         {  
  54.             if (message.Type == ActivityTypes.DeleteUserData)  
  55.             {  
  56.                 // Implement user deletion here  
  57.                 // If we handle user deletion, return a real message  
  58.             }  
  59.             else if (message.Type == ActivityTypes.ConversationUpdate)  
  60.             {  
  61.                 // Handle conversation state changes, like members being added and removed  
  62.                 // Use Activity.MembersAdded and Activity.MembersRemoved and Activity.Action for info  
  63.                 // Not available in all channels  
  64.             }  
  65.             else if (message.Type == ActivityTypes.ContactRelationUpdate)  
  66.             {  
  67.                 // Handle add/remove from contact lists  
  68.                 // Activity.From + Activity.Action represent what happened  
  69.             }  
  70.             else if (message.Type == ActivityTypes.Typing)  
  71.             {  
  72.                 // Handle knowing tha the user is typing  
  73.             }  
  74.             else if (message.Type == ActivityTypes.Ping)  
  75.             {  
  76.             }  
  77.   
  78.             return null;  
  79.         }  
  80.           
  81.         private async Task<string> GetCaptionAsync(Activity activity, ConnectorClient connector)  
  82.         {  
  83.             var imageAttachment = activity.Attachments?.FirstOrDefault(a => a.ContentType.Contains("image"));  
  84.             if (imageAttachment != null)  
  85.             {  
  86.                 using (var stream = await GetImageStream(connector, imageAttachment))  
  87.                 {  
  88.                     return await this.captionService.GetCaptionAsync(stream);  
  89.                 }  
  90.             }  
  91.   
  92.             string url;  
  93.             if (TryParseAnchorTag(activity.Text, out url))  
  94.             {  
  95.                 return await this.captionService.GetCaptionAsync(url);  
  96.             }  
  97.   
  98.             if (Uri.IsWellFormedUriString(activity.Text, UriKind.Absolute))  
  99.             {  
  100.                 return await this.captionService.GetCaptionAsync(activity.Text);  
  101.             }  
  102.   
  103.             // If we reach here then the activity is neither an image attachment nor an image URL.   
  104.             throw new ArgumentException("The activity doesn't contain a valid image attachment or an image URL.");  
  105.         }  
  106.   
  107.         private static async Task<Stream> GetImageStream(ConnectorClient connector, Attachment imageAttachment)  
  108.         {  
  109.             using (var httpClient = new HttpClient())  
  110.             {  
  111.                   
  112.                 var uri = new Uri(imageAttachment.ContentUrl);  
  113.   
  114.                 return await httpClient.GetStreamAsync(uri);  
  115.             }  
  116.         }  
  117.         private static bool TryParseAnchorTag(string text, out string url)  
  118.         {  
  119.             var regex = new Regex("^<a href=\"(?<href>[^\"]*)\">[^<]*</a>$", RegexOptions.IgnoreCase);  
  120.             url = regex.Matches(text).OfType<Match>().Select(m => m.Groups["href"].Value).FirstOrDefault();  
  121.             return url != null;  
  122.         }  
  123.     }  

Run Bot Application

The emulator is a desktop application that lets us test and debug our bot on localhost. Now, you can click on "Run the application" in Visual studio and execute in the browser

Microsoft Cognitive services
Test Application on Bot Emulator

You can follow the below steps to test your bot application.

  1. Open Bot Emulator.
  2. Copy the above localhost url and paste it in emulator e.g. - http://localHost:3979
  3. You can append the /api/messages in the above url; e.g. - http://localHost:3979/api/messages.
  4. You won't need to specify Microsoft App ID and Microsoft App Password for localhost testing, so click on "Connect".

    Microsoft Cognitive services Microsoft Cognitive services

Related Article

I have explained about Bot framework Installation, deployment and implementation in the below articles:

  1. Getting Started with Chatbot Using Azure Bot Service
  2. Getting Started with Bots Using Visual Studio 2017
  3. Deploying A Bot to Azure Using Visual Studio 2017
  4. How to Create ChatBot In Xamarin
  5. Getting Started with Dialog Using Microsoft Bot Framework
  6. Getting Started with Prompt Dialog Using Microsoft Bot Framework
  7. Getting Started With Conversational Forms And FormFlow Using Microsoft Bot Framework
  8. Getting Started With Customizing A FormFlow Using Microsoft Bot Framework
  9. Sending Bot Reply Message With Attachment Using Bot Framework
  10. Getting Started With Hero Card Design Using Microsoft Bot Framework
  11. Getting Started With Thumbnail Card Design Using Microsoft Bot Framework
  12. Getting Started With Adaptive Card Design Using Microsoft Bot Framework
  13. Getting Started with Receipt Card Design Using Microsoft Bot Framework
  14. Building Bot Application With Azure AD Login Authentication Using AuthBot
  15. Building Chat Bots With Bing Search Results Using Bot Framework

Summary

In this article, you learned how to create an Intelligent Image Object Detection Bot using Microsoft Cognitive Computer Vision API. If you have any questions/feedback/ issues, please write in the comment box.


Similar Articles