Building an Image Processor Using Microsoft Bot Framework

Hari Haran
6y
7.3k
0
7

Article

Introduction

Microsoft Bot framework is really a game-changer for many applications. It's very useful in performing valuable operations on top of data. Recently bots and machine learning are in huge demand. I recently participated in the Global Azure Bootcamp. Since I was a presenter, I got to learn all these amazing things about building a bot, and now I am sharing it with you all.

Prerequisites

Microsoft Visual Studio ✌️
Azure Bot Framework Emulator 🎮
An Azure Subscription 🎆
Patience to read this article 🍁

Objectives or Project Goals

The goal for me is to give the bot an image or an image URL and it should give back the analysis of the particular image.

Getting Started:

Let's go to Visual Studio and create a Bot Framework V4 project.

Building an Image Processor using Microsoft Bot Framework

But wait, you won't have the Bot Framework template straight away. You need to install the Bot Framework v4 SDK Templates for Visual Studio from the marketplace.

Now that you have the template installed, select it and create an Echo Bot. You will have a solution like this:

Exploring the Solution

This looks like the same structure as a .NET CORE Web API application. It has a controller for interacting with the bot with[Route(“api/messages”)]. It has a startup for injecting your dependencies, and also a ErrorHandler class.

EchoBot.cs

If you have a look at the EchoBot.cs under the Bots folder, this is where the actual bot processing and logic is done. The main thing to note here is the OnMessageActivityAsync and also OnMembersAddedAsync

OnMembersAddedAsync

The name of the function itself indicates whenever a new member is added to the bot for interacting or the first time the bot is connected, tells it to interact with the user. Let’s modify this first.

var welcomeText = “Hello and welcome to ImageDescriber Bot. Enter a image URL or upload an image to begin analysis”

foreach (var member in membersAdded){
if (member.Id != turnContext.Activity.Recipient.Id)
await turnContext.SendActivityAsync(MessageFactory.Text(welcomeText), cancellationToken);
}

I have removed CreateActivityWithTextAndSpeak and changed it like the one above. All it does is welcome the user.

OnMessageActivityAsync

This is where we need to process the image or a URL. Let’s see the possibilities. In case if you are not aware, you can make use of Azure Cognitive Services for AI operations like this.

There are always two ways to interact with Azure Services.

REST API
Client SDK

I’m a lazy guy, so I won't go explore the REST API browser and find the suitable API, find its headers, mandatory params blah blah.. This is the same reason I created my own library for interacting with StackExchange API. It’s called StackExchange.NET and you can also find it in Nuget

Azure Cognitive Services SDK

So I’m going to install the Azure Cognitive services SDK on my project for processing the image. However, before, you need to create an Azure Cognitive service to be able to do it.

Open Azure portal and click on Add new resource
Select the AI & Machine Learning category and click on Computer Vision

Give it a preferable name as you want, and while selecting the pricing plan, choose the F0 cause its free and it will serve our purpose. We’re not going to production so FREE should be fine.

Note

If you already have a computer vision resource, you cannot re-create a free one. You can make use of it.

Connecting to the Cognitive service

Once the resource creation is completed, you can open the resource and you will find a Key and an Endpoint.

Now navigate back to the solution and open appsettings.json and create like the below JSON. Copy the key and endpoint and paste it.
1. {
2. “MicrosoftAppId”: “”,
3. “MicrosoftAppPassword”: “”,
4. “Credentials”: {
5. “ComputerVisionKey”: “enter your Key”,
6. “ComputerVisionEndpoint”:“enter endpoint URL"
7. }
8. }

Injecting the credentials

Create a new class like below.
1. public class Credentials {
2. public string ComputerVisionKey { get; set; }
3. public string ComputerVisionEndpoint { get; set; }
4. }

Now open Startup.cs and under the ConfigureServices method add the below line.
1. services.Configure<Credentials(Configuration.GetSection(“Credentials”));

I hope you are aware of how we get values from appsettings.json. It is the same thing here.

Installing the SDK

Install the Microsoft.Azure.CognitiveServices.Vision.ComputerVision from Nuget package manager.
We will be creating a new class to perform operations using the client SDK.
1. public class ImageAnalyzer {
2. private readonly string _computerVisionEndpoint;
3. private readonly string _computerVisionKey;
4. public ImageAnalyzer(IOptions<Credentials> options {
5. _computerVisionKey = options.Value.ComputerVisionKey;
6. _computerVisionEndpoint = options.Value.ComputerVisionEndpoint;
7. }
8. }

I have a simple class where the constructor is automatically injected in the runtime.

Computing with the SDK

Any API / SDK that we use needs to be authenticated first. So I have created a method like this:

public static ComputerVisionClient Authenticate(string endpoint, string key) {
var client = new ComputerVisionClient(new ApiKeyServiceClientCredentials(key)) {
Endpoint = endpoint };
return client;
}

We are going to analyze either a Stream or a URL. I am creating two methods for it.

public async Task<ImageAnalysis> AnalyzeImageAsync(Stream image) {
var client = Authenticate(_computerVisionEndpoint, _computerVisionKey);
var analysis = await client.AnalyzeImageInStreamAsync(image, Features);
return analysis;
}
}
public async Task<ImageAnalysis> AnalyzeUrl(string url){
var client = Authenticate(_computerVisionEndpoint, _computerVisionKey);
var result = await client.AnalyzeImageWithHttpMessagesAsync(url, Features);
return result.Body;
}

So that's it. The SDK operations are done. The thing to note is that the second parameter called Features on both of the client calls. What is it?

It is a List<Enums> accepted by the SDK. I have copied it from the docs.

private static readonly List<VisualFeatureTypes> Features = new List<VisualFeatureTypes> {
VisualFeatureTypes.Categories,
VisualFeatureTypes.Description,
VisualFeatureTypes.Faces,
VisualFeatureTypes.ImageType,
VisualFeatureTypes.Tags
};

Interacting with the Bot

The ITurnContext<IMessageActivity> turnContext is the main thing which contains whatever you share with the bot. Have a look at the below code. I have kept it simple so that it is understandable.

If its an attachment image => call the processImage method else if its a url => call the URL method and return results

var result = new ImageAnalysis();
if (turnContext.Activity.Attachments?.Count > 0) {
var attachment = turnContext.Activity.Attachments[0];
var image = await _httpClient.GetStreamAsync(attachment.ContentUrl);
if (image != null) {
result = await _imageAnalyzer.AnalyzeImageAsync(image);
}
}
else {
result = await _imageAnalyzer.AnalyzeUrl(turnContext.Activity.Text);
}
var stringResponse = $”I think the Image you uploaded is a {result.Tags[0].Name.ToUpperInvariant()} and it is {result.Description.Captions[0].Text.ToUpperInvariant()} “;
return stringResponse;

Demo

Now its time to see if the Bot actually works. To do so:

Build the solution and press F5 .
As mentioned in the pre-requisite, I already have the Azure Bot framework emulator installed, so let's open it.
When you open it, you will get a page like this

Uploading an image now.

Booyah! MISSION ACCOMPLISHED! 🔥

The full solution can be downloaded from here on Github

Thanks for reading. Please stay tuned for more blogs.