Intel OpenVINO Inference Engine

Rohit Gupta
5y
8.9k
0
7

Article

Introduction

In this article, I will help you understand how we communicate with the Intel OpenVINO to get the app running and to get the output. As you know Intel OpenVINO is best suited for computer vision applications, so I will be using a computer vision application to demonstrate.

Key Terms

1. Inference Engine

It offers a computer-vision database, facilitates telephone calls to other computer-vision repositories, such as OpenCV, and optimizes information on models for intermediate representation. Works with different plugins to allow much more tuning for different hardware.

2. Synchronous

These requests wait until a particular request is met until the next request begins. For example, the user is authorized to use the application while waiting for a reply to a network call to a server that has uncertain latency for the reply.

3. Asynchronous

These requests will arrive at the same time, meaning that the next one doesn't have to wait until the prior one is collected. For example, before more data can be analyzed, the program has to wait for user feedback.

4. IE Core

The principal Python wrapper for the Inference Engine. Check the enabled layers on a specific network and incorporate any required CPU extensions for loading an IENetwork (CPU extensions are removed in versions from 2020R1 onwards).

5. IE Network

A model class with an IR setting. You will load this into an IECore and then return it as an executable network.

6. Executable Network

An example of a loaded network into an IECore ready to be deduced. It has synchronous and asynchronous requests and includes a multitude of InferRequests items.

7. InferRequest

Person requests for inferences for the inference engine, including image by image. Each one of these requires the inputs as well as the output of the submission.

Intel OpenVINO Python API

Intel OpenVINO has its Python API that can be used to get the desired result. I will be explaining only those classes and functions which I will be using in the demo. To learn about each class of the API, visit

1. ie_api.IECore

This class is a leading category, which helps you to use single inferences to handle plugins.

__init__(self, xml_config_file="")
It is the class constructor, which returns the instance of IECore class.
- The complete path to the .XML file containing the plugin setup is specified in xml_config_file. If this attribute is allocated nothing, default settings are used.
add_extension(self, extension_path, device_name)
This functionality is to load the extension library with a defined application name into the plugin. Nothing is returned by this method.
- extension_path represents to the extension library file to load a plugin
- device_name represents the device for which we have to upload the plugin
load_network(self, network, device_name, config=None, num_requests=1)
It loads the IENetwork object read from IR to the IENetwork plugin with the device name and generates and returns an IENetwork Executable Object.
- device_name represents the name of the target edge device. Values can be CPU, FPGA.0, FPGA.1, MYRIAD, GPU
- config represents a dict of plugin configuration keys and their values
- The number of infer requests to be given for the object returned is num requests. 0 implies that the optimum number of calls is generated
query_network(self, network, device_name, config=None)
It allows the plugin to return a dictionary of supported layers for mapping and application names with the given user name.

Note:

Here num of requests can be understood as the number of parallel threads that must be employed.
Multiple devices can be used with the use of the Hetro plugin about which I will talk later in the coming articles.

2. ie_api.IENetwork

This class includes details on the IR-read network architecture and can be modified with some model parameters including layer affinity and output layers.

__init__(self, model, weights, init_from_buffer)
IENetwork class constructors which return an instance of IENetwork
- model represent the path of .XML file
- weights represent the path of .bin file
- If the value of initi_from_buffer is False, the attributes are interpreted as strings, and if the value if True then attributes are interpreted as Python bytes.
reshape(self, input_shapes)
This function allows the network to be restructured to change the partial lengths, lot sizes, or depth.
- input_shapes represents the dict that maps input layers names to tuples with the target shape
serialize(self, path_to_xml, path_to_bin)
The network can be serialized and stored in files. Nothing is returned by this method.
- path_to_xml represents the file where the serialized model will be stored
- path_to_bin represents the file where the serialized weights will be stored.

Note:

model and weights value can be string path or bytes with the file content.

3. ie_api.ExecutableNetwork

The class represents a network instance loaded to plugin and ready for inference.

__init__(self)
ExecutableNetwork class constructor which returns an instance of ExecutableNetwork
infer(self, inputs=None)
It begins a sync inference for the executable network's 1'st infer request and returns output data, sends a diction mapping the output layer to numpy.ndarray artifacts with layer outputs.
- inputs represent a dict that maps input layer names to numpy.ndarray objects of proper shape with input data from the layer

start_async(self, request_id, inputs=None)
It is used to launch an asynchronous inference for a provided program. This returns an instance of the InferRequest class handler of the stated Infer request.
- request_id represents the index of infer request to start inference
- inputs represent a dict that maps input layer names to numpy.ndarray objects of proper shape with input data to the layer

wait(self, num_requests=None, timeout=None)
This is used to provide a lock function to wait for the outcome of any request. Returns "RESULT NOT READY" or "OK" as per the outcome.

num_requests represent the number of idle requests for which it needs to wait. By default, it is initialized to the number of requests.
Timeout represents the time to wait in milliseconds or special values like 0 and -1. Default value is -1.

4. ie_api.InferRequest

This class offers an interface for inferring ExecutableNetwork requests which is used to collect the responses.

__init__(self)
This class has no specific class constructor. Use ie_api.IECore.load network to construct a legitimate InferRequest case.
async_request(self, inputs=None)
It is used to launch the infer request and supply the output list with a new synchronous inference.
- inputs represent a dict that maps input layer names to a numpy.ndarray objects of proper shape with input data for the layer

get_perf_counts(self)
This method is used to query the layer-by-layer output calculation and obtain feedback regarding the time layer. It returns a dictation containing information per layer execution.
infer(self, inputs=None)
This method is used to launch the assumed requests clustered and to fill in the output list.
- inputs represent a dict that maps input layer names to a numpy.ndarray objects of proper shape with input data for the layer
wait(self, timeout=None)
This is used to provide a lock function to wait for the outcome of any request.

timeout represents the time to wait in milliseconds or special value of 0 and -1. The default value is -1. Here 0 means that output is returned immediately, and 1 means wait until inference becomes available

Demo Application

Now I will demonstrate how we can use all the concepts that we have learned studied, before reading this I would recommend you to please go through all the previous articles:

I will be using the pre-trained model that are using the following:

Human Pose Estimation: human-pose-estimation-0001
Text Detection: text-detection-0004
Determining Car Type & Color: vehicle-attributes-recognition-barrier-0039

The application aims to annotate the given image input with its features. Now let's start programming.

app.py

import argparse
import cv2
import numpy as np
from handle_models import handle_output, preprocessing
from inference import Network

In the above code, we are importing the required libraries

argparse is used to create a command-line argument structure
cv2 is used to import OpenCV
numpy is used to perform some basic tasks on the numpy.ndarray, to learn about numpy, visit
handle_models is the python script that has all the processing logic defined
inference is the python script that is used to communicate with the OpenVINO Python API

CAR_COLORS = ["white", "gray", "yellow", "red", "green", "blue", "black"]
CAR_TYPES = ["car", "bus", "truck", "van"]

The above code is specific to cars, as we are defining car type and car colors, for the model to choose from.

def get_mask(processed_output):
# Create an empty array for other color channels of the mask
empty = np.zeros(processed_output.shape)
# Stack to make a Green mask
mask = np.dstack((empty, processed_output, empty))
return mask

In the above code, we are telling the model to highlight the output in green.

if model_type == "POSE":
#Remove final part of output not used for heatmaps
output = output[: -1]
# Get only pose detections above 0.5 confidence, set to 255
for c in range(len(output)):
output[c] = np.where(output[c] > 0.5, 255, 0)
# Sum along the "class" axis
output = np.sum(output, axis = 0)
# Get a semantic mask
pose_mask = get_mask(output)
# Combine with the original image
image = image + pose_mask
return image

The above code is intended for human pose estimation, where we tell the human pose estimation model to highlight the features point in green.

elif model_type == "TEXT":
# Get only text detections above 0.5 confidence, set to 255
output = np.where(output[1] > 0.5, 255, 0)
# Get a semantic mask
text_mask = get_mask(output)
# Add the mask to the image
image = image + text_mask
return image

The above code is intended for text detection, where we tell the text detection mode to highlight the detected text in green.

elif model_type == "CAR_META":
# Get the color and car type from their lists
color = CAR_COLORS[output[0]]
car_type = CAR_TYPES[output[1]]
# Scale the output text by the image shape
scaler = max(int(image.shape[0] / 1000), 1)
# Write the text of color and type onto the image
image = cv2.putText(image, "Color: {}, Type: {}".format(color, car_type), (50 * scaler, 100 * scaler),
cv2.FONT_HERSHEY_SIMPLEX, 2 * scaler, (255, 255, 255), 3 * scaler,)
return image

The above code is intended for car meta feature detection, where we tell the code to add a text on to the image containing the meta feature information.

#Create a Network for using the Inference Engine
inference_network = Network()
# Load the model in the network and obtain its input shape
n, c, h, w = inference_network.load_model(args.m, args.d, args.c)

In the above code, we instantiated the Network class, so that we can pass the necessary parameters like the location of IR files, device type on which we need to execute the application, and the type of model to used.

# Read the input image
image = cv2.imread(args.i)
preprocessed_image = preprocessing(image, h, w)

The above code is intended to preprocess the input image using the handle_models.preprocessing method.

# Perform synchronous inference on the image
inference_network.sync_inference(preprocessed_image)
# Obtain the output of the inference request
output = inference_network.extract_output()
output_func = handle_output(args.t)
processed_output = output_func(output, image.shape)
# Create an output image based on network
try:
output_image = create_output_image(args.t, image, processed_output)
print("Success")
except:
output_image = image
print("Failure")
# Save down the resulting image
cv2.imwrite("outputs/{}-output.png".format(args.t), output_image)

In the above code, we are telling the program to run inference in sync mode and the output returned is then sent to the create_output function to perform the necessary actions. And at the end, create the output image with "output" suffixed.

handle_models.py

heatmaps = output["Mconv7_stage2_L2"]
out_heatmap = np.zeros([heatmaps.shape[1], input_shape[0], input_shape[1]])
print(out_heatmap.shape)
for h in range(len(heatmaps[0])):
out_heatmap[h] = cv2.resize(heatmaps[0][h], input_shape[0:2][::-1])
return out_heatmap

The above code is intended to get the pose estimation output, as you see in the official documentation, "Mconv7_stage2_L2" is the desired output parameter.

first_blob = output["model/link_logits_/add"]
out_blob = np.zeros([first_blob.shape[1], input_shape[0], input_shape[1]])
for h in range(len(first_blob[0])):
out_blob[h] = cv2.resize(first_blob[0][h], input_shape[0:2][::-1])
print(first_blob.shape[0], first_blob.shape[1], first_blob.shape[2])
return out_blob

The above code is intended to get the text detection output, as per the official documentation, "model/link_logits/add" is the desired output parameter.

color = np.argmax(output["color"].flatten())
ttype = np.argmax(output["type"].flatten())
return color, ttype

The above code is intended to get the car meta data output, we had to flattern both the color and type so that we can get a linear array.

image = np.copy(input_image)
image = cv2.resize(image, (width, height))
image = image.transpose((2, 0, 1))
image = image.reshape(1, 3, height, width)
return image

The above code is intended to preprocess all the image provided.

if model_type == "POSE":
return handle_pose
elif model_type == "TEXT":
return handle_text
elif model_type == "CAR_META":
return handle_car
else:
return None

The above code is the logic that tells the program which method to invoke based on the model type provided.

Inference.py

import os
import sys
import logging as log
from openvino.inference_engine import IENetwork, IECore

The above code is intended to import the necessary libraries.

os and sys is used to use the Python System module functions
logging is an optional library which you may use or leave also, it is intended to log all the errors, warning, and information.
IENetwork and IECore is imported from openvino.inference_engine

model_xml = model
model_bin = os.path.splitext(model_xml)[0] + ".bin"

In the above code, we define the variables for .XML and .bin file. For bin file we remove the ".xml" from the passed fine name and append ".bin".

# Initialize the plugin
self.plugin = IECore()
# Add a CPU extension, if applicable
if cpu_extension and "CPU" in device:
self.plugin.add_extension(cpu_extension, device)
# Read the IR as an IENetwork
network = IENetwork(model=model_xml, weights=model_bin)

In the above code, we instantiate the IECore class and attach the CPU extension file if we are running the application on CPU. CPU extension is necessary as CPUs are not designed to run these types of application, so by adding CPU extension we provide the CPU with the algorithm to run such kind of applications.

Post that we pass the XML and bin file path for the IENetwork object to process and return an executable network.

# Load the IENetwork into the plugin
self.exec_network = self.plugin.load_network(network, device)
# Get the input layer
self.input_blob = next(iter(network.inputs))
# Return the input shape (to determine preprocessing)
return network.inputs[self.input_blob].shape

In the above code, we perform the inference by iterating through the input, here a python iterator is used to iterate.

self.exec_network.infer({self.input_blob: image})

In the above code, we perform the inference, here you could have defined a 'wait' method so as to wait for the output if the input size is big, which may take some time to process, in this application 'wait' function is not required.

self.exec_network.requests[0].outputs

In the above code, we are extracting the first component because the second component gives information about the errors that may have occurred during execution.

In order to execute the execute the application for Car Meta Data Model, execute the following command:

python app.py - i "images/blue-car.jpg" - t "CAR_META"
- m "/home/workspace/models/vehicle-attributes-recognition-barrier-0039.xml"
- c "/opt/intel/openvino/deployment_tools/inference_engine/lib/intel64/libcpu_extension_sse4.so"

Output

To run execute the application for text detection:

python app.py -i "images/sign.jpg" -t "TEXT"
-m "/home/workspace/models/text-detection-0004.xml"
-c "/opt/intel/openvino/deployment_tools/inference_engine/lib/intel64/libcpu_extension_sse4.so"

Output

To execute for human pose estimation:

python app.py -i "images/sitting-on-car.jpg" -t "POSE"
-m "/home/workspace/models/human-pose-estimation-0001.xml"
-c "/opt/intel/openvino/deployment_tools/inference_engine/lib/intel64/libcpu_extension_sse4.so"

Output

Note:

I have attached all the code and all 3 input images, you can try the same with different models.
I have used the code I used during my Udacity Nanodegree.
I have used Linux as the base OS, as I had some issues running the application on Windows. You may use the same command, but just change the path of the parameters accordingly.

Conclusion

In the above article, I tried to explain to you how an inference engine works and how we can use it to create a demo edge application. We will dive more into the coming article. So stay tuned to C# Corner for more articles.

For any doubts, feel free to comment. And if you like the article do give it a like.