Getting Started With The Video Indexer Using Microsoft Cognitive Services


Microsoft introduced the public preview of Video Indexer as a part of Cognitive Service. Previously, we used Video API but now it's replaced with Video Indexer. Video Indexer automatically extracts the metadata and builds intelligent innovative applications based on Video and Audio.

Microsoft Cognitive Services

In this article, I will show how to sign in to Video Indexer and upload your video and extract the metadata and translate.

Create Account

For developing video indexer application, you must log in or create an account using any of the below accounts - Azure Active Directory, Microsoft account, LinkedIn, Google, or Facebook to Microsoft VI

 Upload Video

We need to upload a video to MS Video Indexer portal. After login, select Upload and drag&drop your video file or provide the video web URL for uploading it to the portal.

Provide the basic details of video file name, language, and privacy setting and click on Upload button.

Microsoft Cognitive Services

After uploading the video to the portal, Microsoft VI will do the process for analyzing and indexing the video.

Microsoft Cognitive Services

Once the video indexer is done analyzing, you will get an email notification with the link of the video, short description, and people's face detection.

Microsoft Cognitive Services

You can edit the privacy setting from the portal and Microsoft Cognitive Service will return the following analysis report.

Face identification

Microsoft AI will help detection of faces in a Video. The faces are matched against a celebrity. It will identify the matched name or the user can also edit the labels of faces that do not match the celebrity.

Microsoft Cognitive Services

Speech to Text

The Microsoft Video Indexer has speech to text functionality which helps the user to transcribe a speech to the spoken language. It supports Tamil, English, Hindi, etc.  and also you can edit the text. Video Indexer has the ability to map and understand which speaker spoke which words and when.

Microsoft Cognitive Services

Identify Objects

Video Indexer identifies the pre-defined 2000 objects based on the video background.  

Microsoft Cognitive Services

Keyword Extractions

The meta keyword will help for search the video into a large library. Video Indexer extracts the keywords based on the transcript of the spoken words and text recognized by the visual text recognizer.

Microsoft Cognitive Services

Sentiment analysis

Video Indexer performs sentiment analysis on the text extracted using speech-to-text and optical character recognition, and provide that information in the form of positive, negative of neutral sentiments, along with timecodes.

Microsoft Cognitive Services


Video Indexer has the ability to translate an audio transcript from one language into another. Video Indexer supports the following multiple languages - Tamil, English, Spanish, etc..

Microsoft Cognitive Services

Once the Video Indexer is done processing and analyzing a video, you can review, edit, delete, and publish the video into the Microsoft VI Portal .

Microsoft Cognitive Services


In this article, we learned how to sign in to Video Indexer, upload a video, extract the metadata, and translate it into another language. If you have any questions/feedback/ issues, please write in the comments box.

Similar Articles