Getting Started With Custom Modeling Using Form Recognizer Sample Labelling Tool

Introduction

 
This article is meant for beginners who would like to know about Azure Form Recognizer Sample Labeling Tool and get their hands dirty with some hands on experience. Sample Labeling tool is a tool developed as an open source project that gives a simple user interface (UI), which you can use to manually label forms (documents) for the purpose of supervised learning.
 
Pre- Requisite
  • Azure Account for creating a Azure Storage Account
    (If not you can get a free account with ₹13,300 worth of credits from here. If you are a student you can verify student status to get it without entering credit card details else credit card details are mandatory)
  • An Azure Storage Account for storing the input the labeling tool data. If you don't know to create a Storage Account go to official documentation.
  • An Azure Form Recognizer Resource. for testing the tool. If you don't know to create a Storage Account go to official documentation. Get the form recognizer credentials ie; End Point and API Key.

Getting the ground ready

  1. Get the files for labeling. Here I'm using the sample files which you can also download by clicking here. Extract the zip by WinRAR or similar un zipping tools.
  2. Upload the files we need to test in the Azure Blob Storage. You can use different methods to do so. Here I would do Go to Azure Portal and navigate to storage account you have just created Select the "Storage Explorer" in left pane. Right Click on "Blob Containers" now you will get an option to "create a container" click on that. Now give a name (here test) and click create. Now a container is created click on the created container expanded under the "Blob Containers". Now we are uploading the files by clicking "Upload" button on top. Select the files (if you are using sample files upload the files in Train folder) and click "Upload". After successful upload you can see the files listed on the blob.

    Getting Started With Custom Modeling Using Form Recognizer Sample Labelling Tool

  3. We need to get SAS URI for connecting this Blob with Sample Labeling Tool. For getting SAS URI right click the blob container we just created (here test) and click on "Get Shared Access Signature" a tab will be opened similar to the below one and give an "expiry time" (the time should be at least greater than current time) here I'm giving tomorrow's same time as expiry time. Also give Read, Write, Delete and List permissions. Now click on "Create". You will be shown a URI "Copy" it and keep it as we need it while we connect the Sample Labeling Tool with Blob.

    Getting Started With Custom Modeling Using Form Recognizer Sample Labelling Tool

  4. Now we need to Enable CORS on the storage account. For that navigate to CORS tab on the left pane. On the bottom line, fill in the following values. Then click Save at the top.
    • Allowed origins = *
    • Allowed methods = [select all]
    • Allowed headers = *
    • Exposed headers = *
    • Max age = 200
      Now the screen should be similar to that of below.

      Getting Started With Custom Modeling Using Form Recognizer Sample Labelling Tool
Now we are set to get to Sample Labeling Tool.
 

How to get Sample Labeling Tool ?

 
We can get access to sample labeling tool as a Desktop application or as a Web Application. The tool can be deployed in different ways; it depends on your organization's data policy. Go to official documentation to know the different ways of installation and usage.
 
Here we are using a ready to use web application which is perfectly okay for a learning, evaluation and testing scenarios. There are two versions available now, one is stable (version 2.0) and the other is in preview (version 2.1). If you want to know about the differences between these versions go to "What's new in Form Recognizer Sample Labelling Tool v2.1?" .
 
 
We are working on the stable version as of now ie; Form Recognizer Sample Labelling Tool v2.0 go to this web application. You will get a screen similar to that of below when you go to the tool. Preferred browser is Google Chrome as Mozilla Firefox does not support some features of this.

Getting Started With Custom Modeling Using Form Recognizer Sample Labelling Tool
 

Creating A Connection from Labeling Tool to Blob

 
Before creating a project the first thing is to create a connection.
  1. Click on the "Plug" button which is marked with red square on above figure. The screen should be similar to that of below.

    Getting Started With Custom Modeling Using Form Recognizer Sample Labelling Tool

  2. Give a "Display Name" which is a name you would like to refer this connection in Labeling Tool. Here I give "OCR TEST"
  3. (optional) Give a project "Description".
  4. Paste the SAS URI you have got from the above session (just after uploading file we have copied a URI). It should have the form: https://<storage account>.blob.core.windows.net/<container name>?<SAS value>.
  5. Click on "Save Connection".
  6. You will get a notification "Successfully Saved **".
    Now your tool is connected to your Blob Storage.

Create a Project

 
Now we are moving to create a project.
  1. Click on the "Home" in left pane.
  2. Click on "New Project". You screen should be similar to that of below.

    Getting Started With Custom Modeling Using Form Recognizer Sample Labelling Tool

  3. Give a "Display Name" which is a name you would like to refer this connection in Labeling Tool. Here I give "My Test Project".
  4. Select "Generate New Token" as you are using for the first time. The Security Token help us to work collaboratively and access the project from different browser with no compromise on security. To know more about this read Article ""
  5. Select the connection you just created. Here I have created "OCR TEST" so I select the same from drop-down list.
  6. We can give folder path empty as we have just put files on container. If you have uploaded files to a folder give its path here.
  7. Paste the Form Recognizer URI ie; Endpoint and API Key you copied while creating Form Recognizer Azure Resource. If you have not copied go to Azure Portal and copy the same.
  8. Click "Save Project".
  9. You will see a notification saying "Successfully Saved ***** Project Settings".
    Now your project is created and you will be navigated to the project tab which is similar to that of below screenshot. The left most tab is a scrollable list with preview of invoice you have uploaded to the blob. Center tab has the invoice you currently opened for editing and has a next button and Previous button whenever there is use of that (multipaged documents). The right tab is the place to add, delete and edit tags.

    Getting Started With Custom Modeling Using Form Recognizer Sample Labelling Tool

Labeling the Words

  1. Click on the "+" in right pane -highlighted as red on above screenshot.
  2. Give the Tag/Label Names. (Tag and Label is one and same) Here I'm labelling Company Name and Invoice Amount. with Labels "Company Name" and "Invoice Amount". Now my screen looks like the below screen shot.

    Getting Started With Custom Modeling Using Form Recognizer Sample Labelling Tool

  3. You can see there is a number/character in right of the label names which is automatically generated. This is highlighted with a red square in the above screenshot. This is the key in keyboard assigned in the keyboard for that label.
  4. Select the word from the document seen in the center tab and click the corresponding key in the keyboard. You must see the word under the label name in right tab. Here I select the word Contoso form the center tab and click "2" in keyboard. Also select the value under Charges in the table inside the document in center tab and click "1". "1" and "2" are the keys assigned to "Invoice Amount" and "Company Name" respectively.
  5. Repeat the same for all the documents by selecting each from left tab. We need at least 5 documents labelled to train a model. Currently we are not having a feature to label tables.
  6. After labeling 5 documents it's time to create the model. For that click the "Train" button on left pane. It's highlighted with red square on below screenshot.

    Getting Started With Custom Modeling Using Form Recognizer Sample Labelling Tool

  7. You will get directed to the training page which is similar to that of below screen shot. Click on the "Train" button which will start the model training.

    Getting Started With Custom Modeling Using Form Recognizer Sample Labelling Tool

  8. After the training (It takes about a minute) you will see the training results with "Average Accuracy" of model and "Estimated Accuracy" of each tags. If these values are low add more samples and train it which can improve the accuracy score. Here you can get the "Model ID" which will be used while for referring the model in our application (Here we are not going to use it since we are not integrating this model to an application we have developed, My Next Article "" uses it as it's where we integrate this custom model in our application). Mozilla Firefox does not support copying this Model ID. Now my screen looks like the below screen shot.

    Getting Started With Custom Modeling Using Form Recognizer Sample Labelling Tool
Now we have our Model Trained.
 

Testing the Model

  1. Click on the "Bulb" in the left pane. It's highlighted on the above screenshot. You'll be navigated to the Predict page.
  2. Click on the "Browse" button on right tab. Select the document you need to test. Here I selected the "Invoice_6.pdf" from the Test folder in sample data.
  3. Click on "Run Analysis".
  4. You can see the results (it takes about a minute) in the right tab which has the label name/field name/tag with the confidence % and value.
    You can repeat the same with other invoices/documents too.
If you would like know how to share the model with another team member or open the same in another machine/browser read the article [Accessing A Model In Form Recognizer Sample Labeling Tool By A Different Machine Or In Different Browser (Resuming The Project)].
 
If you would like to know what's new on Sample Labelling Tool v2.1 read the article [What's New In The Form Recognizer Sample Labelling Tool v2.1?]
 
To know more features of Form Recognizer Labeling Tool refer to the official documentation.
 
I hope you loved this article.  Please comment if you face any difficulty while following this. I'm eagerly waiting for your valuable feedback.