Optical Character Recognition Using Google Vision API On Android

Optical Character Recognition in Android
 

Introduction

 
In this tutorial, we will learn how to do Optical Character Recognition in Android using Vision API. Here, we will just import the Google Vision API Library with Android Studio and implement the OCR for retrieving text from the image.
 

Android Mobile Vision API

 
The Mobile Vision API provides a framework for finding objects in photos and videos. The framework includes detectors, which locate and describe visual objects in images or video frames, and an event-driven API that tracks the position of those objects in video. The Mobile Vision API includes face, bar code, and text detectors, which can be applied separately or together.
 
This is not only used to get a text from images, but also for structuring the text retrieved. It will divide the captured text into the following categories.
  • TextBlock - In this category, the scanned paragraph is captured.
  • Line - In this category, the line of text captured from Textblock takes place.
  • Element- In this category, the word captured from the line takes place.
Coding Part
 
Steps
 
I have split this part into 3 steps as in the following.
 
Step 1
 
Creating a New Project with Empty Activity and Gradle Setup.
 
Step 2
 
Setting up Manifest for OCR.
 
Step 3
 
Implementing OCR in Application.
 
Step 1
 
We will start coding for OCR. Create a New Android Project. Add the following line in your app level build.gradle file to import the library.
 
For Android Studio before 3.0
 
compile'com.google.android.gms:play-services-vision:11.8.0'
 
From Android Studio 3.0
 
implementation 'com.google.android.gms:play-services-vision:11.8.0'
 
Step 2
 
Open your Manifest file and add the following code block to instruct the app to install or download the dependencies at the time of installing the app.
<meta-data android:name="com.google.android.gms.vision.DEPENDENCIES" android:value="ocr"/>
 
Step 3
 
Open your activity_main.xml file and paste the following code. It just the designer part of the application.
  1. <?xml version="1.0" encoding="utf-8"?>  
  2. <LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"  
  3.     android:orientation="vertical"  
  4.     android:layout_width="match_parent"  
  5.     android:layout_height="match_parent"  
  6.     android:padding="15dp">  
  7.     <ImageView  
  8.         android:id="@+id/image_view"  
  9.         android:layout_width="match_parent"  
  10.         android:layout_height="wrap_content"  
  11.         android:scaleType="centerInside" />  
  12.     <Button  
  13.         android:id="@+id/btnProcess"  
  14.         android:layout_width="match_parent"  
  15.         android:layout_height="wrap_content"  
  16.         android:text="Process" />  
  17.     <TextView  
  18.         android:id="@+id/txtView"  
  19.         android:layout_width="wrap_content"  
  20.         android:layout_height="wrap_content"  
  21.         android:text="No Text"  
  22.         android:layout_gravity="center"  
  23.         android:textSize="25sp" />  
  24. </LinearLayout>  
Open your MainActivity.java file and initialize the widget used in your designer. Add the following code to start Optical Character Recognition.
  1. // To get bitmap from resource folder of the application.  
  2. bitmap = BitmapFactory.decodeResource(getApplicationContext().getResources(), R.drawable.ocr_sample);  
  3. // Starting Text Recognizer  
  4. TextRecognizer txtRecognizer = new TextRecognizer.Builder(getApplicationContext()).build();  
  5. if (!txtRecognizer.isOperational())  
  6. {  
  7.         // Shows if your Google Play services is not up to date or OCR is not supported for the device  
  8.     txtView.setText("Detector dependencies are not yet available");  
  9. }  
  10. else  
  11. {  
  12.     // Set the bitmap taken to the frame to perform OCR Operations.  
  13.         Frame frame = new Frame.Builder().setBitmap(bitmap).build();  
  14.     SparseArray items = txtRecognizer.detect(frame);  
  15.     StringBuilder strBuilder = new StringBuilder();  
  16.     for (int i = 0; i < items.size(); i++)  
  17.     {  
  18.         TextBlock item = (TextBlock)items.valueAt(i);  
  19.         strBuilder.append(item.getValue());  
  20.         strBuilder.append("/");  
  21.                 // The following Process is used to show how to use lines & elements as well  
  22.                 for (int i = 0; i < items.size(); i++) {  
  23.                         TextBlock item = (TextBlock) items.valueAt(i);  
  24.                         strBuilder.append(item.getValue());  
  25.                         strBuilder.append("/");  
  26.                         for (Text line : item.getComponents()) {  
  27.                             //extract scanned text lines here  
  28.                             Log.v("lines", line.getValue());  
  29.                             for (Text element : line.getComponents()) {  
  30.                                 //extract scanned text words here  
  31.                                 Log.v("element", element.getValue());  
  32.                             }  
  33.                         }  
  34.                     }  
  35.     }  
  36.     txtView.setText(strBuilder.toString());  
  37. }  
txtRecognizer.isOperational() is used to check the device has the support for Google Visison API. The output of the TextRecognizer can be retrieved by using SparseArray and StringBuilder.
 
TextBlock
 
I have used TextBlock to retrieve the paragraph from the image using OCR.
 
Lines
 
You can get the line from the TextBlock using
textblockName.getComponents()
 
Element
 
You can get the line from the TextBlock using
lineName.getComponents()
 
Demo
 
Optical Character Recognition in Android
 
Download Code
 
You can download the full source code for this article from GitHub.


Similar Articles