Optical Character Recognition By Camera Using Google Vision API On Android

Android
 

Introduction

 
In this tutorial, we will learn how to do Optical Character Recognition with a Camera in Android using Vision API. Here, we will just import the Google Vision API Library with Android Studio and implement the OCR for retrieving text from the camera preview. 
 
You can find my previous tutorial on Optical Character Recognition using Google Vision API for Recognizing Text from Images here. My previous tutorial covered the introduction of Google Vision API. Therefore, without any delay, we will skip the coding part.
 
Steps
 
I have split this part into four steps as in the following.
  • Step 1 - Creating a New Project with Empty Activity and Gradle Setup.
  • Step 2 - Setting up Manifest for OCR.
  • Step 3 - Implementing Camera View using SurfaceView.
  • Step 4 - Implementing OCR in Application.
Step 1 - Creating a New Project with Empty Activity and Gradle Setup
 
We will start coding for OCR. Create a New Android Project. Add the following line in your app level build.gradle file to import the library.
implementation 'com.google.android.gms:play-services-vision:15.2.0'
 
Step 2 - Setting up Manifest for OCR
 
Open your manifest file and add the following code block to instruct the app to install or download the dependencies at the time of installing the app.
  1. <meta-data android:name="com.google.android.gms.vision.DEPENDENCIES" android:value="ocr"/>  
Step 3 - Implementing Camera View using SurfaceView
 
Open your activity_main.xml file and paste the following code. It's just the designer part of the application.
  1. <?xml version="1.0" encoding="utf-8"?>  
  2. <android.support.constraint.ConstraintLayout xmlns:android="http://schemas.android.com/apk/res/android"  
  3.     xmlns:app="http://schemas.android.com/apk/res-auto"  
  4.     xmlns:tools="http://schemas.android.com/tools"  
  5.     android:layout_width="match_parent"  
  6.     android:layout_height="match_parent"  
  7.     tools:context="com.androidmads.ocrcamera.MainActivity">  
  8.   
  9.     <SurfaceView  
  10.         android:id="@+id/surface_view"  
  11.         android:layout_width="match_parent"  
  12.         android:layout_height="match_parent" />  
  13.   
  14.     <TextView  
  15.         android:id="@+id/txtview"  
  16.         android:layout_width="match_parent"  
  17.         android:layout_height="wrap_content"  
  18.         app:layout_constraintBottom_toBottomOf="parent"  
  19.         android:text="No Text"  
  20.         android:textColor="@android:color/white"  
  21.         android:textSize="20sp"  
  22.         android:padding="5dp"/>  
  23.   
  24. </android.support.constraint.ConstraintLayout>  
Step 4 - Implementing OCR in Application
 
Open your MainActivity.java file and initialize the widget used in your designer. Add the following code to start Camera View.
 
Implement your Activity with SurfaceHolder.Callback, Detector Processor to start your camera preview.
  1. TextRecognizer txtRecognizer = new TextRecognizer.Builder(getApplicationContext()).build();  
  2. if (!txtRecognizer.isOperational()) {  
  3.     Log.e("Main Activity""Detector dependencies are not yet available");  
  4. else {  
  5.     cameraSource = new CameraSource.Builder(getApplicationContext(), txtRecognizer)  
  6.             .setFacing(CameraSource.CAMERA_FACING_BACK)  
  7.             .setRequestedPreviewSize(1280, 1024)  
  8.             .setRequestedFps(2.0f)  
  9.             .setAutoFocusEnabled(true)  
  10.             .build();  
  11.     cameraView.getHolder().addCallback(this);  
  12.     txtRecognizer.setProcessor(this);  
  13. }  
Here, TextRecognizer is used to do Character Recognition in Camera Preview & txtRecognizer.isOperational() is used to check if the device has the support for Google Vision API. The output of the TextRecognizer can be retrieved by using SparseArray and StringBuilder.
 

TextBlock

 
I have used TextBlock to retrieve the paragraph from the image using OCR.
 

Lines

 
You can get the line from the TextBlock using
textblockName.getComponents()
 

Element

 
You can get the line from the TextBlock using
lineName.getComponents()
 
Camera Source istarts on the surface created with callback and does the scanning process. The Received Detections are read by SparseArray and are similar to reading data with a bitmap in android.
 
The Text View at the bottom of the screen is used to preview the scanned data.
 
full code
 
You can find the full code here.
  1. public class MainActivity extends AppCompatActivity implements SurfaceHolder.Callback, Detector.Processor {  
  2.   
  3.     private SurfaceView cameraView;  
  4.     private TextView txtView;  
  5.     private CameraSource cameraSource;  
  6.   
  7.     @SuppressLint("MissingPermission")  
  8.     @Override  
  9.     public void onRequestPermissionsResult(int requestCode, @NonNull String[] permissions, @NonNull int[] grantResults) {  
  10.         switch (requestCode) {  
  11.             case 1: {  
  12.                 if (grantResults[0] == PackageManager.PERMISSION_GRANTED) {  
  13.                     try {  
  14.                         cameraSource.start(cameraView.getHolder());  
  15.                     } catch (Exception e) {  
  16.   
  17.                     }  
  18.                 }  
  19.             }  
  20.             break;  
  21.         }  
  22.     }  
  23.   
  24.     @Override  
  25.     protected void onCreate(Bundle savedInstanceState) {  
  26.         super.onCreate(savedInstanceState);  
  27.         setContentView(R.layout.activity_main);  
  28.         cameraView = findViewById(R.id.surface_view);  
  29.         txtView = findViewById(R.id.txtview);  
  30.         TextRecognizer txtRecognizer = new TextRecognizer.Builder(getApplicationContext()).build();  
  31.         if (!txtRecognizer.isOperational()) {  
  32.             Log.e("Main Activity""Detector dependencies are not yet available");  
  33.         } else {  
  34.             cameraSource = new CameraSource.Builder(getApplicationContext(), txtRecognizer)  
  35.                     .setFacing(CameraSource.CAMERA_FACING_BACK)  
  36.                     .setRequestedPreviewSize(1280, 1024)  
  37.                     .setRequestedFps(2.0f)  
  38.                     .setAutoFocusEnabled(true)  
  39.                     .build();  
  40.             cameraView.getHolder().addCallback(this);  
  41.             txtRecognizer.setProcessor(this);  
  42.         }  
  43.     }  
  44.   
  45.     @Override  
  46.     public void surfaceCreated(SurfaceHolder holder) {  
  47.         try {  
  48.             if (ActivityCompat.checkSelfPermission(this,  
  49.                     Manifest.permission.CAMERA) != PackageManager.PERMISSION_GRANTED) {  
  50.                 ActivityCompat.requestPermissions(thisnew String[]{Manifest.permission.CAMERA},1);  
  51.                 return;  
  52.             }  
  53.             cameraSource.start(cameraView.getHolder());  
  54.         } catch (Exception e) {  
  55.             e.printStackTrace();  
  56.         }  
  57.     }  
  58.   
  59.     @Override  
  60.     public void surfaceChanged(SurfaceHolder holder, int format, int width, int height) {  
  61.   
  62.     }  
  63.   
  64.     @Override  
  65.     public void surfaceDestroyed(SurfaceHolder holder) {  
  66.         cameraSource.stop();  
  67.     }  
  68.   
  69.     @Override  
  70.     public void release() {  
  71.   
  72.     }  
  73.   
  74.     @Override  
  75.     public void receiveDetections(Detector.Detections detections) {  
  76.         SparseArray items = detections.getDetectedItems();  
  77.         final StringBuilder strBuilder = new StringBuilder();  
  78.         for (int i = 0; i < items.size(); i++)  
  79.         {  
  80.             TextBlock item = (TextBlock)items.valueAt(i);  
  81.             strBuilder.append(item.getValue());  
  82.             strBuilder.append("/");  
  83.             // The following Process is used to show how to use lines & elements as well  
  84.             for (int j = 0; j < items.size(); j++) {  
  85.                 TextBlock textBlock = (TextBlock) items.valueAt(j);  
  86.                 strBuilder.append(textBlock.getValue());  
  87.                 strBuilder.append("/");  
  88.                 for (Text line : textBlock.getComponents()) {  
  89.                     //extract scanned text lines here  
  90.                     Log.v("lines", line.getValue());  
  91.                     strBuilder.append(line.getValue());  
  92.                     strBuilder.append("/");  
  93.                     for (Text element : line.getComponents()) {  
  94.                         //extract scanned text words here  
  95.                         Log.v("element", element.getValue());  
  96.                         strBuilder.append(element.getValue());  
  97.                     }  
  98.                 }  
  99.             }  
  100.         }  
  101.         Log.v("strBuilder.toString()", strBuilder.toString());  
  102.   
  103.         txtView.post(new Runnable() {  
  104.             @Override  
  105.             public void run() {  
  106.                 txtView.setText(strBuilder.toString());  
  107.             }  
  108.         });  
  109.     }  
  110. }  
Download Code
 
You can download the full source code for this article from GitHub.


Similar Articles