Windows Phone 8.1 Optical Character Recognition (OCR)

This article explains that Windows Phone 8.1 now also supports an OCR library that really makes it easy to read text from images and return the text and layout information.

Introduction

Wow! The most desired library is now supported for Windows Phone 8.1. Many developers have been waiting for this library, and finally it was released by Microsoft in the last "Preview Program". The Optical Character Recognition (OCR) library is helpful to read text from images and it returns the text and layout information.

OCR library features

  • Ability to recognize patterns (for example: email, phone and URIs) from image text
  • Launching the patterns (for example: making phone calls, sending mail, visiting a website)

OCR Limitations

  • Image dimension should be  >= 40*40 pixels and <= 2600*2600 pixels.
  • Image text lines must have been written in the same orientations and the same directions. Fortunately OCR is able to correct rotation up to ±40 degrees.

An inaccurate reading may be caused by the following:

  • Blurry images
  • Handwritten or cursive text
  • Artistic font styles
  • Small text size (less than 15 pixels for Western languages, or less than 20 pixels for East Asian languages)
  • Complex backgrounds
  • Shadows or glare over text
  • Perspective distortion
  • Oversized or dropped capital letters at the beginnings of words
  • Subscript, superscript, or strike-through text and please read more from here

The following describes how to build the sample:

  • Be sure you've downloaded and installed the Windows Phone SDK. For more information, see Get the SDK.
  • I assume you're going to test your app on the Windows Phone emulator. If you want to test your app on a phone, you need to use an additional procedure. For more info, see Register your Windows Phone device for development.
  • This article assumes you're using Microsoft Visual Studio Express 2013 for Windows.

Download and install the OCR Library

This library is not included in the Windows Software Development Kit (SDK) and it is distributed as a NuGet package, so to install this library right-click on your project then click on "Manage NuGet Packages" then seelct "Online" then search for "Microsoft.Windows.Ocr.".  Then click on the "Install" button. See the following image for your reference.



"Any CPU" problem

This library does not work on an "AnyCPU" target platform. To change the build configuration of your project from AnyCPU to x86, x64, or ARM right-click on the solution then click on Configuration Properties -> Configuration Manager and change the active solution platform to x86 (If you are using an emulator) or ARM (if you are using a Windows Phone device).

After you install the OCR library into your project, the "OcrResources" folder will be added to your project that has the "MsOcrRes.orp" file.

When you install the package, the file <solution_name>\packages\Microsoft.Windows.Ocr.1.0.0\OcrResources \MsOcrRes.orp is copied and injected into your project in the location <solution_name>\<project_name>\OcrResources\MsOcrRes.orp. This file is consumed by the OcrEngine object for text recognition in a specific language.

OCR supported languages

There are 21 supported languages. Based on recognition accuracy and performance, supported languages are divided into the following three groups:

  • Excellent: Czech, Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian, Polish, Portuguese, Spanish and Swedish.
  • Very good: Chinese Simplified, Greek, Japanese, Russian and Turkish.
  • Good: Chinese Traditional and Korean.

Note: By default English language resources are included in the target project. If you want to use a custom group of languages in your app, use the OCR Resources Generator tool to generate a new OCR resources file and replace the resources that were injected into your project when you installed the package.

To generate OCR resource files:

  • Launch the OCR Resources Generator tool located at <solution_name>\packages\Microsoft.Windows.Ocr.1.0.0\OcrResourcesGenerator\OcrResourcesGenerator.exe. You will then find the following dialog box:



  • Use the buttons in the center of the tool to create a list of the required languages.
  • Click the Generate Resources button. Pick a location to save the new resources file.
  • Replace the existing file <solution_name>\<project_name>\OcrResources\MsOcrRes.orp with the new file that you just generated.

How to extract text from an image

Step 1

In the page constructor, create and initialize a global instance of the OcrEngine. Also declare two unsigned integer variables to store the width and height of the image.

Step 2

Load the image, convert it to WriteableBitmap to get image pixels height and width.

Step 3

Check the image dimensions, it should be  > 40*40 pixels and < 2600*2600 pixels.

Step 4

Call the RecognizeAsync method of the OcrEngine class. This method returns an OcrResult object that contains the recognized text and its size and position. The result is split into lines and the lines are split into words.

Step 5

After the preceding procedure your code is like this for extracting the text from an image.

C# language

  1. using System;  
  2. using System.Collections.Generic;  
  3. using System.IO;  
  4. using System.Linq;  
  5. using System.Runtime.InteropServices.WindowsRuntime;  
  6. using System.Threading.Tasks;  
  7. using Windows.Foundation;  
  8. using Windows.Foundation.Collections;  
  9. using Windows.Storage;  
  10. using Windows.Storage.FileProperties;  
  11. using Windows.UI;  
  12. using Windows.UI.Xaml;  
  13. using Windows.UI.Xaml.Controls;  
  14. using Windows.UI.Xaml.Controls.Primitives;  
  15. using Windows.UI.Xaml.Data;  
  16. using Windows.UI.Xaml.Input;  
  17. using Windows.UI.Xaml.Media;  
  18. using Windows.UI.Xaml.Media.Imaging;  
  19. using Windows.UI.Xaml.Navigation;  
  20. using WindowsPreview.Media.Ocr;  
  21.      
  22. namespace OCRImgReadText  
  23. {  
  24.   
  25.     public sealed partial class MainPage : Page  
  26.     {  
  27.         // Bitmap holder of currently loaded image.   
  28.         private WriteableBitmap bitmap;  
  29.         // OCR engine instance used to extract text from images.   
  30.         private OcrEngine ocrEngine;  
  31.   
  32.         public MainPage()  
  33.         {  
  34.             this.InitializeComponent();  
  35.             ocrEngine = new OcrEngine(OcrLanguage.English);  
  36.             TextOverlay.Children.Clear();  
  37.         }  
  38.   
  39.   
  40.         protected override async void OnNavigatedTo(NavigationEventArgs e)  
  41.         { //Get local image  
  42.             var file = await Windows.ApplicationModel.Package.Current.InstalledLocation.GetFileAsync("TestImages\\SQuotes.jpg");  
  43.             await LoadImage(file);  
  44.         }  
  45.         private async Task LoadImage(StorageFile file)  
  46.         {  
  47.             ImageProperties imgProp = await file.Properties.GetImagePropertiesAsync();  
  48.   
  49.             using (var imgStream = await file.OpenAsync(FileAccessMode.Read))  
  50.             {  
  51.                 bitmap = new WriteableBitmap((int)imgProp.Width, (int)imgProp.Height);  
  52.                 bitmap.SetSource(imgStream);  
  53.                 PreviewImage.Source = bitmap;  
  54.             }  
  55.         }  
  56.         private async void ExtractText_Click(object sender, RoutedEventArgs e)  
  57.         {  
  58.             //// Prevent another OCR request, since only image can be processed at the time at same OCR engine instance.   
  59.             //ExtractTextButton.IsEnabled = false;   
  60.   
  61.             // Check whether is loaded image supported for processing.   
  62.             // Supported image dimensions are between 40 and 2600 pixels.   
  63.             if (bitmap.PixelHeight < 40 ||  
  64.                 bitmap.PixelHeight > 2600 ||  
  65.                 bitmap.PixelWidth < 40 ||  
  66.                 bitmap.PixelWidth > 2600)  
  67.             {  
  68.                 ImageText.Text = "Image size is not supported." +  
  69.                                     Environment.NewLine +  
  70.                                     "Loaded image size is " + bitmap.PixelWidth + "x" + bitmap.PixelHeight + "." +  
  71.                                     Environment.NewLine +  
  72.                                     "Supported image dimensions are between 40 and 2600 pixels.";  
  73.                 //ImageText.Style = (Style)Application.Current.Resources["RedTextStyle"];   
  74.   
  75.                 return;  
  76.             }  
  77.   
  78.             // This main API call to extract text from image.   
  79.             var ocrResult = await ocrEngine.RecognizeAsync((uint)bitmap.PixelHeight, (uint)bitmap.PixelWidth, bitmap.PixelBuffer.ToArray());  
  80.   
  81.             // OCR result does not contain any lines, no text was recognized.    
  82.             if (ocrResult.Lines != null)  
  83.             {  
  84.                 // Used for text overlay.   
  85.                 // Prepare scale transform for words since image is not displayed in original format.   
  86.                 var scaleTrasform = new ScaleTransform  
  87.                 {  
  88.                     CenterX = 0,  
  89.                     CenterY = 0,  
  90.                     ScaleX = PreviewImage.ActualWidth / bitmap.PixelWidth,  
  91.                     ScaleY = PreviewImage.ActualHeight / bitmap.PixelHeight,  
  92.                 };  
  93.   
  94.                 if (ocrResult.TextAngle != null)  
  95.                 {  
  96.   
  97.                     PreviewImage.RenderTransform = new RotateTransform  
  98.                     {  
  99.                         Angle = (double)ocrResult.TextAngle,  
  100.                         CenterX = PreviewImage.ActualWidth / 2,  
  101.                         CenterY = PreviewImage.ActualHeight / 2  
  102.                     };  
  103.                 }  
  104.   
  105.                 string extractedText = "";  
  106.   
  107.                 // Iterate over recognized lines of text.   
  108.                 foreach (var line in ocrResult.Lines)  
  109.                 {  
  110.                     // Iterate over words in line.   
  111.                     foreach (var word in line.Words)  
  112.                     {  
  113.                         var originalRect = new Rect(word.Left, word.Top, word.Width, word.Height);  
  114.                         var overlayRect = scaleTrasform.TransformBounds(originalRect);  
  115.   
  116.                         var wordTextBlock = new TextBlock()  
  117.                         {  
  118.                             Height = overlayRect.Height,  
  119.                             Width = overlayRect.Width,  
  120.                             FontSize = overlayRect.Height * 0.8,  
  121.                             Text = word.Text,  
  122.   
  123.                         };  
  124.   
  125.                         // Define position, background, etc.   
  126.                         var border = new Border()  
  127.                         {  
  128.                             Margin = new Thickness(overlayRect.Left, overlayRect.Top, 0, 0),  
  129.                             Height = overlayRect.Height,  
  130.                             Width = overlayRect.Width,  
  131.                             Background = new SolidColorBrush(Colors.Orange),  
  132.                             Opacity = 0.5,  
  133.                             HorizontalAlignment = HorizontalAlignment.Left,  
  134.                             VerticalAlignment = VerticalAlignment.Top,  
  135.                             Child = wordTextBlock,  
  136.   
  137.                         };  
  138.                         OverlayTextButton.IsEnabled = true;  
  139.                         // Put the filled textblock in the results grid.   
  140.                         TextOverlay.Children.Add(border);  
  141.                         extractedText += word.Text + " ";  
  142.                     }  
  143.                     extractedText += Environment.NewLine;  
  144.                 }  
  145.   
  146.                 ImageText.Text = extractedText;  
  147.   
  148.             }  
  149.             else  
  150.             {  
  151.                 ImageText.Text = "No text.";  
  152.   
  153.             }  
  154.         }  
  155.   
  156.         private void OverlayText_Click(object sender, RoutedEventArgs e)  
  157.         {  
  158.             if (TextOverlay.Visibility == Visibility.Visible)  
  159.             {  
  160.                 TextOverlay.Visibility = Visibility.Collapsed;  
  161.             }  
  162.             else  
  163.             {  
  164.                 TextOverlay.Visibility = Visibility.Visible;  
  165.             }  
  166.         }  
  167.     }  
  168. }

Step 6

And your UI might be like the following.

XAML code

  1. <Grid>   
  2.      <Grid.RowDefinitions>   
  3.           <RowDefinition Height="Auto"/>   
  4.           <RowDefinition Height="Auto"/>   
  5.           <RowDefinition Height="*"/>   
  6.      </Grid.RowDefinitions>   
  7.       <StackPanel Grid.Row="1" x:Name="ControlPanel"  Orientation="Vertical">   
  8.       <StackPanel   Orientation="Horizontal" Margin="10,0,10,0" >   
  9.                 <Button x:Name="ExtractTextButton" Content="Extract Image Text" FontSize="15" MinWidth="90" Click="ExtractText_Click"  Margin="0,0,5,0"/>   
  10.                 <Button x:Name="OverlayTextButton" IsEnabled="False" Content="Overlay Image Text" FontSize="15" MinWidth="90" Click="OverlayText_Click"  Margin="0,0,5,0"/>   
  11.             </StackPanel>   
  12.       <StackPanel Grid.Row="1" Orientation="Horizontal"/>   
  13.       </StackPanel>   
  14.       <ScrollViewer Grid.Row="2" VerticalScrollMode="Auto" VerticalScrollBarVisibility="Auto" Margin="0, 10, 0, 0">   
  15.             <!-- This StackPanel changes its Orientation depending on the available width of the window. -->   
  16.             <StackPanel x:Name="Output" Margin="10,0,10,0" Orientation="Vertical" Visibility="Visible">   
  17.    
  18.                 <StackPanel x:Name="Content" Orientation="Vertical" Visibility="Visible">   
  19.                     <Grid x:Name="Image">   
  20.                         <Image x:Name="PreviewImage" Margin="0,0,10,10"  Source="" Stretch="Uniform" Width="300" HorizontalAlignment="Left" VerticalAlignment="Top"/>   
  21.                         <Grid x:Name="TextOverlay" Visibility="Collapsed" Margin="0,0,10,10"  HorizontalAlignment="Left" VerticalAlignment="Top"/>   
  22.                      </Grid>   
  23.                     <!-- This StackPanel contains all of the image properties output. -->   
  24.                     <Grid x:Name="Result" HorizontalAlignment="Left" VerticalAlignment="Top">   
  25.                         <Grid.RowDefinitions>   
  26.                             <RowDefinition Height="Auto"/>   
  27.                             <RowDefinition Height="Auto"/>   
  28.                         </Grid.RowDefinitions>   
  29.                         <TextBlock Grid.Row="0" FontSize="25" Text="Extracted image text:" />   
  30.                         <TextBlock Name="ImageText" Grid.Row="1" Foreground="#FF1CD399" FontSize="25" Text="Text not yet extracted."/>   
  31.                     </Grid>   
  32.                 </StackPanel>   
  33.         </StackPanel>   
  34.      </ScrollViewer>   
  35. </Grid> 
Output





Note:
When you download and run this code then you will get an error since you must install the OCR library from "Manage NuGet Packages".


Summary

In this article we have learned how the OCR library has made it easy to read text from images in Windows Phone 8.1.