Rotate Boundingbox Based on Angle: Azure OCR API/Read/Form Recognizer

Introduction

If you have worked with Azure Cognitive Service APIs like OCR API, Read API, or Form Recognizer API, you might have come across a bounding box in the read results of the response. If the input you have given is slightly tilted, the response will also be tilted. The response also contains the angle by which the input page is tilted. To manipulate the results we sometimes need to rotate the boundingBox in the response with the tilted angle to do some operations.

Here we are looking at how this rotation can be done. Here is a Receipt from KSEB captured in an angle. We first give it to Azure Read API to get the JSON output.

Sample Input is given to Azure Read API

Azure Read API

Sample Output from Azure Read API

{
   "status": "succeeded",
   "createdDateTime": "2020-12-21T15:13:55Z",
   "lastUpdatedDateTime": "2020-12-21T15:13:56Z",
   "analyzeResult": {
       "version": "3.0.0",
       "readResults": [
          {
               "page": 1,
               "angle": 7.6307,
               "width": 445,
               "height": 1242,
               "unit": "pixel",
               "lines": [
                   ....
                   ....
                   ....
                   
                  {
                       "boundingBox": [
                           185,
                           91,
                           366,
                           113,
                           364,
                           131,
                           183,
                           109
                      ],
                       "text": "Demand/Disconnection Notice",
                       "words": [
                          {
                               "boundingBox": [
                                   186,
                                   92,
                                   319,
                                   109,
                                   318,
                                   126,
                                   184,
                                   109
                              ],
                               "text": "Demand/Disconnection",
                               "confidence": 0.751
                          },
                          {
                               "boundingBox": [
                                   323,
                                   109,
                                   366,
                                   114,
                                   365,
                                   131,
                                   321,
                                   126
                              ],
                               "text": "Notice",
                               "confidence": 0.907
                          }
                      ]
                  },
                   ....
                   ....
                   ....
              ]
          }
      ]
  }
}

The above JSON response from Azure Read API says the Receipt is tilted by an angle of 7.6307 degrees. In order to do some manipulations based on the coordinates in the bounding box we need to rotate this by the angle of 7.6307.

Rotate a point p by n degrees with respect to the origin o

Suppose we need to rotate a point by n degrees with respect to the origin o we may call a custom function described below by calling it.

rotatedPoint = rotate(p, o, n);

The custom function rotates is shown below, which also needs Numpy to work.

import numpy as np

def rotate(point, origin, degrees):
    radians = np.deg2rad(degrees)
    x, y = point
    offset_x, offset_y = origin
    adjusted_x = (x - offset_x)
    adjusted_y = (y - offset_y)
    cos_rad = np.cos(radians)
    sin_rad = np.sin(radians)
    qx = offset_x + cos_rad * adjusted_x + sin_rad * adjusted_y
    qy = offset_y + -sin_rad * adjusted_x + cos_rad * adjusted_y
    return qx, qy

Building a function for correcting angle of coordinates bounding box in response

The bounding box contains 8 points as a list in the order.

[
    "left-top-x",
    "left-top-y",
    "right-top-x",
    "right-top-y",
    "right-bottom-x",
    "right-bottom-y",
    "left-bottom-x",
    "left-bottom-y"
]

So to rotate a bounding box we may loop through the list by incrementing by 2 on each cycle as below where:

angle = analysis["analyzeResult"]["readResults"][index]["angle"]

Here angle=7.6307 and I try to rotate the picture by origin so that is equal to (0,0)

for ind in range(0, 7, 2):
    bBox[ind], bBox[ind + 1] = rotate((bBox[ind], bBox[ind + 1]), (0, 0), angle)

Now we know how to rotate a bounding box. Let's move to how we can rotate all bounding boxes corresponding to lines in readResult as a whole. Nesting the above code snippet inside a loop of lines which is inside a loop of pages will make you achieve the same. Below is the code for the same.

def correctAngle(analysis):
    for page in analysis["analyzeResult"]["readResults"]:
        for line in page['lines']:
            bBox = line['boundingBox']
            for ind in range(0, 7, 2):
                bBox[ind], bBox[ind + 1] = rotate((bBox[ind], bBox[ind + 1]), (0, 0), angle)
            line['boundingBox'] = bBox
    return analysis

Now we have rotated the bounding box corresponding to lines but we know there are bounding corresponding to words too so in order to achieve rotation on those bounding boxes you may try the same logic in the loop. We know we need to rotate only pages which have an angle so to optimize the code we check if the angle is a non-zero value before entering the rotation function. Below is the code for the same.

def correctAngle(analysis):
    for page in analysis["analyzeResult"]["readResults"]:
        if page["angle"] != 0:
            for line in page['lines']:
                bBox = line['boundingBox']
                for ind in range(0, 7, 2):
                    bBox[ind], bBox[ind + 1] = rotate((bBox[ind], bBox[ind + 1]), (0, 0), page["angle"])
                line['boundingBox'] = bBox
                for word in line['words']:
                    wbBox = word['boundingBox']
                    for ind in range(0, 7, 2):
                        wbBox[ind], wbBox[ind + 1] = rotate((wbBox[ind], wbBox[ind + 1]), (0, 0), page["angle"])
                    word['boundingBox'] = wbBox
            page["angle"] = 0
    return analysis

Sample Output

Output when processing the above JSON response with the above function is.

{
   "status": "succeeded",
   "createdDateTime": "2020-12-21T15:13:55Z",
   "lastUpdatedDateTime": "2020-12-21T15:13:56Z",
   "analyzeResult": {
       "version": "3.0.0",
       "readResults": [
          {
               "page": 1,
               "angle": 0,
               "width": 445,
               "height": 1242,
               "unit": "pixel",
               "lines": [
                   ....
                   ....
                   ....
                   
                  {
                       "boundingBox": [
                           202.42927798617825,
                           39.094595713404814,
                           382.83721732032967,
                           12.675371175104381,
                           385.64576445111106,
                           30.567046977392415,
                           205.23782511695967,
                           56.98627151569286
                      ],
                       "text": "Demand/Disconnection Notice",
                       "words": [
                          {
                               "boundingBox": [
                                   203.65723612684363,
                                   39.796107512859024,
                                   336.4417810450956,
                                   21.187920313329045,
                                   339.95183997533115,
                                   37.85163797495165,
                                   206.2025600870195,
                                   56.72304834508725
                              ],
                               "text": "Demand/Disconnection",
                               "confidence": 0.751
                          },
                          {
                               "boundingBox": [
                                   340.3007209253349,
                                   20.135027630906585,
                                   383.1004404909352,
                                   13.640106145164204,
                                   386.6104994211709,
                                   30.30382380678678,
                                   342.84604488551065,
                                   37.061968463134804
                              ],
                               "text": "Notice",
                               "confidence": 0.907
                          }
                      ]
                  },
                   
                   ....
                   ....
                   ....
              ]
          }
      ]
  }
}

Final code

import numpy as np

def rotate(point, origin, degrees):
    radians = np.deg2rad(degrees)
    x, y = point
    offset_x, offset_y = origin
    adjusted_x = (x - offset_x)
    adjusted_y = (y - offset_y)
    cos_rad = np.cos(radians)
    sin_rad = np.sin(radians)
    qx = offset_x + cos_rad * adjusted_x + sin_rad * adjusted_y
    qy = offset_y + -sin_rad * adjusted_x + cos_rad * adjusted_y
    return qx, qy

def correctAngle(analysis):
    for page in analysis["analyzeResult"]["readResults"]:
        if page["angle"] != 0:
            for line in page['lines']:
                bBox = line['boundingBox']
                for ind in range(0, 7, 2):
                    bBox[ind], bBox[ind + 1] = rotate((bBox[ind], bBox[ind + 1]), (0, 0), page["angle"])
                line['boundingBox'] = bBox
                for word in line['words']:
                    wbBox = word['boundingBox']
                    for ind in range(0, 7, 2):
                        wbBox[ind], wbBox[ind + 1] = rotate((wbBox[ind], wbBox[ind + 1]), (0, 0), page["angle"])
                    word['boundingBox'] = wbBox
    return analysis

Conclusion

This blog was about how to rotate the bounding in response to the tilted angle. I hope you were able to achieve the same.

To know how to Build a Flask application and get started with Azure Cognitive Services go to the article series starting at "Python Flask App And Azure Cognitive Services Read API - Render HTML Page And File Transfer Between Client And Server"


Similar Articles