Introduction
If you have worked with Azure Cognitive Service API's like OCR API, Read API, or Form Recognizer API, you might have come across boundingBox in the readResults of the response. If the input you have given is slightly tilted, the response will also be tilted. The response also contains the angle by which the input page is tilted. To manipulate the results we sometimes need to rotate the boundingBox in the response with the tilted angle to do some operations.
Here we are looking at how this rotation can be done. Here is a Receipt from KSEB captured in an angle. We first give it to Azure Read API to get the JSON output.
Sample Input is given to Azure Read API
Sample Output from Azure Read API
- {
- "status": "succeeded",
- "createdDateTime": "2020-12-21T15:13:55Z",
- "lastUpdatedDateTime": "2020-12-21T15:13:56Z",
- "analyzeResult": {
- "version": "3.0.0",
- "readResults": [
- {
- "page": 1,
- "angle": 7.6307,
- "width": 445,
- "height": 1242,
- "unit": "pixel",
- "lines": [
- ....
- ....
- ....
-
- {
- "boundingBox": [
- 185,
- 91,
- 366,
- 113,
- 364,
- 131,
- 183,
- 109
- ],
- "text": "Demand/Disconnection Notice",
- "words": [
- {
- "boundingBox": [
- 186,
- 92,
- 319,
- 109,
- 318,
- 126,
- 184,
- 109
- ],
- "text": "Demand/Disconnection",
- "confidence": 0.751
- },
- {
- "boundingBox": [
- 323,
- 109,
- 366,
- 114,
- 365,
- 131,
- 321,
- 126
- ],
- "text": "Notice",
- "confidence": 0.907
- }
- ]
- },
- ....
- ....
- ....
- ]
- }
- ]
- }
- }
The above JSON response from Azure Read API says the Receipt is tilted by an angle of 7.6307 degrees. In order to do some manipulations based on the coordinates in boundingBox we need to rotate this by the angle 7.6307.
Rotate a point p by n degrees with respect to origin o
Suppose we need to rotate a point by n degrees with respect to origin o we may call a custom function describes below by calling it:
- rotatedPointoint=rotate(p,o,n)
The custom function rotate is shown below, which also needs numpy to work:
- import numpy as np
- def rotate(point, origin, degrees):
- radians = np.deg2rad(degrees)
- x, y = point
- offset_x, offset_y = origin
- adjusted_x = (x - offset_x)
- adjusted_y = (y - offset_y)
- cos_rad = np.cos(radians)
- sin_rad = np.sin(radians)
- qx = offset_x + cos_rad * adjusted_x + sin_rad * adjusted_y
- qy = offset_y + -sin_rad * adjusted_x + cos_rad * adjusted_y
- return qx, qy
Building a function for correcting the angle of coordinates boundingBox in response
The boundingBox contains 8 points as a list in the order:
- ["left-top-x","left-top-y","right-top-x",right-top-y","right-bottom-x","right-bottom-y","left-bottom-x","left-bottom-y"]
So to rotate a boundingBox we may loop through the list by incrementing by 2 on each cycle as below where:
- angle=analysis["analyzeResult"]["readResults"][index]["angle"]
Here angle=7.6307 and I try to rotate the picture by origin so that is equal to (0,0)
- for ind in range(0, 7, 2):
- bBox[ind], bBox[ind + 1] = rotate((bBox[ind], bBox[ind + 1]), (0, 0),angle )
Now we know how to rotate a bounding box. Let's move to how we can rotate all boudingBox's corrponding to lines in readResult as a whole.Nesting the above code snippet inside a loop of lines which is inside a loop of pages wiil make you achive the same. Below is the code for the same:
- def correctAngle(analysis):
- for page in analysis["analyzeResult"]["readResults"]:
- for line in page['lines']:
- bBox = line['boundingBox']
- for ind in range(0, 7, 2):
- bBox[ind],bBox[ind+1]=rotate((bBox[ind], bBox[ind + 1]),(0, 0),angle)
- line['boundingBox'] = bBox
- return analysis
Now we have rotated the boudingBox's corresponding to lines but we know there are boundingBox corresponding to words too so in order to achieve rotation on those bounding boxes you may try the same logic in the loop. We know we need to rotate only pages which have an angle so to optimize the code we check if the angle is a non zero value before entering the rotation function. Below is the code for the same.
- def correctAngle(analysis):
- for page in analysis["analyzeResult"]["readResults"]:
- if page["angle"] != 0:
- for line in page['lines']:
- bBox = line['boundingBox']
- for ind in range(0, 7, 2):
- bBox[ind], bBox[ind + 1] = rotate((bBox[ind], bBox[ind + 1]), (0, 0), page["angle"])
- line['boundingBox'] = bBox
- for word in line['words']:
- wbBox = word['boundingBox']
- for ind in range(0, 7, 2):
- wbBox[ind], wbBox[ind + 1] = rotate((wbBox[ind], wbBox[ind + 1]), (0, 0), page["angle"])
- word['boundingBox'] = wbBox
- page["angle"]=0
- return analysis
Sample Output
Output when process the above JSON response with the above function is:
- {
- "status": "succeeded",
- "createdDateTime": "2020-12-21T15:13:55Z",
- "lastUpdatedDateTime": "2020-12-21T15:13:56Z",
- "analyzeResult": {
- "version": "3.0.0",
- "readResults": [
- {
- "page": 1,
- "angle": 0,
- "width": 445,
- "height": 1242,
- "unit": "pixel",
- "lines": [
- ....
- ....
- ....
-
- {
- "boundingBox": [
- 202.42927798617825,
- 39.094595713404814,
- 382.83721732032967,
- 12.675371175104381,
- 385.64576445111106,
- 30.567046977392415,
- 205.23782511695967,
- 56.98627151569286
- ],
- "text": "Demand/Disconnection Notice",
- "words": [
- {
- "boundingBox": [
- 203.65723612684363,
- 39.796107512859024,
- 336.4417810450956,
- 21.187920313329045,
- 339.95183997533115,
- 37.85163797495165,
- 206.2025600870195,
- 56.72304834508725
- ],
- "text": "Demand/Disconnection",
- "confidence": 0.751
- },
- {
- "boundingBox": [
- 340.3007209253349,
- 20.135027630906585,
- 383.1004404909352,
- 13.640106145164204,
- 386.6104994211709,
- 30.30382380678678,
- 342.84604488551065,
- 37.061968463134804
- ],
- "text": "Notice",
- "confidence": 0.907
- }
- ]
- },
-
- ....
- ....
- ....
- ]
- }
- ]
- }
- }
Final code
- import numpy as np
- def rotate(point, origin, degrees):
- radians = np.deg2rad(degrees)
- x, y = point
- offset_x, offset_y = origin
- adjusted_x = (x - offset_x)
- adjusted_y = (y - offset_y)
- cos_rad = np.cos(radians)
- sin_rad = np.sin(radians)
- qx = offset_x + cos_rad * adjusted_x + sin_rad * adjusted_y
- qy = offset_y + -sin_rad * adjusted_x + cos_rad * adjusted_y
- return qx, qy
-
- def correctAngle(analysis):
- for page in analysis["analyzeResult"]["readResults"]:
- if page["angle"] != 0:
- for line in page['lines']:
- bBox = line['boundingBox']
- for ind in range(0, 7, 2):
- bBox[ind], bBox[ind + 1] = rotate((bBox[ind], bBox[ind + 1]), (0, 0), page["angle"])
- line['boundingBox'] = bBox
- for word in line['words']:
- wbBox = word['boundingBox']
- for ind in range(0, 7, 2):
- wbBox[ind], wbBox[ind + 1] = rotate((wbBox[ind], wbBox[ind + 1]), (0, 0), page["angle"])
- word['boundingBox'] = wbBox
- return analysis
Conclusion
This blog was about how to rotate the boundingBox in resulted response with the tilted angle. I hope you were able to achieve the same.