Integrating Read API, Converting The Resultant JSON To CSV And Deploy App To Azure

This is the second part in the article series "Python Flask App and Azure Cognitive Services Read API". In the first article [Render HTML Page And File Transfer Between Client And Server] which you can access by clicking Here. we were able to build a HTML Page and a basic Python Flask app which enables the user to upload and download a file. In this article we are taking the file uploaded by the user and calling the Read API to get the analyzed results form the API as JSON string.
 

Before we Start

 
Before we start working on code we need a subscription key and endpoint to call Read API. If you don't know how to get it, follow the official documentation for step by step guidance on how to create single service resource of Computer Vision.
 
We also need to install request module for POST request to call Read API and python-csv module to create and save csv files. For this installation we can use pip install. The following code does this. 
  1. pip install requests  
  2. pip install python-csv  

Integrating Read API

 
First set the subscription key and endpoint to variables which you get from the Azure portal.
  1. endpoint = "https://xxxxxxxxx.cognitiveservices.azure.com/"  
  2. subscription_key = "xxxxxxxxxxxxxxxxxxxxxxxxxx"  
We need to import json and csv modules for manipulating with JSON & CSV file types. Also requests and time modules for POST request to Read API and sleep function respectively. The following is the updated import statements.
  1. from flask import Flask,request, render_template,send_from_directory,redirect  
  2. import json  
  3. import os  
  4. import requests  
  5. import time  
  6. import csv  
Now use the below function get JSON which is taking the file as input and analyzed JSON string as output after calling the Read API.
  1. def getJSON(image_data):  
  2. text_recognition_url = endpoint + "/vision/v3.0/read/analyze"  
  3. headers = {'Ocp-Apim-Subscription-Key': subscription_key,  
  4. 'Content-Type''application/octet-stream'}  
  5. response = requests.post(  
  6. text_recognition_url, headers=headers, data=image_data)  
  7. response.raise_for_status()  
  8. operation_url = response.headers["Operation-Location"]  
  9. analysis = {}  
  10. poll = True  
  11. while (poll):  
  12. response_final = requests.get(  
  13. response.headers["Operation-Location"], headers=headers)  
  14. analysis = response_final.json()  
  15. if ("analyzeResult" in analysis):  
  16. poll = False  
  17. if ("status" in analysis and analysis['status'] == 'failed'):  
  18. poll = False  
  19. return analysis;  
Now we write a saveAsCSV function which takes table/row as input and appends to a file (here output.csv). If you like to clear and rewrite the file replace a+ with w+. The + is optional, if + is used the file will be created if it does not exist.
 
CSV is actually an abbreviation for comma separated values, each column is separated by comma and rows by space. It can be opened with any text editors or office sheet editors like MS Excel.
  1. def saveAsCSV(table):  
  2.    with open("output.csv"'a+', newline='') as file:  
  3.        writer = csv.writer(file)  
  4.        writer.writerows(table)   
We now need to update the upload_file function which we created in the previous article. Replace the return redirect statement of third if (ie: under if file: ) as below,
  1. analysis=getJSON(file)  
  2. row=[]  
  3. table=[]  
  4. for page in analysis['analyzeResult']['readResults']:  
  5. for word in page['lines']:  
  6. row.append(word['text'])  
  7. table.append(row)  
  8. row=[]  
  9. saveAsCSV(table)  
  10. return redirect('/uploads/'+'output.csv')  
Explanation
 
We call the getJSON function and pass the file to read (here file variable) as argument. Then set the response to a variable (here we set to analysis). And declare and set row and table variables as empty lists. Here we use the row list to store a row and table list to store these rows ie; table is of type list of lists. An item in analysis'analyzeResult' list corresponds to a page and the ['lines'] inside this item corresponds to each sentence identified. The ['text'] inside this gives the text of a sentence , ['boundingBox'] gives the 8 coordinates of the sentence and ['words'] gives information related to each word. Here in our use case we need only the ['text']. We append this to the row for each sentence. We append each row to table for each page and set the row to empty. Now we call the saveAsCSV function we already defined by passing the 'table' variable.We are now done with the changes needed server side. Some changes are needed in the client side; i.e.,  in home.html here for accepting the return file.
 
Yep! We are now done with the application. Our Flask application should be able to upload your pdf/image and return the csv file containing the text. The CSV file will have each sentence in each column and each page in each row.
 

Deploy to Azure

 
Now you can deploy this app to Azure as a free web app. For that you want to install Azure CLI from here. (There are other methods but I would recommend to follow this.) Download and install the same and check if it's successfully installed by the command,
  1. az --version  
The version should be 2.0.80 or higher.
 
Now login to Azure in CLI using the following command,
  1. az login   
This will open a browser window to enter login credentials and after successful login CLI will display the subscription info as a JSON String.
 
Now you are ready to deploy your app. You can do the same with following command,
  1. az webapp up --sku F1 -n <YourAppName>   
F1 refers to free tier and you should replace <YourAppName> with your app name. You can give any name to your app but it should be unique across Azure and you can use all lower case alphabets, numbers & special character '-'. There is a best practice that is followed by most developers in naming the app, that is starting with the company name. There are additional optional arguments like location that can be included. You can check the documentation for more info on this.
 
This process takes 5-20 minutes depending on the file size and bandwidth. After successful deployment you will see a JSON string with details of application.
 
Yep! Your app is now deployed as a Python app to App Service on Linux in Azure and available over internet for users. Now to browse to the deployed application in your web browser using the URL http://<YourAppName>.azurewebsites.net.
 
If you have made any changes and would like to update it on Azure you can do it using the below command,
  1. az webapp up   
Congrats... We are now done with building a Python Flask App which is integrated with Read API and published it on Azure.
 
In next article [Passing An HTML Table To Client And Passing Multiple Values As JSON From Server To Client] we'll we will try to send this HTML table and some values in order to know how we can send multiple values from server to client. This can be accessed by clicking here.
 
The code below is the final code in app.py and home.html as of now.
 
app.py
  1. from flask import Flask,request, render_template,send_from_directory,redirect  
  2. import json  
  3. import os  
  4. import requests  
  5. import time  
  6. import csv  
  7. ​  
  8. endpoint = "https://xxxxxx.cognitiveservices.azure.com/"  
  9. subscription_key = "xxxxxxxxxxxxxxxxxxxxxxxxxxx"  
  10. dirname = os.path.dirname(__file__)  
  11. app = Flask(__name__)  
  12. ​  
  13. def saveAsCSV(table):  
  14.    with open("output.csv"'a+', newline='') as file:  
  15.        writer = csv.writer(file)  
  16.        writer.writerows(table)  
  17.          
  18. def getJSON(image_data):  
  19. text_recognition_url = endpoint + "/vision/v3.0/read/analyze"  
  20. headers = {'Ocp-Apim-Subscription-Key': subscription_key,  
  21. 'Content-Type''application/octet-stream'}  
  22. response = requests.post(  
  23. text_recognition_url, headers=headers, data=image_data)  
  24. response.raise_for_status()  
  25. operation_url = response.headers["Operation-Location"]  
  26. analysis = {}  
  27. poll = True  
  28. while (poll):  
  29. response_final = requests.get(  
  30. response.headers["Operation-Location"], headers=headers)  
  31. analysis = response_final.json()  
  32. if ("analyzeResult" in analysis):  
  33. poll = False  
  34. if ("status" in analysis and analysis['status'] == 'failed'):  
  35. poll = False  
  36. return analysis;  
  37.  
  38. @app.route('/uploads/<filename>')  
  39. def uploaded_file(filename):  
  40.    return send_from_directory(dirname,  
  41.                               filename)  
  42. @app.route('/')  
  43. def home():  
  44. return render_template('home.html')  
  45. ​  
  46. @app.route('/uploader', methods=['POST'])  
  47. def upload_file():  
  48. if 'file' not in request.files:  
  49. return redirect(request.url)  
  50. file = request.files['file']  
  51. if file.filename == '':  
  52. return redirect(request.url)  
  53. if file:  
  54. analysis=getJSON(file)  
  55. row=[]  
  56. table=[]  
  57. for page in analysis['analyzeResult']['readResults']:  
  58. for word in page['lines']:  
  59. row.append(word['text'])  
  60. table.append(row)  
  61. row=[]  
  62. saveAsCSV(table)  
  63. return redirect('/uploads/'+'output.csv')  
  64. return ''  
  65. ​  
  66. if __name__ == '__main__':  
  67. app.run(debug=True)   
home.html
  1. <html>  
  2.     <body>  
  3.         <form id="fileUploadForm" method='post' enctype='multipart/form-data' action='/uploader'>  
  4.             <input type="file" name="file"  id="fileSelect">  
  5.                 <button type="submit" id="btnSubmit">Upload</button>  
  6.             </form>  
  7.         </body>  
  8.     </html>   
In the next article we will see how we can send an html table instead of the CSV file and how we can return multiple values as JSON string from server to client using AJAX and JQuery.