Write And Read Pandas Dataframe And CSV To And From Azure Storage Table

Introduction

 
This article is about how to read and write Pandas DataFrame and CSV to and from Azure Storage Tables. The Pandas DataFrames are used in many Data Analytics applications. Therefore, storing it in a cloud is a repetitive task in many cases. Here we can see how we can do the same. First, we see how to save data in CSV file to Azure Table Storage and then we'll see how to deal with the same situation with Pandas DataFrame.
 
Prerequisites
  • Azure Account : (If not you can get a free account with ₹13,300 worth of credits from here. If you are a student you can verify student status to get it without entering credit card details else credit card details are mandatory)
  • Azure Storage Account: To know how to create a storage account follow the official documentation.

Saving a CSV file to Azure Storage Table

 
We need a CSV module and Azure cosmosdb table for the same so install the same, You may use pip install CSV and pip install azure-cosmosdb-table for the same.
 
Read CSV File
 
The below function uses a CSV module to read a CSV file at a specified location.
  1. import csv  
  2. def readCSV(location):  
  3.    with open(location) as csv_file:  
  4.        csv_reader = csv.reader(csv_file, delimiter=',')  
  5.        tab=[]  
  6.        for row in csv_reader:  
  7.            tab.append(row)  
  8.        return tab  
Initialize Table Service
 
To initialize table service we need credentials of an active Azure Storage Account. If you don't know how to create and get the credentials, follow official documentation. Replace the values of account_key and account_name.
  1. from azure.cosmosdb.table.tableservice import TableService  
  2. from azure.cosmosdb.table.models import Entity  
  3. table_service = TableService(account_name='xxxx', account_key='xxxxxxxx')  
Create a Azure Storage Table
 
You may create tables via different methods like by using Azure Web Portal, Azure CLI, etc,... Below is a way to create a table using python script. Replace the table-name with your preferred table name.
  1. table_service.create_table(table-name)  
Save table to Azure
 
To save a table (list of list) to Azure we need Partition Key and Row Key. We are just updating the same with incremental values since it's not matters here. Here is the snippet which does the same.
  1. def setTable(tablename,table):  
  2.    table_service = TableService(account_name='xxxx', account_key=xxxx')  
  3.    index=0  
  4.    for row in table:  
  5.        task = {'PartitionKey'"P"+str(index), 'RowKey':  "R"+str(index+1)}  
  6.        index=index+1  
  7.        for ele in row:  
  8.           task["Row"+str(row.index(ele))]=ele  
  9.        table_service.insert_entity(tablename, task)  
  10.    return True   
Final Code
  1. import csv  
  2. from azure.cosmosdb.table.tableservice import TableService  
  3. from azure.cosmosdb.table.models import Entity  
  4. def readCSV(location):  
  5.    with open(location) as csv_file:  
  6.        csv_reader = csv.reader(csv_file, delimiter=',')  
  7.        tab=[]  
  8.        for row in csv_reader:  
  9.            tab.append(row)  
  10.        return tab  
  11. def setTable(tablename,table,table_service):  
  12.    index=0  
  13.    for row in table:  
  14.        task = {'PartitionKey'"P"+str(index), 'RowKey':  "R"+str(index+1)}  
  15.        index=index+1  
  16.        for ele in row:  
  17.           task["Row"+str(row.index(ele))]=ele  
  18.        table_service.insert_entity(tablename, task)  
  19.    return True  
  20. table_service = TableService(account_name='xxxx', account_key='xxxx')  
  21. table_service.create_table(table-name)  
  22. tab=readCSV("<location>")  
  23. res = setTable(table-name, tab,table_service)  

Retrieving data from Azure Storage Table and saving to CSV File

 
Retrieving Data from Azure Storage Table
 
The below code snippet helps you to retrieve the Table:
  1. def getTab(tableName):  
  2.    table_service = TableService(account_name='xxx', account_key=xxxx')  
  3.    tasks = table_service.query_entities(tableName)  
  4.    tab=[]  
  5.    newrow=[]  
  6.    for row in tasks:  
  7.        for ele in row:  
  8.            newrow.append(row[ele])  
  9.        tab.append(newrow)  
  10.        newrow=[]  
  11.    return tab   
Saving the table as CSV File
 
To save the table (list of list) as csv file we can use the csv module. The below snippet helps you to achieve the same.
  1. def saveToCSV(tab, fileName):  
  2.    with open( fileName +"Output.csv"'w+', newline='') as file:  
  3.        writer = csv.writer(file)  
  4.        writer.writerows(tab)   
Final Code
  1. import csv  
  2. from azure.cosmosdb.table.tableservice import TableService  
  3. from azure.cosmosdb.table.models import Entity  
  4. def getTab(tableName):  
  5.    table_service = TableService(account_name='xxx', account_key=xxxx')  
  6.    tasks = table_service.query_entities(tableName)  
  7.    tab=[]  
  8.    newrow=[]  
  9.    for row in tasks:  
  10.        for ele in row:  
  11.            newrow.append(row[ele])  
  12.        tab.append(newrow)  
  13.        newrow=[]  
  14.    return tab  
  15. def saveToCSV(tab, fileName):  
  16.    with open( fileName +"Output.csv"'w+', newline='') as file:  
  17.        writer = csv.writer(file)  
  18.        writer.writerows(tab)  
  19. table_service = TableService(account_name='xxxx', account_key='xxxx')  
  20. table=getTab(table-name)  
  21. saveToCSV(tab, fileName)   

Pandas DataFrame to and from Azure Storage Table

 
We know Pandas DataFrames can be converted to the table (list of list) directly by df.values.tolist(). We have already discussed how to store the list of lists to Azure Storage Table. A sample of the main function is given below:
  1. import pandas as pd  
  2. from azure.cosmosdb.table.tableservice import TableService  
  3. from azure.cosmosdb.table.models import Entity  
  4. ​  
  5. l = [[1 , 2 , 3],[4,5,6] , [8 , 7 , 9]]  
  6. df = pd.DataFrame (l)  
  7. list_of_lists = df.values.tolist()  
  8. ​  
  9. table_service = TableService(account_name='xxxx', account_key='xxxxx')  
  10. table_service.create_table(table-name)#creating table  
  11. res = setTable(table-name, list_of_lists)#inserting csv to cloud   

Conclusion

 
Saving data to Azure Cloud from CSV file and pandas dataframe is discussed in this article. This is one of many methods to achieve the same. You can also save the CSV file as such in an Azure Blob also. I hope you found this article useful.