Introduction
This article is about how to read and write Pandas DataFrame and CSV to and from Azure Storage Tables. The Pandas DataFrames are used in many Data Analytics applications. Therefore, storing it in a cloud is a repetitive task in many cases. Here we can see how we can do the same. First, we see how to save data in CSV file to Azure Table Storage and then we'll see how to deal with the same situation with Pandas DataFrame.
Prerequisites
- Azure Account : (If not you can get a free account with ₹13,300 worth of credits from here. If you are a student you can verify student status to get it without entering credit card details else credit card details are mandatory)
- Azure Storage Account: To know how to create a storage account follow the official documentation.
Saving a CSV file to Azure Storage Table
We need a CSV module and Azure cosmosdb table for the same so install the same, You may use pip install CSV and pip install azure-cosmosdb-table for the same.
Read CSV File
The below function uses a CSV module to read a CSV file at a specified location.
- import csv
- def readCSV(location):
- with open(location) as csv_file:
- csv_reader = csv.reader(csv_file, delimiter=',')
- tab=[]
- for row in csv_reader:
- tab.append(row)
- return tab
Initialize Table Service
To initialize table service we need credentials of an active Azure Storage Account. If you don't know how to create and get the credentials, follow
official documentation. Replace the values of
account_key and
account_name.
- from azure.cosmosdb.table.tableservice import TableService
- from azure.cosmosdb.table.models import Entity
- table_service = TableService(account_name='xxxx', account_key='xxxxxxxx')
Create a Azure Storage Table
You may create tables via different methods like by using Azure Web Portal, Azure CLI, etc,... Below is a way to create a table using python script. Replace the table-name with your preferred table name.
- table_service.create_table(table-name)
Save table to Azure
To save a table (list of list) to Azure we need Partition Key and Row Key. We are just updating the same with incremental values since it's not matters here. Here is the snippet which does the same.
- def setTable(tablename,table):
- table_service = TableService(account_name='xxxx', account_key=xxxx')
- index=0
- for row in table:
- task = {'PartitionKey': "P"+str(index), 'RowKey': "R"+str(index+1)}
- index=index+1
- for ele in row:
- task["Row"+str(row.index(ele))]=ele
- table_service.insert_entity(tablename, task)
- return True
Final Code
- import csv
- from azure.cosmosdb.table.tableservice import TableService
- from azure.cosmosdb.table.models import Entity
- def readCSV(location):
- with open(location) as csv_file:
- csv_reader = csv.reader(csv_file, delimiter=',')
- tab=[]
- for row in csv_reader:
- tab.append(row)
- return tab
- def setTable(tablename,table,table_service):
- index=0
- for row in table:
- task = {'PartitionKey': "P"+str(index), 'RowKey': "R"+str(index+1)}
- index=index+1
- for ele in row:
- task["Row"+str(row.index(ele))]=ele
- table_service.insert_entity(tablename, task)
- return True
- table_service = TableService(account_name='xxxx', account_key='xxxx')
- table_service.create_table(table-name)
- tab=readCSV("<location>")
- res = setTable(table-name, tab,table_service)
Retrieving data from Azure Storage Table and saving to CSV File
Retrieving Data from Azure Storage Table
The below code snippet helps you to retrieve the Table:
- def getTab(tableName):
- table_service = TableService(account_name='xxx', account_key=xxxx')
- tasks = table_service.query_entities(tableName)
- tab=[]
- newrow=[]
- for row in tasks:
- for ele in row:
- newrow.append(row[ele])
- tab.append(newrow)
- newrow=[]
- return tab
Saving the table as CSV File
To save the table (list of list) as csv file we can use the csv module. The below snippet helps you to achieve the same.
- def saveToCSV(tab, fileName):
- with open( fileName +"Output.csv", 'w+', newline='') as file:
- writer = csv.writer(file)
- writer.writerows(tab)
Final Code
- import csv
- from azure.cosmosdb.table.tableservice import TableService
- from azure.cosmosdb.table.models import Entity
- def getTab(tableName):
- table_service = TableService(account_name='xxx', account_key=xxxx')
- tasks = table_service.query_entities(tableName)
- tab=[]
- newrow=[]
- for row in tasks:
- for ele in row:
- newrow.append(row[ele])
- tab.append(newrow)
- newrow=[]
- return tab
- def saveToCSV(tab, fileName):
- with open( fileName +"Output.csv", 'w+', newline='') as file:
- writer = csv.writer(file)
- writer.writerows(tab)
- table_service = TableService(account_name='xxxx', account_key='xxxx')
- table=getTab(table-name)
- saveToCSV(tab, fileName)
Pandas DataFrame to and from Azure Storage Table
We know Pandas DataFrames can be converted to the table (list of list) directly by df.values.tolist(). We have already discussed how to store the list of lists to Azure Storage Table. A sample of the main function is given below:
- import pandas as pd
- from azure.cosmosdb.table.tableservice import TableService
- from azure.cosmosdb.table.models import Entity
-
- l = [[1 , 2 , 3],[4,5,6] , [8 , 7 , 9]]
- df = pd.DataFrame (l)
- list_of_lists = df.values.tolist()
-
- table_service = TableService(account_name='xxxx', account_key='xxxxx')
- table_service.create_table(table-name)
- res = setTable(table-name, list_of_lists)
Conclusion
Saving data to Azure Cloud from CSV file and pandas dataframe is discussed in this article. This is one of many methods to achieve the same. You can also save the CSV file as such in an Azure Blob also. I hope you found this article useful.