Upload, Download, Or Delete A File Stream In Azure Blob Storage Using C# .NET

Problem

I have a database table with 15 columns. One of these columns is a Blob/Binary Column that we used to store the PDF binary data. Day by day, the storage and the size of the database grew, and now, the Oracle/SQL Server file size is close to 1.5 TB with 10 Million records. The size is growing 200GB each year with 1 million records.

Solution

I decided to move the Blob column (PDF File Stream) to Azure Blob Storage.

How

Fetch the blob column from the Oracle/SQL Server and upload record by record with GUID reference. Please follow the below steps to be more specific.

So, the concept here is, the blob/binary column is in byte[] format. When we upload a file to a blob storage, we need to convert and save the binary/file stream information with a unique name. For that, we will be using the additional column (GUID) as mentioned below. The GUID column will act as a Primary and Foreign key relationship between Oracle/SQL Server databases and the Blob Storage.

Results

After moving the Blob/Binary column to Azure Blob Storage, I deleted that column from the table. It saved us 1.4 TB space.
 
How did I do that?

Step 1

Create a blob storage account in Azure. Here, I'm not going over blob storage. If you're not familiar with Azure Blob Storage, get the Understanding of Blob Storage in Azure
 
My focus is, how to upload, download, and delete file streams from a blob storage.
 
Finalize the database activity before moving to C#

Create a column Varchar (50) in your table. I created the 16th column and named it BlobStorageRefKey. Now, update the column with System_GUID that we are going to use to refer between Blob storage and Oracle/SQL Server table.

Here is the SQL query.

  1. Update TableName set BlobStorageRefKey = Sys_GUID();  

C#.NET Project.

Step 2

Create a web application using Visual Studio.
 
Azure Blob Storage

Step 3

Create a connection or app setting for Azure Blob Storage connection.

Ex,

  1. <appSettings>  
  2.    <addkey="AzureConnection"value="DefaultEndpointsProtocol=https;AccountName= etc….. />  
  3. </appSettings>  

Step 4

We are going to read the blob column from database and upload with the BlobStorageRefKey column as reference to blob storage system.

FYI

When you have a large number of records in the database, if you fetch all records at a time, you may end off with “Out of Memory” issue.

I recommend fetching 5000 records at a time, processing them, and fetching the next 5000 and so on.

Statistics

Reading 5000 records (Blob Column) from the database and uploading it to the Azure Blob Storage took 1.8 minutes for me. I executed from the server like Azure Virtual Machine/client remote machines. If you do it from your local machine, it may take up to 10 minutes for every 5000 records.

First of all, how do you identify the first 5000 and next 5000…. When you have 1 million records, here is the solution.

Fetch the total number of records (count (BlobStorageRefKey)) from the table and pass a List< BlobStorageRefKey> as an input to the below method (Parameter: Locations) and use the below method to split the count for every 5000.

  1. ///<summary>  
  2. /// Split the total number of rows to multiple of 5k  
  3. ///</summary>  
  4. ///<typeparam name="T"></typeparam>  
  5. ///<param name="locations"></param>  
  6. ///<param name="nSize"></param>  
  7. ///<returns></returns>  
  8. publicstatic IEnumerable < List < T >> SplitList < T > (List < T > locations, int nSize = 5000) {  
  9.     for (var i = 0; i < locations.Count; i += nSize) {  
  10.         yieldreturn locations.GetRange(i, Math.Min(nSize, locations.Count - i));  
  11.     }  
  12. }  

The above method will give you the List of List count and your first index in your startingIndex and the list last index in your EndIndex so, use these 2 indexes to fetch the records from the main table using below query.

Ex

If you have 100K records in the database, the above methods result like List<List<ColumnName>> so, you will have 2 foreach() loops. The first foreach() loop will give you the first List<T> from List<List<T>> which 5000 and once you are done, your next List<T> will also have 5000 which are next available 5000 for you so and so forth...

The below method will help you to pull the Blob/Binary Column with BlobReferenceKey to a list of object/class.

  1. ///<summary>  
  2. /// Retrieves a PDF stored in the maintable. Returns a byte array, or null  
  3. /// if the PDF does not exist.  
  4. ///</summary>  
  5. publicstatic List < ClassName > RetrieveBlobFromDatabase(string strConnection, int startIndex, int endIndex, ILog log) {  
  6.     try {  
  7.         log.Info("Database records fetching start Time: " + DateTime.Now);  
  8.         Console.WriteLine("Database records fetching start Time: " + DateTime.Now);  
  9.         var lstClassName = new List < ClassName which has Binary and String Properties > ();  
  10.         var con = new OracleConnection(strConnection);  
  11.         con.Open();  
  12.         var commandText = "SELECT PDFColumn, BlobReferenceKeyFROM Maintable WHERE ID between startIndex and EndIndex";  
  13.         //ID: Is your Table Primary Key (Identity Column), if not exists please create one, this will //help you to pull accurate results.  
  14.         var cmd = new OracleCommand {  
  15.             CommandText = commandText,  
  16.                 Connection = con  
  17.         };  
  18.         var dr = cmd.ExecuteReader();  
  19.         if (dr.HasRows) {  
  20.             while (dr.Read()) {  
  21.                 var objInfo = new ClassName {  
  22.                     PDF = (byte[]) dr["PDFColumn"],  
  23.                         BLOBSTORAGEKEY = dr["BlobReferenceKey"].ToString()  
  24.                 };  
  25.                 lstClassName.Add(objInfo);  
  26.             }  
  27.         }  
  28.         con.Close();  
  29.         log.Info("Database records fetching End Time: " + DateTime.Now);  
  30.         Console.WriteLine("Database records fetching End Time: " + DateTime.Now);  
  31.         return lstClassName;  
  32.     } catch (Exception e) {  
  33.         log.Error(e);  
  34.         throw;  
  35.     }  
  36. }  

The lstClassName contains your blob/binary column data and BLOBSTORAGEKEY is a GUID associated with that row. Now we are going to upload this binary information to blob storage in Azure.

UPLOAD

Step 5

The below method will help you upload your binary data to the Azure Blob Storage account.

Do foreach to the above lstClassName and pass first property to byte[] parameter and 2nd property to filename.

  1. foreach(var objResult in lstClassName) {  
  2.     UploadBinaryAsync(objResult.PDF, objResult.BLOBSTORAGEKEY).Wait();  
  3. }  
  4. ///<summary>  
  5. /// Uploading to Azure blob storage  
  6. ///</summary>  
  7. ///<param name="bytes"></param>  
  8. ///<param name="fileName"></param>  
  9. ///<returns></returns>  
  10. privatestaticasync Task UploadBinaryAsync(byte[] bytes, string fileName) {  
  11.     try {  
  12.         var cloudStorageAccount = CloudStorageAccount.Parse(ConfigurationManager.AppSettings["AzureConnection"]);  
  13.         //create a block blob CloudBlobClient cloudBlobClient = cloudStorageAccount.CreateCloudBlobClient();  
  14.         var cloudBlobClient = cloudStorageAccount.CreateCloudBlobClient();  
  15.         //create a container CloudBlobContainer cloudBlobContainer = cloudBlobClient.GetContainerReference("appcontainer");  
  16.         var cloudBlobContainer = cloudBlobClient.GetContainerReference("Your Container Name");  
  17.         //get Blob reference  
  18.         var cloudBlockBlob = cloudBlobContainer.GetBlockBlobReference(fileName);  
  19.         cloudBlockBlob.Properties.ContentType = System.Net.Mime.MediaTypeNames.Application.Pdf;  
  20.         if (!cloudBlockBlob.Exists()) {  
  21.             using(Stream stream = new MemoryStream(bytes)) {  
  22.                 await cloudBlockBlob.UploadFromStreamAsync(stream);  
  23.             }  
  24.         }  
  25.     } catch (Exception e) {  
  26.         Console.WriteLine("Exception on UploadBinaryAsync : " + fileName);  
  27.         log.Error(e);  
  28.     }  
  29. }  

Download a file from blob Storage

To download a blob from the Azure Blob Storage, you can use your Oracle/SQL Server table column blobReferenceKey as reference to download that from Azure.

  1. ///<summary>  
  2. /// Download Sds from blob storage  
  3. ///</summary>  
  4. ///<param name="blobReferenceKey"></param>  
  5. ///<returns></returns>  
  6. publicbyte[] DownloadFileFromBlob(string blobReferenceKey) {  
  7.     var storageAccount = CloudStorageAccount.Parse(ConfigurationManager.ConnectionStrings["blobConnection"].ToString());  
  8.     var blobClient = storageAccount.CreateCloudBlobClient();  
  9.     // Get Blob Container  
  10.     var container = blobClient.GetContainerReference("Your Container Name");  
  11.     // Get reference to blob (binary content)  
  12.     var blockBlob = container.GetBlockBlobReference(blobReferenceKey);  
  13.     // Read content  
  14.     using(var ms = new MemoryStream()) {  
  15.         if (blockBlob.Exists()) {  
  16.             blockBlob.DownloadToStream(ms);  
  17.         }  
  18.         return ms.ToArray();  
  19.     }  
  20. }  

If you are using a MVC application class this method from FileResult looks like below.

  1. [HttpGet]  
  2. public FileResult DownloadSds(string blobStorageReference) {  
  3.     var resultInfo = DownloadFileFromBlob(blobStorageReference);  
  4.     return File(resultInfo, System.Net.Mime.MediaTypeNames.Application.Pdf, fileName + ".pdf");  
  5. }  

Delete a Blob from Blob Storage

  1. ///<summary>  
  2. /// delete from Azure blob storage  
  3. ///</summary>  
  4. ///<param name="cloudStorageAccount"></param>  
  5. ///<param name="dbConnection"></param>  
  6. ///<param name="log"></param>  
  7. ///<returns></returns>  
  8. privatestaticvoid DeleteBlob(string blobReferenceKey, ILog log) {  
  9.     try {  
  10.         var cloudStorageAccount = CloudStorageAccount.Parse(ConfigurationManager.AppSettings["AzureConnection"]);  
  11.         var cloudBlobClient = cloudStorageAccount.CreateCloudBlobClient();  
  12.         var cloudBlobContainer = cloudBlobClient.GetContainerReference("your container name");  
  13.         var blobReference = cloudBlobContainer.GetBlobReference(blobReferenceKey);  
  14.         if (blobReference.Exists()) {  
  15.             log.Info(blobRef);  
  16.             blobReference.DeleteIfExistsAsync();  
  17.         }  
  18.     } catch (Exception e) {  
  19.         Console.WriteLine(e);  
  20.         throw;  
  21.     }  
  22. }  
Finally, this is how it looks in the Azure Blob Storage.
 
Azure Blob Storage