Upload Large Files To MVC / WebAPI Using Partitioning

Introduction

Sending large files to an MVC/Web-API Server can be problematic. This article is about an alternative. The approach used is to break a large file up into small chunks, upload them, then merge them back together on the Server via file transfer by partitioning. The article shows how to send files to an MVC Server from both a webpage using JavaScript, and a Web-form httpClient, and can be implemented using either MVC or Web API.

In my experience, the larger the file you need to upload to a website/API, the bigger the potential problems you encounter. Even when you put the right settings in place, adjust your web.config, make certain you use the right multiplier for maxRequestLength and maxAllowedContentLength and of course don't forget about executionTimeout (eek!), things can still go wrong. Connections can fail when the file is *almost* transferred, servers unexpectedly (Murphy's law) run out of space, etc., the list goes on. The diagram below demonstrates the basic concept discussed in this article.


Background 

The concept for this solution is very simple. The attached code works (I have it started in production), and can be improved by you in many ways. For example, for the purposes of this article the original large file is broken into app. 1mb chunks, and uploaded to the server sequentially, one chunk at a time. This could, for example, be made more efficient by threading, and sending chunks in parallel. It could also be made more robust by adding fault tolerance, auto-resume into a rest-api architecture etc. I leave you to implement these features yourself if you need them.

The code consists of two parts - the initial file-split/partitioning into chunks, and the final merge of the chunks back into the original file. I will demonstrate the file-split using both C# in a web-form, and JavaScript, and the file-merge using C# server-side.

File split

The concept of splitting a file is very basic. We transverse the file in a binary stream, from position zero, up to the last byte in the file, copying out chunks of binary data along the way and transferring these. Generally we set an arbitrary (or carefully thought out!) chunk size to extract, and use this as the amount of data to take at a time. Anything left over at the end is the final chunk.

In the example below, a chunk size of 128b is set. For the file shown, this gives us 3 x 128b chunks, and 1 x 32b. In this example there are four file chunks resulting from the split and to transfer to the server.

C# File Split

The accompanying demo "WinFileUpload" is a simple Windows forms application. Its sole function is to demonstrate splitting a sample large file (50 MB) in C#, and using a HTTPClient to post the file to a web-server (in this case, an MVC Server).

For this C# example, I have a class called Utils  that takes some input variables such as maximum file chunk size, temporary folder location, and the name of the file to split. To split the file into chunks, we call the method "SplitFile". SplitFile works its way through the input file and breaks it into separate file chunks. We then upload each file chunk it using "UploadFile".

  1. Utils ut = new Utils();  
  2. ut.FileName = "hs-2004-15-b-full_tif.bmp"// hard coded for demo  
  3. ut.TempFolder = Path.Combine(CurrentFolder, "Temp");  
  4. ut.MaxFileSizeMB = 1;  
  5. ut.SplitFile();  
  6.   
  7. foreach (string File in ut.FileParts)  
  8.   {  
  9.     UploadFile(File);  
  10.   }  
  11. MessageBox.Show("Upload complete!");  
The file upload method takes an input file-name, and uses a HTTPClient to upload the file. Note the fact that we are sending MultiPartFormData to carry the payload. 
  1. public bool UploadFile(string FileName)  
  2. {  
  3.   bool rslt = false;  
  4.   using (var client = new HttpClient())  
  5.     {  
  6.       using (var content = new MultipartFormDataContent())  
  7.         {  
  8.          var fileContent = new   ByteArrayContent(System.IO.File.ReadAllBytes(FileName));  
  9.          fileContent.Headers.ContentDisposition = new  
  10.              ContentDispositionHeaderValue("attachment")  
  11.                {  
  12.                 FileName = Path.GetFileName(FileName)  
  13.                };  
  14.          content.Add(fileContent);  
  15.   
  16.         var requestUri = "http://localhost:8170/Home/UploadFile/";  
  17.             try  
  18.             {  
  19.                 var result = client.PostAsync(requestUri, content).Result;  
  20.                 rslt = true;  
  21.             }  
  22.             catch (Exception ex)  
  23.             {  
  24.                 // log error  
  25.                 rslt = false;  
  26.             }  
  27.         }  
  28.     }  
  29.    return rslt;  
  30. }  

So, that's the supporting code out of the way. One of the critical things to be aware of next is the file naming convention that is being used. It consists of the original file-name, plus a code-parsable tail "_part." that will be used server-side to merge the different file chunks back into a single contiguous file again. This is simply the convention I put together - you can change it to your own requirements, just be sure you are consistent with it.

The convention for this example is,

Name = original name + ".part_N.X" (N = file part number, X = total files).

Here is an example of a picture file split into three parts.

  1. MyPictureFile.jpg.part_1.3
  2. MyPictureFile.jpg.part_2.3
  3. MyPictureFile.jpg.part_3.3

It doesn't matter what order the file chunks are sent to the Server. The important thing is that some convention, like the above is used, so that the Server knows (a) what file part it is dealing with and (b) when all parts have been received and can be merged back into one large original file again.

Next, here is the meat of the C# code that scans the file, creating multiple chunk files ready to transfer.

  1. public bool SplitFile()  
  2. {  
  3.     bool rslt = false;  
  4.     string BaseFileName = Path.GetFileName(FileName);  
  5.     // set the size of file chunk we are going to split into  
  6.     int BufferChunkSize = MaxFileSizeMB * (1024 * 1024);  
  7.     // set a buffer size and an array to store the buffer data as we read it  
  8.     const int READBUFFER_SIZE = 1024;  
  9.     byte[] FSBuffer = new byte[READBUFFER_SIZE];  
  10.     // open the file to read it into chunks  
  11.     using (FileStream FS = new FileStream(FileName, FileMode.Open, FileAccess.Read, FileShare.Read))  
  12.     {  
  13.         // calculate the number of files that will be created  
  14.         int TotalFileParts = 0;  
  15.         if (FS.Length < BufferChunkSize)  
  16.         {  
  17.             TotalFileParts = 1;  
  18.         }  
  19.         else  
  20.         {  
  21.             float PreciseFileParts = ((float)FS.Length / (float)BufferChunkSize);  
  22.             TotalFileParts = (int)Math.Ceiling(PreciseFileParts);  
  23.         }  
  24.   
  25.         int FilePartCount = 0;  
  26.         // scan through the file, and each time we get enough data to fill a chunk, write out that file  
  27.         while (FS.Position < FS.Length)  
  28.         {  
  29.             string FilePartName = String.Format("{0}.part_{1}.{2}",  
  30.             BaseFileName, (FilePartCount + 1).ToString(), TotalFileParts.ToString());  
  31.             FilePartName = Path.Combine(TempFolder, FilePartName);  
  32.             FileParts.Add(FilePartName);  
  33.             using (FileStream FilePart = new FileStream(FilePartName, FileMode.Create))  
  34.             {  
  35.                 int bytesRemaining = BufferChunkSize;  
  36.                 int bytesRead = 0;  
  37.                 while (bytesRemaining > 0 && (bytesRead = FS.Read(FSBuffer, 0,  
  38.                  Math.Min(bytesRemaining, READBUFFER_SIZE))) > 0)  
  39.                 {  
  40.                     FilePart.Write(FSBuffer, 0, bytesRead);  
  41.                     bytesRemaining -= bytesRead;  
  42.                 }  
  43.             }  
  44.           // file written, loop for next chunk  
  45.           FilePartCount++;  
  46.         }  
  47.   
  48.     }  
  49.         return rslt;  
  50. }  

That's it for the C# client-side - we will see the result and how to handle things server-side later in the article. Next, let's look at how to do the same thing in Javascript, from a web-browser.

JavaScript File Split

NB - The JavaScript code, and the C# Merge code are contained in the attached demo file "MVCServer"

In our browser, we have an input control of type "file", and a button to call a method that initiates the file-split and data transfer.

  1. <input type="file" id="uploadFile" name="file" />  <a class="btn btn-primary" href="#" id="btnUpload">Upload file</a>  
On document ready, we bind to the click event of the button to call the main method.
  1. $(document).ready(function () {  
  2.     $('#btnUpload').click(function () {  
  3.         UploadFile($('#uploadFile')[0].files);  
  4.         }  
  5.     )  
  6. });  
Our UploadFile method does the work of splitting the file into chunks, and as in our C# example, passing the chunks off to another method for transfer. The main difference here is that in C#, we created individual files, in our JavaScript example, we are taking the chunks from an array instead.  
  1. function UploadFile(TargetFile)  
  2. {  
  3.     // create array to store the buffer chunks  
  4.     var FileChunk = [];  
  5.     // the file object itself that we will work with  
  6.     var file = TargetFile[0];  
  7.     // set up other initial vars  
  8.     var MaxFileSizeMB = 1;  
  9.     var BufferChunkSize = MaxFileSizeMB * (1024 * 1024);  
  10.     var ReadBuffer_Size = 1024;  
  11.     var FileStreamPos = 0;  
  12.     // set the initial chunk length  
  13.     var EndPos = BufferChunkSize;  
  14.     var Size = file.size;  
  15.   
  16.     // add to the FileChunk array until we get to the end of the file  
  17.     while (FileStreamPos < Size)  
  18.     {  
  19.         // "slice" the file from the starting position/offset, to  the required length  
  20.         FileChunk.push(file.slice(FileStreamPos, EndPos));  
  21.         FileStreamPos = EndPos; // jump by the amount read  
  22.         EndPos = FileStreamPos + BufferChunkSize; // set next chunk length  
  23.     }  
  24.     // get total number of "files" we will be sending  
  25.     var TotalParts = FileChunk.length;  
  26.     var PartCount = 0;  
  27.     // loop through, pulling the first item from the array each time and sending it  
  28.     while (chunk = FileChunk.shift())  
  29.     {  
  30.         PartCount++;  
  31.         // file name convention  
  32.         var FilePartName = file.name + ".part_" + PartCount + "." + TotalParts;  
  33.         // send the file  
  34.         UploadFileChunk(chunk, FilePartName);  
  35.     }  
  36. }  
The UploadFileChunk takes the part of the file handed by the previous method, and posts it to the Server in a similar manner to the C# example.
  1. function UploadFileChunk(Chunk, FileName)  
  2. {  
  3.     var FD = new FormData();  
  4.     FD.append('file', Chunk, FileName);  
  5.     $.ajax({  
  6.         type: "POST",  
  7.         url: 'http://localhost:8170/Home/UploadFile/',  
  8.         contentType: false,  
  9.         processData: false,  
  10.         data: FD  
  11.     });  
  12. }  

File merge

NB - The JavaScript code, and the C# Merge code are contained in the attached demo file "MVCServer"

Over on the Server, be that MVC or Web-API, we receive the individual file chunks and need to merge them back together again into the original file.

The first thing we do is put a standard POST handler in place to receive the file chunks being posted up to the Server. This code takes the input stream, and saves it to a temp folder using the file-name created by the client (C# or JavaScript). Once the file is saved, the code then calls the "MergeFile" method which checks if it has enough file chunks available yet to merge the file together. Note that this is simply the method I have used for this article. You may decide to handle the merge trigger differently, for example, running a job on a timer every few minutes, passing off to another process, etc. It should be changed depending on your own required implementation.

  1. [HttpPost]  
  2. public HttpResponseMessage UploadFile()  
  3. {  
  4.     foreach (string file in Request.Files)  
  5.     {  
  6.         var FileDataContent = Request.Files[file];  
  7.         if (FileDataContent != null && FileDataContent.ContentLength > 0)  
  8.         {  
  9.             // take the input stream, and save it to a temp folder using  
  10.             // the original file.part name posted  
  11.             var stream = FileDataContent.InputStream;  
  12.             var fileName = Path.GetFileName(FileDataContent.FileName);  
  13.             var UploadPath = Server.MapPath("~/App_Data/uploads");  
  14.             Directory.CreateDirectory(UploadPath);  
  15.             string path = Path.Combine(UploadPath, fileName);  
  16.             try  
  17.             {  
  18.                 if (System.IO.File.Exists(path))  
  19.                     System.IO.File.Delete(path);  
  20.                 using (var fileStream = System.IO.File.Create(path))  
  21.                 {  
  22.                     stream.CopyTo(fileStream);  
  23.                 }  
  24.                 // Once the file part is saved, see if we have enough to merge it  
  25.                 Shared.Utils UT = new Shared.Utils();  
  26.                 UT.MergeFile(path);  
  27.             }  
  28.             catch (IOException ex)  
  29.             {  
  30.                // handle  
  31.             }  
  32.         }  
  33.     }  
  34.     return new HttpResponseMessage()  
  35.     {  
  36.         StatusCode = System.Net.HttpStatusCode.OK,  
  37.         Content = new StringContent("File uploaded.")  
  38.     };  
  39. }  
Each time we call the MergeFile method, it first checks to see if we have all of the file chunk parts required to merge the original file back together again. It determines this by parsing the file-names. If all files are present, the method sorts them into the correct order, and then appends one to another until the original file that was split, is back together again. 
  1. /// <summary>  
  2. /// original name + ".part_N.X" (N = file part number, X = total files)  
  3. /// Objective = enumerate files in folder, look for all matching parts of  
  4. /// split file. If found, merge and return true.  
  5. /// </summary>  
  6. /// <param name="FileName"></param>  
  7. /// <returns></returns>  
  8. public bool MergeFile(string FileName)  
  9. {  
  10.     bool rslt = false;  
  11.     // parse out the different tokens from the filename according to the convention  
  12.     string partToken = ".part_";  
  13.     string baseFileName = FileName.Substring(0, FileName.IndexOf(partToken));  
  14.     string trailingTokens = FileName.Substring(FileName.IndexOf(partToken) + partToken.Length);  
  15.     int FileIndex = 0;  
  16.     int FileCount = 0;  
  17.     int.TryParse(trailingTokens.Substring(0, trailingTokens.IndexOf(".")), out FileIndex);  
  18.     int.TryParse(trailingTokens.Substring(trailingTokens.IndexOf(".") + 1), out FileCount);  
  19.     // get a list of all file parts in the temp folder  
  20.     string Searchpattern = Path.GetFileName(baseFileName) + partToken + "*";  
  21.     string[] FilesList = Directory.GetFiles(Path.GetDirectoryName(FileName), Searchpattern);  
  22.     //  merge .. improvement would be to confirm individual parts are there / correctly in  
  23.     // sequence, a security check would also be important  
  24.     // only proceed if we have received all the file chunks  
  25.     if (FilesList.Count() == FileCount)  
  26.     {  
  27.         // use a singleton to stop overlapping processes  
  28.         if (!MergeFileManager.Instance.InUse(baseFileName))  
  29.         {  
  30.             MergeFileManager.Instance.AddFile(baseFileName);  
  31.             if (File.Exists(baseFileName))  
  32.                 File.Delete(baseFileName);  
  33.             // add each file located to a list so we can get them into  
  34.             // the correct order for rebuilding the file  
  35.             List<SortedFile> MergeList = new List<SortedFile>();  
  36.             foreach (string File in FilesList)  
  37.             {  
  38.                 SortedFile sFile = new SortedFile();  
  39.                 sFile.FileName = File;  
  40.                 baseFileName = File.Substring(0, File.IndexOf(partToken));  
  41.                 trailingTokens = File.Substring(File.IndexOf(partToken) + partToken.Length);  
  42.                 int.TryParse(trailingTokens.  
  43.                    Substring(0, trailingTokens.IndexOf(".")), out FileIndex);  
  44.                 sFile.FileOrder = FileIndex;  
  45.                 MergeList.Add(sFile);  
  46.             }  
  47.             // sort by the file-part number to ensure we merge back in the correct order  
  48.             var MergeOrder = MergeList.OrderBy(s => s.FileOrder).ToList();  
  49.             using (FileStream FS = new FileStream(baseFileName, FileMode.Create))  
  50.             {  
  51.                 // merge each file chunk back into one contiguous file stream  
  52.                 foreach (var chunk in MergeOrder)  
  53.                 {  
  54.                     try  
  55.                     {  
  56.                         using (FileStream fileChunk =  
  57.                            new FileStream(chunk.FileName, FileMode.Open))  
  58.                         {  
  59.                             fileChunk.CopyTo(FS);  
  60.                         }  
  61.                     }  
  62.                     catch (IOException ex)  
  63.                     {  
  64.                         // handle  
  65.                     }  
  66.                 }  
  67.             }  
  68.             rslt = true;  
  69.             // unlock the file from singleton  
  70.             MergeFileManager.Instance.RemoveFile(baseFileName);  
  71.         }  
  72.     }  
  73.     return rslt;  
  74. }  
Using the file split on the client-side, and file-merge on the server-side, we now have a very workable solution for uploading large files in a more secure manner than simply sending up in one large block of data. For testing, I used some large image files converted to a BMP from a hubble picture here.
 
That's it - Happy uploading !