Introduction
When dealing with large files (typically over 100 MB) in AWS S3, using Multipart Upload is critical. Multipart Upload allows you to upload a single object as a set of parts, improving performance, reliability, and enabling pause-resume capabilities.
In this article, we'll explore.
- What Multipart Upload is and why it matters
- How to implement Multipart Upload with the AWS SDK for .NET (C#)
- Best practices and optimization tips
- Integrating Multipart Uploads with real AWS S3
Let's get started!
1. What is Multipart Upload?
Multipart Upload is a feature of S3 that lets you upload a file in discrete parts. Each part is uploaded independently and can be re-uploaded if it fails. Once all parts are uploaded, S3 assembles the object.
Key Benefits
- Parallel uploads: Improve upload speed by uploading parts simultaneously.
- Resumable uploads: Resume failed uploads without starting over.
- Efficient retries: Retry only the failed part.
Multipart upload is strongly recommended for any file larger than 100 MB and required for files larger than 5 GB.
2. Basic Multipart Upload Example
Using the AWS SDK for .NET makes it straightforward.
Code Example
public static async Task MultipartUploadAsync(string filePath, string bucketName, string keyName)
{
var s3Client = new AmazonS3Client(RegionEndpoint.USEast1);
var fileTransferUtility = new TransferUtility(s3Client);
var uploadRequest = new TransferUtilityUploadRequest
{
BucketName = bucketName,
FilePath = filePath,
Key = keyName,
PartSize = 5 * 1024 * 1024, // 5 MB per part
StorageClass = S3StorageClass.Standard,
CannedACL = S3CannedACL.Private
};
await fileTransferUtility.UploadAsync(uploadRequest);
Console.WriteLine("Multipart Upload Completed!");
}
Explanation
- TransferUtility automatically manages parts, uploads, and retries.
- PartSize can be adjusted based on file size and available bandwidth.
- CannedACL sets the object's permissions.
3. Manual Multipart Upload (More Control)
For maximum control, manually manage the upload process.
Manual Multipart Upload Steps
- Initiate a Multipart Upload
- Upload Parts
- Complete Multipart Upload
- Handle exceptions and abort if needed
Full Code Example
public static async Task ManualMultipartUploadAsync(string filePath, string bucketName, string keyName)
{
var s3Client = new AmazonS3Client(RegionEndpoint.USEast1);
var initiateRequest = new InitiateMultipartUploadRequest
{
BucketName = bucketName,
Key = keyName
};
var initResponse = await s3Client.InitiateMultipartUploadAsync(initiateRequest);
var partETags = new List<PartETag>();
try
{
const int partSize = 5 * 1024 * 1024; // 5 MB
long filePosition = 0;
int partNumber = 1;
using var fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read);
while (filePosition < fileStream.Length)
{
var uploadRequest = new UploadPartRequest
{
BucketName = bucketName,
Key = keyName,
UploadId = initResponse.UploadId,
PartNumber = partNumber++,
PartSize = partSize,
InputStream = fileStream,
FilePosition = filePosition,
IsLastPart = filePosition + partSize >= fileStream.Length
};
var uploadResponse = await s3Client.UploadPartAsync(uploadRequest);
partETags.Add(new PartETag(uploadResponse.PartNumber, uploadResponse.ETag));
filePosition += partSize;
}
var completeRequest = new CompleteMultipartUploadRequest
{
BucketName = bucketName,
Key = keyName,
UploadId = initResponse.UploadId,
PartETags = partETags
};
await s3Client.CompleteMultipartUploadAsync(completeRequest);
Console.WriteLine("Upload complete!");
}
catch (Exception ex)
{
Console.WriteLine($"An error occurred: {ex.Message}");
await s3Client.AbortMultipartUploadAsync(new AbortMultipartUploadRequest
{
BucketName = bucketName,
Key = keyName,
UploadId = initResponse.UploadId
});
Console.WriteLine("Multipart upload aborted.");
}
}
Explanation
- You manage each upload part manually.
- You must complete the upload explicitly.
- In case of failure, you should abort the upload to avoid orphaned uploads.
4. Best Practices for Multipart Uploads
- Part size: Choose part sizes wisely (minimum 5 MB, maximum 5 GB per part).
- Parallelism: Upload multiple parts in parallel to speed up the overall upload.
- Retries: Implement retry logic for failed parts.
- Abort uploads: Always abort incomplete uploads to avoid unnecessary S3 charges.
- Monitoring: Use Amazon CloudWatch to monitor upload progress and detect failures.
5. Integration Tips for Production Systems
- Security: Use IAM roles and least privilege policies.
- Resumable uploads: Save UploadId and ETags so uploads can resume after failure.
- Progress tracking: Display upload progress using ProgressEvent handlers.
- Exception handling: Gracefully handle edge cases like network interruptions.
- Cost optimization: Clean up incomplete multipart uploads regularly (use S3 Lifecycle Policies).
Conclusion
Multipart Upload is a powerful feature of Amazon S3 that enables efficient uploading of large files. With the AWS SDK for .NET, you can automate and control the upload process easily. By following best practices and using a well-structured implementation, you can build robust and scalable file transfer features into your C# applications.
Whether you're building a cloud backup tool, media uploader, or large data archive system, mastering multipart uploads with S3 is essential!