SqlDataAdapter: Alternative (Performance)

In this article you will learn how to increase SQL performance when inserting a large amount of data.

Background

Recently I was working on an application that reads a large CSV file, validates each row and inserts them into the database. This was a legacy code and performed pathetically. You can judge the pathetic performance from the time it took to process around 4,500,000 records. It nearly took 3-4 hours! There were many points that I looked at trying to optimize and I wanted to share some points related to efficiently inserting data into a SQL Server database here so that others can get some help.

Algorithm

  1. Create an in-memory DataTable. This was un-typed data table.
  2. Read CSV file as a Stream.
  3. For each row, validate whether it matches a RegEx.
  4. If its valid, create a new row under the in-memory DataTable.
  5. Repeat this process of creating rows in the DataTable for each valid row in the CSV.
  6. In the end, use SqlDataAdapter to send those in-memory rows to the database.

The problem

There were many issues other than the data adapter being used in this program. However for the sake of simplicity, I will not include those details here. So the problem area as of now was how to ensure that 4,500,000 records are inserted in a flash to the database. SQL Server/C# already had an answer to it, that we will look at in a moment. The code that was supposed to push those in-memory rows to the database looked something like the following.

database

Kindly note that I have removed the table and columns names for safety. Also you may find other points of optimization in this code. However the focus of this article does not cover them.

The Solution

As said earlier, I said that there was already a solution provided, however it was not being used. The solution is to use the SqlBukCopy class instead. You can read more details on MSDN. Using this class, I was able to insert all my data into the database nearly in 1 minute (or even less, I have not bothered about the exact figure because this is outstanding performance compared to the previous pathetic code). So the new code looked something like this:

new code

Conclusion

Please choose appropriate approaches/classes when trying to write a solution for specific issues. Chances are that you are not the first person facing the same issues and there could be a solution that you are not aware of. Try your searching skills and you will definitely reach a solution that suits your needs. In my case, it was legacy code done in Framework 2.0 and was not touched by anyone until it failed to work.

I am interested in getting feedback about my article. This is my second article on c-sharpcorner. You can read my previous article on updating data via views.