Last week, we got an introduction to Efficient Data Transfer with SQL Server BulkCopy and C# using System.Data.SqlClient. If you haven’t read that yet, check it out here. There, we looked at how we can pump large amounts of data into a SQL Server database in an efficient manner by using BulkCopy. As an example, we made some dummy data in a DataTable and passed that construct off to the System.Data.SqlClient library to transfer that data into SQL Server. Today, we’ll look at a common, related use case in this space — having to parse large text-based files (TSV, CSV, etc.) and input that data into tables in SQL Server.
Enter LumenWorksCsvReader
You can of course write some C# code to open your CSV/TSV/TXT data files manually, read through the text contents and parse those out and covert the data into a type that implements IDataReader for sending them to System.Data.SqlClient for a BulkCopy but all that drudgery is already taken care of by LumenWorksCsvReader, a library built just for that.
You can install LumenWorksCsvReader through NuGet:
dotnet add package LumenWorksCsvReader
This library gives you access to a CsvReader
class which is a type that implements IDataReader
, a compatible data type for the System.Data.SqlClient’s BulkCopy method. An example of the basic usage is shown below.
Example Code
using (var csv = new CsvReader(new StreamReader(File.Open("yourData.csv", FileMode.Open, FileAccess.Read, FileShare.ReadWrite)), true)) { csv.GetFieldHeaders(); // Specify the columns that are in your CSV file csv.Columns = new List<Column> { new Column { Name = "EmployeeId", Type = typeof(string), }, new Column { Name = "FirstName", Type = typeof(string) }, new Column { Name = "LastName", Type = typeof(string) }, new Column { Name = "FavoriteColor", Type = typeof(string) }, }; // Now use SQL Bulk Copy to move the data using SqlConnection con = new(sqlServerConnectionString); using var sbc = new SqlBulkCopy(con, SqlBulkCopyOptions.UseInternalTransaction, null); con.Open(); sbc.DestinationTableName = "Employees"; sbc.BatchSize = 1000; sbc.EnableStreaming = true; sbc.BulkCopyTimeout = 0; // Add mappings to correlate your CSV columns to your database table colum sbc.AddColumnMapping("EmployeeId", "EmployeeId"); sbc.AddColumnMapping("FirstName", "FirstName"); sbc.AddColumnMapping("LastName", "LastName"); sbc.AddColumnMapping("FavoriteColor", "FavoriteColor"); // Can create an event handler to get updates on the progress of the data transer // Can be useful for handling very large files with lots of rows sbc.NotifyAfter = 1000; sbc.SqlRowsCopied += new SqlRowsCopiedEventHandler(SqlRowsCopiedNotifier.OnSqlRowsCopied); await sbc.WriteToServerAsync(csv); // When done, close out the connection if (con.State == ConnectionState.Open) { con.Close(); } }
In the example above, we are specifying the columns that exist in the CSV file; open a connection to SQL Server; specify settings for the bulk transfer; map CSV columns to database columns; initiate the transfer; close connection after done.
Conclusion
Loading data from CSV files to insert into RDBMS systems is a common use case in the enterprise and these libraries make that tasks a little less arduous. Check these out to see if it will fit your particular use case.