Retrieving failed records after an SqlBulkCopy exception
Let me start by saying that the idea I used in this article is not originally mine, but since I have only heard of it and have not been able to find any actual examples of its implementation anywhere I wrote the code to handle it.
With that out of the way - here's what this is about: Anyone who's worked with .NET's SqlBulkCopy class knows how fast and powerful it is. It beats other mechanisms for pumping large quantities of data into an SQL Server database by huge factors, and one of the reasons it is so fast is that it does not log anything it does.
The lack of logging definitely speeds things up, but when you are pumping hundreds of thousands of rows and suddenly have a failure on one of them because of a constraint, you're stuck. All the SqlException will tell you is that something went wrong with a given constraint (you'll get the constraint's name at least), but that's about it. You're then stuck having to go back to your source, run separate SELECT statements on it (or do manual searches), and find the culprit rows on your own.
On top of that, it can be a very long and iterative process if you've got data with several potential failures in it becauseSqlBulkCopy will stop as soon as the first failure is hit. Once you correct that one, you need to rerun the load to find the second error, etc.
The approach described in this article has the following advantages:
- Reports all possible errors that the
SqlBulkCopywould encounter - Reports all culprit data rows, along with the exception that row would be causing
- The entire thing is run in a transaction that is rolled back at the end, so no changes are committed.
... and disadvantages:
- For extremely large amounts of data it might take a couple of minutes.
- This solution is reactive; i.e. the errors are not returned as part of the exception raised by your
SqlBulkCopy.WriteToServer()process. Instead, this helper method is executed after the exception is raised to try and capture all possible errors along with their related data. This means that in case of an exception, your process will take longer to run than just running the bulk copy. - You cannot reuse the same
DataReaderobject from the failedSqlBulkCopy, as readers are forward only fire hoses that cannot be reset. You'll need to create a new reader of the same type (e.g. re-issue the originalSqlCommand, recreate the reader based on the same DataTable, etc).
Background
The main idea is quite simple. Rerun the bulk copy, but only process one row at a time. As the rows are processed, capture the individual exceptions that copying them raises (if any) and add both the message and the row's data to an incremental message, but don't stop copying the data to the server. When all is said and done, your final error message is a nice log showing all the issues and the data that caused them. From that point it's easy to go back to the source, find those records, fix the issues and then reissue the bulk copy.
Using the code
It's important to note that not all failures on a bulk copy happen because of data. You might have connectivity issues, authentication failures, timeouts, etc. None of these cases would be explained by your data, so there's no point in calling this helper method if this is your case. You need to take this into account when calling the helper method, and only call it for specific types of exceptions (the sample code below takes care of this).
Also consider that the exception you're catching may not necessarily be the one raised by SqlServer and could be contained within an inner Exception. So if you plan on calling the helper method only if a data-related issue occurred, the exception (and all inner exceptions) needs to be inspected for this. The sample code below takes care of this, even though the Exception is coming directly from the server; in your case, you might be handling it at a higher level after the it has been wrapped in other exceptions.
Test bulk copy method
TestMethod() below is a simple method that sets up for a bulk copy operation and encloses it in a try/catch block. It is this bulk copy that supposedly fails because of some data issue, so within the catch block we then check the exception (and all inner exceptions) for a message containing the word "constraint" (which is apparently the only way to find a constraint failure, as all exceptions from SqlServer are of type SqlException). If such an exception message is found, we call GetBulkCopyFailedData() in order to get the failed rows. This latter method would ideally reside in a separate helper-type class.
Granted, this checking could have been done within the helper, but I was trying to keep it generic enough so that would show all exceptions and not assume what the caller wanted to filter out.
private void TestMethod()
{
// new code
SqlConnection connection = null;
SqlBulkCopy bulkCopy = null; DataTable dataTable = new DataTable();
// load some sample data into the DataTable
IDataReader reader = dataTable.CreateDataReader(); try
{
connection = new SqlConnection("connection string goes here ...");
connection.Open();
bulkCopy = new SqlBulkCopy(connection);
bulkCopy.DestinationTableName = "Destination table name";
bulkCopy.WriteToServer(reader);
}
catch (Exception exception)
{
// loop through all inner exceptions to see if any relate to a constraint failure
bool dataExceptionFound = false;
Exception tmpException = exception;
while (tmpException != null)
{
if (tmpException is SqlException
&& tmpException.Message.Contains("constraint"))
{
dataExceptionFound = true;
break;
}
tmpException = tmpException.InnerException;
} if (dataExceptionFound)
{
// call the helper method to document the errors and invalid data
string errorMessage = GetBulkCopyFailedData(
connection.ConnectionString,
bulkCopy.DestinationTableName,
dataTable.CreateDataReader());
throw new Exception(errorMessage, exception);
}
}
finally
{
if (connection != null && connection.State == ConnectionState.Open)
{
connection.Close();
}
}
}
Documenting the errors and faulty data rows
GetBulkCopyFailedData() then opens a new connection to the database, creates a transaction, and begins bulk copying the data one row at a time. It does so by reading through the supplied DataReader and copying each row into an empty DataTable. The DataTable is then bulk copied into the destination database, and any exceptions resulting from this are caught, documented (along with the DataRow that caused it), and the cycle then repeats itself with the next row.
At the end of the DataReader we rollback the transaction and return the complete error message. Fixing the problems in the data source should now be a breeze.
/// <summary>
/// Build an error message with the failed records and their related exceptions.
/// </summary>
/// <param name="connectionString">Connection string to the destination database</param>
/// <param name="tableName">Table name into which the data will be bulk copied.</param>
/// <param name="dataReader">DataReader to bulk copy</param>
/// <returns>Error message with failed constraints and invalid data rows.</returns>
public static string GetBulkCopyFailedData(
string connectionString,
string tableName,
IDataReader dataReader)
{
StringBuilder errorMessage = new StringBuilder("Bulk copy failures:" + Environment.NewLine);
SqlConnection connection = null;
SqlTransaction transaction = null;
SqlBulkCopy bulkCopy = null;
DataTable tmpDataTable = new DataTable(); try
{
connection = new SqlConnection(connectionString);
connection.Open();
transaction = connection.BeginTransaction();
bulkCopy = new SqlBulkCopy(connection, SqlBulkCopyOptions.CheckConstraints, transaction);
bulkCopy.DestinationTableName = tableName; // create a datatable with the layout of the data.
DataTable dataSchema = dataReader.GetSchemaTable();
foreach (DataRow row in dataSchema.Rows)
{
tmpDataTable.Columns.Add(new DataColumn(
row["ColumnName"].ToString(),
(Type)row["DataType"]));
} // create an object array to hold the data being transferred into tmpDataTable
//in the loop below.
object[] values = new object[dataReader.FieldCount]; // loop through the source data
while (dataReader.Read())
{
// clear the temp DataTable from which the single-record bulk copy will be done
tmpDataTable.Rows.Clear(); // get the data for the current source row
dataReader.GetValues(values); // load the values into the temp DataTable
tmpDataTable.LoadDataRow(values, true); // perform the bulk copy of the one row
try
{
bulkCopy.WriteToServer(tmpDataTable);
}
catch (Exception ex)
{
// an exception was raised with the bulk copy of the current row.
// The row that caused the current exception is the only one in the temp
// DataTable, so document it and add it to the error message.
DataRow faultyDataRow = tmpDataTable.Rows[];
errorMessage.AppendFormat("Error: {0}{1}", ex.Message, Environment.NewLine);
errorMessage.AppendFormat("Row data: {0}", Environment.NewLine);
foreach (DataColumn column in tmpDataTable.Columns)
{
errorMessage.AppendFormat(
"\tColumn {0} - [{1}]{2}",
column.ColumnName,
faultyDataRow[column.ColumnName].ToString(),
Environment.NewLine);
}
}
}
}
catch (Exception ex)
{
throw new Exception(
"Unable to document SqlBulkCopy errors. See inner exceptions for details.",
ex);
}
finally
{
if (transaction != null)
{
transaction.Rollback();
}
if (connection.State != ConnectionState.Closed)
{
connection.Close();
}
}
return errorMessage.ToString();
}
Conclusion
I've certainly wasted more than enough time trying to figure out what was wrong with my data because the bulk copy operation wouldn't help me out there, so I hope this helps avoid wasted time for someone else as well.
As always - comments, questions and suggestions are always welcome. And please don't forget to vote!
refer to:http://www.codeproject.com/Articles/387465/Retrieving-failed-records-after-an-SqlBulkCopy-exc
Retrieving failed records after an SqlBulkCopy exception的更多相关文章
- 使用 QQ 邮箱发送邮件报错:java.net.SocketTimeoutException: Read timed out. Failed messages: javax.mail.MessagingException: Exception reading response
使用 QQ 邮箱发送邮件报错:java.net.SocketTimeoutException: Read timed out. Failed messages: javax.mail.Messagin ...
- Failed to create AppDomain 'xxx'. Exception has been Failed to create AppDomain
一服务器上的数据库全部被置于紧急模式(EMERGENCY),在错误日志里面能看到大量下面的错误 Failed to create AppDomain "YourSQLDba.dbo[runt ...
- jmeter linux分布式压测Server failed to start: java.rmi.server.ExportException: Listen failed on port: 0; nested exception is: java.io.FileNotFoundException: rmi_keystore.jks
在路径\apache-jmeter-5.0\bin下启动jmeter-server.bat时抛出了如下异常: 1.jmeter 1099端口 被占用,修改端口号 使用netstat -lntp|gre ...
- JMeter分布式压测-常见问题之(Server failed to start: java.rmi.server.ExportException: Listen failed on port: 0; nested exception )
问题描述: 在Linux环境启动jmeter-server时抛出了如下异常: 问题描述: 1.可能监听的端口被占用,修改端口号2.Server相关的rmi配置需要调整 解决方案: 在目录/apache ...
- nested exception is org.springframework.beans.factory.NoSuchBeanDefinitionException
You should autowire interface AbstractManager instead of class MailManager. If you have different im ...
- 7.6 Models -- Finding Records
Ember Data的store为检索一个类型的records提供一个接口. 一.Retrieving a single record(检索单记录) 1. 通过type和ID使用store.findR ...
- Android Studio 编译错误 Error:Execution failed for task ':app:buildInfoDebugLoader'.
今天来到打开昨天的项目运行正常,然后改动了一点代码编译报错: Error:Execution failed for task ':app:buildInfoDebugLoader'. > Exc ...
- JMeter4.0以上 分布式测试报错 "server failed start Listen failed on port"
使用JMeter4.0做分布式测试的是否,我的电脑作为肉鸡(执行机),双击jmeter-server.bat后显示失败 Found ApacheJMeter_core.jarUsing local p ...
- Professional C# 6 and .NET Core 1.0 - 38 Entity Framework Core
本文内容为转载,重新排版以供学习研究.如有侵权,请联系作者删除. 转载请注明本文出处:Professional C# 6 and .NET Core 1.0 - 38 Entity Framework ...
随机推荐
- 学习并发包常用的接口----java.util.concurrent
1.常用的相关的接口 Callable.(Runnable).Futrue.RunnableFuture.RunnableSheduledFuture.ScheduledFuture.Executor ...
- 数据库分库分表(一)常见分布式主键ID生成策略
主键生成策略 系统唯一ID是我们在设计一个系统的时候常常会遇见的问题,下面介绍一些常见的ID生成策略. Sequence ID UUID GUID COMB Snowflake 最开始的自增ID为了实 ...
- 剑指offer(11-20)编程题
二进制中1的个数 数值的整数次方 调整数组顺序使奇数位于偶数前面 链表中倒数第k个结点 反转链表 合并两个排序的链表 树的子结构 二叉树的镜像 顺时针打印矩阵 包含min函数的栈 11.输入一个整数, ...
- 使用Microsoft Azure云平台中的Service Bus 中继 Intanet环境下的WCF服务。
之前写的一篇文章:) 看起来好亲切. http://www.cnblogs.com/developersupport/archive/2013/05/23/WCF-ON-IIS-Azure-Servi ...
- 得到DataGrid列的值
<mx:DataGridColumn headerText="状态" dataField="D30120_ZH" width="80" ...
- Linq to SharePoint与权限提升(转)
转自http://www.cnblogs.com/kaneboy/archive/2012/01/25/2437086.html SharePoint 2010支持Linq to SharePoint ...
- 【转】IIS网站浏览时提示需要用户名密码登录-解决方法
打开iis,站点右键----属性----目录安全性----编辑----允许匿名访问钩选 IIS连接127.0.0.1要输入用户名密码的解决办法原因很多,请尝试以下操作: 1.查看网站属性——文档看看启 ...
- hibernate的inverse用法
Inverse和cascade是Hibernate映射中最难掌握的两个属性.两者都在对象的关联操作中发挥作用. 1.明确inverse和cascade的作用 inverse 决定是否把对对象中集合的改 ...
- 八、window搭建spark + IDEA开发环境
本文将简单搭建一个spark的开发环境,如下: 1)操作系统:window os 2)IDEA开发工具以及scala插件(IDEA和插件版本要对应): 2-1)IDEA2018.2.1:https:/ ...
- spss C# 二次开发 学习笔记(五)——Spss系统集成模式
Spss官方不支持Server2008R2等Server系列,但做Spss的二次开发,调用Spss的Web系统,一般部署在Server系列上,例如Server2008R2. 起初,在Server上安装 ...