So let us engage in a little SQL exercise. Suppose that we need to copy data from one table to another. A task like this, on its own, can be tackled with a single INSERT...SELECT query. For example:
insert into Employee(firstName, lastName,...,postalCode) select firstName, lastName,...,postalCode from EmployeeImportFor the sake of discussion, we'll refer to EmployeesImport as the table we need to copy rows from and Employees as their destination. Imagine that, instead of just copying the rows from EmployeeImport into Employees, we also need to generate a special hash value generated from the content of each row that's inserted into the Employees table. Top that off with the fact that this special index value can only be generated via a stored procedure and, voilĂ , you've now got an interesting situation on your hands. We can't use the simple INSERT...SELECT query like we did previously because. as we all well know, you can't execute stored procedures inline with SELECT queries. That means that we are going to need to iterate through each row of the EmployeesImport table and insert them one at a time into the Employees table. Here's one way I've seen this task accomplished:
declare @firstName varchar(50), @lastName varchar(50), @birthDt datetime, @addressLine1 varchar(100), @city varchar(80), @province varchar(80), @country varchar(80), @postalCode varchar(10), @hashKey char(10) declare @err bit set @err=0 begin transaction importEmployees -- Setup the cursor declare cur_employee cursor for select * from EmployeesImport open cur_employee fetch next from cur_employee into @firstName,@lastName,@birthDt,@addressLine1,@city,@province,@country,@postalCode -- Main cursor loop while(@@FETCH_STATUS=0) begin -- Generate the hash exec GetEmployeeHash @hashKey output , @lastName , @firstName , @postalCode , @birthDt -- Insert record into Employees table insert into Employees(EmployeeID,FirstName,LastName,BirthDt,AddressLine1,City,Province,Country,PostalCode) select @hashKey,@firstName,@lastName,@birthDt,@addressLine1,@city,@province,@country,@postalCode if(@@ROWCOUNT=0) begin set @err=1 goto RollBackTransaction end -- Retrieve the next record using the cursor fetch next from cur_employee into @firstName,@lastName,@birthDt,@addressLine1,@city,@province,@country,@postalCode end -- Cleanup close cur_employee deallocate cur_employee if(@err=0) commit transaction importEmployees RollBackTransaction: if(@err>1) rollback transaction importEmployeesNow there are a number of issues that I have with this code. For one, lines 21-22 show that the cur_employee cursor is fetching rows directly from the EmployeesImport table and not from a temporary table. This means that it will create locks against the table that can cause blocking issues for other queries that might be working against the same table. This can lead to performance degradation in high-volume environments. Not only that, but developers using cursors must always be mindful to de-allocate and close any they've declare to avoid memory leaks. Personally, I prefer to avoid cursors at all costs as I've found them to be slow, tedious, and high on overhead.
So how would I solve the problem? Like this:
declare @err bit set @err=0 declare @EmployeesTemp as table ( RowID int identity(1,1), FirstName varchar(50), LastName varchar(50), BirthDt datetime, AddressLine1 varchar(100), City varchar(80), province varchar(80), country varchar(80), postalCode varchar(10) ) insert into @EmployeesTemp(FirstName,LastName,BirthDt,AddressLine1,City,Province,Country,PostalCode) select * from EmployeesImport declare @firstName varchar(50), @lastName varchar(50), @birthDt datetime, @addressLine1 varchar(100), @city varchar(80), @province varchar(80), @country varchar(80), @postalCode varchar(10), @hashKey char(10) declare @total int declare @rowID int select @total=count(*) from @EmployeesTemp set @rowID=1 begin transaction importEmployees while (@rowID<=@total) begin select @firstName=FirstName , @lastName=LastName , @birthDt=birthDt , @addressLine1=AddressLine1 , @city=City , @province=Province , @country=Country , @postalCode=PostalCode from @EmployeesTemp where RowID=@rowID -- Generate the hash exec GetEmployeeHash @hashKey output , @lastName , @firstName , @birthDt , @postalCode -- Insert record into Employee table insert into Employees(EmployeeID,FirstName,LastName,BirthDt,AddressLine1,City,Province,Country,PostalCode) select @hashKey,@firstName,@lastName,@birthDt,@addressLine1,@city,@province,@country,@postalCode if(@@ROWCOUNT=0) begin set @err=1 goto RollBackTransaction end -- Increment the loop counter set @rowID=@rowID+1 end if(@err=0) commit transaction importEmployees RollBackTransaction: if(@err>1) rollback transaction importEmployeesMy solution presents a number of improvements over the previous. Instead of repeatedly querying against the source table, the needed data is cached in a table variable. I love table variables because, unlike temp tables, they are scoped to the execution of the stored procedure and are disposed automatically once execution is complete. While I'm using the table variable, I also don't have to worry about locks against the source table. Temp tables also work for the same purpose (for more on the difference between temp tables and table variables, see this excellent Stack Overflow post)
Another cool feature of my solution is that I don't need to maintain cursors in order to move sequentially though my data set. Because I used an identity column in the temp table, I can just increment my counter so that I know the next rowID I need to query. Overall, I find it much simpler to use table variables effectively than to remember the complexities of managing cursors correctly.
On the whole, 95% of the work that you'll ever need to do in SQL can and should be done using set-based queries. That's the whole point of using relational databases in the first place. But for the small minority of tasks that require row-by-row sequential processing, table variables are definitely the better alternative to cursors.
For a great article on cursor performance, check out the article "Performance Tuning SQL Server Cursors" by Brad McGee.
So that's how it works. Nice. Note: Sorry if you get a duplicate comment.
ReplyDelete