Read first how not to do(set-based) running total at http://sqlblog.com/blogs/adam_machanic/archive/2006/07/12/running-sums-redux.aspx
Given Sql Server 2008 windowing capability limitation, most would give their new-fangled CTE skills a shot by writing running total query in a recursive manner:
with T AS ( select ROW_NUMBER() over(order by OrderID) as rn, * from test ) ,R(Rn, OrderId, Qty, RunningTotal) as ( select Rn, OrderID, Qty, Qty from t where rn = 1 union all select t.Rn, t.OrderId, t.Qty, p.RunningTotal + t.Qty from t t join r p on t.rn = p.rn + 1 ) select R.OrderId, R.Qty, R.RunningTotal from r option(maxrecursion 0)
All is fine and dandy, except when that query is ran on a production database, that query will not scale, that query took a good 9 seconds on a 5,000 rows table. (DDL at the bottom of this post)
Now let's try another approach, let's think for a while we are back in time, say Sql Server 2000. What would your DBA grandpa would do to facilitate that running total report?
create function TestRunningTotal() returns @ReturnTable table( OrderId int, Qty int, RunningTotal int ) as begin insert into @ReturnTable(OrderID, Qty, RunningTotal) select OrderID, Qty, 0 from Test order by OrderID; declare @RunningTotal int = 0; update @ReturnTable set RunningTotal = @RunningTotal, @RunningTotal = @RunningTotal + Qty; return; end;
And that query took 0 second.
And go back further in time, say Sql Server 7, what would he do? He would follow Adam Machanic's approach: http://sqlblog.com/blogs/adam_machanic/archive/2006/07/12/running-sums-redux.aspx. Cursor is one of the rarity cases where running total is a very good choice.
The following is every developer's heaven, works on Postgresql 8.4, Sql Server 2012 and Oracle 9(or 8.1.7 ?):
select OrderID, Qty, sum(Qty) over(order by OrderID) as RunningTotal from test
Now back to regular programming(Sql Server 2008).
Some called that kind of update that relies on physical sort a quirky update. If you feel uneasy with quirky update, you might want to put a guard statement to prevent wrong update order.
create function TestRunningTotalGuarded() returns @ReturnTable table( OrderId int, Qty int, RunningTotal int not null, RN int identity(1,1) not null ) as begin insert into @ReturnTable(OrderID, Qty, RunningTotal) select OrderID, Qty, 0 from Test order by OrderID; declare @RunningTotal int = 0; declare @RN_check INT = 0; update @ReturnTable set @RN_check = @RN_check + 1, @RunningTotal = (case when RN = @RN_check then @RunningTotal + Qty else 1/0 end), RunningTotal = @RunningTotal; return; end;
If UPDATE really update rows in unpredictable order, the @RN_Check will not be equal to RN(identity order) anymore, the code will raise a divide-by-zero error then.
DDL
create table Test( OrderID int primary key, Qty int not null ); declare @i int = 1; while @i <= 5000 begin insert into Test(OrderID, Qty) values (@i * 2,rand() * 10); set @i = @i + 1; end;
Running total example results:
OrderId Qty RunningTotal ----------- ----------- ------------ 2 4 4 4 8 12 6 4 16 8 5 21 10 3 24 12 8 32 14 2 34 16 9 43 18 1 44 20 2 46 22 0 46 24 2 48 26 6 54 28 2 56 30 8 64 32 6 70 34 0 70 36 4 74 38 2 76 40 5 81 42 4 85
UPDATE: May 29, 2012
Use cursor, quirky update is well.. quirky. Until further notice, please use something predictable, like cursor. http://www.ienablemuch.com/2012/05/recursive-cte-is-evil-and-cursor-is.html
UPDATE: May 29, 2012 8:27 PM
Just by putting a clustered primary key on the table variable, it makes the updating of rows in order. Check the test(looped 100 times) on the bottom of this article, it has no divide-by-zero(guard statement) error anymore: http://www.ienablemuch.com/2012/05/recursive-cte-is-evil-and-cursor-is.html
Until further notice, I would say quirky update is not really quirky.
No comments:
Post a Comment