Let's create the tables and populate them:
create table Country ( CountryId int identity(1,1) not null primary key, CountryCode varchar(4) not null unique, CountryName varchar(30) not null unique ); insert into Country(CountryCode,CountryName) values('PH','Philippines'); insert into Country(CountryCode,CountryName) values('CN','China'); insert into Country(CountryCode,CountryName) values('JP','Japan'); create table City ( CityId int identity(1,1) not null primary key, CityCode varchar(3) not null unique, CityName varchar(30) not null unique, CountryId int not null references Country(CountryId) ); insert into City(CityCode, CityName, CountryId) values('MKT','Makati',1); insert into City(CityCode, CityName, CountryId) values('MNL','Manila',1); insert into City(CityCode, CityName, CountryId) values('BEI','Beijing',2); insert into City(CityCode, CityName, CountryId) values('SHA','Shanghai',2); insert into City(CityCode, CityName, CountryId) values('TKY','Tokyo',3); insert into City(CityCode, CityName, CountryId) values('KYT','Kyoto',3); create table Sales ( OrNo int identity(1,1) not null primary key, OrYear int not null, CityId int not null references City(CityId), OrAmount numeric(18,6) not null ); declare @i int; set @i = 0; while @i < 500000 begin insert into Sales(OrYear,CityId,OrAmount) values(2010,1,2); set @i = @i + 1; end; set @i = 0; while @i < 500000 begin insert into Sales(OrYear,CityId,OrAmount) values(2010,2,2); set @i = @i + 1; end; set @i = 0; while @i < 500000 begin insert into Sales(OrYear,CityId,OrAmount) values(2011,1,2); set @i = @i + 1; end; set @i = 0; while @i < 500000 begin insert into Sales(OrYear,CityId,OrAmount) values(2011,2,2); set @i = @i + 1; end; set @i = 0; while @i < 500000 begin insert into Sales(OrYear,CityId,OrAmount) values(2010,3,2); set @i = @i + 1; end; set @i = 0; while @i < 500000 begin insert into Sales(OrYear,CityId,OrAmount) values(2010,4,2); set @i = @i + 1; end; set @i = 0; while @i < 500000 begin insert into Sales(OrYear,CityId,OrAmount) values(2011,3,2); set @i = @i + 1; end; set @i = 0; while @i < 500000 begin insert into Sales(OrYear,CityId,OrAmount) values(2011,4,2); set @i = @i + 1; end; set @i = 0; while @i < 500000 begin insert into Sales(OrYear,CityId,OrAmount) values(2010,5,2); set @i = @i + 1; end; set @i = 0; while @i < 500000 begin insert into Sales(OrYear,CityId,OrAmount) values(2010,6,2); set @i = @i + 1; end; set @i = 0; while @i < 500000 begin insert into Sales(OrYear,CityId,OrAmount) values(2011,5,2); set @i = @i + 1; end; set @i = 0; while @i < 500000 begin insert into Sales(OrYear,CityId,OrAmount) values(2011,6,2); set @i = @i + 1; end;Then let's create a view for the summary:
create view SalesOnEachCountry as select x.CountryId, x.CountryName, s.OrYear, COUNT_BIG(*) AS SalesCountPerCountryPerYear, sum(s.OrAmount) as TotalSales from dbo.Sales s join dbo.City c on c.CityId = s.CityId join dbo.Country x on c.CountryId = x.CountryId group by x.CountryId, s.OrYear ,x.CountryName -- ancillary fieldRunning this...
select * from SalesOnEachCountry order by CountryName, OrYear, has a result of...
CountryId CountryName OrYear SalesCountPerCountryPerYear TotalSales ----------- ------------------------------ ----------- --------------------------- --------------------------------------- 2 China 2010 1000000 2000000.000000 2 China 2011 1000000 2000000.000000 3 Japan 2010 1000000 2000000.000000 3 Japan 2011 1000000 2000000.000000 1 Philippines 2010 1000000 2000000.000000 1 Philippines 2011 1000000 2000000.000000 (6 row(s) affected)...which took 4 seconds. My benchmarking is muddled up a bit, as my development machine has SSD(with read speed of 250 MB/s), 7200 rpm drive has 35 MB/s read speed only. Anyway, for the sake of discussion, we are interested with how fast can materialized view(indexed view in Sql Server parlance) optimize aggregations. Now that 4 seconds, could really hurt the web response time, it's time to rectify that. We shall materialize that 6 rows view. It's just a two step process, first we must schema-bind the views:
alter view SalesOnEachCountry with SchemaBinding as select x.CountryId, x.CountryName, s.OrYear, COUNT_BIG(*) AS SalesCountPerCountryPerYear, sum(s.OrAmount) as TotalSales from dbo.Sales s join dbo.City c on c.CityId = s.CityId join dbo.Country x on c.CountryId = x.CountryId group by x.CountryId, s.OrYear ,x.CountryName -- ancillary fieldSecond, we put index on those fact columns(i.e. CountryId, OrYear):
CREATE UNIQUE CLUSTERED INDEX ux_Country on SalesOnEachCountry(CountryId, OrYear);Then try querying the view again:
select * from SalesOnEachCountry order by CountryName, OrYear;Surprise! That query took 0 second(I don't know how to switch the display to millisecond) :-) And even if you try to insert/update/delete a row in Sales table, Sql Server will not re-aggregate the 3 million and 1 rows, instead it will just increase/decrease the count and summary column. Likewise, updating the OrAmount won't cause re-aggregation on rows; instead it will efficiently increment/decrement the summary column based on the delta of updated OrAmount. Your query will not visit the million rows to produce information for 6 rows when using
Ancillary field(s)
No comments:
Post a Comment