I encountered an SQL query where I thought it would be ideal to make it ORM-friendly, so when time comes you need to port it to an ORM query, porting shall be easier. OR so I thought.
I saw this question on http://stackoverflow.com/questions/10646018/can-you-use-case-to-specify-a-join-to-use-in-mysql:
I am wondering if it is possible to use some sort of case in a mysql query to effectively achieve the following and reduce the amount of data that has to be looked up each time ..
SELECT um.message, UNIX_TIMESTAMP(um.time), um.read, user.username
FROM usermessages um
CASE
WHEN um.friendId = 1
INNER JOIN user ON um.sourceUserId = user.id
WHEN um.sourceUserId = 1
INNER JOIN user ON um.friendId = user.id
END
WHERE (um.friendId = 1 OR um.sourceUserId = 1)
I offered this solution: http://stackoverflow.com/questions/10646018/can-you-use-case-to-specify-a-join-to-use-in-mysql/10646233#10646233
SELECT um.message, UNIX_TIMESTAMP(um.time), um.read, user.username FROM usermessages um INNER JOIN user ON (um.friendId = 1 AND um.sourceUserId = user.id) OR um.friendId = user.id WHERE (um.friendId = 1 OR um.sourceUserId = 1);
Then another stackoverfellow offer this solution: http://stackoverflow.com/questions/10646018/can-you-use-case-to-specify-a-join-to-use-in-mysql/10646078#10646078
SELECT um.message, UNIX_TIMESTAMP(um.time), um.read, user.username FROM usermessages um INNER JOIN user ON CASE WHEN um.friendId = 1 THEN um.sourceUserId ELSE um.friendId END = user.id WHERE (um.friendId = 1 OR um.sourceUserId = 1)
I disliked his solution initially, as I embrace ORM more and more, I'm of the opinion that we should write portable code, and what's more portable than writing the query with constructs that exists on both SQL and ORM? I disliked the CASE WHEN approach to the problem as it don't have any analogous construct on ORM, Linq in particular.
With that in mind, I tried to dissuade the user from using the CASE WHEN approach. But we need to wise-up though, portability is not a panacea that can magically make our applications have more customers. How can a customer be happy if bad application performance spoils the fun on using your application. With this in mind, I tried to profile both queries to see which one is faster. I'm expecting the OR to be slower(RDBMS don't do short-circuit, it uses cost-based analysis on determining how it should perform the query) than CASE WHEN, but I'm not expecting very big difference in performance, then much to my chagrin when I see a staggering difference in performance, the CASE WHEN approach can finish the job in 88 milliseconds, while the OR approach took 4.7 seconds. The performance of OR is not acceptable
CASE WHEN approach's Query Cost is 3% only, while the OR approach is 97%. So there goes our portability ideals, it's not ideal. We need to tap the strengths of a given tool in order to get our job done in an efficient manner. I will not be dead set anymore to make a given code construct be very compatible on another platform or API.
Expand to see the code used for benchmarking: http://www.sqlfiddle.com/#!6/29531/2
create table usermessages( message varchar(30), friendId int, sourceUserId int ); create table "user"( userId int identity(1,1) primary key, userName varchar(30) ); create index ix_usermessages__friendid on usermessages(friendid); create index ix_usermessages__combined on usermessages(friendid,sourceUserId); create index ix_usermessages__sourceuserid on usermessages(sourceuserid); insert into usermessages select 'hello',row_number() over(order by x.message_id) % 50,1 from sys.messages x; insert into "user"(userName) select 'hello' from sys.messages; select count(*) from usermessages um join "user" u on (case when um.friendId = 1 then um.sourceUserId else um.friendId end) = u.userId; select count(*) from usermessages um join "user" u on (um.friendId = 1 and um.sourceUserId = u.userId) or (um.friendId = u.userId);
No comments:
Post a Comment