First, let's create test data
create table tank as select * from (values('01',7),('02',8),('03',9),('04',10),('05',11)) as x(tank_num, volume); create table tank_fish as select * from (values ('01','Black Moor'), ('01','Bubble Eye'), ('01','Comet'), ('02','Bubble Eye'), ('02','Lion Head'), ('02','Pompom'), ('02','Ryukin'), ('03','Comet'), ('03','Lion Head'), ('03','Ranchu'), ('04','Betta')) as x(tank_num, fish);
test=# select * from tank order by tank_num; tank_num | volume ----------+-------- 01 | 7 02 | 8 03 | 9 04 | 10 05 | 11 (5 rows) test=# select * from tank_fish order by tank_num, fish; tank_num | fish ----------+------------ 01 | Black Moor 01 | Bubble Eye 01 | Comet 02 | Bubble Eye 02 | Lion Head 02 | Pompom 02 | Ryukin 03 | Comet 03 | Lion Head 03 | Ranchu 04 | Betta (11 rows)
Then let's use some nifty function to facilitate reporting rows in comma-delimited format.
CREATE AGGREGATE array_accum ( sfunc = array_append, basetype = anyelement, stype = anyarray, initcond = '{}' );
Then let's count the fish in each tank
select t.tank_num, array_to_string(array_accum(f.fish), ', ') as fishes, count(*) from tank t left join tank_fish f using(tank_num) group by t.tank_num order by t.tank_num
Output:
tank_num | fishes | count ----------+---------------------------------------+------- 01 | Black Moor, Bubble Eye, Comet | 3 02 | Bubble Eye, Lion Head, Pompom, Ryukin | 4 03 | Comet, Lion Head, Ranchu | 3 04 | Betta | 1 05 | | 1 (5 rows)
As we can see there's something wrong in output, the tank# 5 indicates one fish, while there is none.
So make it a practice to use COUNT(field_name_here) instead of COUNT(*)
Amend the query:
select t.tank_num, array_to_string(array_accum(f.fish), ', ') as fishes, count(f.fish) from tank t left join tank_fish f using(tank_num) group by t.tank_num order by t.tank_num
Output:
tank_num | fishes | count ----------+---------------------------------------+------- 01 | Black Moor, Bubble Eye, Comet | 3 02 | Bubble Eye, Lion Head, Pompom, Ryukin | 4 03 | Comet, Lion Head, Ranchu | 3 04 | Betta | 1 05 | | 0 (5 rows)
Now let's find the tank which has Comet and Ranchu in it:
select t.tank_num, array_to_string(array_accum(f.fish), ', ') as fishes, count(f.fish in ('Comet', 'Ranchu')) from tank t left join tank_fish f using(tank_num) group by t.tank_num order by t.tank_num
Output:
tank_num | fishes | count ----------+---------------------------------------+------- 01 | Black Moor, Bubble Eye, Comet | 3 02 | Bubble Eye, Lion Head, Pompom, Ryukin | 4 03 | Comet, Lion Head, Ranchu | 3 04 | Betta | 1 05 | | 0 (5 rows)
Oops.. there's something wrong with the query, it should only show tank# 1 with count of 1, and tank# 3 with count of 2.
COUNT also counts false, it only exclude NULL.
So this query:
select count(null),count(0),count(1),count(2),count(false),count(true)
Outputs:
count | count | count | count | count | count -------+-------+-------+-------+-------+------- 0 | 1 | 1 | 1 | 1 | 1 (1 row)
Here's how the GROUP BY query above looks like when expanded
select t.tank_num, f.fish, f.fish in ('Comet', 'Ranchu') from tank t left join tank_fish f using(tank_num) order by t.tank_num, f.fish
Here's the output:
tank_num | fish | ?column? ----------+------------+---------- 01 | Black Moor | f 01 | Bubble Eye | f 01 | Comet | t 02 | Bubble Eye | f 02 | Lion Head | f 02 | Pompom | f 02 | Ryukin | f 03 | Comet | t 03 | Lion Head | f 03 | Ranchu | t 04 | Betta | f 05 | | (12 rows)
By the way, f.fish in ('Comet','Ranchu') is equivalent to: f.fish = 'Comet' or f.fish = 'Ranchu'.
False is also countable, like for example there's a survey "Do you think lochness monster is real?" some will answer True, some will answer False, some are undecided (NULL). Say the answers are False, True, True, Null, False, True, Null, True. If you want to count how many has an answer to the question, it should only yield 6, not 8, COUNT(Interviewed.Answer) excludes the undecided, don't do COUNT(*) nor COUNT(Interviewed.*). But if you want to count how many have answered, count the person, COUNT(Interviewed.PersonName).
And why 0 is also countable, well number 0 could be a primary key, Steve Jobs employee ID is numbered 0. So if you want to count how many employees in Apple has been issued an ID, you should do COUNT(Employee.EmployeeID). COUNT(Employee.EmployeeID) instead of COUNT(*), also takes care of not counting employees which don't have an ID yet.
So how to exclude those false? Just use NULLIF function. NULLIF converts the first parameter to null if its value is same as parameter 2. Typical uses are NULLIF(field_or_expression_here, 0),
NULLIF(field_or_expression_here, false)
select t.tank_num, array_to_string(array_accum(f.fish), ', ') as fishes, count(nullif( f.fish in ('Comet', 'Ranchu'), false )) from tank t left join tank_fish f using(tank_num) group by t.tank_num order by t.tank_num
Output:
tank_num | fishes | count ----------+---------------------------------------+------- 01 | Black Moor, Bubble Eye, Comet | 1 02 | Bubble Eye, Lion Head, Pompom, Ryukin | 0 03 | Comet, Lion Head, Ranchu | 2 04 | Betta | 0 05 | | 0 (5 rows)
Now the output is correct, so finding the tank which has Comet and Ranchu fish in it, should be as simple matter as moving the count statement from SELECT to GROUP BY's HAVING clause:
select t.tank_num, array_to_string(array_accum(f.fish), ', ') as fishes from tank t left join tank_fish f using(tank_num) group by t.tank_num having count(nullif( f.fish in ('Comet', 'Ranchu'), false )) >= 2 order by t.tank_num
Output:
tank_num | fishes ----------+-------------------------- 03 | Comet, Lion Head, Ranchu (1 row)
If you want to take advantage of the fact that booleans can be casted to integer in Postgresql, cast the boolean to integer, then just summarize results using SUM function, SUM accepts numbers only.
Query
select t.tank_num, array_to_string(array_accum(f.fish), ', ') as fishes from tank t left join tank_fish f using(tank_num) group by t.tank_num having sum(f.fish in ('Comet', 'Ranchu') ::int) >= 2 order by t.tank_num
No comments:
Post a Comment