MySQL commands for In-Class practice
===========================================================================
IN CLASS PRACTICE
===========================================================================
Q.27 Ramakrishana p152, WRONG ANSWER
Find the name and age of the oldest sailor
SELECT S.sname, MAX(S.age)
FROM sailors S;
Wrong because cannot select single record when doing aggregate.
I.e. if you do max you can only report on aggregate (aspects of the
whole, not the individual). To get a the sailor's name who has the
maximum age, we could do the following nested query (review how
nesting works)...
SELECT S.sname, S.age
FROM sailors S
WHERE S.age = (SELECT MAX(S2.age)
FROM sailors S2);
Note if you used MySQL prior to version 4.1 you cannot do this
because nested queries were NOT supported. We'll look at examples of
this later...
Q.30 Ramakrishnan p 153
Find the names of sailors who are older than the oldest sailor
sailer with a rating of 3.
SELECT S.sname
FROM sailors S
WHERE S.age >
(SELECT MAX(S2.age)
FROM sailors S2
WHERE S2.rating = 3);
SELECT S.sname
FROM sailors S
WHERE S.age > ALL
(SELECT S2.age
FROM sailors S2
WHERE S2.rating = 3);
5.2.9: Find the SIDs of suppliers who supply a red part or a green part.
----------------------------------------------------------------------------
5.2.9 Book Answer, good use of UNION; works in MySQL
----------------------------------------------------------------------------
SELECT DISTINCT C.sid
FROM catalog C, parts P
WHERE C.pid = P.pid AND P.color = 'Red'
UNION
SELECT DISTINCT C1.sid
FROM catalog C1, parts P1
WHERE C1.pid = P1.pid AND P1.color = 'Green';
5.2.6: "for each part, find the sname of the supplier who charges the
most for that part"
---------------------------------------------------------------------
5.2.6 Book Answer in MySQL:
---------------------------------------------------------------------
SELECT P.pid, P.pname, S.sname, C.Cost
FROM parts P, suppliers S, catalog C
WHERE C.pid = P.pid
AND C.sid = S.sid
AND C.cost = (
SELECT MAX(C1.cost)
FROM catalog C1
WHERE C1.pid = P.pid)
ORDER BY P.pid;
---------------------------------------------------------------------
5.2.6 Brad's version of book answer, but shorter, without Parts Table
---------------------------------------------------------------------
SELECT C.pid, S.sname, C.Cost
FROM suppliers S, catalog C
WHERE C.sid = S.sid
AND C.cost = (
SELECT MAX(C1.cost)
FROM catalog C1
WHERE C1.pid = C.pid)
ORDER BY C.pid;
---------------------------------------------------------------------
5.2.6 WRONG ANSWER. This will only report those that are max over
all supplied parts, i.e. it will not limit max() check to just the
current PID from the outer loop because the nested query is not
CORRELATED like the answer just above this one.
---------------------------------------------------------------------
SELECT P.pid, S.sname, C.Cost
FROM parts P, suppliers S, catalog C
WHERE C.pid = P.pid
AND C.sid = S.sid
AND C.cost = (
SELECT MAX(C1.cost)
FROM catalog C1
)
ORDER BY P.pid;
---------------------------------------------------------------------
5.2.6 full list of part costs (just for reference)
---------------------------------------------------------------------
SELECT P.pid, S.sname, C.Cost
FROM parts P, suppliers S, catalog C
WHERE C.pid = P.pid
AND C.sid = S.sid
ORDER BY P.pid;
---------------------------------------------------------------------
5.2.6 Short Answer Almost...Incorrect SQL use because it associated
S.sname with aggregrate function (not a unique association). Looks
like it works in MySQL; show why it doesn't by adding value to
table..
---------------------------------------------------------------------
SELECT S.sname, C.pid, MAX(C.Cost)
FROM suppliers S, catalog C
WHERE S.sid = C.sid
GROUP BY C.pid;
Now let's test by
INSERT INTO catalog () VALUES (3,3,0.55);
Then try Short answer above (which yields only single supplier Acme
Widgets for part 3. Compare with correct answer Book or Brad from
above which lists both Acme Widgets and Perfunctory Parts for part 3
at $0.55.
----------------------------------------------------------------------------
Related incorrect answer; like above, but difference is that we used
our trick of including the attributes we want to report (P.pname,
S.sname) in the group by so that we can report them in the SELECT.
This is OK for P.pname as it doesn't changing the groupings.
However, if we also add S.sname this will modify the groupings and
the answer (basically lists every part by every supplier and you
lost the whole intent of max price per part across suppliers.
MORAL: be careful if using this trick to not change the groupings.
----------------------------------------------------------------------------
SELECT S.sname, C.pid, MAX(C.Cost)
FROM parts P, suppliers S, catalog C
WHERE C.pid = P.pid AND C.sid = S.sid
GROUP BY P.pid, P.pname, S.sname;
----------------------------------------------------------------------------
5.2.6 Temporary Table version that works in MySQL. This is how we
used to do things in MySQL before support for nested queries. Note
the syntax for Temporary tables. This is very useful to store
intermediate results.
----------------------------------------------------------------------------
CREATE TEMPORARY TABLE IF NOT EXISTS maxpartcost
SELECT P.pid, MAX(C.cost) as maxcost
FROM parts P, suppliers S, catalog C
WHERE C.pid = P.pid
AND C.sid = S.sid
GROUP BY P.pid;
SELECT C.pid, S.sname, maxcost
FROM suppliers S, maxpartcost, catalog C
WHERE C.sid = S.sid
AND C.cost = maxpartcost.maxcost
AND maxpartcost.pid = C.pid
ORDER BY C.pid;
----------------------------------------------------------------------------
Made-up Query Y: show use of "IN" in nested queries. What's the
minimum price of all green parts?
----------------------------------------------------------------------------
SELECT MIN(C.cost)
FROM catalog C
WHERE C.pid IN (
SELECT DISTINCT P.pid
FROM parts P
WHERE P.color = 'Green');
Since most "IN" nested queries are just like a nested query, often
don't use "IN" syntax, especially if it's not a correlated query, in
which case it can be reduced to just a single query.
SELECT MIN(C.cost)
FROM catalog C, parts P
WHERE C.pid = P.pid AND P.color = 'Green';
More common is to use >,<, etc, or NOT IN to achieve something the
simple nested queries can't as easily do. Or to use IN when
generating complicated nested queries (see later exercises,
especially constraints).
----------------------------------------------------------------------------
Q.22 Ramakrishnan p148, find sailors with better rating than sailor
'horatio' (note horatio doesn't exist in our database, so use jones
first)
----------------------------------------------------------------------------
SELECT S.sid, S.sname
FROM sailors S
WHERE S.rating > ANY ( SELECT S2.rating
FROM sailors S2
WHERE S2.sname = 'jones');
/* NOTE THAT MYSQL sometimes has issues with white space when you
tab in like I did, to avoid this you can do the following... */
SELECT S.sid, S.sname
FROM sailors S
WHERE S.rating > ANY (
SELECT S2.rating
FROM sailors S2
WHERE S2.sname = 'jones'
);
/* now do with horatio--since horatio doesn't exist the subquery
returns FALSE so nothing will match against this result, so empty
set is returned overall */
SELECT S.sid, S.sname
FROM sailors S
WHERE S.rating > ANY (
SELECT S2.rating
FROM sailors S2
WHERE S2.sname = 'horatio'
);
/* The previous example finds sailors rated better than "ANY"
example sailor (jones/horatio). As another example if we want
sailors better than all the sailors named "jones" we could do the
following */
SELECT S.sid, S.sname
FROM sailors S
WHERE S.rating > ALL (
SELECT S2.rating
FROM sailors S2
WHERE S2.sname = 'jones'
);
To see the difference you could add
INSERT INTO sailors () VALUES (72, 'jones', 6, 55.0);
Now you'll just get 'ahab', instead of jonah, ahab, jones.
?? Now if we run, we're returning a SET of values (because of TWO
?? jones) so what happens if you had written the SQL as
SELECT S.sid, S.sname
FROM sailors S
WHERE S.rating > (
SELECT S2.rating
FROM sailors S2
WHERE S2.sname = 'jones'
);
You'll get the following...
ERROR 1242 (21000): Subquery returns more than 1 row
Would this work even if when the sailor doesn't exist, like
horatio, i.e.
SELECT S.sid, S.sname
FROM sailors S
WHERE S.rating > ALL (
SELECT S2.rating
FROM sailors S2
WHERE S2.sname = 'horatio'
);
Book says yes...but what does MYSQL result give? What about ACCESS?
Oracle?
MySQL returns empty set
ACCESS: returns full set
ORACLE: returns full set
Why?? (Let's play around with NULLs, and empty sets some)
/* Moby which has NULL returns NULL */
SELECT S2.rating
FROM sailors S2
WHERE S2.sname = 'moby';
/* Horatio, which doesn't exist, returns empty set */
SELECT S2.rating
FROM sailors S2
WHERE S2.sname = 'horatio';
/* works, NULL matches NULL */
SELECT (SELECT S2.rating
FROM sailors S2
WHERE S2.sname = 'moby') IS NULL;
/* works, EMPTY SET matches NULL, not sure why */
SELECT (SELECT S2.rating
FROM sailors S2
WHERE S2.sname = 'horatio') IS NULL;
SELECT (NULL) IS NULL;
some websites to look at....
http://www.dbdebunk.com/page/page/2296478.htm
http://sql-info.de/mysql/gotchas.html
http://www.oracle.com/technology/oramag/oracle/05-jul/o45sql.html
/* how aggregates handle empty sets: returning 0 versus NULL */
SELECT COUNT(S.sid),AVG(S.sid),SUM(S.sid),MAX(S.sid),MIN(S.sid)
FROM sailors S
WHERE 1 = 2;
----------------------------------------------------------------------------
Chapter Example: Q32 variation
Find the age of the youngest sailor who is eligible to vote for each
rating level that has less than two sailors [and that doesn't have
any sailors over 60].
----------------------------------------------------------------------------
/* BASIC Query (without the < 60 part) */
SELECT S.rating, MIN(S.age) as minage
FROM sailors S
WHERE S.age >= 18
GROUP BY S.rating
HAVING COUNT(*) < 2;
+--------+--------+
| rating | minage |
+--------+--------+
| 5 | 63.0 |
| 7 | 44.0 |
| 8 | 53.0 |
+--------+--------+
3 rows in set (0.00 sec)
/* the following would answer the part about over 60 as well, but it
does not work in MySQL because of the every() */
SELECT S.rating, MIN(S.age) as minage
FROM sailors S
WHERE S.age >= 18
GROUP BY S.rating
HAVING COUNT(*) < 2 AND EVERY(S.age <= 60);
/* since we cannot use EVERY, can we use something else? "MAX" */
SELECT S.rating, MIN(S.age) as minage
FROM sailors S
WHERE S.age >= 18
GROUP BY S.rating
HAVING COUNT(*) < 2 AND MAX(S.age) <= 60;
Note the difference in logic (and results) between the above query
(slightly modified to report number of sailors in group) and a
similar but DIFFERENT query (in logic and results). The second
query is different in that it excludes the > 60 year olds prior to
them getting into the groups, and as a result returns an empty set.
This is shown well in the figures 5.10 - 5.15 on pages 156-158 of
Ramakrishnan.
CORRECT
SELECT S.rating, MIN(S.age) as minage, COUNT(*) as NumberInGroup
FROM sailors S
WHERE S.age >= 18
GROUP BY S.rating
HAVING COUNT(*) < 2 AND MAX(S.age) <= 60;
+--------+--------+---------------+
| rating | minage | NumberInGroup |
+--------+--------+---------------+
| 7 | 44.0 | 1 |
| 8 | 53.0 | 1 |
+--------+--------+---------------+
2 rows in set (0.00 sec)
INCORRECT--excluses 60+ year olds too early
SELECT S.rating, MIN(S.age) as minage, COUNT(*) as NumberInGroup
FROM sailors S
WHERE S.age >= 18 AND S.age <= 60
GROUP BY S.rating
HAVING COUNT(*) > 2;
+--------+--------+---------------+
| rating | minage | NumberInGroup |
+--------+--------+---------------+
| 3 | 30.0 | 3 |
| 6 | 55.0 | 3 |
+--------+--------+---------------+
2 rows in set (0.00 sec)
----------------------------------------------------------------------------
Madeup-Query-Example: Should only have attributes in SELECT line
that are in GROUP BY or are aggregate functions. WRONG--ILLEGAL SQL
since DNAME not in GROUP BY
List the department name, and the number of employees in each department.
----------------------------------------------------------------------------
/* incorrect as you are not allowed to report dname since not part
of GROUP BY or aggregate function. However, some SQL implementions,
including MySQL allow this to occur. They just fill in the first
dname they match (not guaranteed to be correct) */
SELECT dname, count(*)
FROM dept, works
WHERE dept.did=works.did
GROUP BY dept.did
HAVING COUNT(*) < 3;
/* incorrect as you want to group by depts, and dname is not PK for
depts */
SELECT dname, count(*)
FROM dept, works
WHERE dept.did=works.did
GROUP BY dept.dname
HAVING COUNT(*) < 3;
/* common work around, just add dname if doesn't affect grouping */
SELECT dname, count(*)
FROM dept, works
WHERE dept.did=works.did
GROUP BY dept.did, dept.dname
HAVING COUNT(*) < 3;
----------------------------------------------------------------------------
Chapter Example: Q37 Ramakrishnan p161.
"Find those ratings for which the average age of sailors is a
minimum over all ratings."
----------------------------------------------------------------------------
SELECT Temp.rating, Temp.avgage
FROM (
SELECT S.rating, AVG(S.age) AS avgage
FROM sailors S
GROUP BY S.rating) AS Temp
WHERE Temp.avgage = (
SELECT MIN(Temp.avgage)
FROM Temp)
;
This doesn't work in MySQL because the last select/from on Temp
doesn't know about Temp table; I believe because Temp is only known
to it's own nested subquery, and the containing nesting subquery
(outer SELECT), but not to other subqueries nested at the same level
as the Temp table defition. One can however do this using
nesting--see later example.
----------------------------------------------------------------------------
Q37: (almost) Working example using two temporary tables.
----------------------------------------------------------------------------
CREATE TEMPORARY TABLE IF NOT EXISTS temp
SELECT S.rating, AVG (S.age) AS avgage
FROM sailors S
Group BY S.rating;
CREATE TEMPORARY TABLE IF NOT EXISTS min
SELECT MIN(temp.avgage) as avgage
FROM temp;
SELECT temp.rating, min.avgage
FROM temp, min
WHERE temp.avgage = min.avgage;
EXCEPT....that it doesn't preclude NULL ratings from being included,
which probably wasn't the intent. So how do you exclude NULL
ratings?
----------------------------------------------------------------------------
Q37: Same as Last, but fixed to handle NULLs, CORRECT Temp Table example
----------------------------------------------------------------------------
CREATE TEMPORARY TABLE IF NOT EXISTS temp
SELECT S.rating, AVG (S.age) AS avgage
FROM sailors S
WHERE S.rating IS NOT NULL
Group BY S.rating;
CREATE TEMPORARY TABLE IF NOT EXISTS min
SELECT MIN(temp.avgage) as avgage
FROM temp;
SELECT temp.rating, min.avgage
FROM temp, min
WHERE temp.avgage = min.avgage;
CREATE TEMPORARY TABLE IF NOT EXISTS temp
SELECT S.rating, AVG (S.age) AS avgage
FROM sailors S
Group BY S.rating
HAVING S.rating IS NOT NULL;
CREATE TEMPORARY TABLE IF NOT EXISTS min
SELECT MIN(temp.avgage) as avgage
FROM temp;
SELECT temp.rating, min.avgage
FROM temp, min
WHERE temp.avgage = min.avgage;
+--------+----------+
| rating | avgage |
+--------+----------+
| 3 | 30.00000 |
+--------+----------+
1 row in set (0.00 sec)
----------------------------------------------------------------------------
Q37 Almost Working example using nesting (more elegant)
----------------------------------------------------------------------------
/* Close but not quite; not mistake in how HAVING MIN is used */
SELECT s1.rating, s1.avg_age
FROM (SELECT AVG(s.age) AS avg_age, s.rating
FROM sailors s
WHERE s.rating IS NOT NULL
GROUP BY s.rating) AS s1
HAVING MIN(s1.avg_age);
Why doesn't this work? HAVING takes T/F condition. How can we
interpret MIN(s1.avg_age)? Is it true or false? It's not T/F.
MySQL parses is as false, so returns empty set. Also, we want MIN
across the groups. HAVING clauses evaluates WITHIN that group.
PS....can you use HAVING without GROUP BY? YES
/* returns because evaluates 1 as TRUE *?
SELECT rating
FROM sailors
HAVING (1);
/* returns empty set because evaluates 0 as FALSE *?
SELECT rating
FROM sailors
HAVING (0);
----------------------------------------------------------------------------
Q37 Another close to Working example using nesting (more elegant)
----------------------------------------------------------------------------
SELECT s1.rating, MIN(s1.avg_age)
FROM (SELECT AVG(s.age) AS avg_age, s.rating
FROM sailors s
WHERE s.rating IS NOT NULL
GROUP BY s.rating) AS s1 ;
Close, but problem is that MIN can report aggreate but can't associate with
proper s1.rating level. (same problem we saw earlier with MAX().
So this is messy two step process because we need to calculate group
by average, then go back through this list to ge the minimum value
of average age. Nice short cut to this is to do group by, then add
order by and asc (if not large file).
Some other almost working examples:
This one does in one step nicely through ORDER BY and ASC, but only
captures ONE result when there many be more than one, so not totally
correct.
select rating,avg(age)
from sailors
where rating is not null
group by rating
order by avg(age) asc limit 1;