10 things in MySQL (that won’t work as expected)
#10. Searching for a NULL

1
2
3
| SELECT *FROM aWHERE a.column = NULL |
NULL is never equal to anything, even another NULL. This query won't return anything and in fact will be thrown out by the optimizer when building the plan.NULL values, use this instead:
1
2
3
| SELECT *FROM aWHERE a.column IS NULL |
#9. LEFT JOIN with additional conditions

1
2
3
4
5
6
| SELECT *FROM aLEFT JOIN bON b.a = a.idWHERE b.column = 'something' |
LEFT JOIN is like INNER JOIN except that it will return each record from a at least once, substituting missing fields from b with NULL values, if there are no actual matching records.WHERE condition, however, is evaluated after the LEFT JOIN so the query above checks column after it had been joined. And as we learned earlier, no NULL value can satisfy an equality condition, so the records from awithout corresponding record from b will unavoidably be filtered out.INNER JOIN, only less efficient.b.column = 'something' (while still returning all records from a), this condition should be moved into ON clause:
1
2
3
4
5
6
| SELECT *FROM aLEFT JOIN bON b.a = a.id AND b.column = 'something' |
#8. Less than a value but not a NULL

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
| SELECT *FROM bWHERE b.column < 'something' AND b.column IS NOT NULL[/sourcecode]This is actually not an error: this query is valid and will do what's intended. However, <code>IS NOT NULL</code> here is redundant.If <code>b.column</code> is a <code>NULL</code>, then <code>b.column < 'something'</code> will never be satisfied, since any comparison to <code>NULL</code> evaluates to a boolean <code>NULL</code> and does not pass the filter.It is interesting that this additional <code>NULL</code> check is never used for <q>greater than</q> queries (like in <code>b.column > 'something'</code>).This is because <code>NULL</code> go first in <code>ORDER BY</code> in <strong>MySQL</strong> and hence are incorrectly considered <q>less</q> than any other value by some people.This query can be simplified:SELECT *FROM bWHERE b.column < 'something'[/sourcecode]and will still never return a <code>NULL</code> in <code>b.column</code>.<h3 class="cracked">#7. Joining on NULL</h3><img src="https://explainextended.com/wp-content/uploads/2010/11/MG_3163-e1288839302867.jpg" alt="" title="Helicopter" width="700" height="467" class="aligncenter size-full wp-image-5105 noborder" />SELECT *FROM aJOIN bON a.column = b.column |
column is nullable in both tables, this query won't return a match of two NULLs for the reasons described above: no NULLs are equal.
1
2
3
4
5
| SELECT *FROM aJOIN bON a.column = b.column OR (a.column IS NULL AND b.column IS NULL) |
ref_or_null.#6. NOT IN with NULL values

1
2
3
4
5
6
7
| SELECT a.*FROM aWHERE a.column NOT IN ( SELECT column FROM b ) |
NULL in b.column. As with other predicates, both IN and NOT IN against NULL evaluate to NULL.NOT EXISTS:
1
2
3
4
5
6
7
8
| SELECT a.*FROM aWHERE NOT EXISTS ( SELECT NULL FROM b WHERE b.column = a.column ) |
IN, EXISTS always evaluates to either true or false.#5. Ordering random samples

1
2
3
4
5
| SELECT *FROM aORDER BY RAND(), columnLIMIT 10 |
column.ORDER BY orders the output lexicographically: that is, the records are only ordered on the second expression when the values of the first expression are equal.RAND() are, well, random. It's infeasible that the values of RAND() will match, so ordering on column after RAND() is quite useless.
1
2
3
4
5
6
7
8
9
10
| SELECT *FROM ( SELECT * FROM mytable ORDER BY RAND() LIMIT 10 ) qORDER BY column |
#4. Sampling arbitrary record from a group

column from each group (defined by grouper)
1
2
| SELECT DISTINCT(grouper), a.*FROM a |
DISTINCT is not a function, it's a part of SELECT clause. It applies to all columns in the SELECT list, and the parentheses here may just be omitted. This query may and will select the duplicates on grouper (if the values in at least one of the other columns differ).GROUP BY):
1
2
3
4
| SELECT a.*FROM aGROUP BY grouper |
index_union after GROUP BY). So it's not safe to rely on this behavior.PRIMARY KEY defined:
1
2
3
4
5
6
7
8
9
10
11
12
13
| SELECT a.*FROM ( SELECT DISTINCT grouper FROM a ) aoJOIN aON a.id = ( SELECT id FROM a ai WHERE ai.grouper = ao.grouper LIMIT 1 ) |
#3. Sampling first record from a group

1
2
3
4
5
6
| SELECT a.*FROM aGROUP BY grouperORDER BY MIN(id) DESC |
id.a.* will belong to a record holding MIN(id) (or even to a single record at all).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
| SELECT a.*FROM ( SELECT DISTINCT grouper FROM a ) aoJOIN aON a.id = ( SELECT id FROM a ai WHERE ai.grouper = ao.grouper ORDER BY ai.grouper, ai.id LIMIT 1 ) |
ORDER BY added to ensure that the first record in id order will be returned.#2. IN and comma-separated list of values

column against any of those provided in a comma-separated string:
1
2
3
| SELECT *FROM aWHERE column IN ('1, 2, 3') |
IN list.column is a VARCHAR, it is compared (as a string) to the whole list (also as a string), and of course will never match. If column is of a numeric type, the list is cast into the numeric type as well (and only the first item will match, at best).IN list
1
2
3
| SELECT *FROM aWHERE column IN (1, 2, 3) |
1
2
3
4
5
6
7
8
9
10
| SELECT *FROM ( SELECT 1 AS id UNION ALL SELECT 2 AS id UNION ALL SELECT 3 AS id ) qJOIN aON a.column = q.id |
FIND_IN_SET:
1
2
3
| SELECT *FROM aWHERE FIND_IN_SET(column, '1,2,3') |
a.#1. LEFT JOIN with COUNT(*)

1
2
3
4
5
6
7
| SELECT a.id, COUNT(*)FROM aLEFT JOIN bON b.a = a.idGROUP BY a.id |
b for each record in a.COUNT(*) will never return a 0 in such a query. If there is no match for a certain record in a, the record will be still returned and counted.COUNT should be made to count only the actual records in b. Since COUNT(*), when called with an argument, ignores NULLs, we can pass b.a to it. As a join key, it can never be a null in an actual match, but will be if there were no match:
1
2
3
4
5
6
7
| SELECT a.id, COUNT(b.a)FROM aLEFT JOIN bON b.a = a.idGROUP BY a.id |
