Tuesday, March 24, 2009

MySQL: WHERE versus HAVING

Using the WHERE clause to restrict a result set is familiar to anyone who regularly uses SQL, but the use of the HAVING clause may not be. So what's the difference between the two? The WHERE clause is used to restrict records, and is also used by the query optimizer to determine which indexes and tables to use; only actual fields represented in the query table(s) can be utilized. The HAVING clause is a filter on the final result set, and is applied after ORDER BY and GROUP BY. So MySQL cannot use it to optimize the query, but it can filter the results based on a calculated or aggregate value obtained with a GROUP BY clause or similar means. As a rule, use WHERE when you can in order to utilize a table's indexes and use HAVING to restrict a result set otherwise. For instance, take the following query which demonstrates a common mistake in MySQL:

SELECT sum(salary) AS amount 
FROM employees
GROUP BY department
WHERE amount > 500000;


The idea with this query is to find departments at a given company where the total salary expenditure is greater than $500,000. However, it won't work because the 'amount' field is a calculated value and cannot be used by the WHERE clause. You'll get an error like this: Unknown column 'amount' in 'where clause'. The correct way to write this query is by using HAVING as follows:

SELECT sum(salary) AS amount 
FROM employees
GROUP BY department
HAVING amount > 500000;


This time, MySQL will calculate all the 'amount' values then use the HAVING clause to filter out any records where the amount is <= 500000.

Here we use WHERE and HAVING in combination:

SELECT sum(salary) AS amount 
FROM employees
WHERE salary < 50000
GROUP BY department
HAVING amount > 500000;


In this query we find departments at a given company where the total salary expenditure is greater than $500,000 but only including employees who make less than $50,000 a year.

No comments:

Post a Comment