SQL & how different databases implement it

Some vaguely structured thoughts on the differences that you see between SQL as implemented by one DBMS and that of another.

05 November 2007

 

ORDER BY clause in statements with a UNION

Imagine you write a SELECT statement with a UNION or UNION ALL in it with each subselect having a GROUP BY clause, then add an ORDER BY clause to present the results set in a particular order. How do you construct the ORDER BY clause so that the data is presented in order of the group function or aggregate function?

Let me take this one step at a time.

Take a simple SELECT statement with a GROUP BY and an ORDER BY:

SELECT a_status, COUNT(*) AS cnt
FROM device
GROUP BY a_status
ORDER BY cnt

That is pretty straightforward. You could also write the ORDER BY as:
ORDER BY 2
or even:
ORDER BY COUNT(*)
and the statement will work as expected.

Now let's add a UNION ALL:

SELECT a_status, COUNT(*) AS cnt
FROM device
GROUP BY a_status
UNION ALL
SELECT b_status, COUNT(*) AS cnt_b
FROM device
GROUP BY b_status
ORDER BY cnt

This works fine. But if you change the ORDER BY clause in the previous statement so that it reads:
ORDER BY cnt_b
both Oracle and SQL Server will give error messages. In Oracle:
SQL Error: ORA-00904: "CNT_B": invalid identifier
In SQL Server:
Server: Msg 207, Level 16, State 3, Line 1
Invalid column name 'cnt_b'.
Msg 104, Level 16, State 1, Line 8
ORDER BY items must appear in the select list if the statement contains a UNION, INTERSECT or EXCEPT operator.

Now change the ORDER BY clause to read:
ORDER BY COUNT(*)

In SQL Server, the statement will work just fine. In Oracle the statement will give an error message.
SQL Error: ORA-01785: ORDER BY item must be the number of a SELECT-list expression

Is this an important difference in behaviour? Almost certainly not. It has never been advisable or correct to specify an aggregate function in an ORDER BY clause. The developer should have given the aggregate function the same name in both select lists of the UNION (i.e. both COUNT(*)'s should have been called cnt). Or they should have specified the name of the select list item taken from the first part of the UNION. Or alternatively the numeric method of specifying an ORDER BY could have been used.

Comments:
Does the SQL standard specify default ordering of union (all) statements?

SELECT 1 AS foo
UNION
SELECT 2 AS foo

Should return the first results then the second results ie
1
2

Postgres returns them backward ie
2
1
I am guessing the uniqueness of UNION is causing this.

However this fixes changing it to a UNION ALL on Postgres seems to fix the issue.
SELECT 1 AS foo
UNION ALL
SELECT 2 AS foo

Shouldn't the SQL specification specify the order of results of queries aggergated with a UNION (All) statement, if it does I couldn't find where it specifys.
 
SQL explicitly does not guarantee the order of results unless you have an ORDER BY clause.

The fact that some RDBMS give you predictable results without an order by clause (like the ones you have provided) does not give you freedom to assume this will always be the case.

It is interesting that Postgres returns results in reverse order when executing a UNION - you are right this is to do with sorting the results to guarantee uniqueness, whereas the UNION ALL does not need this sort and the results come out in the order you have written them.
 
Post a Comment

Subscribe to Post Comments [Atom]





<< Home

Links

Archives

September 2005   October 2005   December 2005   January 2006   March 2006   May 2006   June 2006   September 2006   May 2007   June 2007   November 2007   December 2007   March 2008   April 2008   December 2008   January 2011  

This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]