Re: "for SELECT DISTINCT, ORDER BY expressions must appear in select list" - is that the standart or a limitation of postgresql?

Lists: pgsql-sql
From: "Walter Cruz" <walter(dot)php(at)gmail(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: "for SELECT DISTINCT, ORDER BY expressions must appear in select list" - is that the standart or a limitation of postgresql?
Date: 2007-02-15 11:19:28
Message-ID: 32cabba0702150319v427d5a25s9a6c7f851576baa2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

CREATE TABLE test
(
id int4 NOT NULL DEFAULT nextval('teste_id_seq'::regclass),
name varchar,
number int4
)
WITHOUT OIDS;

The data:

1;"walter";1
2;"walter";1
3;"walter";1
4;"walter";1
5;"walter";2
6;"walter";3
7;"rodrigo";1
8;"rodrigo";2
9;"rodrigo";3

The query:

SELECT distinct name from test order by number

(well, I think that que query doesn't make any sense, but raises the error :) )

The error: ERROR: for SELECT DISTINCT, ORDER BY expressions must
appear in select list is due to a standart implementarion or a design
decision of postgres?

The comentary on parse_clause.c looks like the second option. I'm right?

[]'s
- Walter


From: Richard Huxton <dev(at)archonet(dot)com>
To: Walter Cruz <walter(dot)php(at)gmail(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: "for SELECT DISTINCT, ORDER BY expressions must appear in select list" - is that the standart or a limitation of postgresql?
Date: 2007-02-15 13:35:18
Message-ID: 45D46196.2090003@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Walter Cruz wrote:
>
> SELECT distinct name from test order by number
>
> (well, I think that que query doesn't make any sense, but raises the
> error :) )
>
> The error: ERROR: for SELECT DISTINCT, ORDER BY expressions must
> appear in select list is due to a standart implementarion or a design
> decision of postgres?

I think ORDER BY is defined to take place after DISTINCT, which means
there is no meaningful "number" for it to order by. You could
arbitrarily choose the first number encountered, but I can't see what
sense it would make to order by them.

--
Richard Huxton
Archonet Ltd


From: Michael Glaesemann <grzm(at)seespotcode(dot)net>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: Walter Cruz <walter(dot)php(at)gmail(dot)com>, pgsql-sql(at)postgresql(dot)org
Subject: Re: "for SELECT DISTINCT, ORDER BY expressions must appear in select list" - is that the standart or a limitation of postgresql?
Date: 2007-02-15 14:04:04
Message-ID: 57FD2B24-DB43-4EF7-92D1-3C467641AF9D@seespotcode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql


On Feb 15, 2007, at 22:35 , Richard Huxton wrote:

> Walter Cruz wrote:
>> SELECT distinct name from test order by number
>> (well, I think that que query doesn't make any sense, but raises
>> the error :) )
>> The error: ERROR: for SELECT DISTINCT, ORDER BY expressions must
>> appear in select list is due to a standart implementarion or a design
>> decision of postgres?
>
> I think ORDER BY is defined to take place after DISTINCT, which
> means there is no meaningful "number" for it to order by. You could
> arbitrarily choose the first number encountered, but I can't see
> what sense it would make to order by them.

I believe the reason is that DISTINCT depends on a sort to determine
uniqueness (distinctness), so it's a implementation detail that if
you're going to include an ORDER BY, you also need to include the
same columns in the ORDER BY in the DISTINCT clause. Though I suspect
Richard is right that ORDER BY takes place after DISTINCT. (My
cursory attempt at parsing the SQL 2003 draft failed me.)

On further thought, I bet

SELECT DISTINCT name
FROM test
ORDER BY name, number

fails with a different error, one directly supporting Richard's
conclusion.

Michael Glaesemann
grzm seespotcode net


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Glaesemann <grzm(at)seespotcode(dot)net>
Cc: Richard Huxton <dev(at)archonet(dot)com>, Walter Cruz <walter(dot)php(at)gmail(dot)com>, pgsql-sql(at)postgresql(dot)org
Subject: Re: "for SELECT DISTINCT, ORDER BY expressions must appear in select list" - is that the standart or a limitation of postgresql?
Date: 2007-02-15 17:10:17
Message-ID: 27009.1171559417@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Michael Glaesemann <grzm(at)seespotcode(dot)net> writes:
> On Feb 15, 2007, at 22:35 , Richard Huxton wrote:
>> Walter Cruz wrote:
>>> The error: ERROR: for SELECT DISTINCT, ORDER BY expressions must
>>> appear in select list is due to a standart implementarion or a design
>>> decision of postgres?
>>
>> I think ORDER BY is defined to take place after DISTINCT, which
>> means there is no meaningful "number" for it to order by. You could
>> arbitrarily choose the first number encountered, but I can't see
>> what sense it would make to order by them.

> I believe the reason is that DISTINCT depends on a sort to determine
> uniqueness (distinctness), so it's a implementation detail that if
> you're going to include an ORDER BY, you also need to include the
> same columns in the ORDER BY in the DISTINCT clause.

No, there's actually a definitional reason for it. Consider

SELECT DISTINCT x FROM tab ORDER BY y;

For any particular x-value in the table there might be many different y
values. Which one will you use to sort that x-value in the output?

Back in SQL92 they avoided this problem by specifying that ORDER BY
entries had to reference output columns. SQL99 has some messy verbiage
that I think comes out at the same place as our restriction:

A) If K(i) is not equivalent to a <value expression>
immediately contained in any <derived column> in the
<select list> SL of <query specification> QS contained
in QE, then:

I) T shall not be a grouped table.

II) QS shall not specify the <set quantifier> DISTINCT
or directly contain one or more <set function
specification>s.

regards, tom lane


From: "Walter Cruz" <walter(dot)php(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Michael Glaesemann" <grzm(at)seespotcode(dot)net>, "Richard Huxton" <dev(at)archonet(dot)com>, pgsql-sql(at)postgresql(dot)org
Subject: Re: "for SELECT DISTINCT, ORDER BY expressions must appear in select list" - is that the standart or a limitation of postgresql?
Date: 2007-02-15 22:59:58
Message-ID: 32cabba0702151459m4829c489y67b40f53e27134a1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-sql

Thanks Tom, Thank all :)

Maybe the commentary on parse_clase.c (beggining with "Now, DISTINCT
list consists of all non-resjunk") needs to be updated - In the
comment, looks likes this is a postgresql limitation.

[]'s
- Walter

On 2/15/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Michael Glaesemann <grzm(at)seespotcode(dot)net> writes:
> > On Feb 15, 2007, at 22:35 , Richard Huxton wrote:
> >> Walter Cruz wrote:
> >>> The error: ERROR: for SELECT DISTINCT, ORDER BY expressions must
> >>> appear in select list is due to a standart implementarion or a design
> >>> decision of postgres?
> >>
> >> I think ORDER BY is defined to take place after DISTINCT, which
> >> means there is no meaningful "number" for it to order by. You could
> >> arbitrarily choose the first number encountered, but I can't see
> >> what sense it would make to order by them.
>
> > I believe the reason is that DISTINCT depends on a sort to determine
> > uniqueness (distinctness), so it's a implementation detail that if
> > you're going to include an ORDER BY, you also need to include the
> > same columns in the ORDER BY in the DISTINCT clause.
>
> No, there's actually a definitional reason for it. Consider
>
> SELECT DISTINCT x FROM tab ORDER BY y;
>
> For any particular x-value in the table there might be many different y
> values. Which one will you use to sort that x-value in the output?
>
> Back in SQL92 they avoided this problem by specifying that ORDER BY
> entries had to reference output columns. SQL99 has some messy verbiage
> that I think comes out at the same place as our restriction:
>
> A) If K(i) is not equivalent to a <value expression>
> immediately contained in any <derived column> in the
> <select list> SL of <query specification> QS contained
> in QE, then:
>
> I) T shall not be a grouped table.
>
> II) QS shall not specify the <set quantifier> DISTINCT
> or directly contain one or more <set function
> specification>s.
>
>
> regards, tom lane
>