Re: DISTINCT ON

Lists: pgsql-hackers
From: Emmanuel Cecchet <manu(at)asterdata(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: DISTINCT ON
Date: 2009-11-04 03:17:06
Message-ID: 4AF0F232.7020405@asterdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi all,

It looks like Postgres has a restriction in DISTINCT ON queries where the DISTINCT ON expressions must match the left side of the ORDER BY list. The issue is that if a DISTINCT ON ... has multiple instances of a particular expression, this check doesn't seem to fire correctly.

For example, this query returns an error (but I guess it shouldn't):

SELECT DISTINCT ON ('1'::varchar, '1'::varchar) a FROM (SELECT 1 AS a) AS a ORDER BY '1'::varchar, '1'::varchar, '2'::varchar;

And this query doesn't return an error (but I guess it should):

SELECT DISTINCT ON ('1'::varchar, '2'::varchar, '1'::varchar) a FROM (SELECT 1 AS a) AS a ORDER BY '1'::varchar, '2'::varchar, '2'::varchar;

Am I misunderstanding something or is there a bug?

Thanks for the help
Emmanuel

--
Emmanuel Cecchet
Aster Data
Web: http://www.asterdata.com


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Emmanuel Cecchet <manu(at)asterdata(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: DISTINCT ON
Date: 2009-11-04 04:06:53
Message-ID: 407d949e0911032006w21b1654dqc75ba3e0e462a496@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 4, 2009 at 3:17 AM, Emmanuel Cecchet <manu(at)asterdata(dot)com> wrote:

> For example, this query returns an error (but I guess it shouldn't):
>
> SELECT DISTINCT ON ('1'::varchar,  '1'::varchar) a FROM (SELECT 1 AS a) AS a
> ORDER BY '1'::varchar, '1'::varchar, '2'::varchar;

This sounds familiar. What version of Postgres are you testing this on?

--
greg


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Emmanuel Cecchet <manu(at)asterdata(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: DISTINCT ON
Date: 2009-11-04 04:36:28
Message-ID: A673D168-A7AF-4F79-96EC-698F7BB62C7E@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Nov 3, 2009, at 10:17 PM, Emmanuel Cecchet <manu(at)asterdata(dot)com>
wrote:

> Hi all,
>
> It looks like Postgres has a restriction in DISTINCT ON queries
> where the DISTINCT ON expressions must match the left side of the
> ORDER BY list. The issue is that if a DISTINCT ON ... has multiple
> instances of a particular expression, this check doesn't seem to
> fire correctly.
>
> For example, this query returns an error (but I guess it shouldn't):
>
> SELECT DISTINCT ON ('1'::varchar, '1'::varchar) a FROM (SELECT 1 AS
> a) AS a ORDER BY '1'::varchar, '1'::varchar, '2'::varchar;
>
> And this query doesn't return an error (but I guess it should):
>
> SELECT DISTINCT ON ('1'::varchar, '2'::varchar, '1'::varchar) a FROM
> (SELECT 1 AS a) AS a ORDER BY '1'::varchar, '2'::varchar,
> '2'::varchar;
>
>
> Am I misunderstanding something or is there a bug?

I'm guessing this is the result of some subtly flakey equivalence
class handling. On first glance ISTM that discarding duplicates is
legit and therefore both examples ought to work...

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Emmanuel Cecchet <manu(at)asterdata(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: DISTINCT ON
Date: 2009-11-04 05:56:02
Message-ID: 9132.1257314162@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> writes:
> On Wed, Nov 4, 2009 at 3:17 AM, Emmanuel Cecchet <manu(at)asterdata(dot)com> wrote:
>> SELECT DISTINCT ON ('1'::varchar, '1'::varchar) a FROM (SELECT 1 AS a) AS a
>> ORDER BY '1'::varchar, '1'::varchar, '2'::varchar;

> This sounds familiar. What version of Postgres are you testing this on?

Presumably something before this bug
http://archives.postgresql.org/pgsql-sql/2008-07/msg00123.php
got fixed
http://archives.postgresql.org/pgsql-committers/2008-07/msg00341.php

regards, tom lane


From: Emmanuel Cecchet <manu(at)asterdata(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Emmanuel Cecchet <manu(at)asterdata(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: DISTINCT ON
Date: 2009-11-04 13:41:35
Message-ID: 4AF1848F.8010602@asterdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Greg Stark <gsstark(at)mit(dot)edu> writes:
>
>> On Wed, Nov 4, 2009 at 3:17 AM, Emmanuel Cecchet <manu(at)asterdata(dot)com> wrote:
>>
>>> SELECT DISTINCT ON ('1'::varchar, '1'::varchar) a FROM (SELECT 1 AS a) AS a
>>> ORDER BY '1'::varchar, '1'::varchar, '2'::varchar;
>>>
>
>
>> This sounds familiar. What version of Postgres are you testing this on?
>>
>
> Presumably something before this bug
> http://archives.postgresql.org/pgsql-sql/2008-07/msg00123.php
> got fixed
> http://archives.postgresql.org/pgsql-committers/2008-07/msg00341.php
>
I am using 8.3.6 and it looks like the fix was only integrated in 8.4.
So using 8.4 should solve the problem.

Thanks
Emmanuel

--
Emmanuel Cecchet
Aster Data
Web: http://www.asterdata.com