Re: Warts with SELECT DISTINCT

From: Bruno Wolff III <bruno(at)wolff(dot)to>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Warts with SELECT DISTINCT
Date: 2006-05-04 03:10:57
Message-ID: 20060504031057.GA30219@wolff.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 03, 2006 at 17:58:07 -0400,
Greg Stark <gsstark(at)mit(dot)edu> wrote:
>
> Though it's optimized poorly and does a superfluous sort step:
>
> stark=> explain select col1 from test group by col1 order by upper(col1);
> QUERY PLAN
> ---------------------------------------------------------------------------
> Sort (cost=99.72..100.22 rows=200 width=32)
> Sort Key: upper(col1)
> -> Group (cost=85.43..92.08 rows=200 width=32)
> -> Sort (cost=85.43..88.50 rows=1230 width=32)
> Sort Key: col1
> -> Seq Scan on test (cost=0.00..22.30 rows=1230 width=32)
> (6 rows)
>
>
> Whereas it shouldn't be hard to prove that this is equivalent:
>
> stark=> explain select col1 from test group by upper(col1),col1 order by upper(col1);
> QUERY PLAN
> ---------------------------------------------------------------------
> Group (cost=88.50..98.23 rows=200 width=32)
> -> Sort (cost=88.50..91.58 rows=1230 width=32)
> Sort Key: upper(col1), col1
> -> Seq Scan on test (cost=0.00..25.38 rows=1230 width=32)
> (4 rows)

I don't think you can assume that that will be true for any locale. If there
are two different characters that both have the same uppercase version, this
will break things.

And while you would expect that x = y => upper(x) = upper(y) I am not sure
that is guarenteed for locales. I can imagine having two different characters
that are treated the same for ordering purposes, but have uppercase versions
that are considered different for ordering purposes.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2006-05-04 04:05:16 Re: Warts with SELECT DISTINCT
Previous Message Christopher Kings-Lynne 2006-05-04 01:53:08 Re: Rethinking locking for database create/drop vs connection