Re: Question about sorting internals

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: depesz(at)depesz(dot)com
Cc: PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Question about sorting internals
Date: 2013-12-11 15:30:07
Message-ID: 15974.1386775807@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

hubert depesz lubaczewski <depesz(at)depesz(dot)com> writes:
> There are two simple queries: ...
> They differ only in order of queries in union all part.
> The thing is that they return the same result. Why isn't one of them returning
> "2005" for 6th "miesiac"?

With such a small amount of data, you're getting an in-memory quicksort,
and a well-known property of quicksort is that it isn't stable --- that
is, there are no guarantees about the order in which it will return items
that have equal keys. In this case it's evidently making different
partitioning choices, as a consequence of the different arrival order of
the rows, that just by chance end up with the 6/2004/6 row being returned
before the 6/2005/6 row in both cases. You could trace through the logic
and see exactly how that's happening, but I doubt it'd be a very edifying
exercise.

If you want to get well-defined results with DISTINCT ON, you should
make the ORDER BY sort by a candidate key. Anything less opens you to
uncertainty about which rows the DISTINCT will select.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message MauMau 2013-12-11 15:31:25 Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?
Previous Message Andrew Sullivan 2013-12-11 15:28:21 Re: Case sensitivity