Quick Links

Re: DISTINCT vs. GROUP BY

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Greg Stark <gsstark(at)mit(dot)edu>
Cc:	Neil Conway <neilc(at)samurai(dot)com>, Hans-Jürgen Schönig <postgres(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: DISTINCT vs. GROUP BY
Date:	2005-09-19 19:50:47
Message-ID:	29125.1127159447@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Greg Stark <gsstark(at)mit(dot)edu> writes:
> DISTINCT is really just special a case of GROUP BY. Even DISTINCT ON is just
> GROUP BY with a kind of "first()" aggregate function. What would be really
> neat would be to teach GROUP BY about first() and last() and how it can skip
> over some index entries and still satisfy the query. Then make DISTINCT and
> DISTINCT ON be handled through the exact same code path.

You've missed the point entirely.

first() is not a substitute for sorting the input; it is only useful
if the input comes pre-sorted. And if you are going to sort the input,
you might as well use the current implementation of DISTINCT ON and
skip the effort and memory-overflow-risk associated with a hashtable.

I do think hash aggregation is a plausible alternative implementation of
plain DISTINCT, but I don't see the case for using it for DISTINCT ON.

regards, tom lane

In response to

Re: DISTINCT vs. GROUP BY at 2005-09-19 15:45:10 from Greg Stark

Responses

Re: DISTINCT vs. GROUP BY at 2005-09-19 21:00:35 from Greg Stark

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2005-09-19 19:59:35	Re: postmaster core dump
Previous Message	Patrick Welche	2005-09-19 19:07:34	Re: postmaster core dump