Re: PoC: Duplicate Tuple Elidation during External Sort for DISTINCT

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PoC: Duplicate Tuple Elidation during External Sort for DISTINCT
Date: 2014-01-22 03:53:27
Message-ID: 26684.1390362807@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jon Nelson <jnelson+pgsql(at)jamponi(dot)net> writes:
> A rough summary of the patch follows:

> - a GUC variable enables or disables this capability
> - in nodeAgg.c, eliding duplicate tuples is enabled if the number of
> distinct columns is equal to the number of sort columns (and both are
> greater than zero).
> - in createplan.c, eliding duplicate tuples is enabled if we are
> creating a unique plan which involves sorting first
> - ditto planner.c
> - all of the remaining changes are in tuplesort.c, which consist of:
> + a new macro, DISCARDTUP and a new structure member, discardtup, are
> both defined and operate similar to COMPARETUP, COPYTUP, etc...
> + in puttuple_common, when state is TSS_BUILDRUNS, we *may* simply
> throw out the new tuple if it compares as identical to the tuple at
> the top of the heap. Since we're already performing this comparison,
> this is essentially free.
> + in mergeonerun, we may discard a tuple if it compares as identical
> to the *last written tuple*. This is a comparison that did not take
> place before, so it's not free, but it saves a write I/O.
> + We perform the same logic in dumptuples

[ raised eyebrow ... ] And what happens if the planner drops the
unique step and then the sort doesn't actually go to disk?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2014-01-22 03:55:38 Re: ALTER SYSTEM SET typos and fix for temporary file name management
Previous Message Tom Lane 2014-01-22 03:49:36 Re: [bug fix] pg_ctl always uses the same event source