Re: Get more from indices.

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: robertmhaas(at)gmail(dot)com
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Get more from indices.
Date: 2013-11-15 06:57:16
Message-ID: 20131115.155716.184031283.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, I might made insufficient explanation. Surely it is
overdone for the example.

>> uniquetest=# create table t (a int, b int, c int, d text);
>> uniquetest=# create unique index i_t_pkey on t(a, b);
>> uniquetest=# insert into t
>> (select a % 10, a / 10, a, 't' from generate_series(0, 100000) a);
>> uniquetest=# analyze;
>>
>> uniquetest=# explain (costs off, analyze on) select distinct * from t;

This is the simplest example for this patch.

The main intention of this patch is to widen application of my
another 'Index for UNION' patch.

https://commitfest.postgresql.org/action/patch_view?id=1279

> ISTM the right way to deal with this is not what you've done
> here, but just to deem the DISTINCT a no-op if there's a unique
> index on some subset of the distinct-ed columns.

It is not sufficient only to deem DISTINCT no-op in order to get
more utility of MergeAppends on IndexScans for UNION. The
following example (omitting data insertion..)

| create table cu11 (a int not null, b int not null, c int, d text);
| create unique index i_cu11_pk on cu11 (a, b);
| create table cu12 (like cu11 including all);
| explain select a, b from cu11 union select a, b from cu12
| order by a, b limit 100;

yields following plan without this patch.

| Limit
| -> Sort (Sort Key: cu11.a, cu11.b)
| -> HashAggregate
| -> Append
| -> Limit
| -> Unique
| -> Index Only Scan using cu11_a_b_idx on cu11
| -> Limit
| -> Unique
| -> Index Only Scan using cu12_a_b_idx on cu12

But, this could be far more effecient when the plan were as
follows if the rows are fully-ordered on the sort key.

| Limit
| -> Unique
| -> Merge Append (Sort Key: cu11.a, cu11.b)
| -> Index Only Scan using cu11_a_b_idx on cu11
| -> Index Only Scan using cu12_a_b_idx on cu12

Propagation of uniqueness on MergeAppend is required to do this
but not for the first expample on this thread.

> This query is actually legally satisfied by a simple seqscan,
> which would be faster than either of the plans you mention. In
> any case, it seems like a bad idea to me to conflate
> distinct-ness with ordering, so I don't like what you did to
> PathKeys.

Umm. Reconsidering on that and the discussion of the purose above
and the last patch of this thread, I've been wrong in where to
split the patches. I'll repost the reorganized patches on both
threads.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2013-11-15 06:58:47 Re: Using indices for UNION.
Previous Message David Rowley 2013-11-15 06:54:25 Re: Improve code in tidbitmap.c