Re: Get more from indices.

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org, robertmhaas(at)gmail(dot)com
Subject: Re: Get more from indices.
Date: 2014-01-07 04:45:56
Message-ID: 20140107.134556.94242730.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

> Tom Lane wrote:
> > I started to look at this patch. I don't understand the reason for the
> > foreach loop in index_pathkeys_are_extensible (and the complete lack of
> > comments in the patch isn't helping). Isn't it sufficient to check that
> > the index is unique/immediate/allnotnull and its ordering is a prefix of
> > query_pathkeys? If not, what's the rationale for the specific tests being
> > made on the pathkeys --- this code doesn't make much sense to me.
>
> Thank you for taking time to look at this patch. I think it's not
> sufficient to check those things. Let me explain the reason why this patch
> has that code. The patch has that code in order to prevent
> build_join_pathkeys() from building incorrect join pathkeys', where the
> pathkeys for a join relation constructed by mergejoin or nestloop join are
> built normally just by using the outer path's pathkeys. Without that code,
> the patch would produce an incorrect result for such a case. An example
> will be shown below.

> A simple approach to avoid this problem would be to
> apply this idea only when each pathkey in query_pathkeys references the
> indexed relation in addition to that the index is
> unique/immediate/allnotnull and its ordering is a prefix of query_pathkeys.
> That's the reason.

Utterly disregarding the chances of joins - the patch (v7)
already does so in some extent, ignoring the possibility of
partial extension for multi-table'd pathkeys - it is also
avoidable by simply passing a boolean
'extend_pathkeys_if_possible', or splitting into two functions
regarding the boolean. The check was not a yes-or-no decision but
a how-long-it-can-be-extended measuring in the previous version
(pathkey_and_uniqueindx_v5). It has been simplified and splitted
out as individual function after.

> [Data]
> CREATE TABLE t (a int not null, b int not null, c int, d text);
> CREATE UNIQUE INDEX i_t_ab ON t (a, b);
> INSERT INTO t (SELECT a / 5, 4 - (a % 5), a, 't' FROM
> generate_series(000000, 099999) a);
> ANALYZE t;
> CREATE TABLE t2 (e text, f int);
> INSERT INTO t2 VALUES ('t', 2);
> INSERT INTO t2 VALUES ('t', 1);
> ANALYZE t2;
>
> [Query]
> EXPLAIN SELECT * FROM t, t2 WHERE t.a < 10 AND t.d = t2.e ORDER BY t.a, t.b,
> t.c, t.d, t2.f LIMIT 4;
> QUERY PLAN
> ----------------------------------------------------------------------------
> ----
> Limit (cost=0.29..3.96 rows=4 width=20)
> -> Nested Loop (cost=0.29..110.17 rows=120 width=20)
> Join Filter: (t.d = t2.e)
> -> Index Scan using i_t_ab on t (cost=0.29..107.34 rows=60
> width=14)
> Index Cond: (a < 10)
> -> Materialize (cost=0.00..1.03 rows=2 width=6)
> -> Seq Scan on t2 (cost=0.00..1.02 rows=2 width=6)
> (7 rows)
>
> SELECT * FROM t, t2 WHERE t.a < 10 AND t.d = t2.e ORDER BY t.a, t.b, t.c,
> t.d, t2.f LIMIT 4;
> a | b | c | d | e | f
> ---+---+---+---+---+---
> 0 | 0 | 4 | t | t | 2
> 0 | 0 | 4 | t | t | 1
> 0 | 1 | 3 | t | t | 2
> 0 | 1 | 3 | t | t | 1
> (4 rows)
>
> (Note the column f is sorted in the descending order.)
>
> Sorry for the delay.
>
> Best regards,
> Etsuro Fujita

With best wishes for a happy New Yaar.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-01-07 04:49:42 Re: Re: Fixing bug #8228 ("set-valued function called in context that cannot accept a set")
Previous Message Tom Lane 2014-01-07 04:40:53 Re: [COMMITTERS] pgsql: Upgrade to Autoconf 2.69