Re: Get more from indices.

From: "Etsuro Fujita" <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: "'Kyotaro HORIGUCHI'" <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, <pgsql-hackers(at)postgresql(dot)org>
Cc: <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <robertmhaas(at)gmail(dot)com>
Subject: Re: Get more from indices.
Date: 2013-11-26 03:30:03
Message-ID: 004a01ceea57$cbab0440$63010cc0$@etsuro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Kyotaro HORIGUCHI wrote:
> the attched pathkey_and_uniqueindx_v4_20131122.patch is changed as
> follows.

The patch is compiled successfully and passes all regression tests. Also,
the patch works well for the tests shown in an earlier email from
Horiguchi-san. But in this version I found an incorrect behavior.

postgres=# CREATE TABLE t (a int not null, b int not null, c int, d text);
CREATE TABLE
postgres=# CREATE UNIQUE INDEX i_t_ab ON t (a, b);
CREATE INDEX
postgres=# INSERT INTO t (SELECT a / 5, 4 - (a % 5), a, 't' FROM
generate_series(000000, 099999) a);
INSERT 0 100000
postgres=# ANALYZE t;
ANALYZE
postgres=# CREATE TABLE t2 (e text, f int);
CREATE TABLE
postgres=# INSERT INTO t2 VALUES ('t', 2);
INSERT 0 1
postgres=# INSERT INTO t2 VALUES ('t', 1);
INSERT 0 1
postgres=# ANALYZE t2;
ANALYZE
postgres=# EXPLAIN SELECT * FROM t, t2 WHERE t.a < 10 AND t.d = t2.e ORDER
BY t.a, t.b, t.c, t.d, t2.f LIMIT 10;
QUERY PLAN
----------------------------------------------------------------------------
----
Limit (cost=0.29..9.30 rows=10 width=20)
-> Nested Loop (cost=0.29..129.99 rows=144 width=20)
Join Filter: (t.d = t2.e)
-> Index Scan using i_t_ab on t (cost=0.29..126.80 rows=72
width=14)
Index Cond: (a < 10)
-> Materialize (cost=0.00..1.03 rows=2 width=6)
-> Seq Scan on t2 (cost=0.00..1.02 rows=2 width=6)
(7 rows)

postgres=# SELECT * FROM t, t2 WHERE t.a < 10 AND t.d = t2.e ORDER BY t.a,
t.b, t.c, t.d, t2.f LIMIT 10;
a | b | c | d | e | f
---+---+---+---+---+---
0 | 0 | 4 | t | t | 2
0 | 0 | 4 | t | t | 1
0 | 1 | 3 | t | t | 2
0 | 1 | 3 | t | t | 1
0 | 2 | 2 | t | t | 2
0 | 2 | 2 | t | t | 1
0 | 3 | 1 | t | t | 2
0 | 3 | 1 | t | t | 1
0 | 4 | 0 | t | t | 2
0 | 4 | 0 | t | t | 1
(10 rows)

(Note the column f is sorted in the descending order.)

ISTM this was brought by the following change.

> In truncate_useless_pathkeys, if query_pathkeys is longer than pathkeys
> made from index columns old patch picked up the latter as IndexPath's
> pathkeys. But the former is more suitable according to the context here.

> - truncate_useless_pathkeys returns root->query_pathkeys when
> the index is fully-ordered and query_pathkeys contains the
> pathkeys made from index columns.

I think it would be required to modify the patch so that the transformation
of the pathkeys is performed only when each pathkey of query_pathkeys
references the indexed relation. (The above change might have been made
according to my comment in an earlier email I sent. But I have to admit my
explanation there was not adequate. I'm sorry for that.)

Here are random comments.

* In grouping_planner(), the patch resets the pathkeys of the cheapest
presorted index-path to query_pathkeys through
get_cheapest_fractional_path_for_pathkeys(). Is this necessary? ISTM the
transformation of the pathkeys after the scan/join optimization would be no
longer necessary once it has been done at the index-path creation time, ie,
build_index_paths(). Why does the patch do that thing?

* In get_relation_info(), the patch determines the branch condition
"keyattno != ObjectIdAttributeNumber". I fail to understand the meaning of
this branch condition. Could you explain about it?

Thanks,

Best regards,
Etsuro Fujita

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rodolfo Campero 2013-11-26 03:46:48 Re: PL/Python: domain over array support
Previous Message Bruce Momjian 2013-11-26 03:12:43 Re: Suggestion: Issue warning when calling SET TRANSACTION outside transaction block