UNION ALL on partitioned tables won't use indices.

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: UNION ALL on partitioned tables won't use indices.
Date: 2013-10-24 10:39:53
Message-ID: 20131024.193953.233464126.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, Sorry that it's been a while..

1. Observed symptom

As you know, UNION ALL accompanied with ORDER BY uses indexes if
available.

> uniontest=# EXPLAIN SELECT * FROM c11 UNION ALL SELECT * FROM c12 ORDER BY a;
> QUERY PLAN
> ---------------------------------------------------------------------------
> Merge Append (cost=0.59..10214.10 rows=200000 width=16)
> Sort Key: c11.a
> -> Index Scan using ... on c11 (cost=0.29..3857.04 rows=100000 width=16)
> -> Index Scan using ... on c12 (cost=0.29..3857.04 rows=100000 width=16)

So do simple queries on partitioned (inheritance) tables.

> uniontest=# EXPLAIN SELECT * FROM p1 ORDER BY a;
> QUERY PLAN
> ------------------------------------------------------------------------------
> Merge Append (cost=0.73..11392.19 rows=200001 width=16)
> Sort Key: p1.a
> -> Index Scan using ... on p1 (cost=0.12..8.14 rows=1 width=44)
> -> Index Scan using ... on c11 (cost=0.29..3857.04 rows=100000 width=16)
> -> Index Scan using ... on c12 (cost=0.29..3857.04 rows=100000 width=16)

Nevertheless, UNION ALL on partitioned tables doesn't. This is
quite unfavourable behavior especially having LIMIT.

>uniontest=# EXPLAIN ANALYZE SELECT * FROM p1
> UNION ALL SELECT * FROM p2 ORDER BY a LIMIT 10;
> QUERY PLAN
>------------------------------------------------------------------------
> Limit (actual time=182.732..182.735 rows=10 loops=1)
> -> Sort (actual time=182.729..182.730 rows=10 loops=1)
> Sort Key: p1.a
> Sort Method: top-N heapsort Memory: 25kB
> -> Append (actual time=0.029..108.109 rows=400000 loops=1)
> -> Seq Scan on p1 (actual time=0.001..0.001 rows=0 loops=1)
> -> Seq Scan on c11 (actual time=0.027..19.074 rows=100000 loops=1)
> -> Seq Scan on c12 (actual time=0.014..16.653 rows=100000 loops=1)
> -> Seq Scan on p2 (actual time=0.000..0.000 rows=0 loops=1)
> -> Seq Scan on c21 (actual time=0.012..16.677 rows=100000 loops=1)
> -> Seq Scan on c22 (actual time=0.012..16.794 rows=100000 loops=1)
> Total runtime: 182.857 ms

2. The cause

In grouping_planner, flattern_simple_union_all creates
appendrelinfos for each subqueries then expand_inherited_tables
furthur expands the parent tables in each subqueries. Finally,
this sequence leaves following structure. Where rte[2] and [3]
are subqueries abandoned on the way pulling up rte[4] and [5].

rte[1] Subquery "SELECT*1", inh = 1
+- appinfo[0] - rte[4] Relation "p1", inh = 1
| +- appinfo[2] - rte[6] Relation "p1", inh = 0
| +- appinfo[3] - rte[7] Relation "c11", inh = 0
| +- appinfo[4] - rte[8] Relation "c12", inh = 0
+- appinfo[1] - rte[5] Relation "p2", inh = 1
+- appinfo[5] - rte[9] Relation "p1", inh = 0
+- appinfo[6] - rte[10] Relation "c11", inh = 0
+- appinfo[7] - rte[11] Relation "c12", inh = 0

On the other hand, ec member finally has exprs only for varno =
1, 4 and 5, in other words, it lacks the members for the most
descendant RTEs. This is because add_child_rel_equivalences()
always inhibits add_eq_member from registering the child_rel's
relid on EC. Conequently these grandchild relations does not find
index pathkeys for themselves.

3. Measures

I could thought up three approaches for the problem.

One is to simplly modifying here to give child flag in the
parameters of add_eq_member accordig to whether the child_rel is
inh or not. This seems to work fine although leaves two levels of
MergeAppends (PATCH-1). So the additional patch is attached to
collapse these MergeAppends (PATCH-2). This gives the same plan
as PATCH-3.

> uniontest=# explain analyze select * from p1 union all
> select * from p2 order by a limit 10;
> QUERY PLAN
> ------------------------------------------------------------------------
> Limit (actual time=0.208..0.224 rows=10 loops=1)
> -> Merge Append (actual time=0.205..0.218 rows=10 loops=1)
> Sort Key: p1.a
> -> Merge Append (actual time=0.110..0.120 rows=10 loops=1)
> Sort Key: p1.a
> -> Index Scan .. on p1 (actual time=0.006..0.006 rows=0 loops=1)
> -> Index Scan .. on c11 (actual time=0.054..0.060 rows=10 loops=1)
> -> Index Scan .. on c12 (actual time=0.046..0.046 rows=1 loops=1)
> -> Merge Append (actual time=0.093..0.093 rows=1 loops=1)
> Sort Key: p2.a
> -> Index Scan .. on p2 (actual time=0.002..0.002 rows=0 loops=1)
> -> Index Scan .. on c21 (actual time=0.047..0.047 rows=1 loops=1)
> -> Index Scan .. on c22 (actual time=0.043..0.043 rows=1 loops=1)
> Total runtime: 0.448 ms

The second is to collapse the appendrel structure shown above to
have only single level inheritance. This also seems to work
fine. This makes the plan with single level
MergeAppend. (PATCH-3)

> uniontest=# EXPLAIN ANALYZE SELECT * FROM p1 UNION ALL
> SELECT * FROM p2 ORDER BY a LIMIT 10;
> QUERY PLAN
> -------------------------------------------------------------------
> Limit (actual time=0.095..0.107 rows=10 loops=1)
> -> Merge Append (actual time=0.092..0.102 rows=10 loops=1)
> Sort Key: p1.a
> -> Index Scan ... on p1 (actual time=0.005..0.005 rows=0 loops=1)
> -> Index Scan ... on c11 (actual time=0.023..0.030 rows=10 loops=1)
> -> Index Scan ... on c12 (actual time=0.020..0.020 rows=1 lops=1)
> -> Index Scan ... on p2 (actual time=0.001..0.001 rows=0 loops=1)
> -> Index Scan ... on c21 (actual time=0.019..0.019 rows=1 loops=1)
> -> Index Scan ... on c22 (actual time=0.019..0.019 rows=1 loops=1)
> Total runtime: 0.241 ms

The last one is most strait-forward in some sense, and conversely
is most ad hoc measure. Modifying expand_inherited_rtentry() to
pull up adding appendrels if needed, using the similar manner to
PATCH-3. This is added as PATCH-4.

what do you think about this?

Four patches following are attached to this message.

1. union_inh_idx_typ1_v1_20131024.patch .. PATCH-1 described above.
2. union_inh_idx_typ1_add_v1_20131024.patch .. PATCH-2 described above.
3. union_inh_idx_typ2_v1_20131024.patch .. PATCH-3 described above.
4. union_inh_idx_typ3_v1_20131024.patch .. PATCH-4 described above.

regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
union_inh_idx_typ1_v1_20131024.patch text/x-patch 987 bytes
union_inh_idx_typ1_add_v1_20131024.patch text/x-patch 1.7 KB
union_inh_idx_typ2_v1_20131024.patch text/x-patch 6.1 KB
union_inh_idx_typ3_v1_20131024.patch text/x-patch 3.6 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thom Brown 2013-10-24 10:41:11 Re: 9.4 regression
Previous Message Haribabu kommi 2013-10-24 10:20:48 Ident context leak during reloading of conf files when no ident information is present in the file