Re: UNION ALL on partitioned tables won't use indices.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UNION ALL on partitioned tables won't use indices.
Date: 2013-10-25 13:28:23
Message-ID: CA+Tgmob1PVQK16JbJOjeHZhqq0vsBpwkNSkD+Up+X+m9YByCuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 24, 2013 at 6:39 AM, Kyotaro HORIGUCHI
<horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> Hello, Sorry that it's been a while..
>
> 1. Observed symptom
>
> As you know, UNION ALL accompanied with ORDER BY uses indexes if
> available.
>
>> uniontest=# EXPLAIN SELECT * FROM c11 UNION ALL SELECT * FROM c12 ORDER BY a;
>> QUERY PLAN
>> ---------------------------------------------------------------------------
>> Merge Append (cost=0.59..10214.10 rows=200000 width=16)
>> Sort Key: c11.a
>> -> Index Scan using ... on c11 (cost=0.29..3857.04 rows=100000 width=16)
>> -> Index Scan using ... on c12 (cost=0.29..3857.04 rows=100000 width=16)
>
> So do simple queries on partitioned (inheritance) tables.
>
>> uniontest=# EXPLAIN SELECT * FROM p1 ORDER BY a;
>> QUERY PLAN
>> ------------------------------------------------------------------------------
>> Merge Append (cost=0.73..11392.19 rows=200001 width=16)
>> Sort Key: p1.a
>> -> Index Scan using ... on p1 (cost=0.12..8.14 rows=1 width=44)
>> -> Index Scan using ... on c11 (cost=0.29..3857.04 rows=100000 width=16)
>> -> Index Scan using ... on c12 (cost=0.29..3857.04 rows=100000 width=16)
>
> Nevertheless, UNION ALL on partitioned tables doesn't. This is
> quite unfavourable behavior especially having LIMIT.
>
>>uniontest=# EXPLAIN ANALYZE SELECT * FROM p1
>> UNION ALL SELECT * FROM p2 ORDER BY a LIMIT 10;
>> QUERY PLAN
>>------------------------------------------------------------------------
>> Limit (actual time=182.732..182.735 rows=10 loops=1)
>> -> Sort (actual time=182.729..182.730 rows=10 loops=1)
>> Sort Key: p1.a
>> Sort Method: top-N heapsort Memory: 25kB
>> -> Append (actual time=0.029..108.109 rows=400000 loops=1)
>> -> Seq Scan on p1 (actual time=0.001..0.001 rows=0 loops=1)
>> -> Seq Scan on c11 (actual time=0.027..19.074 rows=100000 loops=1)
>> -> Seq Scan on c12 (actual time=0.014..16.653 rows=100000 loops=1)
>> -> Seq Scan on p2 (actual time=0.000..0.000 rows=0 loops=1)
>> -> Seq Scan on c21 (actual time=0.012..16.677 rows=100000 loops=1)
>> -> Seq Scan on c22 (actual time=0.012..16.794 rows=100000 loops=1)
>> Total runtime: 182.857 ms
>
>
> 2. The cause
>
> In grouping_planner, flattern_simple_union_all creates
> appendrelinfos for each subqueries then expand_inherited_tables
> furthur expands the parent tables in each subqueries. Finally,
> this sequence leaves following structure. Where rte[2] and [3]
> are subqueries abandoned on the way pulling up rte[4] and [5].
>
> rte[1] Subquery "SELECT*1", inh = 1
> +- appinfo[0] - rte[4] Relation "p1", inh = 1
> | +- appinfo[2] - rte[6] Relation "p1", inh = 0
> | +- appinfo[3] - rte[7] Relation "c11", inh = 0
> | +- appinfo[4] - rte[8] Relation "c12", inh = 0
> +- appinfo[1] - rte[5] Relation "p2", inh = 1
> +- appinfo[5] - rte[9] Relation "p1", inh = 0
> +- appinfo[6] - rte[10] Relation "c11", inh = 0
> +- appinfo[7] - rte[11] Relation "c12", inh = 0
>
> On the other hand, ec member finally has exprs only for varno =
> 1, 4 and 5, in other words, it lacks the members for the most
> descendant RTEs. This is because add_child_rel_equivalences()
> always inhibits add_eq_member from registering the child_rel's
> relid on EC. Conequently these grandchild relations does not find
> index pathkeys for themselves.
>
>
> 3. Measures
>
> I could thought up three approaches for the problem.
>
> One is to simplly modifying here to give child flag in the
> parameters of add_eq_member accordig to whether the child_rel is
> inh or not. This seems to work fine although leaves two levels of
> MergeAppends (PATCH-1). So the additional patch is attached to
> collapse these MergeAppends (PATCH-2). This gives the same plan
> as PATCH-3.
>
>> uniontest=# explain analyze select * from p1 union all
>> select * from p2 order by a limit 10;
>> QUERY PLAN
>> ------------------------------------------------------------------------
>> Limit (actual time=0.208..0.224 rows=10 loops=1)
>> -> Merge Append (actual time=0.205..0.218 rows=10 loops=1)
>> Sort Key: p1.a
>> -> Merge Append (actual time=0.110..0.120 rows=10 loops=1)
>> Sort Key: p1.a
>> -> Index Scan .. on p1 (actual time=0.006..0.006 rows=0 loops=1)
>> -> Index Scan .. on c11 (actual time=0.054..0.060 rows=10 loops=1)
>> -> Index Scan .. on c12 (actual time=0.046..0.046 rows=1 loops=1)
>> -> Merge Append (actual time=0.093..0.093 rows=1 loops=1)
>> Sort Key: p2.a
>> -> Index Scan .. on p2 (actual time=0.002..0.002 rows=0 loops=1)
>> -> Index Scan .. on c21 (actual time=0.047..0.047 rows=1 loops=1)
>> -> Index Scan .. on c22 (actual time=0.043..0.043 rows=1 loops=1)
>> Total runtime: 0.448 ms
>
>
> The second is to collapse the appendrel structure shown above to
> have only single level inheritance. This also seems to work
> fine. This makes the plan with single level
> MergeAppend. (PATCH-3)
>
>> uniontest=# EXPLAIN ANALYZE SELECT * FROM p1 UNION ALL
>> SELECT * FROM p2 ORDER BY a LIMIT 10;
>> QUERY PLAN
>> -------------------------------------------------------------------
>> Limit (actual time=0.095..0.107 rows=10 loops=1)
>> -> Merge Append (actual time=0.092..0.102 rows=10 loops=1)
>> Sort Key: p1.a
>> -> Index Scan ... on p1 (actual time=0.005..0.005 rows=0 loops=1)
>> -> Index Scan ... on c11 (actual time=0.023..0.030 rows=10 loops=1)
>> -> Index Scan ... on c12 (actual time=0.020..0.020 rows=1 lops=1)
>> -> Index Scan ... on p2 (actual time=0.001..0.001 rows=0 loops=1)
>> -> Index Scan ... on c21 (actual time=0.019..0.019 rows=1 loops=1)
>> -> Index Scan ... on c22 (actual time=0.019..0.019 rows=1 loops=1)
>> Total runtime: 0.241 ms
>
>
>
> The last one is most strait-forward in some sense, and conversely
> is most ad hoc measure. Modifying expand_inherited_rtentry() to
> pull up adding appendrels if needed, using the similar manner to
> PATCH-3. This is added as PATCH-4.
>
>
> what do you think about this?
>
>
> Four patches following are attached to this message.
>
> 1. union_inh_idx_typ1_v1_20131024.patch .. PATCH-1 described above.
> 2. union_inh_idx_typ1_add_v1_20131024.patch .. PATCH-2 described above.
> 3. union_inh_idx_typ2_v1_20131024.patch .. PATCH-3 described above.
> 4. union_inh_idx_typ3_v1_20131024.patch .. PATCH-4 described above.

Please add your patches to the currently-open CommitFest so that we
don't lose track of them.

https://commitfest.postgresql.org/action/commitfest_view/open

I'm not sure which approach to this problem is best, but I agree that
it is worth solving.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2013-10-25 13:35:05 Re: proposal: lob conversion functionality
Previous Message Robert Haas 2013-10-25 13:26:29 Re: CLUSTER FREEZE