Re: UNION ALL on partitioned tables won't use indices.

From: Noah Misch <noah(at)leadboat(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, peter_e(at)gmx(dot)net, robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: UNION ALL on partitioned tables won't use indices.
Date: 2014-02-27 04:29:35
Message-ID: 20140227042935.GA3260493@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 03, 2014 at 07:36:22PM +0900, Kyotaro HORIGUCHI wrote:
> > > create table parent (a int, b int);
> > > create table child () inherits (parent);
> > > insert into parent values (1,10);
> > > insert into child values (2,20);
> > > select a, b from parent union all select a, b from child;
> >
> > Mmm. I had the same result. Please let me have a bit more time.
>
> This turned out to be a correct result. The two tables have
> following records after the two INSERTs.
>
> | =# select * from only parent;
> | 1 | 10
> | (1 row)
> |
> | =# select * from child;
> | 2 | 20
> | (1 row)
>
> Then it is natural that the parent-side in the UNION ALL returns
> following results.
>
> | =# select * from parent;
> | a | b
> | ---+----
> | 1 | 10
> | 2 | 20
> | (2 rows)
>
> Finally, the result we got has proved not to be a problem.

The first union branch should return two rows, and the second union branch
should return one row, for a total of three. In any case, I see later in your
mail that you fixed this. The larger point is that this patch has no business
changing the output rows of *any* query. Its goal is to pick a more-efficient
plan for arriving at the same answer. If there's a bug in our current output
for some query, that's a separate discussion from the topic of this thread.

> Second, about the crash in this sql,
>
> > select parent from parent union all select parent from parent;
>
> It is ignored whole-row reference (=0) which makes the index of
> child translated-vars list invalid (-1). I completely ignored it
> in spite that myself referred to before.
>
> Unfortunately ConvertRowtypeExpr prevents appendrels from being
> removed currently, and such a case don't seem to take place so
> often, so I decided to exclude the case.

> + /*
> + * Appendrels which does whole-row-var conversion cannot be
> + * removed. ConvertRowtypeExpr can convert only RELs which can
> + * be referred to using relid.

We have parent and child relids, so it is not clear to me how imposing that
restriction helps us. I replaced transvars_merge_mutator() with a call to
adjust_appendrel_attrs(). This reduces code duplication, and it handles
whole-row references. (I don't think the other nodes adjust_appendrel_attrs()
can handle matter to this caller. translated_vars will never contain join
tree nodes, and I doubt it could contain a PlaceHolderVar with phrels
requiring translation.)

The central design question for this patch seems to be how to represent, in
the range table, the fact that we expanded an inheritance parent despite its
children ending up as appendrel children of a freestanding UNION ALL. The v6
patch detaches the original RTE from the join tree and clears its "inh" flag.
This breaks sepgsql_dml_privileges(), which looks for RTE_RELATION with inh =
true and consults selinux concerning every child table. We could certainly
change the way sepgsql discovers inheritance hierarchies, but nothing clearly
better comes to mind. I selected the approach of preserving the RTE's "inh"
flag, removing the AppendRelInfo connecting that RTE to its enclosing UNION
ALL, and creating no AppendRelInfo children for that RTE. An alternative was
to introduce a new RTE flag, say "append". An inheritance parent under a
UNION ALL would have append = false, inh = true; other inheritance parent RTEs
would have append = true, inh = true; an RTE for UNION ALL itself would have
append = true, inh = false.

> > > > > > The attached two patches are rebased to current 9.4dev HEAD and
> > > > > > make check at the topmost directory and src/test/isolation are
> > > > > > passed without error. One bug was found and fixed on the way. It
> > > > > > was an assertion failure caused by probably unexpected type
> > > > > > conversion introduced by collapse_appendrels() which leads
> > > > > > implicit whole-row cast from some valid reltype to invalid
> > > > > > reltype (0) into adjust_appendrel_attrs_mutator().
> > > > >
> > > > > What query demonstrated this bug in the previous type 2/3 patches?
> > >
> > > I would still like to know the answer to the above question.
>
> I rememberd after some struggles. It failed during 'make check',
> on the following query in inherit.sql.

> [details]

Interesting. Today, the parent_reltype and child_reltype fields of
AppendRelInfo are either both valid or both invalid. Your v6 patch allowed us
to have a valid child_reltype with an invalid parent_reltype. At the moment,
we can't benefit from just one valid reltype. I restored the old invariant.

If the attached patch version looks reasonable, I will commit it.

Incidentally, I tried adding an assertion that append_rel_list does not show
one appendrel as a direct child of another. The following query, off-topic
for the patch at hand, triggered that assertion:

SELECT 0 FROM (SELECT 0 UNION ALL SELECT 0) t0
UNION ALL
SELECT 0 FROM (SELECT 0 UNION ALL SELECT 0) t0;

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
unionall_inh_idx_typ3_v7.patch text/plain 9.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2014-02-27 04:31:23 Re: define type_transform to new user defined type
Previous Message Stephen Frost 2014-02-27 04:12:25 Re: jsonb and nested hstore