Re: BUG #14646: performance hint to remove

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: boshomi(at)gmail(dot)com, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #14646: performance hint to remove
Date: 2017-05-11 22:32:43
Message-ID: 12219.1494541963@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

"David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
> Confirmed 9.6.2 on Ubuntu. Not sure this is properly classified as a bug
> but its definitely an area where improvement would seem desirable. I am a
> particularly heavy user of the join predicate "USING" and never really
> thought to look at this dynamic (without the WHERE clause it doesn't
> matter, both tables up end sequentially scanned).

Yeah, if you use USING the parser tries to define the merged join variable
as equivalent to just one or the other input variable. With an inner join
it'll arbitrarily pick the left input, with a left or right join it will
use the outer-side variable (since the inner-side one might go to NULL).
With a FULL join, the merged variable is actually defined as
COALESCE(left-input, right-input), which is necessary for semantic
correctness but is pretty awful for optimization purposes.

If you put an equality constraint on the merged join variable, everything
works pretty well anyway because of deduction of implied equalities (ie,
we have left-input = right-input from the USING's join clause, and then
if we also know e.g. left-input = 42 we can deduce right-input = 42).
There are some hacks that make that carry through for outer join cases
as well.

The OP's WHERE clause isn't a simple equality so the equivalence-class
machinery doesn't help there. There's been some speculation about
improving that, but there are a bunch of semantic and practical problems
in the way. One of the biggest is that we can't simply duplicate the
restriction for each input variable, because that would cause the planner
to misestimate their selectivity and produce bad plans of a different
sort (not knowing that the restrictions are redundant). The eclass
machinery knows enough to compensate for that, but it's not clear how
to do it for arbitrary restrictions.

Anyway, this isn't a bug but a missed optimization opportunity, and
I would counsel not holding your breath waiting for an improvement.
It's a difficult problem and it's not tremendously high priority.

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2017-05-12 03:59:56 Re: [HACKERS] Concurrent ALTER SEQUENCE RESTART Regression
Previous Message David G. Johnston 2017-05-11 21:59:25 Re: BUG #14646: performance hint to remove