Outer joins and equivalence

Lists: pgsql-hackerspgsql-performance
From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Outer joins and equivalence
Date: 2008-05-27 19:59:18
Message-ID: 1211918358.4489.257.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance


I have a complex query where making a small change to the SQL increases
run-time by > 1000 times.

The first SQL statement is of the form

A JOIN B ON (a.id = b.id) LEFT JOIN C ON (a.id = c.id)

and the second is like this

A JOIN B ON (a.id = b.id) LEFT JOIN C ON (b.id = c.id)

the only difference is the substitution of a -> b

This has been verified by examining EXPLAIN of SQL1, SQL2 and SQL1. The
first and third EXPLAINs are equivalent. All ANALYZE etc has been run.
All relevant join columns are INTEGERs. So we have a repeatable
difference in plans attributable to a single change.

The difference in run time occurs because the second form of the query
uses a SeqScan of a large table, whereas the first form is able to use a
nested loops join to access the large table, which then allows it to
access just 3 rows rather than 85 million rows.

There is a clear equivalence between the two forms of SQL, since the
equivalence a = b is derived from a natural rather than an outer join.
This can be applied from the left side to the right side of the join.

So this looks to me like either a bug or just an un-implemented
optimizer feature. The code I've just looked at for equivalent class
relationships appears to refer to using this to propagate constant info
only, so I'm thinking it is not a bug. and hence why it is reported here
and not to pgsql-bugs.

I do recognise that we would *not* be able to deduce this form of SQL

A JOIN B ON (a.id = c.id) LEFT JOIN C ON (b.id = c.id)

though that restriction on outer join equivalence is not relevant here.

(SQL, EXPLAINs etc available off-list only, by request).

I'm looking into this more now.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Outer joins and equivalence
Date: 2008-05-27 21:43:12
Message-ID: 14085.1211924592@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> I have a complex query where making a small change to the SQL increases
> run-time by > 1000 times.

> The first SQL statement is of the form

> A JOIN B ON (a.id = b.id) LEFT JOIN C ON (a.id = c.id)

> and the second is like this

> A JOIN B ON (a.id = b.id) LEFT JOIN C ON (b.id = c.id)

> the only difference is the substitution of a -> b

Please provide an actual test case.

regards, tom lane


From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Outer joins and equivalence
Date: 2008-05-28 10:45:14
Message-ID: Pine.LNX.4.64.0805281144200.16756@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On Tue, 27 May 2008, Simon Riggs wrote:
> I do recognise that we would *not* be able to deduce this form of SQL
>
> A JOIN B ON (a.id = c.id) LEFT JOIN C ON (b.id = c.id)

Surely that would not be valid SQL?

Matthew

--
Psychotics are consistently inconsistent. The essence of sanity is to
be inconsistently inconsistent.


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Matthew Wakeling <matthew(at)flymine(dot)org>
Cc: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Outer joins and equivalence
Date: 2008-05-28 13:39:10
Message-ID: 1211981950.4489.473.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance


On Wed, 2008-05-28 at 11:45 +0100, Matthew Wakeling wrote:
> On Tue, 27 May 2008, Simon Riggs wrote:
> > I do recognise that we would *not* be able to deduce this form of SQL
> >
> > A JOIN B ON (a.id = c.id) LEFT JOIN C ON (b.id = c.id)
>
> Surely that would not be valid SQL?

You are right, but my point was about inferences during SQL planning,
not about initial analysis of the statement.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Outer joins and equivalence
Date: 2008-06-02 17:10:44
Message-ID: 1212426644.4120.378.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance


On Tue, 2008-05-27 at 17:43 -0400, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > I have a complex query where making a small change to the SQL increases
> > run-time by > 1000 times.
>
> > The first SQL statement is of the form
>
> > A JOIN B ON (a.id = b.id) LEFT JOIN C ON (a.id = c.id)
>
> > and the second is like this
>
> > A JOIN B ON (a.id = b.id) LEFT JOIN C ON (b.id = c.id)
>
> > the only difference is the substitution of a -> b
>
> Please provide an actual test case.

Getting closer, but still not able to produce a moveable test case.

Symptoms are

* using partitioning
* when none of the partitions are excluded
* when equivalence classes ought to be able to reconcile join

Still working on it

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Outer joins and equivalence
Date: 2008-06-02 19:47:11
Message-ID: 1212436031.4120.400.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance


On Mon, 2008-06-02 at 18:10 +0100, Simon Riggs wrote:
> On Tue, 2008-05-27 at 17:43 -0400, Tom Lane wrote:
> > Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > > I have a complex query where making a small change to the SQL increases
> > > run-time by > 1000 times.
> >
> > > The first SQL statement is of the form
> >
> > > A JOIN B ON (a.id = b.id) LEFT JOIN C ON (a.id = c.id)
> >
> > > and the second is like this
> >
> > > A JOIN B ON (a.id = b.id) LEFT JOIN C ON (b.id = c.id)
> >
> > > the only difference is the substitution of a -> b
> >
> > Please provide an actual test case.
>
> Getting closer, but still not able to produce a moveable test case.

I've got a test case which shows something related and weird, though not
the exact case.

The queries shown here have significantly different costs, depending
upon whether we use tables a or b in the query. Since a and b are
equivalent this result isn't expected at all.

I suspect the plan variation in the original post is somehow cost
related and we are unlikely to discover the exact plan.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

Attachment Content-Type Size
q.sql text/x-sql 1.4 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [PERFORM] Outer joins and equivalence
Date: 2008-06-05 02:18:12
Message-ID: 8599.1212632292@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

[ redirecting thread from -performance to -hackers ]

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> I've got a test case which shows something related and weird, though not
> the exact case.

> The queries shown here have significantly different costs, depending
> upon whether we use tables a or b in the query. Since a and b are
> equivalent this result isn't expected at all.

Hmm. I had been guessing that there was something about your original
query that prevented the system from applying best_appendrel_indexscan,
but after fooling with this a bit, I don't believe that's the issue
at all. The problem is that these two cases "should" be equivalent:

select ... from a join b on (a.id = b.id) left join c on (a.id = c.id);

select ... from a join b on (a.id = b.id) left join c on (b.id = c.id);

but they are not seen that way by the current planner. It correctly
forms an EquivalenceClass consisting of a.id and b.id, but it cannot put
c.id into that same class, and so the clause a.id = c.id is just left
alone; there is noplace that can generate "b.id = c.id" as an
alternative join condition. This means that (for the first query)
we can consider the join orders "(a join b) leftjoin c" and
"(a leftjoin c) join b", but there is no way to consider the join
order "(b leftjoin c) join a"; to implement that we'd need to have the
alternative join clause available. So if that join order is
significantly better than the other two, we lose.

This is going to take a bit of work to fix :-(. I am toying with the
idea that we could go ahead and put c.id into the EquivalenceClass
as a sort of second-class citizen that's labeled as associated with this
particular outer join --- the implication being that we can execute the
outer join using a generated clause that equates c.id to any one of the
first-class members of the EquivalenceClass, but above the outer join
we can't assume that c.id hasn't gone to null, so it's not really equal
to anything else in the class. I think it might also be possible
to get rid of the reconsider_outer_join_clauses() kluge in favor of
driving transitive-equality-to-a-constant off of this representation.

However there's a larger issue here, which is the very identity of an
outer join :-(. Currently, for the first query above, the left join
is defined as being between a and c, with a being the minimum
left-hand-side needed to form the join. To be able to handle a case
like this, it seems that the notion of a "minimum left hand side"
falls apart altogether. We can execute the OJ using either a or b
as left hand side. So the current representation of OuterJoinInfo,
and the code that uses it to enforce valid join orders, needs a serious
rethink.

Looks like 8.4 development material to me, rather than something we
can hope to back-patch a fix for...

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [PERFORM] Outer joins and equivalence
Date: 2008-06-05 05:17:47
Message-ID: 1212643067.4148.244.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance


On Wed, 2008-06-04 at 22:18 -0400, Tom Lane wrote:
> [ redirecting thread from -performance to -hackers ]
>
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > I've got a test case which shows something related and weird, though not
> > the exact case.
>
> > The queries shown here have significantly different costs, depending
> > upon whether we use tables a or b in the query. Since a and b are
> > equivalent this result isn't expected at all.
>
> Hmm. I had been guessing that there was something about your original
> query that prevented the system from applying best_appendrel_indexscan,
> but after fooling with this a bit, I don't believe that's the issue
> at all. The problem is that these two cases "should" be equivalent:
>
> select ... from a join b on (a.id = b.id) left join c on (a.id = c.id);
>
> select ... from a join b on (a.id = b.id) left join c on (b.id = c.id);
>
> but they are not seen that way by the current planner. It correctly
> forms an EquivalenceClass consisting of a.id and b.id, but it cannot put
> c.id into that same class, and so the clause a.id = c.id is just left
> alone; there is noplace that can generate "b.id = c.id" as an
> alternative join condition. This means that (for the first query)
> we can consider the join orders "(a join b) leftjoin c" and
> "(a leftjoin c) join b", but there is no way to consider the join
> order "(b leftjoin c) join a"; to implement that we'd need to have the
> alternative join clause available. So if that join order is
> significantly better than the other two, we lose.
>
> This is going to take a bit of work to fix :-(. I am toying with the
> idea that we could go ahead and put c.id into the EquivalenceClass
> as a sort of second-class citizen that's labeled as associated with this
> particular outer join --- the implication being that we can execute the
> outer join using a generated clause that equates c.id to any one of the
> first-class members of the EquivalenceClass, but above the outer join
> we can't assume that c.id hasn't gone to null, so it's not really equal
> to anything else in the class. I think it might also be possible
> to get rid of the reconsider_outer_join_clauses() kluge in favor of
> driving transitive-equality-to-a-constant off of this representation.

Yes, EquivalenceClass allows an implication to be made either way
around, which is wrong for this class of problem. I was imagining a
higher level ImplicationClass that was only of the form A => B but not B
=> A. So we end up with an ImplicationTree, rather than a just a flat
Class. Which is where I punted...

> However there's a larger issue here, which is the very identity of an
> outer join :-(. Currently, for the first query above, the left join
> is defined as being between a and c, with a being the minimum
> left-hand-side needed to form the join. To be able to handle a case
> like this, it seems that the notion of a "minimum left hand side"
> falls apart altogether. We can execute the OJ using either a or b
> as left hand side. So the current representation of OuterJoinInfo,
> and the code that uses it to enforce valid join orders, needs a serious
> rethink.

Hadn't seen that at all.

> Looks like 8.4 development material to me, rather than something we
> can hope to back-patch a fix for...

Definitely.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support