join ordering via Simulated Annealing

Lists: pgsql-hackers
From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: join ordering via Simulated Annealing
Date: 2009-12-23 01:23:55
Message-ID: 4B31712B.4040205@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

I've been playing with using a Simulated Annealing-type algorithm for
determinig join ordering for relations. To get into context see
http://archives.postgresql.org/pgsql-hackers/2009-05/msg00098.php
(there's also a TODO in the wiki). There's a nice paper on that in
http://reference.kfupm.edu.sa/content/h/e/heuristic_and_randomized_optimization_fo_87585.pdf
(also linked from that thread) and someone even wrote a patch:
http://archives.postgresql.org/pgsql-hackers/2009-05/msg00736.php

This generally aims at being able to replace GEQO.

I have some rough prototype code, but I'm not even asking people to look
at it. It's stuffed with silly things and writes three Graphviz-style
.dot files in /tmp for each algorithm step. But I'd like to at least
present the idea.

There are three main problems that have to be solved to get a good
Simulated Annealing implementation:
o) choosing the starting point for the algorithm
o) generating the next point
o) computing the cost in the current point

The code I have now creates the initial plan by doing something similar
to what gimme_tree does in GEQO, but without any kind of heuristic to
try to favour joins with join restrictions (the idea is that it doesn't
matter, since we only want to get *any* plan and we only do it once),
but ideally it would be somehow chosen randonly from the space of all
possible join orderings.

I keep a binary tree of relations in memory, where leaves are
RelOptInfos with 1 relid and the root is a RelOptInfo with all relids.
Each iteration of the algorithm picks two nodes at random (with some
restrictions, but that's not important) and tries to swap them around,
meaning that a tree like (use a monospace font for best results):

(1 2 3 4)
*(1 2) (3 4)
(1) (2) *(3) (4)

where the parenthesised things are the two chosen nodes would get
transformed into

(1 2 3 4)
(3) (1 2 4)
(1 2) (4)
(1) (2)

that is, the (1 2) node and its subtree gets swapped with the (3) node
and the upper-level nodes get changed accordingly. Sometimes a swap like
that will produce an invalid join ordering - then swap is then reverted.
If the join order given by the tree after the swap is legal, the paths
are recomputed, much like in geqo_eval, and if the root node's cheapest
path is not cheaper that before the swap, the swap gets reverted.
Simulated Annealing algorithms permit uphill moves in terms of cost,
with some probability that's decreasing as time passes, but that's not
implemented yet. After a fixed amount of moves, the algorithm stops and
returns the root node of the tree as the single RelOptInfo.

The issues with the approach are:

o) the initial state is not really a random plan, it's usualy a
left-deep tree (and is very inefficient) and this might skew results.

o) is swapping random nodes like that a good way of exploring the
solution space? The SA paper suggests something much simpler, but some
of the moves proposed there don't really make sense when using the
make_join_rel machinery:
*) changing the inner and outer relation of a join doesn't make
sense, because make_join_rel is symmetrical
*) changing the join executing strategy (nested loop, merge join,
etc.) doesn't make sense, because make_join_rel considers all possible
paths for a join

o) each swap needs a full recalcualtion of the whole solution
(geqo_eval does that, for instance), maybe it's possible to leave the
subtrees of swapped nodes intact and only recalculate the nodes above
the two swapped nodes?

o) because make_join_rel scribbles on the PlannerInfo, the same hack as
in geqo_eval is necessary: truncating join_rel_list after building all
joinrels and nulling out the join_rel_hash

o) I use make_join_rel to determine whether a join is legal, which does
lots of other things. That looks like wasted effort, although in the end
each time I need to build the final rel to assess the resulting path's
cost. To follow the SA algorithm spirit more closely it would be
necessary to only consider one, random path for each relation and make
using different paths a move that the algorithm can choose while
exploring the solution space. A cheaper way of determining the current
solution's cost would be nice, too - fully rebuilding the final
RelOptInfo after each move is annoying.

Lastly, I'm lacking good testcases or even a testing approach: I'm
generating silly queries and looking at how they get optimised, but if
someone has a real dataset and examples of joins that cannot be planned
with the standard planner, I would be interested to compare the results
my prototype gets with those produced by GEQO.

The code, a module that you can LOAD into a backend, is here (if you
really want to see it):
http://git.wulczer.org/?p=saio.git
Set the saio GUC to true and saio_cutoff to the number of iterations you
want.

Cheers,
Jan


From: bin wang <comx0330(at)gmail(dot)com>
To: Jan Urbaski <wulczer(at)wulczer(dot)org>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: join ordering via Simulated Annealing
Date: 2009-12-23 01:57:55
Message-ID: 323338680912221757p3028180co317b923e71950f98@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I will follow it.
Thank you.

2009/12/23 Jan Urbaski <wulczer(at)wulczer(dot)org>

> Hi,
>
> I've been playing with using a Simulated Annealing-type algorithm for
> determinig join ordering for relations. To get into context see
> http://archives.postgresql.org/pgsql-hackers/2009-05/msg00098.php
> (there's also a TODO in the wiki). There's a nice paper on that in
>
> http://reference.kfupm.edu.sa/content/h/e/heuristic_and_randomized_optimization_fo_87585.pdf
> (also linked from that thread) and someone even wrote a patch:
> http://archives.postgresql.org/pgsql-hackers/2009-05/msg00736.php
>
> This generally aims at being able to replace GEQO.
>
> I have some rough prototype code, but I'm not even asking people to look
> at it. It's stuffed with silly things and writes three Graphviz-style
> .dot files in /tmp for each algorithm step. But I'd like to at least
> present the idea.
>
> There are three main problems that have to be solved to get a good
> Simulated Annealing implementation:
> o) choosing the starting point for the algorithm
> o) generating the next point
> o) computing the cost in the current point
>
> The code I have now creates the initial plan by doing something similar
> to what gimme_tree does in GEQO, but without any kind of heuristic to
> try to favour joins with join restrictions (the idea is that it doesn't
> matter, since we only want to get *any* plan and we only do it once),
> but ideally it would be somehow chosen randonly from the space of all
> possible join orderings.
>
> I keep a binary tree of relations in memory, where leaves are
> RelOptInfos with 1 relid and the root is a RelOptInfo with all relids.
> Each iteration of the algorithm picks two nodes at random (with some
> restrictions, but that's not important) and tries to swap them around,
> meaning that a tree like (use a monospace font for best results):
>
> (1 2 3 4)
> *(1 2) (3 4)
> (1) (2) *(3) (4)
>
> where the parenthesised things are the two chosen nodes would get
> transformed into
>
> (1 2 3 4)
> (3) (1 2 4)
> (1 2) (4)
> (1) (2)
>
> that is, the (1 2) node and its subtree gets swapped with the (3) node
> and the upper-level nodes get changed accordingly. Sometimes a swap like
> that will produce an invalid join ordering - then swap is then reverted.
> If the join order given by the tree after the swap is legal, the paths
> are recomputed, much like in geqo_eval, and if the root node's cheapest
> path is not cheaper that before the swap, the swap gets reverted.
> Simulated Annealing algorithms permit uphill moves in terms of cost,
> with some probability that's decreasing as time passes, but that's not
> implemented yet. After a fixed amount of moves, the algorithm stops and
> returns the root node of the tree as the single RelOptInfo.
>
> The issues with the approach are:
>
> o) the initial state is not really a random plan, it's usualy a
> left-deep tree (and is very inefficient) and this might skew results.
>
> o) is swapping random nodes like that a good way of exploring the
> solution space? The SA paper suggests something much simpler, but some
> of the moves proposed there don't really make sense when using the
> make_join_rel machinery:
> *) changing the inner and outer relation of a join doesn't make
> sense, because make_join_rel is symmetrical
> *) changing the join executing strategy (nested loop, merge join,
> etc.) doesn't make sense, because make_join_rel considers all possible
> paths for a join
>
> o) each swap needs a full recalcualtion of the whole solution
> (geqo_eval does that, for instance), maybe it's possible to leave the
> subtrees of swapped nodes intact and only recalculate the nodes above
> the two swapped nodes?
>
> o) because make_join_rel scribbles on the PlannerInfo, the same hack as
> in geqo_eval is necessary: truncating join_rel_list after building all
> joinrels and nulling out the join_rel_hash
>
> o) I use make_join_rel to determine whether a join is legal, which does
> lots of other things. That looks like wasted effort, although in the end
> each time I need to build the final rel to assess the resulting path's
> cost. To follow the SA algorithm spirit more closely it would be
> necessary to only consider one, random path for each relation and make
> using different paths a move that the algorithm can choose while
> exploring the solution space. A cheaper way of determining the current
> solution's cost would be nice, too - fully rebuilding the final
> RelOptInfo after each move is annoying.
>
> Lastly, I'm lacking good testcases or even a testing approach: I'm
> generating silly queries and looking at how they get optimised, but if
> someone has a real dataset and examples of joins that cannot be planned
> with the standard planner, I would be interested to compare the results
> my prototype gets with those produced by GEQO.
>
> The code, a module that you can LOAD into a backend, is here (if you
> really want to see it):
> http://git.wulczer.org/?p=saio.git
> Set the saio GUC to true and saio_cutoff to the number of iterations you
> want.
>
> Cheers,
> Jan
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: join ordering via Simulated Annealing
Date: 2009-12-23 03:48:41
Message-ID: 3975.1261540121@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <wulczer(at)wulczer(dot)org> writes:
> I've been playing with using a Simulated Annealing-type algorithm for
> determinig join ordering for relations.

Cool.

> The code I have now creates the initial plan by doing something similar
> to what gimme_tree does in GEQO, but without any kind of heuristic to
> try to favour joins with join restrictions (the idea is that it doesn't
> matter, since we only want to get *any* plan and we only do it once),
> but ideally it would be somehow chosen randonly from the space of all
> possible join orderings.

FWIW, I think that's probably a bad idea. In a query where there are a
lot of join order constraints, you can waste enormous amounts of time
trying to find a join order that is even legal at all, let alone any
good. Pure randomness is not as nice as it seems. You might want to
study the CVS history of GEQO a bit and try to avoid falling into some
of the traps we already fell into with that ;-)

> o) each swap needs a full recalcualtion of the whole solution
> (geqo_eval does that, for instance), maybe it's possible to leave the
> subtrees of swapped nodes intact and only recalculate the nodes above
> the two swapped nodes?

Even if you could make this work, it'd probably be an infeasible
space-for-time tradeoff, because you'd have to keep around all the Path
infrastructure of the lower nodes in order to not start from scratch.
As we were forcibly reminded a few months ago, one of the important
properties of the GEQO code is to be able to do planning with a limited
amount of memory even for very large relation sets.

regards, tom lane


From: Adriano Lange <alange0001(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: join ordering via Simulated Annealing
Date: 2009-12-24 02:00:16
Message-ID: 4B32CB30.3070100@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Em 22-12-2009 22:23, Jan Urbański escreveu:
> o) the initial state is not really a random plan, it's usualy a
> left-deep tree (and is very inefficient) and this might skew results.

Maybe a QuickPick + SA.
http://www.springerlink.com/content/garn64gt61ju5xfa/
http://portal.acm.org/citation.cfm?doid=1559845.1559889

Att
Adriano Lange


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Jan Urbański <wulczer(at)wulczer(dot)org>
Subject: Re: join ordering via Simulated Annealing
Date: 2009-12-26 16:30:54
Message-ID: 200912261730.54730.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wednesday 23 December 2009 02:23:55 Jan Urbański wrote:
> Hi,
>
> I've been playing with using a Simulated Annealing-type algorithm for
> determinig join ordering for relations.
Very cool.

> Lastly, I'm lacking good testcases or even a testing approach: I'm
> generating silly queries and looking at how they get optimised, but if
> someone has a real dataset and examples of joins that cannot be planned
> with the standard planner, I would be interested to compare the results
> my prototype gets with those produced by GEQO.
If you want to see some queries which are rather hard to plan with random
search you can look at
http://archives.postgresql.org/message-
id/200907091700(dot)43411(dot)andres(at)anarazel(dot)de
which tom analyzed and improved here http://archives.postgresql.org/message-
id/17807(dot)1247932094(at)sss(dot)pgh(dot)pa(dot)us

They are hard to plan because they have lots and lots of join order
restrictions. While this example is rather extreme I have found quite many
such queries so far.

Robert had another example in
603c8f070911271205r4d4534edt1cebcb76ff5066a5(at)mail(dot)gmail(dot)com that might be
interesting.

Andres


From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: join ordering via Simulated Annealing
Date: 2009-12-28 12:13:18
Message-ID: 4B38A0DE.6040002@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund wrote:
> On Wednesday 23 December 2009 02:23:55 Jan Urbański wrote:

>> Lastly, I'm lacking good testcases

> If you want to see some queries which are rather hard to plan with random
> search you can look at
> http://archives.postgresql.org/message-
> id/200907091700(dot)43411(dot)andres(at)anarazel(dot)de
> which tom analyzed and improved here http://archives.postgresql.org/message-
> id/17807(dot)1247932094(at)sss(dot)pgh(dot)pa(dot)us

Thanks, these look like good testing candidates, not least because they
trigger assertion errors with my code :( I'll report back when they're
fixed...

> Robert had another example in
> 603c8f070911271205r4d4534edt1cebcb76ff5066a5(at)mail(dot)gmail(dot)com that might be
> interesting.

Yes, I rememberd this one, will try to put them through the mill as soon
as I fix my code.

Cheers,
Jan


From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: join ordering via Simulated Annealing
Date: 2010-01-08 14:41:42
Message-ID: 4B474426.8060906@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund wrote:
> On Wednesday 23 December 2009 02:23:55 Jan Urbański wrote:
>> Lastly, I'm lacking good testcases or even a testing approach:

> If you want to see some queries which are rather hard to plan with random
> search you can look at
> http://archives.postgresql.org/message-id/200907091700.43411.andres@anarazel.de
> which tom analyzed and improved here http://archives.postgresql.org/message-id/17807.1247932094@sss.pgh.pa.us

Here's the current state of affairs. I managed to make the module into
sufficiently good shape to at least not error out on Andres' example
query. I
tried it with SAIO, GEQO and the standard planner. from_collapse_limit and
join_collapse_limit were set to 14. This was on a --enable-debug
--disable-cassert -O2 build. Here are the results:

SAIO, standard values

saio_equilibrium_factor = 16
saio_temperature_reduction_factor = 0.95
time = too big

SAIO, tweaked

saio_equilibrium_factor = 12
saio_temperature_reduction_factor = 0.3
cost = 13376.10..19692.53
time = 86866.276 ms

GEQO

cost = 13936.53..19479.38
time = 182000.097 ms

STANDARD

cost = 17807.57..19339.64
time = 39361.248 ms

A couple of remarks. The standard planner found a decent plan and only ate
around 550 MB of memory while doing so. Reading this mail
http://archives.postgresql.org/pgsql-hackers/2009-07/msg01219.php
I was expecting it to start swapping, but for some reason it didn't.

GEQO produced a comparable (even if a bit better) plan to SAIO and took
quite
some more time. OTOH, with the default parameters, SAIO takes crazy
amounts of
time and I never had the patience to even wait until it finishes. Memory
usage
stayed low in both, because both of them do calculations in a separate
memory
context that gets periodically reset.

A short explanation on the saio_* parameter values. and the algorithm
itself.
The SAIO algorithm is composed of two loops:

do {
do {
random_move()
} while (!equilibrium())
reduce_temperature()
} while (!frozen())

The inner loop does random moves until it reaches "equilibrium". Moves that
improve the plan are always accepted, uphill moves are accepted with the
probability that's proportional to how much worse the new plan is and
how high
the current temperature is. The paper I quoted earlier that equilibrium is
reached after iterating the inner loop 16 * number_of_joins times. The "16"
parameter is what I called saio_equilibrium_factor. In each outer loop
iteration the temperature of the system is reduced by a factor of
saio_temperature_reduction_factor, which is 0.9 in the paper.

The catch is that because of join order constraints, lots of random
moves are
resulting simply invalid and are rejected, even if they still are
counted as a
full iteration of the inner loop. ISTM that it doesn't skew the results too
much, though.

I followed Tom's advice to mimick GEQO's way of choosing joins with join
clauses first, but that only is done when building the starting tree. After
that the moves are random. A related question: why does GEQO try to put the
largest "clumps" at the beginning of the list? Is it to achieve maximum
left-deepness? I didn't copy that in SAIO, I'm just adding the new clump
at the
beginning of the list.

The big question for me now is what are the reasonable values of those two
GUCs? I think that because of the amount of work make_join_rel does (handle
symmetric case, consider all access paths) they can be made stricter that in
the paper, so 16/0.9 seems wrong. OTOH, I have no idea why it takes so long
with those parameters cranked up, I would think they influence the
running time
linearly, but maybe I'm just way off base.

For the record, I'm attaching oprofile results for the standard planner,
GEQO
and SAIO (standard and tweaked, I ctrl+c'd the standard run). I was
profiling
the whole system, but what I'm attaching is the grepped out "postgres"
part. What I don't get is why cranking up the GUCs results in
generate_join_implied_equalities moving to the top of the list. I've spent
countless hours trying to understand why that happens and am currently at a
loss.

> Robert had another example in
> 603c8f070911271205r4d4534edt1cebcb76ff5066a5(at)mail(dot)gmail(dot)com that might be
> interesting.

I'll give it a shot soon, and then hopefully will do some plan quality
comparision on less pathological queries.

Cheers,
Jan

Attachment Content-Type Size
oprofile-callgraph.saio.tweaked text/plain 60.3 KB
oprofile-callgraph.saio text/plain 59.5 KB
oprofile.std application/vnd.sun.xml.draw.template 1.3 KB
oprofile.saio.tweaked text/plain 1.3 KB
oprofile.saio text/plain 1.3 KB
oprofile.geqo text/plain 1.3 KB