GEQO status?

Lists: pgsql-hackers
From: Ed Loehr <eloehr(at)austin(dot)rr(dot)com>
To: pghackers <pgsql-hackers(at)postgreSQL(dot)org>
Subject: GEQO status?
Date: 2000-12-22 22:46:26
Message-ID: 3A43D9C2.7AD483D1@austin.rr.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

What is the status of the genetic algorithm query optimizer? Is this
supposed to work well on many-table joins, or has it fallen out of favor
or in disrepair? [I'm needing to optimize some large, many-table-join
queries and wondering time spent configuring/understanding geqo would be
fruitful...]

Regards,
Ed Loehr


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Ed Loehr <eloehr(at)austin(dot)rr(dot)com>
Cc: pghackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GEQO status?
Date: 2000-12-22 23:08:38
Message-ID: 21495.977526518@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Ed Loehr <eloehr(at)austin(dot)rr(dot)com> writes:
> What is the status of the genetic algorithm query optimizer? Is this
> supposed to work well on many-table joins, or has it fallen out of favor
> or in disrepair?

It's supposed to work ;-). I'm not sure that the default parameters are
optimal, however. If you experiment with other settings, please post your
results.

regards, tom lane


From: Ed Loehr <eloehr(at)austin(dot)rr(dot)com>
To: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>
Cc: pghackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GEQO status?
Date: 2000-12-22 23:10:47
Message-ID: 3A43DF77.14FABB8D@austin.rr.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thomas Lockhart wrote:
>
> > What is the status of the genetic algorithm query optimizer? Is this
> > supposed to work well on many-table joins, or has it fallen out of favor
> > or in disrepair? [I'm needing to optimize some large, many-table-join
> > queries and wondering time spent configuring/understanding geqo would be
> > fruitful...]
>
> It is the only techique we have to achieve adequate performance on
> many-table joins. It has received little work recently, but that may be
> due to having received no complaints or discussions that I can recall.

I'm having some trouble, not sure its related to GEQO. Is there a
PGOPTIONS flag to turn it off to attempt isolate the problem?

Ed


From: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>
To: Ed Loehr <eloehr(at)austin(dot)rr(dot)com>
Cc: pghackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GEQO status?
Date: 2000-12-22 23:14:25
Message-ID: 3A43E051.181A21A@alumni.caltech.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> What is the status of the genetic algorithm query optimizer? Is this
> supposed to work well on many-table joins, or has it fallen out of favor
> or in disrepair? [I'm needing to optimize some large, many-table-join
> queries and wondering time spent configuring/understanding geqo would be
> fruitful...]

It is the only techique we have to achieve adequate performance on
many-table joins. It has received little work recently, but that may be
due to having received no complaints or discussions that I can recall.

- Thomas


From: Ed Loehr <eloehr(at)austin(dot)rr(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pghackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GEQO status?
Date: 2000-12-22 23:39:49
Message-ID: 3A43E645.B84685E2@austin.rr.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
>
> Ed Loehr <eloehr(at)austin(dot)rr(dot)com> writes:
> > What is the status of the genetic algorithm query optimizer? Is this
> > supposed to work well on many-table joins, or has it fallen out of favor
> > or in disrepair?
>
> It's supposed to work ;-). I'm not sure that the default parameters are
> optimal, however. If you experiment with other settings, please post your
> results.

Query time dropped from many minutes to 13 seconds on a 12-table join
with a little tweaking from the default params:

My $PGDATA/pg_geqo:
-------------------
Pool_Size 1024
# Effort high
Generations 100
Random_Seed 330418761
Selection_Bias 2.00

Similar performance with Generations setting of 800 derived from Effort.

Regards,
Ed Loehr


From: Ed Loehr <eloehr(at)austin(dot)rr(dot)com>
To: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>
Cc: pghackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GEQO status?
Date: 2000-12-22 23:53:34
Message-ID: 3A43E97E.26469F66@austin.rr.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thomas Lockhart wrote:
>
> > What is the status of the genetic algorithm query optimizer? Is this
> > supposed to work well on many-table joins, or has it fallen out of favor
> > or in disrepair? [I'm needing to optimize some large, many-table-join
> > queries and wondering time spent configuring/understanding geqo would be
> > fruitful...]
>
> It is the only techique we have to achieve adequate performance on
> many-table joins. It has received little work recently, but that may be
> due to having received no complaints or discussions that I can recall.

At risk of being off-topic here, is there a reason why GEQO is off by
default in the ODBC driver (postdrv.exe)? I vaguely recall something
about this from a year ago, but can't find it.

Regards,
Ed Loehr


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Ed Loehr <eloehr(at)austin(dot)rr(dot)com>
Cc: Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu>, pghackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GEQO status?
Date: 2000-12-23 00:04:49
Message-ID: 21912.977529889@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Ed Loehr <eloehr(at)austin(dot)rr(dot)com> writes:
> is there a reason why GEQO is off by
> default in the ODBC driver (postdrv.exe)?

There may once have been a good reason for that, but it sounds like a
mighty bad idea nowadays.

AFAICT ODBC's default setting has been that way for as long as ODBC has
been in our CVS tree, so no way to know who chose to do that, when, or
why.

regards, tom lane


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Ed Loehr <eloehr(at)austin(dot)rr(dot)com>
Cc: pghackers <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: GEQO status?
Date: 2000-12-23 00:30:40
Message-ID: Pine.LNX.4.30.0012230122210.1505-100000@peter.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Ed Loehr writes:

> What is the status of the genetic algorithm query optimizer? Is this
> supposed to work well on many-table joins, or has it fallen out of favor
> or in disrepair? [I'm needing to optimize some large, many-table-join
> queries and wondering time spent configuring/understanding geqo would be
> fruitful...]

I've seen a number of bug reports that would indicate to me the GEQO works
less than perfectly. I vividly recall how, while working on my own code,
mere additions of dummy clauses like '... AND 5=5' altered query results
in seemingly random ways. That was admittedly quite a while ago, but the
GEQO code hasn't changed since. I'd advise you to be *very* careful.

--
Peter Eisentraut peter_e(at)gmx(dot)net http://yi.org/peter-e/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Ed Loehr <eloehr(at)austin(dot)rr(dot)com>, pghackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GEQO status?
Date: 2000-12-23 00:51:18
Message-ID: 22156.977532678@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> I've seen a number of bug reports that would indicate to me the GEQO works
> less than perfectly. I vividly recall how, while working on my own code,
> mere additions of dummy clauses like '... AND 5=5' altered query results
> in seemingly random ways.

The choices made by GEQO are intentionally random, so I would expect
variation in tuple output order even for repetitions of the identical
query. If you got a semantically different result, that would indeed
be a bug. But it would most likely be a bug in the core planner, since
GEQO has essentially no influence over whether the produced plan is
correct or not. GEQO merely forces specific choices of join order.
All else is in the core planner.

> That was admittedly quite a while ago, but the
> GEQO code hasn't changed since.

The planner has changed quite markedly over the past couple releases,
so I don't put a lot of stock in old anecdotes. Let's see a test case.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Ed Loehr <eloehr(at)austin(dot)rr(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pghackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GEQO status?
Date: 2000-12-23 01:00:19
Message-ID: 22196.977533219@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Ed Loehr <eloehr(at)austin(dot)rr(dot)com> writes:
> You can remove the randomness by setting the Seed configuration value,

True, but that's not the default setup.

regards, tom lane


From: Ed Loehr <eloehr(at)austin(dot)rr(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pghackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GEQO status?
Date: 2000-12-23 01:04:39
Message-ID: 3A43FA27.603534A2@austin.rr.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
>
> The choices made by GEQO are intentionally random, so I would expect
> variation in tuple output order even for repetitions of the identical
> query. If you got a semantically different result, that would indeed
> be a bug. But it would most likely be a bug in the core planner, since
> GEQO has essentially no influence over whether the produced plan is
> correct or not. GEQO merely forces specific choices of join order.
> All else is in the core planner.

You can remove the randomness by setting the Seed configuration value, if
the docs are correct.

Regards,
Ed Loehr