Re: SDP query optimizer

Lists: pgsql-hackers
From: Adriano Lange <alange0001(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: SDP query optimizer
Date: 2013-03-22 23:35:43
Message-ID: 514CEACF.1070402@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi all,

I have developed a new query optimizer for PostgreSQL and I would like
to share it with the community. The optimizer's name is Sampling and
Dynamic Programming (SDP). I put it into a plugin developed some years
ago, named LJQO:

https://github.com/alange0001/ljqo.git

This plugin was configured to compile only against PostgreSQL 9.2.
However, I guess it may be easily adjusted for other versions of PostgreSQL.

I would be glad for any feedback about SDP or even about LJQO.

I have some numbers about the SDP in comparison with GEQO. If
interested, see a diff between the two ".out2" files attached. The
schema and query are from a previous email posted by Andres Freund in
this list.

Thanks for attention.
Adriano Lange

Attachment Content-Type Size
query_2_geqo.out2 text/plain 72.9 KB
query_2_sdp_1.out2 text/plain 72.8 KB
query_2.sql text/x-sql 2.2 KB
schema.sql text/x-sql 6.3 KB
views.sql text/x-sql 55.2 KB

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: SDP query optimizer
Date: 2013-03-23 00:22:28
Message-ID: 514CF5C4.2090603@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Adriano,

> I have developed a new query optimizer for PostgreSQL and I would like
> to share it with the community. The optimizer's name is Sampling and
> Dynamic Programming (SDP). I put it into a plugin developed some years
> ago, named LJQO:

Woah! Way cool.

As a warning, we're in the closing throes of version 9.3 right now, so
if you code/ideas doesn't get the attention it deserves, that's why.

There is an incomplete project from a few years back to make the
non-exhaustive query planner pluggable so that we could use different
algorithms. Unfortunately, it was never finished and merged with the
core code. Your planner is yet another reason it would be great to
complete this.

Keep up the good work!

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Ants Aasma <ants(at)cybertec(dot)at>
To: Adriano Lange <alange0001(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: SDP query optimizer
Date: 2013-03-23 00:46:20
Message-ID: CA+CSw_vFE1_+Y0d2Pij10XcQg30kw3vyddyJayw4G8hYNhs3=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Mar 23, 2013 at 1:35 AM, Adriano Lange <alange0001(at)gmail(dot)com> wrote:
> I have developed a new query optimizer for PostgreSQL and I would like to
> share it with the community. The optimizer's name is Sampling and Dynamic
> Programming (SDP). I put it into a plugin developed some years ago, named
> LJQO:
>
> https://github.com/alange0001/ljqo.git

Looks great. Do you happen to have any papers or insight into why SDP
works as well as it does? It isn't immediately clear to me why the
cheapest left-deep tree is a good heuristic start point for the
dynamic programming phase.

Regards,
Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


From: Adriano Lange <alange0001(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: SDP query optimizer
Date: 2013-03-23 10:17:53
Message-ID: 514D8151.2040809@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 22-03-2013 21:22, Josh Berkus wrote:
> Woah! Way cool.
>
> As a warning, we're in the closing throes of version 9.3 right now, so
> if you code/ideas doesn't get the attention it deserves, that's why.

Ok. No problem. :-)

> There is an incomplete project from a few years back to make the
> non-exhaustive query planner pluggable so that we could use different
> algorithms. Unfortunately, it was never finished and merged with the
> core code. Your planner is yet another reason it would be great to
> complete this.

Yes. I looked at the Julius and Tomas' project in pgFoundry [1] some
years ago, but it was inactive. Therefore, I decided to start a new one.

[1] - http://pgfoundry.org/projects/optimizer/

Anyway, good work for all of you.

--
Adriano Lange


From: Adriano Lange <alange0001(at)gmail(dot)com>
To: Ants Aasma <ants(at)cybertec(dot)at>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: SDP query optimizer
Date: 2013-03-23 11:44:15
Message-ID: 514D958F.7060200@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 22-03-2013 21:46, Ants Aasma wrote:
> On Sat, Mar 23, 2013 at 1:35 AM, Adriano Lange <alange0001(at)gmail(dot)com> wrote:
>> I have developed a new query optimizer for PostgreSQL and I would like to
>> share it with the community. The optimizer's name is Sampling and Dynamic
>> Programming (SDP). I put it into a plugin developed some years ago, named
>> LJQO:
>>
>> https://github.com/alange0001/ljqo.git
>
> Looks great. Do you happen to have any papers or insight into why SDP
> works as well as it does? It isn't immediately clear to me why the
> cheapest left-deep tree is a good heuristic start point for the
> dynamic programming phase.

Hi Ants Aasma,

I think it is difficult to say that SDP is completely better than GEQO.
There is a huge variety of situations where these kinds of optimizers
can be used.

I have made a series of experiments on SDP in a separate environment.
The main objective was to parallelize the optimizer. Thus, I called it
PSDP. I developed this environment from scratch in c++ and I used a
simplified version of the cost model used by PostgreSQL. Therefore, the
SDP is a sequential version of an experimental parallel optimizer.

For PSDP in its experimental environment, I have a comparison between
the costs obtained by the sampling phase and the costs obtained by the
complete PSDP. See the figure grafico-scaled_costs-dp_psdp attached. I
think that the dynamic programming phase can bring some robustness to
the optimizer. However, I still need to perform these same experiments
on PostgreSQL.

Considering the heuristic used in sampling phase, I have made another
series of experiments on PostgreSQL 9.0 comparing this sampling
(Sampling-L) with both a sampling on bushy trees (Sampling-B) and the
sampling used by GEQO. See as example the file "sampling...eps" also
attached. By these experiments I observed that Sampling-L was
statistically more efficient to get good join orders in a short time.

Regards,
Adriano Lange

Attachment Content-Type Size
grafico-scaled_costs-dp_psdp.eps image/x-eps 25.4 KB
sampling:chain,100,12,14,2:freq.eps image/x-eps 24.8 KB

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Adriano Lange <alange0001(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: SDP query optimizer
Date: 2013-03-23 13:15:38
Message-ID: 20130323131538.GC12686@alap2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-03-22 20:35:43 -0300, Adriano Lange wrote:
> Hi all,
>
> I have developed a new query optimizer for PostgreSQL and I would like to
> share it with the community. The optimizer's name is Sampling and Dynamic
> Programming (SDP). I put it into a plugin developed some years ago, named
> LJQO:
>
> https://github.com/alange0001/ljqo.git
>
> This plugin was configured to compile only against PostgreSQL 9.2. However,
> I guess it may be easily adjusted for other versions of PostgreSQL.
>
> I would be glad for any feedback about SDP or even about LJQO.
>
> I have some numbers about the SDP in comparison with GEQO. If interested,
> see a diff between the two ".out2" files attached. The schema and query are
> from a previous email posted by Andres Freund in this list.

I just want to mention that unless you skew the statistics for the individual
tables from their empty/default state this mostly measures a pretty degenerate
case where optima are very rare and not very differentiated. Thats a useful
thing to test, but not to have as the target to optimize for.
So it might be interesting to run that thing with some table
stats/contents stats set up.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Adriano Lange <alange0001(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: SDP query optimizer
Date: 2013-03-23 17:51:18
Message-ID: 514DEB96.1010806@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 23-03-2013 10:15, Andres Freund wrote:
> I just want to mention that unless you skew the statistics for the individual
> tables from their empty/default state this mostly measures a pretty degenerate
> case where optima are very rare and not very differentiated. Thats a useful
> thing to test, but not to have as the target to optimize for.
> So it might be interesting to run that thing with some table
> stats/contents stats set up.

Yes, the search space obtained from this experiment may be very simpler
than a real case. Beyond this experiment, I can construct classical
queries used to evaluate this kind of algorithm, as stars, cliques,
chains and cycles. Beyond these queries I have no idea how can I further
test it.

Regards,

Adriano Lange