Re: SDP query optimizer

From: Adriano Lange <alange0001(at)gmail(dot)com>
To: Ants Aasma <ants(at)cybertec(dot)at>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: SDP query optimizer
Date: 2013-03-23 11:44:15
Message-ID: 514D958F.7060200@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 22-03-2013 21:46, Ants Aasma wrote:
> On Sat, Mar 23, 2013 at 1:35 AM, Adriano Lange <alange0001(at)gmail(dot)com> wrote:
>> I have developed a new query optimizer for PostgreSQL and I would like to
>> share it with the community. The optimizer's name is Sampling and Dynamic
>> Programming (SDP). I put it into a plugin developed some years ago, named
>> LJQO:
>>
>> https://github.com/alange0001/ljqo.git
>
> Looks great. Do you happen to have any papers or insight into why SDP
> works as well as it does? It isn't immediately clear to me why the
> cheapest left-deep tree is a good heuristic start point for the
> dynamic programming phase.

Hi Ants Aasma,

I think it is difficult to say that SDP is completely better than GEQO.
There is a huge variety of situations where these kinds of optimizers
can be used.

I have made a series of experiments on SDP in a separate environment.
The main objective was to parallelize the optimizer. Thus, I called it
PSDP. I developed this environment from scratch in c++ and I used a
simplified version of the cost model used by PostgreSQL. Therefore, the
SDP is a sequential version of an experimental parallel optimizer.

For PSDP in its experimental environment, I have a comparison between
the costs obtained by the sampling phase and the costs obtained by the
complete PSDP. See the figure grafico-scaled_costs-dp_psdp attached. I
think that the dynamic programming phase can bring some robustness to
the optimizer. However, I still need to perform these same experiments
on PostgreSQL.

Considering the heuristic used in sampling phase, I have made another
series of experiments on PostgreSQL 9.0 comparing this sampling
(Sampling-L) with both a sampling on bushy trees (Sampling-B) and the
sampling used by GEQO. See as example the file "sampling...eps" also
attached. By these experiments I observed that Sampling-L was
statistically more efficient to get good join orders in a short time.

Regards,
Adriano Lange

Attachment Content-Type Size
grafico-scaled_costs-dp_psdp.eps image/x-eps 25.4 KB
sampling:chain,100,12,14,2:freq.eps image/x-eps 24.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ants Aasma 2013-03-23 12:02:41 Re: Page replacement algorithm in buffer cache
Previous Message Adriano Lange 2013-03-23 10:17:53 Re: SDP query optimizer