From: | Adriano Lange <alange0001(at)gmail(dot)com> |
---|---|
To: | Ants Aasma <ants(at)cybertec(dot)at> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: SDP query optimizer |
Date: | 2013-03-23 11:44:15 |
Message-ID: | 514D958F.7060200@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 22-03-2013 21:46, Ants Aasma wrote:
> On Sat, Mar 23, 2013 at 1:35 AM, Adriano Lange <alange0001(at)gmail(dot)com> wrote:
>> I have developed a new query optimizer for PostgreSQL and I would like to
>> share it with the community. The optimizer's name is Sampling and Dynamic
>> Programming (SDP). I put it into a plugin developed some years ago, named
>> LJQO:
>>
>> https://github.com/alange0001/ljqo.git
>
> Looks great. Do you happen to have any papers or insight into why SDP
> works as well as it does? It isn't immediately clear to me why the
> cheapest left-deep tree is a good heuristic start point for the
> dynamic programming phase.
Hi Ants Aasma,
I think it is difficult to say that SDP is completely better than GEQO.
There is a huge variety of situations where these kinds of optimizers
can be used.
I have made a series of experiments on SDP in a separate environment.
The main objective was to parallelize the optimizer. Thus, I called it
PSDP. I developed this environment from scratch in c++ and I used a
simplified version of the cost model used by PostgreSQL. Therefore, the
SDP is a sequential version of an experimental parallel optimizer.
For PSDP in its experimental environment, I have a comparison between
the costs obtained by the sampling phase and the costs obtained by the
complete PSDP. See the figure grafico-scaled_costs-dp_psdp attached. I
think that the dynamic programming phase can bring some robustness to
the optimizer. However, I still need to perform these same experiments
on PostgreSQL.
Considering the heuristic used in sampling phase, I have made another
series of experiments on PostgreSQL 9.0 comparing this sampling
(Sampling-L) with both a sampling on bushy trees (Sampling-B) and the
sampling used by GEQO. See as example the file "sampling...eps" also
attached. By these experiments I observed that Sampling-L was
statistically more efficient to get good join orders in a short time.
Regards,
Adriano Lange
Attachment | Content-Type | Size |
---|---|---|
grafico-scaled_costs-dp_psdp.eps | image/x-eps | 25.4 KB |
sampling:chain,100,12,14,2:freq.eps | image/x-eps | 24.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Ants Aasma | 2013-03-23 12:02:41 | Re: Page replacement algorithm in buffer cache |
Previous Message | Adriano Lange | 2013-03-23 10:17:53 | Re: SDP query optimizer |