From: | Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, Antonin Houska <ah(at)cybertec(dot)at>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] [PATCH] Incremental sort |
Date: | 2018-04-09 15:56:10 |
Message-ID: | CAPpHfdt14t5fsLQ2Pd5m2WQcRZHaHHPNyx9Fko6x_4Uk-Buq4g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Apr 7, 2018 at 11:57 PM, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:
> On 04/07/2018 06:23 PM, Tom Lane wrote:
> > Teodor Sigaev <teodor(at)sigaev(dot)ru> writes:
> >>> I dunno, how would you estimate whether this is actually a win or not?
> >>> I don't think our model of sort costs is anywhere near refined enough
> >>> or accurate enough to reliably predict whether this is better than
> >>> just doing it in one step. Even if the cost model is good, it's not
> >>> going to be better than our statistics about the number/size of the
> >>> groups in the first column(s), and that's a notoriously unreliable
> stat.
> >
> >> I think that improvement in cost calculation of sort should be a
> >> separate patch, not directly connected to this one. Postpone patches
> >> till other part will be ready to get max improvement for postponed ones
> >> doesn't seem to me very good, especially if it suggests some improvement
> >> right now.
> >
> > No, you misunderstand the point of my argument. Without a reasonably
> > reliable cost model, this patch could easily make performance *worse*
> > not better for many people, due to choosing incremental-sort plans
> > where they were really a loss.
> >
>
> Yeah. Essentially the patch could push the planner to pick a path that
> has low startup cost (and very high total cost), assuming it'll only
> need to read small part of the input. But if the number of groups in the
> input is low (perhaps just one huge group), that would be a regression.
>
Yes, I think the biggest risk here is too small number of groups. More
precisely the risk is too large groups while total number of groups might
be large enough.
> If we were at the start of a development cycle and work were being
> > promised to be done later in the cycle to improve the planning aspect,
> > I'd be more charitable about it. But this isn't merely the end of a
> > cycle, it's the *last day*. Now is not the time to commit stuff that
> > needs, or even just might need, follow-on work.
> >
>
> +1 to that
>
> FWIW I'm willing to spend some time on the patch for PG12, particularly
> on the planner / costing part. The potential gains are too interesting
>
Thank you very much for your efforts on reviewing this patch.
I'm looking forward work with you on this feature for PG12.
FWIW, I think that we're moving this patch in the right direction by
providing separate paths for incremental sort. It's much better than
deciding between full or incremental sort in-place. For sure, considereing
extra paths might cause planning time regression. But I think the
same statement is true about many other planning optimizations.
One thing be can do is to make enable_incrementalsort = off by
default. Then only users who understand improtance of incremental
sort will get extra planning time.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2018-04-09 16:06:43 | Re: [HACKERS] Runtime Partition Pruning |
Previous Message | Greg Stark | 2018-04-09 15:29:36 | Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS |