Re: [PATCH] Incremental sort (was: PoC: Partial sort)

From: James Coleman <jtc331(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Shaun Thomas <shaun(dot)thomas(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Andreas Karlsson <andreas(at)proxel(dot)se>
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date: 2020-04-19 13:46:55
Message-ID: CAAaqYe829tk-mQ_ePn1zZqKTcocd9LVDnv8mxZ1UufNq034qhA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 18, 2020 at 10:36 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
>
> On Tue, Apr 07, 2020 at 10:53:05AM -0500, Justin Pryzby wrote:
> > On Tue, Apr 07, 2020 at 08:40:30AM -0400, James Coleman wrote:
> > > > And, should it use two spaces before "Sort Method", "Memory" and "Pre-sorted
> > ...
> > > I read through that subthread, and the ending seemed to be Peter
> > > wanting things to be unified. Was there a conclusion beyond that?
> >
> > This discussion is ongoing. I think let's wait until that's settled before
> > addressing this more complex and even newer case. We can add "explain, two
> > spaces and equals vs colon" to the "Open items" list if need be - I hope the
> > discussion will not delay the release.
>
> The change proposed on the WAL thread is minimal, and makes new explain(WAL)
> output consistent with the that of explain(BUFFERS).
>
> That uses a different format from "Sort", which is what incremental sort should
> follow. (Hashjoin also uses the Sort's format of two-spaces and colons rather
> than equals).

I think it's not great that buffers/sort are different, but I agree
that we should follow sort.

> So the attached 0001 makes explain output for incremental sort more consistent
> with sort:
>
> - Two spaces;
> - colons rather than equals;
> - Don't use semicolon, which isn't in use anywhere else;
>
> I tested with this:
> template1=# DROP TABLE t; CREATE TABLE t(i int, j int); INSERT INTO t SELECT a-(a%100), a%1000 FROM generate_series(1,99999)a; CREATE INDEX ON t(i); VACUUM VERBOSE ANALYZE t;
> template1=# explain analyze SELECT * FROM t a ORDER BY i,j;
> ...
> Full-sort Groups: 1000 Sort Method: quicksort Average Memory: 28kB Peak Memory: 28kB Pre-sorted Groups: 1000 Sort Method: quicksort Average Memory: 30kB Peak Memory: 30kB
>
> On Tue, Apr 07, 2020 at 05:34:15PM +0200, Tomas Vondra wrote:
> > On Tue, Apr 07, 2020 at 08:40:30AM -0400, James Coleman wrote:
> > > On Tue, Apr 7, 2020 at 12:25 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> > > > Should "Pre-sorted Groups:" be on a separate line ?
> > > > | Full-sort Groups: 1 Sort Method: quicksort Memory: avg=28kB peak=28kB Pre-sorted Groups: 1 Sort Method: quicksort Memory: avg=30kB peak=30kB
> > >
> > > I'd originally had that, but Tomas wanted it to be more compact. It's
> > > easy to adjust though if the consensus changes on that.
> >
> > I'm OK with changing the format if there's a consensus. The current
> > format seemed better to me, but I'm not particularly attached to it.
>
> I still think Pre-sorted groups should be on a separate line, as in 0002.
> In addition to looking better (to me), and being easier to read, another reason
> is that there are essentially key=>values here, but the keys are repeated (Sort
> Method, etc).

I collapsed this into 0001 because I think that if we're going to do
away with the key=value style then we effectively to have to do this
to avoid the repeated values being confusing (with key=value it kinda
worked, because that made it seem like the avg/peak were clearly a
subset of the Sort Groups info).

I also cleaned up the newline patch a bit in the process (we already
have a way to indent with a flag so don't need to do it directly).

> I also suggested to rename: s/Presorted/Pre-sorted/, but I didn't do that here.

I went ahead and did that too; we already use "Full-sort", so the
proposed change makes both parallel.

> Michael already patched most of the comment typos, the remainder I'm including
> here as a "nearby patch"..

Modified slightly.

James

Attachment Content-Type Size
v2-0002-comment-typos-Incremental-Sort.patch text/x-patch 3.9 KB
v2-0001-Fix-explain-output-for-incr-sort.patch text/x-patch 9.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2020-04-19 14:22:26 Re: WAL usage calculation patch
Previous Message Tom Lane 2020-04-19 13:37:08 Re: HEAPDEBUGALL is broken