Re: [PATCH] Incremental sort (was: PoC: Partial sort)

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Shaun Thomas <shaun(dot)thomas(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andreas Karlsson <andreas(at)proxel(dot)se>
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date: 2020-04-08 15:57:03
Message-ID: 20200408155703.evmgiyjeffyhklui@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 08, 2020 at 11:13:26AM -0400, James Coleman wrote:
>On Wed, Apr 8, 2020 at 11:02 AM Tomas Vondra
><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>
>> On Wed, Apr 08, 2020 at 04:08:39PM +0200, Tomas Vondra wrote:
>> >On Wed, Apr 08, 2020 at 09:54:42AM -0400, James Coleman wrote:
>> >>On Wed, Apr 8, 2020 at 9:43 AM Tomas Vondra
>> >><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> >>>
>> >>>On Wed, Apr 08, 2020 at 12:51:05PM +0200, Tomas Vondra wrote:
>> >>>>On Tue, Apr 07, 2020 at 11:54:23PM -0400, Tom Lane wrote:
>> >>>>>hyrax is not too happy with this test:
>> >>>>>
>> >>>>>https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hyrax&dt=2020-04-07%2004%3A55%3A15
>> >>>>>
>> >>>>>It's not too clear to me why CLOBBER_CACHE_ALWAYS would be breaking
>> >>>>>EXPLAIN output, but it evidently is.
>> >>>>>
>> >>>>
>> >>>>Thanks, I'll investigate. It's not clear to me either what might be
>> >>>>causing this, but I guess something must have gone wrong in
>> >>>>estimation/planning.
>> >>>>
>> >>>
>> >>>OK, I know what's going on - it's a rather embarassing issue in the
>> >>>regression test. There's no analyze on the test tables, so it uses
>> >>>default estimates for number of groups etc. But with clobber cache the
>> >>>test runs long enough for autoanalyze to kick in and collect stats, so
>> >>>we generate better estimates which changes the plan.
>> >>>
>> >>>I'll get this fixed - explicit analyze and tweaking the data a bit
>> >>>should do the trick.
>> >>
>> >>Looking at the tests that failed, I think we should consider just adding:
>> >>set enable_sort = off;
>> >>because several of those tests have very specific amounts of data to
>> >>ensure we test the transition points around the different modes in the
>> >>incremental sort node.
>> >>
>> >
>> >Maybe, but I'd much rather tweak the data so that we test both the
>> >costing and execution part.
>> >
>>
>> I do think this does the trick by increasing the number of rows a bit
>> (from 100 to 1000) to make the Sort more expensive than Incremental
>> Sort, while still testing the transition points.
>>
>> James, can you verify it that's still true?
>
>Those changes all look good to me from a "testing correctness" POV.
>Also I like that we now test multiple sort methods in the explain
>output, like: "Sort Methods: top-N heapsort, quicksort".
>

OK, good. I'll push the fix.

>I personally find the `i/100` notation harder to read than a case, but
>that's just an opinion...
>

Yeah, but with 1000 rows we'd need a more complex CASE statement (I
don't think simply having two groups - small+large would work).

>Should we change `analyze` to `analyze t` to avoid unnecessarily
>re-analyzing all other tables in the regression db?
>

Ah, definitely. That was a mistake. Thanks for noticing.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-04-08 15:57:08 Re: Vacuum o/p with (full 1, parallel 0) option throwing an error
Previous Message David Steele 2020-04-08 15:49:11 Re: Function to track shmem reinit time