Re: Memory usage during sorting

From: Greg Stark <stark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Memory usage during sorting
Date: 2012-03-18 15:56:27
Message-ID: CAM-w4HN8H264xvTH9Nw-ZxUNRECB_0qzcEeoaJoDgD6GxQTv9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Mar 18, 2012 at 3:25 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> No, it's about reducing the number of comparisons needed to maintain
>>> the heap property.
>
>> That sounds very interesting.  I didn't know it was even theoretically
>> possible to do that.
>
> So has somebody found a hole in the n log n lower bound on the cost of
> comparison-based sorting?  I thought that had been proven pretty
> rigorously.

I think what he's describing is, for example, say you have a heap of
1,000 elements and that lets you do the entire sort in a single pass
-- i.e. it generates a single sorted run after the first pass.
Increasing the heap size to 2,000 will just mean doing an extra
comparison for each tuple to sift up. And increasing the heap size
further will just do more and more work for nothing. It's still nlogn
but the constant factor is slightly higher. Of course to optimize this
you would need to be able to see the future.

I always thought the same behaviour held for if the heap was larger
than necessary to do N merge passes. that is, making the heap larger
might generate fewer tapes but the still enough to require the same
number of passes. However trying to work out an example I'm not too
sure. If you generate fewer tapes then the heap you use to do the
merge is smaller so it might work out the same (or more likely, you
might come out ahead).

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-03-18 16:13:23 Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)
Previous Message Tom Lane 2012-03-18 15:45:47 Re: Memory usage during sorting