Re: Fix for seg picksplit function

From: Yeb Havinga <yebhavinga(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix for seg picksplit function
Date: 2010-11-20 12:36:20
Message-ID: 4CE7C0C4.6070902@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2010-11-20 04:46, Robert Haas wrote:
> On Tue, Nov 16, 2010 at 6:07 AM, Alexander Korotkov
> <aekorotkov(at)gmail(dot)com> wrote:
>> On Tue, Nov 16, 2010 at 3:07 AM, Robert Haas<robertmhaas(at)gmail(dot)com> wrote:
>>> But on a broader note, I'm not very certain the sorting algorithm is
>>> sensible. For example, suppose you have 10 segments that are exactly
>>> '0' and 20 segments that are exactly '1'. Maybe I'm misunderstanding,
>>> but it seems like this will result in a 15/15 split when we almost
>>> certainly want a 10/20 split. I think there will be problems in more
>>> complex cases as well. The documentation says about the less-than and
>>> greater-than operators that "These operators do not make a lot of
>>> sense for any practical purpose but sorting."
>> In order to illustrate a real problem we should think about
>> gist behavior with great enough amount of data. For example, I tried to
>> extrapolate this case to 100000 of segs where 40% are (0,1) segs and 60% are
>> (1,2) segs. And this case doesn't seem a problem for me.
> Well, the problem with just comparing on< is that it takes very
> little account of the upper bounds. I think the cases where a simple
> split would hurt you the most are those where examining the upper
> bound is necessary to to get a good split.
With the current 8K default blocksize, I put my money on the sorting
algorithm for any seg case. The r-tree algorithm's performance is
probably more influenced by large buckets->low tree depth->generic keys
on non leaf nodes.

regards,
Yeb Havinga

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2010-11-20 13:15:10 Re: Fix for seg picksplit function
Previous Message Pavel Stehule 2010-11-20 12:29:14 Re: Fwd: patch: format function - fixed oid