Re: GIN improvements part 1: additional information

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN improvements part 1: additional information
Date: 2014-01-24 08:03:04
Message-ID: CAPpHfduaLDYjXRw8q5+Fnjbe_3SgvKZbNHXJ952VbFLi5H6HDQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 23, 2014 at 9:27 AM, Alexander Korotkov <aekorotkov(at)gmail(dot)com>wrote:

> On Wed, Jan 22, 2014 at 9:28 PM, Heikki Linnakangas <
> hlinnakangas(at)vmware(dot)com> wrote:
>
>> On 01/22/2014 02:17 PM, Alexander Korotkov wrote:
>>
>>> We already spent a lot of time with compression. Now we need to figure
>>> out
>>> the result we want see. I spent quite long time debugging varbyte
>>> encoding
>>> without segments. Finally, it passed very many tests without any
>>> problems.
>>> Now, it is just piece of junk. I'm afraid that we will have to
>>> reimplement
>>> everything from scratch another two or three times because code doesn't
>>> look perfect. For sure, it's incompatible with getting something into
>>> 9.4.
>>>
>>
>> That's a bit harsh :-). But yes, I understand what you're saying. It's
>> quite common for large patches like this to be rewritten several times
>> before being committed; you just don't know what works best until you've
>> tried many designs.
>>
>>
>> Remember we have also subsequent fast-scan which is very needed for
>>> hstore
>>> and other application.
>>> I propose to do final decisions now and concentrate forces on making
>>> committable patch with these decisions. And don't change these decisions
>>> even if somebody have idea how to improve code readability in 100 times
>>> and
>>> potential extendability in 1000 times.
>>> I propose following decisions:
>>> 1) Leave uncompressed area, allow duplicates in it
>>> 2) Don't introduce Items on page.
>>>
>>
>> Well, I got the insertions to work now without the uncompressed area, so
>> in the spirit of Just Getting This Damn Patch Committed, I'm going to go
>> ahead with that. We can add the uncompressed area later if performance
>> testing shows it to be necessary. And I agree about the Items on page idea
>> - we can come back to that too still in 9.4 timeframe if necessary, but
>> probably not.
>>
>> So, committed. It's the same design as in the most recent patch I posted,
>> but with a bunch of bugs fixed, and cleaned up all over. I'm going to move
>> to the fast scan patch now, but please do test and review the committed
>> version to see if I missed something.
>
>
> Great! Thanks a lot!
> Assertion in dataPlaceToPageLeafRecompress is too strong. Page can contain
> GinDataLeafMaxContentSize bytes. Patch is attached.
> My test-suite don't run correctly. I'm debugging now.
>

ITSM I found this bug. ginVacuumPostingTreeLeaf re-encodes only some
segments. Others are not even re-palloced. They are moved left
in dataPlaceToPageLeafRecompress by memcpy. But it's incorrect to with
memcpy, proper solution is memmove. Using memcpy platform dependently can
lead to page corruption. Another solution is to re-palloc segments in
ginVacuumPostingTreeLeaf.

------
With best regards,
Alexander Korotkov.

Attachment Content-Type Size
gin-memmove-fix.patch application/octet-stream 611 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2014-01-24 08:33:32 Re: ALTER TABLE lock strength reduction patch is unsafe
Previous Message Dean Rasheed 2014-01-24 07:47:33 Re: [PATCH] Negative Transition Aggregate Functions (WIP)