Re: GIN improvements part 1: additional information

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN improvements part 1: additional information
Date: 2013-11-15 06:49:37
Message-ID: CAPpHfdsj-hg9knfD+FXPS-+330qjXZKiuR-_a15EFCKiNVi7pQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 14, 2013 at 3:00 PM, Alexander Korotkov <aekorotkov(at)gmail(dot)com>wrote:

> On Thu, Nov 14, 2013 at 2:17 PM, Alexander Korotkov <aekorotkov(at)gmail(dot)com>wrote:
>
>> On Tue, Nov 5, 2013 at 9:49 PM, Heikki Linnakangas <
>> hlinnakangas(at)vmware(dot)com> wrote:
>>
>>> On 04.11.2013 23:44, Alexander Korotkov wrote:
>>>
>>>> On Mon, Oct 21, 2013 at 11:12 PM, Alexander Korotkov
>>>> <aekorotkov(at)gmail(dot)com>wrote:
>>>>
>>>> Attached version of patch is debugged. I would like to note that
>>>>> number of
>>>>> bugs was low and it wasn't very hard to debug. I've rerun tests on it.
>>>>> You
>>>>> can see that numbers are improved as the result of your refactoring.
>>>>>
>>>>> event | period
>>>>> -----------------------+-----------------
>>>>> index_build | 00:01:45.416822
>>>>> index_build_recovery | 00:00:06
>>>>> index_update | 00:05:17.263606
>>>>> index_update_recovery | 00:01:22
>>>>> search_new | 00:24:07.706482
>>>>> search_updated | 00:26:25.364494
>>>>> (6 rows)
>>>>>
>>>>> label | blocks_mark
>>>>> ----------------+-------------
>>>>> search_new | 847587636
>>>>> search_updated | 881210486
>>>>> (2 rows)
>>>>>
>>>>> label | size
>>>>> ---------------+-----------
>>>>> new | 419299328
>>>>> after_updates | 715915264
>>>>> (2 rows)
>>>>>
>>>>> Beside simple bugs, there was a subtle bug in incremental item indexes
>>>>> update. I've added some more comments including ASCII picture about how
>>>>> item indexes are shifted.
>>>>>
>>>>> Now, I'm trying to implement support of old page format. Then we can
>>>>> decide which approach to use.
>>>>>
>>>>>
>>>> Attached version of patch has support of old page format. Patch still
>>>> needs
>>>> more documentation and probably refactoring, but I believe idea is clear
>>>> and it can be reviewed. In the patch I have to revert change of null
>>>> category placement for compatibility with old posting list format.
>>>>
>>>
>>> Thanks, just glanced at this quickly.
>>>
>>> If I'm reading the patch correctly, old-style non-compressed posting
>>> tree leaf pages are not vacuumed at all; that's clearly not right.
>>>
>>
>> Fixed. Now separate function handles uncompressed posting lists and
>> compress them if as least one TID is deleted.
>>
>>
>>> I argued earlier that we don't need to handle the case that compressing
>>> a page makes it larger, so that the existing items don't fit on the page
>>> anymore. I'm having some second thoughts on that; I didn't take into
>>> account the fact that the "mini-index" in the new page format takes up some
>>> space. I think it's still highly unlikely that there isn't enough space on
>>> a 8k page, but it's not totally crazy with a non-standard small page size.
>>> So at least that situation needs to be checked for with an ereport(),
>>> rather than Assert.
>>>
>>
>> Now this situation is ereported before any change in page.
>>
>> To handle splitting a non-compressed page, it seems that you're relying
>>> on the fact that before splitting, we try to insert, which compresses the
>>> page. The problem with that is that it's not correctly WAL-logged. The
>>> split record that follows includes a full copy of both page halves, but if
>>> the split fails for some reason, e.g you run out of disk space, there is no
>>> WAL record at all of the the compression. I'd suggest doing the compression
>>> in the insert phase on a temporary copy of the page, leaving the original
>>> page untouched if there isn't enough space for the insertion to complete.
>>> (You could argue that this case can't happen because the compression must
>>> always create enough space to insert one more item. maybe so, but at least
>>> there should be an explicit check for that.)
>>
>>
>> Good point. Yes, if we don't handle specially inserting item indexes, I
>> see no point to do special handling for single TID which is much smaller.
>> In the attached patch dataCompressLeafPage just reserves space for single
>> TID.
>>
>> Also, many changes in comments and README.
>>
>> Unfortunally, I didn't understand what is FIXME about in
>> ginVacuumEntryPage. So, it's not fixed :)
>>
>
> Sorry, I posted buggy version of patch. Attached version is correct.
>

Another bug fix by report from Rod Taylor.

------
With best regards,
Alexander Korotkov.
>
>

Attachment Content-Type Size
gin-packed-postinglists-16.patch.gz application/x-gzip 28.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2013-11-15 06:51:00 Re: GIN improvements part2: fast scan
Previous Message Sameer Thakur 2013-11-15 06:40:00 Re: pg_stat_statements: calls under-estimation propagation