Re: GIN improvements part 1: additional information

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GIN improvements part 1: additional information
Date: 2013-09-17 08:58:42
Message-ID: CAPpHfduiTGZ+xGUPYXjckYVe2cuqx7SVaTsvjVJq5j0ywx6-NQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 16, 2013 at 11:43 AM, Heikki Linnakangas <
hlinnakangas(at)vmware(dot)com> wrote:

> On 15.09.2013 12:14, Alexander Korotkov wrote:
>
>> On Sat, Jun 29, 2013 at 12:56 PM, Heikki Linnakangas<
>> hlinnakangas(at)vmware(dot)com> wrote:
>>
>> There's a few open questions:
>>>
>>> 1. How are we going to handle pg_upgrade? It would be nice to be able to
>>> read the old page format, or convert on-the-fly. OTOH, if it gets too
>>> complicated, might not be worth it. The indexes are much smaller with the
>>> patch, so anyone using GIN probably wants to rebuild them anyway, sooner
>>> or
>>> later. Still, I'd like to give it a shot.
>>>
>>> 2. The patch introduces a small fixed 32-entry index into the packed
>>> items. Is that an optimal number?
>>>
>>> 3. I'd like to see some performance testing of insertions, deletions, and
>>> vacuum. I suspect that maintaining the 32-entry index might be fairly
>>> expensive, as it's rewritten on every update to a leaf page.
>>>
>>
>> It appears that maintaining 32-entry index is really expensive because it
>> required re-decoding whole page. This issue is fixed in attached version
>> of
>> patch by introducing incremental updating of that index. Benchmarks will
>> be
>> posted later today..
>>
>
> Great! Please also benchmark WAL replay; you're still doing
> non-incremental update of the 32-entry index in ginRedoInsert.
>

Yes

> A couple of quick comments after a quick glance (in addition to the above):
>
> The varbyte encoding stuff is a quite isolated piece of functionality. And
> potentially useful elsewhere, too. It would be good to separate that into a
> separate .c/.h files. I'm thinking of src/backend/access/gin/**packeditemptr.c,
> which would contain ginDataPageLeafReadItemPointer**,
> ginDataPageLeafWriteItemPointe**r and ginDataPageLeafGetItemPointerS**ize
> functions. A new typedef for varbyte-encoded things would probably be good
> too, something like "typedef char *PackedItemPointer". In the new .c file,
> please also add a file-header comment explaining the encoding.
>
> The README really needs to be updated. The posting tree page structure is
> a lot more complicated now, there needs to be a whole new section to
> explain it. A picture would be nice, similar to the one in bufpage.h.
>
> It's a bit funny that we've clumped together all different kinds of GIN
> insertions into one WAL record type. ginRedoInsert does completely
> different things depending on what kind of a page it is. And the
> ginXlogInsert struct also contains different kinds of things depending on
> what kind of an insertion it is. It would be cleaner to split it into
> three. (this isn't your patch's fault - it was like that before, too.)

Finally, I've debugged index update.

There are benchmark scripts attached which I used for testing. bench.sh is
doing following:
1) Switches ~/postgres to given branch, configures and compiles it
2) Initializes cluster, runs postgres, imports mailing lists archives which
could be downloaded from here:
http://mira.sai.msu.ru/~megera/tmp/pg-mailing/archives-9.1-main.dump.gz
3) Runs index creation measuring taken time and index size.
4) Runs set of index search queries measuring overall taken time and number
of used blocks. Queries was extracted from mailing lists web server logs.
So, they are real-life.
5) Runs huge updates and vacuums measuring overall taken time and final
index size.
6) Rerun set of queries.

The results of master branch:

Time taken by index build, update and search:
event | period
----------------+-----------------
index_build | 00:01:52.154299
index_update | 00:10:42.982413
search_new | 00:26:14.294872
search_updated | 00:27:06.10298
(4 rows)

Numbers of blocks used in search (not very representative, because it's
mostly consumed by heap fetches):
label | blocks_mark
----------------+-------------
search_new | 848156708
search_updated | 885122373
(2 rows)

Size of index newly created and after updates:
label | size
---------------+------------
new | 884514816
after_updates | 1595252736
(2 rows)

The results of packed posting lists branch.

Time taken by index build, update and search:
event | period
----------------+-----------------
index_build | 00:01:54.363988
index_update | 00:08:55.291099
search_new | 00:26:06.262403
search_updated | 00:27:07.501142
(4 rows)

Numbers of blocks used in search:
label | blocks_mark
----------------+-------------
search_new | 847591514
search_updated | 883928608
(2 rows)

Size of index newly created and after updates:
label | size
---------------+-----------
new | 421330944
after_updates | 718921728
(2 rows)

We can see there is no significant slowdown. Updates even becomes faster
probably because of reduced index size.

Now, I'm going to take care about WAL, refactoring and documentation.

------
With best regards,
Alexander Korotkov.

Attachment Content-Type Size
bench.tar.gz application/x-gzip 446.5 KB
gin-packed-postinglists-4.patch.gz application/x-gzip 19.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeevan Chalke 2013-09-17 08:58:44 Re: PL/pgSQL, RAISE and error context
Previous Message Thom Brown 2013-09-17 08:30:12 Re: Minmax indexes