Re: gincostestimate

From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: gincostestimate
Date: 2010-07-30 17:19:43
Message-ID: 4C5309AF.9030100@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

OK, here's a review, as much as I was able to do it without
understanding deeply how GIN works.

The patch is context, applies cleanly to HEAD, compiles without warnings
and passes regression tests.

Using the script from
http://archives.postgresql.org/pgsql-performance/2009-10/msg00393.php I
was able to get an index scan with commonterm40, while with the
unpatched source I was getting an index scan only for commonterm20, so
it indeed improves the situation as far as cost estimation is concerned.

Codewise I have one question: the patch changes a loop in
ginvacuumcleanup from

for (blkno = GIN_ROOT_BLKNO + 1; blkno < npages; blkno++)

to

for (blkno = GIN_ROOT_BLKNO; blkno < npages; blkno++)

why should it now go through all blocks? I couldn't immediately see why
was it not OK to do it before and why is it OK to do it now, but I don't
really get how GIN works internally. I guess a comment would be good to
have there in any case.

The patch has lots of statements like if ( GinPageIsLeaf(page) ), that
is with extra space between the outer parenthesis and the condition,
which AFAIK is not the project style. I guess pgindent fixes that, so
it's no big deal.

There are lines with elog(NOTICE) that are #if 0, they probably should
either become elog(DEBUGX) or get removed.

As for performance, I tried running the attached script a couple of
times. I used the standard config file, only changed checkpoint_segments
to 30 and shared_buffers to 512MB. The timings were:

HEAD

INSERT 0 500000
Time: 13487.450 ms
VACUUM
Time: 337.673 ms

INSERT 0 500000
Time: 13751.110 ms
VACUUM
Time: 315.812 ms

INSERT 0 500000
Time: 12691.259 ms
VACUUM
Time: 312.320 ms

HEAD + gincostestimate

INSERT 0 500000
Time: 13961.894 ms
VACUUM
Time: 355.798 ms

INSERT 0 500000
Time: 14114.975 ms
VACUUM
Time: 341.822 ms

INSERT 0 500000
Time: 13679.871 ms
VACUUM
Time: 340.576 ms

so there is no immediate slowdown for a quick test with one client.

Since there was no stability or performance issues and it solves the
problem, I am marking this as ready for committer, although it might be
beneficial if someone more acquianted with GIN takes another look at it
before the committer review.

I will be travelling during the whole August and will only have
intermittent email access, so in case of any questions with regards to
review the respionse time can be a few days.

Cheers,
Jan

Attachment Content-Type Size
test.sql text/plain 210 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alex Hunsaker 2010-07-30 17:34:45 Re: patch for check constraints using multiple inheritance
Previous Message Andres Freund 2010-07-30 16:33:37 Re: merge command - GSoC progress