Re: WIP: Fast GiST index build

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Fast GiST index build
Date: 2011-08-12 15:04:55
Message-ID: CA+TgmoZoON6aAE3NB9=GF9oAdF=Kkz21+MmoE5P_kA9aKN-OQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 11, 2011 at 6:21 AM, Alexander Korotkov
<aekorotkov(at)gmail(dot)com> wrote:
> [ new patch ]

Some random comments:

- It appears that the "noFollowFight" flag is really supposed to be
called "noFollowRight".

- In gist_private.h you've written "halt-filled" where you really mean
"half-filled".

- It seems like you're using reloptions to set parameters that are
only going to do anything at index creation time. IIUC, "BUFFERING",
"LEVELSTEP" and "BUFFERSIZE" have no permanent meaning for that index;
they're just used ephemerally while constructing it. If we're going
to expose such things as options, maybe they should be GUCs, not
reloptions.

- Function names should begin with "gist" or some other, appropriate
prefix, especially if they are non-static. decreasePathRefcount(),
getNodeBuffer(), relocateBuildBuffersOnSplit(), adn
getNodeBufferBusySize() violate this rule, and it might be good to
change the static functions to follow it, too, just for consistency,
and to avoid renaming things if something that's currently static
later needs to be made non-static.

- validateBufferOption needs to use ereport(), not elog().

- This needs a bit of attention:

+ /* TODO: Write the WAL record */
+ if (RelationNeedsWAL(state->r))
+ recptr = gistXLogSplit(state->r->rd_node,
blkno, is_leaf,
+ dist,
oldrlink, oldnsn, InvalidBuffer, true);
+ else
+ recptr = GetXLogRecPtrForTemp();
+

I don't think the comment matches the code, since gistXLogSplit() does
in fact call XLogInsert(). Also, you should probably move the
RelationNeedsWAL() test inside gistXLogSplit(). Otherwise, every
single caller of gistXLogSplit() is going to need to do the same
dance.

- In gistBufferingPlaceToPage, you've got a series of loops that look like this:

+ for (ptr = dist; ptr; ptr = ptr->next)

The first loop allocates a bunch of buffers. The second loop sets up
downlink pointers. Then there's some other code. Then there's a
third loop, which adds items to each page in turn and sets up right
links. Then there's a fourth loop, which marks all those buffers
dirty. Then you write XLOG. Then there's a fifth loop, which sets
all the LSNs and TLIs, and a sixth loop, which does
UnlockReleaseBuffer() on each valid buffer in the list. All of this
seems like it could be simplified. In particular, the third and
fourth loops can certainly be merged - you should set the dirty bit at
the same time you're adding items to the page. And the fifth and
sixth loops can also be merged. You certainly don't need to set all
the LSNs and TLIs before releasing any of the buffer locks & pins.
I'm not sure if there's any more merging that can be done than that,
but you might want to have a look.

I'm also wondering how long this linked list can be. It's not good to
have large numbers of buffers locked for a long period of time. At
the very least, some comments are in order here.

Another general comment about this function is that it seems like it
is backwards. The overall flow of the function is:

if (is_split)
{
/* complicated stuff */
}
else
{
/* simple case */
}

It seems like it might be better to flip that around and do this:

if (!is_split)
{
/* simple case */
return result;
}
/* complicated stuff */

It's easier to read and avoids having the indentation get too deep.

- As I look at this more, I see that a lot of the logic in
gistBufferingBuildPlaceToPage is copied from gistplacetopage(). It
would be nice to move the common bits to common subroutines that both
functions can call, instead of duplicating the code.

- On a related note, gistBufferingBuildPlaceToPage needs to do
START_CRIT_SECTION and END_CRIT_SECTION at appropriate points in the
sequence, as gistplacetopage() does.

- gistFindCorrectParent() seems to rely heavily on the assumption that
there's no concurrent activity going on in this index. Otherwise,
it's got to be unsafe to release the buffer lock before using the
answer the function computes. Some kind of comment seems like it
would be a good idea.

- On a more algorithmic note, I don't really understand why we attach
buffers to all pages on a level or none of them. If it isn't
necessary to have buffers on every internal page in the tree, why do
we have them on every other level or every third level rather than,
say, creating them on the fly in whatever parts of the tree end up
busy?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-08-12 15:13:57 Re: index-only scans
Previous Message Robert Haas 2011-08-12 14:57:47 Re: bgwriter and checkpoints