Re: Stating the significance of Lehman & Yao in the nbtree README

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Stating the significance of Lehman & Yao in the nbtree README
Date: 2014-09-11 20:47:43
Message-ID: CAM3SWZSvV2YQkJR_hYzv14v_qpMGNu+HfVKrd-Wnp+KFZEmGfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 9, 2014 at 12:01 AM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
>> + Lehman and Yao don't require read locks, but assume that in-memory
>> + copies of tree pages are unshared.

> This is the existing paragraph, just moved to different place in the README.

That's right - it seemed to make just as much sense to talk about that
here, while doing so establishes the context of talking about how what
we do differs from the canonical algorithm (which I think is true of
real-world L&Y implementations generally). Which makes the next
paragraph easier to understand:

>> + Although it could be argued that Lehman and Yao isn't followed to the
>> + letter because single pages are read locked as the tree is descended,
>> + this is at least necessary to support deletion, a common requirement
>> + which L&Y hardly acknowledge. Read locks also ensure that B-tree
>> + pages are self-consistent (L&Y appear to assume atomic page reads and
>> + writes).
>
> This is just duplicating the existing paragraph. I don't see the point of
> this.

The point is to make clear the reason for the differences - evidently,
Amit felt it was unclear why we don't follow L&Y. I am suggesting that
L&Y's talk of having no read locks is unrealistic (it might be
realistic in the 21st century, but that's a whole other story).

>> Even with these read locks, following L&Y obviates the need
>> + of earlier schemes to hold multiple locks concurrently when descending
>> + the tree as part of servicing index scans (pessimistic lock coupling).
>> + The addition of right-links at all levels, as well as the addition of
>> + a page "high key" allows detection and dynamic recovery from
>> + concurrent page splits (that is, splits between unlocking an internal
>> + page, and subsequently locking its child page during a descent).

> This explains in a few sentences what a L&Y B-tree looks like. The current
> README assumes that you're already familiar with what a L&Y tree looks like,
> or that you go read the paper mentioned at the top of the README. It might
> be a good idea to expand on that, and add an introduction like this so that
> an unfamiliar reader doesn't need to read the L&Y paper first. Is that the
> purpose of this patch? Please make it more explicit.

Yes, although even L&Y don't get around to telling the reader why they
should care, the mistake that many good papers make. We now know its
significance, both in general and to Postgres. Framing the discussion
like this aids understanding more than you'd think.

> And please make the
> sentences simpler - the above reads like a Shakespeare play.

Out, damned lock!

> Something like:
> The basic Lehman & Yao Algorithm
> ================================
>
> Compared to a classic B-tree, L&Y adds a right-link pointer to each page, to
> the page's right sibling. It also adds a "high key" to each page, which is
> an upper bound on the keys that are allowed on that page. These two
> additions make it possible detect a concurrent page split, which allows the
> tree to be searched without holding any read locks (except to keep a single
> page from being modified while reading it).
>
> When a search follows a downlink to a child page, it compares the page's
> high key with the search key. If the search key is greater than the high
> key, the page must've been split concurrently, and you must follow the
> right-link to find the new page containing the key range you're looking for.
> This might need to be repeated, if the page has been split more than once.
>
> Differences to the Lehman & Yao algorithm
> =========================================
>
> (current "Lehamn and Yao Algorithm and Insertions section)
>
>
>
> I think that's pretty much the same information you added, but it's in a
> separate section, with the clear purpose that it explains what a L&Y tree
> looks like. You can skip over it, if you have read the paper or are
> otherwise already familiar with it. It still assumes that you're familiar
> with B-trees in general.

That seems fair enough - I'd just expand on why we don't completely
avoid read locks, which L&Y suppose we can get away with. That is
clearly a point of confusion.

> Anyway, I see that you had resurrected this in the commitfest app after
> three weeks of inactivity. I'm going to mark this back to "Returned with
> Feedback". Please don't resurrect it again, this patch has received more
> than its fair share of attention.

I didn't mean to suggest that it deserves much attention. I didn't
know how to interpret the fact that you changed the status, since in
fact not much additional work was required. I was busy throughout, for
reasons that are perhaps obvious. I am not fussed about when this
happens, but I really think we should get around to it.

> Instead, please help by signing up to
> review a patch. The commitfest progress is glacial at the moment, and we
> really need experienced reviewers like you to get closure to people's
> patches.

I'm back from holidays now. I plan to look at the Tomas Vondra's
patch. If you think I should look at some particular patch, please let
me know.
--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-09-11 20:50:24 Re: B-Tree support function number 3 (strxfrm() optimization)
Previous Message Stephen Frost 2014-09-11 20:36:10 Re: RLS Design