Re: Stating the significance of Lehman & Yao in the nbtree README

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Stating the significance of Lehman & Yao in the nbtree README
Date: 2014-07-23 05:49:41
Message-ID: CAM3SWZRUdMBoCr5=MT0k6n=k724=g5S3PmMsSA1MVXwmKnBGKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 22, 2014 at 8:59 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> Okay, but how does this justify to add below new text in README.
> + Even with these read locks, Lehman and Yao's approach obviates the
> + need of earlier schemes to hold multiple read locks concurrently when
> + descending the tree as part of servicing index scans (pessimistic lock
> + coupling).
>
> Actually I think putting it can lead to inconsistency in the README.
> Currently it indicates that our algorithm is different from L&Y w.r.t taking
> Read Locks and has given explanation for same.

Not really. Firstly, that sentence acknowledges that there are read
locks where L&Y assume there will not be. "Even with these read locks"
references the first paragraph, where it is stated the Postgres
B-Trees still acquire read locks while descending the tree. Secondly,
I'm pretty sure that even Lehman and Yao realized that their apparent
assumption that real implementations would not require read locks
isn't realistic. Their handling of deletion seems perfunctory to me.
They say "In situations where excessive deletions cause the storage
utilization of tree nodes to be unacceptably low, a batch
reorganization or an underflow operation which locks the entire tree
can be performed". I'm pretty sure that that sounded almost as bad in
1980 as it does now. We don't have a "not quite L&Y" implementation
just because there are single read locks acquired while descending the
tree. Prior schemes needed multiple *concurrent* exclusive locks.
B-Trees were around for about 10 years before L&Y.

There is reason to think that pretty much every practical
implementation uses read locks for many years, because there is a well
received 2001 paper [1] that describes a scheme where L&Y style B-link
trees can *actually* be made to not require read locks, which
discusses things like caching effects on contemporary hardware - it
involves playing tricks with detecting and recovering from page level
inconsistencies, IIRC. Furthermore, it references a scheme from the
late 90s involving local copies of B-Link pages. I thought about
pursuing something like that myself, but the cost of "latching"
(buffer locking) B-Trees in PostgreSQL is likely to be reduced before
too long anyway, which makes the general idea seem unenticing right
now.

> Basically I think it will be better if you can explain in bit more detail
> that
> how does "right-links at all levels and high-key" helps to detect and
> recover from concurrent page splits.

You might be right about that - perhaps I should go into more detail.

[1] http://www.vldb.org/conf/2001/P181.pdf
--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Viswanatham kirankumar 2014-07-23 06:14:22 Re: [TODO] Process pg_hba.conf keywords as case-insensitive
Previous Message Pavel Stehule 2014-07-23 05:36:34 Re: proposal (9.5) : psql unicode border line styles