Re: Bug: Buffer cache is not scan resistant

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Jim Nasby <decibel(at)decibel(dot)org>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Luke Lonergan <LLonergan(at)greenplum(dot)com>, Grzegorz Jaskiewicz <gj(at)pointblue(dot)com(dot)pl>, PGSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Doug Rady <drady(at)greenplum(dot)com>, Sherry Moore <sherry(dot)moore(at)sun(dot)com>
Subject: Re: Bug: Buffer cache is not scan resistant
Date: 2007-03-06 17:56:03
Message-ID: 1173203763.13722.414.camel@dogma.v10.wvs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2007-03-05 at 21:02 -0700, Jim Nasby wrote:
> On Mar 5, 2007, at 2:03 PM, Heikki Linnakangas wrote:
> > Another approach I proposed back in December is to not have a
> > variable like that at all, but scan the buffer cache for pages
> > belonging to the table you're scanning to initialize the scan.
> > Scanning all the BufferDescs is a fairly CPU and lock heavy
> > operation, but it might be ok given that we're talking about large
> > I/O bound sequential scans. It would require no DBA tuning and
> > would work more robustly in varying conditions. I'm not sure where
> > you would continue after scanning the in-cache pages. At the
> > highest in-cache block number, perhaps.
>
> If there was some way to do that, it'd be what I'd vote for.
>

I still don't know how to make this take advantage of the OS buffer
cache.

However, no DBA tuning is a huge advantage, I agree with that.

If I were to implement this idea, I think Heikki's bitmap of pages
already read is the way to go. Can you guys give me some pointers about
how to walk through the shared buffers, reading the pages that I need,
while being sure not to read a page that's been evicted, and also not
potentially causing a performance regression somewhere else?

> Given the partitioning of the buffer lock that Tom did it might not
> be that horrible for many cases, either, since you'd only need to
> scan through one partition.
>
> We also don't need an exact count, either. Perhaps there's some way
> we could keep a counter or something...

Exact count of what? The pages already in cache?

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-03-06 17:59:32 Re: Bug: Buffer cache is not scan resistant
Previous Message Gregory Stark 2007-03-06 17:48:34 Re: GIST and TOAST