Re: Extended Prefetching using Asynchronous IO - proposal and patch

From: John Lumby <johnlumby(at)hotmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Claudio Freire <klaussfreire(at)gmail(dot)com>, pgsql hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Subject: Re: Extended Prefetching using Asynchronous IO - proposal and patch
Date: 2014-05-29 22:11:26
Message-ID: BAY175-W432649F7090F849C2871BAA3240@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

> From: tgl(at)sss(dot)pgh(dot)pa(dot)us
> To: klaussfreire(at)gmail(dot)com
> CC: hlinnakangas(at)vmware(dot)com; johnlumby(at)hotmail(dot)com; pgsql-hackers(at)postgresql(dot)org
> Subject: Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch
> Date: Thu, 29 May 2014 17:56:57 -0400
>
> Claudio Freire <klaussfreire(at)gmail(dot)com> writes:
> > On Thu, May 29, 2014 at 6:43 PM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
> >> On Thu, May 29, 2014 at 6:19 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >>> "ampeeknexttuple"? That's a bit scary. It would certainly be unsafe
> >>> for non-MVCC snapshots (read about vacuum vs indexscan interlocks in
> >>> nbtree/README).
>
> >> It's not really the tuple, just the tid
>
> > And, furthermore, it's used only to do prefetching, so even if the tid
> > was invalid when the tuple needs to be accessed, it wouldn't matter,
> > because the indexam wouldn't use the result of ampeeknexttuple to do
> > anything at that time.
>
> Nonetheless, getting the next tid out of the index may involve stepping
> to the next index page, at which point you've lost your interlock

I think we are ok as peeknexttuple (yes bad name, sorry, can change it ...

never advances to another page :

* btpeeknexttuple() -- peek at the next tuple different from any blocknum in pfch_list

* without reading a new index page

* and without causing any side-effects such as altering values in control blocks

* if found, store blocknum in next element of pfch_list

> guaranteeing that the *previous* tid will still mean something by the time
> you arrive at its heap page. I presume that the ampeeknexttuple call is
> issued before trying to visit the heap (otherwise you're not actually
> getting much I/O overlap), so I think there's a real risk here.
>
> Having said that, it's probably OK as long as this mode is only invoked
> for user queries (with MVCC snapshots) and not for system indexscans.
>
> regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Claudio Freire 2014-05-29 22:14:36 Re: Extended Prefetching using Asynchronous IO - proposal and patch
Previous Message Tom Lane 2014-05-29 21:56:57 Re: Extended Prefetching using Asynchronous IO - proposal and patch

Browse pgsql-hackers by date

  From Date Subject
Next Message Claudio Freire 2014-05-29 22:14:36 Re: Extended Prefetching using Asynchronous IO - proposal and patch
Previous Message Tom Lane 2014-05-29 21:56:57 Re: Extended Prefetching using Asynchronous IO - proposal and patch