Re: fallocate / posix_fallocate for new WAL file creation (etc...)

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)2ndquadrant(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fallocate / posix_fallocate for new WAL file creation (etc...)
Date: 2013-05-30 15:45:31
Message-ID: 51A7741B.4010506@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5/30/13 11:21 AM, Alvaro Herrera wrote:
> Greg Smith escribió:
>
>> The messy part of extending relations in larger chunks
>> is how to communicate that back into the buffer manager usefully.
>> The extension path causing trouble is RelationGetBufferForTuple
>> calling ReadBufferBI. All of that is passing a single buffer
>> around. There's no simple way I can see to rewrite it to handle
>> more than one at a time.
>
> No, but we can have it create several pages and insert them into the
> FSM. So they aren't returned to the original caller but are available
> to future users.

There's actually a code comment wondering about this topic for the pages
that are already created, in src/backend/access/heap/hio.c :

"Remember the new page as our target for future insertions.
XXX should we enter the new page into the free space map immediately, or
just keep it for this backend's exclusive use in the short run (until
VACUUM sees it)? Seems to depend on whether you expect the current
backend to make more insertions or not, which is probably a good bet
most of the time. So for now, don't add it to FSM yet."

We have to be careful about touching too much at that particular point,
because it's holding a relation extension lock at the obvious spot to
make a change.

There's an interesting overlap with these questions about how files are
extended too, with this comment in that file too, just before the above:

"XXX This does an lseek - rather expensive - but at the moment it is the
only way to accurately determine how many blocks are in a relation. Is
it worth keeping an accurate file length in shared memory someplace,
rather than relying on the kernel to do it for us?"

That whole sequence of code took the easy way forward when it was
written, but it's obvious the harder one (also touching the FSM) was
considered even then. The whole sequence needs to be revisited to pull
off multiple page extension. I wouldn't say it's hard, but it's enough
work that I haven't been able to find a block of time to go through the
whole thing.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-05-30 16:47:07 Re: Behavior of a pg_trgm index for 2 (or < 3) character LIKE queries
Previous Message Alvaro Herrera 2013-05-30 15:21:26 Re: fallocate / posix_fallocate for new WAL file creation (etc...)