Re: Improving the Performance of Full Table Updates

Lists: pgsql-hackers
From: "Gokulakannan Somsundaram" <gokul007(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Improving the Performance of Full Table Updates
Date: 2007-09-20 07:11:20
Message-ID: 9362e74e0709200011j7b155bb9p74912a450c5b2f84@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,
The Current architecture of Updates in PostgreSQL is
1) Make a select query out of update. It involves a READ lock/BUFFER_SHARE
2) Get the tupleid
3) Goto the buffer containing the tupleid, make a BUFFER_EXCLUSIVE lock on
it
4) update it
5) Repeat the above process for subsequent rows

I propose to change this row-by-row approach, when it is a full table
update. I plan to send a extra flag(which will be set for Full table
Deletes/Updates). this would make the access method directly acquire the
exclusive lock and update the existing record.

For Deletes this is simple. But for updates, the projection tuple has to be
made before re-inserting it. So there will be a list of Heap tuples stored
in memory for each page getting updated. these tuples will be inserted after
the deletion part of update is done. This is just a rough design. I may get
involved in a detail design once i get a nod from the mailing list
community.

Thanks,
Gokul.


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Gokulakannan Somsundaram" <gokul007(at)gmail(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improving the Performance of Full Table Updates
Date: 2007-09-20 07:20:06
Message-ID: 46F21F26.7030407@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gokulakannan Somsundaram wrote:
> Hi,
> The Current architecture of Updates in PostgreSQL is
> 1) Make a select query out of update. It involves a READ lock/BUFFER_SHARE
> 2) Get the tupleid
> 3) Goto the buffer containing the tupleid, make a BUFFER_EXCLUSIVE lock on
> it
> 4) update it
> 5) Repeat the above process for subsequent rows
>
> I propose to change this row-by-row approach, when it is a full table
> update. I plan to send a extra flag(which will be set for Full table
> Deletes/Updates). this would make the access method directly acquire the
> exclusive lock and update the existing record.
>
> For Deletes this is simple. But for updates, the projection tuple has to be
> made before re-inserting it. So there will be a list of Heap tuples stored
> in memory for each page getting updated. these tuples will be inserted after
> the deletion part of update is done. This is just a rough design. I may get
> involved in a detail design once i get a nod from the mailing list
> community.

I doubt the locking overhead is that significant. Have you done any
profiling to show that it's worth it?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: "Gokulakannan Somsundaram" <gokul007(at)gmail(dot)com>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Improving the Performance of Full Table Updates
Date: 2007-09-20 08:00:19
Message-ID: 9362e74e0709200100y56ccc16ao42b74b26c6407e95@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

The obvious advantages are
a) Avoidance of one read lock per page
b) One Big write lock instead of multiple write locks.

But as you said, i will do some initial profiling and get back.

Thanks,
Gokul.

On 9/20/07, Heikki Linnakangas <heikki(at)enterprisedb(dot)com> wrote:
>
> Gokulakannan Somsundaram wrote:
> > Hi,
> > The Current architecture of Updates in PostgreSQL is
> > 1) Make a select query out of update. It involves a READ
> lock/BUFFER_SHARE
> > 2) Get the tupleid
> > 3) Goto the buffer containing the tupleid, make a BUFFER_EXCLUSIVE lock
> on
> > it
> > 4) update it
> > 5) Repeat the above process for subsequent rows
> >
> > I propose to change this row-by-row approach, when it is a full table
> > update. I plan to send a extra flag(which will be set for Full table
> > Deletes/Updates). this would make the access method directly acquire the
> > exclusive lock and update the existing record.
> >
> > For Deletes this is simple. But for updates, the projection tuple has to
> be
> > made before re-inserting it. So there will be a list of Heap tuples
> stored
> > in memory for each page getting updated. these tuples will be inserted
> after
> > the deletion part of update is done. This is just a rough design. I may
> get
> > involved in a detail design once i get a nod from the mailing list
> > community.
>
> I doubt the locking overhead is that significant. Have you done any
> profiling to show that it's worth it?
>
> --
> Heikki Linnakangas
> EnterpriseDB http://www.enterprisedb.com
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Gokulakannan Somsundaram" <gokul007(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Improving the Performance of Full Table Updates
Date: 2007-09-20 14:35:25
Message-ID: 29928.1190298925@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Gokulakannan Somsundaram" <gokul007(at)gmail(dot)com> writes:
> I propose to change this row-by-row approach, when it is a full table
> update. I plan to send a extra flag(which will be set for Full table
> Deletes/Updates). this would make the access method directly acquire the
> exclusive lock and update the existing record.

This sounds like a recipe for utter destruction of what little
modularity and layering we've got left. And I rather doubt it will buy
anything interesting performance-wise.

To cite just one concrete objection: surely the tuple-fetch code has got
no business firing triggers.

regards, tom lane


From: "Gokulakannan Somsundaram" <gokul007(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Improving the Performance of Full Table Updates
Date: 2007-09-21 08:05:31
Message-ID: 9362e74e0709210105x40878e38m1c57ab9fda764e68@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Tom,
Thanks for the feedback. Let me clarify my intention again.
This was thought for improving the performance of the Data Warehousing
applications

Full table updates similar to
"Update dd set n2=n2+1"

When you talked about firing triggers, i looked into the implementation of
triggers and the approach i suggested may not work fine with the Triggers.
Since we cannot hold write locks and fire triggers, triggers should get
disabled for this. But Remember in Data Warehousing applications, people
won't be usually having Row-level triggers.

My alternate suggestion would be to make this kind of update an optional one
to provide the speed up.

If a page contains 100 rows, then the current scenario takes 1 read lock +
100 write locks to complete the full table update. In the suggested
scenario, it takes one Write Lock. Also it reduces the 101 Logical I/Os to1
Logical I/O. This might provide the same kind of benefit the Bitmap Index
Scan provided.

Again there are some specific cases
a) If the tuple is locked / Concurrently being updated:
Currently we release the buffer, take a lock on the
tuple and wait for the transaction to complete in case of concurrent
updates. In the Full table update also, we will do the same.

b) If the page contains lot of deleted tuples:
This is a tricky scenario. Say if we have 100 tuples
and we have 20% of them deleted. In the current scenario, we will find that
out during the read lock and we will not waste time in those tuples during
the write lock. But in the suggested scenario, we will be wasting time in
those with the write lock on the buffer. In order to circumvent that, we can
resort to a one read lock + one write lock combination.

Again if this full table updates are thought with the OLTP applications in
mind, then this is not at all a suitable option. This will only benefit the
people with Data Warehouses.

Expecting some more replies....

Thanks,
Gokul.

On 9/20/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> "Gokulakannan Somsundaram" <gokul007(at)gmail(dot)com> writes:
> > I propose to change this row-by-row approach, when it is a full table
> > update. I plan to send a extra flag(which will be set for Full table
> > Deletes/Updates). this would make the access method directly acquire the
> > exclusive lock and update the existing record.
>
> This sounds like a recipe for utter destruction of what little
> modularity and layering we've got left. And I rather doubt it will buy
> anything interesting performance-wise.
>
> To cite just one concrete objection: surely the tuple-fetch code has got
> no business firing triggers.
>
> regards, tom lane
>


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Gokulakannan Somsundaram" <gokul007(at)gmail(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improving the Performance of Full Table Updates
Date: 2007-09-21 10:24:43
Message-ID: 46F39BEB.1040205@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gokulakannan Somsundaram wrote:
> Again if this full table updates are thought with the OLTP applications in
> mind, then this is not at all a suitable option. This will only benefit the
> people with Data Warehouses.
>
> Expecting some more replies....

Start with profiling.

I just ran a quick oprofile run of a full-table UPDATE on a simple table
with one index, and it looks like RelationGetBufferForTuple uses 4.53%
of the CPU time. Out of that, 2.86 percentage points are spent in
ReadBuffer_common. That means that write-locking the heap pages takes at
most 4.53 - 2.86 = 1.67 % of the total CPU time.

That's the upper limit of the benefit from the scheme you're proposing.
Surely the effort would be better spent on something else. For example,
if you kept the insertion target page just pinned over the calls, which
wouldn't have the problems with triggers etc, you could save that 2.86%.
Which still isn't much. Or take a look at WAL logging. XLogInsert took
16.06% of the CPU time. Earlier tests have suggested that a big chunk of
that time is spent in CRC calculation. Alternative CRC methods have been
suggested in the past, or perhaps that could time could be offloaded to
the WAL writer process, speeding up the UPDATE on a multi-CPU server.

Also, if we're talking about data warehousing, we're talking about big
tables that don't fit in memory. That means that you're likely
bottlenecked by I/O speed, not CPU. If that's the case, saving some CPU
time makes no difference whatsoever. What would help with I/O bottleneck
is to try to make the disk footprint smaller, or make better use of the
I/O bandwidth available.

Three steps to improve throughput:

1. Identify the hardware component that's the bottleneck.
2. Profile the workload to see what's using the bottlenecked resource
the most.
3. Figure out how to make that piece of code cheaper.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Improving the Performance of Full Table Updates
Date: 2007-09-26 13:34:21
Message-ID: 9362e74e0709260634q352a6aaejbc503cedc554c7d3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Tom/ Heikki,
Thanks for the suggestion. After profiling i got similar results.
So i am thinking of a design like this to get the performance improvement.

a) We can get one page for insert(during update) and we will hold the write
lock on it, till the page gets filled. In this way,
RelationGetBufferForTuple will get called only once for one page of inserts.

b) Do you think if we can optimize the XlogInsert in such a way, it will
write a page instead of writing all the records in the page. I think we
need to write a recovery routine for the same. Currently the page gets
flushed to the WAL, if it gets modified after the checkpoint. So i still
need to understand those code pieces. But do you think it is wise to
continue working on this line?

Thanks,
Gokul.

On 9/21/07, Heikki Linnakangas <heikki(at)enterprisedb(dot)com> wrote:
>
> Gokulakannan Somsundaram wrote:
> > Again if this full table updates are thought with the OLTP applications
> in
> > mind, then this is not at all a suitable option. This will only benefit
> the
> > people with Data Warehouses.
> >
> > Expecting some more replies....
>
> Start with profiling.
>
> I just ran a quick oprofile run of a full-table UPDATE on a simple table
> with one index, and it looks like RelationGetBufferForTuple uses 4.53%
> of the CPU time. Out of that, 2.86 percentage points are spent in
> ReadBuffer_common. That means that write-locking the heap pages takes at
> most 4.53 - 2.86 = 1.67 % of the total CPU time.
>
> That's the upper limit of the benefit from the scheme you're proposing.
> Surely the effort would be better spent on something else. For example,
> if you kept the insertion target page just pinned over the calls, which
> wouldn't have the problems with triggers etc, you could save that 2.86%.
> Which still isn't much. Or take a look at WAL logging. XLogInsert took
> 16.06% of the CPU time. Earlier tests have suggested that a big chunk of
> that time is spent in CRC calculation. Alternative CRC methods have been
> suggested in the past, or perhaps that could time could be offloaded to
> the WAL writer process, speeding up the UPDATE on a multi-CPU server.
>
> Also, if we're talking about data warehousing, we're talking about big
> tables that don't fit in memory. That means that you're likely
> bottlenecked by I/O speed, not CPU. If that's the case, saving some CPU
> time makes no difference whatsoever. What would help with I/O bottleneck
> is to try to make the disk footprint smaller, or make better use of the
> I/O bandwidth available.
>
> Three steps to improve throughput:
>
> 1. Identify the hardware component that's the bottleneck.
> 2. Profile the workload to see what's using the bottlenecked resource
> the most.
> 3. Figure out how to make that piece of code cheaper.
>
> --
> Heikki Linnakangas
> EnterpriseDB http://www.enterprisedb.com
>


From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Gokulakannan Somasundaram <gokul007(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Improving the Performance of Full Table Updates
Date: 2007-09-26 19:02:26
Message-ID: 46FAACC2.5090900@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gokulakannan Somasundaram wrote:
> Hi Tom/ Heikki,
> Thanks for the suggestion. After profiling i got similar results.
> So i am thinking of a design like this to get the performance improvement.
>
> a) We can get one page for insert(during update) and we will hold the write
> lock on it, till the page gets filled. In this way,
> RelationGetBufferForTuple will get called only once for one page of inserts.

The locking actually isn't that expensive once you have the page pinned.
For starters, keep the page pinned over calls to heap_update, and just
relock it instead of calling RelationGetBufferForTuple. Unsurprisingly,
this is not a new idea:
http://archives.postgresql.org/pgsql-patches/2007-05/msg00499.php.

> b) Do you think if we can optimize the XlogInsert in such a way, it will
> write a page instead of writing all the records in the page. I think we
> need to write a recovery routine for the same. Currently the page gets
> flushed to the WAL, if it gets modified after the checkpoint. So i still
> need to understand those code pieces. But do you think it is wise to
> continue working on this line?

It's going to be very difficult at least. There's a lot of race
conditions lurking if you try to coalesce multiple updates to a single
WAL record.

That said, making XLogInsert faster would help a lot of use cases, not
only full-table udpates. Most of the time is spent calculating the CRC,
but it has some other non-trivial overhead as well. Profiling XLogInsert
in more detail, and figuring out how to make it faster would be very nice.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Improving the Performance of Full Table Updates
Date: 2007-10-08 06:31:19
Message-ID: 9362e74e0710072331w4b348ef8ge115918a650608a2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Heikki,
Finally i found some time to look more into the CRC code. The
time is mostly taken when there is a back-up block in the XLOG structure.
when it calculates the CRC for the entire block(there is some optimization
already for the holes), the time is spent on the CRC macro. I tried doing
some small optimization like changing the int into uint8, thinking that the
exponentiation operation might get slightly faster, with no success. I don't
know whether changing the CRC algorithm is a good option.
I have other observations on including the snapshot information
into the indexes, on which i will make a proposal. Please provide me
guidance on that.

Thanks,
Gokul.

On 9/27/07, Heikki Linnakangas <heikki(at)enterprisedb(dot)com> wrote:
>
> Gokulakannan Somasundaram wrote:
> > Hi Tom/ Heikki,
> > Thanks for the suggestion. After profiling i got similar
> results.
> > So i am thinking of a design like this to get the performance
> improvement.
> >
> > a) We can get one page for insert(during update) and we will hold the
> write
> > lock on it, till the page gets filled. In this way,
> > RelationGetBufferForTuple will get called only once for one page of
> inserts.
>
> The locking actually isn't that expensive once you have the page pinned.
> For starters, keep the page pinned over calls to heap_update, and just
> relock it instead of calling RelationGetBufferForTuple. Unsurprisingly,
> this is not a new idea:
> http://archives.postgresql.org/pgsql-patches/2007-05/msg00499.php.
>
> > b) Do you think if we can optimize the XlogInsert in such a way, it will
> > write a page instead of writing all the records in the page. I think we
> > need to write a recovery routine for the same. Currently the page gets
> > flushed to the WAL, if it gets modified after the checkpoint. So i still
> > need to understand those code pieces. But do you think it is wise to
> > continue working on this line?
>
> It's going to be very difficult at least. There's a lot of race
> conditions lurking if you try to coalesce multiple updates to a single
> WAL record.
>
> That said, making XLogInsert faster would help a lot of use cases, not
> only full-table udpates. Most of the time is spent calculating the CRC,
> but it has some other non-trivial overhead as well. Profiling XLogInsert
> in more detail, and figuring out how to make it faster would be very nice.
>
> --
> Heikki Linnakangas
> EnterpriseDB http://www.enterprisedb.com
>


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improving the Performance of Full Table Updates
Date: 2007-10-08 09:18:52
Message-ID: 4709F5FC.20803@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gokulakannan Somasundaram wrote:
> Finally i found some time to look more into the CRC code. The
> time is mostly taken when there is a back-up block in the XLOG structure.
> when it calculates the CRC for the entire block(there is some optimization
> already for the holes), the time is spent on the CRC macro. I tried doing
> some small optimization like changing the int into uint8, thinking that the
> exponentiation operation might get slightly faster, with no success. I don't
> know whether changing the CRC algorithm is a good option.

Yeah, I believe the current implementation is quite well-optimized, so
it's unlikely that you'll find any micro-optimizations like that that
make any significant difference. I'm afraid you're going to need bigger
changes to make it faster. Some ideas that I've had in the past that
might be worth investigating:

* Use a much bigger WAL block size, to avoid splitting each big record,
like one with a backup block. One block / one WAL segment might not be a
bad idea.

* mmap the WAL segments, instead of using the slru buffers. Even if it
didn't help with performance per se, we could get rid of the wal_buffers
setting and let the OS manage the caching.

* Use a cheaper CRC algorithm.

* Calculate CRC when flushing, instead of inserting, the WAL. This would
allow the WAL writer to pick up much the of the CRC work, increasing
throughput on multi-core systems, particularly on bulk loads.

* Cover multiple WAL records with the CRC. Presumably calculating the
CRC of a large contiguous chunk cheaper than n smaller chunks.

* Allow buffering of multiple WAL segments, to avoid the WAL flush on
every xlog segment switch.

* Reorder WAL records on the fly, so that if you have something like
"full page image + update + update + update + update" all happening on a
single page, we could skip writing the individual update records, and
take a full page image of the page after the last update instead.
(getting all the locking right is quite tricky...)

* Change the locking so that you don't need to hold the WALInsertLock
while memcpying the record to the WAL block, but just to allocate the
slot for it. Wouldn't increase single-process performance, but would
make it more scalable.

I haven't pursued any of these further, so some or all of them might
turn out to be not helpful or not feasible.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Improving the Performance of Full Table Updates
Date: 2007-10-08 14:04:43
Message-ID: 18439.1191852283@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
> * mmap the WAL segments, instead of using the slru buffers.

This one's almost certainly a nonstarter, for lack of any portable way
to control when writes happen.

regards, tom lane


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improving the Performance of Full Table Updates
Date: 2007-10-08 14:22:44
Message-ID: 470A3D34.1020201@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
>> * mmap the WAL segments, instead of using the slru buffers.
>
> This one's almost certainly a nonstarter, for lack of any portable way
> to control when writes happen.

For flushing, there's msync, which I believe is quite portable.

We mustn't write data pages before the WAL hits the disk, but we don't
have any such limitation for WAL.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Improving the Performance of Full Table Updates
Date: 2007-10-08 14:33:31
Message-ID: 18887.1191854011@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
> We mustn't write data pages before the WAL hits the disk, but we don't
> have any such limitation for WAL.

Um, you're right, I was remembering discussions about trying to mmap
data files. Still, googling "msync performance" leaves me full of
concern about whether this would be a portable improvement.

regards, tom lane


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>, "Gokulakannan Somasundaram" <gokul007(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improving the Performance of Full Table Updates
Date: 2007-10-08 15:19:11
Message-ID: 87odf963dc.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
>> * mmap the WAL segments, instead of using the slru buffers.
>
> This one's almost certainly a nonstarter, for lack of any portable way
> to control when writes happen.

I think mlock and msync(MS_SYNC) ought to be enough. I'm not sure every OS
implements them as specified though.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com