Re: WAL format changes

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Subject: Re: WAL format changes
Date: 2012-06-14 21:58:12
Message-ID: 201206142358.12431.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thursday, June 14, 2012 11:01:42 PM Heikki Linnakangas wrote:
> As I threatened earlier
> (http://archives.postgresql.org/message-id/4FD0B1AB.3090405@enterprisedb.co
> m), here are three patches that change the WAL format. The goal is to
> change the format so that when you're inserting a WAL record of a given
> size, you know exactly how much space it requires in the WAL.
>
> 1. Use a 64-bit segment number, instead of the log/seg combination. And
> don't waste the last segment on each logical 4 GB log file. The concept
> of a "logical log file" is now completely gone. XLogRecPtr is unchanged,
> but it should now be understood as a plain 64-bit value, just split into
> two 32-bit integers for historical reasons. On disk, this means that
> there will be log files ending in FF, those were skipped before.
Whats the reason for keeping that awkward split now? There aren't that many
users of xlogid/xcrecoff and many of those would be better served by using
helper macros.
API compatibility isn't a great argument either as code manually playing
around with those needs to be checked anyway. I think there might be some code
around that does XLogRecPtr addition manuall and such.

> 2. Always include the xl_rem_len field, used for continuation records,
> in the xlog page header. A continuation log record only contained that
> one field, it's now included straight in the page header, so the concept
> of a continuation record doesn't exist anymore. Because of alignment,
> this wastes 4 bytes on every page that contains continued data from a
> previous record, and 8 bytes on pages that don't. That's not very much,
> and the next step will buy that back:
>
> 3. Allow WAL record header to be split across pages. Per Tom's
> suggestion, move xl_tot_len to be the first field in XLogRecord, so that
> even if the header is split, xl_tot_len is always on the first page.
> xl_crc is moved to be the last field, and xl_prev is the second to last.
> This has the advantage that you can calculate the CRC for all the other
> fields before acquiring WALInsertLock. For xl_prev, you need to know
> where exactly the record is inserted, so it's handy that it's the last
> field before CRC. This patch doesn't try to take advantage of that,
> however, and I'm not sure if that makes any difference once I finish the
> patch to make XLogInsert scale better, which is the ultimate goal of all
> this.
>
> Those are the three patches I'd like to get committed in this
> commitfest. To see where all this is leading to, I've included a rough
> WIP version of the XLogInsert scaling patch. This version is quite
> different from the one I posted in spring, it takes advantage of the WAL
> format changes, and I'm also experimenting with a different method of
> tracking how far each WAL insertion has progressed. But more on that later.
>
> (Note to self: remember to bump XLOG_PAGE_MAGIC)
Will review.

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2012-06-14 22:30:46 Re: sortsupport for text
Previous Message Andres Freund 2012-06-14 21:52:04 Re: WAL format changes