Re: Scaling XLog insertion (was Re: Moving more work outside WALInsertLock)

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: 'PostgreSQL-development' <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Scaling XLog insertion (was Re: Moving more work outside WALInsertLock)
Date: 2012-03-05 08:31:56
Message-ID: 4F5479FC.3030304@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 20.02.2012 08:00, Amit Kapila wrote:
> I was trying to understand this patch and had few doubts:
>
> 1. In PerformXLogInsert(), why there is need to check freespace when
> already during ReserveXLogInsertLocation(), the space is reserved. Is
> it possible that the record size is more than actually calculted in
> ReserveXLogInsertLocation(), if so in that case what I understand is
> it is moving to next page to write, however isn't it possible that
> some other backend had already reserved that space.

The calculations between PerformXLogInsert (called CopyXLogRecordToWAL()
in the latest patch version) and ReserveXLogInsertLocation() must always
match, otherwise we have reserved incorrect amount of WAL and you get
corrupt WAL. They both need to do the same calculations of how the WAL
record is split across pages, which depends on how much free space there
is on the first page. There is an assertion in CopyXLogRecordToWAL() to
check that once it's finished writing the WAL record, the last byte
landed on the position that ReserveXLogInsertLocation() calculated it
would.

Another way to do that would be to remember the calculations done in
ReserveXLogInsertLocation(), in an extra array or something. But we want
to keep ReserveXLogInsertLocation() as simple as possible, as that runs
while holding the spinlock. Any extra CPU cycles there will hurt
scalability.

> 2. In function WaitForXLogInsertionSlotToBecomeFree(), chances are
> there such that when nextslot equals lastslot, all new backends try
> to reserve a slot will start waiting on same last slot which can
> lead to serialization for those backends and can impact latency.

True. That warrants some performance testing to see if that effect is
significant. (it's surely better than the current situation, anyway,
where all WAL insertions block on the single lock)

> 3. GetXlogBuffer - This will get called twice, once for normal
> buffer, second time for when there is not enough space in current
> page, and both times it can lead to I/O whereas in earlier algorithm,
> the chances of I/O is only once.

I don't see any difference to the previous situation. In both cases, if
you need a new page to copy the WAL record to, you need to first flush
out some old pages from the WAL buffers if they're all dirty. The patch
doesn't change the number of WAL buffers consumed. Note that
GetXLogBuffer() is very cheap when it doesn't need to do I/O, extra
calls to it don't matter if the page is already initialized.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Shigeru Hanada 2012-03-05 09:21:19 Re: pgsql_fdw, FDW for PostgreSQL server
Previous Message Michael Meskes 2012-03-05 08:26:15 Re: ECPG FETCH readahead