Re: Proposal for CSN based snapshots

From: Greg Stark <stark(at)mit(dot)edu>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Rajeev rastogi <rajeev(dot)rastogi(at)huawei(dot)com>, Ants Aasma <ants(at)cybertec(dot)at>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Markus Wanner <markus(at)bluegap(dot)ch>
Subject: Re: Proposal for CSN based snapshots
Date: 2014-08-26 18:25:57
Message-ID: CAM-w4HMJReV6DLZC6v+-TR3_S0eBkxJtiuMrTkb92sdOssJezA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 26, 2014 at 11:45 AM, Heikki Linnakangas
<hlinnakangas(at)vmware(dot)com> wrote:
>> It appears that this patch weakens the idea of hint bits. Even if
>> HEAP_XMIN_COMMITTED is set, it still needs to find out if it's in the
>> snapshot.
>>
>> That's fast if the xid is less than snap->xmin, but otherwise it needs
>> to do a fetch from the csnlog, which is exactly what the hint bits are
>> designed to avoid. And we can't get around this, because the whole point
>> of this patch is to remove the xip array from the snapshot.
>
>
> Yeah. This patch in the current state is likely much much slower than
> unpatched master, except in extreme cases where you have thousands of
> connections and short transactions so that without the patch, you spend most
> of the time acquiring snapshots.

Interesting analysis.

I suppose the equivalent of hint bits would be to actually write the
CSN of the transaction into the record when the hint bit is set.

I don't immediately see how to make that practical. One thought would
be to have a list of xids in the page header with their corresponding
csn -- which starts to sound a lot like Oralce's "Interested
Transaction List". But I don't see how to make that work for the
hundreds of possible xids on the page.

The worst case for visibility resolution is you have a narrow table
that has random access DDL happening all the time, each update is a
short transaction and there are a very high rate of such transactions
spread out uniformly over a very large table. That means any given
page has over 200 rows with random xids spread over a very large range
of xids.

Currently the invariant hint bits give us is that each xid needs to be
looked up in the clog only a more or less fixed number of times, in
that scenario only once since the table is very large and the
transactions short lived.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-08-26 18:27:45 Re: Compute attr_needed for child relations (was Re: inherit support for foreign tables)
Previous Message Jeff Davis 2014-08-26 17:58:55 Re: Proposal for CSN based snapshots