Re: WIP patch for hint bit i/o mitigation

From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP patch for hint bit i/o mitigation
Date: 2012-11-14 20:31:53
Message-ID: CAOeZVid7zd9V-9WsEtf9wRCd0H4yw_1OgCLm7=CdvGPCxZcJJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 15, 2012 at 12:42 AM, Atri Sharma <atri(dot)jiit(at)gmail(dot)com> wrote:

>
>
>
> On Wed, Nov 7, 2012 at 9:47 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>
>> On Wed, Nov 7, 2012 at 6:01 AM, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
>> wrote:
>> >> >> Following the sig is a first cut at a patch (written by Atri) that
>> >> >> attempts to mitigate hint bit i/o penalty when many pages worth of
>> >> >> tuples are sequentially written out with the same transaction id.
>> >> >> There have been other attempts to deal with this problem that fit
>> >> >> niche cases (especially those that create the table in the same
>> >> >> transaction as the one inserting) that work but don't really solve
>> >> the
>> >> >> problem generally.
>> >>
>> >> As we are now dealing with only the last xid(please refer to the
>> details
>> >> of the patch attached), the invalidation issues are not significant any
>> >> more.
>> >
>> > I think you are right, but please check below the scenario I have in
>> mind
>> > due to which I got confused:
>> >
>> > Session -1
>> > 1. let's say for tup-1 on page, cachedVisibilityXid is not set, and it
>> go
>> > inside SetHinbits and set it to xid of tuple which is let's say = 708
>> > 2. now for all consecutive tuples which have same xmin (708), it can
>> > directly refer
>> > cached id and cached status, even if hint bit is not set.
>> >
>> > Other Sessions
>> > 3. now from other sessions, large operations happened on all other
>> tables
>> > except this table.
>> > 4. The situation can reach where xid can wrap around.
>> >
>> > Session -1
>> > 5. It again tries to query the same table, at this point it will compare
>> > the xid in tuple with cachedVisibilityXid.
>> >
>> > Now if all tuple's xid's for that particular table are frozen in all
>> cases
>> > then it seems to be okay, otherwise it might be problem.
>> > I am not fully aware about this wrap around and frozed xid concept,
>> thats
>> > why I had doubted
>> > that it might create problem, on further thought, it appears that I was
>> > wrong.
>>
>> Well there's that. But more to the point for the cache to fail you'd
>> have to have a scenario where a table didn't scan any records for 1
>> billion+ transactions. See note [3] above for reasoning why this is
>> implausible. Also we're already relying on this effect in transam.c.
>>
>> merlin
>>
>
> PFA below the sig the updated patch for the same.It maintains a cache
> cachedVisibilityXid which holds the results of clog visibility check.
> cachedVisibilityXidStatus holds the commit status of the XID in
> cachedVisibilityXid.
>
> In each visibility function (except HeapTupleSatisfiesVacuum() ), an
> addition check has been added to check if the commit status of Xmin or Xmax
> of a tuple can be retrieved from the cache.
>
> So, in place of
>
> if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED))
>
> the condition is now
>
>
> if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED)
> && !(cachedVisibilityXid == HeapTupleHeaderGetXmin(tuple)
> && cachedVisibilityXidStatus == HEAP_XMIN_COMMITTED))
>
> This checks if the commit status can be known from the cache or not before
> proceeding.
>
> I will be posting the patch to commit fest.
>
> Thoughts/Feedback?
>
> Atri
>
> --
> Regards,
>
> Atri
> *l'apprenant
>
>
> *patch:
>
> *** tqual--unchanged.c 2012-09-20 03:17:58.000000000 +0530
> --- tqual.c 2012-11-14 23:27:30.470499857 +0530
> ***************
> *** 75,80 ****
> --- 75,88 ----
>
> /* local functions */
> static bool XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot);
>
> + /*
> + * Single-item cache for results of clog visibility check. It's worth
> having
> + * such a cache to help reduce the amount of hint bit traffic when
> + * many sequentially touched tuples have the same XID.
> + */
> + static TransactionId cachedVisibilityXid;
> + /* Visibility status cache stores the commit status of the XID in
> cachedVisibilityXid */
> + static uint16 cachedVisibilityXidStatus;
>
> /*
> * SetHintBits()
> ***************
> *** 117,122 ****
> --- 125,133 ----
>
>
> if (XLogNeedsFlush(commitLSN) && BufferIsPermanent(buffer))
> return; /* not flushed yet, so don't set hint
> */
> +
> + cachedVisibilityXid = xid;
> + cachedVisibilityXidStatus = infomask;
> }
>
> tuple->t_infomask |= infomask;
> ***************
> *** 164,170 ****
> bool
> HeapTupleSatisfiesSelf(HeapTupleHeader tuple, Snapshot snapshot, Buffer
> buffer)
>
> {
> ! if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED))
> {
> if (tuple->t_infomask & HEAP_XMIN_INVALID)
> return false;
> --- 175,183 ----
> bool
> HeapTupleSatisfiesSelf(HeapTupleHeader tuple, Snapshot snapshot, Buffer
> buffer)
>
> {
> ! if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED)
> ! && !(cachedVisibilityXid == HeapTupleHeaderGetXmin(tuple)
> ! && cachedVisibilityXidStatus == HEAP_XMIN_COMMITTED))
> {
> if (tuple->t_infomask & HEAP_XMIN_INVALID)
> return false;
> ***************
> *** 247,253 ****
> if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid or
> aborted */
> return true;
>
> ! if (tuple->t_infomask & HEAP_XMAX_COMMITTED)
> {
> if (tuple->t_infomask & HEAP_IS_LOCKED)
> return true;
> --- 260,268 ----
> if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid or
> aborted */
> return true;
>
> ! if ((tuple->t_infomask & HEAP_XMAX_COMMITTED)
> ! || (cachedVisibilityXid == HeapTupleHeaderGetXmax(tuple)
> ! && cachedVisibilityXidStatus == HEAP_XMAX_COMMITTED))
> {
> if (tuple->t_infomask & HEAP_IS_LOCKED)
> return true;
> ***************
> *** 327,333 ****
> bool
> HeapTupleSatisfiesNow(HeapTupleHeader tuple, Snapshot snapshot, Buffer
> buffer)
>
> {
> ! if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED))
> {
> if (tuple->t_infomask & HEAP_XMIN_INVALID)
> return false;
> --- 342,350 ----
> bool
> HeapTupleSatisfiesNow(HeapTupleHeader tuple, Snapshot snapshot, Buffer
> buffer)
>
> {
> ! if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED)
> ! && !(cachedVisibilityXid == HeapTupleHeaderGetXmin(tuple)
> ! && cachedVisibilityXidStatus == HEAP_XMIN_COMMITTED))
> {
> if (tuple->t_infomask & HEAP_XMIN_INVALID)
> return false;
> ***************
> *** 416,422 ****
> if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid or
> aborted */
> return true;
>
> ! if (tuple->t_infomask & HEAP_XMAX_COMMITTED)
> {
> if (tuple->t_infomask & HEAP_IS_LOCKED)
> return true;
> --- 433,441 ----
> if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid or
> aborted */
> return true;
>
> ! if ((tuple->t_infomask & HEAP_XMAX_COMMITTED)
> ! || (cachedVisibilityXid == HeapTupleHeaderGetXmax(tuple)
> ! && cachedVisibilityXidStatus == HEAP_XMAX_COMMITTED))
> {
> if (tuple->t_infomask & HEAP_IS_LOCKED)
> return true;
> ***************
> *** 493,499 ****
> HeapTupleSatisfiesToast(HeapTupleHeader tuple, Snapshot snapshot,
>
> Buffer buffer)
> {
> ! if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED))
> {
> if (tuple->t_infomask & HEAP_XMIN_INVALID)
> return false;
> --- 512,520 ----
> HeapTupleSatisfiesToast(HeapTupleHeader tuple, Snapshot snapshot,
>
> Buffer buffer)
> {
> ! if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED)
> ! && !(cachedVisibilityXid == HeapTupleHeaderGetXmin(tuple)
> ! && cachedVisibilityXidStatus == HEAP_XMIN_COMMITTED))
> {
> if (tuple->t_infomask & HEAP_XMIN_INVALID)
> return false;
> ***************
> *** 574,580 ****
> HeapTupleSatisfiesUpdate(HeapTupleHeader tuple, CommandId curcid,
>
> Buffer buffer)
> {
> ! if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED))
> {
> if (tuple->t_infomask & HEAP_XMIN_INVALID)
> return HeapTupleInvisible;
> --- 595,603 ----
> HeapTupleSatisfiesUpdate(HeapTupleHeader tuple, CommandId curcid,
>
> Buffer buffer)
> {
> ! if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED)
> ! && !(cachedVisibilityXid == HeapTupleHeaderGetXmin(tuple)
> ! && cachedVisibilityXidStatus == HEAP_XMIN_COMMITTED))
> {
> if (tuple->t_infomask & HEAP_XMIN_INVALID)
> return HeapTupleInvisible;
> ***************
> *** 663,669 ****
> if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid or
> aborted */
> return HeapTupleMayBeUpdated;
>
> ! if (tuple->t_infomask & HEAP_XMAX_COMMITTED)
> {
> if (tuple->t_infomask & HEAP_IS_LOCKED)
> return HeapTupleMayBeUpdated;
> --- 686,694 ----
> if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid or
> aborted */
> return HeapTupleMayBeUpdated;
>
> ! if ((tuple->t_infomask & HEAP_XMAX_COMMITTED)
> ! || (cachedVisibilityXid == HeapTupleHeaderGetXmax(tuple)
> ! && cachedVisibilityXidStatus == HEAP_XMAX_COMMITTED))
> {
> if (tuple->t_infomask & HEAP_IS_LOCKED)
> return HeapTupleMayBeUpdated;
> ***************
> *** 743,749 ****
> {
> snapshot->xmin = snapshot->xmax = InvalidTransactionId;
>
>
> ! if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED))
> {
> if (tuple->t_infomask & HEAP_XMIN_INVALID)
> return false;
> --- 768,776 ----
> {
> snapshot->xmin = snapshot->xmax = InvalidTransactionId;
>
>
> ! if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED)
> ! && !(cachedVisibilityXid == HeapTupleHeaderGetXmin(tuple)
> ! && cachedVisibilityXidStatus == HEAP_XMIN_COMMITTED))
> {
> if (tuple->t_infomask & HEAP_XMIN_INVALID)
> return false;
> ***************
> *** 830,836 ****
> if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid or
> aborted */
> return true;
>
> ! if (tuple->t_infomask & HEAP_XMAX_COMMITTED)
> {
> if (tuple->t_infomask & HEAP_IS_LOCKED)
> return true;
> --- 857,865 ----
> if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid or
> aborted */
> return true;
>
> ! if ((tuple->t_infomask & HEAP_XMAX_COMMITTED)
> ! || (cachedVisibilityXid == HeapTupleHeaderGetXmax(tuple)
> ! && cachedVisibilityXidStatus == HEAP_XMAX_COMMITTED))
> {
> if (tuple->t_infomask & HEAP_IS_LOCKED)
> return true;
>
> ***************
> *** 904,910 ****
> HeapTupleSatisfiesMVCC(HeapTupleHeader tuple, Snapshot snapshot,
> Buffer buffer)
> {
> ! if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED))
> {
> if (tuple->t_infomask & HEAP_XMIN_INVALID)
> return false;
> --- 933,941 ----
>
> HeapTupleSatisfiesMVCC(HeapTupleHeader tuple, Snapshot snapshot,
> Buffer buffer)
> {
> ! if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED)
> ! && !(cachedVisibilityXid == HeapTupleHeaderGetXmin(tuple)
> ! && cachedVisibilityXidStatus == HEAP_XMIN_COMMITTED))
> {
> if (tuple->t_infomask & HEAP_XMIN_INVALID)
> return false;
> ***************
> *** 1008,1014 ****
> return true;
> }
>
> ! if (!(tuple->t_infomask & HEAP_XMAX_COMMITTED))
> {
> if
> (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmax(tuple)))
> {
> --- 1039,1047 ----
>
> return true;
> }
>
> ! if (!(tuple->t_infomask & HEAP_XMAX_COMMITTED)
> ! && !(cachedVisibilityXid == HeapTupleHeaderGetXmax(tuple)
> ! && cachedVisibilityXidStatus == HEAP_XMAX_COMMITTED))
> {
> if
> (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmax(tuple)))
> {
> ***************
> *** 1240,1246 ****
> * invalid, then we assume it's still alive (since the presumption
> is that
> * all relevant hint bits were just set moments ago).
> */
>
> ! if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED))
> return (tuple->t_infomask & HEAP_XMIN_INVALID) != 0 ? true :
> false;
>
> /*
> --- 1273,1281 ----
> * invalid, then we assume it's still alive (since the presumption
> is that
> * all relevant hint bits were just set moments ago).
> */
>
> ! if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED)
> ! && !(cachedVisibilityXid == HeapTupleHeaderGetXmin(tuple)
> ! && cachedVisibilityXidStatus == HEAP_XMIN_COMMITTED))
> return (tuple->t_infomask & HEAP_XMIN_INVALID) != 0 ? true :
> false;
>
> /*
> ***************
> *** 1253,1259 ****
> return false;
>
> /* If deleter isn't known to have committed, assume it's still
> running. */
>
> ! if (!(tuple->t_infomask & HEAP_XMAX_COMMITTED))
> return false;
>
> /* Deleter committed, so tuple is dead if the XID is old enough. */
> --- 1288,1296 ----
> return false;
>
> /* If deleter isn't known to have committed, assume it's still
> running. */
>
> ! if (!(tuple->t_infomask & HEAP_XMAX_COMMITTED)
> ! && !(cachedVisibilityXid == HeapTupleHeaderGetXmax(tuple)
> ! && cachedVisibilityXidStatus == HEAP_XMAX_COMMITTED))
> return false;
>
> /* Deleter committed, so tuple is dead if the XID is old enough. */
>
>
Attached is the patch for review.

Atri

--
Regards,

Atri
*l'apprenant*

Attachment Content-Type Size
tqual_xid_cache_v3_1.patch application/octet-stream 8.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Karl O. Pinc 2012-11-14 20:35:54 Re: Doc patch, further describe and-mask nature of the permission system
Previous Message Andrew Dunstan 2012-11-14 20:21:29 Re: Enabling Checksums