Confusing with commit time usage in logical decoding

Lists: pgsql-hackers
From: Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Confusing with commit time usage in logical decoding
Date: 2016-02-29 11:18:48
Message-ID: 56D42918.1010108@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

I read this message
http://www.postgresql.org/message-id/56D4197E.9050706@informatik.uni-kl.de

Is this a bug or a typo? In DecodeCommit() in decode.c instead of:

if (parsed->xinfo & XACT_XINFO_HAS_ORIGIN)
{
origin_lsn = parsed->origin_lsn;
commit_time = parsed->origin_timestamp;
}

should be:

if (parsed->xinfo & XACT_XINFO_HAS_ORIGIN)
{
origin_lsn = parsed->origin_lsn;
commit_time = parsed->origin_timestamp;
}
else
commit_time = parsed->xact_time;

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


From: Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Confusing with commit time usage in logical decoding
Date: 2016-03-01 09:09:26
Message-ID: 56D55C46.2040101@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello, Andres

You have introduced a large replication progress tracking infrastructure
last year. And there is a problem described at the link in the quote below.

Attached patch fix this issue. Is this patch correct? I will be grateful
if it is and if it will be committed.

Thanks.

On 29.02.2016 14:18, Artur Zakirov wrote:
> Hello,
>
> I read this message
> http://www.postgresql.org/message-id/56D4197E.9050706@informatik.uni-kl.de
>
> Is this a bug or a typo? In DecodeCommit() in decode.c instead of:
>
> if (parsed->xinfo & XACT_XINFO_HAS_ORIGIN)
> {
> origin_lsn = parsed->origin_lsn;
> commit_time = parsed->origin_timestamp;
> }
>
> should be:
>
> if (parsed->xinfo & XACT_XINFO_HAS_ORIGIN)
> {
> origin_lsn = parsed->origin_lsn;
> commit_time = parsed->origin_timestamp;
> }
> else
> commit_time = parsed->xact_time;
>

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Attachment Content-Type Size
logical-decoding.patch text/x-patch 422 bytes

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Confusing with commit time usage in logical decoding
Date: 2016-03-01 16:57:55
Message-ID: 20160301165755.GA356079@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Artur Zakirov wrote:
> Hello, Andres
>
> You have introduced a large replication progress tracking infrastructure
> last year. And there is a problem described at the link in the quote below.
>
> Attached patch fix this issue. Is this patch correct? I will be grateful if
> it is and if it will be committed.

AFAICS this is clearly a bug introduced in 5aa235042:

/* replay actions of all transaction + subtransactions in order */
ReorderBufferCommit(ctx->reorder, xid, buf->origptr, buf->endptr,
- parsed->xact_time);
+ commit_time, origin_id, origin_lsn);
}

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Confusing with commit time usage in logical decoding
Date: 2016-03-01 17:09:28
Message-ID: 56D5CCC8.9080703@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/03/16 17:57, Alvaro Herrera wrote:
> Artur Zakirov wrote:
>> Hello, Andres
>>
>> You have introduced a large replication progress tracking infrastructure
>> last year. And there is a problem described at the link in the quote below.
>>
>> Attached patch fix this issue. Is this patch correct? I will be grateful if
>> it is and if it will be committed.
>
> AFAICS this is clearly a bug introduced in 5aa235042:
>
> /* replay actions of all transaction + subtransactions in order */
> ReorderBufferCommit(ctx->reorder, xid, buf->origptr, buf->endptr,
> - parsed->xact_time);
> + commit_time, origin_id, origin_lsn);
> }
>

Well yeah but the commit_time is set few lines above as Artur pointed
out, I think the proposed fix is correct one.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Confusing with commit time usage in logical decoding
Date: 2016-03-01 17:13:28
Message-ID: 20160301171328.GA357017@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Petr Jelinek wrote:
> On 01/03/16 17:57, Alvaro Herrera wrote:
> >Artur Zakirov wrote:
> >>Hello, Andres
> >>
> >>You have introduced a large replication progress tracking infrastructure
> >>last year. And there is a problem described at the link in the quote below.
> >>
> >>Attached patch fix this issue. Is this patch correct? I will be grateful if
> >>it is and if it will be committed.
> >
> >AFAICS this is clearly a bug introduced in 5aa235042:
> >
> > /* replay actions of all transaction + subtransactions in order */
> > ReorderBufferCommit(ctx->reorder, xid, buf->origptr, buf->endptr,
> >- parsed->xact_time);
> >+ commit_time, origin_id, origin_lsn);
> > }
> >
>
> Well yeah but the commit_time is set few lines above as Artur pointed out, I
> think the proposed fix is correct one.

Err, yes, that's exactly what I am saying. Sorry for being unclear.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Andres Freund <andres(at)anarazel(dot)de>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Confusing with commit time usage in logical decoding
Date: 2016-03-01 17:18:04
Message-ID: 20160301171802.6ismn7iumoaw77uw@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2016-03-01 18:09:28 +0100, Petr Jelinek wrote:
> On 01/03/16 17:57, Alvaro Herrera wrote:
> >Artur Zakirov wrote:
> >>Hello, Andres
> >>
> >>You have introduced a large replication progress tracking infrastructure
> >>last year. And there is a problem described at the link in the quote below.
> >>
> >>Attached patch fix this issue. Is this patch correct? I will be grateful if
> >>it is and if it will be committed.
> >
> >AFAICS this is clearly a bug introduced in 5aa235042:
> >
> > /* replay actions of all transaction + subtransactions in order */
> > ReorderBufferCommit(ctx->reorder, xid, buf->origptr, buf->endptr,
> >- parsed->xact_time);
> >+ commit_time, origin_id, origin_lsn);
> > }
> >
>
> Well yeah but the commit_time is set few lines above as Artur pointed out, I
> think the proposed fix is correct one.

I'd rather just initialize commit_time to parsed->xact_time.

This indeed is clearly a bug. I do wonder if anybody has a good idea
about how to add regression tests for this? It's rather annoying that
we have to suppress timestamps in the test_decoding tests, because
they're obviously not reproducible...

Andres


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Petr Jelinek <petr(at)2ndquadrant(dot)com>, Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Confusing with commit time usage in logical decoding
Date: 2016-03-01 17:28:35
Message-ID: 20160301172835.GA358846@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund wrote:

> I'd rather just initialize commit_time to parsed->xact_time.

That also works.

Probably also change its declaration to actually be TimestampTz ...

> This indeed is clearly a bug. I do wonder if anybody has a good idea
> about how to add regression tests for this? It's rather annoying that
> we have to suppress timestamps in the test_decoding tests, because
> they're obviously not reproducible...

Maybe commit two transactions with a 1s sleep in between, and verify
that the difference between the two timestamps is >= 1s and <= now()?
(I don't know the test_decoding test suite)

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Confusing with commit time usage in logical decoding
Date: 2016-03-01 17:31:42
Message-ID: 56D5D1FE.8090704@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 01/03/16 18:18, Andres Freund wrote:
> Hi,
>
> On 2016-03-01 18:09:28 +0100, Petr Jelinek wrote:
>> On 01/03/16 17:57, Alvaro Herrera wrote:
>>> Artur Zakirov wrote:
>>>> Hello, Andres
>>>>
>>>> You have introduced a large replication progress tracking infrastructure
>>>> last year. And there is a problem described at the link in the quote below.
>>>>
>>>> Attached patch fix this issue. Is this patch correct? I will be grateful if
>>>> it is and if it will be committed.
>>>
>>> AFAICS this is clearly a bug introduced in 5aa235042:
>>>
>>> /* replay actions of all transaction + subtransactions in order */
>>> ReorderBufferCommit(ctx->reorder, xid, buf->origptr, buf->endptr,
>>> - parsed->xact_time);
>>> + commit_time, origin_id, origin_lsn);
>>> }
>>>
>>
>> Well yeah but the commit_time is set few lines above as Artur pointed out, I
>> think the proposed fix is correct one.
>
> I'd rather just initialize commit_time to parsed->xact_time.
>
> This indeed is clearly a bug. I do wonder if anybody has a good idea
> about how to add regression tests for this? It's rather annoying that
> we have to suppress timestamps in the test_decoding tests, because
> they're obviously not reproducible...
>

The test for commit timestamps checks that the timestamps are within
reasonable time frame (for example, bigger than value of a timestamp
column in the table since that's assigned before commit obviously) ,
it's not perfect but similar approach should catch issues like this one.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Andres Freund <andres(at)anarazel(dot)de>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Confusing with commit time usage in logical decoding
Date: 2016-03-03 07:47:27
Message-ID: 20160303074727.hvpitilqrl4fb3a2@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2016-03-01 18:31:42 +0100, Petr Jelinek wrote:
> On 01/03/16 18:18, Andres Freund wrote:
> >I'd rather just initialize commit_time to parsed->xact_time.
> >
> >This indeed is clearly a bug. I do wonder if anybody has a good idea
> >about how to add regression tests for this? It's rather annoying that
> >we have to suppress timestamps in the test_decoding tests, because
> >they're obviously not reproducible...
> >
>
> The test for commit timestamps checks that the timestamps are within
> reasonable time frame (for example, bigger than value of a timestamp column
> in the table since that's assigned before commit obviously) , it's not
> perfect but similar approach should catch issues like this one.

Fixed, including such a test. Thanks for the report; and for the idea of
the fix!

Andres