From: | Steve Singer <steve(at)ssinger(dot)info> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: logical changeset generation v6 |
Date: | 2013-09-28 17:25:09 |
Message-ID: | BLU0-SMTP992D12C8DC135752910F19DC2A0@phx.gbl |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 09/27/2013 05:18 PM, Andres Freund wrote:
> Hi Steve,
>
> On 2013-09-27 17:06:59 -0400, Steve Singer wrote:
>>> I've determined that when in this test the walsender seems to be hitting
>>> this when it is decode the transactions that are behind the slonik
>>> commands to add tables to replication (set add table, set add sequence).
>>> This is before the SUBSCRIBE SET is submitted.
>>>
>>> I've also noticed something else that is strange (but might be unrelated).
>>> If I stop my slon process and restart it I get messages like:
>>>
>>> WARNING: Starting logical replication from 0/a9321360
>>> ERROR: cannot stream from 0/A9321360, minimum is 0/A9320B00
>>>
>>> Where 0/A9321360 was sent in the last packet my slon received from the
>>> walsender before the restart.
> Uh, that looks like I fumbled some comparison. Let me check.
>
>> I've further narrowed this down to something (or the combination of) what
>> the _disorder_replica.altertableaddTriggers(1);
>> stored function does. (or @SLONYNAMESPACE(at)(dot)altertableaddTriggers(int);
>>
>> Which is essentially
>> * Get an exclusive lock on sl_config_lock
>> * Get an exclusive lock on the user table in question
>> * create a trigger (the deny access trigger)
>> * create a truncate trigger
>> * create a deny truncate trigger
>>
>> I am not yet able to replicate the error by issuing the same SQL commands
>> from psql, but I must be missing something.
>>
>> I can replicate this when just using the test_decoding plugin.
> Thanks. That should get me started with debugging. Unless it's possibly
> fixed in the latest version, one bug fixed there might cause something
> like this if the moon stands exactly right?
The latest version has NOT fixed the problem.
Also, I was a bit inaccurate in my previous descriptions. To clarify:
1. I sometimes am getting that 'unexpected duplicate' error
2. The 'set add table ' which triggers those functions that create and
configure triggers is actually causing the walsender to hit the
following assertion
2 0x0000000000773d47 in ExceptionalCondition (
conditionName=conditionName(at)entry=0x8cf400 "!(ent->cmin ==
change->tuplecid.cmin)", errorType=errorType(at)entry=0x7ab830
"FailedAssertion",
fileName=fileName(at)entry=0x8cecc3 "reorderbuffer.c",
lineNumber=lineNumber(at)entry=1162) at assert.c:54
#3 0x0000000000665480 in ReorderBufferBuildTupleCidHash (txn=0x1b6e610,
rb=<optimized out>) at reorderbuffer.c:1162
#4 ReorderBufferCommit (rb=0x1b6e4f8, xid=<optimized out>,
commit_lsn=3461001952, end_lsn=<optimized out>) at reorderbuffer.c:1285
#5 0x000000000065f0f7 in DecodeCommit (xid=<optimized out>,
nsubxacts=<optimized out>, sub_xids=<optimized out>, ninval_msgs=16,
msgs=0x1b637c0, buf=0x7fff54d01530, buf=0x7fff54d01530, ctx=0x1adb928,
ctx=0x1adb928) at decode.c:477
I had added an assert(false) to the code where the 'unknown duplicate'
error was logged to make spotting this easier but yesterday I didn't
double check that I was hitting the assertion I added versus this other
one. I can't yet say if this is two unrelated issues or if I'd get to
the 'unknown duplicate' message immediately after.
> Greetings,
>
> Andres Freund
>
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2013-09-28 19:03:20 | Re: record identical operator - Review |
Previous Message | Steve Singer | 2013-09-28 16:55:44 | Re: logical changeset generation v6.1 |