Re: pg_basebackup caused FailedAssertion

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Jeff Davis <pgsql(at)j-davis(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: pg_basebackup caused FailedAssertion
Date: 2020-12-14 08:51:45
Message-ID: 3d57bc29-4459-578b-79cb-7641baf53c57@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/12/2020 00:47, Jeff Davis wrote:
> On Wed, 2013-02-27 at 19:29 +0200, Heikki Linnakangas wrote:
>> Right. I fixed that by adding WL_SOCKET_READABLE, and handling any
>> messages that might arrive after the frontend already sent CopyEnd.
>> The
>> frontend shouldn't send any messages after CopyEnd, until it receives
>> a
>> CopyEnd from the backend.
>
> It looks like 4bad60e3 may have fixed the problem, is it possible to
> just revert 3a9e64aa and allow the case?

Yes, I think you're right.

> Also, the comment added by 3a9e64aa is misleading, because waiting for
> a CopyDone from the server is not enough. It's possible that the client
> receives the CopyDone from the server and the client sends a new query
> before the server breaks from the loop. The client needs to wait until
> at least the first CommandComplete.

Good point. I think that's a bug in the implementation rather than the
comment, though. ProcessRepliesIfAny() should exit the loop immediately
if (streamingDoneReceiving && streamingDoneSending). But that's moot if
we revert 3a9e64aa altogether. I think we could backpatch the revert,
because it's not quite right as it is, and we have 3a9e64aa in all the
supported versions.

>> In theory, the frontend could already send the next query before
>> receiving the CopyEnd, but libpq doesn't currently allow that. Until
>> someone writes a client that actually tries to do that, I'm not going
>> to
>> try to support that in the backend. It would be a lot more work, and
>> likely be broken anyway, without any way to test it.
>
> I tried to add streaming replication support (still a work in progress)
> to the rust client[1], and I ran into this problem.
>
> The core of the rust client is fully pipelined and async, so it's a bit
> annoying to work around this problem.

Since you have the means to test this, would you like to do the honors
and revert 3a9e64aa?

- Heikki

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-12-14 08:56:23 Re: Asynchronous Append on postgres_fdw nodes.
Previous Message Bharath Rupireddy 2020-12-14 08:24:46 Re: Fail Fast In CTAS/CMV If Relation Already Exists To Avoid Unnecessary Rewrite, Planning Costs