Re: Logical decoding and walsender timeouts

From: Andres Freund <andres(at)anarazel(dot)de>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Subject: Re: Logical decoding and walsender timeouts
Date: 2016-10-31 08:52:23
Message-ID: 20161031085223.zjexqkuau5t32bfl@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2016-10-31 16:34:38 +0800, Craig Ringer wrote:
> TL;DR: Logical decoding clients need to generate their own keepalives
> and not rely on the server requesting them to prevent timeouts. Or
> admins should raise the wal_sender_timeout by a LOT when using logical
> decoding on DBs with any big rows.

Unconvinced.

> When sending a big message, WalSndWriteData() notices that it's
> approaching timeout and tries to send a keepalive request, but the
> request just gets buffered behind the remaining output plugin data and
> isn't seen by the client until the client has received the rest of the
> pending data.

Only for individual messages, not the entire transaction though. Are
you sure the problem at hand is that we're sending a keepalive, but it's
too late? It might very well be that the actual issue is that we're
never sending keepalives, because the network is fast enough / the tcp
window is large enough. IIRC we only send a keepalive if we're blocked
on network IO?

> So: We could ask output plugins to deal with this for us, by chunking
> up their data in small pieces and calling OutputPluginPrepareWrite()
> and OutputPluginWrite() more than once per output plugin callback if
> they expect to send a big message. But this pushes the complexity of
> splitting up and handling big rows, and big Datums, onto each plugin.
> It's awkward to do well and hard to avoid splitting things up
> unnecessarily.

There's decent reason for doing that independently though, namely that
it's a lot more efficient from a memory management POV.

I don't think the "unrequested keepalive" approach really solves the
problem on a fundamental enough level.

> (A separate issue is that we can also time out when doing logical
> _replication_ if the downstream side blocks on a lock, since it's not
> safe to send on a socket from a signal handler ... )

That's strictly speaking not true. write() / sendmsg() are signal safe
functions. There's good reasons not to do that however, namely that the
non signal handler code might be busy writing data itself.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-10-31 08:59:40 Dumb mistakes in WalSndWriteData()
Previous Message Craig Ringer 2016-10-31 08:34:38 Logical decoding and walsender timeouts