Re: Optimization for updating foreign tables in Postgres FDW

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: ashutosh(dot)bapat(at)enterprisedb(dot)com
Cc: robertmhaas(at)gmail(dot)com, michael(dot)paquier(at)gmail(dot)com, fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp, noah(at)leadboat(dot)com, rushabh(dot)lathia(at)gmail(dot)com, thom(at)linux(dot)com, sfrost(at)snowman(dot)net, laurenz(dot)albe(at)wien(dot)gv(dot)at, tgl(at)sss(dot)pgh(dot)pa(dot)us, shigeru(dot)hanada(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Optimization for updating foreign tables in Postgres FDW
Date: 2017-04-18 04:00:22
Message-ID: 20170418.130022.222486079.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Tue, 18 Apr 2017 09:12:07 +0530, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote in <CAFjFpRcRjohUZaCNon1zkP7c6fo2qd5hnfHCT6t_39SGtS_oKQ(at)mail(dot)gmail(dot)com>
> On Tue, Apr 18, 2017 at 8:12 AM, Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> > At Mon, 17 Apr 2017 17:50:58 +0530, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote in <CAFjFpRdcWw4h0a-zrL-EiaekkPj8O0GR2M1FwZ1useSRfRm3-g(at)mail(dot)gmail(dot)com>
> >> On Mon, Apr 17, 2017 at 1:53 PM, Kyotaro HORIGUCHI
> >> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> >> > At Thu, 13 Apr 2017 13:04:12 -0400, Robert Haas <robertmhaas(at)gmail(dot)com> wrote in <CA+TgmoaxnNmuONgP=bXJojrgbnMPTi6Ms8OSwZBC2YQ2ueUiSg(at)mail(dot)gmail(dot)com>
> >> >> On Thu, Apr 21, 2016 at 10:53 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >> >> > On Thu, Apr 21, 2016 at 8:44 AM, Michael Paquier
> >> >> > <michael(dot)paquier(at)gmail(dot)com> wrote:
> >> >> >> On Thu, Apr 21, 2016 at 5:22 PM, Etsuro Fujita
> >> >> >> <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> >> >> >>> Attached is an updated version of the patch, which modified Michael's
> >> >> >>> version of the patch, as I proposed in [1] (see "Other changes:"). I
> >> >> >>> modified comments for pgfdw_get_result/pgfdw_exec_query also, mainly because
> >> >> >>> words like "non-blocking mode" there seems confusing (note that we have
> >> >> >>> PQsetnonbloking).
> >> >> >>
> >> >> >> OK, so that is what I sent except that the comments mentioning PG_TRY
> >> >> >> are moved to their correct places. That's fine for me. Thanks for
> >> >> >> gathering everything in a single patch and correcting it.
> >> >> >
> >> >> > I have committed this patch. Thanks for working on this. Sorry for the delay.
> >> >>
> >> >> This 9.6-era patch, as it turns out, has a problem, which is that we
> >> >> now respond to an interrupt by sending a cancel request and a
> >> >> NON-interruptible ABORT TRANSACTION command to the remote side. If
> >> >> the reason that the user is trying to interrupt is that the network
> >> >> connection has been cut, they interrupt the original query only to get
> >> >> stuck in a non-interruptible wait for ABORT TRANSACTION. That is
> >> >> non-optimal.
> >> >
> >> > Agreed.
> >> >
> >> >> It is not exactly clear to me how to fix this. Could we get by with
> >> >> just slamming the remote connection shut, instead of sending an
> >> >> explicit ABORT TRANSACTION? The remote side ought to treat a
> >> >> disconnect as equivalent to an ABORT anyway, I think, but maybe our
> >> >> local state would get confused. (I have not checked.)
> >> >>
> >> >> Thoughts?
> >> >
> >> > Perhaps we will get stuck at query cancellation before ABORT
> >> > TRANSACTION in the case. A connection will be shut down when
> >> > anything wrong (PQstatus(conn) != CONNECTION_OK and so) on
> >> > aborting local transactoin . So I don't think fdw gets confused
> >> > or sholdn't be confused by shutting down there.
> >> >
> >> > The most significant issue I can see is that the same thing
> >> > happens in the case of graceful ABORT TRANSACTION. It could be a
> >> > performance issue.
> >> >
> >> > We could set timeout here but maybe we can just slamming the
> >> > connection down instead of sending a query cancellation. It is
> >> > caused only by timeout or interrupts so I suppose it is not a
> >> > problem *with a few connections*.
> >> >
> >> >
> >> > Things are a bit diffent with hundreds of connections. The
> >> > penalty of reconnection would be very high in the case.
> >> >
> >> > If we are not willing to pay such high penalty, maybe we are to
> >> > manage busy-idle time of each connection and trying graceful
> >> > abort if it is short enough, maybe having a shoft timeout.
> >> >
> >> > Furthermore, if most or all of the hundreds of connections get
> >> > stuck, such timeout will accumulate up like a mountain...
> >>
> >> Even when the transaction is aborted because a user cancels a query,
> >> we do want to preserve the connection, if possible, to avoid
> >
> > Yes.
> >
> >> reconnection. If the request to cancel the query itself fails, we
> >> should certainly drop the connection. Here's the patch to do that.
> >
> > A problem I think on this would be that we still try to make
> > another connection for canceling and it would stall for several
> > minutes per connection on a packet stall, which should be done in
> > a second on ordinary circumstances. Perhaps we might want here is
> > async canceling with timeout.
>
> I am not sure what do you mean "make another connection for
> cancelling.". Do you mean to say that PQcancel would make another
> connection?

Yes. It will take about 3 minutes on standard settings when no
ACK returned. I thought that this discussion is based on such
situation.

> The patch proposed simply improves the condition when PQcancel has
> returned a failure. Right now we ignore that failure and try to ABORT
> the transaction, which is most probably going to get stalled or fail
> to be dispatched. With the patch such a connection will be reset.

Ah, I understand that. It is surely an improvement since it
avoids useless ABORT TRANSACTION that is known to stall.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-04-18 04:16:57 Re: snapbuild woes
Previous Message Ashutosh Bapat 2017-04-18 03:42:07 Re: Optimization for updating foreign tables in Postgres FDW