Re: RFC: Async query processing

From: Florian Weimer <fweimer(at)redhat(dot)com>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc: PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: RFC: Async query processing
Date: 2013-12-18 16:50:42
Message-ID: 52B1D262.701@redhat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/04/2013 02:51 AM, Claudio Freire wrote:
> On Sun, Nov 3, 2013 at 3:58 PM, Florian Weimer <fweimer(at)redhat(dot)com> wrote:
>> I would like to add truly asynchronous query processing to libpq, enabling
>> command pipelining. The idea is to to allow applications to auto-tune to
>> the bandwidth-delay product and reduce the number of context switches when
>> running against a local server.
> ...
>> If the application is not interested in intermediate query results, it would
>> use something like this:
> ...
>> If there is no need to exit from the loop early (say, because errors are
>> expected to be extremely rare), the PQgetResultNoWait call can be left out.
>
> It doesn't seem wise to me making such a distinction. It sounds like
> you're oversimplifying, and that's why you need "modes", to overcome
> the evidently restrictive limits of the simplified interface, and that
> it would only be a matter of (a short) time when some other limitation
> requires some other mode.

I need modes because I want to avoid unbound buffering, which means that
result data has to be consumed in the order queries are issued.

>> PGAsyncMode oldMode = PQsetsendAsyncMode(conn, PQASYNC_RESULT);
>> bool more_data;
>> do {
>> more_data = ...;
>> if (more_data) {
>> int ret = PQsendQueryParams(conn,
>> "INSERT ... RETURNING ...", ...);
>> if (ret == 0) {
>> // handle low-level error
>> }
>> }
>> // Consume all pending results.
>> while (1) {
>> PGresult *res;
>> if (more_data) {
>> res = PQgetResultNoWait(conn);
>> } else {
>> res = PQgetResult(conn);
>> }
>
> Somehow, that code looks backwards. I mean, really backwards. Wouldn't
> that be !more_data?

No, if more data is available to transfer to the server, the no-wait
variant has to be used to avoid a needless synchronization with the server.

> In any case, pipelining like that, without a clear distinction, in the
> wire protocol, of which results pertain to which query, could be a
> recipe for trouble when subtle bugs, either in lib usage or
> implementation, mistakenly treat one query's result as another's.

We already use pipelining in libpq (see pqFlush, PQsendQueryGuts and
pqParseInput3), the server is supposed to support it, and there is a
lack of a clear tit-for-tat response mechanism anyway because of
NOTIFY/LISTEN and the way certain errors are reported.

>> Instead of buffering the results, we could buffer the encoded command
>> messages in PQASYNC_RESULT mode. This means that PQsendQueryParams would
>> not block when it cannot send the (complete) command message, but store in
>> the connection object so that the subsequent PQgetResultNoWait and
>> PQgetResult would send it. This might work better with single-tuple result
>> mode. We cannot avoid buffering either multiple queries or multiple
>> responses if we want to utilize the link bandwidth, or we'd risk deadlocks.
>
> This is a non-solution. Such an implementation, at least as described,
> would not remove neither network latency nor context switches, it
> would be a purely API change with no externally visible behavior
> change.

Ugh, why?

> An effective solution must include multi-command packets. Without
> knowing the wire protocol in detail, something like:
>
> PARSE: INSERT blah
> BIND: args
> EXECUTE with DISCARD
> PARSE: INSERT blah
> BIND: args
> EXECUTE with DISCARD
> PARSE: SELECT blah
> BIND: args
> EXECUTE with FETCH ALL
>
> All in one packet, would be efficient and error-free (IMO).

No, because this doesn't scale automatically with the bandwidth-delay
product. It also requires that the client buffers queries and their
parameters even though the network has to do that anyway.

In any case, I don't want to change the wire protocol, I just want to
enable libpq clients to use more of its capabilities.

--
Florian Weimer / Red Hat Product Security Team

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2013-12-18 16:50:51 Re: 9.3 reference constraint regression
Previous Message Alvaro Herrera 2013-12-18 16:44:15 Re: [PATCH] SQL assertions prototype