Re: Pipelining executions to postgresql server

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Mikko Tiihonen <Mikko(dot)Tiihonen(at)nitorcreations(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: Pipelining executions to postgresql server
Date: 2014-11-03 14:27:36
Message-ID: 545790D8.9090201@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-jdbc

On 11/02/2014 09:27 PM, Mikko Tiihonen wrote:
> Is the following summary correct:
> - the network protocol supports pipelinings

Yes.

All you have to do is *not* send a Sync message and be aware that the
server will discard all input until the next Sync, so pipelining +
autocommit doesn't make a ton of sense for error handling reasons.

> - the server handles operations in order, starting the processing of next operation only after fully processing the previous one - thus pipelining is invisible to the server

As far as I know, yes. The server just doesn't care.

> - libpq driver does not support pipelining, but that is due to internal limitations

Yep.

> - if proper error handling is done by the client then there is no reason why pipelining could be supported by any pg client

Indeed, and most should support it. Sending batches of related queries
would make things a LOT faster.

PgJDBC's batch support is currently write-oriented. There is no
fundamental reason it can't be expanded for reads. I've already written
a patch to do just that for the case of returning generated keys.

https://github.com/ringerc/pgjdbc/tree/batch-returning-support

and just need to rebase it so I can send a pull for upstream PgJDBC.
It's already linked in the issues documenting the limitatations in batch
support.

If you want to have more general support for batches that return rowsets
there's no fundamental technical reason why it can't be added. It just
requires some tedious refactoring of the driver to either:

- Sync and wait before it fills its *send* buffer, rather than trying
to manage its receive buffer (the server send buffer), so it can
reliably avoid deadlocks; or

- Do async I/O in a select()-like loop over a protocol state machine,
so it can simultaneously read and write on the wire.

I might need to do some of that myself soon, but it's a big (and
therefore error-prone) job I've so far avoided by making smaller, more
targeted changes.

Doing async I/O using Java nio channels is by far the better approach,
but also the more invasive one. The driver currently sends data on the
wire where it generates it and blocks to receive expected data.
Switching to send-side buffer management doesn't have the full
performance gains that doing bidirectional I/O via channels does,
though, and may be a significant performance _loss_ if you're sending
big queries but getting small replies.

For JDBC the JDBC batch interface is the right place to do this, and you
should not IMO attempt to add pipelining outside that interface.
(Multiple open resultsets from portals, yes, but not pipelining of queries).

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2014-11-03 15:18:20 Re: group locking: incomplete patch, just for discussion
Previous Message Craig Ringer 2014-11-03 14:13:54 Re: Pipelining executions to postgresql server

Browse pgsql-jdbc by date

  From Date Subject
Next Message Mikko Tiihonen 2014-11-03 23:56:22 Re: Pipelining executions to postgresql server
Previous Message Craig Ringer 2014-11-03 14:13:54 Re: Pipelining executions to postgresql server