Re: jdbc spec violation for autocommit=true & addbatch/executeBatch

From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Quartz <quartz12h(at)yahoo(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: jdbc spec violation for autocommit=true & addbatch/executeBatch
Date: 2011-01-18 22:20:29
Message-ID: 4D36122D.9030308@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Quartz wrote:

> Is is expected that your said 'sync' at every statement would be less performant than a transaction, but it is still more performant than separated statements/connections, especially with prepared statements.

I don't know without implementing and benchmarking it (which,
unfortunately, I have no time to do).

> IMHO, you should avoid breaking the spec even if it means some performance loss, which "might"* be recovered another time. The main issue here is to impose a transaction that MAY fail when the calling code isn't designed to handle retries because it didn't need to in the first place.

We certainly do avoid breaking the spec whereever possible, but please
remember that much of the driver's implementation dates back to the days
of JDBC2, and back then the JDBC spec was a terrible spec in terms of
precisely describing the required behaviour. (It's somewhat better now,
but still leaves a lot to be desired)

Your particular case is a bit of an edge case too - I don't remember
hearing of anyone else using batch updates with autocommit=on, from
memory. Typically you are using batch updates because you have a lot of
data to stream in, and if you have a lot of data to stream in you want
it in one big transaction to avoid repeated transaction costs - so
having autocommit=on somewhat defeats the purpose..

> *For attempting to preserve performance, I guess the sync is too aggressive, so there should be an agreement to make a lighter new protocol directive to denote the intent of performing the statement alone rather than in a transaction, although the server can buffer these statement. Some kind of 'enqueue' directive, sort of.

I think it's quite unlikely that the FE/BE protocol is going to change
merely because of a quirk in the JDBC spec that can be handled by the
driver. That protocol has been stable for a number of years now - dating
back to 7.4, IIRC.

You might want to read up on the query protocol at
http://www.postgresql.org/docs/9.0/static/protocol-flow.html as a
starting point. (The JDBC driver uses the "extended query" flow)

The simplest fix I can see is to just have a special case at the
Statement level that falls back to executing each batch entry
individually when autocommit=on. That would match the spec better and be
no worse than having the caller execute statements individually.

Wiring different logic into the query executor itself to handle this
case would be more complicated, and would only be a win if the server
roundtrip latency was a bigger factor than the cost of transaction
setup/teardown (which pretty much means "server on the other end of a WAN")

Oliver

In response to

Browse pgsql-jdbc by date

  From Date Subject
Next Message Thomas Kellerer 2011-01-18 22:45:59 Re: jdbc spec violation for autocommit=true & addbatch/executeBatch
Previous Message Quartz 2011-01-18 21:55:44 Re: jdbc spec violation for autocommit=true & addbatch/executeBatch