Re: jdbc spec violation for autocommit=true & addbatch/executeBatch

From: Quartz <quartz12h(at)yahoo(dot)com>
To: Oliver Jowett <oliver(at)opencloud(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: jdbc spec violation for autocommit=true & addbatch/executeBatch
Date: 2011-01-18 21:55:44
Message-ID: 728504.34967.qm@web33202.mail.mud.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

--- On Tue, 1/18/11, Oliver Jowett <oliver(at)opencloud(dot)com> wrote:

> From: Oliver Jowett <oliver(at)opencloud(dot)com>
> Subject: Re: [JDBC] jdbc spec violation for autocommit=true & addbatch/executeBatch
> To: "Quartz" <quartz12h(at)yahoo(dot)com>
> Cc: pgsql-jdbc(at)postgresql(dot)org
> Received: Tuesday, January 18, 2011, 4:31 PM
> Quartz wrote:
>
> > Postgres with transaction (autocommit=false &
> commit()) does deadlock as expected.
> > Postgres without transaction (autocommit=true)
> deadlocks UNEXPECTEDLY.
> >
> > The jdbc javadoc says clearly:
> > http://download.oracle.com/javase/6/docs/api/java/sql/Connection.html#setAutoCommit%28boolean%29
> >
> > "If a connection is in auto-commit mode, then all its
> SQL statements will be executed and committed as individual
> transactions."
>
> It's worth noting that the javadoc changes without warning
> between releases - there have been many "clarifications"
> over the years that are actually unannounced behavioral
> changes - have you checked historical javadoc to see what it
> says? Maybe we implemented against a spec that said
> something different.
>
> That said - yes, all statements within addBatch() are run
> in a single transaction. (Autocommit is triggered by the
> final Sync message, which is only sent at the end of the
> batch)
>
> A related gotcha is that if you execute a multiple
> statement query (i.e. execute("select 1; select 2; select
> 3") then, again, there is only one enclosing autocommit
> transaction. (This was deliberate, as it mirrored the
> behavior of protocol-version-2 drivers back in the 7.x
> days)
>
> I'm not sure if we can implement this according to the
> current spec without losing any performance benefit of
> addBatch()/executeBatch() - we'd have to send a Sync after
> every individual query. As a workaround I suggest you just
> run the queries individually, not in a batch - there won't
> be much performance difference. (And if you really don't
> care about ordering or atomicity as you imply, you could
> parallelize it across multiple connections if latency is a
> problem)
>
> Oliver
>

This is very sad. The batch never mean transaction. It happens to be the supporting fixture for the transaction, but also to avoid roundtrips with connection pools and such overhead when not using transaction as autocommit=true suggest.

Is is expected that your said 'sync' at every statement would be less performant than a transaction, but it is still more performant than separated statements/connections, especially with prepared statements.

IMHO, you should avoid breaking the spec even if it means some performance loss, which "might"* be recovered another time. The main issue here is to impose a transaction that MAY fail when the calling code isn't designed to handle retries because it didn't need to in the first place.

*For attempting to preserve performance, I guess the sync is too aggressive, so there should be an agreement to make a lighter new protocol directive to denote the intent of performing the statement alone rather than in a transaction, although the server can buffer these statement. Some kind of 'enqueue' directive, sort of.

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Oliver Jowett 2011-01-18 22:20:29 Re: jdbc spec violation for autocommit=true & addbatch/executeBatch
Previous Message Oliver Jowett 2011-01-18 21:31:57 Re: jdbc spec violation for autocommit=true & addbatch/executeBatch