Re: JDBC, prepared queries, and partitioning

Lists: pgsql-jdbc
From: Josh Berkus <josh(at)agliodbs(dot)com>
To: PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: JDBC, prepared queries, and partitioning
Date: 2008-02-13 18:05:30
Message-ID: 200802131005.30553.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

All,

Since JDBC always prepares and plan-caches queries, it makes it hard to use
partitioning with a database connected to a JDBC connection pool; successive
queries in the same session will select the wrong partition, or scan the
whole partition set.

Is there any workaround for this, either on 8.2 or 8.3?

--
Josh Berkus
PostgreSQL @ Sun
San Francisco


From: Kris Jurka <books(at)ejurka(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 18:51:24
Message-ID: Pine.BSO.4.64.0802131348050.27322@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Wed, 13 Feb 2008, Josh Berkus wrote:

> Since JDBC always prepares and plan-caches queries, it makes it hard to use
> partitioning with a database connected to a JDBC connection pool; successive
> queries in the same session will select the wrong partition, or scan the
> whole partition set.
>

It will never scan the wrong partition. It will always scan all
partitions. The server is smart enough to not generate a plan that is not
valid for only one parameter set.

Using protocol version 2 will not prepare queries so constraint exclusion
can be used. Just append protocolVersion=2 to your URL.

Kris Jurka


From: Kris Jurka <books(at)ejurka(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 19:18:54
Message-ID: Pine.BSO.4.64.0802131404090.6785@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Wed, 13 Feb 2008, Kris Jurka wrote:

> Using protocol version 2 will not prepare queries so constraint exclusion can
> be used. Just append protocolVersion=2 to your URL.
>

Also, to add what a real fix would look like:

Right now the JDBC driver always uses server side prepared statements for
Java PreparedStatements. There are two reasons to use a
PreparedStatement: to send parameters out of line and to avoid replanning.
Neither the JDBC API nor the fe-be protocol really allow the user to
specify why they are using a prepared statement, so the driver has to
guess and provide a weak hint to the server.

We have a prepare threshold (settable per connection or per statement)
that indicates the number of times the statement has been executed to
consider it as a prepared statement that's used to avoid replanning.
Once we hit that the threshold we tell the server to use a named statement
and it comes up with a generic plan that should theoretically be
acceptable for a wide variety of parameter values. Prior to this
switchover we use an unnamed statement that indicates to the server that
it should use the first set of parameters it gets to construct the best
plan for those parameters.

So on an unnamed statement the server considers the parameter values in
the plan cost estimates, but it can't do constraint exclusion because
there is no guarantee that the plan generated won't be used again with
other parameters (even though the JDBC driver won't). So the disconnect
currently is that the JDBC driver has no way of telling the server, "I
promise I'll never use this plan for anything else" to allow it to do
constraint exclusion. So this requires an extension to the protocol
to fix properly.

Kris Jurka


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 20:34:24
Message-ID: 1202934864.16770.486.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Wed, 2008-02-13 at 13:51 -0500, Kris Jurka wrote:
>
> On Wed, 13 Feb 2008, Josh Berkus wrote:
>
> > Since JDBC always prepares and plan-caches queries, it makes it hard to use
> > partitioning with a database connected to a JDBC connection pool; successive
> > queries in the same session will select the wrong partition, or scan the
> > whole partition set.
> >
>
> It will never scan the wrong partition. It will always scan all
> partitions. The server is smart enough to not generate a plan that is not
> valid for only one parameter set.
>
> Using protocol version 2 will not prepare queries so constraint exclusion
> can be used. Just append protocolVersion=2 to your URL.

What about the prepare threshold?

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Kris Jurka <books(at)ejurka(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 20:43:46
Message-ID: Pine.BSO.4.64.0802131537500.26092@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Wed, 13 Feb 2008, Simon Riggs wrote:

> On Wed, 2008-02-13 at 13:51 -0500, Kris Jurka wrote:
>>
>> Using protocol version 2 will not prepare queries so constraint exclusion
>> can be used. Just append protocolVersion=2 to your URL.
>
> What about the prepare threshold?
>

I'm not sure what your question is, you might be asking:

1) Do I need to be aware of prepare threshold with V2 connections?

No, prepareThreshold only affects V3 protocol connections. Prior to the
8.0 release there was rudimentary support for converting
PreparedStatements to SQL level PREPARE/EXECUTE that worked with V2, but
that was ripped out in the major 8.0 rewrite and V2 always interpolates
parameters into the query and executes it directly.

2) Can't I use prepare threshold with V3 instead of switching to V2 and
achieve the same effect?

No, in V3 all prepared statements are prepared regardless of the
threshold. The difference is in the level of "preparedness" by using
either a named or unnamed statement at the protocol level. See my
previous email for more details on that topic.

Kris Jurka


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Kris Jurka <books(at)ejurka(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 20:47:34
Message-ID: 23221.1202935654@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> On Wed, 2008-02-13 at 13:51 -0500, Kris Jurka wrote:
>> Using protocol version 2 will not prepare queries so constraint exclusion
>> can be used. Just append protocolVersion=2 to your URL.

> What about the prepare threshold?

Won't help; the problem is that the planner will never consider a Param
value as a true constant. So if you're trying to use out-of-line params
to avoid quoting/escaping/SQL-injection-risk issues, you're shut out of
constraint exclusion, as well as some other cases such as LIKE
optimization.

We could imagine having a different operating mode where a generated
plan is only used once and so the Param values can be taken as true
constants.

I suppose this could be implemented with a GUC variable rather than an
explicit protocol change, but I'm not sure if that'd be a good idea or
not. It's not entirely clear to me which layers on the client side need
to be aware of such behavior.

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 20:56:21
Message-ID: 1202936181.16770.493.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Wed, 2008-02-13 at 15:43 -0500, Kris Jurka wrote:

> No, in V3 all prepared statements are prepared regardless of the
> threshold.

Neat. Strange to not carry forward such a useful feature.

I'll implement an equivalent server-side, cos I really like(d) it.

Am I reading the wrong documentation?
http://jdbc.postgresql.org/documentation/83/connect.html#connection-parameters

Or have the docs really not been updated in 4+ years?

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Kris Jurka <books(at)ejurka(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 21:06:08
Message-ID: Pine.BSO.4.64.0802131602540.9961@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Wed, 13 Feb 2008, Simon Riggs wrote:

> On Wed, 2008-02-13 at 15:43 -0500, Kris Jurka wrote:
>
>> No, in V3 all prepared statements are prepared regardless of the
>> threshold.
>
> Am I reading the wrong documentation?
> http://jdbc.postgresql.org/documentation/83/connect.html#connection-parameters
>

The documentation is just simplifying the complexities of named vs unnamed
statements. Aside from a couple of exceptions, an unnamed prepared
statement should perform equivalently to direct execution of the literal
query. Describing it as semi-prepared is complicated and not particularly
useful if there is no way to get non-prepared behavior.

Kris Jurka


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kris Jurka <books(at)ejurka(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 21:13:42
Message-ID: 1202937222.16770.505.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Wed, 2008-02-13 at 15:47 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > On Wed, 2008-02-13 at 13:51 -0500, Kris Jurka wrote:
> >> Using protocol version 2 will not prepare queries so constraint exclusion
> >> can be used. Just append protocolVersion=2 to your URL.
>
> > What about the prepare threshold?

> I suppose this could be implemented with a GUC variable rather than an
> explicit protocol change, but I'm not sure if that'd be a good idea or
> not. It's not entirely clear to me which layers on the client side need
> to be aware of such behavior.

What I liked about prepare threshold was that it gave you the ability to
control the re-optimization.

ISTM we need a way to say "this query's parameters vary wildly at
execution time, so I really need to re-plan it every time". We don't
like hints, but that isn't something the database can easily assess.

I don't really like the idea of an on/off "replan every time switch".
DB2 has that and its a little clunky. I do like the idea of being able
to specify re-plan every time (-1) and prepare always (0) or switch from
re-plan every time to prepare after (N) executions.

Planning with the first set of parameters is OK, but only as long as the
first parameter it receives is a "good" one. There's no way to guarantee
that without jerking around more than I think we should.

We talked about this in 2006 but I think the "Option Node" approach
doesn't really cut it from an application developer perspective.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 21:26:45
Message-ID: 1202938005.16770.514.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Wed, 2008-02-13 at 16:06 -0500, Kris Jurka wrote:
>
> On Wed, 13 Feb 2008, Simon Riggs wrote:
>
> > On Wed, 2008-02-13 at 15:43 -0500, Kris Jurka wrote:
> >
> >> No, in V3 all prepared statements are prepared regardless of the
> >> threshold.
> >
> > Am I reading the wrong documentation?
> > http://jdbc.postgresql.org/documentation/83/connect.html#connection-parameters
> >
>
> The documentation is just simplifying the complexities of named vs unnamed
> statements. Aside from a couple of exceptions, an unnamed prepared
> statement should perform equivalently to direct execution of the literal
> query. Describing it as semi-prepared is complicated and not particularly
> useful if there is no way to get non-prepared behavior.

OK, but AFAICS the docs relating to prepareThreshold are wrong. They
describe behaviour that doesn't happen all of the time, yet there is
nothing to indicate that.

Happy to provide a patch, if you'd like.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Kris Jurka <books(at)ejurka(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 21:41:26
Message-ID: Pine.BSO.4.64.0802131630420.22592@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Wed, 13 Feb 2008, Simon Riggs wrote:

> OK, but AFAICS the docs relating to prepareThreshold are wrong. They
> describe behaviour that doesn't happen all of the time, yet there is
> nothing to indicate that.

That's spelled out in detail if you follow the link:

http://jdbc.postgresql.org/documentation/83/server-prepare.html

> Happy to provide a patch, if you'd like.
>

If you think it's important, go ahead.

Kris Jurka


From: "Stephen Denne" <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Kris Jurka" <books(at)ejurka(dot)com>, "Josh Berkus" <josh(at)agliodbs(dot)com>, "PostgreSQL - JDBC" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 21:45:52
Message-ID: F0238EBA67824444BC1CB4700960CB4804A4F683@dmpeints002.isotach.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Simon Riggs wrote
> ISTM we need a way to say "this query's parameters vary wildly at
> execution time, so I really need to re-plan it every time".

If the wild variations don't affect the cost estimate, they won't affect the plan?

Isn't it more a case of
"even a minor variation in just one of this query's parameters can result in wildly different cost estimate,
so I really need to re-plan it every time."

This is something that the database may be able to determine from currently collected statistics.

However I suspect that any means of trying to automatically determine that would *always* result in
"Yep, minor changes can wildly affect the cost estimate"

(I'm ignorant of details and history but...)

Could this be added (conditionally):
With the stored plan, store the cost estimate,
Re-estimate the cost of the chosen plan with the newly supplied parameters,
If the new cost is significantly higher then re-plan.

Checking the cost of a plan every time should be a lot cheaper than always re-planning.

I can already see problems with this...
the query will get slower and slower over time as it settles onto the plan that best handles the worst cases.

Stephen Denne.

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachments is confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately, destroy it and do not copy, disclose or use it in any way.

__________________________________________________________________
This email has been scanned by the DMZGlobal Business Quality
Electronic Messaging Suite.
Please see http://www.dmzglobal.com/services/bqem.htm for details.
__________________________________________________________________


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Stephen Denne" <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Kris Jurka" <books(at)ejurka(dot)com>, "Josh Berkus" <josh(at)agliodbs(dot)com>, "PostgreSQL - JDBC" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 21:50:38
Message-ID: 24210.1202939438@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

"Stephen Denne" <Stephen(dot)Denne(at)datamail(dot)co(dot)nz> writes:
> Isn't it more a case of
> "even a minor variation in just one of this query's parameters can result in wildly different cost estimate,
> so I really need to re-plan it every time."

The $64 problem is not that; it is that certain optimizations will not
be applied, period, unless the planner is dealing with values known
at plan time.

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Stephen Denne <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>
Cc: Kris Jurka <books(at)ejurka(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 21:55:56
Message-ID: 1202939756.16770.517.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Thu, 2008-02-14 at 10:45 +1300, Stephen Denne wrote:
> Simon Riggs wrote
> > ISTM we need a way to say "this query's parameters vary wildly at
> > execution time, so I really need to re-plan it every time".
>
> If the wild variations don't affect the cost estimate, they won't affect the plan?

Cost estimates are made at planning time. If you don't replan, no, they
won't.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: "Stephen Denne" <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Kris Jurka" <books(at)ejurka(dot)com>, "Josh Berkus" <josh(at)agliodbs(dot)com>, "PostgreSQL - JDBC" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 22:00:26
Message-ID: F0238EBA67824444BC1CB4700960CB4804A4F68C@dmpeints002.isotach.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Tom Lane wrote
> "Stephen Denne" <Stephen(dot)Denne(at)datamail(dot)co(dot)nz> writes:
> > Isn't it more a case of
> > "even a minor variation in just one of this query's
> parameters can result in wildly different cost estimate,
> > so I really need to re-plan it every time."
>
> The $64 problem is not that; it is that certain optimizations will not
> be applied, period, unless the planner is dealing with values known
> at plan time.

Are there two problems:
1) My query could always be faster had the planner known the parameter values. (re Josh's inital question)
2) The chosen plan works fairly well in the general case, but for some parameter values it takes forever.

Stephen Denne.

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachments is confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately, destroy it and do not copy, disclose or use it in any way.

__________________________________________________________________
This email has been scanned by the DMZGlobal Business Quality
Electronic Messaging Suite.
Please see http://www.dmzglobal.com/services/bqem.htm for details.
__________________________________________________________________


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 22:04:43
Message-ID: 1202940283.16770.525.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Wed, 2008-02-13 at 16:41 -0500, Kris Jurka wrote:
>
> On Wed, 13 Feb 2008, Simon Riggs wrote:
>
> > OK, but AFAICS the docs relating to prepareThreshold are wrong. They
> > describe behaviour that doesn't happen all of the time, yet there is
> > nothing to indicate that.
>
> That's spelled out in detail if you follow the link:
>
> http://jdbc.postgresql.org/documentation/83/server-prepare.html

I'd read it before and again before I posted this evening. Which part of
that says prepareTheshold only applies to v2 protocol?

The code example doesn't explicitly state protocolVersion as you say is
required either.

> > Happy to provide a patch, if you'd like.

> If you think it's important, go ahead.

You don't think its important?

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: "Stephen Denne" <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Kris Jurka" <books(at)ejurka(dot)com>, "Josh Berkus" <josh(at)agliodbs(dot)com>, "PostgreSQL - JDBC" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 22:10:56
Message-ID: F0238EBA67824444BC1CB4700960CB4804A4F69D@dmpeints002.isotach.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Simon Riggs wrote
> On Thu, 2008-02-14 at 10:45 +1300, Stephen Denne wrote:
> > Simon Riggs wrote
> > > ISTM we need a way to say "this query's parameters vary wildly at
> > > execution time, so I really need to re-plan it every time".
> >
> > If the wild variations don't affect the cost estimate, they
> won't affect the plan?
>
> Cost estimates are made at planning time. If you don't
> replan, no, they
> won't.

What I meant was that for some queries, replanning with wildly different parameter values will always select the same plan.
That'd happen if the distribution from which the parameter value is chosen was flat, and the operation involving the parameter doesn't affect selectivity (eg "equals" rather than "greater than").
Though I guess in that case that it isn't possible to have a "wildly" different value.

Stephen Denne.

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachments is confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately, destroy it and do not copy, disclose or use it in any way.

__________________________________________________________________
This email has been scanned by the DMZGlobal Business Quality
Electronic Messaging Suite.
Please see http://www.dmzglobal.com/services/bqem.htm for details.
__________________________________________________________________


From: Kris Jurka <books(at)ejurka(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 22:15:10
Message-ID: Pine.BSO.4.64.0802131706100.16718@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Wed, 13 Feb 2008, Simon Riggs wrote:

> I'd read it before and again before I posted this evening. Which part of
> that says prepareTheshold only applies to v2 protocol?

Perhaps not the clearest wording (too many version numbers), but the below
is supposed to say that:

"Enabling server-prepared statements will have no affect when connected to
a 7.3 server or when explicitly using the V2 protocol to connect to a 7.4
server."

> The code example doesn't explicitly state protocolVersion as you say is
> required either.
>

Well the default is V3, so it's not necessary to explicity state it.

> You don't think its important?
>

It's not important to me. If I was going to doing some doc editing, I'd
probably hit this area, but for the moment I think it's close enough to
the truth and I've got other things to do.

Kris Jurka


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Stephen Denne <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>, Kris Jurka <books(at)ejurka(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 22:23:23
Message-ID: 200802131423.24613.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc


> > If the wild variations don't affect the cost estimate, they won't
> > affect the plan?
>
> Cost estimates are made at planning time. If you don't replan, no, they
> won't.

So, just to confirm: there is no way with v3 to make it not use a cached
plan, either on 8.2 or 8.3, correct?

Can I use v2 against 8.3?

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco


From: Kris Jurka <books(at)ejurka(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Stephen Denne <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 22:29:08
Message-ID: Pine.BSO.4.64.0802131724400.2890@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Wed, 13 Feb 2008, Josh Berkus wrote:

> So, just to confirm: there is no way with v3 to make it not use a cached
> plan, either on 8.2 or 8.3, correct?

Correct.

> Can I use v2 against 8.3?
>

Yes, but you then lose some of the features that are only available with
the V3 protocol: ParameterMetaData, some ResultSetMetaData calls, OUT
parameters for CallableStatements. I think that's the list, but couldn't
swear to it.

Kris Jurka


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 22:32:45
Message-ID: 1202941965.16770.540.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Wed, 2008-02-13 at 17:15 -0500, Kris Jurka wrote:
>
> On Wed, 13 Feb 2008, Simon Riggs wrote:
>
> > I'd read it before and again before I posted this evening. Which part of
> > that says prepareTheshold only applies to v2 protocol?
>
> Perhaps not the clearest wording (too many version numbers), but the below
> is supposed to say that:
>
> "Enabling server-prepared statements will have no affect when connected to
> a 7.3 server or when explicitly using the V2 protocol to connect to a 7.4
> server."
>
> > The code example doesn't explicitly state protocolVersion as you say is
> > required either.
> >
>
> Well the default is V3, so it's not necessary to explicity state it.

But the code example shows explicit use of

pgconn.setPrepareThreshold(5);

which clearly does nothing, given the default setting of protocol V3.

So the program shown only works as stated with 7.3, the last time the
default connection protocol was v2.

> > You don't think its important?
> >
>
> It's not important to me. If I was going to doing some doc editing, I'd
> probably hit this area, but for the moment I think it's close enough to
> the truth and I've got other things to do.

Should we be asking for volunteers to overhaul the docs? I'm concerned
that some important facts aren't just wrong, they've been wrong for 4+
years. That's a pretty serious situation for customer performance as
well as advocacy.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: "Stephen Denne" <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>
To: <josh(at)agliodbs(dot)com>
Cc: "Kris Jurka" <books(at)ejurka(dot)com>, "PostgreSQL - JDBC" <pgsql-jdbc(at)postgresql(dot)org>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 22:33:25
Message-ID: F0238EBA67824444BC1CB4700960CB4804A4F6C4@dmpeints002.isotach.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Josh Berkus wrote
> So, just to confirm: there is no way with v3 to make it not
> use a cached
> plan, either on 8.2 or 8.3, correct?

8.3 server release notes mention a couple of options:
- Automatically re-plan cached queries when table definitions change or statistics are updated (Tom)
- Implement new commands DISCARD ALL, DISCARD PLANS, DISCARD TEMPORARY, CLOSE ALL, and DEALLOCATE ALL (Marko Kreen, Neil)

I've no idea about the extent of that first one, but perhaps you could include a tiny table in the query, and update statistics on the table.

Stephen Denne.

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachments is confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately, destroy it and do not copy, disclose or use it in any way.

__________________________________________________________________
This email has been scanned by the DMZGlobal Business Quality
Electronic Messaging Suite.
Please see http://www.dmzglobal.com/services/bqem.htm for details.
__________________________________________________________________


From: "Stephen Denne" <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>
To: <josh(at)agliodbs(dot)com>
Cc: "Kris Jurka" <books(at)ejurka(dot)com>, "PostgreSQL - JDBC" <pgsql-jdbc(at)postgresql(dot)org>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 22:38:27
Message-ID: F0238EBA67824444BC1CB4700960CB4804A4F6CE@dmpeints002.isotach.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

I wrote:
> Josh Berkus wrote
> > So, just to confirm: there is no way with v3 to make it not
> > use a cached
> > plan, either on 8.2 or 8.3, correct?
>
> 8.3 server release notes mention a couple of options:
> - Automatically re-plan cached queries when table definitions
> change or statistics are updated (Tom)
> - Implement new commands DISCARD ALL, DISCARD PLANS, DISCARD
> TEMPORARY, CLOSE ALL, and DEALLOCATE ALL (Marko Kreen, Neil)
>
> I've no idea about the extent of that first one, but perhaps
> you could include a tiny table in the query, and update
> statistics on the table.

Sorry, these seem to allow re-planning, instead of not caching a plan in the first place (which is the cause of not being able to restrict the plan to a subset of your tables).

Stephen.

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachments is confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately, destroy it and do not copy, disclose or use it in any way.

__________________________________________________________________
This email has been scanned by the DMZGlobal Business Quality
Electronic Messaging Suite.
Please see http://www.dmzglobal.com/services/bqem.htm for details.
__________________________________________________________________


From: Kris Jurka <books(at)ejurka(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 22:47:14
Message-ID: Pine.BSO.4.64.0802131737100.28484@leary.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Wed, 13 Feb 2008, Simon Riggs wrote:

> But the code example shows explicit use of
>
> pgconn.setPrepareThreshold(5);
>
> which clearly does nothing, given the default setting of protocol V3.
>
> So the program shown only works as stated with 7.3, the last time the
> default connection protocol was v2.

No, you've got it all backwards. prepareThreshold only does something
with V3. With V2, you *never* get prepared execution and this setting
does nothing. With V3, prepareThreshold switches between semi-prepared
and fully-prepared execution. For most cases semi-prepared is equivalent
to not-prepared (just not for partitioning which didn't exist when this
parameter + documentation were added).

> Should we be asking for volunteers to overhaul the docs? I'm concerned
> that some important facts aren't just wrong, they've been wrong for 4+
> years. That's a pretty serious situation for customer performance as
> well as advocacy.
>

Help is always appreciated, but I fear we don't have a big pool of people
with the right knowledge to do it.

Kris Jurka


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: josh(at)agliodbs(dot)com
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Stephen Denne <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>, Kris Jurka <books(at)ejurka(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 22:55:05
Message-ID: 47B37549.3@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Josh Berkus wrote:
>>> If the wild variations don't affect the cost estimate, they won't
>>> affect the plan?
>> Cost estimates are made at planning time. If you don't replan, no, they
>> won't.
>
> So, just to confirm: there is no way with v3 to make it not use a cached
> plan, either on 8.2 or 8.3, correct?

I think there's a lot of confusion going on here.. "Cached plan" is
pretty vague :( I'll try to summarize here:

The JDBC driver, when using the v3 protocol, will use either a named or
unnamed statement with parameter placeholders, and send the parameter
values out of line from the query.

If the unnamed statement is used:

* Planning occurs at bind time, not parse time, and uses actual bound
parameter values for cost estimation.

* However the planner still produces a plan that is correct for all
possible parameter values, as the protocol allows for the unnamed
statement to be reused with different parameter values. So some
optimizations are not possible.

* The JDBC driver never actually reuses the unnamed statement with
different values in this way. The unnamed statement is discarded after
one use.

If a named statement is used:

* Planning occurs at parse time. No actual parameter values are
available for cost estimation. A general plan is generated.

* The JDBC driver may reuse the statement (and associated plan) for
future query execution with different parameter values.

The JDBC driver selects use of a named or unnamed statement based on the
prepareThreshold connection parameter. For a particular
PreparedStatement, a use count is maintained. If the use count is below
the prepare threshold, an unnamed statement is used. If the use count
exceeds the threshold, a named statement is used (and reused for
subsequent execution of that same PreparedStatement object). This is
done because many applications don't actually reuse their
PreparedStatements; we want to make sure they're actually being reused
before we pay the extra cost of using a named statement.

Does that clear things up a bit ?

To try to answer your specific question: You can disable use of a
"cached plan" by forcing the driver to always use unnamed statements. In
that case, every new query execution re-parses and re-plans. To do this,
either don't reuse your PreparedStatement objects, or set
prepareThreshold=0 (a special value meaning "never use named statements")

However even when plans are not cached, you will still run into the
issue that an unnamed statement using out-of-line parameter values may
generate a less efficient plan than an unnamed statement using inline
parameter values, because there is no way to tell the planner at the
protocol level "I am really only ever using this query once, please give
me a specific plan for these values and don't worry about generating a
plan that is correct for other values too".

The problem is that the current extended query protocol does two
different things - out-of-line parameter passing and statement/plan
caching - but provides no way to independently manage them. The unnamed
statement behaviour is a compromise I suggested for 8.0 that gets rid of
the worst problems for JDBC without an incompatible protocol change, but
it isn't perfect. To get it 100% right it seems that a protocol change
is needed.

...

Using the version 2 protocol means that parameter values are sent inline
with the query, which avoids the planning issue (but raises a number of
other problems since the v2 protocol is much less flexible than v3)

> Can I use v2 against 8.3?

I think so. You obviously lose any behaviour that depends on v3, e.g.
parameter metadata and smarter bytea streaming.

-O


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Kris Jurka <books(at)ejurka(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 22:57:01
Message-ID: 47B375BD.9040202@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Simon Riggs wrote:

> But the code example shows explicit use of
>
> pgconn.setPrepareThreshold(5);
>
> which clearly does nothing, given the default setting of protocol V3.

You are misreading something somewhere, you have that backwards.

Prepare threshold works only on V3. It does not work with V2.

-O


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 23:13:26
Message-ID: 1202944406.16770.562.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Wed, 2008-02-13 at 17:47 -0500, Kris Jurka wrote:
>
> On Wed, 13 Feb 2008, Simon Riggs wrote:
>
> > But the code example shows explicit use of
> >
> > pgconn.setPrepareThreshold(5);
> >
> > which clearly does nothing, given the default setting of protocol V3.
> >
> > So the program shown only works as stated with 7.3, the last time the
> > default connection protocol was v2.
>
> No, you've got it all backwards. prepareThreshold only does something
> with V3.

OK. I was confused by your statement that "in V3 all prepared statements
are prepared regardless of the threshold". I took that to mean that the
prepareThreshold did nothing at all in V3, which threw everything I
understood on its head.

Many thanks for clarifications from yourself and Oliver.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Oliver Jowett <oliver(at)opencloud(dot)com>
Cc: josh(at)agliodbs(dot)com, Stephen Denne <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>, Kris Jurka <books(at)ejurka(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 23:19:12
Message-ID: 1202944752.16770.568.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Thu, 2008-02-14 at 11:55 +1300, Oliver Jowett wrote:

> Using the version 2 protocol means that parameter values are sent inline
> with the query, which avoids the planning issue (but raises a number of
> other problems since the v2 protocol is much less flexible than v3)
>
> > Can I use v2 against 8.3?
>
> I think so. You obviously lose any behaviour that depends on v3, e.g.
> parameter metadata and smarter bytea streaming.

What we need is a "send parameters inline" mode that can be set as an
option for a PreparedStatement. Nobody wants to use V2, we just want the
ability to re-plan a query every time. Perhaps that is best implemented
as a server side option that can be exposed via JDBC option, since this
is an SQL requirement and nothing to do with Java.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: josh(at)agliodbs(dot)com, Stephen Denne <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>, Kris Jurka <books(at)ejurka(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 23:23:06
Message-ID: 47B37BDA.8020702@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Simon Riggs wrote:

> What we need is a "send parameters inline" mode that can be set as an
> option for a PreparedStatement. Nobody wants to use V2, we just want the
> ability to re-plan a query every time. Perhaps that is best implemented
> as a server side option that can be exposed via JDBC option, since this
> is an SQL requirement and nothing to do with Java.

Unfortunately you can't have your cake and eat it too. Many of the
features specific to V3 *require* that parameters be sent out of line
(e.g. parameter metadata, bytea streaming, possibly OUT parameters)

As Kris said earlier in this thread what we really need is a way to tell
the server "I will only ever use this statement once, with these
particular parameter values, go ahead and optimize the plan on that basis"

-O


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: josh(at)agliodbs(dot)com, Stephen Denne <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>, Kris Jurka <books(at)ejurka(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-13 23:37:53
Message-ID: 47B37F51.2010003@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Oliver Jowett wrote:
> Simon Riggs wrote:
>
>> What we need is a "send parameters inline" mode that can be set as an
>> option for a PreparedStatement. Nobody wants to use V2, we just want the
>> ability to re-plan a query every time. Perhaps that is best implemented
>> as a server side option that can be exposed via JDBC option, since this
>> is an SQL requirement and nothing to do with Java.
>
> Unfortunately you can't have your cake and eat it too. Many of the
> features specific to V3 *require* that parameters be sent out of line
> (e.g. parameter metadata, bytea streaming, possibly OUT parameters)
>
> As Kris said earlier in this thread what we really need is a way to tell
> the server "I will only ever use this statement once, with these
> particular parameter values, go ahead and optimize the plan on that basis"

Another possibility is to make the unnamed statement behaviour in the
server more aggressive.

The current behaviour is that planning only happens on the first Bind of
a particular unnamed statement, and parameter placeholders are left as
placeholders in the plan tree, with logic to pull in actual parameter
values when doing cost estimates involving those nodes.

Instead, we could change that so that whenever a Bind for the unnamed
statement is received, the query is replanned. Before the normal
optimization is run, the actual parameter values are substituted into
the parse/plan tree as constants (and so are subject to all the normal
optimizations such as constant folding and the constraint/partition
stuff). After execution we should discard the plan, since it's tied to
specific values. If that same unnamed statement gets reexecuted with
different parameter values later, no harm done - we replan it again with
the new values.

I vaguely remember suggesting something like this back in my changes to
8.0, but it was going to be too much of a change to the protocol
behaviour. Given the hindsight of a few major versions, are there
actually clients that reuse the unnamed statement in use, and would they
be badly affected by this change?

-O


From: "Guillaume Smet" <guillaume(dot)smet(at)gmail(dot)com>
To: "Kris Jurka" <books(at)ejurka(dot)com>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Stephen Denne" <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>, "PostgreSQL - JDBC" <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-14 00:35:03
Message-ID: 1d4e0c10802131635h65bb34cbt9b7c558299c77be3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Feb 13, 2008 11:29 PM, Kris Jurka <books(at)ejurka(dot)com> wrote:
> Yes, but you then lose some of the features that are only available with
> the V3 protocol: ParameterMetaData, some ResultSetMetaData calls, OUT
> parameters for CallableStatements. I think that's the list, but couldn't
> swear to it.

I usually advise people to use protocol v2 for typical web usage and
we experienced another problem. A few exceptions are ambiguous with v2
and perfectly identifiable with v3 (foreign key violation for
example). There are apps relying on that and we were forced to use v3
for them (and we have lower performances).

--
Guillaume


From: "Stephen Denne" <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>
To: "Oliver Jowett" <oliver(at)opencloud(dot)com>
Cc: <josh(at)agliodbs(dot)com>, "Kris Jurka" <books(at)ejurka(dot)com>, "PostgreSQL - JDBC" <pgsql-jdbc(at)postgresql(dot)org>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-14 00:48:41
Message-ID: F0238EBA67824444BC1CB4700960CB4804A4F75D@dmpeints002.isotach.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Oliver Jowett wrote
> Instead, we could change that so that whenever a Bind for the unnamed
> statement is received, the query is replanned.

I like the idea.
A while back I investigated whether partitioning would be helpful in my circumstance, and the inability to get the query plans I was expecting was one of the reasons I didn't make use of it.
I had other reasons why I didn't chose to use partitioning, which would probably mean that even with this change I still wouldn't use partitioning.
However I would be able to make good use of such a change if, for example, I'd be able to use indexes for parameterised LIKE expressions.

> are there
> actually clients that reuse the unnamed statement in use, and would they
> be badly affected by this change?

If such clients exist, then yes they would be.
At the least, they'd start getting poorer performance due to repeated planning.
At worst, they may be desiring to reuse the plan generated by the first bind.

But with a bit more work, this could be new behaviour that defaults to the existing process unless a configuration variable is set.

Is this the kind of change that could go into an 8.3.1 release or would it have to wait till 8.4?

Stephen Denne.

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachments is confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately, destroy it and do not copy, disclose or use it in any way.

__________________________________________________________________
This email has been scanned by the DMZGlobal Business Quality
Electronic Messaging Suite.
Please see http://www.dmzglobal.com/services/bqem.htm for details.
__________________________________________________________________


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Oliver Jowett <oliver(at)opencloud(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, josh(at)agliodbs(dot)com, Stephen Denne <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>, Kris Jurka <books(at)ejurka(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-14 05:59:26
Message-ID: 1442.1202968766@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Oliver Jowett <oliver(at)opencloud(dot)com> writes:
> Another possibility is to make the unnamed statement behaviour in the
> server more aggressive.

Upthread I suggested making that conditional on a GUC variable ...
does that seem reasonable?

We've touched on how a "clean" solution to this would require a protocol
change, but I'm not convinced that a change at the protocol-message
level is all that great a solution. That way means that only the lowest
layer on the client side has any direct access to the options; and that
layer is really the least likely to understand what the best choice for
a given query is. So you immediately are faced with having to design
client-side APIs to expose the behavior upward. For instance JDBC would
need to expose an API to calling applications, and the same for libpq
and other client libraries.

If we drive this off a GUC variable then the extra signaling is already
solved, or at worst soluble with a general-purpose API addition that can
address other issues too.

regards, tom lane


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, josh(at)agliodbs(dot)com, Stephen Denne <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>, Kris Jurka <books(at)ejurka(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-14 06:46:06
Message-ID: 47B3E3AE.9050201@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Tom Lane wrote:
> Oliver Jowett <oliver(at)opencloud(dot)com> writes:
>> Another possibility is to make the unnamed statement behaviour in the
>> server more aggressive.
>
> Upthread I suggested making that conditional on a GUC variable ...
> does that seem reasonable?

That would be fine.

> For instance JDBC would
> need to expose an API to calling applications

I'm not sure why JDBC would need to do this .. in the current driver,
unnamed statements are only used for a single query execution anyway, so
turning on the GUC unconditionally would seem to be the way to go.

-O


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Oliver Jowett <oliver(at)opencloud(dot)com>
Cc: josh(at)agliodbs(dot)com, Stephen Denne <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>, Kris Jurka <books(at)ejurka(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-14 07:13:23
Message-ID: 1202973203.16770.581.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Thu, 2008-02-14 at 12:23 +1300, Oliver Jowett wrote:
> Simon Riggs wrote:
>
> > What we need is a "send parameters inline" mode that can be set as an
> > option for a PreparedStatement. Nobody wants to use V2, we just want the
> > ability to re-plan a query every time. Perhaps that is best implemented
> > as a server side option that can be exposed via JDBC option, since this
> > is an SQL requirement and nothing to do with Java.
>
> Unfortunately you can't have your cake and eat it too. Many of the
> features specific to V3 *require* that parameters be sent out of line
> (e.g. parameter metadata, bytea streaming, possibly OUT parameters)
>
> As Kris said earlier in this thread what we really need is a way to tell
> the server "I will only ever use this statement once, with these
> particular parameter values, go ahead and optimize the plan on that basis"

But V3 supports having no parameters at all right?

All we have to do is create a new property called "reOptimize", whose
default value is false.

When reOptimize is true we
- use the code from V2 to fold parameters inline
- re-execute the protocol sequence for Parse/Bind/Execute just as if we
have never seen this SQL statement before (with the parameters inline,
we literally haven't)

We have the code already for this, so no complex coding required.

We can put in server side code for 8.4 to do this, as Tom suggests.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: josh(at)agliodbs(dot)com, Stephen Denne <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>, Kris Jurka <books(at)ejurka(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-14 07:18:19
Message-ID: 47B3EB3B.5050605@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Simon Riggs wrote:

> But V3 supports having no parameters at all right?

Yes, but you're missing the point. If you want to, for example, ask for
metadata about the 4th parameter to the query, there has to *be* a 4th
parameter. If you fold everything inline, you lose the various
advantages that having the parameters out of line give you, such as the
ability to query the server about their characteristics or stream their
actual values in a binary form separate to the query, etc.

-O


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Oliver Jowett <oliver(at)opencloud(dot)com>
Cc: josh(at)agliodbs(dot)com, Stephen Denne <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>, Kris Jurka <books(at)ejurka(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-14 07:43:33
Message-ID: 1202975013.16770.594.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Thu, 2008-02-14 at 20:18 +1300, Oliver Jowett wrote:
> Simon Riggs wrote:
>
> > But V3 supports having no parameters at all right?
>
> Yes, but you're missing the point. If you want to, for example, ask for
> metadata about the 4th parameter to the query, there has to *be* a 4th
> parameter. If you fold everything inline, you lose the various
> advantages that having the parameters out of line give you, such as the
> ability to query the server about their characteristics or stream their
> actual values in a binary form separate to the query, etc.

Right now, people are choosing to use V2 because of the
parameter-folding characteristics. Those people are giving up the things
you mention, plus much much more. I'm suggesting that we make the
parameter folding an option, to allow people to choose. If they need
access to parameter metadata (or other possibilities) then they won't
take advantage of the parameter-folding option.

I suggest we put the parameter folding option in now, then in 8.4 keep
the option but make it work via a GUC as Tom suggests.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: josh(at)agliodbs(dot)com, Stephen Denne <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>, Kris Jurka <books(at)ejurka(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-14 11:28:57
Message-ID: 47B425F9.4050101@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Simon Riggs wrote:
> On Thu, 2008-02-14 at 20:18 +1300, Oliver Jowett wrote:
>> Simon Riggs wrote:
>>
>>> But V3 supports having no parameters at all right?
>> Yes, but you're missing the point. If you want to, for example, ask for
>> metadata about the 4th parameter to the query, there has to *be* a 4th
>> parameter. If you fold everything inline, you lose the various
>> advantages that having the parameters out of line give you, such as the
>> ability to query the server about their characteristics or stream their
>> actual values in a binary form separate to the query, etc.
>
> Right now, people are choosing to use V2 because of the
> parameter-folding characteristics. Those people are giving up the things
> you mention, plus much much more. I'm suggesting that we make the
> parameter folding an option, to allow people to choose. If they need
> access to parameter metadata (or other possibilities) then they won't
> take advantage of the parameter-folding option.
>
> I suggest we put the parameter folding option in now, then in 8.4 keep
> the option but make it work via a GUC as Tom suggests.

To do this in the driver basically needs a separate protocol path. The
infrastructure is there for that, but it's not trivial to implement. You
could use the existing v3 code as a basis, but there would be a lot of
changes necessary as all of the current v3 code assumes it is working
with parameterized queries at the protocol level.

Does someone who is affected by this want to pick up this approach and
send through some patches?

I'm a bit nervous about introducing another protocol path since we have
enough trouble trying to maintain the existing ones..

-O


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Oliver Jowett <oliver(at)opencloud(dot)com>
Cc: josh(at)agliodbs(dot)com, Stephen Denne <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>, Kris Jurka <books(at)ejurka(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-14 12:07:56
Message-ID: 1202990876.16770.635.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Fri, 2008-02-15 at 00:28 +1300, Oliver Jowett wrote:
> Simon Riggs wrote:
> > On Thu, 2008-02-14 at 20:18 +1300, Oliver Jowett wrote:
> >> Simon Riggs wrote:
> >>
> >>> But V3 supports having no parameters at all right?
> >> Yes, but you're missing the point. If you want to, for example, ask for
> >> metadata about the 4th parameter to the query, there has to *be* a 4th
> >> parameter. If you fold everything inline, you lose the various
> >> advantages that having the parameters out of line give you, such as the
> >> ability to query the server about their characteristics or stream their
> >> actual values in a binary form separate to the query, etc.
> >
> > Right now, people are choosing to use V2 because of the
> > parameter-folding characteristics. Those people are giving up the things
> > you mention, plus much much more. I'm suggesting that we make the
> > parameter folding an option, to allow people to choose. If they need
> > access to parameter metadata (or other possibilities) then they won't
> > take advantage of the parameter-folding option.
> >
> > I suggest we put the parameter folding option in now, then in 8.4 keep
> > the option but make it work via a GUC as Tom suggests.
>
> To do this in the driver basically needs a separate protocol path. The
> infrastructure is there for that, but it's not trivial to implement. You
> could use the existing v3 code as a basis, but there would be a lot of
> changes necessary as all of the current v3 code assumes it is working
> with parameterized queries at the protocol level.
>
> Does someone who is affected by this want to pick up this approach and
> send through some patches?
>
> I'm a bit nervous about introducing another protocol path since we have
> enough trouble trying to maintain the existing ones..

Seems heavy way of doing it.

Why not just fold in parameters if option is set and can continue to use
normal V3 route, just with zero parameters? Any JDBC calls that want to
inspect parameters can throw an exception when the option is set. So
JDBC thinks there were parameters, yet Postgres server thinks there were
no parameters.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com


From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: josh(at)agliodbs(dot)com, Stephen Denne <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>, Kris Jurka <books(at)ejurka(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-14 12:22:05
Message-ID: 47B4326D.3080907@opencloud.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

Simon Riggs wrote:

> Why not just fold in parameters if option is set and can continue to use
> normal V3 route, just with zero parameters? Any JDBC calls that want to
> inspect parameters can throw an exception when the option is set. So
> JDBC thinks there were parameters, yet Postgres server thinks there were
> no parameters.

Well, yes, that's essentially how our protocol-level abstraction works -
the main driver deals in terms of abstracted Query and ParameterList
objects, and the protocol layer maps those to something the server
understands. This is how we support the v2 and v3 protocols in the same
driver. I was suggesting a third protocol path ("v3simple" or something)
that did pretty much what you described .. but I fear you are
underestimating the work needed to implement it.

If you want to put together a patch, though, I'm happy to take a look at it.

-O


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Oliver Jowett <oliver(at)opencloud(dot)com>
Cc: josh(at)agliodbs(dot)com, Stephen Denne <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>, Kris Jurka <books(at)ejurka(dot)com>, PostgreSQL - JDBC <pgsql-jdbc(at)postgresql(dot)org>
Subject: Re: JDBC, prepared queries, and partitioning
Date: 2008-02-14 13:04:19
Message-ID: 1202994259.16770.640.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-jdbc

On Fri, 2008-02-15 at 01:22 +1300, Oliver Jowett wrote:
> Simon Riggs wrote:
>
> > Why not just fold in parameters if option is set and can continue to use
> > normal V3 route, just with zero parameters? Any JDBC calls that want to
> > inspect parameters can throw an exception when the option is set. So
> > JDBC thinks there were parameters, yet Postgres server thinks there were
> > no parameters.
>
> Well, yes, that's essentially how our protocol-level abstraction works -
> the main driver deals in terms of abstracted Query and ParameterList
> objects, and the protocol layer maps those to something the server
> understands. This is how we support the v2 and v3 protocols in the same
> driver. I was suggesting a third protocol path ("v3simple" or something)
> that did pretty much what you described .. but I fear you are
> underestimating the work needed to implement it.

OK, I'll spend the time on providing server-side facilities in the next
release. That way we won't need to do the driver side work at all.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com