Patch to fix search_path defencies with pg_bench

Lists: pgsql-hackers
From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Patch to fix search_path defencies with pg_bench
Date: 2009-05-05 22:04:24
Message-ID: 1241561064.4278.13.camel@jd-laptop.pragmaticzealot.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

I have been doing some testing with pgbench and I realized that it
forces the use of public as its search_path. This is bad if:

* You want to run multiple pgbench instances within the same database
* You don't want to use public (for whatever reason)

This patch removes that ability and thus will defer to the default
search_path for the connecting user.

diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index ad20cac..1f25921 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -357,8 +357,6 @@ doConnect(void)
return NULL;
}

- executeStatement(conn, "SET search_path = public");
-
return conn;
}

--
PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org
Consulting, Development, Support, Training
503-667-4564 - http://www.commandprompt.com/
The PostgreSQL Company, serving since 1997


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: jd(at)commandprompt(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-05 23:07:51
Message-ID: 14219.1241564871@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Joshua D. Drake" <jd(at)commandprompt(dot)com> writes:
> I have been doing some testing with pgbench and I realized that it
> forces the use of public as its search_path. This is bad if:

> * You want to run multiple pgbench instances within the same database
> * You don't want to use public (for whatever reason)

> This patch removes that ability and thus will defer to the default
> search_path for the connecting user.

Hmm. The search_path setting seems to have been added here
http://archives.postgresql.org/pgsql-committers/2002-10/msg00118.php
http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/contrib/pgbench/pgbench.c.diff?r1=1.20;r2=1.21

as part of a mass patch to make everything in contrib work in the public
schema. I agree that it probably wasn't considered carefully whether
pg_bench should do that; but does anyone see a reason not to change it?

regards, tom lane


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: jd(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-06 06:17:43
Message-ID: alpine.GSO.2.01.0905060141250.13274@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 5 May 2009, Tom Lane wrote:

> I agree that it probably wasn't considered carefully whether pg_bench
> should do that; but does anyone see a reason not to change it?

I thought of one pretty weak use-case for not making this change, but
would wager the additional flexibility here is far more likely to be
appreciated. I'd say it's a clear net improvement.

As for that case...many good database designs put all the user relations
into a schema, so that it's easier to do bulk operations on all of them
while avoiding catalog tables etc.--less work to filter out pg_class to
find them for example.

I once did some pgbench testing on a system that included a real
"accounts" table in a named schema. "pgbench -i" will execute "drop table
if exists accounts". It had already accidentally wiped out the copy of
the accounts table on the system during an earlier test, before the schema
policy was in place, leaving everyone wary of it.

I was able to defend the risk for running pgbench with the new schema
layout by saying "that can only execute against public.accounts no matter
what the user search_path is, so you're safe now". That made everybody
happy. Anyone counting on such behavior could be rudely surprised at this
change. For all I know I'm the only person to ever actually run into that
particular situation though.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: jd(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-06 13:37:19
Message-ID: 3700.1241617039@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith <gsmith(at)gregsmith(dot)com> writes:
> I once did some pgbench testing on a system that included a real
> "accounts" table in a named schema. "pgbench -i" will execute "drop table
> if exists accounts". It had already accidentally wiped out the copy of
> the accounts table on the system during an earlier test, before the schema
> policy was in place, leaving everyone wary of it.

Seems like the right policy for that is "run pgbench in its own
database". I doubt that either adding or removing the "set search_path"
command changes the risk of trouble very much.

regards, tom lane


From: "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, jd(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-06 18:53:10
Message-ID: 1241635990.4141.21.camel@analise3.cresoltec.com.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Em Qua, 2009-05-06 às 09:37 -0400, Tom Lane escreveu:
> Greg Smith <gsmith(at)gregsmith(dot)com> writes:
> > I once did some pgbench testing on a system that included a real
> > "accounts" table in a named schema. "pgbench -i" will execute "drop table
> > if exists accounts". It had already accidentally wiped out the copy of
> > the accounts table on the system during an earlier test, before the schema
> > policy was in place, leaving everyone wary of it.
>
> Seems like the right policy for that is "run pgbench in its own
> database".

A text warning about this could be shown at start of pgbench if the
target database isn't named "pgbench", for examplo, or just some text
could be added to the docs.

regards.
--
Dickson S. Guedes
mail/xmpp: guedes(at)guedesoft(dot)net - skype: guediz
http://guedesoft.net - http://planeta.postgresql.org.br


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, jd(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-06 19:13:05
Message-ID: 20090506191305.GJ4476@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dickson S. Guedes wrote:
> Em Qua, 2009-05-06 às 09:37 -0400, Tom Lane escreveu:

> > Seems like the right policy for that is "run pgbench in its own
> > database".
>
> A text warning about this could be shown at start of pgbench if the
> target database isn't named "pgbench", for examplo, or just some text
> could be added to the docs.

I think it would be better that the schema is specified on the command
line.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, jd(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-06 19:18:05
Message-ID: 24357.1241637485@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Dickson S. Guedes" <listas(at)guedesoft(dot)net> writes:
> Em Qua, 2009-05-06 s 09:37 -0400, Tom Lane escreveu:
>> Seems like the right policy for that is "run pgbench in its own
>> database".

> A text warning about this could be shown at start of pgbench if the
> target database isn't named "pgbench", for examplo, or just some text
> could be added to the docs.

There already is a prominent warning in the pgbench docs:

Caution

pgbench -i creates four tables accounts, branches, history, and
tellers, destroying any existing tables of these names. Be very
careful to use another database if you have tables having these
names!

regards, tom lane


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-06 19:29:49
Message-ID: 1241638189.4278.67.camel@jd-laptop.pragmaticzealot.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2009-05-06 at 15:13 -0400, Alvaro Herrera wrote:
> Dickson S. Guedes wrote:
> > Em Qua, 2009-05-06 às 09:37 -0400, Tom Lane escreveu:
>
> > > Seems like the right policy for that is "run pgbench in its own
> > > database".
> >
> > A text warning about this could be shown at start of pgbench if the
> > target database isn't named "pgbench", for examplo, or just some text
> > could be added to the docs.
>
> I think it would be better that the schema is specified on the command
> line.

I could see that as an option but applications that use a role should
adhere to the rules the DBA sets forth for that role. In this particular
case I explicitly said that role bench01 was to connect to the database
bench and that his search path was bench01 (thus all tables would be
created under the schema bench01). Public should never come into play in
that scenario.

Sincerely,

Joshua D. Drake

--
PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org
Consulting, Development, Support, Training
503-667-4564 - http://www.commandprompt.com/
The PostgreSQL Company, serving since 1997


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, jd(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-06 20:27:18
Message-ID: 26920.1241641638@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> I think it would be better that the schema is specified on the command
> line.

Surely that's more work than the issue is worth. It's also inconvenient
to use, because you'd have to remember to give the switch both for the
-i run and the normal test runs.

regards, tom lane


From: "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, jd(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-06 20:42:39
Message-ID: 1241642559.4141.45.camel@analise3.cresoltec.com.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Em Qua, 2009-05-06 às 16:27 -0400, Tom Lane escreveu:
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> > I think it would be better that the schema is specified on the command
> > line.
>
> Surely that's more work than the issue is worth. It's also inconvenient
> to use, because you'd have to remember to give the switch both for the
> -i run and the normal test runs.

So, in my opinion, the Joshua alternative is a good little change that
let "pgbench" runs in a more flexible way.

But, there is the possibility that someone are using an automated script
that could be broken by this change?

--
Dickson S. Guedes
mail/xmpp: guedes(at)guedesoft(dot)net - skype: guediz
http://guedesoft.net - http://planeta.postgresql.org.br


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-06 20:49:10
Message-ID: 1241642950.4278.112.camel@jd-laptop.pragmaticzealot.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2009-05-06 at 17:42 -0300, Dickson S. Guedes wrote:
> Em Qua, 2009-05-06 às 16:27 -0400, Tom Lane escreveu:
> > Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> > > I think it would be better that the schema is specified on the command
> > > line.
> >
> > Surely that's more work than the issue is worth. It's also inconvenient
> > to use, because you'd have to remember to give the switch both for the
> > -i run and the normal test runs.
>
> So, in my opinion, the Joshua alternative is a good little change that
> let "pgbench" runs in a more flexible way.
>
> But, there is the possibility that someone are using an automated script
> that could be broken by this change?

Only if the role pgbench is using as an explicit search_path set.

Sincerely,

Joshua D. Drake

--
PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org
Consulting, Development, Support, Training
503-667-4564 - http://www.commandprompt.com/
The PostgreSQL Company, serving since 1997


From: "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>
To: jd(at)commandprompt(dot)com
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-06 21:01:57
Message-ID: 1241643717.4141.52.camel@analise3.cresoltec.com.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Em Qua, 2009-05-06 às 13:49 -0700, Joshua D. Drake escreveu:
> On Wed, 2009-05-06 at 17:42 -0300, Dickson S. Guedes wrote:
> > Em Qua, 2009-05-06 às 16:27 -0400, Tom Lane escreveu:
> > > Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> > > > I think it would be better that the schema is specified on the command
> > > > line.
> > >
> > > Surely that's more work than the issue is worth. It's also inconvenient
> > > to use, because you'd have to remember to give the switch both for the
> > > -i run and the normal test runs.
> >
> > So, in my opinion, the Joshua alternative is a good little change that
> > let "pgbench" runs in a more flexible way.
> >
> > But, there is the possibility that someone are using an automated script
> > that could be broken by this change?
>
> Only if the role pgbench is using as an explicit search_path set.

So, in a way to avoid the scenario where a ROLE has an explicit
search_path set to schemes that already have tables named same as the
pgbench's tables, doesn't makes sense also create a "pgbench_" suffix
for them?

--
Dickson S. Guedes
mail/xmpp: guedes(at)guedesoft(dot)net - skype: guediz
http://guedesoft.net - http://planeta.postgresql.org.br


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: jd(at)commandprompt(dot)com
Cc: "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-06 21:04:57
Message-ID: 27748.1241643897@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Joshua D. Drake" <jd(at)commandprompt(dot)com> writes:
> On Wed, 2009-05-06 at 17:42 -0300, Dickson S. Guedes wrote:
>> But, there is the possibility that someone are using an automated script
>> that could be broken by this change?

> Only if the role pgbench is using as an explicit search_path set.

Even then, it's not a problem from the point of view of pgbench ---
the tables will still get created and used correctly. The only problem
shows up if someone is ignoring the existing warning in the docs and
running pgbench in a database that has application tables named accounts
&etc. If you're doing that you're at considerable risk anyway, no
matter *what* we do or don't do with pgbench's search path.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>
Cc: jd(at)commandprompt(dot)com, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-06 21:13:27
Message-ID: 27949.1241644407@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Dickson S. Guedes" <listas(at)guedesoft(dot)net> writes:
> So, in a way to avoid the scenario where a ROLE has an explicit
> search_path set to schemes that already have tables named same as the
> pgbench's tables, doesn't makes sense also create a "pgbench_" suffix
> for them?

Hm, just rename the standard scenario's tables to pgbench_accounts
etc? Sure, but then we break custom pgbench scripts that happen
to be using the default tables for their own purposes. There's
no free lunch.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, jd(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-07 14:12:07
Message-ID: 1241705527.6109.188.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Wed, 2009-05-06 at 15:18 -0400, Tom Lane wrote:
> "Dickson S. Guedes" <listas(at)guedesoft(dot)net> writes:
> > Em Qua, 2009-05-06 s 09:37 -0400, Tom Lane escreveu:
> >> Seems like the right policy for that is "run pgbench in its own
> >> database".
>
> > A text warning about this could be shown at start of pgbench if the
> > target database isn't named "pgbench", for examplo, or just some text
> > could be added to the docs.
>
> There already is a prominent warning in the pgbench docs:
>
> Caution
>
> pgbench -i creates four tables accounts, branches, history, and
> tellers, destroying any existing tables of these names. Be very
> careful to use another database if you have tables having these
> names!

Holy Handgrenade, what a huge footgun! It doesn't even have a
conceivable upside.

The table names "accounts" and "history" are fairly common and a caution
isn't a sufficient safeguard on production data. We know the manual
rarely gets read *after* a problem, let alone beforehand.

We should check they are the correct tables before we just drop them.
Perhaps check for the comment "Tables for pgbench application. Not
production data" on the tables, which would be nice to add anyway.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, jd(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-07 15:14:35
Message-ID: 603c8f070905070814w4ee5282j355d5e497e3a18c1@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, May 7, 2009 at 10:12 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> On Wed, 2009-05-06 at 15:18 -0400, Tom Lane wrote:
>> "Dickson S. Guedes" <listas(at)guedesoft(dot)net> writes:
>> > Em Qua, 2009-05-06 s 09:37 -0400, Tom Lane escreveu:
>> >> Seems like the right policy for that is "run pgbench in its own
>> >> database".
>>
>> > A text warning about this could be shown at start of pgbench if the
>> > target database isn't named "pgbench", for examplo, or just some text
>> > could be added to the docs.
>>
>> There already is a prominent warning in the pgbench docs:
>>
>>               Caution
>>
>>       pgbench -i creates four tables accounts, branches, history, and
>>       tellers, destroying any existing tables of these names. Be very
>>       careful to use another database if you have tables having these
>>       names!
>
> Holy Handgrenade, what a huge footgun! It doesn't even have a
> conceivable upside.
>
> The table names "accounts" and "history" are fairly common and a caution
> isn't a sufficient safeguard on production data. We know the manual
> rarely gets read *after* a problem, let alone beforehand.
>
> We should check they are the correct tables before we just drop them.
> Perhaps check for the comment "Tables for pgbench application. Not
> production data" on the tables, which would be nice to add anyway.

I bet it would be just as good and a lot simpler to do what someone
suggested upthread, namely s/^/pgbench_/

...Robert


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, jd(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-07 15:29:32
Message-ID: 1241710172.6109.235.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Thu, 2009-05-07 at 11:14 -0400, Robert Haas wrote:

> > We should check they are the correct tables before we just drop them.
> > Perhaps check for the comment "Tables for pgbench application. Not
> > production data" on the tables, which would be nice to add anyway.
>
> I bet it would be just as good and a lot simpler to do what someone
> suggested upthread, namely s/^/pgbench_/

Running pgbench has become more popular now, with various people
recommending running it on every system to test performance. I don't
disagree with that recommendation, but I've had problems myself with a
similar issue - hence earlier patch to prevent pgbench running a
complete database VACUUM before every test run.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Aidan Van Dyk <aidan(at)highrise(dot)ca>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, jd(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-07 16:46:35
Message-ID: 20090507164635.GF3305@yugib.highrise.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Robert Haas <robertmhaas(at)gmail(dot)com> [090507 11:15]:

> I bet it would be just as good and a lot simpler to do what someone
> suggested upthread, namely s/^/pgbench_/

That has the "legacy compatibility" problem...

But seeing as "legacy" has a:
SET search_path TO public;

And uses plain <table> in it's queries/creates/drops, couldn't we just
make "new" pgbench refer to tables as <schema>.<table> where <schema> is
"public"? If we leave "schema" as public, and leave in the search_path,
we should be identical to what we currently have, except we've
explicliyt scoped was was searched for before.

And it leads to an easy way for people to change public (in the
search path and/or <schema>.<table>) to do other things (although I'm
not saying that's necessarily required or desired either).

a.

--
Aidan Van Dyk Create like a god,
aidan(at)highrise(dot)ca command like a king,
http://www.highrise.ca/ work like a slave.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, jd(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-07 16:47:28
Message-ID: 20631.1241714848@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On Thu, 2009-05-07 at 11:14 -0400, Robert Haas wrote:
>>> We should check they are the correct tables before we just drop them.
>>> Perhaps check for the comment "Tables for pgbench application. Not
>>> production data" on the tables, which would be nice to add anyway.
>>
>> I bet it would be just as good and a lot simpler to do what someone
>> suggested upthread, namely s/^/pgbench_/

> Running pgbench has become more popular now, with various people
> recommending running it on every system to test performance. I don't
> disagree with that recommendation, but I've had problems myself with a
> similar issue - hence earlier patch to prevent pgbench running a
> complete database VACUUM before every test run.

Well, pgbench has been coded this way since forever and we've only seen
this one report of trouble. Still, I can't object very hard to renaming
the tables to pgbench_accounts etc --- it's a trivial change and the
only thing it could break is custom pgbench scenarios that rely on the
default scenario's tables, which there are probably not many of.

So do we have consensus on dropping the "SET search_path" and renaming
the tables?

regards, tom lane


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-07 16:53:23
Message-ID: 1241715203.11534.39.camel@jd-laptop.pragmaticzealot.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 2009-05-07 at 12:47 -0400, Tom Lane wrote:

> Well, pgbench has been coded this way since forever and we've only seen
> this one report of trouble. Still, I can't object very hard to renaming
> the tables to pgbench_accounts etc --- it's a trivial change and the
> only thing it could break is custom pgbench scenarios that rely on the
> default scenario's tables, which there are probably not many of.
>
> So do we have consensus on dropping the "SET search_path" and renaming
> the tables?

+1 (I hate prefixed table names but I get the idea)

Joshua D. Drake

>
> regards, tom lane
>
--
PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org
Consulting, Development, Support, Training
503-667-4564 - http://www.commandprompt.com/
The PostgreSQL Company, serving since 1997


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Aidan Van Dyk <aidan(at)highrise(dot)ca>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, jd(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-07 16:53:27
Message-ID: 20761.1241715207@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Aidan Van Dyk <aidan(at)highrise(dot)ca> writes:
> ... couldn't we just
> make "new" pgbench refer to tables as <schema>.<table> where <schema> is
> "public"?

I'd prefer not to do that because it changes the amount of parsing work
demanded by the benchmark. Maybe not by enough to matter ... or maybe
it does. Adjusting the length of the identifiers is a small enough
change that I'm prepared to believe it doesn't invalidate comparisons,
but changing the set of catalog lookups that occur is another question.

regards, tom lane


From: Aidan Van Dyk <aidan(at)highrise(dot)ca>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, jd(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-07 16:58:50
Message-ID: 20090507165850.GG3305@yugib.highrise.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> [090507 12:53]:
> Aidan Van Dyk <aidan(at)highrise(dot)ca> writes:
> > ... couldn't we just
> > make "new" pgbench refer to tables as <schema>.<table> where <schema> is
> > "public"?
>
> I'd prefer not to do that because it changes the amount of parsing work
> demanded by the benchmark. Maybe not by enough to matter ... or maybe
> it does. Adjusting the length of the identifiers is a small enough
> change that I'm prepared to believe it doesn't invalidate comparisons,
> but changing the set of catalog lookups that occur is another question.

True enough... What about making the prefix be configurable, so by
default, it could be "pgbench_", it could be set to "" (to force it to
use old pgbench names) or set to "something." to get it to use a
different schema (noting that the comparisons to older ones not doing
catalog lookups are void).

But by dropping the search_path, you're necessarily changing the catalog
comparisons and lookups anyways, because your now taking a "random"
search path to follow (which could have multiple entries in it) instead
of one guaranteed to be a single, useable entry.

--
Aidan Van Dyk Create like a god,
aidan(at)highrise(dot)ca command like a king,
http://www.highrise.ca/ work like a slave.


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Aidan Van Dyk <aidan(at)highrise(dot)ca>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-07 17:01:57
Message-ID: 1241715717.11534.44.camel@jd-laptop.pragmaticzealot.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 2009-05-07 at 12:58 -0400, Aidan Van Dyk wrote:

> True enough... What about making the prefix be configurable, so by
> default, it could be "pgbench_", it could be set to "" (to force it to
> use old pgbench names) or set to "something." to get it to use a
> different schema (noting that the comparisons to older ones not doing
> catalog lookups are void).

Then you have to pass the prefix on the command line. That seems a bit
over doing it for such a simple utility.

>
> But by dropping the search_path, you're necessarily changing the catalog
> comparisons and lookups anyways, because your now taking a "random"
> search path to follow (which could have multiple entries in it) instead
> of one guaranteed to be a single, useable entry.

Except that it isn't a guaranteed usable entry, which is why I submitted
the patch.

Sincerely,

Joshua D. Drake

--
PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org
Consulting, Development, Support, Training
503-667-4564 - http://www.commandprompt.com/
The PostgreSQL Company, serving since 1997


From: Aidan Van Dyk <aidan(at)highrise(dot)ca>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-07 17:05:35
Message-ID: 20090507170535.GH3305@yugib.highrise.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Joshua D. Drake <jd(at)commandprompt(dot)com> [090507 13:02]:
> On Thu, 2009-05-07 at 12:58 -0400, Aidan Van Dyk wrote:
>
> > True enough... What about making the prefix be configurable, so by
> > default, it could be "pgbench_", it could be set to "" (to force it to
> > use old pgbench names) or set to "something." to get it to use a
> > different schema (noting that the comparisons to older ones not doing
> > catalog lookups are void).
>
> Then you have to pass the prefix on the command line. That seems a bit
> over doing it for such a simple utility.

Sure, but by putting a sane default (which seems to be leaning towards
"" or "pgbench_"), you don't *need* to do anything on the command line.

> > But by dropping the search_path, you're necessarily changing the catalog
> > comparisons and lookups anyways, because your now taking a "random"
> > search path to follow (which could have multiple entries in it) instead
> > of one guaranteed to be a single, useable entry.
>
> Except that it isn't a guaranteed usable entry, which is why I submitted
> the patch.

Well ya, but at least you didn't have any pgbench result to try and
"compare unevenly" with something else ;-)

a.

--
Aidan Van Dyk Create like a god,
aidan(at)highrise(dot)ca command like a king,
http://www.highrise.ca/ work like a slave.


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: jd(at)commandprompt(dot)com
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-07 17:07:48
Message-ID: 1241716068.6109.291.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Thu, 2009-05-07 at 09:53 -0700, Joshua D. Drake wrote:
> On Thu, 2009-05-07 at 12:47 -0400, Tom Lane wrote:
>
> > Well, pgbench has been coded this way since forever and we've only seen
> > this one report of trouble. Still, I can't object very hard to renaming
> > the tables to pgbench_accounts etc --- it's a trivial change and the
> > only thing it could break is custom pgbench scenarios that rely on the
> > default scenario's tables, which there are probably not many of.
> >
> > So do we have consensus on dropping the "SET search_path" and renaming
> > the tables?
>
> +1 (I hate prefixed table names but I get the idea)

+1, sorry JD.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: jd(at)commandprompt(dot)com
Cc: Aidan Van Dyk <aidan(at)highrise(dot)ca>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>, Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-07 17:08:16
Message-ID: 21179.1241716096@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Joshua D. Drake" <jd(at)commandprompt(dot)com> writes:
> On Thu, 2009-05-07 at 12:58 -0400, Aidan Van Dyk wrote:
>> True enough... What about making the prefix be configurable, so by
>> default, it could be "pgbench_", it could be set to "" (to force it to
>> use old pgbench names) or set to "something." to get it to use a
>> different schema (noting that the comparisons to older ones not doing
>> catalog lookups are void).

> Then you have to pass the prefix on the command line. That seems a bit
> over doing it for such a simple utility.

Yes, this seems like vastly more work than is called for.

>> But by dropping the search_path, you're necessarily changing the catalog
>> comparisons and lookups anyways, because your now taking a "random"
>> search path to follow (which could have multiple entries in it) instead
>> of one guaranteed to be a single, useable entry.

> Except that it isn't a guaranteed usable entry, which is why I submitted
> the patch.

I think this argument is bogus anyway. The tables are always going to be
created in the default creation schema, ie, the first one on the search
path. As long as you don't change the effective search_path between
pgbench -i and the actual test runs, it won't matter whether that is
public or something else.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: jd(at)commandprompt(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-07 22:02:50
Message-ID: 27657.1241733770@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Joshua D. Drake" <jd(at)commandprompt(dot)com> writes:
> --- a/contrib/pgbench/pgbench.c
> +++ b/contrib/pgbench/pgbench.c
> @@ -357,8 +357,6 @@ doConnect(void)
> return NULL;
> }
>
> - executeStatement(conn, "SET search_path = public");
> -
> return conn;
> }

Applied along with changes of table names accounts -> pgbench_accounts
etc, per discussion.

regards, tom lane


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Aidan Van Dyk <aidan(at)highrise(dot)ca>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>, jd(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-08 00:07:37
Message-ID: alpine.GSO.2.01.0905071934220.25224@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 7 May 2009, Aidan Van Dyk wrote:

> But by dropping the search_path, you're necessarily changing the catalog
> comparisons and lookups anyways, because your now taking a "random"
> search path to follow (which could have multiple entries in it) instead
> of one guaranteed to be a single, useable entry.

You are correct here. Right now, pgbench is guaranteed to be running
against a search_path with only one entry in it. If someone runs the new
version against a configuration with something like:

search_path='a,b,c,d,e,f,g,h,i,j,public'

instead, that is going to execute more slowly than the current pgbench
would have.

But it seems pretty unlikely such a change would actually be noticable
relative to how much per-transaction overhead and run to run variation
there already is in pgbench for reasonably sized catalogs. Maybe it's
worth adding a quick comment about the issue in the docs, I don't think
this downside is significant enough to worry about beyond that.

I think Joshua's original suggestion here is worth considering a bug fix
for merging into 8.4. As long as testers don't do anything crazy with
their manually set search_path, results should be identical with the
earlier verions.

The additional suggestion of renaming the tables with a prefix is
reasonable to me, but it seems way out of scope for something to consider
applying right now though. If you look at the pgbench client, there's a
lot of string parsing going on in there that's not particularly efficient.
I'd want to see a benchmark aimed that quantifying whether that part
suffers measurably from making the table names all longer before such a
change got applied. And there's already a couple of us who are in the
middle of 8.4 pgbench tests that really don't need disruption like that
thrown into the mix right now.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Aidan Van Dyk <aidan(at)highrise(dot)ca>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>, jd(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-08 00:11:08
Message-ID: 7736.1241741468@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith <gsmith(at)gregsmith(dot)com> writes:
> On Thu, 7 May 2009, Aidan Van Dyk wrote:
> You are correct here. Right now, pgbench is guaranteed to be running
> against a search_path with only one entry in it. If someone runs the new
> version against a configuration with something like:

> search_path='a,b,c,d,e,f,g,h,i,j,public'

> instead, that is going to execute more slowly than the current pgbench
> would have.

No, it is not. The tables will be created and used in schema 'a',
and the effective search path depth will be the same.

regards, tom lane


From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Aidan Van Dyk <aidan(at)highrise(dot)ca>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>, jd(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-08 18:45:57
Message-ID: alpine.GSO.2.01.0905081443550.1930@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 7 May 2009, Tom Lane wrote:

> The tables will be created and used in schema 'a', and the effective
> search path depth will be the same.

The case to be concerned about here is where the search_path changes
between initialization and the pgbench run, which certainly isn't
impossible. That can leave you with a longer effective path to search.
Pretty unlikely to be a problem in the field though.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Aidan Van Dyk <aidan(at)highrise(dot)ca>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, "Dickson S(dot) Guedes" <listas(at)guedesoft(dot)net>, jd(at)commandprompt(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch to fix search_path defencies with pg_bench
Date: 2009-05-08 19:11:07
Message-ID: 12470.1241809867@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Smith <gsmith(at)gregsmith(dot)com> writes:
> On Thu, 7 May 2009, Tom Lane wrote:
>> The tables will be created and used in schema 'a', and the effective
>> search path depth will be the same.

> The case to be concerned about here is where the search_path changes
> between initialization and the pgbench run, which certainly isn't
> impossible. That can leave you with a longer effective path to search.
> Pretty unlikely to be a problem in the field though.

Yeah. Also, there is another consideration here that hasn't been
brought up AFAIR: the main point of running pgbench in-the-field
is to find out whether your installation is properly tuned. If
you've chosen a search_path setting that *did* have some unexpected
performance issues, wouldn't you want pgbench to reveal that?
It's peculiar to have pgbench forcing this one particular GUC setting
and not any others.

regards, tom lane