Re: InitPostgres and flatfiles question

Lists: pgsql-hackers
From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: pgsql-hackers(at)postgresql(dot)org
Subject: InitPostgres and flatfiles question
Date: 2007-01-03 15:49:51
Message-ID: 459BD09F.30602@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

I've just run into a race condition with creating a database and
connecting to it immediately afterwards. I'm getting a "database %s not
found" error just after the first flatfiles check in InitPostgres().

What that FindMyDatabase() there does, besides checking if the database
exists, is getting the database and tablespace OIDs, right? Why does the
postmaster not pass those instead of the database name? What would be
the dangers of doing so? (Besides code different from standalone backend)?

Regards

Markus


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Markus Schiltknecht <markus(at)bluegap(dot)ch>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-03 16:07:35
Message-ID: 9742.1167840455@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Markus Schiltknecht <markus(at)bluegap(dot)ch> writes:
> I've just run into a race condition with creating a database and
> connecting to it immediately afterwards. I'm getting a "database %s not
> found" error just after the first flatfiles check in InitPostgres().

In what PG version?

> What that FindMyDatabase() there does, besides checking if the database
> exists, is getting the database and tablespace OIDs, right? Why does the
> postmaster not pass those instead of the database name?

Because the postmaster doesn't have direct database access. If it did,
any corruption of shared memory would risk crashing the postmaster
along with the backends.

regards, tom lane


From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-03 16:20:41
Message-ID: 459BD7D9.6010001@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello Tom,

Tom Lane wrote:
> In what PG version?

Postgres-R 8.3devel ;-)

> Because the postmaster doesn't have direct database access. If it did,
> any corruption of shared memory would risk crashing the postmaster
> along with the backends.

Understood, thanks.

Most probably I better go another way, than fiddling with postgres
startup internals. Is there a way to be sure the flatfile has been
written to disk after a CREATE DATABASE? I would like to ensure I can
connect to a newly created database.

Regards

Markus


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Markus Schiltknecht <markus(at)bluegap(dot)ch>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-03 16:28:54
Message-ID: 11702.1167841734@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Markus Schiltknecht <markus(at)bluegap(dot)ch> writes:
> Most probably I better go another way, than fiddling with postgres
> startup internals. Is there a way to be sure the flatfile has been
> written to disk after a CREATE DATABASE? I would like to ensure I can
> connect to a newly created database.

It should happen automatically at commit of the CREATE DATABASE ... and
you'd not be able to see the pg_database row before that anyway. So I'm
not clear on what you're worried about.

regards, tom lane


From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-03 16:37:44
Message-ID: 459BDBD8.4090601@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> It should happen automatically at commit of the CREATE DATABASE ... and
> you'd not be able to see the pg_database row before that anyway. So I'm
> not clear on what you're worried about.

Okay, thanks. I'll have to investigate on why exactly I still get the
error, then. That's unclear to me, too.

Regards

Markus


From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-04 09:00:32
Message-ID: 459CC230.6080409@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Tom Lane wrote:
> It should happen automatically at commit of the CREATE DATABASE ... and
> you'd not be able to see the pg_database row before that anyway. So I'm
> not clear on what you're worried about.

I've just found the stumbling block: the -c option of psql wraps all in
a transaction, as man psql says:

If the command string contains multiple SQL commands, they are
processed in a single transaction, unless there are explicit
BEGIN/COMMIT commands included in the string to divide it into
multiple transactions. This is different from the behavior when
the same string is fed to psql’s standard input.

Thank you for clarification, I wouldn't have expected that (especially
because CREATE DATABASE itself says, it cannot be run inside a
transaction block... A transaction block (with BEGIN and COMMIT) seems
to be more than just a transaction, right?)

Regards

Markus


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Markus Schiltknecht <markus(at)bluegap(dot)ch>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-04 15:36:48
Message-ID: 17180.1167925008@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Markus Schiltknecht <markus(at)bluegap(dot)ch> writes:
> I've just found the stumbling block: the -c option of psql wraps all in
> a transaction, as man psql says:
> ...
> Thank you for clarification, I wouldn't have expected that (especially
> because CREATE DATABASE itself says, it cannot be run inside a
> transaction block... A transaction block (with BEGIN and COMMIT) seems
> to be more than just a transaction, right?)

Hm, that's an interesting point. psql's -c just shoves its whole
argument string at the backend in one PQexec(), instead of dividing
at semicolons as psql does with normal input. And so it winds up as
a single transaction because postgres.c doesn't force a transaction
commit until the end of the querystring. But that's not a "transaction
block" in the normal sense and so it doesn't trigger the
PreventTransactionChain defense in CREATE DATABASE and elsewhere.

I wonder whether we ought to change that? The point of
PreventTransactionChain is that we don't want the user rolling back
the statement post-completion, but it seems that
psql -c 'CREATE DATABASE foo; ABORT; BEGIN; ...'
would bypass the check.

regards, tom lane


From: Mario Weilguni <mweilguni(at)sime(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Markus Schiltknecht <markus(at)bluegap(dot)ch>
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-04 15:41:36
Message-ID: 200701041641.36860.mweilguni@sime.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Am Donnerstag, 4. Januar 2007 16:36 schrieb Tom Lane:
> Markus Schiltknecht <markus(at)bluegap(dot)ch> writes:
> Hm, that's an interesting point. psql's -c just shoves its whole
> argument string at the backend in one PQexec(), instead of dividing
> at semicolons as psql does with normal input. And so it winds up as
> a single transaction because postgres.c doesn't force a transaction
> commit until the end of the querystring. But that's not a "transaction
> block" in the normal sense and so it doesn't trigger the
> PreventTransactionChain defense in CREATE DATABASE and elsewhere.
>
> I wonder whether we ought to change that? The point of
> PreventTransactionChain is that we don't want the user rolling back
> the statement post-completion, but it seems that
> psql -c 'CREATE DATABASE foo; ABORT; BEGIN; ...'
> would bypass the check.

Maybe not directly related to that problem, but I had a problem with "-c" last
month, when I noticed that this will not work:

psql -c "set client_encoding=iso-8859-1; select name from customer" (UTF8
database, output is hmmm... broken german umlauts).

Best regards
Mario Weilguni


From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-04 16:28:39
Message-ID: 459D2B37.4080704@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Tom Lane wrote:
> Hm, that's an interesting point. psql's -c just shoves its whole
> argument string at the backend in one PQexec(), instead of dividing
> at semicolons as psql does with normal input. And so it winds up as
> a single transaction because postgres.c doesn't force a transaction
> commit until the end of the querystring. But that's not a "transaction
> block" in the normal sense and so it doesn't trigger the
> PreventTransactionChain defense in CREATE DATABASE and elsewhere.

Is there a good reason to not let psql -c behave exactly like psql from
STDIN? I found this exception to be quite confusing. Of course that
could break compatibility with certain scripts, but can this be fixed
without doing so?

Regards

Markus


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Markus Schiltknecht <markus(at)bluegap(dot)ch>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-04 16:44:36
Message-ID: 18111.1167929076@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Markus Schiltknecht <markus(at)bluegap(dot)ch> writes:
> Tom Lane wrote:
>> Hm, that's an interesting point. psql's -c just shoves its whole
>> argument string at the backend in one PQexec(), instead of dividing
>> at semicolons as psql does with normal input.

> Is there a good reason to not let psql -c behave exactly like psql from
> STDIN?

Backwards compatibility, mostly --- there seems to be a considerable
risk of subtly breaking people's scripts if we change the transactional
boundaries for psql -c commands.

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Markus Schiltknecht <markus(at)bluegap(dot)ch>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-05 03:25:05
Message-ID: 200701050325.l053P5X13691@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Markus Schiltknecht <markus(at)bluegap(dot)ch> writes:
> > Tom Lane wrote:
> >> Hm, that's an interesting point. psql's -c just shoves its whole
> >> argument string at the backend in one PQexec(), instead of dividing
> >> at semicolons as psql does with normal input.
>
> > Is there a good reason to not let psql -c behave exactly like psql from
> > STDIN?
>
> Backwards compatibility, mostly --- there seems to be a considerable
> risk of subtly breaking people's scripts if we change the transactional
> boundaries for psql -c commands.

True, but if we keep hitting people who don't expect this behavior, I
wonder if we should just fix it and mention it in the release notes.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Markus Schiltknecht <markus(at)bluegap(dot)ch>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-05 03:38:18
Message-ID: 9236.1167968298@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> Tom Lane wrote:
>> Markus Schiltknecht <markus(at)bluegap(dot)ch> writes:
>>> Is there a good reason to not let psql -c behave exactly like psql from
>>> STDIN?
>>
>> Backwards compatibility, mostly --- there seems to be a considerable
>> risk of subtly breaking people's scripts if we change the transactional
>> boundaries for psql -c commands.

> True, but if we keep hitting people who don't expect this behavior, I
> wonder if we should just fix it and mention it in the release notes.

One other point is that if we change -c's behavior, there won't be
*any* way to submit multiple queries in a single PQexec using plain
psql --- it will require hacking up a special test program using
libpq directly. Unless we have plans to obsolete
multi-queries-per-PQexec altogether, this doesn't seem like a good idea.

OTOH, you could argue that forbidding multiple queries in one PQexec
isn't a bad idea; it would provide an additional defense against
SQL-injection attacks. We did that already in the "extended" query
protocol and I've not heard many complaints.

I'd be willing to buy into doing both together, perhaps.

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Markus Schiltknecht <markus(at)bluegap(dot)ch>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-05 03:43:22
Message-ID: 200701050343.l053hMg16287@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > Tom Lane wrote:
> >> Markus Schiltknecht <markus(at)bluegap(dot)ch> writes:
> >>> Is there a good reason to not let psql -c behave exactly like psql from
> >>> STDIN?
> >>
> >> Backwards compatibility, mostly --- there seems to be a considerable
> >> risk of subtly breaking people's scripts if we change the transactional
> >> boundaries for psql -c commands.
>
> > True, but if we keep hitting people who don't expect this behavior, I
> > wonder if we should just fix it and mention it in the release notes.
>
> One other point is that if we change -c's behavior, there won't be
> *any* way to submit multiple queries in a single PQexec using plain
> psql --- it will require hacking up a special test program using
> libpq directly. Unless we have plans to obsolete
> multi-queries-per-PQexec altogether, this doesn't seem like a good idea.

What value is allowing multiple queies via PQexec() via psql, aside from
avoiding BEGIN/END around your -c query string?

> OTOH, you could argue that forbidding multiple queries in one PQexec
> isn't a bad idea; it would provide an additional defense against
> SQL-injection attacks. We did that already in the "extended" query
> protocol and I've not heard many complaints.
>
> I'd be willing to buy into doing both together, perhaps.

True.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Markus Schiltknecht <markus(at)bluegap(dot)ch>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-05 03:48:40
Message-ID: 9335.1167968920@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> What value is allowing multiple queies via PQexec()

The only argument I can think of is that it allows applications to be
sloppy about parsing a SQL script into individual commands before they
send it. (I think initdb may be guilty of exactly that BTW...) At the
same time you could argue that such sloppiness is inherently a Bad Idea.

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Markus Schiltknecht <markus(at)bluegap(dot)ch>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-05 04:07:13
Message-ID: 200701050407.l0547DH19721@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > What value is allowing multiple queries via PQexec()
>
> The only argument I can think of is that it allows applications to be
> sloppy about parsing a SQL script into individual commands before they
> send it. (I think initdb may be guilty of exactly that BTW...) At the
> same time you could argue that such sloppiness is inherently a Bad Idea.

I thought the idea was that psql was going to split multiple -c commands
into separate PQexec() calls, so sloppy was OK, and if they want a
single transaction, add BEGIN/END to the string.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Markus Schiltknecht <markus(at)bluegap(dot)ch>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-05 04:13:39
Message-ID: 9568.1167970419@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> Tom Lane wrote:
>> The only argument I can think of is that it allows applications to be
>> sloppy about parsing a SQL script into individual commands before they
>> send it. (I think initdb may be guilty of exactly that BTW...) At the
>> same time you could argue that such sloppiness is inherently a Bad Idea.

> I thought the idea was that psql was going to split multiple -c commands
> into separate PQexec() calls, so sloppy was OK, and if they want a
> single transaction, add BEGIN/END to the string.

No, psql isn't the point: we can certainly make its behavior match the
backend's. What I'm wondering about is the effect on random PG-using
applications: should we forbid them from sending multiple SQL commands
per PQexec (or equivalent in other client library APIs)?

Backwards compatibility says no, but you can make some decent arguments
for forbidding it anyway.

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Markus Schiltknecht <markus(at)bluegap(dot)ch>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-05 04:30:11
Message-ID: 200701050430.l054UBO22537@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > Tom Lane wrote:
> >> The only argument I can think of is that it allows applications to be
> >> sloppy about parsing a SQL script into individual commands before they
> >> send it. (I think initdb may be guilty of exactly that BTW...) At the
> >> same time you could argue that such sloppiness is inherently a Bad Idea.
>
> > I thought the idea was that psql was going to split multiple -c commands
> > into separate PQexec() calls, so sloppy was OK, and if they want a
> > single transaction, add BEGIN/END to the string.
>
> No, psql isn't the point: we can certainly make its behavior match the
> backend's. What I'm wondering about is the effect on random PG-using
> applications: should we forbid them from sending multiple SQL commands
> per PQexec (or equivalent in other client library APIs)?
>
> Backwards compatibility says no, but you can make some decent arguments
> for forbidding it anyway.

Yea, I was trying to separate the psql case from the PQexec() case. For
psql, I think it is clear that -c _should_ act like a normal stdin
query. That would eliminate confusion, and I don't see a large loss of
functionality.

The PQexec() case, the problem is we don't know who is using
multi-statement PQexec() calls, and users can't always add BEGIN/END to
fix them if they are embedded in applications.

What we could do it do both and see what pushback we get during beta.
We could always revert it before the final release.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-05 20:33:36
Message-ID: 459EB620.6050705@cheapcomplexdevices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>> What value is allowing multiple queies via PQexec()
>
> The only argument I can think of is that it allows applications to be
> sloppy about parsing a SQL script into individual commands before they
> send it. (I think initdb may be guilty of exactly that BTW...) At the
> same time you could argue that such sloppiness is inherently a Bad Idea.

Doesn't it also avoid some network(?) overhead when you have
a large number of small inserts or updates?

I seem to recall a previous company where we had a major performance
by concatenating a bunch of updates with ";"s in between and sending
them to postgresql as a single command.


From: Jim Nasby <decibel(at)decibel(dot)org>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Markus Schiltknecht <markus(at)bluegap(dot)ch>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-05 21:24:41
Message-ID: 96B7153F-29F7-4FB8-B399-A7D3CE4AEB1D@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jan 4, 2007, at 11:30 PM, Bruce Momjian wrote:
>> No, psql isn't the point: we can certainly make its behavior match
>> the
>> backend's. What I'm wondering about is the effect on random PG-using
>> applications: should we forbid them from sending multiple SQL
>> commands
>> per PQexec (or equivalent in other client library APIs)?
>>
>> Backwards compatibility says no, but you can make some decent
>> arguments
>> for forbidding it anyway.
>
> Yea, I was trying to separate the psql case from the PQexec()
> case. For
> psql, I think it is clear that -c _should_ act like a normal stdin
> query. That would eliminate confusion, and I don't see a large
> loss of
> functionality.

Heh, something I hadn't expected to work:

decibel=# select 1
decibel-# ; select 2
?column?
----------
1
(1 row)

decibel-# ;
?column?
----------
2
(1 row)

> The PQexec() case, the problem is we don't know who is using
> multi-statement PQexec() calls, and users can't always add BEGIN/
> END to
> fix them if they are embedded in applications.
>
> What we could do it do both and see what pushback we get during beta.
> We could always revert it before the final release.

There is one (weak) argument for allowing multiple commands in a
single call to the backend; it's going to perform better in an OLTP
environment because of fewer round-trips between the client and server..

Actually, there's some cases there that might not fit well into
wrapping them into a function, ie: multiple selects issued in one go.
So maybe the argument isn't that weak afterall...
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-06 01:45:51
Message-ID: 24377.1168047951@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com> writes:
> Tom Lane wrote:
>> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>>> What value is allowing multiple queies via PQexec()
>>
>> The only argument I can think of is that it allows applications to be
>> sloppy about parsing a SQL script into individual commands before they
>> send it. (I think initdb may be guilty of exactly that BTW...) At the
>> same time you could argue that such sloppiness is inherently a Bad Idea.

> Doesn't it also avoid some network(?) overhead when you have
> a large number of small inserts or updates?

> I seem to recall a previous company where we had a major performance
> by concatenating a bunch of updates with ";"s in between and sending
> them to postgresql as a single command.

These days you'd probably be better off using a multi-row VALUES() list
if relevant. Also, if you really want to send multiple statements like
that, there's a cleaner way to do it: use the extended query protocol
and don't Sync or wait for a reply until you've sent them all.

regards, tom lane


From: "Andrew Dunstan" <andrew(at)dunslane(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-06 03:40:28
Message-ID: 3845.24.211.165.134.1168054828.squirrel@www.dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


>> Tom Lane wrote:
>>> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>>>> What value is allowing multiple queies via PQexec()
>>>
>>> The only argument I can think of is that it allows applications to be
>>> sloppy about parsing a SQL script into individual commands before they
>>> send it. (I think initdb may be guilty of exactly that BTW...) At the
>>> same time you could argue that such sloppiness is inherently a Bad
>>> Idea.

initdb doesn't use libpq at all ... are you saying it should only pass
individual statements to the backend? Adding smarts to enable that would
add complexity without any benefit I can see.

cheers

andrew


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-06 21:58:11
Message-ID: 200701062158.l06LwBD03325@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Ron Mayer wrote:
> Tom Lane wrote:
> > Bruce Momjian <bruce(at)momjian(dot)us> writes:
> >> What value is allowing multiple queies via PQexec()
> >
> > The only argument I can think of is that it allows applications to be
> > sloppy about parsing a SQL script into individual commands before they
> > send it. (I think initdb may be guilty of exactly that BTW...) At the
> > same time you could argue that such sloppiness is inherently a Bad Idea.
>
> Doesn't it also avoid some network(?) overhead when you have
> a large number of small inserts or updates?
>
> I seem to recall a previous company where we had a major performance
> by concatenating a bunch of updates with ";"s in between and sending
> them to postgresql as a single command.

Added to TODO list:

> o Consider parsing the -c string into individual queries so each
> is run in its own transaction
>
> o Consider disallowing multiple queries in PQexec() as an
> additional barrier to SQL injection attacks

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Markus Schiltknecht <markus(at)bluegap(dot)ch>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-06 22:01:57
Message-ID: 200701062201.l06M1vJ03895@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Do we need a TODO for this?

---------------------------------------------------------------------------

Tom Lane wrote:
> Markus Schiltknecht <markus(at)bluegap(dot)ch> writes:
> > I've just found the stumbling block: the -c option of psql wraps all in
> > a transaction, as man psql says:
> > ...
> > Thank you for clarification, I wouldn't have expected that (especially
> > because CREATE DATABASE itself says, it cannot be run inside a
> > transaction block... A transaction block (with BEGIN and COMMIT) seems
> > to be more than just a transaction, right?)
>
> Hm, that's an interesting point. psql's -c just shoves its whole
> argument string at the backend in one PQexec(), instead of dividing
> at semicolons as psql does with normal input. And so it winds up as
> a single transaction because postgres.c doesn't force a transaction
> commit until the end of the querystring. But that's not a "transaction
> block" in the normal sense and so it doesn't trigger the
> PreventTransactionChain defense in CREATE DATABASE and elsewhere.
>
> I wonder whether we ought to change that? The point of
> PreventTransactionChain is that we don't want the user rolling back
> the statement post-completion, but it seems that
> psql -c 'CREATE DATABASE foo; ABORT; BEGIN; ...'
> would bypass the check.
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: elein <elein(at)varlena(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-07 01:40:07
Message-ID: 20070107014007.GN24367@varlena.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 05, 2007 at 08:45:51PM -0500, Tom Lane wrote:
> Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com> writes:
> > Tom Lane wrote:
> >> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> >>> What value is allowing multiple queies via PQexec()
> >>
> >> The only argument I can think of is that it allows applications to be
> >> sloppy about parsing a SQL script into individual commands before they
> >> send it. (I think initdb may be guilty of exactly that BTW...) At the
> >> same time you could argue that such sloppiness is inherently a Bad Idea.
>
> > Doesn't it also avoid some network(?) overhead when you have
> > a large number of small inserts or updates?
>
> > I seem to recall a previous company where we had a major performance
> > by concatenating a bunch of updates with ";"s in between and sending
> > them to postgresql as a single command.
>
> These days you'd probably be better off using a multi-row VALUES() list
> if relevant. Also, if you really want to send multiple statements like
> that, there's a cleaner way to do it: use the extended query protocol
> and don't Sync or wait for a reply until you've sent them all.
>
> regards, tom lane
>
In shell scripts that do things in the database I often put >1 statement
in the line. Since it is the shell, I want quick results. Usually it
is an INSERT/UPDATE followed by a SELECT.

It would be very frustrating not to be able to send multiple commands
with one -c in psql.

--elein


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: elein <elein(at)varlena(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-07 02:02:49
Message-ID: 200701070202.l0722nP27481@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

elein wrote:
> On Fri, Jan 05, 2007 at 08:45:51PM -0500, Tom Lane wrote:
> > Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com> writes:
> > > Tom Lane wrote:
> > >> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > >>> What value is allowing multiple queies via PQexec()
> > >>
> > >> The only argument I can think of is that it allows applications to be
> > >> sloppy about parsing a SQL script into individual commands before they
> > >> send it. (I think initdb may be guilty of exactly that BTW...) At the
> > >> same time you could argue that such sloppiness is inherently a Bad Idea.
> >
> > > Doesn't it also avoid some network(?) overhead when you have
> > > a large number of small inserts or updates?
> >
> > > I seem to recall a previous company where we had a major performance
> > > by concatenating a bunch of updates with ";"s in between and sending
> > > them to postgresql as a single command.
> >
> > These days you'd probably be better off using a multi-row VALUES() list
> > if relevant. Also, if you really want to send multiple statements like
> > that, there's a cleaner way to do it: use the extended query protocol
> > and don't Sync or wait for a reply until you've sent them all.
> >
> > regards, tom lane
> >
> In shell scripts that do things in the database I often put >1 statement
> in the line. Since it is the shell, I want quick results. Usually it
> is an INSERT/UPDATE followed by a SELECT.
>
> It would be very frustrating not to be able to send multiple commands
> with one -c in psql.

We aren't going to disable that --- we are considering disabling the
backend from treating it as a single transaction.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Markus Schiltknecht <markus(at)bluegap(dot)ch>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-07 02:09:17
Message-ID: 1955.1168135757@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> Do we need a TODO for this?

Well, if we *don't* change the backend to disallow multi statements per
PQexec, then we'd probably better do something about this. If we do
make that change then it's not a problem anymore.

>> Hm, that's an interesting point. psql's -c just shoves its whole
>> argument string at the backend in one PQexec(), instead of dividing
>> at semicolons as psql does with normal input. And so it winds up as
>> a single transaction because postgres.c doesn't force a transaction
>> commit until the end of the querystring. But that's not a "transaction
>> block" in the normal sense and so it doesn't trigger the
>> PreventTransactionChain defense in CREATE DATABASE and elsewhere.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: elein <elein(at)varlena(dot)com>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-07 02:11:44
Message-ID: 1983.1168135904@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> We aren't going to disable that --- we are considering disabling the
> backend from treating it as a single transaction.

Or even more specifically, making sure that that only happens if you
explicitly put begin/commit into the -c string.

One thing I think we probably ought *not* change is that if any of the
statements fail the rest of the -c string is abandoned --- that is,
-c would work like a script with ON_ERROR_STOP enabled.

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Markus Schiltknecht <markus(at)bluegap(dot)ch>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: InitPostgres and flatfiles question
Date: 2007-01-11 04:48:00
Message-ID: 200701110448.l0B4m0m27026@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Markus Schiltknecht <markus(at)bluegap(dot)ch> writes:
> > I've just found the stumbling block: the -c option of psql wraps all in
> > a transaction, as man psql says:
> > ...
> > Thank you for clarification, I wouldn't have expected that (especially
> > because CREATE DATABASE itself says, it cannot be run inside a
> > transaction block... A transaction block (with BEGIN and COMMIT) seems
> > to be more than just a transaction, right?)
>
> Hm, that's an interesting point. psql's -c just shoves its whole
> argument string at the backend in one PQexec(), instead of dividing
> at semicolons as psql does with normal input. And so it winds up as
> a single transaction because postgres.c doesn't force a transaction
> commit until the end of the querystring. But that's not a "transaction
> block" in the normal sense and so it doesn't trigger the
> PreventTransactionChain defense in CREATE DATABASE and elsewhere.
>
> I wonder whether we ought to change that? The point of
> PreventTransactionChain is that we don't want the user rolling back
> the statement post-completion, but it seems that
> psql -c 'CREATE DATABASE foo; ABORT; BEGIN; ...'
> would bypass the check.

Added to TODO:

o Fix transaction restriction checks for CREATE DATABASE and
other commands

http://archives.postgresql.org/pgsql-hackers/2007-01/msg00133.php

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +