Re: Allowing line-continuation in pgbench custom scripts

Lists: pgsql-hackers
From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Allowing line-continuation in pgbench custom scripts
Date: 2014-05-26 09:50:40
Message-ID: CA+HiwqEMGL3TJknmZBd7hmPtTnqkrkarusi9hDQEkNnaynnv7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

In a custom pgbench script, it seems convenient to be able to split a
really long query to span multiple lines using an escape character
(bash-style). Attached adds that capability to read_line_from_file()
in pgbench.c

For example,

BEGIN;
\setrandom 1 16500000
UPDATE table \
SET col2 = (clock_timestamp() + '10s'::interval * random() * 1000), \
col3 = (clock_timestamp() + '10s'::interval * sin(random() *
(2*pi()) ) * 1000) \
WHERE col1 = :id;
COMMIT;

instead of:

BEGIN;
\setrandom id 1 16500000
UPDATE table SET col2 = (clock_timestamp() + '10s'::interval *
random() * :id), col3 = (clock_timestamp() + '10s'::interval *
sin(random() * (2*pi()) ) * 100000) WHERE col1 = :id;
COMMIT;

Thoughts?

--
Amit

Attachment Content-Type Size
pgbench-custom-script-line-continuation.patch application/octet-stream 527 bytes

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allowing line-continuation in pgbench custom scripts
Date: 2014-05-26 13:59:49
Message-ID: CAHGQGwGfywr87qU5FdwdO_9wRBZacNUgpg+F5w1tTWhMuutd8w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, May 26, 2014 at 6:50 PM, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> Hi,
>
> In a custom pgbench script, it seems convenient to be able to split a
> really long query to span multiple lines using an escape character
> (bash-style). Attached adds that capability to read_line_from_file()
> in pgbench.c
>
> For example,
>
> BEGIN;
> \setrandom 1 16500000
> UPDATE table \
> SET col2 = (clock_timestamp() + '10s'::interval * random() * 1000), \
> col3 = (clock_timestamp() + '10s'::interval * sin(random() *
> (2*pi()) ) * 1000) \
> WHERE col1 = :id;
> COMMIT;
>
> instead of:
>
> BEGIN;
> \setrandom id 1 16500000
> UPDATE table SET col2 = (clock_timestamp() + '10s'::interval *
> random() * :id), col3 = (clock_timestamp() + '10s'::interval *
> sin(random() * (2*pi()) ) * 100000) WHERE col1 = :id;
> COMMIT;
>
> Thoughts?

IMO it's better if we can write SQL in multiples line *without* a tailing
escape character, like psql's input file.

Regards,

--
Fujii Masao


From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allowing line-continuation in pgbench custom scripts
Date: 2014-05-26 15:01:58
Message-ID: CA+HiwqF-bRgCTheuZxc5Y747mNHDwS_f-7+5-ZMjrGK_fg+sAQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, May 26, 2014 at 10:59 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Mon, May 26, 2014 at 6:50 PM, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>> Thoughts?
>
> IMO it's better if we can write SQL in multiples line *without* a tailing
> escape character, like psql's input file.
>

Yeah, that would be much cleaner.

--
Amit


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allowing line-continuation in pgbench custom scripts
Date: 2014-05-26 15:37:52
Message-ID: 20140526153752.GY7857@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Amit Langote wrote:
> On Mon, May 26, 2014 at 10:59 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> > On Mon, May 26, 2014 at 6:50 PM, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> >> Thoughts?
> >
> > IMO it's better if we can write SQL in multiples line *without* a tailing
> > escape character, like psql's input file.
>
> Yeah, that would be much cleaner.

But that would require duplicating the lexing stuff to determine where
quotes are and where commands end. There are already some cases where
pgbench itself is the bottleneck; adding a lexing step would be more
expensive, no? Whereas simply detecting line continuations would be
cheaper.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allowing line-continuation in pgbench custom scripts
Date: 2014-05-26 15:44:31
Message-ID: 26266.1401119071@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Amit Langote <amitlangote09(at)gmail(dot)com> writes:
> In a custom pgbench script, it seems convenient to be able to split a
> really long query to span multiple lines using an escape character
> (bash-style). Attached adds that capability to read_line_from_file()
> in pgbench.c

This seems pretty likely to break existing scripts that happen to contain
backslashes. Is it really worth the compatibility risk?

The patch as written has got serious problems even discounting any
compatibility risk: it will be fooled by a backslash near the end of a
bufferload that doesn't end with a newline, and it doesn't allow for
DOS-style newlines (\r\n), and it indexes off the array if the buffer
contains *only* a newline (and, assuming that it fails to crash in that
case, it'd also fail to note a backslash that had been in the previous
bufferload).

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allowing line-continuation in pgbench custom scripts
Date: 2014-05-26 15:52:03
Message-ID: 26629.1401119523@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Amit Langote wrote:
>> On Mon, May 26, 2014 at 10:59 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>> IMO it's better if we can write SQL in multiples line *without* a tailing
>>> escape character, like psql's input file.

>> Yeah, that would be much cleaner.

> But that would require duplicating the lexing stuff to determine where
> quotes are and where commands end. There are already some cases where
> pgbench itself is the bottleneck; adding a lexing step would be more
> expensive, no? Whereas simply detecting line continuations would be
> cheaper.

Well, we only parse the script file(s) once at run start, and that time
isn't included in the TPS timing, so I don't think performance is really
an issue here. But yeah, the amount of code that would have to be
duplicated out of psql is pretty daunting --- it'd be a maintenance
nightmare, for what seems like not a lot of gain. There would also
be a compatibility issue if we went this way, because existing scripts
that haven't bothered with semicolon line terminators would break.

regards, tom lane


From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allowing line-continuation in pgbench custom scripts
Date: 2014-05-26 16:12:50
Message-ID: CA+HiwqEf2UZ9E3h=tmz_JRqDbKKv_k+z76VaYqBRXrXhiqgPhw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, May 27, 2014 at 12:44 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Amit Langote <amitlangote09(at)gmail(dot)com> writes:
>> In a custom pgbench script, it seems convenient to be able to split a
>> really long query to span multiple lines using an escape character
>> (bash-style). Attached adds that capability to read_line_from_file()
>> in pgbench.c
>
> This seems pretty likely to break existing scripts that happen to contain
> backslashes. Is it really worth the compatibility risk?
>
> The patch as written has got serious problems even discounting any
> compatibility risk: it will be fooled by a backslash near the end of a
> bufferload that doesn't end with a newline, and it doesn't allow for
> DOS-style newlines (\r\n), and it indexes off the array if the buffer
> contains *only* a newline (and, assuming that it fails to crash in that
> case, it'd also fail to note a backslash that had been in the previous
> bufferload).
>

Sorry, the patch was in a really bad shape. Should have pondered these
points before submitting it.

Even if I drop the backslash line-continuation idea and decide to use
semi-colons as SQL command separators, given the compatibility issues
mentioned downthread, it would not be worthwhile.

--
Amit


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allowing line-continuation in pgbench custom scripts
Date: 2014-05-26 17:50:03
Message-ID: CAMkU=1wLSMBDY0YoxZH+weMS3Mjm6d1YphPoFi4jjAo10k0Yig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Monday, May 26, 2014, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Amit Langote <amitlangote09(at)gmail(dot)com <javascript:;>> writes:
> > In a custom pgbench script, it seems convenient to be able to split a
> > really long query to span multiple lines using an escape character
> > (bash-style). Attached adds that capability to read_line_from_file()
> > in pgbench.c
>
> This seems pretty likely to break existing scripts that happen to contain
> backslashes. Is it really worth the compatibility risk?
>

Do you mean due to the bugs you point out, or in general? Is it really at
all likely that someone has ended a line of their custom benchmark file
with a backslash? I'm having a hard time seeing what, other than malice,
would prod someone to do that.

Cheers,

Jeff


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allowing line-continuation in pgbench custom scripts
Date: 2014-05-26 17:53:02
Message-ID: 20140526175302.GB11572@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-05-26 11:44:31 -0400, Tom Lane wrote:
> Amit Langote <amitlangote09(at)gmail(dot)com> writes:
> > In a custom pgbench script, it seems convenient to be able to split a
> > really long query to span multiple lines using an escape character
> > (bash-style). Attached adds that capability to read_line_from_file()
> > in pgbench.c
>
> This seems pretty likely to break existing scripts that happen to contain
> backslashes. Is it really worth the compatibility risk?

Weaknesses in the implementation aside, I don't think pgbench has to be
hold up to the same level of compatibility as many of our other
tools. It's just a benchmark tool after all.

I've more than once wished for the capability.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allowing line-continuation in pgbench custom scripts
Date: 2014-05-26 18:15:26
Message-ID: 1159.1401128126@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
> On Monday, May 26, 2014, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> This seems pretty likely to break existing scripts that happen to contain
>> backslashes. Is it really worth the compatibility risk?

> Do you mean due to the bugs you point out, or in general? Is it really at
> all likely that someone has ended a line of their custom benchmark file
> with a backslash? I'm having a hard time seeing what, other than malice,
> would prod someone to do that.

No, I was worried that the feature would pose such a risk even when
correctly implemented. But on reflection, you're right, it seems a bit
hard to credit that any existing script file would have a backslash just
before EOL. There's certainly not SQL syntax in which that could be
valid; perhaps someone would do it in a "--" comment but that seems a
tad far fetched.

regards, tom lane


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allowing line-continuation in pgbench custom scripts
Date: 2014-05-26 18:35:22
Message-ID: 20140526183522.GZ7857@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund wrote:

> I've more than once wished for the capability.

+1

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allowing line-continuation in pgbench custom scripts
Date: 2014-05-26 21:19:05
Message-ID: 5383AFC9.1060901@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 05/26/2014 08:52 AM, Tom Lane wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>> Amit Langote wrote:
>>> On Mon, May 26, 2014 at 10:59 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>>> IMO it's better if we can write SQL in multiples line *without* a tailing
>>>> escape character, like psql's input file.
>
>>> Yeah, that would be much cleaner.
>
>> But that would require duplicating the lexing stuff to determine where
>> quotes are and where commands end. There are already some cases where
>> pgbench itself is the bottleneck; adding a lexing step would be more
>> expensive, no? Whereas simply detecting line continuations would be
>> cheaper.
>
> Well, we only parse the script file(s) once at run start, and that time
> isn't included in the TPS timing, so I don't think performance is really
> an issue here. But yeah, the amount of code that would have to be
> duplicated out of psql is pretty daunting --- it'd be a maintenance
> nightmare, for what seems like not a lot of gain. There would also
> be a compatibility issue if we went this way, because existing scripts
> that haven't bothered with semicolon line terminators would break.

What if we make using semicolons or not a config option in the file? i.e.:

\multiline

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Christoph Berg <cb(at)df7cb(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allowing line-continuation in pgbench custom scripts
Date: 2014-05-26 21:29:07
Message-ID: 20140526212907.GA24735@msgid.df7cb.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Re: Tom Lane 2014-05-26 <26629(dot)1401119523(at)sss(dot)pgh(dot)pa(dot)us>
> >> Yeah, that would be much cleaner.
>
> > But that would require duplicating the lexing stuff to determine where
> > quotes are and where commands end. There are already some cases where
> > pgbench itself is the bottleneck; adding a lexing step would be more
> > expensive, no? Whereas simply detecting line continuations would be
> > cheaper.
>
> Well, we only parse the script file(s) once at run start, and that time
> isn't included in the TPS timing, so I don't think performance is really
> an issue here. But yeah, the amount of code that would have to be
> duplicated out of psql is pretty daunting --- it'd be a maintenance
> nightmare, for what seems like not a lot of gain. There would also
> be a compatibility issue if we went this way, because existing scripts
> that haven't bothered with semicolon line terminators would break.

Fwiw, I would love to have some \ line continuation thing also for
.psqlrc. I have some dozen \set in there containing queries for
looking into stats/locks/whatever I can invoke just typing e.g.
:user_tables, and these are pretty hard to edit as they are squeezed
on one line.

I agree that putting an SQL parser into the backslash command parser
is overkill, but there's hardly a chance backslashes at the end of a
backslash command line would break anything, except for meeting what
most people would expect.

Christoph
--
cb(at)df7cb(dot)de | http://www.df7cb.de/


From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allowing line-continuation in pgbench custom scripts
Date: 2014-05-27 00:52:32
Message-ID: CA+HiwqHgWrFYuLBZMimRdzdmU8ViBEq+h+1_trp=Mx97B9q7gQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, May 27, 2014 at 6:19 AM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 05/26/2014 08:52 AM, Tom Lane wrote:
>> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>>> Amit Langote wrote:
>>>> On Mon, May 26, 2014 at 10:59 PM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>>>> IMO it's better if we can write SQL in multiples line *without* a tailing
>>>>> escape character, like psql's input file.
>>
>>>> Yeah, that would be much cleaner.
>>
>>> But that would require duplicating the lexing stuff to determine where
>>> quotes are and where commands end. There are already some cases where
>>> pgbench itself is the bottleneck; adding a lexing step would be more
>>> expensive, no? Whereas simply detecting line continuations would be
>>> cheaper.
>>
>> Well, we only parse the script file(s) once at run start, and that time
>> isn't included in the TPS timing, so I don't think performance is really
>> an issue here. But yeah, the amount of code that would have to be
>> duplicated out of psql is pretty daunting --- it'd be a maintenance
>> nightmare, for what seems like not a lot of gain. There would also
>> be a compatibility issue if we went this way, because existing scripts
>> that haven't bothered with semicolon line terminators would break.
>
> What if we make using semicolons or not a config option in the file? i.e.:
>
> \multiline
>
>

And perhaps make 'off' the default if I get it correctly?

It would apply only to the SQL commands though, no?

--
Amit