Re: COPY and Volatile default expressions

Lists: pgsql-hackers
From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: COPY and Volatile default expressions
Date: 2013-04-15 14:00:34
Message-ID: CA+U5nMKyXfbUTFLT4Y5tq44dN_k3bFmEvkf=2O7SN1Pm-KTnyQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

COPY cannot be optimised correctly if we have before triggers or
volatile default expressions.

The multi-insert code detects those cases and falls back to the single
row mechanism in those cases.

There a common class of volatile functions that wouldn't cause
problems: any volatile function that doesn't touch the table being
loaded and still works correctly when called with alternately ordered
data.

I claim this is a common class, since sequence next_val functions and
uuid generators meet that criteria and most common forms of auditing
trigger, as well as any other form of data-reformatting trigger. Since
this is a common case, it seems worth optimising.

What I'd like to do is to invent a new form of labelling that allows
us to understand that COPY can still be optimised. I'm thinking to add
a new function label, something like one of
* IDEMPOTENT
* ORDER INDEPENDENT
* BATCHABLE
* NON SELF REFERENCING
* GO FASTER DAMMIT
etc

I'm sure many people will have a much more exact description and a
better name than I do.

This becomes more important when we think about parallelising SQL,
since essentially the same problem exists with parallel SQL calling
volatile functions. Oracle handles that by having a pragma to allow a
function to be declared as parallel safe.

I was also thinking that the best way to do this would be to invent a
new flexible function labelling scheme, so use something like hstore
to store a list of function attributes. Something that would mean we
don't have to invent new keywords every time we have a new function
label.

Suggestions please.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: David Fetter <david(at)fetter(dot)org>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 15:24:27
Message-ID: 20130415152427.GH5337@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Apr 15, 2013 at 03:00:34PM +0100, Simon Riggs wrote:
> COPY cannot be optimised correctly if we have before triggers or
> volatile default expressions.
>
> The multi-insert code detects those cases and falls back to the
> single row mechanism in those cases.
>
> There a common class of volatile functions that wouldn't cause
> problems: any volatile function that doesn't touch the table being
> loaded and still works correctly when called with alternately
> ordered data.

"Doesn't touch already-existing rows?" Makes a lot of sense :)

> I claim this is a common class, since sequence next_val functions and
> uuid generators meet that criteria and most common forms of auditing
> trigger, as well as any other form of data-reformatting trigger. Since
> this is a common case, it seems worth optimising.

Do you have numbers on this, or ways to gather same? In other words,
how do we know what resources (time, CPU cycles, disk seeks, etc.) are
being consumed here?

> What I'd like to do is to invent a new form of labelling that allows
> us to understand that COPY can still be optimised. I'm thinking to add
> a new function label, something like one of
> * IDEMPOTENT
> * ORDER INDEPENDENT
> * BATCHABLE
> * NON SELF REFERENCING
> * GO FASTER DAMMIT
> etc
>
> I'm sure many people will have a much more exact description and a
> better name than I do.
>
> This becomes more important when we think about parallelising SQL,
> since essentially the same problem exists with parallel SQL calling
> volatile functions. Oracle handles that by having a pragma to allow a
> function to be declared as parallel safe.

What happens when you misinform Oracle about this? Does it attempt to
check? More importantly, what *should* happen?

> I was also thinking that the best way to do this would be to invent a
> new flexible function labelling scheme, so use something like hstore
> to store a list of function attributes. Something that would mean we
> don't have to invent new keywords every time we have a new function
> label.
>
> Suggestions please.

JSON's in core. How about using that?

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 15:41:00
Message-ID: 516C1F8C.2000903@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 15.04.2013 17:00, Simon Riggs wrote:
> COPY cannot be optimised correctly if we have before triggers or
> volatile default expressions.
>
> The multi-insert code detects those cases and falls back to the single
> row mechanism in those cases.
>
> There a common class of volatile functions that wouldn't cause
> problems: any volatile function that doesn't touch the table being
> loaded and still works correctly when called with alternately ordered
> data.
>
> I claim this is a common class, since sequence next_val functions and
> uuid generators meet that criteria and most common forms of auditing
> trigger, as well as any other form of data-reformatting trigger. Since
> this is a common case, it seems worth optimising.
>
> What I'd like to do is to invent a new form of labelling that allows
> us to understand that COPY can still be optimised.

It would be even nicer to detect at runtime, when a default expression
or before trigger tries to access the same table. When that happens, we
could immediately flush all the tuples buffered that far to disk, so
that they are visible to the expression, and then proceed with it.

- Heikki


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 15:49:10
Message-ID: CA+U5nMJT5scKzFo6DQuvXqjRoc0LKOHBhSucp385ALZnGBZ_Ag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 15 April 2013 16:24, David Fetter <david(at)fetter(dot)org> wrote:

>> I claim this is a common class, since sequence next_val functions and
>> uuid generators meet that criteria and most common forms of auditing
>> trigger, as well as any other form of data-reformatting trigger. Since
>> this is a common case, it seems worth optimising.
>
> Do you have numbers on this, or ways to gather same? In other words,
> how do we know what resources (time, CPU cycles, disk seeks, etc.) are
> being consumed here?

The multi-insert optimisation for COPY is already there and works well
enough to have been committed.

All we have to do to allow it to be used is to persuade COPY that come
kinds of volatile function need not prevent the optimisation. So once
we have a mechanism for appropriately labelling a function, it will be
a one-line change in copy.c to enable the optimisation.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 15:49:42
Message-ID: 15279.1366040982@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> COPY cannot be optimised correctly if we have before triggers or
> volatile default expressions.

> The multi-insert code detects those cases and falls back to the single
> row mechanism in those cases.

> There a common class of volatile functions that wouldn't cause
> problems: any volatile function that doesn't touch the table being
> loaded and still works correctly when called with alternately ordered
> data.

> I claim this is a common class, since sequence next_val functions and
> uuid generators meet that criteria and most common forms of auditing
> trigger, as well as any other form of data-reformatting trigger.

I don't believe that it's a good idea to consider nextval() to be
reorderable, so I'm not convinced by your argument here.

> What I'd like to do is to invent a new form of labelling that allows
> us to understand that COPY can still be optimised.

And I don't want to invent impossible-to-verify function attributes with
such a tiny use-case as this.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 15:53:13
Message-ID: CA+U5nMK9FTVt3KEau7YOyOEfaENhzWV+M2XDnf=atSoqBzbzUw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 15 April 2013 16:41, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:

>> What I'd like to do is to invent a new form of labelling that allows
>> us to understand that COPY can still be optimised.
>
> It would be even nicer to detect at runtime, when a default expression or
> before trigger tries to access the same table. When that happens, we could
> immediately flush all the tuples buffered that far to disk, so that they are
> visible to the expression, and then proceed with it.

Maybe, but do we really want an extra test every time we access a
table? And if we did that, how would we pass control back to the COPY
command to allow it flush the buffer before continuing with the
function?

How would we cope with times when a subtle problem makes a function
unusable, even though it passes automatic detection?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: David Fetter <david(at)fetter(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 15:55:06
Message-ID: 15416.1366041306@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> On 15 April 2013 16:24, David Fetter <david(at)fetter(dot)org> wrote:
>> Do you have numbers on this, or ways to gather same? In other words,
>> how do we know what resources (time, CPU cycles, disk seeks, etc.) are
>> being consumed here?

> The multi-insert optimisation for COPY is already there and works well
> enough to have been committed.

You seem to not have answered the question. Exactly what sort of
performance gain might be possible, bearing in mind that anything that
invokes a trigger (for instance) is unlikely to be amazingly fast
anyway?

regards, tom lane


From: David Fetter <david(at)fetter(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 16:03:59
Message-ID: 20130415160359.GK5337@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Apr 15, 2013 at 11:49:42AM -0400, Tom Lane wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> > COPY cannot be optimised correctly if we have before triggers or
> > volatile default expressions.
>
> > The multi-insert code detects those cases and falls back to the single
> > row mechanism in those cases.
>
> > There a common class of volatile functions that wouldn't cause
> > problems: any volatile function that doesn't touch the table being
> > loaded and still works correctly when called with alternately ordered
> > data.
>
> > I claim this is a common class, since sequence next_val functions and
> > uuid generators meet that criteria and most common forms of auditing
> > trigger, as well as any other form of data-reformatting trigger.
>
> I don't believe that it's a good idea to consider nextval() to be
> reorderable, so I'm not convinced by your argument here.

We tell people very clearly in the docs and elsewhere that nextval()
guarantees uniqueness and very specifically not ordering, so with
greatest respect, I agree with Simon on its reorderability.

> > What I'd like to do is to invent a new form of labelling that
> > allows us to understand that COPY can still be optimised.
>
> And I don't want to invent impossible-to-verify function attributes
> with such a tiny use-case as this.

Are you referring to the Halting Problem?

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Fetter <david(at)fetter(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 16:04:16
Message-ID: CA+U5nM+To5yv1m+pgT1TG-4yz1mKXNxR0CbLr8fTmrqGshVKdA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 15 April 2013 16:55, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>> On 15 April 2013 16:24, David Fetter <david(at)fetter(dot)org> wrote:
>>> Do you have numbers on this, or ways to gather same? In other words,
>>> how do we know what resources (time, CPU cycles, disk seeks, etc.) are
>>> being consumed here?
>
>> The multi-insert optimisation for COPY is already there and works well
>> enough to have been committed.
>
> You seem to not have answered the question. Exactly what sort of
> performance gain might be possible, bearing in mind that anything that
> invokes a trigger (for instance) is unlikely to be amazingly fast
> anyway?

Forgive me, I assumed the list would be familiar with the optimization
and so be excited by the need for this.

I will implement as a kluge, test and report the results.

Loading data into a table with a SERIAL or UUID column is the main use
case, so I'll measure that.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: David Fetter <david(at)fetter(dot)org>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 16:08:04
Message-ID: 20130415160804.GL5337@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Apr 15, 2013 at 05:04:16PM +0100, Simon Riggs wrote:
> On 15 April 2013 16:55, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> >> On 15 April 2013 16:24, David Fetter <david(at)fetter(dot)org> wrote:
> >>> Do you have numbers on this, or ways to gather same? In other
> >>> words, how do we know what resources (time, CPU cycles, disk
> >>> seeks, etc.) are being consumed here?
> >
> >> The multi-insert optimisation for COPY is already there and works
> >> well enough to have been committed.
> >
> > You seem to not have answered the question. Exactly what sort of
> > performance gain might be possible, bearing in mind that anything
> > that invokes a trigger (for instance) is unlikely to be amazingly
> > fast anyway?
>
> Forgive me, I assumed the list would be familiar with the
> optimization and so be excited by the need for this.
>
> I will implement as a kluge, test and report the results.
>
> Loading data into a table with a SERIAL or UUID column is the main
> use case, so I'll measure that.

The former is common enough a use case to optimize specifically,
should the numbers come out right. Do you suppose that an in-core
UUID generator would help the latter make more sense as a part of the
same use case?

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 16:20:05
Message-ID: CA+U5nM+12=8BPZMubkcSSbYmXaAGZNNJK99+wUZ3E-9-Aduxrw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 15 April 2013 17:08, David Fetter <david(at)fetter(dot)org> wrote:

>> Loading data into a table with a SERIAL or UUID column is the main
>> use case, so I'll measure that.
>
> The former is common enough a use case to optimize specifically,
> should the numbers come out right. Do you suppose that an in-core
> UUID generator would help the latter make more sense as a part of the
> same use case?

Only if some form of labelling becomes an issue, but I'm ever hopeful.
Let's wait for the test results now.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Fetter <david(at)fetter(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 16:24:48
Message-ID: 516C29D0.20903@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 04/15/2013 06:04 PM, Simon Riggs wrote:
> On 15 April 2013 16:55, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
>>> On 15 April 2013 16:24, David Fetter <david(at)fetter(dot)org> wrote:
>>>> Do you have numbers on this, or ways to gather same? In other words,
>>>> how do we know what resources (time, CPU cycles, disk seeks, etc.) are
>>>> being consumed here?
>>> The multi-insert optimisation for COPY is already there and works well
>>> enough to have been committed.
>> You seem to not have answered the question. Exactly what sort of
>> performance gain might be possible, bearing in mind that anything that
>> invokes a trigger (for instance) is unlikely to be amazingly fast
>> anyway?
> Forgive me, I assumed the list would be familiar with the optimization
> and so be excited by the need for this.
>
> I will implement as a kluge, test and report the results.
Would just declaring nextval() to be a stable function be a good test ?

Hannu
>
> Loading data into a table with a SERIAL or UUID column is the main use
> case, so I'll measure that.
>
> --
> Simon Riggs http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Fetter <david(at)fetter(dot)org>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 16:35:32
Message-ID: 2292.1366043732@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

David Fetter <david(at)fetter(dot)org> writes:
> On Mon, Apr 15, 2013 at 05:04:16PM +0100, Simon Riggs wrote:
>> Loading data into a table with a SERIAL or UUID column is the main
>> use case, so I'll measure that.

> The former is common enough a use case to optimize specifically,
> should the numbers come out right.

Yeah. TBH I would rather see a special-case hack in the COPY code to
accept nextval() than expose anything as dirty and special-purpose as
this proposed flag to users. But in any case, I don't believe that
adequate evidence has been offered to show that we should do anything
at all here.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Fetter <david(at)fetter(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 17:30:55
Message-ID: CA+U5nMKyEJgRKt+CJkDK_7oUAM4OxdMZvpMZo8axij_dYvrVgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 15 April 2013 17:04, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

> I will implement as a kluge, test and report the results.

Test is COPY 1 million rows on a table with 2 columns, both bigint.
Verified no checkpoints triggered during load.
No other work active on database, tests condicted on laptop
Autovacuum disabled.
Results from multiple runs, outliers excluded, rough averages

HEAD
COPY, with sequence ~5500ms
COPY, with sequence, cached ~5000ms
COPY, no sequence ~1600ms

PATCH to allow sequences to use multi-insert optimisation (1 line change)
COPY, with sequence ~1850ms
COPY, with sequence, cached ~1750ms
COPY, no sequence ~1600ms

This shows that
* cacheing the sequence gives a useful improvement currently
* use of multi-insert optimisaton is very important

Proposals
* set CACHE 100 on automatically created SERIAL sequences
* allow some way to use multi-insert optimisation when default expr is
next_val on a sequence

Tests performed without indexes since this is another area of known
performance issues that I hope to cover later. Zero indexes is not
real, but we're trying to measure the effect and benefit of an
isolated change, so in this case it is appropriate.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: David Fetter <david(at)fetter(dot)org>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 17:41:27
Message-ID: 20130415174127.GD19333@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Apr 15, 2013 at 06:30:55PM +0100, Simon Riggs wrote:
> On 15 April 2013 17:04, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> > I will implement as a kluge, test and report the results.
>
> Test is COPY 1 million rows on a table with 2 columns, both bigint.
> Verified no checkpoints triggered during load.
> No other work active on database, tests condicted on laptop
> Autovacuum disabled.
> Results from multiple runs, outliers excluded, rough averages
>
> HEAD
> COPY, with sequence ~5500ms
> COPY, with sequence, cached ~5000ms
> COPY, no sequence ~1600ms
>
> PATCH to allow sequences to use multi-insert optimisation (1 line change)
> COPY, with sequence ~1850ms
> COPY, with sequence, cached ~1750ms
> COPY, no sequence ~1600ms
>
> This shows that
> * cacheing the sequence gives a useful improvement currently
> * use of multi-insert optimisaton is very important
>
> Proposals
> * set CACHE 100 on automatically created SERIAL sequences
> * allow some way to use multi-insert optimisation when default expr is
> next_val on a sequence
>
> Tests performed without indexes since this is another area of known
> performance issues that I hope to cover later. Zero indexes is not
> real, but we're trying to measure the effect and benefit of an
> isolated change, so in this case it is appropriate.

The difference between HEAD and patch in the "COPY, with sequence"
case is pretty remarkable. What's the patch?

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 18:04:55
Message-ID: CA+U5nMJOXi40tDJmf2-niHu+VcM-8-yOVBuAB9Wwt6ii94krXQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 15 April 2013 18:41, David Fetter <david(at)fetter(dot)org> wrote:

> The difference between HEAD and patch in the "COPY, with sequence"
> case is pretty remarkable. What's the patch?

Attached.

This is usable only for this test. It is not anywhere remotely close
to being applied.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
copy_test_kluge.v1.patch application/octet-stream 551 bytes

From: David Fetter <david(at)fetter(dot)org>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 18:27:37
Message-ID: 20130415182737.GF19333@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Apr 15, 2013 at 07:04:55PM +0100, Simon Riggs wrote:
> On 15 April 2013 18:41, David Fetter <david(at)fetter(dot)org> wrote:
>
> > The difference between HEAD and patch in the "COPY, with sequence"
> > case is pretty remarkable. What's the patch?
>
> Attached.

Thanks! :)

> This is usable only for this test. It is not anywhere remotely close
> to being applied.

Of course.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 19:52:21
Message-ID: CA+TgmoYunLHaD66QX2J58YQmj58RFuOMDzp-a_guMzfbbPP3=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Apr 15, 2013 at 11:49 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I claim this is a common class, since sequence next_val functions and
>> uuid generators meet that criteria and most common forms of auditing
>> trigger, as well as any other form of data-reformatting trigger.
>
> I don't believe that it's a good idea to consider nextval() to be
> reorderable, so I'm not convinced by your argument here.

Why not?

I admit that I can't convince myself that it's safe. But I can't
think of a concrete example where it breaks anything, either.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 20:21:31
Message-ID: 7870.1366057291@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Mon, Apr 15, 2013 at 11:49 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I don't believe that it's a good idea to consider nextval() to be
>> reorderable, so I'm not convinced by your argument here.

> Why not?

> I admit that I can't convince myself that it's safe. But I can't
> think of a concrete example where it breaks anything, either.

I think plenty of people would be upset if row serial numbers assigned
with nextval() were not assigned in the order of the incoming rows.
The argument that you can get gaps in the sequence in some corner cases
(none of which apply within a single COPY operation, btw) doesn't
entitle us to violate the POLA to that extent.

After looking again at the code Simon is concerned about, though,
whether we are willing to allow volatile function calls to be reordered
has approximately nothing to do with this COPY optimization. Rather,
the thing that makes it safe is that nextval() isn't going to look at
the COPY target table, and thus whether or not the previous rows have
been physically inserted isn't important. The same goes for the UUID
example. So I think he's done himself a disservice by talking about
reordering and bringing up the question of parallel queries. What we
ought to be thinking about is how we can be certain that a function call
isn't going to look at the uncommitted table rows; no more and no less.

In this context, I think we could do a lot worse than to special-case
nextval(), because it's hard to see a really principled function
attribute definition that would admit it here. It does look at, and
even modify, uncommitted database state. We know it's safe because a
sequence relation couldn't be the target of COPY ... but that reasoning
doesn't fit nicely into anything I think we'd want to expose to users.

OTOH, the notion that a UUID generator doesn't touch *any* database
state seems like it might be worth treating as a general function
property: it's simple to understand and applies to a lot of other
volatile functions such as random() and clock_timestamp().

regards, tom lane


From: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 20:32:29
Message-ID: CAJKUy5jYD+kwBA2oaA9WHdb5QFy3JSYFWkeZUv3cMG9qhWfgCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Apr 15, 2013 at 3:21 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> OTOH, the notion that a UUID generator doesn't touch *any* database
> state seems like it might be worth treating as a general function
> property: it's simple to understand and applies to a lot of other
> volatile functions such as random() and clock_timestamp().
>

Something like the NO SQL access indication mandated by sql standard?

http://www.postgresql.org/message-id/1267473390.7837.9.camel@vanquo.pezone.net

--
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 20:34:14
Message-ID: CA+U5nMLKhqOr8DXMp8QCOvHYEyEouuJQqEX-1z1cWX2Ym0FCmg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 15 April 2013 20:52, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, Apr 15, 2013 at 11:49 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> I claim this is a common class, since sequence next_val functions and
>>> uuid generators meet that criteria and most common forms of auditing
>>> trigger, as well as any other form of data-reformatting trigger.
>>
>> I don't believe that it's a good idea to consider nextval() to be
>> reorderable, so I'm not convinced by your argument here.
>
> Why not?
>
> I admit that I can't convince myself that it's safe. But I can't
> think of a concrete example where it breaks anything, either.

I'm not sure exactly if the corect label is "reorderable" or some
other word that describes this specific situation. "Parallel
independent" probably would work, since if we had a DML statement that
worked in 2 separate processes we'd probably want to know that the
rows in each could be divided up without changing the result.

It looks straightforward to put in a special case check for sequences,
which is the most important use case. Sequences are a recognised
database object, so such a special case could be justified... shame
about the other use cases though, e.g. UUIDs.

I guess we can generalise it when we have a better idea of whether
this does indeed make a useful generalisation.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 20:35:02
Message-ID: CA+TgmobLT0XWJJ8tEiP+AzVE4aBrFxSL2qQ2cgGBG5QhZPWjRQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Apr 15, 2013 at 4:21 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I think plenty of people would be upset if row serial numbers assigned
> with nextval() were not assigned in the order of the incoming rows.
> The argument that you can get gaps in the sequence in some corner cases
> (none of which apply within a single COPY operation, btw) doesn't
> entitle us to violate the POLA to that extent.

I agree.

> After looking again at the code Simon is concerned about, though,
> whether we are willing to allow volatile function calls to be reordered
> has approximately nothing to do with this COPY optimization. Rather,
> the thing that makes it safe is that nextval() isn't going to look at
> the COPY target table, and thus whether or not the previous rows have
> been physically inserted isn't important. The same goes for the UUID
> example. So I think he's done himself a disservice by talking about
> reordering and bringing up the question of parallel queries. What we
> ought to be thinking about is how we can be certain that a function call
> isn't going to look at the uncommitted table rows; no more and no less.

Yep.

> In this context, I think we could do a lot worse than to special-case
> nextval(), because it's hard to see a really principled function
> attribute definition that would admit it here. It does look at, and
> even modify, uncommitted database state. We know it's safe because a
> sequence relation couldn't be the target of COPY ... but that reasoning
> doesn't fit nicely into anything I think we'd want to expose to users.
>
> OTOH, the notion that a UUID generator doesn't touch *any* database
> state seems like it might be worth treating as a general function
> property: it's simple to understand and applies to a lot of other
> volatile functions such as random() and clock_timestamp().

I think that's right; and I also think that's something that could be
useful in other contexts. It also has the advantage of being a
*checkable* property. That is, if a function is marked as changing no
database state, we can set a flag on entry into that context and clear
the flag on exit. If in the middle any attempt is made to change the
database state, then we can throw an error. This would be pretty good
insurance against the possibility that future optimizations based
around that flag would cause behavioral differences.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-15 20:53:51
Message-ID: CA+U5nMJbmCw+dR5AEpr2M7vPkmVtBK2DKXOGMbD0BOKfn7n0og@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 15 April 2013 21:32, Jaime Casanova <jaime(at)2ndquadrant(dot)com> wrote:
> On Mon, Apr 15, 2013 at 3:21 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>> OTOH, the notion that a UUID generator doesn't touch *any* database
>> state seems like it might be worth treating as a general function
>> property: it's simple to understand and applies to a lot of other
>> volatile functions such as random() and clock_timestamp().
>>
>
> Something like the NO SQL access indication mandated by sql standard?
>
> http://www.postgresql.org/message-id/1267473390.7837.9.camel@vanquo.pezone.net

That would work for UUIDs, random() etc but not for sequences.

So I'll treat this as two separate cases:
* add special case for sequences
* use the NO SQL mechanism, as described, which implies no reads or
writes of database state. We could test that, but its somewhat harder
and we'd need to test for that on entry to any function, which I don't
much like.

Default to current timestamp is also a common use case - thanks for
mentioning that.

Doing it tha way Tatsuo would be able to parse functions more easily
as requested in the linked post.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Jaime Casanova <jaime(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-16 11:38:16
Message-ID: CA+U5nM+P_MMK7X4CqAszqdrgHLvAZpNK1dS9=yp2T6upn9E9qg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 15 April 2013 21:53, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:

> So I'll treat this as two separate cases:
> * add special case for sequences

Patch attached.

> * use the NO SQL mechanism, as described, which implies no reads or
> writes of database state. We could test that, but its somewhat harder
> and we'd need to test for that on entry to any function, which I don't
> much like.

For another time.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
copy_tuning_with_default_nextval.v1.patch application/octet-stream 6.5 KB

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-16 12:57:15
Message-ID: 516D4AAB.3000809@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 16.04.2013 14:38, Simon Riggs wrote:
> On 15 April 2013 21:53, Simon Riggs<simon(at)2ndquadrant(dot)com> wrote:
>
>> So I'll treat this as two separate cases:
>> * add special case for sequences
>
> Patch attached.
>
> + if (IsA(node, FuncExpr))
> + {
> + FuncExpr *expr = (FuncExpr *) node;
> +
> + /*
> + * For this case only, we want to ignore the volatility of the
> + * nextval() function, since some callers want this. nextval()
> + * has no other args other than sequence name, so we can just
> + * return false immediately in this case.
> + */
> + if (expr->funcid == F_NEXTVAL_OID)
> + return false;
> + else if (func_volatile(expr->funcid) == PROVOLATILE_VOLATILE)
> + return true;
> + /* else fall through to check args */
> + } ...

You still need to check the args, if the function is nextval, otherwise
you incorrectly perform the optimization for something like
"nextval(myvolatilefunc())".

- Heikki


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-16 13:37:33
Message-ID: CA+U5nMKANoSJsZAwY69VLNdm-apobw8Opt+1XLWrnM=xSe4t8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 16 April 2013 13:57, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:

> You still need to check the args, if the function is nextval, otherwise you
> incorrectly perform the optimization for something like
> "nextval(myvolatilefunc())".

Guess so. At least its an easy change.

Thanks for checking.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
copy_tuning_with_default_nextval.v2.patch application/octet-stream 6.3 KB

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-16 13:58:17
Message-ID: 20130416135817.GG5343@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Apr 16, 2013 at 02:37:33PM +0100, Simon Riggs wrote:
> On 16 April 2013 13:57, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:
>
> > You still need to check the args, if the function is nextval, otherwise you
> > incorrectly perform the optimization for something like
> > "nextval(myvolatilefunc())".
>
> Guess so. At least its an easy change.

I found Simon's nextval()/COPY timings without this patch sobering. I
assume he can apply this for 9.3, right? I believe it is a fix for a
new 9.3 feature.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Jaime Casanova <jaime(at)2ndQuadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-16 14:07:07
Message-ID: 29624.1366121227@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> I found Simon's nextval()/COPY timings without this patch sobering. I
> assume he can apply this for 9.3, right? I believe it is a fix for a
> new 9.3 feature.

It is not a "fix", it is not for a 9.3 feature (the multi-insert thing
went in in 9.2), and personally I'd vote against applying it now,
especially if Simon is expecting anybody to review it.

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Jaime Casanova <jaime(at)2ndQuadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-16 14:10:37
Message-ID: 20130416141037.GH5343@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Apr 16, 2013 at 10:07:07AM -0400, Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > I found Simon's nextval()/COPY timings without this patch sobering. I
> > assume he can apply this for 9.3, right? I believe it is a fix for a
> > new 9.3 feature.
>
> It is not a "fix", it is not for a 9.3 feature (the multi-insert thing
> went in in 9.2), and personally I'd vote against applying it now,
> especially if Simon is expecting anybody to review it.

Oh, I thought multi-insert was 9.3.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-04-16 15:43:59
Message-ID: CA+U5nMLOA7EADwMjpS+OoWpvJQNPDYVaexVGY-kY6DFkL8t_pA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 16 April 2013 15:07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>> I found Simon's nextval()/COPY timings without this patch sobering. I
>> assume he can apply this for 9.3, right? I believe it is a fix for a
>> new 9.3 feature.
>
> It is not a "fix", it is not for a 9.3 feature (the multi-insert thing
> went in in 9.2), and personally I'd vote against applying it now,
> especially if Simon is expecting anybody to review it.

I wrote this expecting it to be a 9.4 feature. I'm good with that.

There are other somewhat related optimisations also.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-06-24 09:21:22
Message-ID: CADyhKSWXCJnWBfOYjjj2X3uowUmo63nUHrsWYbh+7VH9ycnyaQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Simon,

I checked this patch. One thing I could comment on is, do you think it is a good
idea to have oid of exception function list on
contain_volatile_functions_walker()?

The walker function is static thus here is no impact for other caller, and its
"context" argument is unused.
My proposition is to enhance 2nd argument of contain_volatile_functions_walker()
to deliver list of exceptional functions, then
contain_volatile_functions_not_nextval()
calls contain_volatile_functions_walker() with list_make1_oid(F_NEXTVAL_OID) to
handle nextval() as exception.
Otherwise, all we need to do is put NIL as 2nd argument.

It kills code duplication and reduces future maintenance burden.
How about your opinion?

Thanks,

2013/4/16 Simon Riggs <simon(at)2ndquadrant(dot)com>:
> On 16 April 2013 13:57, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:
>
>> You still need to check the args, if the function is nextval, otherwise you
>> incorrectly perform the optimization for something like
>> "nextval(myvolatilefunc())".
>
> Guess so. At least its an easy change.
>
> Thanks for checking.
>
> --
> Simon Riggs http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2013-06-29 00:14:37
Message-ID: CA+U5nMKSiiB2NS_PZUKbRSGQdr5f251tGTFUrt0=W-PNmXN2KQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 24 June 2013 10:21, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> wrote:

> Hi Simon,
>
> I checked this patch. One thing I could comment on is, do you think it is
> a good
> idea to have oid of exception function list on
> contain_volatile_functions_walker()?
>
> The walker function is static thus here is no impact for other caller, and
> its
> "context" argument is unused.
> My proposition is to enhance 2nd argument of
> contain_volatile_functions_walker()
> to deliver list of exceptional functions, then
> contain_volatile_functions_not_nextval()
> calls contain_volatile_functions_walker() with
> list_make1_oid(F_NEXTVAL_OID) to
> handle nextval() as exception.
> Otherwise, all we need to do is put NIL as 2nd argument.
>
> It kills code duplication and reduces future maintenance burden.
> How about your opinion?
>

That approach is more flexible than the one in the patch, I agree.

Ultimately, I see this as a choice between a special purpose piece of code
(as originally supplied) and a much more generic facility for labelling
functions as to whether they contain SQL or not, per the SQL standard as
Jaime suggests. There's not much mileage in something in between.

So I'm mid way through updating the patch to implement the generic facility.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Jaime Casanova <jaime(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY and Volatile default expressions
Date: 2014-01-15 15:47:45
Message-ID: CA+U5nM+6b_fT=FeW-SgwcOJsOrxWevAbeG4wYW0RP6N4L6+Frg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 16 April 2013 14:37, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 16 April 2013 13:57, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> wrote:
>
>> You still need to check the args, if the function is nextval, otherwise you
>> incorrectly perform the optimization for something like
>> "nextval(myvolatilefunc())".
>
> Guess so. At least its an easy change.
>
> Thanks for checking.

Rebased v3

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
copy_tuning_with_default_nextval.v3.patch application/octet-stream 6.3 KB