Quick Links

Re: [HACKERS] 8.2 features?

Lists:	pgsql-docspgsql-hackerspgsql-patches

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	8.2 features?
Date:	2006-07-13 18:56:56
Message-ID:	44B69778.5050104@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

What is the state of the following items that have been previously
discussed?

. MERGE (at least in PK case)
. multiple values clauses for INSERT
. recursive WITH queries

Thanks

andrew

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.2 features?
Date:	2006-07-13 19:32:41
Message-ID:	18897.1152819161@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> What is the state of the following items that have been previously
> discussed?

> . MERGE (at least in PK case)

No submitted patch; no one working on it AFAIK; doesn't look like
something that could get done in the next three weeks.

> . multiple values clauses for INSERT

Also not done, but if we are willing to accept a limited patch
(ie, not necessarily everything that SQL92 says you can do with
VALUES, but at least the INSERT case) I think it could get done.
I might even volunteer to do it ... but won't object if someone
else volunteers to.

> . recursive WITH queries

I believe Jonah has given up on fixing the originally-submitted
patch, but he mentioned at the code sprint that non-recursive
WITH is potentially doable in time for 8.2. Not sure if that's
a sufficiently important case to be worth doing.

regards, tom lane

From:	"Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Andrew Dunstan" <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.2 features?
Date:	2006-07-13 19:42:17
Message-ID:	36e682920607131242y36446c3cvec7a6d1203043f83@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

On 7/13/06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > . recursive WITH queries
>
> I believe Jonah has given up on fixing the originally-submitted
> patch, but he mentioned at the code sprint that non-recursive
> WITH is potentially doable in time for 8.2. Not sure if that's
> a sufficiently important case to be worth doing.

A working WITH clause which will work in most usual use-cases will be submitted.

--
Jonah H. Harris, Software Architect | phone: 732.331.1300
EnterpriseDB Corporation | fax: 732.331.1301
33 Wood Ave S, 2nd Floor | jharris(at)enterprisedb(dot)com
Iselin, New Jersey 08830 | http://www.enterprisedb.com/

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.2 features?
Date:	2006-07-13 21:02:52
Message-ID:	44B6B4FC.9030002@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:

>>. multiple values clauses for INSERT
>>
>>
>
>Also not done, but if we are willing to accept a limited patch
>(ie, not necessarily everything that SQL92 says you can do with
>VALUES, but at least the INSERT case) I think it could get done.
>I might even volunteer to do it ... but won't object if someone
>else volunteers to.
>
>
>

I would be very happy to see it accepted.

cheers

andrew

From:	"Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>
To:	"Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.2 features?
Date:	2006-07-13 21:09:32
Message-ID:	36e682920607131409x2326f2fane5931f37247a3228@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

On 7/13/06, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> >>. multiple values clauses for INSERT
>
> I would be very happy to see it accepted.

Same here.

From:	David Fetter <david(at)fetter(dot)org>
To:	"Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>
Cc:	Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.2 features?
Date:	2006-07-13 22:11:38
Message-ID:	20060713221138.GP7361@fetter.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

On Thu, Jul 13, 2006 at 05:09:32PM -0400, Jonah H. Harris wrote:
> On 7/13/06, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> >>>. multiple values clauses for INSERT
> >
> >I would be very happy to see it accepted.
>
> Same here.

Cheers,
D
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
phone: +1 415 235 3778 AIM: dfetter666
Skype: davidfetter

Remember to vote!

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.2 features?
Date:	2006-07-13 22:12:11
Message-ID:	44B6C53B.7060502@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>
>>What is the state of the following items that have been previously
>>discussed?
>>. multiple values clauses for INSERT
>
> Also not done, but if we are willing to accept a limited patch
> (ie, not necessarily everything that SQL92 says you can do with
> VALUES, but at least the INSERT case) I think it could get done.
> I might even volunteer to do it ... but won't object if someone
> else volunteers to.

I'm looking to contribute something useful for the 8.2 release, and it
seems Bernd is going to finish up updateable views himself, so I'd be
glad to take a look (at the limited case, that is). Any landmines I
should watch out for?

Joe

From:	"Peter Eisentraut" <peter_e(at)gmx(dot)net>
To:	"Andrew Dunstan" <andrew(at)dunslane(dot)net>
Cc:	"PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>, "Susanne Ebrecht" <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: 8.2 features?
Date:	2006-07-13 23:23:11
Message-ID:	50461.84.63.113.55.1152832991.squirrel@pezone.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Andrew Dunstan wrote:
> . MERGE (at least in PK case)

I think that died after we figured out that it didn't do the sort of
UPDATE-else-INSERT thing that people wanted out of it.

> . multiple values clauses for INSERT

Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de> was last heard to work on
it. Updates, Susanne?

From:	Stephen Frost <sfrost(at)snowman(dot)net>
To:	Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: 8.2 features?
Date:	2006-07-14 01:30:26
Message-ID:	20060714013026.GH17269@kenobi.snowman.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

* Peter Eisentraut (peter_e(at)gmx(dot)net) wrote:
> Andrew Dunstan wrote:
> > . MERGE (at least in PK case)
>
> I think that died after we figured out that it didn't do the sort of
> UPDATE-else-INSERT thing that people wanted out of it.

I agree that it's probably not going to happen for 8.2 but I certainly
have uses for the SQL spec's definition of MERGE (table-level instead of
the individual-tuple case). I'd like to see the individual-tuple
UPSERT/REPLACE issue handled as well but I don't believe MERGE lacking
that necessairly means MERGE should be ignored..

Thanks,

Stephen

From:	"Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>
To:	"Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, "Susanne Ebrecht" <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: 8.2 features?
Date:	2006-07-14 01:50:06
Message-ID:	36e682920607131850m5503ccd5v6b9af0151de49c9a@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

On 7/13/06, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> I agree that it's probably not going to happen for 8.2 but I certainly
> have uses for the SQL spec's definition of MERGE (table-level instead of
> the individual-tuple case). I'd like to see the individual-tuple
> UPSERT/REPLACE issue handled as well but I don't believe MERGE lacking
> that necessairly means MERGE should be ignored..

Where does Jan stand on it, I know he was doing some thinking about
how to accomplish it.

From:	Bernd Helmle <mailings(at)oopsware(dot)de>
To:	Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: 8.2 features?
Date:	2006-07-14 14:26:40
Message-ID:	ECCAAC0A3182895D3D12B0E8@[172.26.14.247]
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

--On Freitag, Juli 14, 2006 01:23:11 +0200 Peter Eisentraut
<peter_e(at)gmx(dot)net> wrote:

>> . multiple values clauses for INSERT
>
> Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de> was last heard to work on
> it. Updates, Susanne?

I've talked to Susanne today and she's just back from hospital and not
available
online until next week.
She was working on the SET (col1, col2) = (val1, val2) syntax for UPDATE
commands.
Don't know what the status is on this, though.

--
Thanks

Bernd

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Bernd Helmle <mailings(at)oopsware(dot)de>
Cc:	Peter Eisentraut <peter_e(at)gmx(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: 8.2 features?
Date:	2006-07-14 14:45:16
Message-ID:	44B7ADFC.90205@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Bernd Helmle wrote:

>
>
> --On Freitag, Juli 14, 2006 01:23:11 +0200 Peter Eisentraut
> <peter_e(at)gmx(dot)net> wrote:
>
>>> . multiple values clauses for INSERT
>>
>>
>> Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de> was last heard to work on
>> it. Updates, Susanne?
>
>
> I've talked to Susanne today and she's just back from hospital and not
> available
> online until next week.
> She was working on the SET (col1, col2) = (val1, val2) syntax for
> UPDATE commands.
> Don't know what the status is on this, though.
>

Not the same thing, surely. So maybe we should gratefully accept Joe
Conway's offer to work on it.

cheers

andrew

From:	Susanne Ebrecht <miracee(at)miracee(dot)de>
To:	Bernd Helmle <mailings(at)oopsware(dot)de>
Cc:	Peter Eisentraut <peter_e(at)gmx(dot)net>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.2 features?
Date:	2006-07-17 10:27:08
Message-ID:	1153132028.18893.8.camel@localhost
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Am Freitag, den 14.07.2006, 16:26 +0200 schrieb Bernd Helmle:
>
> --On Freitag, Juli 14, 2006 01:23:11 +0200 Peter Eisentraut
> <peter_e(at)gmx(dot)net> wrote:
>
> >> . multiple values clauses for INSERT
> >
> > Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de> was last heard to work on
> > it. Updates, Susanne?
>
> I've talked to Susanne today and she's just back from hospital and not
> available
> online until next week.
> She was working on the SET (col1, col2) = (val1, val2) syntax for UPDATE
> commands.
> Don't know what the status is on this, though.
>

Thanks Peter and Bernd for your postings.
I'am working on
update table set (col1, col2, ...) = (val1, val2, ...), (colx,
coly, ...) = (valx, valy, ...), ...
I hope, it will be finished this week. Most of work is done.

Susanne

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: 8.2 features?
Date:	2006-07-17 21:42:10
Message-ID:	44BC0432.3040509@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Andrew Dunstan wrote:
> Bernd Helmle wrote:
>>
>> --On Freitag, Juli 14, 2006 01:23:11 +0200 Peter Eisentraut
>> <peter_e(at)gmx(dot)net> wrote:
>>
>>>> . multiple values clauses for INSERT
>>>
>>> Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de> was last heard to work on
>>> it. Updates, Susanne?
>>
>> I've talked to Susanne today and she's just back from hospital and not
>> available online until next week.
>> She was working on the SET (col1, col2) = (val1, val2) syntax for
>> UPDATE commands.
>> Don't know what the status is on this, though.
>
> Not the same thing, surely. So maybe we should gratefully accept Joe
> Conway's offer to work on it.

I've played with this a bit now, and the grammar changes seem pretty
straightforward, but the other half is kind of ugly. I can't see a good
way to propagate multiple targetlists that isn't a big hack.

The best way might be to fabricate a selectStmt equiv to
"SELECT <targetlist> UNION ALL SELECT <targetlist>...",
but that still feels like a hack.

Have there been any past discussions on how this might be implemented
(FWIW I couldn't find any in the archives)? Any better ideas for an
approach?

Thanks,

Joe

From:	Joe Conway <mail(at)joeconway(dot)com>
To:
Cc:	Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-18 04:39:59
Message-ID:	44BC661F.80609@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway wrote:
>>>
>>>>> . multiple values clauses for INSERT
>
> The best way might be to fabricate a selectStmt equiv to
> "SELECT <targetlist> UNION ALL SELECT <targetlist>...",
> but that still feels like a hack.

Here is a patch pursuant to my earlier post. It has the advantage of
being fairly simple and noninvasive.

The major downside is that somewhere between 9000 and 10000
VALUES-targetlists produces "ERROR: stack depth limit exceeded".
Perhaps for the typical use-case this is sufficient though.

I'm open to better ideas, comments, objections...

Thanks,

Joe

Attachment	Content-Type	Size
multi-insert.diff	text/x-patch	3.8 KB

From:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-18 12:13:32
Message-ID:	44BCD06C.3060409@calorieking.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

> The major downside is that somewhere between 9000 and 10000
> VALUES-targetlists produces "ERROR: stack depth limit exceeded".
> Perhaps for the typical use-case this is sufficient though.
>
> I'm open to better ideas, comments, objections...

If the use case is people running MySQL dumps, then there will be
millions of values-targetlists in MySQL dumps.

Chris

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>
Cc:	Joe Conway <mail(at)joeconway(dot)com>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-18 14:36:04
Message-ID:	44BCF1D4.3010807@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Christopher Kings-Lynne wrote:

>> The major downside is that somewhere between 9000 and 10000
>> VALUES-targetlists produces "ERROR: stack depth limit exceeded".
>> Perhaps for the typical use-case this is sufficient though.
>>
>> I'm open to better ideas, comments, objections...
>
>
> If the use case is people running MySQL dumps, then there will be
> millions of values-targetlists in MySQL dumps.
>
>

Yeah. The fabricated select hack does feel wrong to me. Taking a quick
2 minute look at the grammar it looks like a better bet would be to make
InsertStmt.targetList a list of lists of values rather than just a list
of values. Of course, that would make the changes more invasive. Even
with that we'd still be reading the whole thing into memory ... is there
a sane way to cache the inline data before statement execution?

I guess we can just say that for true bulk load our supported mechanism
is still just COPY, but it would be a pity to restrict a feature that is
in the standard that way.

cheers

andrew

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-18 15:05:57
Message-ID:	44BCF8D5.5080109@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Andrew Dunstan wrote:
> Christopher Kings-Lynne wrote:
>
>>> The major downside is that somewhere between 9000 and 10000
>>> VALUES-targetlists produces "ERROR: stack depth limit exceeded".
>>> Perhaps for the typical use-case this is sufficient though.
>>>
>>> I'm open to better ideas, comments, objections...
>>
>> If the use case is people running MySQL dumps, then there will be
>> millions of values-targetlists in MySQL dumps.
>
> Yeah. The fabricated select hack does feel wrong to me. Taking a quick
> 2 minute look at the grammar it looks like a better bet would be to make
> InsertStmt.targetList a list of lists of values rather than just a list
> of values. Of course, that would make the changes more invasive. Even
> with that we'd still be reading the whole thing into memory ... is there
> a sane way to cache the inline data before statement execution?

I started down the path of making InsertStmt.targetList a list of
targetlists. The problem is finding a reasonable way to make that
available to the executor. Back to the drawing board I guess.

I have similar concerns with the millions of values-targetlists comment
that Chris made. But I don't see how we can cache the data easily short
of inventing a List alternative that spills to disk.

> I guess we can just say that for true bulk load our supported mechanism
> is still just COPY, but it would be a pity to restrict a feature that is
> in the standard that way.

True

Joe

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pgsql-hackers(at)postgresql(dot)org, Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, Joe Conway <mail(at)joeconway(dot)com>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-18 18:19:01
Message-ID:	5842.1153246741@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

>> If the use case is people running MySQL dumps, then there will be
>> millions of values-targetlists in MySQL dumps.

I did some experimentation just now, and could not get mysql to accept a
command longer than about 1 million bytes. It complains about
Got a packet bigger than 'max_allowed_packet' bytes
which seems a bit odd because max_allowed_packet is allegedly set to
16 million, but anyway I don't think people are going to be loading any
million-row tables using single INSERT commands in mysql either.

regards, tom lane

From:	Chris Browne <cbbrowne(at)acm(dot)org>
To:	pgsql-patches(at)postgresql(dot)org
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-18 18:49:35
Message-ID:	60u05eagu8.fsf@dba2.int.libertyrms.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

chris(dot)kings-lynne(at)calorieking(dot)com (Christopher Kings-Lynne) writes:
>> The major downside is that somewhere between 9000 and 10000
>> VALUES-targetlists produces "ERROR: stack depth limit
>> exceeded". Perhaps for the typical use-case this is sufficient
>> though.
>> I'm open to better ideas, comments, objections...
>
> If the use case is people running MySQL dumps, then there will be
> millions of values-targetlists in MySQL dumps.

Curiosity: How do *does* TheirSQL parse that, and not have the One
Gigantic Query blow up their query parser?
--
output = reverse("gro.gultn" "@" "enworbbc")
http://www.ntlug.org/~cbbrowne/unix.html
JOHN CAGE (strapped to table): Do you really expect me to conduct this
antiquated tonal system?
LEONARD BERNSTEIN: No, Mr. Cage, I expect you to die!
[With apologies to music and James Bond fans the world over...]

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Chris Browne <cbbrowne(at)acm(dot)org>
Cc:	pgsql-patches(at)postgresql(dot)org
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-18 19:46:32
Message-ID:	44BD3A98.2050702@dunslane.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Chris Browne wrote:

>chris(dot)kings-lynne(at)calorieking(dot)com (Christopher Kings-Lynne) writes:
>
>
>>>The major downside is that somewhere between 9000 and 10000
>>>VALUES-targetlists produces "ERROR: stack depth limit
>>>exceeded". Perhaps for the typical use-case this is sufficient
>>>though.
>>>I'm open to better ideas, comments, objections...
>>>
>>>
>>If the use case is people running MySQL dumps, then there will be
>>millions of values-targetlists in MySQL dumps.
>>
>>
>
>Curiosity: How do *does* TheirSQL parse that, and not have the One
>Gigantic Query blow up their query parser?
>
>

Experimentation shows that mysqldump breaks up the insert into chunks.

Example with 10m rows:

[ad(at)wired-219 ~]# perl -e 'print "drop table if exists foo; create table
foo (x int);\n"; foreach my $i (0..9_9999) { print "insert into foo
values \n"; foreach my $j (0..99) { print "," if $j; print
"(",100*$i+$j+1,")"; } print ";\n"; } ' > gggggg
[ad(at)wired-219 ~]# mysql test < gggggg
[ad(at)wired-219 ~]# mysqldump test foo > aaaaaa
[ad(at)wired-219 ~]# mysql test < aaaaaa
[ad(at)wired-219 ~]# grep INSERT aaaaaa | wc -l
104

cheers

andrew

From:	Thomas Bley <thbley(at)gmail(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: [PATCHES] 8.2 features?
Date:	2006-07-18 21:39:20
Message-ID:	44BD5508.3050504@gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

from http://dev.mysql.com/doc/refman/4.1/en/blob.html

You can change the message buffer size by changing the value of the
max_allowed_packet variable, but you must do so for both the server and
your client program. For example, both mysql and mysqldump allow you to
change the client-side max_allowed_packet value.

Tom Lane wrote:
>>> If the use case is people running MySQL dumps, then there will be
>>> millions of values-targetlists in MySQL dumps.
>>>
>
> I did some experimentation just now, and could not get mysql to accept a
> command longer than about 1 million bytes. It complains about
> Got a packet bigger than 'max_allowed_packet' bytes
> which seems a bit odd because max_allowed_packet is allegedly set to
> 16 million, but anyway I don't think people are going to be loading any
> million-row tables using single INSERT commands in mysql either.
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>
>

From:	"Matthew D(dot) Fuller" <fullermd(at)over-yonder(dot)net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: [PATCHES] 8.2 features?
Date:	2006-07-18 23:41:49
Message-ID:	20060718234149.GV98476@over-yonder.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

On Tue, Jul 18, 2006 at 02:19:01PM -0400 I heard the voice of
Tom Lane, and lo! it spake thus:
>
> I did some experimentation just now, and could not get mysql to accept a
> command longer than about 1 million bytes. It complains about
> Got a packet bigger than 'max_allowed_packet' bytes
> which seems a bit odd because max_allowed_packet is allegedly set to
> 16 million, but anyway I don't think people are going to be loading any
> million-row tables using single INSERT commands in mysql either.

On the contrary, I've hit it several times by just trying to import
[into another database] the output of a mysqldump I just did. Great
design, that...

--
Matthew Fuller (MF4839) | fullermd(at)over-yonder(dot)net
Systems/Network Administrator | http://www.over-yonder.net/~fullermd/
On the Internet, nobody can hear you scream.

From:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org, Andrew Dunstan <andrew(at)dunslane(dot)net>, Joe Conway <mail(at)joeconway(dot)com>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-19 01:18:12
Message-ID:	44BD8854.9060506@calorieking.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

> I did some experimentation just now, and could not get mysql to accept a
> command longer than about 1 million bytes. It complains about
> Got a packet bigger than 'max_allowed_packet' bytes
> which seems a bit odd because max_allowed_packet is allegedly set to
> 16 million, but anyway I don't think people are going to be loading any
> million-row tables using single INSERT commands in mysql either.

Strange. Last time I checked I thought MySQL dump used 'multivalue
lists in inserts' for dumps, for the same reason that we use COPY

From:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org, Andrew Dunstan <andrew(at)dunslane(dot)net>, Joe Conway <mail(at)joeconway(dot)com>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-19 01:20:29
Message-ID:	44BD88DD.7070507@calorieking.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Ah no, I'm mistaken. It's not by default in mysqldump, but it does seem
"recommended". This is from "man mysqldump":

-e|--extended-insert
Allows utilization of the new, much faster INSERT syntax.

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org, Andrew Dunstan <andrew(at)dunslane(dot)net>, Joe Conway <mail(at)joeconway(dot)com>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-19 03:32:34
Message-ID:	15212.1153279954@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com> writes:
> Strange. Last time I checked I thought MySQL dump used 'multivalue
> lists in inserts' for dumps, for the same reason that we use COPY

I think Andrew identified the critical point upthread: they don't try
to put an unlimited number of rows into one INSERT, only a megabyte
or so's worth. Typical klugy-but-effective mysql design approach ...

regards, tom lane

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-19 04:20:05
Message-ID:	44BDB2F5.7080003@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com> writes:
>
>>Strange. Last time I checked I thought MySQL dump used 'multivalue
>>lists in inserts' for dumps, for the same reason that we use COPY
>
> I think Andrew identified the critical point upthread: they don't try
> to put an unlimited number of rows into one INSERT, only a megabyte
> or so's worth. Typical klugy-but-effective mysql design approach ...

OK, so given that we don't need to be able to do 1 million
multi-targetlist insert statements, here is rev 2 of the patch.

It is just slightly more invasive, but performs *much* better. In fact,
it can handle as many targetlists as you have memory to deal with. It
also deals with DEFAULT values in the targetlist.

I've attached a php script that I used to do crude testing. Basically I
tested 3 cases in this order:

single-INSERT-multi-statement:
------------------------------
"INSERT INTO foo2a (f1,f2) VALUES (1,2);"
-- repeat statement $loopcount times

single-INSERT-at-once:
----------------------
"INSERT INTO foo2b (f1,f2) VALUES (1,2);INSERT INTO foo2a (f1,f2)
VALUES (1,2);INSERT INTO foo2a (f1,f2) VALUES (1,2)..."
-- build a single SQL string by looping $loopcount times,
-- and execute it all at once

multi-INSERT-at-once:
---------------------
"INSERT INTO foo2c (f1,f2) VALUES (1,2),(1,2),(1,2)..."
-- build a single SQL string by looping $loopcount times,
-- and execute it all at once

Here are the results:
$loopcount = 100000;
single-INSERT-multi-statement Elapsed time is 34 seconds
single-INSERT-at-once Elapsed time is 7 seconds
multi-INSERT-at-once Elapsed time is 4 seconds
about 370MB peak memory usage

$loopcount = 200000;
single-INSERT-multi-statement Elapsed time is 67 seconds
single-INSERT-at-once Elapsed time is 12 seconds
multi-INSERT-at-once Elapsed time is 9 seconds
about 750MB peak memory usage

$loopcount = 300000;
single-INSERT-multi-statement Elapsed time is 101 seconds
single-INSERT-at-once Elapsed time is 18 seconds
multi-INSERT-at-once Elapsed time is 13 seconds
about 1.1GB peak memory usage

Somewhere beyond this, my machine goes into swap hell, and I didn't have
the patience to wait for it to complete :-)

It would be interesting to see a side-by-side comparison with MySQL
since that seems to be our benchmark on this feature. I'll try to do
that tomorrow if no one beats me to it.

There is only one downside to the current approach that I'm aware of.
The command-result tag is only set by the "original" query, meaning that
even if you insert 300,000 rows using this method, the command-result
tag looks like "INSERT 0 1"; e.g.:

regression=# create table foo2(f1 int default 42,f2 int default 6);
CREATE TABLE
regression=# insert into foo2 (f1,f2) values
(default,12),(default,10),(115,21);
INSERT 0 1
regression=# select * from foo2;
f1 | f2
-----+----
42 | 12
42 | 10
115 | 21
(3 rows)

Any thoughts on how to fix that?

Thanks,

Joe

Attachment	Content-Type	Size
multi-insert-r2.diff	text/x-patch	4.7 KB
test-insert.php	application/x-php	2.0 KB

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [PATCHES] 8.2 features?
Date:	2006-07-19 21:41:36
Message-ID:	44BEA710.6060700@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway wrote:
> Tom Lane wrote:
>
>> Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com> writes:
>>
>>> Strange. Last time I checked I thought MySQL dump used 'multivalue
>>> lists in inserts' for dumps, for the same reason that we use COPY
>>
>> I think Andrew identified the critical point upthread: they don't try
>> to put an unlimited number of rows into one INSERT, only a megabyte
>> or so's worth. Typical klugy-but-effective mysql design approach ...
>
> OK, so given that we don't need to be able to do 1 million
> multi-targetlist insert statements, here is rev 2 of the patch.

I did some testing today against mysql and found that it will easily
absorb insert statements with 1 million targetlists provided you set
max_allowed_packet high enough for the server. It peaked out at about
600MB, compared to my test similar last night where it was using about
3.8 GB when I killed it.

So the question is, do we care?

If we do, I'll start looking for a new rev 3 strategy (ideas/pointers
etc very welcome). If not, I'll start working on docs and regression test.

Thanks,

Joe

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [PATCHES] 8.2 features?
Date:	2006-07-20 03:58:07
Message-ID:	23198.1153367887@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> I did some testing today against mysql and found that it will easily
> absorb insert statements with 1 million targetlists provided you set
> max_allowed_packet high enough for the server. It peaked out at about
> 600MB, compared to my test similar last night where it was using about
> 3.8 GB when I killed it.

> So the question is, do we care?

What's the performance like relative to mysql? It seems hard to believe
that we can afford the overhead of a separate INSERT statement per row
(duplicating all the work of parse analysis, rewrite, planning, executor
start/stop) ... at least not without looking mighty bad.

regards, tom lane

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [PATCHES] 8.2 features?
Date:	2006-07-20 04:16:02
Message-ID:	44BF0382.2010501@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
>
>>I did some testing today against mysql and found that it will easily
>>absorb insert statements with 1 million targetlists provided you set
>>max_allowed_packet high enough for the server. It peaked out at about
>>600MB, compared to my test similar last night where it was using about
>>3.8 GB when I killed it.
>
>>So the question is, do we care?
>
> What's the performance like relative to mysql? It seems hard to believe
> that we can afford the overhead of a separate INSERT statement per row
> (duplicating all the work of parse analysis, rewrite, planning, executor
> start/stop) ... at least not without looking mighty bad.

I don't have the exact numbers handy, but not too great.

As I recall, with last night's patch we did 100K inserts in about 4
seconds, and today mysql did 100K in about 1 second. We never finished
the 1 million insert test due to swapping (I killed it after quite a
while), and mysql did 1 million in about 18 seconds (we did 300K in 13
seconds). The hardware was not identical between last night's test and
today's on mysql, but very similar (similar CPUs and memory, although
the machine I did the mysql tests on had scsi drives, while the pg test
was done on sata).

The difficulty is finding a way to avoid all that extra work without a
very ugly special case kludge just for inserts. I've been banging my
head on that on-and-off for a few days now, and every idea looks uglier
than the last. One suggestion I got off list was to figure out a way to
build a tuplestore and use it to feed the executor. That's starting to
sound better and better to me.

Any ideas or guidance would be greatly appreciated.

Joe

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [PATCHES] 8.2 features?
Date:	2006-07-20 04:36:54
Message-ID:	23454.1153370214@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> The difficulty is finding a way to avoid all that extra work without a
> very ugly special case kludge just for inserts.

[ thinks a bit ... ]

It seems to me that the reason it's painful is exactly that INSERT
... VALUES is a kluge already. We've special-cased the situation where
the INSERT's <query expression> is a <table value constructor> with
exactly one row --- but actually a <table value constructor> with
multiple rows ought to be allowed anywhere you can currently write
"SELECT ...". So ideally fixing this would include eliminating the
current artificial distinction between two types of INSERT command.

I think the place we'd ultimately like to get to involves changing the
executor's Result node type to have a list of targetlists and sequence
through those lists to produce its results (cf Append --- perhaps while
at it, divorce the "gating node" functionality into a different node
type). That part seems clear, what's a bit less clear is what the
ripple effect on the upstream parser/planner data structures should be.
Should *all* occurrences of Query be changed to have a
list-of-targetlists? Sounds ugly, and I don't understand what it would
mean for any Query other than one representing a VALUES construct.

[ thinks some more ... ]

Maybe the right place to put the list-of-targetlists functionality is
not in Query per se, but in a new type of jointree node. This would
localize the impact as far as changing the datastructures go, but I've
not thought hard enough about what the impact would actually be.

regards, tom lane

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [PATCHES] 8.2 features?
Date:	2006-07-20 05:18:39
Message-ID:	44BF122F.1080200@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
>
>>The difficulty is finding a way to avoid all that extra work without a
>>very ugly special case kludge just for inserts.
>
> [ thinks a bit ... ]
>
> It seems to me that the reason it's painful is exactly that INSERT
> ... VALUES is a kluge already. We've special-cased the situation where
> the INSERT's <query expression> is a <table value constructor> with
> exactly one row --- but actually a <table value constructor> with
> multiple rows ought to be allowed anywhere you can currently write
> "SELECT ...". So ideally fixing this would include eliminating the
> current artificial distinction between two types of INSERT command.
>
> I think the place we'd ultimately like to get to involves changing the
> executor's Result node type to have a list of targetlists and sequence
> through those lists to produce its results

I was actually just looking at that and ended up thinking that it might
be better to deal with it one level down in ExecProject (because it is
already passing targetlists directly to ExecTargetList).

> That part seems clear, what's a bit less clear is what the
> ripple effect on the upstream parser/planner data structures should be.
> Should *all* occurrences of Query be changed to have a
> list-of-targetlists? Sounds ugly, and I don't understand what it would
> mean for any Query other than one representing a VALUES construct.

There are certainly many places to be looked at if Query.targetList
becomes a list-of-targetlists (about 153 if I grep'd correctly).

>
> [ thinks some more ... ]
>
> Maybe the right place to put the list-of-targetlists functionality is
> not in Query per se, but in a new type of jointree node. This would
> localize the impact as far as changing the datastructures go, but I've
> not thought hard enough about what the impact would actually be.

OK. You've given me a good bit to think about -- thanks!

Joe

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [PATCHES] 8.2 features?
Date:	2006-07-20 05:39:31
Message-ID:	23927.1153373971@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> Tom Lane wrote:
>> I think the place we'd ultimately like to get to involves changing the
>> executor's Result node type to have a list of targetlists and sequence
>> through those lists to produce its results

> I was actually just looking at that and ended up thinking that it might
> be better to deal with it one level down in ExecProject (because it is
> already passing targetlists directly to ExecTargetList).

I'd vote against that, because (a) ExecProject is used by all executor
node types, and we shouldn't add overhead to all of them for the benefit
of one; (b) ExecProject doesn't actually have any internal state at the
moment. To keep track of which targetlist to evaluate next, it would
not only need some internal state, it would have to be told the current
"es_direction". This stuff fits much better at the exec node level ---
again, I'd suggest looking at Append for a comparison.

But really the executor part of this is not the hard part; what we need
to think about first is what's the impact on the Query datastructure
that the parser/rewriter/planner use.

I'm still liking the idea of pushing multi-values into a jointree node
type. Basically this would suggest representing "VALUES ..." as if it
were "SELECT * FROM VALUES ..." (which I believe is actually legal
syntax per spec) --- in the general case you'd need to have a Query node
that has a trivial "col1, col2, col3, ..." targetlist and then the
multiple values lists are in some kind of jointree entry. But possibly
this could be short-circuited somehow, at least for INSERT.

BTW, I noticed an interesting property of historical Postgres behavior:
you can put a table reference into a VALUES targetlist.

regression=# create table foo (like tenk1);
CREATE TABLE
regression=# insert into foo values (tenk1.*);
ERROR: missing FROM-clause entry for table "tenk1"
LINE 1: insert into foo values (tenk1.*);
^
regression=# set add_missing_from to 1;
SET
regression=# insert into foo values (tenk1.*);
NOTICE: adding missing FROM-clause entry for table "tenk1"
LINE 1: insert into foo values (tenk1.*);
^
INSERT 0 10000
regression=#

So that last is really exactly equivalent to

insert into foo select * from tenk1;

I do not feel a need to support this sort of thing when there are
multiple VALUES targetlists, but it'd be nice not to break it for the
single-targetlist case. At least not till we're ready to disable
add_missing_from entirely.

regards, tom lane

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [PATCHES] 8.2 features?
Date:	2006-07-20 23:13:34
Message-ID:	44C00E1E.1000405@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
>>I was actually just looking at that and ended up thinking that it might
>>be better to deal with it one level down in ExecProject (because it is
>>already passing targetlists directly to ExecTargetList).
>
> I'd vote against that, because (a) ExecProject is used by all executor
> node types, and we shouldn't add overhead to all of them for the benefit
> of one; (b) ExecProject doesn't actually have any internal state at the
> moment. To keep track of which targetlist to evaluate next, it would
> not only need some internal state, it would have to be told the current
> "es_direction". This stuff fits much better at the exec node level ---
> again, I'd suggest looking at Append for a comparison.

OK.

> But really the executor part of this is not the hard part; what we need
> to think about first is what's the impact on the Query datastructure
> that the parser/rewriter/planner use.

After a quick look, I think changing Query.targetList is too big an
impact, and probably unneeded given your suggestion below.

One of the problems with the current code is that the targetList in the
"VALUES..." case is being used for two purposes -- 1) to define the
column types, and 2) to hold the actual data. By putting the data into a
new node type, I think the targetList reverts to being just a list of
datatypes as it is with INSERT ... SELECT ...

> I'm still liking the idea of pushing multi-values into a jointree node
> type. Basically this would suggest representing "VALUES ..." as if it
> were "SELECT * FROM VALUES ..." (which I believe is actually legal
> syntax per spec) --- in the general case you'd need to have a Query node
> that has a trivial "col1, col2, col3, ..." targetlist and then the
> multiple values lists are in some kind of jointree entry. But possibly
> this could be short-circuited somehow, at least for INSERT.

I'm liking this too. But when you say "jointree node", are you saying to
model the new node type after NestLoop/MergeJoin/HashJoin nodes? These
are referred to as "join nodes" in ExecInitNode. Or as you mentioned a
couple of times, should this look more like an Append node?

Thanks,

Joe

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [PATCHES] 8.2 features?
Date:	2006-07-21 00:46:13
Message-ID:	14047.1153442773@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> I'm liking this too. But when you say "jointree node", are you saying to
> model the new node type after NestLoop/MergeJoin/HashJoin nodes? These
> are referred to as "join nodes" in ExecInitNode. Or as you mentioned a
> couple of times, should this look more like an Append node?

No, I guess I confused you by talking about the executor representation
at the same time. This is really unrelated to the executor. The join
tree I'm thinking of here is the data structure that dangles off
Query.jointree --- it's a representation of the query's FROM clause,
and (at present) can contain RangeTblRef, FromExpr, and JoinExpr nodes.
See the last hundred or so lines of primnodes.h for some details.
The jointree is used by the planner to compute the plan node tree that
the executor will run, but it's not the same thing.

There are basically two ways you could go about this:
1. Make a new jointree leaf node type to represent a VALUES construct,
and dangle the list of lists of expressions off that.
2. Make a new RangeTblEntry type to represent a VALUES construct, and
just put a RangeTblRef to it into the jointree. The expressions
dangle off the RangeTblEntry.

Offhand I'm not certain which of these would be cleanest. The second
way has some similarities to the way we handle set operation trees
(UNION et al), so it might be worth looking at that stuff. However,
being a RangeTblEntry has a lot of baggage (eg, various routines expect
to find an RTE alias, column names, column types, etc) and maybe we
don't need all that for VALUES.

One advantage of the first way is that you could use the same node
type for the "raw parser output" delivered by gram.y. This is a bit of
a type cheat, because raw parser output is logically distinct from what
parse analysis produces, but we do it in lots of other places too
(JoinExpr for instance is used that way). You should in any case have a
clear idea of the difference between the raw and analyzed parser
representations --- for instance, the raw form won't contain any
datatype info, whereas the analyzed form must. This might or might not
need to be visible directly in the VALUES node --- it might be that you
can rely on the datatype info embedded in the analyzed expressions.

regards, tom lane

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [PATCHES] 8.2 features?
Date:	2006-07-21 03:09:47
Message-ID:	44C0457B.1010303@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> No, I guess I confused you by talking about the executor representation
> at the same time. This is really unrelated to the executor. The join
> tree I'm thinking of here is the data structure that dangles off
> Query.jointree --- it's a representation of the query's FROM clause,
> and (at present) can contain RangeTblRef, FromExpr, and JoinExpr nodes.
> See the last hundred or so lines of primnodes.h for some details.
> The jointree is used by the planner to compute the plan node tree that
> the executor will run, but it's not the same thing.

Ah, that helps. Thanks for the explanation. I'll start digging in again...

Thanks,

Joe

From:	"Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Joe Conway <mail(at)joeconway(dot)com>, Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [PATCHES] 8.2 features?
Date:	2006-07-21 14:08:32
Message-ID:	20060721140832.GD83250@pervasive.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

On Thu, Jul 20, 2006 at 08:46:13PM -0400, Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
> > I'm liking this too. But when you say "jointree node", are you saying to
> > model the new node type after NestLoop/MergeJoin/HashJoin nodes? These
> > are referred to as "join nodes" in ExecInitNode. Or as you mentioned a
> > couple of times, should this look more like an Append node?
>
> No, I guess I confused you by talking about the executor representation
> at the same time. This is really unrelated to the executor. The join
> tree I'm thinking of here is the data structure that dangles off
> Query.jointree --- it's a representation of the query's FROM clause,
> and (at present) can contain RangeTblRef, FromExpr, and JoinExpr nodes.
> See the last hundred or so lines of primnodes.h for some details.
> The jointree is used by the planner to compute the plan node tree that
> the executor will run, but it's not the same thing.
>
> There are basically two ways you could go about this:
> 1. Make a new jointree leaf node type to represent a VALUES construct,
> and dangle the list of lists of expressions off that.
> 2. Make a new RangeTblEntry type to represent a VALUES construct, and
> just put a RangeTblRef to it into the jointree. The expressions
> dangle off the RangeTblEntry.
>
> Offhand I'm not certain which of these would be cleanest. The second
> way has some similarities to the way we handle set operation trees
> (UNION et al), so it might be worth looking at that stuff. However,
> being a RangeTblEntry has a lot of baggage (eg, various routines expect
> to find an RTE alias, column names, column types, etc) and maybe we
> don't need all that for VALUES.

I misread that to include SRFs, but it got me thinking... another
possibility would be to changes VALUES() so that it was treated as a
function, and allow it to have an arbitrary number of parameters. That
would automatically allow the case of SELECT * FROM VALUES(...). INSERT
would need to learn how to accept SRFs, but that would have the nice
side-effect of allowing INSERT INTO table set_returning_function();

Of course, adding the ability for functions to have an arbitrary
argument list could well be more complex than any of the options
discussed thusfar... though it would be a very handy feature to have.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Values list-of-targetlists patch for comments (was Re: [HACKERS] 8.2 features?)
Date:	2006-07-24 03:57:14
Message-ID:	44C4451A.4010906@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
>
>>I'm liking this too. But when you say "jointree node", are you saying to
>>model the new node type after NestLoop/MergeJoin/HashJoin nodes? These
>>are referred to as "join nodes" in ExecInitNode. Or as you mentioned a
>>couple of times, should this look more like an Append node?
>
>
> No, I guess I confused you by talking about the executor representation
> at the same time. This is really unrelated to the executor. The join
> tree I'm thinking of here is the data structure that dangles off
> Query.jointree --- it's a representation of the query's FROM clause,
> and (at present) can contain RangeTblRef, FromExpr, and JoinExpr nodes.
> See the last hundred or so lines of primnodes.h for some details.
> The jointree is used by the planner to compute the plan node tree that
> the executor will run, but it's not the same thing.
>
> There are basically two ways you could go about this:
> 1. Make a new jointree leaf node type to represent a VALUES construct,
> and dangle the list of lists of expressions off that.
> 2. Make a new RangeTblEntry type to represent a VALUES construct, and
> just put a RangeTblRef to it into the jointree. The expressions
> dangle off the RangeTblEntry.
>
> Offhand I'm not certain which of these would be cleanest. The second
> way has some similarities to the way we handle set operation trees
> (UNION et al), so it might be worth looking at that stuff. However,
> being a RangeTblEntry has a lot of baggage (eg, various routines expect
> to find an RTE alias, column names, column types, etc) and maybe we
> don't need all that for VALUES.

Since the feature freeze is only about a week off, I wanted to post this
patch even though it is not yet ready to be applied.

Executive summary:
==================
1. The patch is now large and invasive based on adding new node
types and associated infrastructure. I modelled the nodes largely
on RangeFunction and FunctionScan.
2. Performance is close enough to mysql to not be a big issue (I think,
more data below) as long as the machine does not get into a memory
swapping regime. Memory usage is now better, but not as good as
mysql.
3. I specifically coded with the intent of preserving current insert
statement behavior and code paths for current functionality. So there
*should* be no performance degradation or subtle semantics changes
for "INSERT DEFAULT VALUES", "INSERT ... VALUES (with one target
list)", "INSERT ... SELECT ...". Even Tom's recently discovered
"insert into foo values (tenk1.*)" still works ;-)

Performance:
============
On my development machine (dual core amd64, 2GB RAM) I get the following
results using the php script posted earlier:

Postgres:
---------
$loopcount = 100000;
multi-INSERT-at-once Elapsed time is 1 second

$loopcount = 300000;
multi-INSERT-at-once Elapsed time is 5 seconds

$loopcount = 500000;
multi-INSERT-at-once Elapsed time is 9 seconds

$loopcount = 800000;
multi-INSERT-at-once Elapsed time is 14 seconds

$loopcount = 900000;
multi-INSERT-at-once Elapsed time is 17 seconds

$loopcount = 1000000;
multi-INSERT-at-once Elapsed time is 42 seconds

$loopcount = 2000000;
killed after 5 minutes due to swapping

MySQL:
------
$loopcount = 100000;
multi-INSERT-at-once Elapsed time is 2 seconds

$loopcount = 300000;
INSERT failed:Got a packet bigger than 'max_allowed_packet' bytes
changed max_allowed_packet=64M
multi-INSERT-at-once Elapsed time is 5 seconds

$loopcount = 500000;
multi-INSERT-at-once Elapsed time is 8 seconds

$loopcount = 800000;
multi-INSERT-at-once Elapsed time is 13 seconds

$loopcount = 900000;
multi-INSERT-at-once Elapsed time is 15 seconds

$loopcount = 1000000;
multi-INSERT-at-once Elapsed time is 17 seconds

$loopcount = 2000000;
multi-INSERT-at-once Elapsed time is 36 seconds

$loopcount = 3000000;
multi-INSERT-at-once Elapsed time is 54 seconds

$loopcount = 4000000;
multi-INSERT-at-once Elapsed time is 134 seconds

<table value constructor>:
==========================
Included in this patch is support for <table value constructor> in the
FROM clause, e.g.:

regression=# select * from {values (1,array[1,2]),(2,array[3,4])};
?column? | array
----------+-------
1 | {1,2}
2 | {3,4}
(2 rows)

The strange syntax is a temporary hack to eliminate shift/reduce
conflicts. I'm not entirely sure we want to try to support this (or
something like it) for 8.2, but much of what is needed is now readily
available. More on known issues next.

Known Issues:
=============

General:
--------
1. Several comments in the patch are marked "FIXME". These are areas
where I was uncertain what was the "right thing to do". Any advice
on these specific spots would be very much appreciated.
2. I broke the rules regression test -- still need to look at what I
did to mess that up. Somewhere in the reconstruction of "VALUES ..."
according to the diff.

VALUES multi-targetlist INSERTS:
--------------------------------
3. Not yet quite sure how to get DEFAULT to work for "INSERT ...
multi-values". As noted above, works fine if there is only
one targetlist.

<table value constructor>:
--------------------------
4. I'm getting shift/reduce conflicts that are not easily eliminated.
Making VALUES fully reserved only made it 1 shift/reduce conflict.
5. Column aliases are still not working correctly. Haven't really looked
closely at this yet.
6. Data types are being deduced currently based on the first row,
and not currently getting checked on subsequent rows. So it is
easy to induce a crash:

regression=# select * from {values (1,array[1,2]),(2,3)};
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!>

7. In general, <table value constructor> in the FROM clause needs
more discussion -- among other things, how should we determine and
enforce column types? I think this could be a very useful feature,
but I'm not comfortable I understand it yet.

=================
As usual, review, advise, comments, flames, etc. requested

Joe

Attachment	Content-Type	Size
multi-insert-r6a.diff.gz	application/x-gzip	16.1 KB

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	"Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-07-24 06:22:14
Message-ID:	44C46716.8090207@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway wrote:
> Since the feature freeze is only about a week off, I wanted to post this
> patch even though it is not yet ready to be applied.
>

Sorry -- I just realized that two new files for ValuesScan didn't make
it into the patch posted earlier. Here they are now -- please untar in
your postgres sourcetree root in addition to applying the patch.

(I thought "cvs diff -cN" should have included the new files, since I
had earlier done "cvs add" on them, but it didn't work. I could swear
that worked for me in the past...)

Thanks,

Joe

Attachment	Content-Type	Size
multi-insert-r6a.new.tar.gz	application/x-gzip	2.4 KB

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [HACKERS] 8.2 features?)
Date:	2006-07-24 16:12:11
Message-ID:	11495.1153757531@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> Tom Lane wrote:
>> There are basically two ways you could go about this:
>> 1. Make a new jointree leaf node type to represent a VALUES construct,
>> and dangle the list of lists of expressions off that.
>> 2. Make a new RangeTblEntry type to represent a VALUES construct, and
>> just put a RangeTblRef to it into the jointree. The expressions
>> dangle off the RangeTblEntry.

You seem to have done *both*, which is certainly not what I had in mind.
I'd drop the RangeTblEntry changes, I think.

Shoving all the tuples into a tuplestore is not doing anything for you
from a performance point of view. I was thinking more of evaluating the
targetlists on-the-fly. Basically what I foresaw as the executor
mechanism was something like a Result node, except with a list of
targetlists instead of just one, and part of its runtime state would be
an index saying which one to evaluate next. (The update logic for the
index would be just like Append's logic for which subplan to eval next.)

Result as it currently stands is a pretty queer beast because it can
have a child plan or not. I'm tempted to suggest splitting it into
two node types, perhaps call the one with a child "Filter" and reserve
the name "Result" for the one with no child. The reason for doing this
in this context is that we could just make the no-child case be
multi-targetlist-capable (rather than having separate nearly identical
node types with single and multi tlists). AFAICS multi tlists don't
make any sense for the filter-a-child-plan scenario, so that's why I
want to push that case off to a different node type.

regards, tom lane

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-07-24 16:32:37
Message-ID:	44C4F625.2050304@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
>>
>>>There are basically two ways you could go about this:
>>>1. Make a new jointree leaf node type to represent a VALUES construct,
>>>and dangle the list of lists of expressions off that.
>>>2. Make a new RangeTblEntry type to represent a VALUES construct, and
>>>just put a RangeTblRef to it into the jointree. The expressions
>>>dangle off the RangeTblEntry.
>
> You seem to have done *both*, which is certainly not what I had in mind.
> I'd drop the RangeTblEntry changes, I think.

Good feedback -- thanks! But without the RTE, how would VALUES in the
FROM clause work? Or should I just drop that part and focus on just the
InsertStmt case?

Joe

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [HACKERS] 8.2 features?)
Date:	2006-07-24 17:09:29
Message-ID:	12029.1153760969@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> Good feedback -- thanks! But without the RTE, how would VALUES in the
> FROM clause work?

Is it different from INSERT? I'm just imagining a Values node in
the jointree and nothing in the rangetable.

If I'm reading the spec correctly, VALUES is exactly parallel to SELECT
in the grammar, which means that to use it in FROM you would need
parentheses and an alias:

SELECT ... FROM (SELECT ...) AS foo

SELECT ... FROM (VALUES ...) AS foo

ISTM that this should be represented using an RTE_SUBQUERY node in the
outer query; the alias attaches to that node, not to the VALUES itself.
So I don't think you need that alias field in the jointree entry either.

If we stick with the plan of representing VALUES as if it were SELECT *
FROM (valuesnode), then this approach would make the second query above
have a structure like

Query
.rtable -> RTE_SUBQUERY
.subquery -> Query
.jointree -> Values

(leaving out a ton of detail of course, but those are the key nodes).

To get this to reverse-list in the expected form, we'd need a small
kluge in ruleutils.c that short-circuits the display of "SELECT
... FROM" etc when it sees a Values node at the top of the jointree.
This seems like a fairly small price to pay for keeping Query in
approximately its present form, though.

One thought is that we might allow Query.jointree to point to either
a FromExpr or a Values node, and disallow Values from appearing further
down in the jointree (except perhaps after flattening of subqueries
in the planner). The alternative is that there's a FromExpr atop
the Values node in the jointree even in the simple case; which seems
uglier but it might avoid breaking some code that expects the top level
to always be FromExpr.

regards, tom lane

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-07-24 21:22:29
Message-ID:	44C53A15.9030509@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> ISTM that this should be represented using an RTE_SUBQUERY node in the
> outer query; the alias attaches to that node, not to the VALUES itself.
> So I don't think you need that alias field in the jointree entry either.
>
> If we stick with the plan of representing VALUES as if it were SELECT *
> FROM (valuesnode), then this approach would make the second query above
> have a structure like
>
> Query
> .rtable -> RTE_SUBQUERY
> .subquery -> Query
> .jointree -> Values
>
> (leaving out a ton of detail of course, but those are the key nodes).
>

OK, I'll go try to wrap my mind around that this evening and see where
it takes me.

Thanks,

Joe

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-28 19:47:24
Message-ID:	200607281947.k6SJlOI02533@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Are you going to apply this? Seems it is ready.

---------------------------------------------------------------------------

Joe Conway wrote:
> Tom Lane wrote:
> > Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com> writes:
> >
> >>Strange. Last time I checked I thought MySQL dump used 'multivalue
> >>lists in inserts' for dumps, for the same reason that we use COPY
> >
> > I think Andrew identified the critical point upthread: they don't try
> > to put an unlimited number of rows into one INSERT, only a megabyte
> > or so's worth. Typical klugy-but-effective mysql design approach ...
>
>
> OK, so given that we don't need to be able to do 1 million
> multi-targetlist insert statements, here is rev 2 of the patch.
>
> It is just slightly more invasive, but performs *much* better. In fact,
> it can handle as many targetlists as you have memory to deal with. It
> also deals with DEFAULT values in the targetlist.
>
> I've attached a php script that I used to do crude testing. Basically I
> tested 3 cases in this order:
>
> single-INSERT-multi-statement:
> ------------------------------
> "INSERT INTO foo2a (f1,f2) VALUES (1,2);"
> -- repeat statement $loopcount times
>
> single-INSERT-at-once:
> ----------------------
> "INSERT INTO foo2b (f1,f2) VALUES (1,2);INSERT INTO foo2a (f1,f2)
> VALUES (1,2);INSERT INTO foo2a (f1,f2) VALUES (1,2)..."
> -- build a single SQL string by looping $loopcount times,
> -- and execute it all at once
>
> multi-INSERT-at-once:
> ---------------------
> "INSERT INTO foo2c (f1,f2) VALUES (1,2),(1,2),(1,2)..."
> -- build a single SQL string by looping $loopcount times,
> -- and execute it all at once
>
> Here are the results:
> $loopcount = 100000;
> single-INSERT-multi-statement Elapsed time is 34 seconds
> single-INSERT-at-once Elapsed time is 7 seconds
> multi-INSERT-at-once Elapsed time is 4 seconds
> about 370MB peak memory usage
>
> $loopcount = 200000;
> single-INSERT-multi-statement Elapsed time is 67 seconds
> single-INSERT-at-once Elapsed time is 12 seconds
> multi-INSERT-at-once Elapsed time is 9 seconds
> about 750MB peak memory usage
>
> $loopcount = 300000;
> single-INSERT-multi-statement Elapsed time is 101 seconds
> single-INSERT-at-once Elapsed time is 18 seconds
> multi-INSERT-at-once Elapsed time is 13 seconds
> about 1.1GB peak memory usage
>
> Somewhere beyond this, my machine goes into swap hell, and I didn't have
> the patience to wait for it to complete :-)
>
> It would be interesting to see a side-by-side comparison with MySQL
> since that seems to be our benchmark on this feature. I'll try to do
> that tomorrow if no one beats me to it.
>
> There is only one downside to the current approach that I'm aware of.
> The command-result tag is only set by the "original" query, meaning that
> even if you insert 300,000 rows using this method, the command-result
> tag looks like "INSERT 0 1"; e.g.:
>
> regression=# create table foo2(f1 int default 42,f2 int default 6);
> CREATE TABLE
> regression=# insert into foo2 (f1,f2) values
> (default,12),(default,10),(115,21);
> INSERT 0 1
> regression=# select * from foo2;
> f1 | f2
> -----+----
> 42 | 12
> 42 | 10
> 115 | 21
> (3 rows)
>
> Any thoughts on how to fix that?
>
> Thanks,
>
> Joe
>
>

[ application/x-php is not supported, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Joe Conway <mail(at)joeconway(dot)com>, Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-28 19:54:15
Message-ID:	473.1154116455@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> Are you going to apply this? Seems it is ready.

I thought Joe was off in a corner doing a whole new version.
(I'm willing to help if he needs help...)

regards, tom lane

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Joe Conway <mail(at)joeconway(dot)com>, Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-28 19:54:50
Message-ID:	200607281954.k6SJsoC03793@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > Are you going to apply this? Seems it is ready.
>
> I thought Joe was off in a corner doing a whole new version.
> (I'm willing to help if he needs help...)

OK, just checking.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Bruce Momjian <bruce(at)momjian(dot)us>, Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-28 20:42:17
Message-ID:	44CA76A9.8070101@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>
>>Are you going to apply this? Seems it is ready.
>
> I thought Joe was off in a corner doing a whole new version.
> (I'm willing to help if he needs help...)
>

Yeah, I was going to post the latest tonight.

I'm afraid though that after 2 or so days heading down the last path you
suggested (namely making a new jointree leaf node) I was having trouble,
and at the same time came to the conclusion that adding a new RTE was
alot cleaner and made more sense to me. So I'm hoping you won't want to
send me back to the drawing board again. I believe I have cleaned up the
things you objected to:

1. Now I'm not doing both alternative -- the targetlists are only
attached to the RTE from the point of parse analysis onward.
2. I've eliminated the tuplestore in favor of runtime evaluation
of the targetlists which are in an array (allowing forward or
backward scanning -- although I haven't tested the latter yet).

I've also solved the INSERT related issues that I had earlier:

1. Fixed the rules regression test -- now all regression tests pass
2. Fixed evaluation of DEFAULT values
3. Improved memory consumption and speed some more -- basically
we are approximately equal to mysql as long as we don't swap,
and we consume about twice the RAM as mysql instead of several
times as much. I have more analysis of memory use I'd also like
to share later.
4. I think the INSERT part of this is ready to go basically, but
I need a bit more time to test corner cases.

I've made some progress on "SELECT ... FROM (VALUES ...) AS ..."

1. No more shift/reduce issues
2. The ValuesScan work and memory improvements mentioned above
applies here too.
3. This part still needs the most work though.

I'll post a patch in a few hours -- there is some debug code in there
currently that I should clean up before I send it to the list.

BTW, I'm reserving Saturday, Sunday, and Monday (taking Monday off from
my day job) to work on outstanding issues. I can continue to work
through the end of next Friday, 4 August. After that I'm heading to
Germany on a business trip and my "spare" time will evaporate for a few
weeks.

Joe

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Bruce Momjian <bruce(at)momjian(dot)us>, Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-29 06:50:05
Message-ID:	44CB051D.6090609@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway wrote:
> Tom Lane wrote:
>> I thought Joe was off in a corner doing a whole new version.
>> (I'm willing to help if he needs help...)
>
> Yeah, I was going to post the latest tonight.

Sorry for the delay. Ever see the movie "The Money Pit"? This afternoon
I started to think I lived in that house :-(

Anyway, as mentioned below, I think the attached works well for the
"INSERT ... VALUES (...), (...), ..." and related cases. There are still
things wrong that I have not even tried to fix with respect to FROM
clause VALUES lists. Namely column aliases have no effect, and neither
does "ORDER BY" clause (I'm pretty sure addRangeTableEntryForValues
needs work among other places).

From a memory usage standpoint, I got the following using 1,000,000
values targetlists:

sql length = 6000032

NOTICE: enter transformInsertStmt
MessageContext: 478142520 total in 66 blocks; 5750400 free (3 chunks);
472392120 used

NOTICE: enter transformRangeValues
MessageContext: 478142520 total in 66 blocks; 5749480 free (6 chunks);
472393040 used

NOTICE: enter updateTargetListEntry
MessageContext: 629137464 total in 84 blocks; 44742464 free (999991
chunks); 584395000 used

NOTICE: exit transformInsertStmt
MessageContext: 629137464 total in 84 blocks; 44742408 free (999991
chunks); 584395056 used

NOTICE: start ExecInitValuesScan
MessageContext: 1015013432 total in 130 blocks; 6614008 free (8 chunks);
1008399424 used

NOTICE: end ExecInitValuesScan
MessageContext: 1015013432 total in 130 blocks; 6614008 free (8 chunks);
1008399424 used
ExecutorState: 8024632 total in 3 blocks; 21256 free (8 chunks); 8003376
used

This shows original SQL statement is about 6MB, by the time we get to
parse analysis we're at almost 500 MB, and that memory is never
recovered. Transforming from ResTarget to TargetEntry chews up about
100MB. Then between exiting transformInsertStmt and entering
ExecInitValuesScan we double in memory usage to about 1 GB. It isn't
shown here, but we add another 200 MB or so during tuple projection. So
we top out at about 1.2 GB. Note that mysql tops out at about 600 MB for
this same SQL.

I'm not sure what if anything can be done to improve the above -- I'm
open to suggestions.

Please note that this patch requires an initdb, although I have not yet
bothered to bump CATVERSION.

Thanks for help, comments, suggestions, etc...

Joe

>
> I'm afraid though that after 2 or so days heading down the last path you
> suggested (namely making a new jointree leaf node) I was having trouble,
> and at the same time came to the conclusion that adding a new RTE was
> alot cleaner and made more sense to me. So I'm hoping you won't want to
> send me back to the drawing board again. I believe I have cleaned up the
> things you objected to:
>
> 1. Now I'm not doing both alternative -- the targetlists are only
> attached to the RTE from the point of parse analysis onward.
> 2. I've eliminated the tuplestore in favor of runtime evaluation
> of the targetlists which are in an array (allowing forward or
> backward scanning -- although I haven't tested the latter yet).
>
> I've also solved the INSERT related issues that I had earlier:
>
> 1. Fixed the rules regression test -- now all regression tests pass
> 2. Fixed evaluation of DEFAULT values
> 3. Improved memory consumption and speed some more -- basically
> we are approximately equal to mysql as long as we don't swap,
> and we consume about twice the RAM as mysql instead of several
> times as much. I have more analysis of memory use I'd also like
> to share later.
> 4. I think the INSERT part of this is ready to go basically, but
> I need a bit more time to test corner cases.
>
> I've made some progress on "SELECT ... FROM (VALUES ...) AS ..."
>
> 1. No more shift/reduce issues
> 2. The ValuesScan work and memory improvements mentioned above
> applies here too.
> 3. This part still needs the most work though.
>
> I'll post a patch in a few hours -- there is some debug code in there
> currently that I should clean up before I send it to the list.
>
> BTW, I'm reserving Saturday, Sunday, and Monday (taking Monday off from
> my day job) to work on outstanding issues. I can continue to work
> through the end of next Friday, 4 August. After that I'm heading to
> Germany on a business trip and my "spare" time will evaporate for a few
> weeks.
>

Attachment	Content-Type	Size
multi-insert-r17.diff.gz	application/x-gzip	20.8 KB

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES]
Date:	2006-07-30 01:24:54
Message-ID:	44CC0A66.6070903@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> If I'm reading the spec correctly, VALUES is exactly parallel to SELECT
> in the grammar, which means that to use it in FROM you would need
> parentheses and an alias:
>
> SELECT ... FROM (SELECT ...) AS foo
>
> SELECT ... FROM (VALUES ...) AS foo

One of the things I'm struggling with is lack of column aliases. Would
it be reasonable to require something like this?

SELECT ... FROM (VALUES ...) AS foo(col1, col2, ...)

The other issue is how to determine column type. Even better would be to
require (similar to SRF returning record):

SELECT ... FROM (VALUES ...) AS foo(col1 type1, col2 type2, ...)

This would unambiguously identify the column aliases and types. Assuming
we stick with the spec:
SELECT ... FROM (VALUES ...) AS foo

1. How should we assign column names?
values1, values2, ...?
or
col1, col2, ...?
or
???

2. How should we assign datatypes? Use the first "row" and try to coerce
the rest to that type?

Joe

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	"Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-30 01:27:20
Message-ID:	21320.1154222840@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
>> I'm afraid though that after 2 or so days heading down the last path you
>> suggested (namely making a new jointree leaf node) I was having trouble,
>> and at the same time came to the conclusion that adding a new RTE was
>> alot cleaner and made more sense to me. So I'm hoping you won't want to
>> send me back to the drawing board again. I believe I have cleaned up the
>> things you objected to:

I was just objecting to having both a new RTE type and a new jointree
node type --- you only need one or the other. Opting for the new RTE
type is fine with me, and it probably is a bit cleaner at the end of
the day.

I still dislike the way you're doing things in the executor though.
I don't see the point of using the execScan.c machinery; most of the
time that'll be useless overhead. As I said before, I think the right
direction here is to split Result into two single-purpose node types
and make the non-filter version capable of taking a list of targetlists.

As far as reducing memory use goes, it seems to me that there's no need
for the individual "targetlists" to have ResTarget/TargetEntry
decoration. For the simple case where the expressions are just Const
nodes, this could save something like a third of the space (there's also
a List node per item, which we can't do much about). I think we'd have
to gin up a fake targetlist to attach to the Plan node, but there'd be
only one.

Since the result-node split is my hot button, I'm willing to volunteer
to make it happen. Do you want to concentrate on the remaining
parser-area issues and leave the executor part to me?

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES] 8.2 features?)
Date:	2006-07-30 01:35:24
Message-ID:	21398.1154223324@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> One of the things I'm struggling with is lack of column aliases. Would
> it be reasonable to require something like this?

> SELECT ... FROM (VALUES ...) AS foo(col1, col2, ...)

Requiring column aliases is counter to spec ...

> The other issue is how to determine column type. Even better would be to
> require (similar to SRF returning record):

> SELECT ... FROM (VALUES ...) AS foo(col1 type1, col2 type2, ...)

... and this is even further away from it.

As for the names, just use "?column?", same as we do now in INSERT
... VALUES. Anyone who wants to refer to those columns explicitly will
need to assign aliases, but if they don't assign aliases, we don't have
to do anything very intelligent.

As for the types, I believe that the spec pretty much dictates that we
apply the same type resolution algorithm as for a UNION. This is fairly
expensive and we should avoid it in the case of INSERT ... VALUES, but
for VALUES appearing anywhere else I think we have little choice.

regards, tom lane

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-30 01:37:35
Message-ID:	44CC0D5F.7080105@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
>
>>>I'm afraid though that after 2 or so days heading down the last path you
>>>suggested (namely making a new jointree leaf node) I was having trouble,
>>>and at the same time came to the conclusion that adding a new RTE was
>>>alot cleaner and made more sense to me. So I'm hoping you won't want to
>>>send me back to the drawing board again. I believe I have cleaned up the
>>>things you objected to:
>
>
> I was just objecting to having both a new RTE type and a new jointree
> node type --- you only need one or the other. Opting for the new RTE
> type is fine with me, and it probably is a bit cleaner at the end of
> the day.

Great!

> I still dislike the way you're doing things in the executor though.
> I don't see the point of using the execScan.c machinery; most of the
> time that'll be useless overhead. As I said before, I think the right
> direction here is to split Result into two single-purpose node types
> and make the non-filter version capable of taking a list of targetlists.

OK.

> As far as reducing memory use goes, it seems to me that there's no need
> for the individual "targetlists" to have ResTarget/TargetEntry
> decoration. For the simple case where the expressions are just Const
> nodes, this could save something like a third of the space (there's also
> a List node per item, which we can't do much about). I think we'd have
> to gin up a fake targetlist to attach to the Plan node, but there'd be
> only one.

OK, I'll take a look at that (actually I was just in that general
vicinity anyway).

> Since the result-node split is my hot button, I'm willing to volunteer
> to make it happen. Do you want to concentrate on the remaining
> parser-area issues and leave the executor part to me?
>

Sure, sounds good to me.

Joe

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES]
Date:	2006-07-30 01:45:44
Message-ID:	44CC0F48.9020805@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
>>One of the things I'm struggling with is lack of column aliases. Would
>>it be reasonable to require something like this?

> Requiring column aliases is counter to spec ...

>> SELECT ... FROM (VALUES ...) AS foo(col1 type1, col2 type2, ...)

> ... and this is even further away from it.

I figured as much, but thought I'd ask anyway :-). I did find something
in the appendix to the spec after sending this:

Annex C
(informative)
Implementation-dependent elements

18) Subclause 7.3, “<table value constructor>”:
a) The column names of a <table value constructor> or a <contextually
typed table value constructor>
are implementation-dependent.

> As for the names, just use "?column?", same as we do now in INSERT
> ... VALUES. Anyone who wants to refer to those columns explicitly will
> need to assign aliases, but if they don't assign aliases, we don't have
> to do anything very intelligent.

OK, I just thought "?column?" was ugly and useless.

> As for the types, I believe that the spec pretty much dictates that we
> apply the same type resolution algorithm as for a UNION. This is fairly
> expensive and we should avoid it in the case of INSERT ... VALUES, but
> for VALUES appearing anywhere else I think we have little choice.

Where do I find that algorithm -- somewhere in nodeAppend.c?

Thanks,

Joe

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	Christopher Kings-Lynne <chris(dot)kings-lynne(at)calorieking(dot)com>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Bernd Helmle <mailings(at)oopsware(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Susanne Ebrecht <susanne(dot)ebrecht(at)credativ(dot)de>
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES] 8.2 features?)
Date:	2006-07-30 01:50:03
Message-ID:	21588.1154224203@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> Tom Lane wrote:
>> As for the types, I believe that the spec pretty much dictates that we
>> apply the same type resolution algorithm as for a UNION.

> Where do I find that algorithm -- somewhere in nodeAppend.c?

select_common_type(), in the parser.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: [PATCHES] 8.2 features?
Date:	2006-07-30 17:15:41
Message-ID:	26226.1154279741@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

I wrote:
> I still dislike the way you're doing things in the executor though.
> I don't see the point of using the execScan.c machinery; most of the
> time that'll be useless overhead. As I said before, I think the right
> direction here is to split Result into two single-purpose node types
> and make the non-filter version capable of taking a list of targetlists.

After more thought I've reconsidered this. The "ValuesScan" node is
still redundant with Result's non-filter case, but we should probably
get rid of the latter not the former. The reason is that in the general
case of VALUES-in-FROM, we do need all the generality of execScan.
Consider

SELECT x,y,x+y
FROM (VALUES (1,2),(3,4),...) AS foo(x,y)
WHERE x < y;

which AFAICS is perfectly legal SQL. We need a qual condition for the
WHERE and a projection step to form the x+y result. We could make a
non-filtering Result clause do all that but it'd really be reinventing
the execScan wheel.

So what I'm currently thinking is

1. Implement ValuesScan.
2. Convert all existing uses of Result without a child node into
ValuesScan.
3. Rename Result to Filter and rip out whatever code is only used for
the no-child-node case.

Steps 2 and 3 are just in the nature of housekeeping and can wait till
after the VALUES feature is in.

As far as avoiding overhead goes, here's what I'm thinking:

* The Values RTE node should contain a list of lists of bare
expressions, without TargetEntry decoration (you probably do not
need ResTarget in the raw parse tree for VALUES, either).

* The ValuesScan plan node will just reference this list-of-lists
(avoiding making a copy). It will need to contain a targetlist
because all plan nodes do, but the base version of that will just
be a trivial "Var 1", "Var 2", etc. (The planner might replace that
with a nontrivial targetlist in cases such as the example above.)

* At runtime, ValuesScan evaluates each sublist of expressions and
stores the results into a virtual tuple slot which is returned as
the "scan tuple" to execScan. If the targetlist is nontrivial then
it is evaluated with this tuple as input. If the targetlist is
a trivial Var list then the existing "physical tuple" optimization
kicks in and execScan will just return the scan tuple unmodified.
So for INSERT ... VALUES, the execScan layer will cost us nothing
in memory space and not much in execution time.

There are still some things I don't like about the way you did
ValuesScan but I'll work on improving that.

regards, tom lane

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: [PATCHES] 8.2 features?
Date:	2006-07-30 18:21:36
Message-ID:	44CCF8B0.5080406@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> So what I'm currently thinking is
>
> 1. Implement ValuesScan.
> 2. Convert all existing uses of Result without a child node into
> ValuesScan.
> 3. Rename Result to Filter and rip out whatever code is only used for
> the no-child-node case.
>
> Steps 2 and 3 are just in the nature of housekeeping and can wait till
> after the VALUES feature is in.

Sounds good to me.

> As far as avoiding overhead goes, here's what I'm thinking:
>
> * The Values RTE node should contain a list of lists of bare
> expressions, without TargetEntry decoration (you probably do not
> need ResTarget in the raw parse tree for VALUES, either).
>
> * The ValuesScan plan node will just reference this list-of-lists
> (avoiding making a copy). It will need to contain a targetlist
> because all plan nodes do, but the base version of that will just
> be a trivial "Var 1", "Var 2", etc. (The planner might replace that
> with a nontrivial targetlist in cases such as the example above.)

I'll work on that today.

> * At runtime, ValuesScan evaluates each sublist of expressions and
> stores the results into a virtual tuple slot which is returned as
> the "scan tuple" to execScan. If the targetlist is nontrivial then
> it is evaluated with this tuple as input. If the targetlist is
> a trivial Var list then the existing "physical tuple" optimization
> kicks in and execScan will just return the scan tuple unmodified.
> So for INSERT ... VALUES, the execScan layer will cost us nothing
> in memory space and not much in execution time.
>
> There are still some things I don't like about the way you did
> ValuesScan but I'll work on improving that.

OK.

Thanks,

Joe

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-31 18:57:35
Message-ID:	44CE529F.2070701@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> As far as avoiding overhead goes, here's what I'm thinking:
>
> * The Values RTE node should contain a list of lists of bare
> expressions, without TargetEntry decoration (you probably do not
> need ResTarget in the raw parse tree for VALUES, either).
>
> * The ValuesScan plan node will just reference this list-of-lists
> (avoiding making a copy). It will need to contain a targetlist
> because all plan nodes do, but the base version of that will just
> be a trivial "Var 1", "Var 2", etc. (The planner might replace that
> with a nontrivial targetlist in cases such as the example above.)

I wanted to post an updated patch even though there are still things not
working again after conversion to bare expressions. Note that I hacked
enough of the executor stuff so I could test my changes on the parser
area. The basic "INSERT ... VALUES (...), (...), ..." does work, but
without DEFAULT again :-(.

The good news is that from a memory and perfomance standpoint, my simple
test now shows us outperforming mysql:

$loopcount = 1000000;
Postgres:
multi-INSERT-at-once Elapsed time is 12 seconds
~420MB
MySQL:
multi-INSERT-at-once Elapsed time is 17 seconds
~600MB

$loopcount = 2000000;
Postgres:
multi-INSERT-at-once Elapsed time is 29 seconds
~730MB
MySQL:
multi-INSERT-at-once Elapsed time is 37 seconds
~1.2GB (this one is from memory -- I didn't write it in my notes)

Joe

Attachment	Content-Type	Size
multi-insert-r18.diff.gz	application/x-gzip	20.3 KB

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	"Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-31 19:10:20
Message-ID:	9049.1154373020@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> The good news is that from a memory and perfomance standpoint, my simple
> test now shows us outperforming mysql:

Sweet ;-)

I'm up to my *ss in fixing relation locking, but will get back to your
thing as soon as that's done. I think you're close enough to qualify
as having made the feature freeze deadline, in any case.

regards, tom lane

From:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Joe Conway <mail(at)joeconway(dot)com>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-31 20:19:43
Message-ID:	20060731201943.GF20962@alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
> > The good news is that from a memory and perfomance standpoint, my simple
> > test now shows us outperforming mysql:
>
> Sweet ;-)

I love this team. Kudos!

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

From:	Michael Fuhr <mike(at)fuhr(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Joe Conway <mail(at)joeconway(dot)com>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-31 20:44:56
Message-ID:	20060731204456.GA67809@winnie.fuhr.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

On Mon, Jul 31, 2006 at 04:19:43PM -0400, Alvaro Herrera wrote:
> Tom Lane wrote:
> > Joe Conway <mail(at)joeconway(dot)com> writes:
> > > The good news is that from a memory and perfomance standpoint, my simple
> > > test now shows us outperforming mysql:
> >
> > Sweet ;-)
>
> I love this team. Kudos!

So now it's MySQL users' turn to say, "Sure, but speed isn't
everything...." :-)

--
Michael Fuhr

From:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To:	Michael Fuhr <mike(at)fuhr(dot)org>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Joe Conway <mail(at)joeconway(dot)com>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-07-31 21:46:45
Message-ID:	44CE7A45.70400@commandprompt.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Michael Fuhr wrote:
> On Mon, Jul 31, 2006 at 04:19:43PM -0400, Alvaro Herrera wrote:
>> Tom Lane wrote:
>>> Joe Conway <mail(at)joeconway(dot)com> writes:
>>>> The good news is that from a memory and perfomance standpoint, my simple
>>>> test now shows us outperforming mysql:
>>> Sweet ;-)
>> I love this team. Kudos!
>
> So now it's MySQL users' turn to say, "Sure, but speed isn't
> everything...." :-)

"Sure, but speed isn't everything... We can accept 02/31/2006 as a valid
date. Let's see PostgreSQL do that!"

Joshua D. Drake

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	"Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-08-01 01:21:00
Message-ID:	26167.1154395260@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> I wanted to post an updated patch even though there are still things not
> working again after conversion to bare expressions.

I've been through the planner part of this and it looks OK (one or two
small errors). I'm currently messing with a revised version of the
grammar that supports putting VALUES everyplace that the spec allows,
and is a bit simpler than the old one to boot: it folds VALUES and
SELECT together, so we need fewer cases in the INSERT production.
Of course this breaks most of what you did in the parser :-( ...
I'm working on fixing that.

I'm about to go out to dinner but thought I'd post the gram.y and
parsenodes.h files so you could see where I'm headed. These are
diffs from CVS tip, not from your patch.

regards, tom lane

Attachment	Content-Type	Size
values-incomplete-patch.diff	application/octet-stream	9.8 KB

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-08-01 02:17:48
Message-ID:	44CEB9CC.2060102@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
>
>>I wanted to post an updated patch even though there are still things not
>>working again after conversion to bare expressions.
>
> I've been through the planner part of this and it looks OK (one or two
> small errors). I'm currently messing with a revised version of the
> grammar that supports putting VALUES everyplace that the spec allows,
> and is a bit simpler than the old one to boot: it folds VALUES and
> SELECT together, so we need fewer cases in the INSERT production.
> Of course this breaks most of what you did in the parser :-( ...
> I'm working on fixing that.
>
> I'm about to go out to dinner but thought I'd post the gram.y and
> parsenodes.h files so you could see where I'm headed. These are
> diffs from CVS tip, not from your patch.
>

Yup, I can see where you're headed. Looks nice!

In case you can make use of it, here's my latest. I found that I was
being too aggressive at freeing the input nodes to transformExpr() in
transformRangeValues() after using them. In many cases the returned node
is a new palloc'd node, but in some cases it is not.

The other issue I found was that I had neglected to fixup/coerce the raw
expressions ala updateTargetListEntry(). I ended up creating a somewhat
simpler updateValuesExprListEntry() to use on values expression lists.

I have yet to get to the similar/more general issue of coercing values
expression lists to common datatypes (i.e. using select_common_type()).

FWIW, here's a list of non-working cases at the moment:

8<-------------------------------------
create table inserttest (col1 int4, col2 int4 NOT NULL, col3 text
default 'testing');

--doesn't work
---------------
--wrong result
insert into inserttest (col2, col3) values (23, DEFAULT), (24, DEFAULT),
(25, 'hello'), (26, DEFAULT);
select * from (values (3,4),(2,3)) as t1(f1,f2) join (values
(3,8),(2,6)) as t2(f1,f2) using (f1);
select * from (values (3,4),(2,3)) as t1(f1,f2) join (values
(3,8),(2,6)) as t2(f1,f2) using (f1) where t2.f2 = 8;
select * from (values (3,4),(2,3)) as t1(f1,f2) join (values
(3,8),(2,6)) as t2(f1,f2) on t1.f1 = t2.f2 where t1.f1 = 3;

--corrupt result but no crash
select f1,f2 from (values (11,2),(26,'a'),(6,4)) as t(f1,f2) order by 1
desc;

--crash
select f1 from (values (1,2),(2,3)) as t(f1,f2) order by 1 desc;
select f1,f2 from (values (11,'a'),(26,13),(6,'c')) as t(f1,f2) order by
1 desc;
8<-------------------------------------

Joe

Attachment	Content-Type	Size
multi-insert-r19.diff.gz	application/x-gzip	21.6 KB

From:	"Harald Armin Massa" <haraldarminmassa(at)gmail(dot)com>
To:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc:	"Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-08-01 07:15:52
Message-ID:	7be3f35d0608010015j750f0b5axacd9753ec79bd46e@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joshua,

> So now it's MySQL users' turn to say, "Sure, but speed isn't
> > everything...." :-)
> "Sure, but speed isn't everything... We can accept 02/31/2006 as a valid
> date. Let's see PostgreSQL do that!"
>

I got the joke :)

But: it is still a problem when converting. As accepting 2006-02-31 as a
valid date would require brainwashing at least the entire core team, we
should find a "recommended path of date migration from different universes".

My idea would be to:

a) declare date fields as text
b) load the dump of the other db
c) add another column for the date fields, type timestampe (w/wo tz)
d) try to update the column of c) with the converted field from a)
e) replace the failing ones manually

is this really best practice? especially finding the invalid ones would be
challenging :(
idea: sort after the textual date fields; look at hot spots (0000-00-00,
xxxx-02-31)

Are there better ideas? shall we document the best practice somewhere?

Harald
--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Reinsburgstraße 202b
70197 Stuttgart
0173/9409607
-
Let's set so double the killer delete select all.

From:	Michael Glaesemann <grzm(at)seespotcode(dot)net>
To:	"Harald Armin Massa" <haraldarminmassa(at)gmail(dot)com>
Cc:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: [PATCHES] 8.2 features?
Date:	2006-08-01 07:51:39
Message-ID:	54422BE0-F8B8-4ED5-9A60-41261FC739BD@seespotcode.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

On Aug 1, 2006, at 16:15 , Harald Armin Massa wrote:

> As accepting 2006-02-31 as a valid date would require brainwashing
> at least the entire core team, we should find a "recommended path
> of date migration from different universes".

Have you checked out the mysql2pgsql[1] or my2postgres projects?
Perhaps they've included this as part of the conversion process.

Michael Glaesemann
grzm seespotcode net

[1](http://gborg.postgresql.org/project/mysql2psql/projdisplay.php)
[2](http://pgfoundry.org/projects/my2postgres/)

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	"Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: [HACKERS] 8.2 features?
Date:	2006-08-01 13:01:53
Message-ID:	1360.1154437313@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> In case you can make use of it, here's my latest. I found that I was
> being too aggressive at freeing the input nodes to transformExpr() in
> transformRangeValues() after using them. In many cases the returned node
> is a new palloc'd node, but in some cases it is not.

Great, I'll incorporate these updates and keep plugging --- should be
done today barring problems. If you have some spare cycles today,
want to work on regression tests and docs?

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	"Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES] 8.2 features?)
Date:	2006-08-01 20:15:51
Message-ID:	16429.1154463351@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

I've found a problem with the VALUES-as-RTE approach:

regression=# create table src(f1 int, f2 int);
CREATE TABLE
regression=# create table log(f1 int, f2 int, tag text);
CREATE TABLE
regression=# insert into src values(1,2);
INSERT 0 1
regression=# create rule r2 as on update to src do
regression-# insert into log values(old.*, 'old'), (new.*, 'new');
CREATE RULE
regression=# update src set f2 = f2 + 1;
server closed the connection unexpectedly

The problem with this is that the rewriter is substituting Vars
referencing "src" into the values lists of the VALUES RTE, within
a query that looks like a Cartesian product of src and *VALUES*:

regression=# explain update src set f2 = f2 + 1;
QUERY PLAN
--------------------------------------------------------------------
Nested Loop (cost=0.00..97.62 rows=3880 width=40)
-> Values Scan on "*VALUES*" (cost=0.00..0.02 rows=2 width=40)
-> Seq Scan on src (cost=0.00..29.40 rows=1940 width=0)

Seq Scan on src (cost=0.00..34.25 rows=1940 width=14)
(5 rows)

The ValuesScan node doesn't have access to the values of the current
row of src ... indeed, the planner doesn't know that it shouldn't
put the VALUES on the outside of the join, as it's done here, so
there *isn't* a current row of src.

AFAICT, the only way to make this work would be to implement SQL99's
LATERAL construct (or something pretty close to it --- I'm not entirely
sure I understand what LATERAL is supposed to do) so that the rewritten
query could be expressed like

insert into log select ... from src, LATERAL VALUES(src.f1, ...)

That's obviously not something we can get done for 8.2.

We could maybe kluge something to work for 8.2 if we were willing to
abandon the VALUES-as-RTE approach and go back to the notion of some
kind of multiple targetlist in a Query. I'm disinclined to do that
though, because as I've been working with your patch I've come to agree
that the RTE solution is a pretty clean one.

What I'm inclined to do for 8.2 is to disallow OLD/NEW references in
multi-element VALUES clauses; the feature is still tremendously useful
without that.

regards, tom lane

From:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Joe Conway <mail(at)joeconway(dot)com>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES] 8.2 features?)
Date:	2006-08-01 20:26:44
Message-ID:	20060801202644.GA19514@alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> I've found a problem with the VALUES-as-RTE approach:
>
> regression=# create table src(f1 int, f2 int);
> CREATE TABLE
> regression=# create table log(f1 int, f2 int, tag text);
> CREATE TABLE
> regression=# insert into src values(1,2);
> INSERT 0 1
> regression=# create rule r2 as on update to src do
> regression-# insert into log values(old.*, 'old'), (new.*, 'new');
> CREATE RULE
> regression=# update src set f2 = f2 + 1;
> server closed the connection unexpectedly

Does it work if you do

regression=# create rule r2 as on update to src do
regression-# insert into log values(old.f1, old.f2, 'old'), (new.f1, new.f2, 'new');
?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES]
Date:	2006-08-01 20:41:28
Message-ID:	44CFBC78.2020508@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> What I'm inclined to do for 8.2 is to disallow OLD/NEW references in
> multi-element VALUES clauses; the feature is still tremendously useful
> without that.

Given the timing, this sounds like a reasonable approach. I agree that
the feature has lots of interesting uses -- I'd hate to see us lose
that. Disallowing OLD/NEW references doesn't contradict the spec in any
way AFAIK either.

Joe

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc:	Joe Conway <mail(at)joeconway(dot)com>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES] 8.2 features?)
Date:	2006-08-01 20:47:34
Message-ID:	16717.1154465254@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Does it work if you do

> regression=# create rule r2 as on update to src do
> regression-# insert into log values(old.f1, old.f2, 'old'), (new.f1, new.f2, 'new');

No, that's not the problem. "*" expansion works just fine here, it's
the executor that can't deal with the resulting Vars.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	pgsql-patches(at)postgreSQL(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [HACKERS] 8.2 features?)
Date:	2006-08-01 23:16:35
Message-ID:	27359.1154474195@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Here's what I've got so far. I think there's probably more gold to be
mined in terms of reducing runtime memory consumption (I don't like the
list_free_deep bit, we should use a context), but functionally it seems
complete. I'm off to dinner again, it's in your court to look over some
more if you want.

(PS: if you want to apply, go ahead, don't forget catversion bump.)

regards, tom lane

Attachment	Content-Type	Size
values-lists-1.patch.gz	application/octet-stream	27.4 KB

From:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Joe Conway <mail(at)joeconway(dot)com>, pgsql-patches(at)postgresql(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-02 00:05:31
Message-ID:	Pine.LNX.4.58.0608021005220.14057@linuxworld.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom,

Is this intentional:

template1=# values(1), (2);
column1
---------
1
2
(2 rows)

This is legal because of:

simple_select:
/* ... */
| values_clause { $$ = $2; }

Also, I am working out some docs and regression tests.

Gavin

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-patches(at)postgreSQL(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-02 01:17:26
Message-ID:	44CFFD26.8040807@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> Here's what I've got so far. I think there's probably more gold to be
> mined in terms of reducing runtime memory consumption (I don't like the
> list_free_deep bit, we should use a context), but functionally it seems
> complete. I'm off to dinner again, it's in your court to look over some
> more if you want.

OK, I'll continue to look at it this week.

> (PS: if you want to apply, go ahead, don't forget catversion bump.)
>

Sure, I'll commit shortly.

Thanks,

Joe

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-patches(at)postgresql(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-02 01:17:33
Message-ID:	44CFFD2D.8040906@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Gavin Sherry wrote:
> Is this intentional:
>
> template1=# values(1), (2);
> column1
> ---------
> 1
> 2
> (2 rows)
>
> This is legal because of:
>
> simple_select:
> /* ... */
> | values_clause { $$ = $2; }

hmm, not sure about that...

>
> Also, I am working out some docs and regression tests.
>

Oh, cool. I was going to start working on that myself tonight, but if
you're already working on it, don't let me stand in the way ;-)

Actually, if you want me to finish up whatever you have started, I'm
happy to do that too.

Joe

From:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-patches(at)postgresql(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-02 01:28:47
Message-ID:	Pine.LNX.4.58.0608021128240.17678@linuxworld.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

On Tue, 1 Aug 2006, Joe Conway wrote:

> Gavin Sherry wrote:
> > Is this intentional:
> >
> > template1=# values(1), (2);
> > column1
> > ---------
> > 1
> > 2
> > (2 rows)
> >
> > This is legal because of:
> >
> > simple_select:
> > /* ... */
> > | values_clause { $$ = $2; }
>
> hmm, not sure about that...
>
> >
> > Also, I am working out some docs and regression tests.
> >
>
> Oh, cool. I was going to start working on that myself tonight, but if
> you're already working on it, don't let me stand in the way ;-)
>
> Actually, if you want me to finish up whatever you have started, I'm
> happy to do that too.

I've got to go out but I'll send a complete patch when I get back.

Gavin

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-patches(at)postgreSQL(dot)org, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-02 02:10:50
Message-ID:	44D009AA.10708@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Committed, with catversion bump.

Joe

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
Cc:	Joe Conway <mail(at)joeconway(dot)com>, pgsql-patches(at)postgresql(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [HACKERS] 8.2 features?)
Date:	2006-08-02 02:38:44
Message-ID:	28402.1154486324@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Gavin Sherry <swm(at)linuxworld(dot)com(dot)au> writes:
> Is this intentional:

> template1=# values(1), (2);
> column1
> ---------
> 1
> 2
> (2 rows)

You bet. VALUES is parallel to SELECT in the SQL grammar, so AFAICS
it should be legal anywhere you can write SELECT.

The basic productions in the spec's grammar are respectively

<query specification> ::=
SELECT [ <set quantifier> ] <select list>
<table expression>

and

<table value constructor> ::=
VALUES <row value expression list>

and both of them link into the rest of the grammar here:

There is no construct I can find in the spec grammar that allows
<query specification> but not <table value constructor>. QED.

Try some stuff like
DECLARE c CURSOR FOR VALUES ...
WHERE foo IN (VALUES ...

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	"Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES] 8.2 features?)
Date:	2006-08-02 02:52:00
Message-ID:	28624.1154487120@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> Tom Lane wrote:
>> What I'm inclined to do for 8.2 is to disallow OLD/NEW references in
>> multi-element VALUES clauses; the feature is still tremendously useful
>> without that.

> Given the timing, this sounds like a reasonable approach. I agree that
> the feature has lots of interesting uses -- I'd hate to see us lose
> that. Disallowing OLD/NEW references doesn't contradict the spec in any
> way AFAIK either.

I don't think rules are in the spec at all ;-) ... so no, that's not
a problem. My example demonstrated a pretty likely use:

create rule r2 as on update to src do
insert into log values(old.*, 'old'), (new.*, 'new');

but for the moment we can tell people to work around it the way
they always have:

create rule r2 as on update to src do
insert into log select old.*, 'old' union all new.*, 'new';

or just use two separate INSERT commands in the rule.

We oughta fix it later, but I don't feel ashamed to have a restriction
like this in the first cut.

regards, tom lane

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Joe Conway <mail(at)joeconway(dot)com>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES]
Date:	2006-08-02 03:14:37
Message-ID:	200608020314.k723EbQ01079@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Should we wait for someone to actually ask for this before adding it to
the TODO list? Does it cause a crash now?

---------------------------------------------------------------------------

Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
> > Tom Lane wrote:
> >> What I'm inclined to do for 8.2 is to disallow OLD/NEW references in
> >> multi-element VALUES clauses; the feature is still tremendously useful
> >> without that.
>
> > Given the timing, this sounds like a reasonable approach. I agree that
> > the feature has lots of interesting uses -- I'd hate to see us lose
> > that. Disallowing OLD/NEW references doesn't contradict the spec in any
> > way AFAIK either.
>
> I don't think rules are in the spec at all ;-) ... so no, that's not
> a problem. My example demonstrated a pretty likely use:
>
> create rule r2 as on update to src do
> insert into log values(old.*, 'old'), (new.*, 'new');
>
> but for the moment we can tell people to work around it the way
> they always have:
>
> create rule r2 as on update to src do
> insert into log select old.*, 'old' union all new.*, 'new';
>
> or just use two separate INSERT commands in the rule.
>
> We oughta fix it later, but I don't feel ashamed to have a restriction
> like this in the first cut.
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES]
Date:	2006-08-02 03:47:42
Message-ID:	44D0205E.9060300@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Bruce Momjian wrote:
> Should we wait for someone to actually ask for this before adding it
> to the TODO list?

Probably worth adding it to the TODO list so it doen't get lost.

> Does it cause a crash now?

Nope:

regression=# create table log(f1 int, f2 int, tag text);
CREATE TABLE
regression=# insert into src values(1,2);
INSERT 0 1
regression=# create rule r2 as on update to src do insert into log
values(old.*, 'old'), (new.*, 'new');
ERROR: VALUES must not contain OLD or NEW references

Joe

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-patches(at)postgreSQL(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-02 05:26:53
Message-ID:	44D0379D.10805@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

I checked out memory usage, and it had regressed to about 1.4 GB (from
730 MB as reported yesterday) for 2 million inserts of 2 integers (i.e.
with the php script I've been using).

I know you're not too happy with the attached approach to solving this,
but I'm not sure how creating a memory context is going to help. Part of
the problem is that the various transformXXX functions sometimes return
freshly palloc'd memory, and sometimes return the pointer they are given.

Anyway, with the attached diff, the 2 million inserts case is back to
about 730 MB memory use, and speed is pretty much the same as reported
yesterday (i.e both memory use and performance better than mysql with
innodb tables).

Thoughts?

Thanks,

Joe

Attachment	Content-Type	Size
multi-insert-memleak.r00.diff	text/x-patch	3.2 KB

From:	"Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-patches(at)postgresql(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-02 05:50:06
Message-ID:	44D03D0E.1070901@commandprompt.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

>
> Anyway, with the attached diff, the 2 million inserts case is back to
> about 730 MB memory use, and speed is pretty much the same as reported
> yesterday (i.e both memory use and performance better than mysql with
> innodb tables).

That's all that matters ;)

Joshua D. Drake

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-patches(at)postgreSQL(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-02 05:52:31
Message-ID:	44D03D9F.80203@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway wrote:
> Tom Lane wrote:
>
>> Here's what I've got so far. I think there's probably more gold to be
>> mined in terms of reducing runtime memory consumption (I don't like the
>> list_free_deep bit, we should use a context), but functionally it seems
>> complete.
>
> I checked out memory usage, and it had regressed to about 1.4 GB (from
> 730 MB as reported yesterday) for 2 million inserts of 2 integers (i.e.
> with the php script I've been using).
>
> I know you're not too happy with the attached approach to solving this,
> but I'm not sure how creating a memory context is going to help. Part of
> the problem is that the various transformXXX functions sometimes return
> freshly palloc'd memory, and sometimes return the pointer they are given.
>
> Anyway, with the attached diff, the 2 million inserts case is back to
> about 730 MB memory use, and speed is pretty much the same as reported
> yesterday (i.e both memory use and performance better than mysql with
> innodb tables).

Of course it also breaks a bunch of regression tests -- I guess that
just points to the fragility of this approach.

This patch retains the memory consumption savings but doesn't break any
regression tests...

Joe

Attachment	Content-Type	Size
multi-insert-memleak.r01.diff	text/x-patch	2.9 KB

From:	Michael Glaesemann <grzm(at)seespotcode(dot)net>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES]
Date:	2006-08-02 06:54:00
Message-ID:	1CD9D95B-E824-4A0D-85BB-279AD06B931C@seespotcode.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

On Aug 2, 2006, at 12:47 , Joe Conway wrote:

> regression=# create rule r2 as on update to src do insert into log
> values(old.*, 'old'), (new.*, 'new');
> ERROR: VALUES must not contain OLD or NEW references
>

Would it make sense to add a HINT as well, recommending the UNION
construct Tom mentioned earlier?

On Aug 2, 2006, at 11:52 , Tom Lane wrote:

> create rule r2 as on update to src do
> insert into log select old.*, 'old' union all new.*, 'new';
>
> or just use two separate INSERT commands in the rule.
>

Or is the general case too, uh, general for a HINT?

Michael Glaesemann
grzm seespotcode net

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-02 13:30:11
Message-ID:	200608021330.k72DUBW09188@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway wrote:
> Bruce Momjian wrote:
> > Should we wait for someone to actually ask for this before adding it
> > to the TODO list?
>
> Probably worth adding it to the TODO list so it doen't get lost.

Added:

o In rules, allow VALUES() to contain a mixture of 'old' and 'new'
references

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Michael Glaesemann <grzm(at)seespotcode(dot)net>
Cc:	Joe Conway <mail(at)joeconway(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-02 13:30:29
Message-ID:	200608021330.k72DUTp09254@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Michael Glaesemann wrote:
> On Aug 2, 2006, at 12:47 , Joe Conway wrote:
>
>
> > regression=# create rule r2 as on update to src do insert into log
> > values(old.*, 'old'), (new.*, 'new');
> > ERROR: VALUES must not contain OLD or NEW references
> >
>
> Would it make sense to add a HINT as well, recommending the UNION
> construct Tom mentioned earlier?

Agreed.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	pgsql-patches(at)postgreSQL(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-02 13:43:01
Message-ID:	15263.1154526181@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> This patch retains the memory consumption savings but doesn't break any
> regression tests...

I'm unconvinced that retail pfree's are the way to go. I just did some
profiling of this test case:

insert into foo values
(0,0,0),
(1,1,1),
(2,2,2),
(3,3,3),
... one million rows ...
(999997,999997,999997),
(999998,999998,999998),
(999999,999999,999999);

using CVS tip, and got these oprofile results:

samples % symbol name
39742 10.1656 base_yyparse
38338 9.8065 XLogInsert
28247 7.2253 AllocSetAlloc
20490 5.2411 expression_tree_walker
16822 4.3029 ExecInitExpr
16469 4.2126 base_yylex
14789 3.7829 PageAddItem
11174 2.8582 LWLockAcquire
11167 2.8564 LWLockRelease
9195 2.3520 RewriteQuery
9120 2.3328 AllocSetFree
7788 1.9921 ExecInitValuesScan
7596 1.9430 ExecEvalConst
7586 1.9404 lappend
6860 1.7547 ValuesNext
6261 1.6015 heap_fill_tuple
6141 1.5708 MemoryContextAllocZeroAligned
5619 1.4373 fix_expr_references_walker
5613 1.4357 transformExpressionList
5269 1.3478 heap_insert
5177 1.3242 contain_vars_of_level_walker
4601 1.1769 heap_form_tuple
4345 1.1114 ExecutorRun
4299 1.0996 hash_any
4201 1.0746 MemoryContextAlloc
4061 1.0388 check_stack_depth

It's slightly depressing that there's not more time being spent in
places we can easily tweak, but anyway the salient point to me is
that AllocSetFree is already chewing a nontrivial part of the runtime.

regards, tom lane

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-patches(at)postgreSQL(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-02 13:58:31
Message-ID:	44D0AF87.2030304@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> Joe Conway <mail(at)joeconway(dot)com> writes:
>
>>This patch retains the memory consumption savings but doesn't break any
>>regression tests...
>
>
> I'm unconvinced that retail pfree's are the way to go. I just did some
> profiling of this test case:

<snip>

> It's slightly depressing that there's not more time being spent in
> places we can easily tweak, but anyway the salient point to me is
> that AllocSetFree is already chewing a nontrivial part of the runtime.

That's undoubtedly true, and important for the case that isn't memory
constrained (but where I'm already seeing us perform relatively well).
But once we start the machine swapping, runtime goes in the toilet. And
without addressing the memory leak somehow, we will start a machine
swapping significantly earlier than mysql.

Joe

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Michael Glaesemann <grzm(at)seespotcode(dot)net>, Joe Conway <mail(at)joeconway(dot)com>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-02 14:07:51
Message-ID:	16503.1154527671@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> Michael Glaesemann wrote:
>> Would it make sense to add a HINT as well, recommending the UNION
>> construct Tom mentioned earlier?

> Agreed.

Done.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	pgsql-patches(at)postgreSQL(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-02 14:25:08
Message-ID:	16892.1154528708@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> That's undoubtedly true, and important for the case that isn't memory
> constrained (but where I'm already seeing us perform relatively well).
> But once we start the machine swapping, runtime goes in the toilet. And
> without addressing the memory leak somehow, we will start a machine
> swapping significantly earlier than mysql.

I'm not arguing that we don't need to work on the memory usage ... just
that I'm not very happy with that particular approach.

I wonder whether there is any reasonable way to determine which data
structures are responsible for how much space ... in my test I'm seeing

MessageContext: 822075440 total in 104 blocks; 4510280 free (1 chunks); 817565160 used
ExecutorState: 8024624 total in 3 blocks; 20592 free (12 chunks); 8004032 used

so it seems mostly not the executor's fault, but that's not much to go
on.

regards, tom lane

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-patches(at)postgreSQL(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-02 14:42:09
Message-ID:	44D0B9C1.2020608@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
>
> I wonder whether there is any reasonable way to determine which data
> structures are responsible for how much space ... in my test I'm seeing
>
> MessageContext: 822075440 total in 104 blocks; 4510280 free (1 chunks); 817565160 used
> ExecutorState: 8024624 total in 3 blocks; 20592 free (12 chunks); 8004032 used
>
> so it seems mostly not the executor's fault, but that's not much to go
> on.

I was doing it by sprinkling MemoryContextStats() in various places.
I'll spend some time again later today and see if I can narrow it down
to specific data structures using that. It shouldn't be too hard -- the
patch I sent last night only pfrees a few structures, and they represent
the bulk of what we need to clean up.

Joe

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	pgsql-patches(at)postgreSQL(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-02 15:56:23
Message-ID:	17704.1154534183@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> Tom Lane wrote:
>> I wonder whether there is any reasonable way to determine which data
>> structures are responsible for how much space ... in my test I'm seeing

> I was doing it by sprinkling MemoryContextStats() in various places.

Yeah, I've just been doing that and some hand analysis too. What I get
(on a 64-bit machine) is that essentially all the space goes into

lists of A_Const lists: 320000000
lists of Const lists: 320000000
transformInsertRow extra lists: 144000000

I think we could safely list_free the input list in transformInsertRow
as your patch suggests, which would buy back the 144M part. But I don't
believe it's safe at all to free the raw_parser output --- the grammar
sometimes makes multiple links to the same subtree, eg in BETWEEN.
In any case the patch as proposed wouldn't catch all the detritus for
any case more complicated than a simple integer constant.

The way that the list memory usage works (again, 64-bit machine) is

sizeof(List) = 24
sizeof(ListCell) = 16
sizeof(A_Const) = 32

Each of these nodes will have 16 bytes palloc overhead, and the List
header will be rounded up to 32 bytes as well, so we have total space
for a 3-element integer list of
32+16 + (16+16 + 32+16) * 3
Add in 16+16 for the associated ListCell of the top list-of-lists,
and you come to 320 bytes per sublist. Const happens to also be
32 bytes so the transformed lists are the same size.

It's interesting to reflect on the fact that this comes to 184 bytes
of useful data and 136 bytes of palloc overhead per row ... not sure
if we can do much about the overhead though.

regards, tom lane

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-patches(at)postgreSQL(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-02 18:10:11
Message-ID:	44D0EA83.7050702@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> Yeah, I've just been doing that and some hand analysis too. What I get
> (on a 64-bit machine) is that essentially all the space goes into
>
> lists of A_Const lists: 320000000
> lists of Const lists: 320000000
> transformInsertRow extra lists: 144000000
>
> I think we could safely list_free the input list in transformInsertRow
> as your patch suggests, which would buy back the 144M part. But I don't
> believe it's safe at all to free the raw_parser output --- the grammar
> sometimes makes multiple links to the same subtree, eg in BETWEEN.
> In any case the patch as proposed wouldn't catch all the detritus for
> any case more complicated than a simple integer constant.

:-(

> The way that the list memory usage works (again, 64-bit machine) is
>
> sizeof(List) = 24
> sizeof(ListCell) = 16
> sizeof(A_Const) = 32
>
> Each of these nodes will have 16 bytes palloc overhead, and the List
> header will be rounded up to 32 bytes as well, so we have total space
> for a 3-element integer list of
> 32+16 + (16+16 + 32+16) * 3
> Add in 16+16 for the associated ListCell of the top list-of-lists,
> and you come to 320 bytes per sublist. Const happens to also be
> 32 bytes so the transformed lists are the same size.

What if we built an array of A_Const nodes instead of a List? Maybe we
could use something akin to appendStringInfo()/enlargeStringInfo() to
build the array of nodes and enlarge it in chunks.

Joe

From:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-patches(at)postgreSQL(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-02 18:59:52
Message-ID:	20060802185952.GD6019@alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway wrote:

> What if we built an array of A_Const nodes instead of a List? Maybe we
> could use something akin to appendStringInfo()/enlargeStringInfo() to
> build the array of nodes and enlarge it in chunks.

In inval.c you find this:

/*
* To minimize palloc traffic, we keep pending requests in successively-
* larger chunks (a slightly more sophisticated version of an expansible
* array). All request types can be stored as SharedInvalidationMessage
* records. The ordering of requests within a list is never significant.
*/
typedef struct InvalidationChunk
{
struct InvalidationChunk *next; /* list link */
int nitems; /* # items currently stored in chunk */
int maxitems; /* size of allocated array in this chunk */
SharedInvalidationMessage msgs[1]; /* VARIABLE LENGTH ARRAY */
} InvalidationChunk; /* VARIABLE LENGTH STRUCTURE */

Which might give you an idea ...

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	pgsql-patches(at)postgreSQL(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-02 19:18:26
Message-ID:	19646.1154546306@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> What if we built an array of A_Const nodes instead of a List? Maybe we
> could use something akin to appendStringInfo()/enlargeStringInfo() to
> build the array of nodes and enlarge it in chunks.

For lists this short I'm not sure how much of a win it'd be. It's
interesting though to think about doing something like that within the
List abstraction itself. We did a "fastlist" hack once before and it
was a crock ... don't want to do that again. But now that we've got a
distinction between List and ListCell you could imagine that a List
header has a small private array of ListCells ... tuning the size might
be tricky though.

Another thing we could consider is flattening the double-level list into
a single list ... probably be a pain notationally, but it'd save one
List header and one ListCell per VALUES sublist. And it would offer
more traction for an array-inside-Lists optimization.

regards, tom lane

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject:	VALUES clause memory optimization (was: Values list-of-targetlists patch...)
Date:	2006-08-03 00:18:39
Message-ID:	44D140DF.5080209@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> I think we could safely list_free the input list in transformInsertRow
> as your patch suggests, which would buy back the 144M part. But I don't
> believe it's safe at all to free the raw_parser output --- the grammar
> sometimes makes multiple links to the same subtree, eg in BETWEEN.

In transformExpr the comment implies that it should be safe to reapply
to an already transformed expression. What if we free the raw_parser
expression list/cells/nodes and replace it with the as-transformed one?

Joe

From:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-patches(at)postgresql(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-03 05:31:06
Message-ID:	Pine.LNX.4.58.0608031523090.4590@linuxworld.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Docs and regression tests attached.

One slightly annoying thing is this:

---
regression=# declare foo cursor with hold for VALUES(1,2), (3, 4);
DECLARE CURSOR
regression=# declare foo2 cursor with hold for (VALUES(1,2), (3, 4)) as
foo(i, j);
ERROR: syntax error at or near "as"
LINE 1: ...e foo2 cursor with hold for (VALUES(1,2), (3, 4)) as foo(i, ...
---

Now, we can just rewrite the second query as:

---
declare foo2 cursor with hold for select * from (VALUES(1,2), (3, 4)) as
foo(i, j);
---

but it's not immediately obvious. Not worth busting up the grammar for it,
though. And, it's not spec.

Gavin

Attachment	Content-Type	Size
values_doc_test.diff	text/plain	16.6 KB

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	"Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: VALUES clause memory optimization (was: Values list-of-targetlists patch...)
Date:	2006-08-03 12:48:57
Message-ID:	2922.1154609337@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> In transformExpr the comment implies that it should be safe to reapply
> to an already transformed expression. What if we free the raw_parser
> expression list/cells/nodes and replace it with the as-transformed one?

How are you going to do the "replace" bit? The entire problem is that
you don't know where are all the down-links leading to the subexpression
you are currently working on.

The reason we could safely list_free inside transformInsertRow is that
we know all its callers have just built the passed-in list and so there
are no other pointers to it. That doesn't apply in the general case of
grammar output.

I think in the long run we probably ought to fix things so that the
grammar never outputs any multiply-linked trees; that little shortcut
has been a continuing source of grief for many reasons. I can't see
doing that for 8.2 though. My advice is to get that low-hanging fruit
in transformInsertRow and leave the other ideas for 8.3.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
Cc:	Joe Conway <mail(at)joeconway(dot)com>, pgsql-patches(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES]
Date:	2006-08-03 14:58:10
Message-ID:	18685.1154617090@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Gavin Sherry <swm(at)linuxworld(dot)com(dot)au> writes:
> Docs and regression tests attached.

I've applied the regression tests (with a few additions), but I'm
feeling dissatisfied with this approach to documenting VALUES.
It seems to be mostly missing the point about VALUES being usable
whereever SELECT is. I'm not at all sure what I'd do instead though.
Should we give VALUES its own reference page? That doesn't quite
seem helpful either. cc'ing to pgsql-docs for ideas.

regards, tom lane

From:	Michael Glaesemann <grzm(at)seespotcode(dot)net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Joe Conway <mail(at)joeconway(dot)com>, pgsql-patches(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES]
Date:	2006-08-03 15:56:20
Message-ID:	429A0065-78C7-48C9-B2DC-E898988F93E6@seespotcode.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

On Aug 3, 2006, at 23:58 , Tom Lane wrote:

> Should we give VALUES its own reference page? That doesn't quite
> seem helpful either.
>

I think we should go for a separate reference page, as VALUES appears
to be expanding quite a bit. Up till now I've thought of VALUES only
in conjunction with UPDATE, so perhaps a useful alternative would be
to keep all of the information regarding VALUES and its syntax would
be as a large part of the UPDATE reference page, though that would
imply by placement (even if explained otherwise) that VALUES is only
a part of the UPDATE syntax, which it no longer (?) is. That brings
me back to the idea of VALUES deserving its own reference page.

I wonder how soon pretty much the entire SQL spec will be duplicated
in the PostgreSQL documentation. :)

Michael Glaesemann
grzm seespotcode net

From:	Peter Eisentraut <peter_e(at)gmx(dot)net>
To:	pgsql-docs(at)postgresql(dot)org
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Joe Conway <mail(at)joeconway(dot)com>
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES]
Date:	2006-08-03 17:11:47
Message-ID:	200608031911.48010.peter_e@gmx.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> Should we give VALUES its own reference page? That doesn't quite
> seem helpful either. cc'ing to pgsql-docs for ideas.

This is probably the sort of thing that should be explained part II "The
SQL Language". In particular, section 7.2 is about table expressions,
which this is either a case of or related to.

A few examples on the command references pages to show how it can be
useful are probably OK, but the big picture shoulbd be developed in
chapter 7 or so.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	pgsql-docs(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-03 17:48:36
Message-ID:	44D236F4.2010001@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Peter Eisentraut wrote:
> Tom Lane wrote:
>
>>Should we give VALUES its own reference page? That doesn't quite
>>seem helpful either. cc'ing to pgsql-docs for ideas.
>
> This is probably the sort of thing that should be explained part II "The
> SQL Language". In particular, section 7.2 is about table expressions,
> which this is either a case of or related to.
>
> A few examples on the command references pages to show how it can be
> useful are probably OK, but the big picture shoulbd be developed in
> chapter 7 or so.
>

I'll need some time (perhaps a few weeks because I'm just starting to do
business travel again), but I'll take responsibility to write something
up if you'd like.

Joe

From:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-docs(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-04 00:36:45
Message-ID:	Pine.LNX.4.58.0608041035470.16622@linuxworld.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

On Thu, 3 Aug 2006, Joe Conway wrote:

> Peter Eisentraut wrote:
> > Tom Lane wrote:
> >
> >>Should we give VALUES its own reference page? That doesn't quite
> >>seem helpful either. cc'ing to pgsql-docs for ideas.
> >
> > This is probably the sort of thing that should be explained part II "The
> > SQL Language". In particular, section 7.2 is about table expressions,
> > which this is either a case of or related to.
> >
> > A few examples on the command references pages to show how it can be
> > useful are probably OK, but the big picture shoulbd be developed in
> > chapter 7 or so.
> >
>
> I'll need some time (perhaps a few weeks because I'm just starting to do
> business travel again), but I'll take responsibility to write something
> up if you'd like.

I'm happy to finish what I started.

Gavin

From:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Joe Conway <mail(at)joeconway(dot)com>, pgsql-patches(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES]
Date:	2006-08-04 00:40:44
Message-ID:	Pine.LNX.4.58.0608041037080.16622@linuxworld.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

On Thu, 3 Aug 2006, Tom Lane wrote:

> Gavin Sherry <swm(at)linuxworld(dot)com(dot)au> writes:
> > Docs and regression tests attached.
>
> I've applied the regression tests (with a few additions), but I'm
> feeling dissatisfied with this approach to documenting VALUES.
> It seems to be mostly missing the point about VALUES being usable
> whereever SELECT is. I'm not at all sure what I'd do instead though.
> Should we give VALUES its own reference page? That doesn't quite
> seem helpful either. cc'ing to pgsql-docs for ideas.

Good point. One question: are we happy calling this a 'VALUES list'? It's
better than a 'table value constructor'. I took the lead from a comment in the
source.

Thanks,

gavin

From:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
To:	Michael Glaesemann <grzm(at)seespotcode(dot)net>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Joe Conway <mail(at)joeconway(dot)com>, pgsql-patches(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-04 00:42:02
Message-ID:	Pine.LNX.4.58.0608041040550.16622@linuxworld.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

On Fri, 4 Aug 2006, Michael Glaesemann wrote:

> On Aug 3, 2006, at 23:58 , Tom Lane wrote:
>
>
> > Should we give VALUES its own reference page? That doesn't quite
> > seem helpful either.
> >
>
> I think we should go for a separate reference page, as VALUES appears
> to be expanding quite a bit. Up till now I've thought of VALUES only
> in conjunction with UPDATE, so perhaps a useful alternative would be
> to keep all of the information regarding VALUES and its syntax would
> be as a large part of the UPDATE reference page, though that would
> imply by placement (even if explained otherwise) that VALUES is only
> a part of the UPDATE syntax, which it no longer (?) is. That brings
> me back to the idea of VALUES deserving its own reference page.

... with update? I associate it very closely with INSERT. After all,
INSERT is the only statement where we've had VALUES as part of the
grammar.

Thanks,

Gavin

From:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
To:	Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	pgsql-docs(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Joe Conway <mail(at)joeconway(dot)com>
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-04 00:52:12
Message-ID:	Pine.LNX.4.58.0608041042551.16622@linuxworld.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

On Thu, 3 Aug 2006, Peter Eisentraut wrote:

> Tom Lane wrote:
> > Should we give VALUES its own reference page? That doesn't quite
> > seem helpful either. cc'ing to pgsql-docs for ideas.
>
> This is probably the sort of thing that should be explained part II "The
> SQL Language". In particular, section 7.2 is about table expressions,
> which this is either a case of or related to.

Yes, good idea.

>
> A few examples on the command references pages to show how it can be
> useful are probably OK, but the big picture shoulbd be developed in
> chapter 7 or so.

What do we want to do about documenting:

regression=# values(1);
column1
---------
1
(1 row)

It seems to me that if this isn't in the command reference we'll see an
email every few months say 'is this an undocumented feature?'. We can
cover it easily in Part II, 7.2 but people wont look there. Then again, it
just doesn't seem right to put it in the command reference as a 'top
level' command.

Gavin

From:	Michael Glaesemann <grzm(at)seespotcode(dot)net>
To:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Joe Conway <mail(at)joeconway(dot)com>, pgsql-patches(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES]
Date:	2006-08-04 01:05:39
Message-ID:	0C6AD0FD-2D94-4916-943A-8A1D8C49816B@seespotcode.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

On Aug 4, 2006, at 9:42 , Gavin Sherry wrote:

> ... with update? I associate it very closely with INSERT. After all,
> INSERT is the only statement where we've had VALUES as part of the
> grammar.

Of course! Thanks for catching the glitch. I must have a bad RAM chip.

Michael Glaesemann
grzm seespotcode net

From:	Peter Eisentraut <peter_e(at)gmx(dot)net>
To:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
Cc:	pgsql-docs(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Joe Conway <mail(at)joeconway(dot)com>
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES]
Date:	2006-08-04 01:21:47
Message-ID:	200608040321.48560.peter_e@gmx.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Gavin Sherry wrote:
> What do we want to do about documenting:
>
> regression=# values(1);

Out of curiosity, according to what theory should that be allowed?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

From:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
To:	Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	pgsql-docs(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Joe Conway <mail(at)joeconway(dot)com>
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-04 01:57:48
Message-ID:	Pine.LNX.4.58.0608041143430.17044@linuxworld.com.au
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

On Fri, 4 Aug 2006, Peter Eisentraut wrote:

> Gavin Sherry wrote:
> > What do we want to do about documenting:
> >
> > regression=# values(1);
>
> Out of curiosity, according to what theory should that be allowed?

I asked this too. Tom pointed this out (I'd give the URL if I could find
it on archives):

--
You bet. VALUES is parallel to SELECT in the SQL grammar, so AFAICS
it should be legal anywhere you can write SELECT.

The basic productions in the spec's grammar are respectively

<query specification> ::=
SELECT [ <set quantifier> ] <select list>
<table expression>

and

<table value constructor> ::=
VALUES <row value expression list>
--

which seems to be the case from my reading of the spec.

Gavin

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, pgsql-docs(at)postgresql(dot)org, Joe Conway <mail(at)joeconway(dot)com>
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES]
Date:	2006-08-04 02:36:23
Message-ID:	860.1154658983@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> Gavin Sherry wrote:
>> What do we want to do about documenting:
>> regression=# values(1);

> Out of curiosity, according to what theory should that be allowed?

Wrong question. SELECT (for the general case of multi-row results)
and VALUES are exactly parallel in the SQL grammar; the right question
is "according to what theory are you allowed to issue a general SELECT?"

AFAICT the SQL spec only envisions "SELECT ... INTO some-variables",
restricted to a single-row result (cf <select statement: single row>
production) as being something a client can issue directly. Every more-
complex case is apparently supposed to be handled by one-row-at-a-time
fetches from a cursor.

If you can persuade the community that they don't want SELECT in its
current form, you might be able to persuade me that VALUES shouldn't
be allowed either.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pgsql-hackers(at)postgresql(dot)org, Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
Cc:	Michael Glaesemann <grzm(at)seespotcode(dot)net>, Joe Conway <mail(at)joeconway(dot)com>, pgsql-patches(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject:	Re: [PATCHES] Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-04 02:50:17
Message-ID:	969.1154659817@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Gavin Sherry <swm(at)linuxworld(dot)com(dot)au> writes:
> On Fri, 4 Aug 2006, Michael Glaesemann wrote:
>> On Aug 3, 2006, at 23:58 , Tom Lane wrote:
>>> Should we give VALUES its own reference page? That doesn't quite
>>> seem helpful either.
>>
>> I think we should go for a separate reference page, as VALUES appears
>> to be expanding quite a bit.

> ... with update? I associate it very closely with INSERT. After all,
> INSERT is the only statement where we've had VALUES as part of the
> grammar.

True, but I think that's just a historical artifact. If you look at the
SQL spec, INSERT ... VALUES and INSERT ... SELECT are not distinct
constructs: they fall out of the fact that VALUES and SELECT are allowed
interchangeably.

<insert statement> ::=
INSERT INTO <table name>
<insert columns and source>

<insert columns and source> ::=
[ <left paren> <insert column list> <right paren> ]
<query expression>
| DEFAULT VALUES

and when you trace down <query expression> you find the SELECT
and VALUES options entering at exactly the same place ...

I'd like to see us refactor the docs as necessary to reflect that idea.
Peter is right that this needs some discussion in syntax.sgml as well as
in the reference pages --- but I'm still not very clear on how the
presentation should go.

regards, tom lane

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: VALUES clause memory optimization
Date:	2006-08-04 04:27:16
Message-ID:	44D2CCA4.8040507@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> The reason we could safely list_free inside transformInsertRow is that
> we know all its callers have just built the passed-in list and so there
> are no other pointers to it. That doesn't apply in the general case of
> grammar output.

What about for the specific case of an InsertStmt? It seems that we
could at least get away with freeing the raw-expression list in that case.

In terms of freeing an entire arbitrary node, could we create a
backend/nodes/freefuncs.c file that does a recursive freeObject()
similar to the way copyObject() does in backend/nodes/copyfuncs.c?

> My advice is to get that low-hanging fruit
> in transformInsertRow and leave the other ideas for 8.3.

OK. This should be safe also, correct?

Thanks,

Joe

8<----------------------------------------
diff -c -r1.341 analyze.c
*** src/backend/parser/analyze.c 2 Aug 2006 01:59:46 -0000 1.341
--- src/backend/parser/analyze.c 2 Aug 2006 05:13:20 -0000
***************
*** 2191,2196 ****
--- 2196,2202 ----
for (i = 0; i < sublist_length; i++)
{
coltypes[i] = select_common_type(coltype_lists[i], "VALUES");
+ list_free(coltype_lists[i]);
}

newExprsLists = NIL;

From:	"Pavel Stehule" <pavel(dot)stehule(at)hotmail(dot)com>
To:	swm(at)linuxworld(dot)com(dot)au, tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc:	mail(at)joeconway(dot)com, pgsql-patches(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES]
Date:	2006-08-04 04:43:08
Message-ID:	BAY20-F143AB0B68CABA5DB334B10F9500@phx.gbl
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

>
>Good point. One question: are we happy calling this a 'VALUES list'? It's
>better than a 'table value constructor'. I took the lead from a comment in
>the
>source.
>

table value constructor is name from ANSI. All people wiil find t.v.c., not
values list. I vote table value constructor.

Regards
Pavel Stehule

_________________________________________________________________
Emotikony a pozadi programu MSN Messenger ozivi vasi konverzaci.
http://messenger.msn.cz/

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	"Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: VALUES clause memory optimization
Date:	2006-08-04 04:56:26
Message-ID:	2050.1154667386@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Joe Conway <mail(at)joeconway(dot)com> writes:
> What about for the specific case of an InsertStmt? It seems that we
> could at least get away with freeing the raw-expression list in that case.

Not sure ... what about rules, BETWEEN, yadda yadda?

> In terms of freeing an entire arbitrary node, could we create a
> backend/nodes/freefuncs.c file that does a recursive freeObject()
> similar to the way copyObject() does in backend/nodes/copyfuncs.c?

We got rid of freefuncs.c years ago, for good and sufficient reasons
that have not gone away. The problem is exactly that you don't know
whether any shortcuts were taken in constructing the node tree:
multiple links, pointers to constants, pointers to stuff that wasn't
supposed to be freed are all severe hazards.

>> My advice is to get that low-hanging fruit
>> in transformInsertRow and leave the other ideas for 8.3.

> OK. This should be safe also, correct?

Yes, but what's your point? The case that seems worth trying to
optimize is "INSERT INTO foo VALUES ... real long list ...". Certainly
the MySQL crowd is not going to be stressing transformValuesClause,
because they don't know it exists.

$ mysql test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 12 to server version: 5.0.22

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> values (1),(2);
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'values (1),(2)' at line 1
mysql> select * from (values (1),(2)) as x(y);
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'values (1),(2)) as x(y)' at line 1
mysql> select * from foo where x in (values (1),(2));
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '1),(2))' at line 1
mysql>

mysql shortcomings aside, I don't really see the use-case for enormously
long VALUES lists anywhere except the bulk-data-load scenario, ie,
exactly INSERT ... VALUES. So I don't feel a need to be real tense
in transformValuesClause.

regards, tom lane

From:	Peter Eisentraut <peter_e(at)gmx(dot)net>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, pgsql-docs(at)postgresql(dot)org, Joe Conway <mail(at)joeconway(dot)com>
Subject:	Re: Values list-of-targetlists patch for comments (was Re: [PATCHES]
Date:	2006-08-04 07:02:09
Message-ID:	200608040902.09945.peter_e@gmx.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Tom Lane wrote:
> Wrong question. SELECT (for the general case of multi-row results)
> and VALUES are exactly parallel in the SQL grammar; the right
> question is "according to what theory are you allowed to issue a
> general SELECT?"

The "top-level" SELECT for interactive use is

<directly executable statement> ::= <direct SQL data statement> | ...

<direct SQL data statement> ::= <direct select statement: multiple rows> | ...

But this actually does resolve as just VALUES (something), so nevermind.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Joe Conway <mail(at)joeconway(dot)com>, pgsql-patches(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-04 20:35:09
Message-ID:	1154723709.2495.14.camel@localhost.localdomain
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

On Thu, 2006-08-03 at 10:58 -0400, Tom Lane wrote:
> Gavin Sherry <swm(at)linuxworld(dot)com(dot)au> writes:
> > Docs and regression tests attached.
>
> I've applied the regression tests (with a few additions), but I'm
> feeling dissatisfied with this approach to documenting VALUES.
> It seems to be mostly missing the point about VALUES being usable
> whereever SELECT is. I'm not at all sure what I'd do instead though.
> Should we give VALUES its own reference page? That doesn't quite
> seem helpful either. cc'ing to pgsql-docs for ideas.

The DB2 manual does exactly that and is not any clearer as a result,
even if it is fully normalised.

If we did that we'd need to emphasise that VALUES is more of a clause
than a new command.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

From:	Peter Eisentraut <peter_e(at)gmx(dot)net>
To:	pgsql-hackers(at)postgresql(dot)org
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Michael Glaesemann <grzm(at)seespotcode(dot)net>, Joe Conway <mail(at)joeconway(dot)com>, pgsql-patches(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject:	Re: [HACKERS] [PATCHES] Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-09 13:05:02
Message-ID:	200608091505.04383.peter_e@gmx.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Am Freitag, 4. August 2006 04:50 schrieb Tom Lane:
> I'd like to see us refactor the docs as necessary to reflect that idea.
> Peter is right that this needs some discussion in syntax.sgml as well as
> in the reference pages --- but I'm still not very clear on how the
> presentation should go.

I'm beginning to think that VALUES might be a separate command after all.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

From:	David Fetter <david(at)fetter(dot)org>
To:	Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Michael Glaesemann <grzm(at)seespotcode(dot)net>, Joe Conway <mail(at)joeconway(dot)com>, pgsql-patches(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject:	Re: [HACKERS] [PATCHES] Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-09 16:22:42
Message-ID:	20060809162242.GA19092@fetter.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

On Wed, Aug 09, 2006 at 03:05:02PM +0200, Peter Eisentraut wrote:
> Am Freitag, 4. August 2006 04:50 schrieb Tom Lane:
> > I'd like to see us refactor the docs as necessary to reflect that
> > idea. Peter is right that this needs some discussion in
> > syntax.sgml as well as in the reference pages --- but I'm still
> > not very clear on how the presentation should go.
>
> I'm beginning to think that VALUES might be a separate command after
> all.

What's below definitely bolsters that idea :)

postgres=# VALUES(1);
column1
---------
1
(1 row)

However, there are some oddities:

postgres=# SELECT * FROM (VALUES (1,2)) AS foo(bar,baz);
bar | baz
-----+-----
1 | 2
(1 row)

postgres=# (VALUES (1,2)) AS foo(bar,baz);
ERROR: syntax error at or near "AS"
LINE 1: (VALUES (1,2)) AS foo(bar,baz);

Does the SQL standard have anything to say about assigning identifiers
both to the entire VALUES() statement and to its columns when the
VALUES() statement is by itself?

Cheers,
D
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
phone: +1 415 235 3778 AIM: dfetter666
Skype: davidfetter

Remember to vote!

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	David Fetter <david(at)fetter(dot)org>
Cc:	Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Michael Glaesemann <grzm(at)seespotcode(dot)net>, Joe Conway <mail(at)joeconway(dot)com>, pgsql-patches(at)postgresql(dot)org, pgsql-docs(at)postgresql(dot)org
Subject:	Re: [HACKERS] [PATCHES] Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-09 17:07:30
Message-ID:	27400.1155143250@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

David Fetter <david(at)fetter(dot)org> writes:
> However, there are some oddities:
> postgres=# SELECT * FROM (VALUES (1,2)) AS foo(bar,baz);
> [ works ]
> postgres=# (VALUES (1,2)) AS foo(bar,baz);
> ERROR: syntax error at or near "AS"

This is per spec. Effectively, AS is part of the FROM-clause syntax
not part of a standalone command. You can't write this either:

regression=# (select 1,2) as foo(bar,baz);
ERROR: syntax error at or near "as"
LINE 1: (select 1,2) as foo(bar,baz);
^

regards, tom lane

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
Cc:	Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-docs(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Joe Conway <mail(at)joeconway(dot)com>
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-11 03:07:10
Message-ID:	200608110307.k7B37AA22420@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Gavin Sherry wrote:
> > A few examples on the command references pages to show how it can be
> > useful are probably OK, but the big picture shoulbd be developed in
> > chapter 7 or so.
>
> What do we want to do about documenting:
>
> regression=# values(1);
> column1
> ---------
> 1
> (1 row)
>
> It seems to me that if this isn't in the command reference we'll see an
> email every few months say 'is this an undocumented feature?'. We can
> cover it easily in Part II, 7.2 but people wont look there. Then again, it
> just doesn't seem right to put it in the command reference as a 'top
> level' command.

How many people are going to think of doing a VALUES(1)? Isn't it just
like SELECT 1? We can try making VALUES a non-toplevel command, and if
people get confused, we can promote in the documenation. I think
starting it at top-level is perhaps confusing.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

From:	Michael Glaesemann <grzm(at)seespotcode(dot)net>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-docs(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Joe Conway <mail(at)joeconway(dot)com>
Subject:	Re: Values list-of-targetlists patch for comments (was Re:
Date:	2006-08-11 04:35:46
Message-ID:	6C7F4101-7D91-480A-8540-7185174290B3@seespotcode.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

On Aug 11, 2006, at 12:07 , Bruce Momjian wrote:

> Gavin Sherry wrote:
>>> A few examples on the command references pages to show how it can be
>>> useful are probably OK, but the big picture shoulbd be developed in
>>> chapter 7 or so.
>>
>> What do we want to do about documenting:
>>
>> regression=# values(1);
>> column1
>> ---------
>> 1
>> (1 row)
>>
>> It seems to me that if this isn't in the command reference we'll
>> see an
>> email every few months say 'is this an undocumented feature?'. We can
>> cover it easily in Part II, 7.2 but people wont look there. Then
>> again, it
>> just doesn't seem right to put it in the command reference as a 'top
>> level' command.
>
> How many people are going to think of doing a VALUES(1)? Isn't it
> just
> like SELECT 1? We can try making VALUES a non-toplevel command,
> and if
> people get confused, we can promote in the documenation. I think
> starting it at top-level is perhaps confusing.

Perhaps it may be confusing, but at the same time, it's educational
to put VALUES at the top level. Having not read the SQL spec in
detail, I do rely on the Postgres documentation as a quick reference.
Thanks to the high-level of SQL-spec compliance, the Postgres
documentation is pretty helpful in that way. And I'm always
pleasantly surprised when I learn something new just from looking
indices such as this.

Michael Glaesemann
grzm seespotcode net

From:	Bruce Momjian <bruce(at)momjian(dot)us>
To:	Joe Conway <mail(at)joeconway(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: VALUES clause memory optimization
Date:	2006-08-26 03:35:40
Message-ID:	200608260335.k7Q3ZeJ04955@momjian.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Has this been addressed?

---------------------------------------------------------------------------

Joe Conway wrote:
> Tom Lane wrote:
> > The reason we could safely list_free inside transformInsertRow is that
> > we know all its callers have just built the passed-in list and so there
> > are no other pointers to it. That doesn't apply in the general case of
> > grammar output.
>
> What about for the specific case of an InsertStmt? It seems that we
> could at least get away with freeing the raw-expression list in that case.
>
> In terms of freeing an entire arbitrary node, could we create a
> backend/nodes/freefuncs.c file that does a recursive freeObject()
> similar to the way copyObject() does in backend/nodes/copyfuncs.c?
>
> > My advice is to get that low-hanging fruit
> > in transformInsertRow and leave the other ideas for 8.3.
>
> OK. This should be safe also, correct?
>
> Thanks,
>
> Joe
>
> 8<----------------------------------------
> diff -c -r1.341 analyze.c
> *** src/backend/parser/analyze.c 2 Aug 2006 01:59:46 -0000 1.341
> --- src/backend/parser/analyze.c 2 Aug 2006 05:13:20 -0000
> ***************
> *** 2191,2196 ****
> --- 2196,2202 ----
> for (i = 0; i < sublist_length; i++)
> {
> coltypes[i] = select_common_type(coltype_lists[i], "VALUES");
> + list_free(coltype_lists[i]);
> }
>
> newExprsLists = NIL;
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

From:	Joe Conway <mail(at)joeconway(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: VALUES clause memory optimization
Date:	2006-08-26 20:08:03
Message-ID:	44F0AA23.9030800@joeconway.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-docs pgsql-hackers pgsql-patches

Bruce Momjian wrote:
> Has this been addressed?
>
>>Tom Lane wrote:
>>>The reason we could safely list_free inside transformInsertRow is that
>>>we know all its callers have just built the passed-in list and so there
>>>are no other pointers to it. That doesn't apply in the general case of
>>>grammar output.

After another look, even in this case there could be other pointers.
Starting around line 667 of analyze.c:

8<------------------
foreach(lc, selectQuery->targetList)
{
TargetEntry *tle = (TargetEntry *) lfirst(lc);
Expr *expr;

if (tle->resjunk)
continue;
if (tle->expr &&
(IsA(tle->expr, Const) ||IsA(tle->expr, Param)) &&
exprType((Node *) tle->expr) == UNKNOWNOID)
expr = tle->expr;
else
expr = (Expr *) makeVar(rtr->rtindex,
tle->resno,
exprType((Node *) tle->expr),
exprTypmod((Node *) tle->expr),
0);
exprList = lappend(exprList, expr);
}

/* Prepare row for assignment to target table */
exprList = transformInsertRow(pstate, exprList,
stmt->cols,
icolumns, attrnos);

8<------------------

So in the UNKNOWNOID case, it wouldn't be safe to free the node after
transformation because it originates in the grammar output. The *only*
safe thing to free up would be a the input exprList itself. Not much to
be saved there.

>>>My advice is to get that low-hanging fruit
>>>in transformInsertRow and leave the other ideas for 8.3.

This one case doesn't provide that much memory savings by itself anyway.
I guess we'll just live with it until we can come up with a safe way to
free grammar output.

Joe