Re: COPY into a view; help w. design & patch

Lists: pgsql-hackers
From: "Karl O(dot) Pinc" <kop(at)meme(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: COPY into a view; help w. design & patch
Date: 2007-05-18 22:40:43
Message-ID: 1179528043l.12174l.0l@mofo
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

I'm attempting a patch that would allow the COPY
command to copy into a view. I needed to code
something so as to get a clue, but the design has
not been discussed so I'm posting here rather
than the patches list.

I had a wee bit of discussion about this on IRC.
There was a suggestion to add generality by
to copying into a INSERT statement. However
the INSERT statement and the COPY statement
both list column names, and other issues came up
and I stopped thinking about it.

Better I think would be to have a syntax like:

COPY INTO ( statement [; , ...]) ( column [, ...] )
FROM { 'filename' | STDIN }
and so forth

Statements would then have $1, $2 type arguments
in them that correspond to the supplied column names.
The column names wouldn't mean much, really they'd
just be an indication of how many columns there are
in the input data. Statements would usually be
INSERT statements, but could be any other
sort as well. (DELETE comes to mind as useful,
as do INSERT statements that use a
SELECT ... WHERE NOT EXISTS ... sort of query
to deal with inserting into 1-to-many relationships.)

I don't know if it would be tricky
to use a semicolon as the token delimiting statements
but I presume not.

If this more general syntax were chosen the copying
into a view would just be syntactic sugar for copying
into an INSERT statement that inserted into the view.

I haven't thought a whole lot about a generalized syntax.
(The return code would be more like that of EXECUTE.)
I've been focusing on copying into a view, which is
what I want. At this point I'd much prefer just implementing
the part that copies into a view as that's already
something I need help with. (!)

As far as the patch itself, it's time to ask for help.

The basic idea is to prepare an INSERT statement into
the view and then run it in a portal for each line
of input. (This should generalize to the "more general"
copy syntax above, I hope.) I've checked
(I think) that my data structures
are the same as what I'd get if I was doing a
PREPARE and then a EXECUTE. There's issues of locking
and snapshots and probably other stuff and I thought
I was following the example of what the prepare/execute
code does but I've obviously got something wrong.

I'm having problems
debugging what I've got. It works up to the point
of the PortalRun() call, then it crashes.
The regression tests are incomplete, done only
to the point where, in copy2.sql, it crashes.

The changes to outfuncs.c can be ignored. It won't
be in the final patch. I've just been using it to
dump various data structures as can be seen from
the various bits of debugging stuff left in my patch.

I've been debugging by running "make check" and
then looking at the copy2.out section of regression.diffs.
This isn't going to work if I'm going to poke
about inside the PortalRun() call. Is there
a make target that will setup the regression
environment so that I can then run the
backend via gdb, or something? I'm working
on a system that has a live pg install on it
and need to be careful not to break that.

Please help with suggestions for design,
code, and how to debug. As you may have noticed,
this is the first time I've messed with the
pg code. I could really use a lot of help.
I've been at this for quite a while and
you can see that I've not even gotten something
to work.

Thanks.

Karl <kop(at)meme(dot)com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein

Attachment Content-Type Size
copyview_v1.patch.gz application/x-gzip 7.7 KB

From: "Robert Haas" <Robert(dot)Haas(at)dyntek(dot)com>
To: "Karl O(dot) Pinc" <kop(at)meme(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: COPY into a view; help w. design & patch
Date: 2007-05-19 00:05:50
Message-ID: 57653AD4C1743546B3EE80B21262E5CB679C52@EXCH01.ds.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I'm not sure exactly why you want to do with this, but it seems very
similar to what you can already do using prepared statements. Instead
of saying

COPY INTO (statement) (column [, ...])
a1,a2,a3,...,an
b1,b2,b3,...,bn
c1,c2,c3,...,cn
\.

You could instead say:

BEGIN WORK
PREPARE somestatementhandle (column[, ...]) AS statement
EXECUTE somestatementhandle ('a1','a2','a3','...','an');
EXECUTE somestatementhandle ('b1','b2','b3','...','bn');
EXECUTE somestatementhandle ('c1','c2','c3','...','cn');
DEALLOCATE somestatementhandle
COMMIT WORK

See http://www.postgresql.org/docs/8.2/interactive/sql-prepare.html

...Robert

-----Original Message-----
From: pgsql-hackers-owner(at)postgresql(dot)org
[mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Karl O. Pinc
Sent: Friday, May 18, 2007 6:41 PM
To: pgsql-hackers(at)postgresql(dot)org
Subject: [HACKERS] COPY into a view; help w. design & patch

Hi,

I'm attempting a patch that would allow the COPY
command to copy into a view. I needed to code
something so as to get a clue, but the design has
not been discussed so I'm posting here rather
than the patches list.

I had a wee bit of discussion about this on IRC.
There was a suggestion to add generality by
to copying into a INSERT statement. However
the INSERT statement and the COPY statement
both list column names, and other issues came up
and I stopped thinking about it.

Better I think would be to have a syntax like:

COPY INTO ( statement [; , ...]) ( column [, ...] )
FROM { 'filename' | STDIN }
and so forth

Statements would then have $1, $2 type arguments
in them that correspond to the supplied column names.
The column names wouldn't mean much, really they'd
just be an indication of how many columns there are
in the input data. Statements would usually be
INSERT statements, but could be any other
sort as well. (DELETE comes to mind as useful,
as do INSERT statements that use a
SELECT ... WHERE NOT EXISTS ... sort of query
to deal with inserting into 1-to-many relationships.)

I don't know if it would be tricky
to use a semicolon as the token delimiting statements
but I presume not.

If this more general syntax were chosen the copying
into a view would just be syntactic sugar for copying
into an INSERT statement that inserted into the view.

I haven't thought a whole lot about a generalized syntax.
(The return code would be more like that of EXECUTE.)
I've been focusing on copying into a view, which is
what I want. At this point I'd much prefer just implementing
the part that copies into a view as that's already
something I need help with. (!)

As far as the patch itself, it's time to ask for help.

The basic idea is to prepare an INSERT statement into
the view and then run it in a portal for each line
of input. (This should generalize to the "more general"
copy syntax above, I hope.) I've checked
(I think) that my data structures
are the same as what I'd get if I was doing a
PREPARE and then a EXECUTE. There's issues of locking
and snapshots and probably other stuff and I thought
I was following the example of what the prepare/execute
code does but I've obviously got something wrong.

I'm having problems
debugging what I've got. It works up to the point
of the PortalRun() call, then it crashes.
The regression tests are incomplete, done only
to the point where, in copy2.sql, it crashes.

The changes to outfuncs.c can be ignored. It won't
be in the final patch. I've just been using it to
dump various data structures as can be seen from
the various bits of debugging stuff left in my patch.

I've been debugging by running "make check" and
then looking at the copy2.out section of regression.diffs.
This isn't going to work if I'm going to poke
about inside the PortalRun() call. Is there
a make target that will setup the regression
environment so that I can then run the
backend via gdb, or something? I'm working
on a system that has a live pg install on it
and need to be careful not to break that.

Please help with suggestions for design,
code, and how to debug. As you may have noticed,
this is the first time I've messed with the
pg code. I could really use a lot of help.
I've been at this for quite a while and
you can see that I've not even gotten something
to work.

Thanks.

Karl <kop(at)meme(dot)com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein


From: "Karl O(dot) Pinc" <kop(at)meme(dot)com>
To: Robert Haas <Robert(dot)Haas(at)dyntek(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: COPY into a view; help w. design & patch
Date: 2007-05-19 02:48:20
Message-ID: 1179542900l.12174l.1l@mofo
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 05/18/2007 07:05:50 PM, Robert Haas wrote:
> I'm not sure exactly why you want to do with this, but it seems very
> similar to what you can already do using prepared statements. Instead
> of saying
>
> COPY INTO (statement) (column [, ...])
> a1,a2,a3,...,an
> b1,b2,b3,...,bn
> c1,c2,c3,...,cn
> \.

I don't really want to do this. I really want my users
to be able to use the COPY statement without worrying
about whether they are copying into a table or a view.

COPY tableorview (column [, ...]) FROM stdin;
a1,a2,a3,...,an
b1,b2,b3,...,bn
c1,c2,c3,...,cn
\.

I just wanted to cover all the options when going over
the design choices, so came up with the COPY INTO
syntax.

> You could instead say:
>
> BEGIN WORK
> PREPARE somestatementhandle (column[, ...]) AS statement
> EXECUTE somestatementhandle ('a1','a2','a3','...','an');
> EXECUTE somestatementhandle ('b1','b2','b3','...','bn');
> EXECUTE somestatementhandle ('c1','c2','c3','...','cn');
> DEALLOCATE somestatementhandle
> COMMIT WORK

The trouble is that my users have data, in excel spreadsheets
and the like, and it needs to get into PostgreSQL.
The data they have corresponds to views made in the db, but
not to tables. I _could_ make tables that "correspond"
to the views and put BEFORE INSERT triggers on them and
have the triggers insert into the views (or the equalivent),
but then the users would have to use the views for most
things and the "corresponding tables" when doing a COPY
or using the application's data import function. That's
not so good for lots of reasons. Of course I could always
write a special application for each view to import into
each view, but why not have the COPY command there to do
it for me?

Karl <kop(at)meme(dot)com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Karl O(dot) Pinc" <kop(at)meme(dot)com>
Cc: Robert Haas <Robert(dot)Haas(at)dyntek(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: COPY into a view; help w. design & patch
Date: 2007-05-19 17:41:47
Message-ID: 14377.1179596507@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Karl O. Pinc" <kop(at)meme(dot)com> writes:
> I don't really want to do this. I really want my users
> to be able to use the COPY statement without worrying
> about whether they are copying into a table or a view.

But ... but ... the proposed feature entirely fails to achieve that.
Copying into an explicit INSERT statement isn't necessarily a bad idea,
but surely it's not transparent in that way.

> I _could_ make tables that "correspond"
> to the views and put BEFORE INSERT triggers on them and
> have the triggers insert into the views (or the equalivent),
> but then the users would have to use the views for most
> things and the "corresponding tables" when doing a COPY
> or using the application's data import function.

There's been previous discussion of allowing BEFORE INSERT triggers
on views, so long as the triggers always return NULL to suppress
the actual insertion attempt (ie, we'd move the "can't insert into
view" test out of the rewriter and put it downstream of trigger firing
in the executor). So far no one's figured out how to make that idea
work for UPDATE/DELETE, but maybe you could argue that even if it
only worked for INSERT it'd be a useful feature. It'd certainly solve
the problem for COPY.

regards, tom lane


From: "Karl O(dot) Pinc" <kop(at)meme(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <Robert(dot)Haas(at)dyntek(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: COPY into a view; help w. design & patch
Date: 2007-05-21 15:16:57
Message-ID: 1179760617l.28279l.0l@mofo
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 05/19/2007 12:41:47 PM, Tom Lane wrote:
> "Karl O. Pinc" <kop(at)meme(dot)com> writes:
> > I don't really want to do this. I really want my users
> > to be able to use the COPY statement without worrying
> > about whether they are copying into a table or a view.
>
> But ... but ... the proposed feature entirely fails to achieve that.
> Copying into an explicit INSERT statement isn't necessarily a bad
> idea,
> but surely it's not transparent in that way.

Sorry to be confusing. The first sentence in my email
starting this thread was "I'm attempting a patch
that would allow the COPY
command to copy into a view." That's what the
patch does.

The part of the first email that talks
about copying into an INSERT statement was my
attempt to summarize the brief discussion that
occurred on IRC when I went there to talk about
copying into a view. As well as the design related
thoughts I've had since.

If you could COPY into an INSERT statement then
you could have COPYing into a view be syntactic
sugar for COPYing into a INSERT statement that
inserts into the view. Then it'd be transparent.
(And that's how my patch actually works now,
it comes up with an INSERT statement and
prepares it and (tries to) execute it in a portal.)

> There's been previous discussion of allowing BEFORE INSERT triggers
> on views, so long as the triggers always return NULL to suppress
> the actual insertion attempt (ie, we'd move the "can't insert into
> view" test out of the rewriter and put it downstream of trigger firing
> in the executor). So far no one's figured out how to make that idea
> work for UPDATE/DELETE, but maybe you could argue that even if it
> only worked for INSERT it'd be a useful feature. It'd certainly solve
> the problem for COPY.

Disclaimer: At least some of that discussion was my fault.

I'd be happy to go in that direction, but arguing for
having BEFORE INSERT triggers work for
INSERT and not on UPDATE and DELETE seems tough. It's a
little like driving blind in one eye, you can see _partly_
where you're going but not all the way to the end of the road.
I'd be afraid it'd be a bad design decision that would
artificially constrain later work.

Recalling the discussion, plus perhaps a solution:

The problem with UPDATE and DELETE on BEFORE
triggers is coming up with an OLD row. IIRC it is
possible to take the SELECT associated with the view,
add to it the additional WHERE conditions specified
by the UPDATE and the DELETE, and run the resulting
query to come up with the OLD row(s).

The steps would be something like:

Take the raw parse tree of the UPDATE/DELETE WHERE and stick it
into the raw parse tree of a:
"SELECT * FROM viewname"

Send the result through the rule system and run it.

For each row returned do the UPDATE or DELETE, which
would involve calling the BEFORE INSERT trigger
with the row as OLD.

Check that the BEFORE INSERT returns NULL.

The big problem that came up revolves around what
to do if the view does not contain any/enough
primary keys. That would make the BEFORE INSERT
trigger pretty useless when it comes to updating
the underlying tables. I am willing to argue that
the solution to this need not be implemented right
away, along with the rest of the BEFORE INSERT
on views, because there's clear utility when the
views _do_ reveal primary keys. And, perhaps more
importantly, I think there's at least one clean way
to solve the problem.

The trick would be to allow the definition of
the view to "hide" particular columns
in the normal use of the view, but show them in the
OLD rows of the BEFORE INSERT trigger. This would
make primary keys or other data available to the
trigger, without having to reveal them to the
regular users of the view. A syntax like:

CREATE VIEW viewname ( [ BEFORE INSERT ] columnname, ... )
AS query

If "BEFORE INSERT" shows up before "columnname" then
the columnname is not visible in the view, but is visible
in the OLD row of the BEFORE INSERT trigger code.
I'd imagine this means keeping two different query trees
for the view, one with and one without the hidden columns.

Another trouble would be if the view had multiple
columns with the same name, as gotten from
the underlying tables. The solution there is simple;
don't allow BEFORE INSERT triggers on views unless
the columns of the view have unique names.

To keep things sane it'd probably also be a good
idea to have BEFORE INSERT triggers on views
be mutually exclusive with having INSERT/UPDATE/DELETE
rules on them. But I've not thought that through.

Karl <kop(at)meme(dot)com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein


From: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Karl O(dot) Pinc" <kop(at)meme(dot)com>, Robert Haas <Robert(dot)Haas(at)dyntek(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: COPY into a view; help w. design & patch
Date: 2007-05-21 16:23:57
Message-ID: 20070521162357.GD62346@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, May 19, 2007 at 01:41:47PM -0400, Tom Lane wrote:
> > I _could_ make tables that "correspond"
> > to the views and put BEFORE INSERT triggers on them and
> > have the triggers insert into the views (or the equalivent),
> > but then the users would have to use the views for most
> > things and the "corresponding tables" when doing a COPY
> > or using the application's data import function.
>
> There's been previous discussion of allowing BEFORE INSERT triggers
> on views, so long as the triggers always return NULL to suppress
> the actual insertion attempt (ie, we'd move the "can't insert into
> view" test out of the rewriter and put it downstream of trigger firing
> in the executor). So far no one's figured out how to make that idea
> work for UPDATE/DELETE, but maybe you could argue that even if it
> only worked for INSERT it'd be a useful feature. It'd certainly solve
> the problem for COPY.

What about adding COPY support to rules? ISTM if you want to copy into a
view you probably want to insert into it as well, so why not use the
same mechanism? Presumably a COPY rule would also be faster than a
trigger.
--
Jim Nasby decibel(at)decibel(dot)org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)


From: "Karl O(dot) Pinc" <kop(at)meme(dot)com>
To: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <Robert(dot)Haas(at)dyntek(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: COPY into a view; help w. design & patch
Date: 2007-05-21 17:02:29
Message-ID: 1179766949l.2547l.0l@mofo
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 05/21/2007 11:23:57 AM, Jim C. Nasby wrote:

> What about adding COPY support to rules? ISTM if you want to copy into
> a
> view you probably want to insert into it as well, so why not use the
> same mechanism? Presumably a COPY rule would also be faster than a
> trigger.

I'd say there's no difference between the rule you'd use
for COPYing and the rule you'd use for INSERTing,
which is why my patch produces an
INSERT statement and then proceeds to (attempt
to) execute the statement for every row of data
to be copied. If you don't have a rule that allows
INSERT into the view you get (the already existing)
error with a hint that tells you to make an INSERT
rule.

Karl <kop(at)meme(dot)com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein


From: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>
To: "Karl O(dot) Pinc" <kop(at)meme(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <Robert(dot)Haas(at)dyntek(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: COPY into a view; help w. design & patch
Date: 2007-05-21 17:17:38
Message-ID: 20070521171738.GE62346@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, May 21, 2007 at 05:02:29PM +0000, Karl O. Pinc wrote:
>
> On 05/21/2007 11:23:57 AM, Jim C. Nasby wrote:
>
> >What about adding COPY support to rules? ISTM if you want to copy into
> >a
> >view you probably want to insert into it as well, so why not use the
> >same mechanism? Presumably a COPY rule would also be faster than a
> >trigger.
>
> I'd say there's no difference between the rule you'd use
> for COPYing and the rule you'd use for INSERTing,
> which is why my patch produces an
> INSERT statement and then proceeds to (attempt
> to) execute the statement for every row of data
> to be copied. If you don't have a rule that allows
> INSERT into the view you get (the already existing)
> error with a hint that tells you to make an INSERT
> rule.

As Tom mentioned, that's very non-transparent to users. You're also
assuming that converting a COPY to a string of INSERTS (which would then
get pushed through the rule system one-by-one) would be as efficient as
just copying into a table. I don't believe that's the case.

I haven't studied the rule code, but at least for the simple case of
redirecting a copy into a view to a single table (ie: a single statement
INSTEAD rule that has no where clause) the copy command should be able
to be changed by the rule so that it's just inserting into a different
table. The performance should then be the same as if you copied directly
into that table in the first place.

This doesn't mean that a row-by-row capability (or the ability to have
COPY generate insert statements) would be bad, but they are not the same
as a simple rewrite of a COPY command (ie: adding COPY support to rules).
--
Jim Nasby decibel(at)decibel(dot)org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)


From: "Karl O(dot) Pinc" <kop(at)meme(dot)com>
To: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <Robert(dot)Haas(at)dyntek(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: COPY into a view; help w. design & patch
Date: 2007-05-21 18:03:41
Message-ID: 1179770621l.2547l.1l@mofo
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 05/21/2007 12:17:38 PM, Jim C. Nasby wrote:
> On Mon, May 21, 2007 at 05:02:29PM +0000, Karl O. Pinc wrote:
> >
> > On 05/21/2007 11:23:57 AM, Jim C. Nasby wrote:
> >
> > >What about adding COPY support to rules? ISTM if you want to copy
> into
> > >a
> > >view you probably want to insert into it as well, so why not use
> the
> > >same mechanism? Presumably a COPY rule would also be faster than a
> > >trigger.
> >
> > I'd say there's no difference between the rule you'd use
> > for COPYing and the rule you'd use for INSERTing,
> > which is why my patch produces an
> > INSERT statement and then proceeds to (attempt
> > to) execute the statement for every row of data
> > to be copied. If you don't have a rule that allows
> > INSERT into the view you get (the already existing)
> > error with a hint that tells you to make an INSERT
> > rule.
>
> As Tom mentioned, that's very non-transparent to users.
...

I don't think we understand each other. (I sure don't
understand the non-transparency comment above. I thought
Tom said that in regards having to write a special/new
COPY syntax in order to insert into a view, rather than
just using a name of a view instead of a name of a table.)

When I say I write and execute an INSERT statement I mean
that the INSERT statement into the view is executed just as if the user
wrote it -- it is passed through the rule system and turns into
whatever INSERT or other statements the user has
associated with INSERTing into the view. The INSERT
statement must only be passed through the rule system once,
the resulting prepared statement is executed for every line
of COPY input. This is exactly what you propose when
you say COPY should go through the rule system, except
that I'm using the INSERT rule to do the COPY rather
than have a separate COPY rule. Or maybe not, see below.

You're also
> assuming that converting a COPY to a string of INSERTS (which would
> then
> get pushed through the rule system one-by-one) would be as efficient
> as
> just copying into a table. I don't believe that's the case.

I'm sure it's not. But (IMO) anybody who's really concerned about
efficency shouldn't be using a view anyhow. It's easy
enough to run the raw data through a awk script or something
and COPY into the underlying tables. Views are for making things
easier to do. It's not necessary for them to be as fast as possible
in all cases. Again, in my opinion.

> I haven't studied the rule code, but at least for the simple case of
> redirecting a copy into a view to a single table (ie: a single
> statement
> INSTEAD rule that has no where clause) the copy command should be able
> to be changed by the rule so that it's just inserting into a different
> table. The performance should then be the same as if you copied
> directly
> into that table in the first place.

The problem comes when the user writes rules that insert into
mutiple tables, or do other random things. At that point
you just want to "do what the user asked" when inserting
each row of data into the view because you really don't
know what the rules are going to expand into.

>
> This doesn't mean that a row-by-row capability (or the ability to have
> COPY generate insert statements) would be bad, but they are not the
> same
> as a simple rewrite of a COPY command (ie: adding COPY support to
> rules).

Are you talking about having a COPY statement rewrite into a bunch
of COPY statments, and then applying the input data to each copy
statement in turn? That sounds feasible. There would be a certain
disconnect between such a COPY rule and the rest of the rule
system. All the other sorts of rules can expand into any kind
of statement. You could, in theory, have an INSERT rule
that deletes a row from table B for every row inserted into
table A. (Probably in addition to inserting into table A.)
COPY rules couldn't work that way. At least not without
all sorts of implimentation complication. Maybe it doesn't
matter; you could argue that INSERT rules shouldn't do anything
but INSERT, ever. But that's not enforced now.

It'd also be a little wierd to have an INSERT rule that behaves
differently from a COPY rule, updating different tables or whatever.

Karl <kop(at)meme(dot)com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Karl O(dot) Pinc" <kop(at)meme(dot)com>
Cc: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>, Robert Haas <Robert(dot)Haas(at)dyntek(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: COPY into a view; help w. design & patch
Date: 2007-05-21 18:59:36
Message-ID: 15765.1179773976@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Karl O. Pinc" <kop(at)meme(dot)com> writes:
> When I say I write and execute an INSERT statement I mean
> that the INSERT statement into the view is executed just as if the user
> wrote it -- it is passed through the rule system and turns into
> whatever INSERT or other statements the user has
> associated with INSERTing into the view.

The problem with this line of argument is that it starts from the premise
that rule support for INSERTs is fine and dandy, and all we lack is that
COPY isn't paying attention to it. This premise is not in accord with
reality --- reality is that the rule system sucks for a number of
reasons, the main one being multiple-evaluation risks. I can't count
how many times I've told newbies to forget trying to use a rule and
instead use a trigger for whatever they were trying to accomplish.
Take a look also at the so-far-failed attempts to implement SQL-spec
updatable views on the basis of the current rule system.

I think that the wave of the future is probably to figure out a way to
provide trigger support for views. Or maybe we must throw out the
current rule system and start over. Neither plan makes it sound
attractive to make COPY depend on the current rule system.

regards, tom lane


From: "Karl O(dot) Pinc" <kop(at)meme(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>, Robert Haas <Robert(dot)Haas(at)dyntek(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: COPY into a view; help w. design & patch
Date: 2007-05-21 19:46:54
Message-ID: 1179776814l.2547l.2l@mofo
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 05/21/2007 01:59:36 PM, Tom Lane wrote:

> I think that the wave of the future is probably to figure out a way to
> provide trigger support for views.

I put forward a possible way to do trigger support
for views in a previous email and would appreciate comment.
(http://archives.postgresql.org/pgsql-hackers/2007-05/msg00906.php)

Really, I'd like to know if I should just give up and move
on or if I can do something to help advance Postgres towards
something that will solve my copy-into-a-view problems.

Karl <kop(at)meme(dot)com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Karl O(dot) Pinc" <kop(at)meme(dot)com>
Cc: Robert Haas <Robert(dot)Haas(at)dyntek(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: COPY into a view; help w. design & patch
Date: 2007-05-21 21:52:35
Message-ID: 29235.1179784355@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Karl O. Pinc" <kop(at)meme(dot)com> writes:
> On 05/19/2007 12:41:47 PM, Tom Lane wrote:
>> There's been previous discussion of allowing BEFORE INSERT triggers
>> on views, so long as the triggers always return NULL to suppress
>> the actual insertion attempt (ie, we'd move the "can't insert into
>> view" test out of the rewriter and put it downstream of trigger firing
>> in the executor). So far no one's figured out how to make that idea
>> work for UPDATE/DELETE, but maybe you could argue that even if it
>> only worked for INSERT it'd be a useful feature. It'd certainly solve
>> the problem for COPY.

> Disclaimer: At least some of that discussion was my fault.

> I'd be happy to go in that direction, but arguing for
> having BEFORE INSERT triggers work for
> INSERT and not on UPDATE and DELETE seems tough. It's a
> little like driving blind in one eye, you can see _partly_
> where you're going but not all the way to the end of the road.
> I'd be afraid it'd be a bad design decision that would
> artificially constrain later work.

Well, as I noted earlier, making triggers work for views seems like
it'd be more useful than adding more dependencies on the current
rule system.

> The problem with UPDATE and DELETE on BEFORE
> triggers is coming up with an OLD row.

No, that's the easy part, since as you note you can just compute the
view and extract the rows meeting the WHERE condition. And in the
UPDATE case you need to compute a NEW row with the updated values
substituted, but that seems just a small matter of programming as well.
The issues I see are:

* Existing triggers expect to see a ctid identifier for each updated
or deleted row. Is it OK to just say that you don't get that in a
trigger for a view?

* What about INSERT/UPDATE/DELETE RETURNING? The current definition of
triggers gives them no way to specify what is computed for RETURNING.

> The big problem that came up revolves around what
> to do if the view does not contain any/enough
> primary keys.

I don't think that's a problem we need to solve. If a user wants to
update his views it's up to him to define them in a way that gives
the trigger enough information to do its job. This is the SQL spec's
approach (no pkey -> view not updatable), and I don't think we need
to invent weird new concepts to do better.

regards, tom lane


From: "Karl O(dot) Pinc" <kop(at)meme(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <Robert(dot)Haas(at)dyntek(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: COPY into a view; help w. design & patch
Date: 2007-05-24 19:25:59
Message-ID: 1180034759l.22160l.0l@mofo
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 05/21/2007 04:52:35 PM, Tom Lane wrote:
> "Karl O. Pinc" <kop(at)meme(dot)com> writes:
> > On 05/19/2007 12:41:47 PM, Tom Lane wrote:
> >> There's been previous discussion of allowing BEFORE INSERT triggers
> >> on views, so long as the triggers always return NULL to suppress
> >> the actual insertion attempt (ie, we'd move the "can't insert into
> >> view" test out of the rewriter and put it downstream of trigger
> firing
> >> in the executor). So far no one's figured out how to make that
> idea
> >> work for UPDATE/DELETE, but maybe you could argue that even if it
> >> only worked for INSERT it'd be a useful feature. It'd certainly
> solve
> >> the problem for COPY.

> > The problem with UPDATE and DELETE on BEFORE
> > triggers is coming up with an OLD row.
>
> No, that's the easy part, since as you note you can just compute the
> view and extract the rows meeting the WHERE condition. And in the
> UPDATE case you need to compute a NEW row with the updated values
> substituted, but that seems just a small matter of programming as
> well.
> The issues I see are:
>
> * Existing triggers expect to see a ctid identifier for each updated
> or deleted row. Is it OK to just say that you don't get that in a
> trigger for a view?

In addition to computing the view and extracting the rows meeting
the WHERE condition, the resultant rows _could_ be put
(by what the parser generates) into a
temporary table (deleted when the command finishes). That
would result in ctid identifiers, but I'm unable to tell if this
is a good idea or not. (It's probably only a good idea if the
answer is "no, it's not ok if triggers get NULL ctid values",
but I'm out out of my depth vis the pg source code and
got tired and figured I'd post this.)

>
> * What about INSERT/UPDATE/DELETE RETURNING? The current definition
> of
> triggers gives them no way to specify what is computed for RETURNING.

If the BEFORE INSERT triggers actually returned rows, and the
INSERT/UPDATE/DELETE code in the executor (execMain.c)
tested for a view and if found did not do the operation
on the heap, then this problem goes away.
That would also get rid of the programming required to produce
a NEW row, because the trigger would produce it. It would also
allow AFTER triggers, for whatever good that would do, and
otherwise make triggers on views just like any other trigger.
(Excepting that it's up to the triggers to actually change
database data because the executor does not do it.)

The RETURNING would be computed in the executor just like always.

To keep the executor from having to do (much) testing, the
parser and executor _could_ add new CmdType-s of
CMD_INSERT_VIEW, CMD_UPDATE_VIEW, and CMD_DELETE_VIEW
to go with corresponding ExecInsertView(), ExecUpdateView(),
and ExecDeleteView() functions. These new functions would
be just like ExecInsert(), etc., but not actually modify the
heap. There'd be a little code duplication, which could
probably be gotten rid of by using macros. There'd be fussing
elsewhere that CMD_INSERT, CMD_UPDATE, and CMD_DELETE appear
to make sure that CMD_INSERT_VIEW, etc., are all recognized.

Karl <kop(at)meme(dot)com>
Free Software: "You don't pay back, you pay forward."
-- Robert A. Heinlein