Re: table partitioning pl/pgsql helpers

Lists: pgsql-generalpgsql-hackers
From: Enrico Sirola <enrico(dot)sirola(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: table partitioning pl/pgsql helpers
Date: 2007-05-30 16:55:29
Message-ID: 36FCCC20-B1D4-40F2-AD52-CE76239BF344@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Hello,

I'm trying to write a trigger on insert which should insert the row
in another
table. The table on which to insert the row should be selected at
runtime and
it is not know in advance. For example, let's say we have a table with
two columns, a date and an integer. a row is inserted into table XXX and

CREATE TABLE XXX
(
refdate date;
x2 integer;
)

when the statement

insert into XXX VALUES ('2007-11-11', 1);

is executed, a trigger (or rule) should be fired to insert the row into
table XXX_20071111 (having the same schema). If the XXX_* tables are
created beforehand
this is not a problem because you can set up a rule for each date
performing
the needed insert (as documented in http://www.postgresql.org/docs/
8.1/interactive/ddl-partitioning.html)

The problem arises when you try to extend the trigger in order to
also dinamically
perform table creation is the XXX_20071111 doesn't exist:

I have, for example:

CREATE OR REPLACE FUNCTION add_child_table(parent_name varchar, d date)
RETURNS varchar AS $$
DECLARE
new_table_name varchar;
BEGIN
raise notice '%', 'creating table';
new_table_name := date2tblname(parent_name, d); -- converts
table name and date into child table name
execute 'CREATE TABLE ' || new_table_name || ' ( ) INHERITS
(' || parent_name || ')';
execute 'CREATE RULE '
|| new_table_name
|| '_insert AS ON INSERT TO '
|| parent_name
|| ' WHERE ( refdate = DATE ' /* refdate is the
field we use to partition */
|| '''' || d || ''''
|| ' ) DO INSTEAD INSERT INTO '
|| new_table_name
|| ' VALUES ( NEW.* )';
return new_table_name;
END;
$$ LANGUAGE plpgsql;

I can use the above to add a child table and the rule to implement
partitioning

CREATE OR REPLACE FUNCTION child_creation_trigger()
RETURNS "trigger" AS $$
BEGIN
IF ( child_exist(TG_RELNAME::text, NEW.refdate) = false ) THEN
raise notice '%', 'creating ' || TG_RELNAME::text || '
for ' || NEW.refdate::text;
perform add_child_table(TG_RELNAME::text, NEW.refdate);
--insert into ' || child_table_name || ' values
( NEW.* );
RETURN NEW;
ELSE
raise notice '%', 'NOT creating ' || TG_RELNAME::text
|| ' for ' || NEW.refdate::text;
RETURN NEW;
END IF;
END;
$$ LANGUAGE plpgsql;

The trigger function is hooked to the to-be-partitioned table

CREATE TRIGGER XXX_trigger BEFORE INSERT
ON XXX FOR EACH ROW
execute procedure child_creation_trigger ();

now, every time I insert a tuple into XXX, the trigger is fired and
checks if the needed table exists or
not. If not, it creates the table and rule and goes on. The problem
is that in this case the first
row is inserted into the XXX table, not in the (just created)
XXX_<refdate> . Then I tried to insert
the row myself from the trigger body (and return null in order to
skip the original insertion), but
I'm not able to do it (see the commented insert in the above IF
clause) because I can't properly
quote the target table name.
I shoud perform a

insert into child_table_name values (NEW.*);

obviously written like this the plpgsql complains at runtime because
child_table_name is not a table name.
If, on the other side, I dynamically create the query like in

execute 'insert into ' || child_table_name || ' values (NEW.*)';

it complains because NEW in the execution context is unknown.

This should be a rather common problem... Isn't it? Is there a
canonical way to
solve it? Maybe there's a trivial answer, but I have no plpgsql
programming
experience.

Thanks a lot in advance,

Enrico Sirola
enrico(dot)sirola(at)gmail(dot)com


From: Robert Treat <robert(at)omniti(dot)com>
To: pgsql-general(at)postgresql(dot)org
Cc: Enrico Sirola <enrico(dot)sirola(at)gmail(dot)com>
Subject: Re: table partitioning pl/pgsql helpers
Date: 2007-06-01 02:08:01
Message-ID: 200705312208.02184.robert@omniti.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

On Wednesday 30 May 2007 12:55, Enrico Sirola wrote:
> Hello,
>
> I'm trying to write a trigger on insert which should insert the row
> in another
> table. The table on which to insert the row should be selected at
> runtime and
> it is not know in advance. For example, let's say we have a table with
> two columns, a date and an integer. a row is inserted into table XXX and
>
> CREATE TABLE XXX
> (
> refdate date;
> x2 integer;
> )
>
> when the statement
>
> insert into XXX VALUES ('2007-11-11', 1);
>
> is executed, a trigger (or rule) should be fired to insert the row into
> table XXX_20071111 (having the same schema). If the XXX_* tables are
> created beforehand
> this is not a problem because you can set up a rule for each date
> performing
> the needed insert (as documented in http://www.postgresql.org/docs/
> 8.1/interactive/ddl-partitioning.html)
>
> The problem arises when you try to extend the trigger in order to
> also dinamically
> perform table creation is the XXX_20071111 doesn't exist:
>
> I have, for example:
>
>
> CREATE OR REPLACE FUNCTION add_child_table(parent_name varchar, d date)
> RETURNS varchar AS $$
> DECLARE
> new_table_name varchar;
> BEGIN
> raise notice '%', 'creating table';
> new_table_name := date2tblname(parent_name, d); -- converts
> table name and date into child table name
> execute 'CREATE TABLE ' || new_table_name || ' ( ) INHERITS
> (' || parent_name || ')';
> execute 'CREATE RULE '
>
> || new_table_name
> || '_insert AS ON INSERT TO '
> || parent_name
> || ' WHERE ( refdate = DATE ' /* refdate is the
>
> field we use to partition */
>
> || '''' || d || ''''
> || ' ) DO INSTEAD INSERT INTO '
> || new_table_name
> || ' VALUES ( NEW.* )';
>
> return new_table_name;
> END;
> $$ LANGUAGE plpgsql;
>
> I can use the above to add a child table and the rule to implement
> partitioning
>
> CREATE OR REPLACE FUNCTION child_creation_trigger()
> RETURNS "trigger" AS $$
> BEGIN
> IF ( child_exist(TG_RELNAME::text, NEW.refdate) = false ) THEN
> raise notice '%', 'creating ' || TG_RELNAME::text || '
> for ' || NEW.refdate::text;
> perform add_child_table(TG_RELNAME::text, NEW.refdate);
> --insert into ' || child_table_name || ' values
> ( NEW.* );
> RETURN NEW;
> ELSE
> raise notice '%', 'NOT creating ' || TG_RELNAME::text
>
> || ' for ' || NEW.refdate::text;
>
> RETURN NEW;
> END IF;
> END;
> $$ LANGUAGE plpgsql;
>
> The trigger function is hooked to the to-be-partitioned table
>
>
> CREATE TRIGGER XXX_trigger BEFORE INSERT
> ON XXX FOR EACH ROW
> execute procedure child_creation_trigger ();
>
>
> now, every time I insert a tuple into XXX, the trigger is fired and
> checks if the needed table exists or
> not. If not, it creates the table and rule and goes on. The problem
> is that in this case the first
> row is inserted into the XXX table, not in the (just created)
> XXX_<refdate> . Then I tried to insert
> the row myself from the trigger body (and return null in order to
> skip the original insertion), but
> I'm not able to do it (see the commented insert in the above IF
> clause) because I can't properly
> quote the target table name.
> I shoud perform a
>
> insert into child_table_name values (NEW.*);
>
> obviously written like this the plpgsql complains at runtime because
> child_table_name is not a table name.
> If, on the other side, I dynamically create the query like in
>
> execute 'insert into ' || child_table_name || ' values (NEW.*)';
>
> it complains because NEW in the execution context is unknown.
>
> This should be a rather common problem... Isn't it? Is there a
> canonical way to
> solve it? Maybe there's a trivial answer, but I have no plpgsql
> programming
> experience.
>

We I set these up for our clients, I typically seperate the partition creation
piece from the data insertion piece. (Mostly as partition creation,
especially with rules, is a table locking event, which is better done in a
non-critical path). If you really must do it all in one go, you'll have a
much better chance of accomplishing this using all triggers (and triggers are
better anyway)... i think you could do it with a combination of rules and a
trigger (on insert to parent, create new parition and insert into it and
delete from parent) but it would certainly need testing to make sure you dont
have multi-rule evaluation... course since your making a trigger anyway...

--
Robert Treat
Database Architect
http://www.omniti.com


From: Jim Nasby <decibel(at)decibel(dot)org>
To: Enrico Sirola <enrico(dot)sirola(at)gmail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org General" <pgsql-general(at)postgresql(dot)org>
Subject: Re: [HACKERS] table partitioning pl/pgsql helpers
Date: 2007-06-01 22:53:56
Message-ID: F1465A3F-4136-4050-B486-1CE0FDE69429@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Dropping -hackers; that list is for development of the database
engine itself.

The problem is that rules will happen before triggers, so what you're
trying to do will never work. Instead, just have the trigger insert
the data into the appropriate table.

On May 30, 2007, at 9:55 AM, Enrico Sirola wrote:

> Hello,
>
> I'm trying to write a trigger on insert which should insert the row
> in another
> table. The table on which to insert the row should be selected at
> runtime and
> it is not know in advance. For example, let's say we have a table with
> two columns, a date and an integer. a row is inserted into table
> XXX and
>
> CREATE TABLE XXX
> (
> refdate date;
> x2 integer;
> )
>
> when the statement
>
> insert into XXX VALUES ('2007-11-11', 1);
>
> is executed, a trigger (or rule) should be fired to insert the row
> into
> table XXX_20071111 (having the same schema). If the XXX_* tables
> are created beforehand
> this is not a problem because you can set up a rule for each date
> performing
> the needed insert (as documented in http://www.postgresql.org/docs/
> 8.1/interactive/ddl-partitioning.html)
>
> The problem arises when you try to extend the trigger in order to
> also dinamically
> perform table creation is the XXX_20071111 doesn't exist:
>
> I have, for example:
>
>
> CREATE OR REPLACE FUNCTION add_child_table(parent_name varchar, d
> date)
> RETURNS varchar AS $$
> DECLARE
> new_table_name varchar;
> BEGIN
> raise notice '%', 'creating table';
> new_table_name := date2tblname(parent_name, d); -- converts
> table name and date into child table name
> execute 'CREATE TABLE ' || new_table_name || ' ( ) INHERITS
> (' || parent_name || ')';
> execute 'CREATE RULE '
> || new_table_name
> || '_insert AS ON INSERT TO '
> || parent_name
> || ' WHERE ( refdate = DATE ' /* refdate is the
> field we use to partition */
> || '''' || d || ''''
> || ' ) DO INSTEAD INSERT INTO '
> || new_table_name
> || ' VALUES ( NEW.* )';
> return new_table_name;
> END;
> $$ LANGUAGE plpgsql;
>
> I can use the above to add a child table and the rule to implement
> partitioning
>
> CREATE OR REPLACE FUNCTION child_creation_trigger()
> RETURNS "trigger" AS $$
> BEGIN
> IF ( child_exist(TG_RELNAME::text, NEW.refdate) = false ) THEN
> raise notice '%', 'creating ' || TG_RELNAME::text ||
> ' for ' || NEW.refdate::text;
> perform add_child_table(TG_RELNAME::text, NEW.refdate);
> --insert into ' || child_table_name || ' values
> ( NEW.* );
> RETURN NEW;
> ELSE
> raise notice '%', 'NOT creating ' || TG_RELNAME::text
> || ' for ' || NEW.refdate::text;
> RETURN NEW;
> END IF;
> END;
> $$ LANGUAGE plpgsql;
>
> The trigger function is hooked to the to-be-partitioned table
>
>
> CREATE TRIGGER XXX_trigger BEFORE INSERT
> ON XXX FOR EACH ROW
> execute procedure child_creation_trigger ();
>
>
> now, every time I insert a tuple into XXX, the trigger is fired and
> checks if the needed table exists or
> not. If not, it creates the table and rule and goes on. The problem
> is that in this case the first
> row is inserted into the XXX table, not in the (just created)
> XXX_<refdate> . Then I tried to insert
> the row myself from the trigger body (and return null in order to
> skip the original insertion), but
> I'm not able to do it (see the commented insert in the above IF
> clause) because I can't properly
> quote the target table name.
> I shoud perform a
>
> insert into child_table_name values (NEW.*);
>
> obviously written like this the plpgsql complains at runtime
> because child_table_name is not a table name.
> If, on the other side, I dynamically create the query like in
>
> execute 'insert into ' || child_table_name || ' values (NEW.*)';
>
> it complains because NEW in the execution context is unknown.
>
> This should be a rather common problem... Isn't it? Is there a
> canonical way to
> solve it? Maybe there's a trivial answer, but I have no plpgsql
> programming
> experience.
>
> Thanks a lot in advance,
>
> Enrico Sirola
> enrico(dot)sirola(at)gmail(dot)com
>
>
>

--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)


From: Enrico Sirola <enrico(dot)sirola(at)gmail(dot)com>
To: Jim Nasby <decibel(at)decibel(dot)org>
Cc: "pgsql-general(at)postgresql(dot)org General" <pgsql-general(at)postgresql(dot)org>
Subject: Re: [HACKERS] table partitioning pl/pgsql helpers
Date: 2007-06-04 08:53:36
Message-ID: 5AAEE84F-D900-4E63-8C23-1DE3126818D8@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers


Il giorno 02/giu/07, alle ore 00:53, Jim Nasby ha scritto:

> Dropping -hackers; that list is for development of the database
> engine itself.

ok, sorry
e.


From: Enrico Sirola <enrico(dot)sirola(at)gmail(dot)com>
To: Robert Treat <robert(at)omniti(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: table partitioning pl/pgsql helpers
Date: 2007-06-04 09:30:50
Message-ID: 703C285B-6BA9-4C4F-8A61-BABFEFDE5D5B@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Hi Robert,

Il giorno 01/giu/07, alle ore 04:08, Robert Treat ha scritto:

[...]

> We I set these up for our clients, I typically seperate the
> partition creation
> piece from the data insertion piece. (Mostly as partition creation,
> especially with rules, is a table locking event, which is better
> done in a
> non-critical path). If you really must do it all in one go, you'll
> have a

I see, and now I agree with you it's better to decouple partition
creation
and data insertion.

> much better chance of accomplishing this using all triggers (and
> triggers are
> better anyway)... i think you could do it with a combination of
> rules and a
> trigger (on insert to parent, create new parition and insert into
> it and
> delete from parent) but it would certainly need testing to make
> sure you dont
> have multi-rule evaluation... course since your making a trigger
> anyway...

Even more importantly, I just discovered (trying and then reading
pgsql docs) that the rule
system is completely bypassed by the COPY FROM statement, so I think
I'll rewrite
everything using some sort of trigger-generating procedure because I
want this stuff
to work transparently (and we do lots of copy from).
Thanks for the advices,

Enrico Sirola
enrico(dot)sirola(at)gmail(dot)com