Re: a performence question

From: Filip Rembiałkowski <plk(dot)zuber(at)gmail(dot)com>
To: "Rafal Pietrak" <rafal(at)zorro(dot)isa-geek(dot)com>
Cc: PostgreSQL <pgsql-general(at)postgresql(dot)org>
Subject: Re: a performence question
Date: 2008-09-05 15:04:00
Message-ID: 92869e660809050803o5340c4c2se4d7e593c5533d63@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

2008/9/4 Rafal Pietrak <rafal(at)zorro(dot)isa-geek(dot)com>:
> Hi,
>
> Maybe someone on this list actually have already tried this:
>
> I'm planning to make a partitioned database. From Postgres documentation
> I can see, that there are basically two methods to route INSERTS into
> partitioned table:
> one. is a TRIGGER
> other. is a RULE
>
> My Table will have over 1000 partitions. Some not so big, but
> significant number of them will be of multimillion rows. Partitioning
> will be done using a single column, on equality.... meaning:
>
> CREATE TABLE mainlog (sel int, tm timestamp, info text,...);
> CREATE TABLE mainlog_p1 (CHECK (sel=1)) INHERITS (mainlog);
> CREATE TABLE mainlog_p2 (CHECK (sel=2)) INHERITS (mainlog);
> ...etc.
>
> If I route INSERT with a TRIGGER, the function would look like:
> CREATE .... TRIGGER...AS $$ DECLARE x RECORD; BEGIN
> SELECT id INTO x FROM current_route; NEW.sel := x.id;
> IF NEW.sel = 1 THEN INSERT INTO mainlog_p1 VALUES (NEW.*);
> ELSE IF NEW.sel = 2 THEN INSERT INTO mainlog_p2 VALUES (NEW.*);
> ....
> END IF;
> RETURN NULL;
> $$;
>
> If I route INSETS with a RULE, I'd have something like 1000 rules hooked
> up to MAINLOG, all looking like:
> CREATE RULE .... ON INSERT ... WHERE EXISTS(SELECT 1 FROM current_route
> WHERE id = 1) DO INSTEAD INSERT INTO mainlog_p1 VALUES SELECT
> x.id,new.tm... FROM (SELECT id FROM current_route) x;
> ... and similar RULES for cases "WHERE id = 2", etc.
>
> My question is, where should I expect better performance on those
> INSERTS).
>
> I would prefer a set of RULES (as I wouldn't like to rewrite TRIGGER
> function every time I add a partition ... a thousand lines function),
> but since they all must make a select query on CURRENT_ROUTE table, may
> be that will not be particularly effective? The TRIGGER function does a
> single query - may be it'll be faster? I was planning to generate some
> dummy data and run a simulation, but may be someone already has that
> experience? Or maybe the TRIGGER should look differently? Or the set of
> RULES?
>

I had a bit spare time so I tested this

see http://filip.rembialkowski.net/postgres-partitioning-performance-rules-vs-triggers/

seems that in your scenario trigger will be better.

but If I had to do this, and if performance was very important, I
would move "partition selection" logic out of the INSERT phase. the
application can know this before the actual insert. unless you want to
shift selections very often...

--
Filip Rembiałkowski

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Sam Mason 2008-09-05 15:10:47 Re: large inserts and fsync
Previous Message Aaron Burnett 2008-09-05 13:16:41 large inserts and fsync