Re: Performance on Bulk Insert to Partitioned Table

From: Evgeny Shishkin <itparanoia(at)gmail(dot)com>
To: Charles Gomes <charlesrg(at)outlook(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Ondrej Ivanič <ondrej(dot)ivanic(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Performance on Bulk Insert to Partitioned Table
Date: 2012-12-24 17:11:07
Message-ID: CA08C4B0-CD63-4C1C-B709-0AD888524CCD@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On Dec 24, 2012, at 9:07 PM, Charles Gomes <charlesrg(at)outlook(dot)com> wrote:

> By the way, I've just re-wrote the code to target the partitions individually and I've got almost 4 times improvement.
> Shouldn't it be faster to process the trigger, I would understand if there was no CPU left, but there is lots of cpu to chew.

I saw your 20% idle cpu and raise eyebrows.

> It seems that there will be no other way to speedup unless the insert code is partition aware.
>
> ----------------------------------------
>> From: charlesrg(at)outlook(dot)com
>> To: jeff(dot)janes(at)gmail(dot)com
>> CC: ondrej(dot)ivanic(at)gmail(dot)com; pgsql-performance(at)postgresql(dot)org
>> Subject: Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
>> Date: Mon, 24 Dec 2012 10:51:12 -0500
>>
>> ________________________________
>>> Date: Sun, 23 Dec 2012 14:55:16 -0800
>>> Subject: Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
>>> From: jeff(dot)janes(at)gmail(dot)com
>>> To: charlesrg(at)outlook(dot)com
>>> CC: ondrej(dot)ivanic(at)gmail(dot)com; pgsql-performance(at)postgresql(dot)org
>>>
>>> On Thursday, December 20, 2012, Charles Gomes wrote:
>>> True, that's the same I feel, I will be looking to translate the
>>> trigger to C if I can find good examples, that should accelerate.
>>>
>>> I think your performance bottleneck is almost certainly the dynamic
>>> SQL. Using C to generate that dynamic SQL isn't going to help much,
>>> because it is still the SQL engine that has to parse, plan, and execute
>>> it.
>>>
>>> Are the vast majority if your inserts done on any given day for records
>>> from that same day or the one before; or are they evenly spread over
>>> the preceding year? If the former, you could use static SQL in IF and
>>> ELSIF for those days, and fall back on the dynamic SQL for the
>>> exceptions in the ELSE block. Of course that means you have to update
>>> the trigger every day.
>>>
>>>
>>> Using rules would be totally bad as I'm partitioning daily and after
>>> one year having 365 lines of IF won't be fun to maintain.
>>>
>>> Maintaining 365 lines of IF is what Perl was invented for. That goes
>>> for triggers w/ static SQL as well as for rules.
>>>
>>> If you do the static SQL in a trigger and the dates of the records are
>>> evenly scattered over the preceding year, make sure your IFs are nested
>>> like a binary search, not a linear search. And if they are mostly for
>>> "today's" date, then make sure you search backwards.
>>>
>>> Cheers,
>>>
>>> Jeff
>>
>> Jeff, I've changed the code from dynamic to:
>>
>> CREATE OR REPLACE FUNCTION quotes_insert_trigger()
>> RETURNS trigger AS $$
>> DECLARE
>> r_date text;
>> BEGIN
>> r_date = to_char(new.received_time, 'YYYY_MM_DD');
>> case r_date
>> when '2012_09_10' then
>> insert into quotes_2012_09_10 values (NEW.*) using new;
>> return;
>> when '2012_09_11' then
>> insert into quotes_2012_09_11 values (NEW.*) using new;
>> return;
>> when '2012_09_12' then
>> insert into quotes_2012_09_12 values (NEW.*) using new;
>> return;
>> when '2012_09_13' then
>> insert into quotes_2012_09_13 values (NEW.*) using new;
>> return;
>> when '2012_09_14' then
>> insert into quotes_2012_09_14 values (NEW.*) using new;
>> return;
>> when '2012_09_15' then
>> insert into quotes_2012_09_15 values (NEW.*) using new;
>> return;
>> when '2012_09_16' then
>> insert into quotes_2012_09_16 values (NEW.*) using new;
>> return;
>> when '2012_09_17' then
>> insert into quotes_2012_09_17 values (NEW.*) using new;
>> return;
>> when '2012_09_18' then
>> insert into quotes_2012_09_18 values (NEW.*) using new;
>> return;
>> when '2012_09_19' then
>> insert into quotes_2012_09_19 values (NEW.*) using new;
>> return;
>> when '2012_09_20' then
>> insert into quotes_2012_09_20 values (NEW.*) using new;
>> return;
>> when '2012_09_21' then
>> insert into quotes_2012_09_21 values (NEW.*) using new;
>> return;
>> when '2012_09_22' then
>> insert into quotes_2012_09_22 values (NEW.*) using new;
>> return;
>> when '2012_09_23' then
>> insert into quotes_2012_09_23 values (NEW.*) using new;
>> return;
>> when '2012_09_24' then
>> insert into quotes_2012_09_24 values (NEW.*) using new;
>> return;
>> end case
>> RETURN NULL;
>> END;
>> $$
>> LANGUAGE plpgsql;
>>
>>
>> However I've got no speed improvement.
>> I need to keep two weeks worth of partitions at a time, that's why all the WHEN statements.
>> Wish postgres could automate the partition process natively like the other sql db.
>>
>> Thank you guys for your help.
>>
>> --
>> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Charles Gomes 2012-12-24 18:36:17 Re: Performance on Bulk Insert to Partitioned Table
Previous Message Charles Gomes 2012-12-24 17:07:09 Re: Performance on Bulk Insert to Partitioned Table