Re: Performance on Bulk Insert to Partitioned Table

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Charles Gomes <charlesrg(at)outlook(dot)com>
Cc: Ondrej Ivanič <ondrej(dot)ivanic(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Performance on Bulk Insert to Partitioned Table
Date: 2012-12-27 04:03:33
Message-ID: CAMkU=1zJ7=DtRv8ikRH1Z8HcVwTKR+qw9yFF7z8mP__OL3D-9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Monday, December 24, 2012, Charles Gomes wrote:

> ________________________________
>

> >
> > I think your performance bottleneck is almost certainly the dynamic
> > SQL. Using C to generate that dynamic SQL isn't going to help much,
> > because it is still the SQL engine that has to parse, plan, and execute
> > it.
> >
>
> Jeff, I've changed the code from dynamic to:
>
> CREATE OR REPLACE FUNCTION quotes_insert_trigger()
> RETURNS trigger AS $$
> DECLARE
> r_date text;
> BEGIN
> r_date = to_char(new.received_time, 'YYYY_MM_DD');
> case r_date
> when '2012_09_10' then
> insert into quotes_2012_09_10 values (NEW.*) using new;
> return;
> ...

> However I've got no speed improvement.
> I need to keep two weeks worth of partitions at a time, that's why all the
> WHEN statements.
>

The 'using new' and return without argument are syntax errors.

When I do a model system with those fixed, I get about 2 fold improvement
over the dynamic SQL performance. Even if your performance did not go up,
did your CPU usage go down? Perhaps you have multiple bottlenecks all
sitting at about the same place, and so tackling any one of them at a time
doesn't get you anywhere.

How does both the dynamic and the CASE scale with the number of threads? I
think you said you had something like 70 sessions, but only 8 CPUs. That
probably will do bad things with contention, and I don't see how using more
connections than CPUs is going to help you here. If the CASE starts out
faster in single thread but then flat lines and the EXECUTE catches up,
that suggests a different avenue of investigation than they are always the
same.

> Wish postgres could automate the partition process natively like the other
> sql db.
>

More automated would be nice (i.e. one operation to make both the check
constraints and the trigger, so they can't get out of sync), but would not
necessarily mean faster. I don't know what you mean about other db. Last
time I looked at partitioning in mysql, it was only about breaking up the
underlying storage into separate files (without regards to contents of the
rows), so that is the same as what postgres does automatically. And in
Oracle, their partitioning seemed about the same as postgres's as far
as administrative tedium was concerned. I'm not familiar with how the MS
product handles it, and maybe me experience with the other two are out of
date.

Cheers,

Jeff

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2012-12-27 04:03:35 Re: Why does the query planner use two full indexes, when a dedicated partial index exists? (solved?)
Previous Message Jeff Janes 2012-12-27 04:03:32 Re: Why does the query planner use two full indexes, when a dedicated partial index exists?