Re: Performance on Bulk Insert to Partitioned Table

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Charles Gomes <charlesrg(at)outlook(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Performance on Bulk Insert to Partitioned Table
Date: 2012-12-21 03:24:09
Message-ID: CAMkU=1x0PpAjCe-C7nG=pxDWeVs3=9wLLQtKdsheOxmaJzK_7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thursday, December 20, 2012, Charles Gomes wrote:

> Jeff,
>
> The 8288 writes are fine, as the array has a BBU, it's fine. You see about
> 4% of the utilization.
>

BBU is great for latency, but it doesn't do much for throughput, unless it
is doing write combining behind the scenes. Is it HDD or SSD behind the
BBU? Have you bench-marked it on randomly scattered 8k writes?

I've seen %util reports that were low while watching a strace showed
obvious IO freezes. So I don't know how much faith to put into low %util.

>
> To target directly instead of doing :
> INSERT INTO TABLE VALUES ()
> I use:
> INSERT INTO TABLE_PARTITION_01 VALUES()
>

But how is it deciding what partition to use? Does it have to re-decide
for every row, or does each thread serve only one partition throughout its
life and so makes the decision only once?

>
> By targeting it I see a huge performance increase.
>

But is that because by targeting you are by-passing the the over-head of
triggers, or is it because you are loading the rows in an order which leads
to more efficient index maintenance?

> I haven't tested using 1Billion rows in a single table. The issue is that
> in the future it will grow to more than 1 billion rows, it will get to
> about 4Billion rows and that's when I believe partition would be a major
> improvement.
>

The way that partitioning gives you performance improvements is by you
embracing the partitioning, for example by targeting the loading to just
one partition without any indexes, creating indexes, and then atomically
attaching it to the table. If you wish to have partitions, but want to use
triggers to hide that partitioning from you, then I don't think you can
expect to get much of a speed up through using partitions.

Any way, the way I would approach it would be to load to a single
un-partitioned table, and also load to a single dummy-partitioned table
which uses a trigger that looks like the one you want to use for real, but
directs all rows to a single partition. If these loads take the same time,
you know it is not the trigger which is limiting.

Cheers,

Jeff

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2012-12-21 03:40:11 Re: Why does the query planner use two full indexes, when a dedicated partial index exists?
Previous Message Richard Neill 2012-12-21 03:16:20 Re: Why does the query planner use two full indexes, when a dedicated partial index exists?