Re: Further pg_upgrade analysis for many tables

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: Further pg_upgrade analysis for many tables
Date: 2012-11-09 23:06:41
Message-ID: 20121109230641.GC26605@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 8, 2012 at 08:59:21PM -0800, Jeff Janes wrote:
> On Thu, Nov 8, 2012 at 4:33 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > On Thu, Nov 8, 2012 at 03:46:09PM -0800, Jeff Janes wrote:
> >> On Wed, Nov 7, 2012 at 6:17 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> >> > As a followup to Magnus's report that pg_upgrade was slow for many
> >> > tables, I did some more testing with many tables, e.g.:
> >> >
> >> ...
> >> >
> >> > Any ideas? I am attaching my test script.
> >>
> >> Have you reviewed the thread at:
> >> http://archives.postgresql.org/pgsql-performance/2012-09/msg00003.php
> >> ?
> >>
> >> There is a known N^2 behavior when using pg_dump against pre-9.3 servers.
> >
> > I am actually now dumping git head/9.3, so I assume all the problems we
> > know about should be fixed.
>
> Are sure the server you are dumping out of is head?

Well, I tested again with 9.2 dumping/loading 9.2 and the same for git
head, and got these results:

pg_dump restore
9.2 git 9.2 git

1 0.13 0.11 0.07 0.07
1000 4.37 3.98 4.32 5.28
2000 12.98 12.19 13.64 14.25
4000 47.85 50.14 61.31 70.97
8000 210.39 183.00 302.67 294.20
16000 901.53 769.83 1399.25 1359.09

As you can see, there is very little difference between 9.2 and git
head, except maybe at the 16k level for pg_dump.

Is there some slowdown with a mismatched version dump/reload? I am
attaching my test script.

> Using head's pg_dump, but 9.2.1 server, it takes me 179.11 seconds to
> dump 16,000 tables (schema only) like your example, and it is
> definitely quadratic.

Are you using a SERIAL column for the tables. I am, and Peter
Eisentraut reported that was a big slowdown.

> But using head's pg_dump do dump tables out of head's server, it only
> took 24.95 seconds, and the quadratic term is not yet important,
> things still look linear.

Again, using SERIAL?

> But even the 179.11 seconds is several times faster than your report
> of 757.8, so I'm not sure what is going on there. I don't think my
> laptop is particularly fast:
>
> Intel(R) Pentium(R) CPU B960 @ 2.20GHz

I am using server-grade hardware, Xeon E5620 2.4GHz:

http://momjian.us/main/blogs/pgblog/2012.html#January_20_2012

> Is the next value, increment, etc. for a sequence stored in a catalog,
> or are they stored in the 8kb file associated with each sequence? If

Each sequence is stored in its own 1-row 8k table:

test=> CREATE SEQUENCE seq;
CREATE SEQUENCE

test=> SELECT * FROM seq;
-[ RECORD 1 ]-+--------------------
sequence_name | seq
last_value | 1
start_value | 1
increment_by | 1
max_value | 9223372036854775807
min_value | 1
cache_value | 1
log_cnt | 0
is_cycled | f
is_called | f

> they are stored in the file, than it is shame that pg_dump goes to the
> effort of extracting that info if pg_upgrade is just going to
> overwrite it anyway.

Actually, pg_upgrade needs pg_dump to restore all those sequence values.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Attachment Content-Type Size
test_many_tables text/plain 1.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2012-11-09 23:08:48 Re: Enabling Checksums
Previous Message Jeff Janes 2012-11-09 20:47:37 Re: [HACKERS] pg_dump and thousands of schemas