Re: Further pg_upgrade analysis for many tables

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: Further pg_upgrade analysis for many tables
Date: 2012-11-10 17:15:58
Message-ID: 20121110171558.GB31383@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 9, 2012 at 04:23:40PM -0800, Jeff Janes wrote:
> On Fri, Nov 9, 2012 at 3:06 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > On Thu, Nov 8, 2012 at 08:59:21PM -0800, Jeff Janes wrote:
> >> On Thu, Nov 8, 2012 at 4:33 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> >> >
> >> > I am actually now dumping git head/9.3, so I assume all the problems we
> >> > know about should be fixed.
> >>
> >> Are sure the server you are dumping out of is head?
> >
> > Well, I tested again with 9.2 dumping/loading 9.2 and the same for git
> > head, and got these results:
> >
> > pg_dump restore
> > 9.2 git 9.2 git
> >
> > 1 0.13 0.11 0.07 0.07
> > 1000 4.37 3.98 4.32 5.28
> > 2000 12.98 12.19 13.64 14.25
> > 4000 47.85 50.14 61.31 70.97
> > 8000 210.39 183.00 302.67 294.20
> > 16000 901.53 769.83 1399.25 1359.09
>
> For pg_dump, there are 4 possible combinations, not just two. you can
> use 9.2's pg_dump to dump from a 9.2 server, use git's pg_dump to dump
> from a 9.2 server, use git's pg_dump to dump from a git server, or use
> 9.2's pg_dump to dump from a git server (although that last one isn't
> very relevant)

True, but I thought doing matching versions was a sufficient test.

> >> Using head's pg_dump, but 9.2.1 server, it takes me 179.11 seconds to
> >> dump 16,000 tables (schema only) like your example, and it is
> >> definitely quadratic.
> >
> > Are you using a SERIAL column for the tables. I am, and Peter
> > Eisentraut reported that was a big slowdown.
>
> Yes, I'm using the same table definition as your example.

OK.

> >> But using head's pg_dump do dump tables out of head's server, it only
> >> took 24.95 seconds, and the quadratic term is not yet important,
> >> things still look linear.
> >
> > Again, using SERIAL?
>
> Yep.

Odd why yours is so much after.

> >> they are stored in the file, than it is shame that pg_dump goes to the
> >> effort of extracting that info if pg_upgrade is just going to
> >> overwrite it anyway.
> >
> > Actually, pg_upgrade needs pg_dump to restore all those sequence values.
>
> I did an experiment where I had pg_dump just output dummy values
> rather than hitting the database. Once pg_upgrade moves the relation
> files over, the dummy values disappear and are set back to their
> originals. So I think that pg_upgrade depends on pg_dump only in a
> trivial way--they need to be there, but it doesn't matter what they
> are.

Oh, wow, I had not thought of that. Once we move the sequence files
into place from the old cluster, whatever was assigned to the sequence
counter by pg_dump restored is thrown away. Good point.

I am hesistant to add an optimization to pg_dump to fix this unless we
decide that pg_dump uses sequences in some non-optimal way that would
not warrant us improving general sequence creation performance.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ants Aasma 2012-11-10 17:17:34 Re: Further pg_upgrade analysis for many tables
Previous Message Bruce Momjian 2012-11-10 17:10:36 Re: Further pg_upgrade analysis for many tables