pg_dump incredibly slow dumping a single schema from a large db

Lists: pgsql-generalpgsql-hackers
From: Mike Roest <mike(dot)roest(at)replicon(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: pg_dump incredibly slow dumping a single schema from a large db
Date: 2012-03-30 15:51:58
Message-ID: CAE7ByhiroJbXUzAJMFE0mVhOOcyYLGONAj83dkrXFkG5-H-rZg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Hey Everyone,
I've got an interesting issue. We're running postgres 9.1.1 linux x64
centos 5.8

aspdata=# select version();
version
---------------------------------------------------------------------------------------------------------------
PostgreSQL 9.1.1 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2
20080704 (Red Hat 4.1.2-52), 64-bit

We currently have 1 DB we use for multiple independent tenant schemas. The
database size is current 56227005240 bytes as reported by pg_database_size.

There are 557 schemas each with about 1300 objects (760 tables 520 views).

We are using pg_dump to do backups of a single schema with a total size of
(5480448 bytes calculated with SELECT sum(pg_relation_size(schemaname ||
'.' || tablename))::bigint FROM pg_tables WHERE schemaname ='miketest';)

pg_dump -f /dumps/test.backup -Fc -n miketest aspdata

This dump is currently taking around 8 minutes. While dumping the pg_dump
process is using 100% of one core in the server (24 core machine). Doing a
-v pg_dump I found that the following stages are taking the majority of the
time

reading user_defined tables (2 minutes and 20 seconds)
reading dependency data (5 minutes and 30 seconds)

The size of the schema doesn't really seem to effect theses times are
almost identical for a 700 meg schema as well (obviously the data dump
portion takes longer with the bigger db)

During the reading user_defined tables the following query shows up for a
10-20 seconds then the pg_dump connection sits idle for the rest of the 2
minutes:

SELECT c.tableoid, c.oid, c.relname, c.relacl, c.relkind, c.relnamespace,
(SELECT rolname FROM pg_catalog.pg_roles WHERE oid = c.relowner) AS
rolname, c.relchecks, c.relhastriggers, c.relhasindex, c.relhasrules,
c.relhasoids, c.relfrozenxid, tc.oid AS
toid, tc.relfrozenxid AS tfrozenxid, c.relpersistence, CASE WHEN
c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS
reloftype, d.refobjid AS owning_tab, d.refobjsubid AS owning_col, (SELECT
spcname FROM pg_tablespace
t WHERE t.oid = c.reltablespace) AS reltablespace,
array_to_string(c.reloptions, ', ') AS reloptions,
array_to_string(array(SELECT 'toast.' || x FROM unnest(tc.reloptions) x),
', ') AS toast_reloptions FROM pg_class c LEFT JOIN pg_depend
d ON (c.relkind = 'S' AND d.classid = c.tableoid AND d.objid = c.oid AND
d.objsubid = 0 AND d.refclassid = c.tableoid AND d.deptype = 'a') LEFT JOIN
pg_class tc ON (c.reltoastrelid = tc.oid) WHERE c.relkind in ('r', 'S',
'v', 'c', 'f')
ORDER BY c.oid

During the reading dependency data the following queries show up for a few
seconds then the connection sits idle for the rest of the 5.5 minutes:
SELECT tableoid
, oid, typname, typnamespace, (SELECT rolname FROM pg_catalog.pg_roles
WHERE oid = typowner) AS rolname, typinput::oid AS typinput, typoutput::oid
AS typoutput, typelem, typrelid, CASE WHEN typrelid = 0 THEN ' '::"char"
ELSE (SELECT relk
ind FROM pg_class WHERE oid = typrelid) END AS typrelkind, typtype,
typisdefined, typname[0] = '_' AND typelem != 0 AND (SELECT typarray FROM
pg_type te WHERE oid = pg_type.typelem) = oid AS isarray FROM pg_type
SELECT classid,
objid, refclassid, refobjid, deptype FROM pg_depend WHERE deptype != 'p'
AND deptype != 'e' ORDER BY 1,2

Non production servers with less schemas don't seem to have any issue and
perform the same dump in under 10 seconds on much lower classed hardware.

Server Specs:

2 x Intel Xeon X5650
32 Gigs of Ram
DELL Perc H700 Controller
Data drive - 6XSAS2 15K in RAID10 FS: xfs
Log Drive - 2XSAS2 15K in RAID1 FS: xfs

There are 2 of these machine one master other slaved via streaming
replication over gigabit network.

Thanks


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mike Roest <mike(dot)roest(at)replicon(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pg_dump incredibly slow dumping a single schema from a large db
Date: 2012-03-30 16:18:25
Message-ID: 21030.1333124305@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Mike Roest <mike(dot)roest(at)replicon(dot)com> writes:
> This dump is currently taking around 8 minutes. While dumping the pg_dump
> process is using 100% of one core in the server (24 core machine). Doing a
> -v pg_dump I found that the following stages are taking the majority of the
> time

> reading user_defined tables (2 minutes and 20 seconds)
> reading dependency data (5 minutes and 30 seconds)

Can you get an execution profile with oprofile or gprof or similar tool?
It doesn't surprise me a lot that pg_dump might have some issues with
large numbers of objects, but guessing which inefficiencies are hurting
you is difficult without more info.

regards, tom lane


From: Mike Roest <mike(dot)roest(at)replicon(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pg_dump incredibly slow dumping a single schema from a large db
Date: 2012-03-30 16:40:44
Message-ID: CAE7ByhiseyD+Qk+oKYr_S==-hV5yrNoVYUnqAkoKPa7GKvssgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

For sure I'll work on that now. One thing I noticed looking through the
pg_dump code based on the messages and the code one thing I noticed it
seems to be grabbing the full dependency graph for the whole db rather then
limiting it by the schema (not sure if limiting this would be possible)

This query returns 9843923 rows from the DB. So processing this seems like
it'll take quite a while.

I'll get a -pg build of pg_dump going here on a dev box so I can get you a
profile.

On Fri, Mar 30, 2012 at 10:18 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Mike Roest <mike(dot)roest(at)replicon(dot)com> writes:
> > This dump is currently taking around 8 minutes. While dumping the
> pg_dump
> > process is using 100% of one core in the server (24 core machine).
> Doing a
> > -v pg_dump I found that the following stages are taking the majority of
> the
> > time
>
> > reading user_defined tables (2 minutes and 20 seconds)
> > reading dependency data (5 minutes and 30 seconds)
>
> Can you get an execution profile with oprofile or gprof or similar tool?
> It doesn't surprise me a lot that pg_dump might have some issues with
> large numbers of objects, but guessing which inefficiencies are hurting
> you is difficult without more info.
>
> regards, tom lane
>


From: Mike Roest <mike(dot)roest(at)replicon(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pg_dump incredibly slow dumping a single schema from a large db
Date: 2012-03-30 17:17:06
Message-ID: CAE7Byhja_94DJdsngQd=E6nGEG82Q2O1cvYL_q3+id3ss31a2g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Ok I just realized that's probably not going to be much help :)

0.00 0.00 0.00 5 0.00 0.00 canonicalize_path
0.00 0.00 0.00 5 0.00 0.00
trim_trailing_separator
0.00 0.00 0.00 3 0.00 0.00 strlcpy
0.00 0.00 0.00 2 0.00 0.00 join_path_components
0.00 0.00 0.00 2 0.00 0.00 last_dir_separator
0.00 0.00 0.00 1 0.00 0.00 find_my_exec
0.00 0.00 0.00 1 0.00 0.00 first_dir_separator
0.00 0.00 0.00 1 0.00 0.00 get_etc_path
0.00 0.00 0.00 1 0.00 0.00 get_progname
0.00 0.00 0.00 1 0.00 0.00 help
0.00 0.00 0.00 1 0.00 0.00 make_relative_path
0.00 0.00 0.00 1 0.00 0.00 resolve_symlinks
0.00 0.00 0.00 1 0.00 0.00 set_pglocale_pgservice
0.00 0.00 0.00 1 0.00 0.00 trim_directory
0.00 0.00 0.00 1 0.00 0.00 validate_exec

That's the output of gprof pg_dump gmon.out (I built the -pg build on my
dev box then ran it on the server. I'm just running the actual dump on my
dev box against the server instead to see if I get something more useful
since that doesn't really seem to have much data in it)

On Fri, Mar 30, 2012 at 11:09 AM, Mike Roest <mike(dot)roest(at)replicon(dot)com>wrote:

> Here's the gmon.out from a -pg compiled 9.1.1 pg_dump.
>
> --Mike
>
>
> On Fri, Mar 30, 2012 at 10:40 AM, Mike Roest <mike(dot)roest(at)replicon(dot)com>wrote:
>
>> For sure I'll work on that now. One thing I noticed looking through the
>> pg_dump code based on the messages and the code one thing I noticed it
>> seems to be grabbing the full dependency graph for the whole db rather then
>> limiting it by the schema (not sure if limiting this would be possible)
>>
>> This query returns 9843923 rows from the DB. So processing this seems
>> like it'll take quite a while.
>>
>> I'll get a -pg build of pg_dump going here on a dev box so I can get you
>> a profile.
>>
>>
>> On Fri, Mar 30, 2012 at 10:18 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>>> Mike Roest <mike(dot)roest(at)replicon(dot)com> writes:
>>> > This dump is currently taking around 8 minutes. While dumping the
>>> pg_dump
>>> > process is using 100% of one core in the server (24 core machine).
>>> Doing a
>>> > -v pg_dump I found that the following stages are taking the majority
>>> of the
>>> > time
>>>
>>> > reading user_defined tables (2 minutes and 20 seconds)
>>> > reading dependency data (5 minutes and 30 seconds)
>>>
>>> Can you get an execution profile with oprofile or gprof or similar tool?
>>> It doesn't surprise me a lot that pg_dump might have some issues with
>>> large numbers of objects, but guessing which inefficiencies are hurting
>>> you is difficult without more info.
>>>
>>> regards, tom lane
>>>
>>
>>
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mike Roest <mike(dot)roest(at)replicon(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pg_dump incredibly slow dumping a single schema from a large db
Date: 2012-03-30 17:30:26
Message-ID: 24744.1333128626@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Mike Roest <mike(dot)roest(at)replicon(dot)com> writes:
> Ok I just realized that's probably not going to be much help :)

gmon.out would be of no value to anybody else anyway --- making sense of
it requires the exact executable you took the measurements with.

> 0.00 0.00 0.00 5 0.00 0.00 canonicalize_path
> 0.00 0.00 0.00 5 0.00 0.00
> trim_trailing_separator
> 0.00 0.00 0.00 3 0.00 0.00 strlcpy

Ugh. There are some platforms on which gprof is busted to various
degrees; you may have one. What platform is this exactly?

regards, tom lane


From: Mike Roest <mike(dot)roest(at)replicon(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pg_dump incredibly slow dumping a single schema from a large db
Date: 2012-03-30 17:33:38
Message-ID: CAE7ByhjxXqs=CrQBYiKwA766wOw87j8RRbrLhBDYFpU8RCVp1g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

That was on the CentOS 5.8 x64 machine. The one I'm trying it from now is
Ubuntu 11.10 x64

On Fri, Mar 30, 2012 at 11:30 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Mike Roest <mike(dot)roest(at)replicon(dot)com> writes:
> > Ok I just realized that's probably not going to be much help :)
>
> gmon.out would be of no value to anybody else anyway --- making sense of
> it requires the exact executable you took the measurements with.
>
> > 0.00 0.00 0.00 5 0.00 0.00 canonicalize_path
> > 0.00 0.00 0.00 5 0.00 0.00
> > trim_trailing_separator
> > 0.00 0.00 0.00 3 0.00 0.00 strlcpy
>
> Ugh. There are some platforms on which gprof is busted to various
> degrees; you may have one. What platform is this exactly?
>
> regards, tom lane
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mike Roest <mike(dot)roest(at)replicon(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pg_dump incredibly slow dumping a single schema from a large db
Date: 2012-03-30 17:49:24
Message-ID: 25164.1333129764@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Mike Roest <mike(dot)roest(at)replicon(dot)com> writes:
> That was on the CentOS 5.8 x64 machine. The one I'm trying it from now is
> Ubuntu 11.10 x64

Hm. On current Red-Hat-derived systems I'd recommend oprofile, although
you need root privileges to use that. Not real sure what is available
on Ubuntu, but our crib sheet for oprofile says it doesn't currently
work there:
http://wiki.postgresql.org/wiki/Profiling_with_OProfile
That could be out-of-date info though. Here's some info about
another possibility:
http://wiki.postgresql.org/wiki/Profiling_with_perf

regards, tom lane


From: Mike Roest <mike(dot)roest(at)replicon(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pg_dump incredibly slow dumping a single schema from a large db
Date: 2012-03-30 20:34:03
Message-ID: CAE7ByhgEPyoX+1rz-PjJn5GBWkTuex_C6y9po5d-7mpeS57UyQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

>
> That could be out-of-date info though. Here's some info about
> another possibility:
> http://wiki.postgresql.org/wiki/Profiling_with_perf
>
>
There we go this perf worked on the VM.

The file is 6 megs so I've dropped it here.

That was doing perf for the length of the pg_dump command and then a perf
report -n
http://dl.dropbox.com/u/13153/output.txt


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mike Roest <mike(dot)roest(at)replicon(dot)com>
Cc: pgsql-general(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_dump incredibly slow dumping a single schema from a large db
Date: 2012-03-31 16:35:53
Message-ID: 21078.1333211753@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Mike Roest <mike(dot)roest(at)replicon(dot)com> writes:
> The file is 6 megs so I've dropped it here.
> That was doing perf for the length of the pg_dump command and then a perf
> report -n
> http://dl.dropbox.com/u/13153/output.txt

Hmm ... that's a remarkably verbose output format, but the useful part
of this info seems to be just

# Events: 2M cpu-clock
#
# Overhead Samples Command Shared Object Symbol
# ........ .......... ............... ............................. .................................................
#
65.96% 1635392 pg_dump pg_dump [.] findLoop
|
--- findLoop
|
|--69.66%-- findDependencyLoops
| sortDumpableObjects
| main
| __libc_start_main
|
|--30.34%-- findLoop
| findDependencyLoops
| sortDumpableObjects
| main
| __libc_start_main
--0.00%-- [...]

10.16% 251955 pg_dump pg_dump [.] getTables
|
--- getTables
getSchemaData
main
__libc_start_main

The file also shows that findObjectByDumpId() accounts for only a
negligible percentage of runtime, which means that the recursion path in
findLoop() is being executed hardly at all. After staring at that for
awhile I realized that what is the O(N^2) part is the initial "is the
object in the workspace?" test. The actual dependency chains are
probably never very long, but as we run through the collection of
objects we gradually add all of them to the front of the workspace.
So this is dumb; we should manage the "is the object already processed"
component of that with an O(1) check, like a bool array or some such,
rather than an O(N) search loop.

As for the getTables slowdown, the only part of that I can see that
looks to be both significant and entirely contained in getTables()
itself is the nested loop near the end that's trying to copy
the dumpable flags for owned sequences from their owning tables.
Do you have a whole lot of owned sequences? Maybe we could postpone
that work until we have the fast table lookup array constructed,
which should reduce this from O(M*N) to O(M*logN).

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mike Roest <mike(dot)roest(at)replicon(dot)com>
Cc: pgsql-general(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_dump incredibly slow dumping a single schema from a large db
Date: 2012-03-31 19:56:41
Message-ID: 25863.1333223801@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

I wrote:
> So this is dumb; we should manage the "is the object already processed"
> component of that with an O(1) check, like a bool array or some such,
> rather than an O(N) search loop.

> As for the getTables slowdown, the only part of that I can see that
> looks to be both significant and entirely contained in getTables()
> itself is the nested loop near the end that's trying to copy
> the dumpable flags for owned sequences from their owning tables.
> Do you have a whole lot of owned sequences? Maybe we could postpone
> that work until we have the fast table lookup array constructed,
> which should reduce this from O(M*N) to O(M*logN).

I've committed fixes for both these issues. If you are in a position to
test with 9.1 branch tip from git, it'd be nice to have confirmation
that these patches actually cure your problem. For both of them, the
issue seems to only show up in a subset of cases, which may explain why
we'd not identified the problem before.

regards, tom lane


From: Mike Roest <mike(dot)roest(at)replicon(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, pgsql-general(at)postgresql(dot)org
Subject: Re: pg_dump incredibly slow dumping a single schema from a large db
Date: 2012-03-31 20:34:26
Message-ID: CAE7ByhhDgMdiALoGwpu+UXunP=iM9wJHWynfuk8bBhYNYH2bPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

>
> I've committed fixes for both these issues. If you are in a position to
> test with 9.1 branch tip from git, it'd be nice to have confirmation
> that these patches actually cure your problem. For both of them, the
> issue seems to only show up in a subset of cases, which may explain why
> we'd not identified the problem before.
>
> regards, tom lane
>
Cool,
I've grabbed your changes and it seems to have completely cured the issue.

reading user-defined tables has gone down to 9 seconds from 2.5 minutes
reading dependency data has gone down to 20 seconds from 5.5 minutes.

Thanks so much Tom this is perfect. I'm just pulling another backup using
the stock 9.1.1 pg_dump to ensure the backups are equivalent.

Any idea when 9.1.4 with this change will be out so we can pull the cluster
up.

With regards to your previous question about sequences there are 61K in the
DB, looks like our schema currently has about 115 sequences per tenant.

--Mike


From: Mike Roest <mike(dot)roest(at)replicon(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, pgsql-general(at)postgresql(dot)org
Subject: Re: pg_dump incredibly slow dumping a single schema from a large db
Date: 2012-03-31 20:54:23
Message-ID: CAE7Byhgv7jt8J3jrcN_t9QQYauPR=fBDnr4gmvJaF+_VjdD+cA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

>
>
> I'm just pulling another backup using the stock 9.1.1 pg_dump to ensure
> the backups are equivalent.
>
> Schema & data are identical between the 2 backups. the new backup passes
all our tests for validating a tenant.

Thank you again for the quick response!

--Mike


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mike Roest <mike(dot)roest(at)replicon(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, pgsql-general(at)postgresql(dot)org
Subject: Re: [HACKERS] Re: pg_dump incredibly slow dumping a single schema from a large db
Date: 2012-03-31 21:44:05
Message-ID: 28037.1333230245@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Mike Roest <mike(dot)roest(at)replicon(dot)com> writes:
> Any idea when 9.1.4 with this change will be out so we can pull the cluster
> up.

Well, we just did some releases last month, so unless somebody finds a
really nasty security or data-loss issue, I'd think it will be a couple
of months before the next set.

regards, tom lane


From: Mike Roest <mike(dot)roest(at)replicon(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, pgsql-general(at)postgresql(dot)org
Subject: Re: [HACKERS] Re: pg_dump incredibly slow dumping a single schema from a large db
Date: 2012-04-02 12:43:03
Message-ID: CAE7ByhjRy54MjyeGTzbVw-LoWw-msAszXo47n4AgH-RCDfJWDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

:) yah that makes sense no big deal. i'll probably just push this head
buiild of pg_dump onto the production machines till it comes out.

Thanks again!

On Sat, Mar 31, 2012 at 3:44 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Mike Roest <mike(dot)roest(at)replicon(dot)com> writes:
> > Any idea when 9.1.4 with this change will be out so we can pull the
> cluster
> > up.
>
> Well, we just did some releases last month, so unless somebody finds a
> really nasty security or data-loss issue, I'd think it will be a couple
> of months before the next set.
>
> regards, tom lane
>