\d on database with a lot of tables is slow

Lists: pgsql-hackers
From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: \d on database with a lot of tables is slow
Date: 2005-09-24 23:59:24
Message-ID: 20050924235924.GR7630@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I have a client with a database that contains 4000 relations according
to vacuum verbose, and \d in psql is painfully slow. In particular...

-> Seq Scan on pg_class c (cost=0.00..2343.09 rows=6124 width=73) (actual time=0.325..22100.840 rows=16856 loops=1)
Filter: (((relkind = 'r'::"char") OR (relkind = 'v'::"char") OR (relkind = 'S'::"char") OR (relkind = ''::"char")) AND pg_table_is_visible(oid))

That's off my laptop, but they're seeing similar issues on an 8-way
Opteron as well...

I've messed around with adding indexes to a copy of pg_class to no
avail. Any ideas on how to improve the performance?

Also, not sure if this matters, but they're occasionally getting errors
like 'Did not find relation named "table that exists"' (where 'table
that exists' is the name of some table that is in the catalog) from \d.
Does anyone know what causes that?
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461


From: Rod Taylor <pg(at)rbt(dot)ca>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: \d on database with a lot of tables is slow
Date: 2005-09-25 00:20:47
Message-ID: 1127607647.22392.40.camel@home
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 2005-09-24 at 18:59 -0500, Jim C. Nasby wrote:
> I have a client with a database that contains 4000 relations according
> to vacuum verbose, and \d in psql is painfully slow. In particular...
>
> -> Seq Scan on pg_class c (cost=0.00..2343.09 rows=6124 width=73) (actual time=0.325..22100.840 rows=16856 loops=1)
> Filter: (((relkind = 'r'::"char") OR (relkind = 'v'::"char") OR (relkind = 'S'::"char") OR (relkind = ''::"char")) AND pg_table_is_visible(oid))
>
> That's off my laptop, but they're seeing similar issues on an 8-way
> Opteron as well...
>
> I've messed around with adding indexes to a copy of pg_class to no
> avail. Any ideas on how to improve the performance?

It is probably the visibility checks. Is a \d fast if you include the
full name (schema.table)?

I brought this up a while ago and Tom has since rearranged some of the
psql queries to move the visibility check to come after the other where
clause segments.

It would be nice if the cost of the function could be added somehow --
even if it was just a low, medium or high setting. This would allow the
planner to shuffle the where clause executing ordering around in a
reasonable manner.
--


From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Rod Taylor <pg(at)rbt(dot)ca>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: \d on database with a lot of tables is slow
Date: 2005-09-25 00:38:18
Message-ID: 20050925003818.GV7630@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Sep 24, 2005 at 08:20:47PM -0400, Rod Taylor wrote:
> On Sat, 2005-09-24 at 18:59 -0500, Jim C. Nasby wrote:
> > I have a client with a database that contains 4000 relations according
> > to vacuum verbose, and \d in psql is painfully slow. In particular...
> >
> > -> Seq Scan on pg_class c (cost=0.00..2343.09 rows=6124 width=73) (actual time=0.325..22100.840 rows=16856 loops=1)
> > Filter: (((relkind = 'r'::"char") OR (relkind = 'v'::"char") OR (relkind = 'S'::"char") OR (relkind = ''::"char")) AND pg_table_is_visible(oid))
> >
> > That's off my laptop, but they're seeing similar issues on an 8-way
> > Opteron as well...
> >
> > I've messed around with adding indexes to a copy of pg_class to no
> > avail. Any ideas on how to improve the performance?
>
> It is probably the visibility checks. Is a \d fast if you include the
> full name (schema.table)?
>
> I brought this up a while ago and Tom has since rearranged some of the
> psql queries to move the visibility check to come after the other where
> clause segments.
>
>
> It would be nice if the cost of the function could be added somehow --
> even if it was just a low, medium or high setting. This would allow the
> planner to shuffle the where clause executing ordering around in a
> reasonable manner.

\d tablename is fast, yes.

Maybe instead of re-arranging the query it would make more sense to roll
the visibility check into the query itself (probably using a new system
view).
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461


From: Hannu Krosing <hannu(at)skype(dot)net>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: \d on database with a lot of tables is slow
Date: 2005-09-25 07:10:22
Message-ID: 1127632222.4865.56.camel@fuji.krosing.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On L, 2005-09-24 at 18:59 -0500, Jim C. Nasby wrote:
> I have a client with a database that contains 4000 relations according
> to vacuum verbose, and \d in psql is painfully slow. In particular...
>
> -> Seq Scan on pg_class c (cost=0.00..2343.09 rows=6124 width=73) (actual time=0.325..22100.840 rows=16856 loops=1)
> Filter: (((relkind = 'r'::"char") OR (relkind = 'v'::"char") OR (relkind = 'S'::"char") OR (relkind = ''::"char")) AND pg_table_is_visible(oid))
>
> That's off my laptop, but they're seeing similar issues on an 8-way
> Opteron as well...

I expext the issue on 8-way opteron to be more of a high load than slow
scan. It seems that sometimes a database with lots of activity slows
down considerably. I suspect some locking issues, but I'm not sure this
is the case.

Also, if a lot of temp tebles are used, then pg_class and pg_attribute
(at least) get bloated quickly and need vacuuming .

> I've messed around with adding indexes to a copy of pg_class to no
> avail. Any ideas on how to improve the performance?
>
> Also, not sure if this matters, but they're occasionally getting errors
> like 'Did not find relation named "table that exists"' (where 'table
> that exists' is the name of some table that is in the catalog) from \d.
> Does anyone know what causes that?

mostly this happens on temp tables from other connections, which have
managed to disappear by the time their detailed info is requested, and
which would actually not show up in \d output due tu visibility checks.

--
Hannu Krosing <hannu(at)skype(dot)net>


From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Hannu Krosing <hannu(at)skype(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: \d on database with a lot of tables is slow
Date: 2005-09-26 18:05:44
Message-ID: 20050926180544.GM30974@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Sep 25, 2005 at 10:10:22AM +0300, Hannu Krosing wrote:
> On L, 2005-09-24 at 18:59 -0500, Jim C. Nasby wrote:
> > I have a client with a database that contains 4000 relations according
> > to vacuum verbose, and \d in psql is painfully slow. In particular...
> >
> > -> Seq Scan on pg_class c (cost=0.00..2343.09 rows=6124 width=73) (actual time=0.325..22100.840 rows=16856 loops=1)
> > Filter: (((relkind = 'r'::"char") OR (relkind = 'v'::"char") OR (relkind = 'S'::"char") OR (relkind = ''::"char")) AND pg_table_is_visible(oid))
> >
> > That's off my laptop, but they're seeing similar issues on an 8-way
> > Opteron as well...
>
> I expext the issue on 8-way opteron to be more of a high load than slow
> scan. It seems that sometimes a database with lots of activity slows
> down considerably. I suspect some locking issues, but I'm not sure this
> is the case.

No temp tables in use. The issue didn't appear to be load-dependant,
either.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461


From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Hannu Krosing <hannu(at)skype(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: \d on database with a lot of tables is slow
Date: 2005-09-26 18:17:42
Message-ID: 20050926181742.GP30974@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Sep 25, 2005 at 10:10:22AM +0300, Hannu Krosing wrote:
> > Also, not sure if this matters, but they're occasionally getting errors
> > like 'Did not find relation named "table that exists"' (where 'table
> > that exists' is the name of some table that is in the catalog) from \d.
> > Does anyone know what causes that?
>
> mostly this happens on temp tables from other connections, which have
> managed to disappear by the time their detailed info is requested, and
> which would actually not show up in \d output due tu visibility checks.

They are seeing this doing \d on regular tables that won't be getting
deleted. Maybe there's some locking that should be happening but isn't?
Or maybe \d should be doing things in one statement instead of a bunch?
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: Hannu Krosing <hannu(at)skype(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: \d on database with a lot of tables is slow
Date: 2005-09-26 19:00:44
Message-ID: 1744.1127761244@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
>>> I have a client with a database that contains 4000 relations according
>>> to vacuum verbose, and \d in psql is painfully slow. In particular...

It's presumably mostly in the pg_table_is_visible() calls. Not sure if
we can do much to speed those up, but: how many schemas in your search
path? What's the distribution of pg_class entries among the schemas?

regards, tom lane


From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hannu Krosing <hannu(at)skype(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: \d on database with a lot of tables is slow
Date: 2005-09-27 22:46:26
Message-ID: 20050927224626.GC30974@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Sep 26, 2005 at 03:00:44PM -0400, Tom Lane wrote:
> "Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
> >>> I have a client with a database that contains 4000 relations according
> >>> to vacuum verbose, and \d in psql is painfully slow. In particular...
>
> It's presumably mostly in the pg_table_is_visible() calls. Not sure if
> we can do much to speed those up, but: how many schemas in your search
> path? What's the distribution of pg_class entries among the schemas?

db=# show search_path;
decibel, pg_sysviews, information_schema, rrs, public

db=# select schema_name, count(*) from pg_user_tables group by schema_name;
public | 764
ledger | 6
state | 2
_track_replica | 10
repair | 3
summarized | 586
orders | 512
snapshot | 1012
acl | 10

db=#

Also, do you have any idea on the 'Did not find relation named
"table-thats-there"' error? table-thats-there isn't a temp table, and I don't
believe they're even using temp tables, so I don't think that's the issue.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: Hannu Krosing <hannu(at)skype(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: \d on database with a lot of tables is slow
Date: 2005-09-27 22:53:13
Message-ID: 29525.1127861593@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
> On Mon, Sep 26, 2005 at 03:00:44PM -0400, Tom Lane wrote:
>> It's presumably mostly in the pg_table_is_visible() calls. Not sure if
>> we can do much to speed those up, but: how many schemas in your search
>> path? What's the distribution of pg_class entries among the schemas?

> db=# show search_path;
> decibel, pg_sysviews, information_schema, rrs, public

> db=# select schema_name, count(*) from pg_user_tables group by schema_name;
> public | 764
> ledger | 6
> state | 2
> _track_replica | 10
> repair | 3
> summarized | 586
> orders | 512
> snapshot | 1012
> acl | 10

Hmm, so lots and lots of tables that aren't visible at all ... that's
definitely the slowest case for pg_table_is_visible. I'll think about
whether we can improve it.

> Also, do you have any idea on the 'Did not find relation named
> "table-thats-there"' error? table-thats-there isn't a temp table, and I don't
> believe they're even using temp tables, so I don't think that's the issue.

Uh, what's the exact error message again? (If it's a backend message,
the verbose form would be useful.)

regards, tom lane


From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hannu Krosing <hannu(at)skype(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: \d on database with a lot of tables is slow
Date: 2005-09-28 00:16:28
Message-ID: 20050928001628.GE30974@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Sep 27, 2005 at 06:53:13PM -0400, Tom Lane wrote:
> > Also, do you have any idea on the 'Did not find relation named
> > "table-thats-there"' error? table-thats-there isn't a temp table, and I don't
> > believe they're even using temp tables, so I don't think that's the issue.
>
> Uh, what's the exact error message again? (If it's a backend message,
> the verbose form would be useful.)

Sorry, remembered it wrong. It's 'Did not find any relation named',
which appears to be in bin/psql/describe.c. It does occur when trying to
do a \d on a specific table.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: Hannu Krosing <hannu(at)skype(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: \d on database with a lot of tables is slow
Date: 2005-09-28 15:35:31
Message-ID: 6776.1127921731@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

"Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
> Sorry, remembered it wrong. It's 'Did not find any relation named',
> which appears to be in bin/psql/describe.c. It does occur when trying to
> do a \d on a specific table.

Hmm, no immediate ideas. You haven't provided a lot of context about
this --- when it happens, is it repeatable? Are they giving an exact
table name or a pattern to \d? Is a schema name included in what they
give to \d? What PG version are they running exactly?

regards, tom lane


From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hannu Krosing <hannu(at)skype(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: \d on database with a lot of tables is slow
Date: 2005-09-30 22:08:19
Message-ID: 20050930220819.GE40138@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Sep 28, 2005 at 11:35:31AM -0400, Tom Lane wrote:
> "Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
> > Sorry, remembered it wrong. It's 'Did not find any relation named',
> > which appears to be in bin/psql/describe.c. It does occur when trying to
> > do a \d on a specific table.
>
> Hmm, no immediate ideas. You haven't provided a lot of context about
> this --- when it happens, is it repeatable? Are they giving an exact
> table name or a pattern to \d? Is a schema name included in what they
> give to \d? What PG version are they running exactly?

Sorry, had the error message wrong:

ERROR: cache lookup failed for relation 1906465919

It is on an exact table name. When we retry the describe on a failure,
sometimes it works and sometimes it fails again. When it fails again
the relation number is different. Also, \d schema.table always returns
quickly and never errors. \d table is slow and produces the error fairly
often.

They're using 8.0.x; I'm pretty certain it's 8.0.3.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461


From: Jon Jensen <jon(at)endpoint(dot)com>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Hannu Krosing <hannu(at)skype(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: \d on database with a lot of tables is slow
Date: 2005-10-01 00:04:54
Message-ID: Pine.LNX.4.63.0509301751550.5391@ynfu.ovalna.fjrygre.arg
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 30 Sep 2005, Jim C. Nasby wrote:

> ERROR: cache lookup failed for relation 1906465919
>
> It is on an exact table name. When we retry the describe on a failure,
> sometimes it works and sometimes it fails again. When it fails again
> the relation number is different. Also, \d schema.table always returns
> quickly and never errors. \d table is slow and produces the error fairly
> often.
>
> They're using 8.0.x; I'm pretty certain it's 8.0.3.

We have the exact same problem on a 7.4.5 database. Some basic info on the
database: psql's \d returns 424 rows, and the on-disk size of the database
is about 11 GB. A standalone \dt throws the same occasional error.

I've suspected that it may be caused by ongoing periodic (at least once an
hour) rebuilding of certain generated tables with TRUNCATE and then INSERT
INTO the_table SELECT ... inside a transaction. But I don't have any proof
of that; it's just the most obvious different thing going on compared to
other databases we have.

It does seem like I've encountered the error less often since increasing
the max_fsm_pages setting and thus had more effective VACUUM and less
pg_class bloat, but OTOH I trained myself not to do \d there very often
either, since it was so slow and failed so often, so that may be
coincidence. :)

Jon

--
Jon Jensen
End Point Corporation
http://www.endpoint.com/
Software development with Interchange, Perl, PostgreSQL, Apache, Linux, ...


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: Hannu Krosing <hannu(at)skype(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: \d on database with a lot of tables is slow
Date: 2005-10-01 18:00:12
Message-ID: 9615.1128189612@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> It's presumably mostly in the pg_table_is_visible() calls.

I did some profiling on a test case with 10000 tables, and noticed that
a big part of the problem is that the catalog caches become entirely
useless: almost every catcache lookup ends up going to the underlying
tables. This is because MAXCCTUPLES in catcache.c is fixed at 5000,
and that's not an adequate working set for this many tables. If you
are willing to throw memory at the problem, you could try increasing
MAXCCTUPLES (to say 50K or 100K) and see if that helps.

regards, tom lane


From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Hannu Krosing <hannu(at)skype(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: \d on database with a lot of tables is slow
Date: 2005-10-04 14:17:33
Message-ID: 20051004141733.GF40138@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Oct 01, 2005 at 02:00:12PM -0400, Tom Lane wrote:
> I wrote:
> > It's presumably mostly in the pg_table_is_visible() calls.
>
> I did some profiling on a test case with 10000 tables, and noticed that
> a big part of the problem is that the catalog caches become entirely
> useless: almost every catcache lookup ends up going to the underlying
> tables. This is because MAXCCTUPLES in catcache.c is fixed at 5000,
> and that's not an adequate working set for this many tables. If you
> are willing to throw memory at the problem, you could try increasing
> MAXCCTUPLES (to say 50K or 100K) and see if that helps.

Out of curiosity... does catcache cache all pg_* tables? Also, at what
point would it be good to up NCCBUCKETS?
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461