Re: About the performance of startup after dropping many tables

Lists: pgsql-hackers
From: Gan Jiadong <ganjd(at)huawei(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: liyuesen(at)huawei(dot)com, yaoyiyu(at)huawei(dot)com, liuxingyu(at)huawei(dot)com, tianwengang(at)huawei(dot)com
Subject: About the performance of startup after dropping many tables
Date: 2011-02-18 02:58:57
Message-ID: 008b01cbcf17$c9622140$5c2663c0$@com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello guys,

we have PG 8.3.13 in our system. When running performance cases, we find the
startup recovery cost about 3 minutes. It is too long in our system.

We diagnose the problem by adding timestamp. Finally, we find almost all 3
minutes were used by the relation dropping and buffer invalidation loop in
xact_redo_commit.

Before the problem happens, we drop 40000 tables and reboot linux. So the
out loop will run 40000 times . And we have 13000 share buffer pages in PG.
But in DropRelFileNodeBuffers who is used to drop shared buffer associated
to the specified relation we will have to run through all the shared buffers
for each relation to check whether the buffer can be dropped, no matter how
many pages the relation has in shared buffer.

In all, we will have 40000 * 13000 LWLock acquire and release. Is this
necessary? How about building a hash to record all relfilenode to be
dropped, and run through the shared buffers once to check where the buffer's
relfilenode is going to be dropped! If we can do this, LWLock traffic will
be 13000 , we will have much better performance!

Does this work? And is there any risk to do so?

Thanks!

Best reguards,

甘嘉栋(Gan Jiadong)

E-MAIL: ganjd(at)huawei(dot)com

Tel:+86-755-289720578

****************************************************************************
*****************************

This e-mail and its attachments contain confidential information from
HUAWEI, which is intended only for the person or entity whose address is
listed above. Any use of the information contained herein in any way
(including, but not limited to, total or partial disclosure, reproduction,
or dissemination) by persons other than the intended recipient(s) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!

****************************************************************************
*****************************


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gan Jiadong <ganjd(at)huawei(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, liyuesen(at)huawei(dot)com, yaoyiyu(at)huawei(dot)com, liuxingyu(at)huawei(dot)com, tianwengang(at)huawei(dot)com
Subject: Re: About the performance of startup after dropping many tables
Date: 2011-02-18 03:37:24
Message-ID: 5095.1298000244@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gan Jiadong <ganjd(at)huawei(dot)com> writes:
> we have PG 8.3.13 in our system. When running performance cases, we find the
> startup recovery cost about 3 minutes. It is too long in our system.

Maybe you should rethink the assumption that dropping 40000 tables is a
cheap operation. Why do you have that many in the first place, let
alone that many that you drop and recreate frequently? Almost
certainly, you need a better-conceived schema.

regards, tom lane


From: Gan Jiadong <ganjd(at)huawei(dot)com>
To: 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, liyuesen(at)huawei(dot)com, yaoyiyu(at)huawei(dot)com, liuxingyu(at)huawei(dot)com, tianwengang(at)huawei(dot)com
Subject: Re: About the performance of startup after dropping many tables
Date: 2011-02-18 06:42:02
Message-ID: 009101cbcf36$f3214aa0$d963dfe0$@com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Thanks for your reply.
Of course, we will think about whether 40000 relations dropping is
reasonable. In fact, this happens in a very special scenario .
But when we analyzed this issue, we found the PG code can be rewritten to
achieve better performance. Or we can say the arithmetic of this part is not
good enough.
For example, by doing the refactoring as we done, the startup time can be
reduced from 3 minutes to 8 seconds, It is quite a great improvement,
especially for the systems with low TTR (time to recovery) requirement.

There is any problem or risk to change this part of code as we suggested?
Thank you.


Best reguards,

甘嘉栋(Gan Jiadong)
E-MAIL: ganjd(at)huawei(dot)com
Tel:+86-755-289720578
****************************************************************************
*****************************
This e-mail and its attachments contain confidential information from
HUAWEI, which is intended only for the person or entity whose address is
listed above. Any use of the information contained herein in any way
(including, but not limited to, total or partial disclosure, reproduction,
or dissemination) by persons other than the intended recipient(s) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!
****************************************************************************
*****************************

-----邮件原件-----
发件人: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
发送时间: 2011年2月18日 11:37
收件人: Gan Jiadong
抄送: pgsql-hackers(at)postgresql(dot)org; liyuesen(at)huawei(dot)com; yaoyiyu(at)huawei(dot)com;
liuxingyu(at)huawei(dot)com; tianwengang(at)huawei(dot)com
主题: Re: [HACKERS] About the performance of startup after dropping many
tables

Gan Jiadong <ganjd(at)huawei(dot)com> writes:
> we have PG 8.3.13 in our system. When running performance cases, we find
the
> startup recovery cost about 3 minutes. It is too long in our system.

Maybe you should rethink the assumption that dropping 40000 tables is a
cheap operation. Why do you have that many in the first place, let
alone that many that you drop and recreate frequently? Almost
certainly, you need a better-conceived schema.

regards, tom lane


From: Gan Jiadong <ganjd(at)huawei(dot)com>
To: 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, liyuesen(at)huawei(dot)com, yaoyiyu(at)huawei(dot)com, liuxingyu(at)huawei(dot)com, tianwengang(at)huawei(dot)com
Subject: Re: About the performance of startup after dropping many tables
Date: 2011-02-18 06:57:02
Message-ID: 009701cbcf39$0be3e000$23aba000$@com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Thanks for your reply.
Of course, we will think about whether 40000 relations dropping is
reasonable. In fact, this happens in a very special scenario .
But when we analyzed this issue, we found the PG code can be rewritten to
achieve better performance. Or we can say the arithmetic of this part is not
good enough.
For example, by doing the refactoring as we done, the startup time can be
reduced from 3 minutes to 8 seconds, It is quite a great improvement,
especially for the systems with low TTR (time to recovery) requirement.

There is any problem or risk to change this part of code as we suggested?
Thank you.

Best reguards,

甘嘉栋(Gan Jiadong)
E-MAIL: ganjd(at)huawei(dot)com
Tel:+86-755-289720578
****************************************************************************
*****************************
This e-mail and its attachments contain confidential information from
HUAWEI, which is intended only for the person or entity whose address is
listed above. Any use of the information contained herein in any way
(including, but not limited to, total or partial disclosure, reproduction,
or dissemination) by persons other than the intended recipient(s) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!
****************************************************************************
*****************************

-----邮件原件-----
发件人: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
发送时间: 2011年2月18日 11:37
收件人: Gan Jiadong
抄送: pgsql-hackers(at)postgresql(dot)org; liyuesen(at)huawei(dot)com; yaoyiyu(at)huawei(dot)com;
liuxingyu(at)huawei(dot)com; tianwengang(at)huawei(dot)com
主题: Re: [HACKERS] About the performance of startup after dropping many
tables

Gan Jiadong <ganjd(at)huawei(dot)com> writes:
> we have PG 8.3.13 in our system. When running performance cases, we find
the
> startup recovery cost about 3 minutes. It is too long in our system.

Maybe you should rethink the assumption that dropping 40000 tables is a
cheap operation. Why do you have that many in the first place, let
alone that many that you drop and recreate frequently? Almost
certainly, you need a better-conceived schema.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Gan Jiadong <ganjd(at)huawei(dot)com>, pgsql-hackers(at)postgresql(dot)org, liyuesen(at)huawei(dot)com, yaoyiyu(at)huawei(dot)com, liuxingyu(at)huawei(dot)com, tianwengang(at)huawei(dot)com
Subject: Re: About the performance of startup after dropping many tables
Date: 2011-02-18 12:37:46
Message-ID: AANLkTi=UsFDcwmHU22pibTZ+rZ_=QQZWsPCFBDv9hqfD@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Feb 17, 2011 at 10:37 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Gan Jiadong <ganjd(at)huawei(dot)com> writes:
>> we have PG 8.3.13 in our system. When running performance cases, we find the
>> startup recovery cost about 3 minutes. It is too long in our system.
>
> Maybe you should rethink the assumption that dropping 40000 tables is a
> cheap operation.  Why do you have that many in the first place, let
> alone that many that you drop and recreate frequently?  Almost
> certainly, you need a better-conceived schema.

Possibly, but it's not necessarily a bad idea to improve performance
for people with crazy schemas.

What concerns me a little bit about the proposed scheme, though, is
that it's only going to work if all over those tables are dropped by a
single transaction. You still need one pass through all of
shared_buffers for every transaction that drops one or more relations.
Now, I'm not sure, maybe there's no help for that, but ever since
commit c2281ac87cf4828b6b828dc8585a10aeb3a176e0 it's been on my mind
that loops that iterate through the entire buffer cache are bad for
scalability.

Conventional wisdom seems to be that performance tops out at, or just
before, 8GB, but it's already the case that that's a quite a small
fraction of the memory on a large machine, and that's only going to
keep getting worse. Admittedly, the existing places where we loop
through the whole buffer cache are probably not the primary reason for
that limitation, but...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Gan Jiadong <ganjd(at)huawei(dot)com>, pgsql-hackers(at)postgresql(dot)org, liyuesen(at)huawei(dot)com, yaoyiyu(at)huawei(dot)com, liuxingyu(at)huawei(dot)com, tianwengang(at)huawei(dot)com
Subject: Re: About the performance of startup after dropping many tables
Date: 2011-02-18 14:55:04
Message-ID: 21029.1298040904@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Possibly, but it's not necessarily a bad idea to improve performance
> for people with crazy schemas.

It is if it introduces unmaintainable code. I see no way to collapse
multiple drop operations into one that's not going to be a Rube Goldberg
device. I'm especially unwilling to introduce such a thing into the
xlog replay code paths, where it's guaranteed to get little testing.

(BTW, it seems like a workaround for the OP is just to CHECKPOINT right
after dropping all those tables. Or even reconsider their shutdown
procedure.)

regards, tom lane


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Gan Jiadong <ganjd(at)huawei(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, liyuesen <liyuesen(at)huawei(dot)com>, yaoyiyu <yaoyiyu(at)huawei(dot)com>, liuxingyu <liuxingyu(at)huawei(dot)com>, tianwengang <tianwengang(at)huawei(dot)com>, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: About the performance of startup after dropping many tables
Date: 2011-09-07 19:20:10
Message-ID: 1315422909-sup-6411@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Gan Jiadong's message of vie feb 18 03:42:02 -0300 2011:
> Hi,
>
> Thanks for your reply.
> Of course, we will think about whether 40000 relations dropping is
> reasonable. In fact, this happens in a very special scenario .
> But when we analyzed this issue, we found the PG code can be rewritten to
> achieve better performance. Or we can say the arithmetic of this part is not
> good enough.
> For example, by doing the refactoring as we done, the startup time can be
> reduced from 3 minutes to 8 seconds, It is quite a great improvement,
> especially for the systems with low TTR (time to recovery) requirement.
>
> There is any problem or risk to change this part of code as we suggested?

The only way to know would be to show the changes. If you were to
submit the patch, and assuming we agree on the design and
implementation, we could even consider including it (or, more likely,
some derivate of it).

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support