Re: Further pg_upgrade analysis for many tables

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: Further pg_upgrade analysis for many tables
Date: 2013-01-20 18:42:29
Message-ID: 20130120184229.GQ16126@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

* Jeff Janes (jeff(dot)janes(at)gmail(dot)com) wrote:
> By making the list over-flowable, we fix a demonstrated pathological
> workload (restore of huge schemas); we impose no detectable penalty to
> normal workloads; and we fail to improve, but also fail to make worse, a
> hypothetical pathological workload. All at the expense of a few bytes per
> backend.
[...]
> > Why does the list not grow as needed?
>
> It would increase the code complexity for no concretely-known benefit.

I'm curious if this is going to help with rollback's of transactions
which created lots of tables..? We've certainly seen that take much
longer than we'd like, although I've generally attributed it to doing
all of the unlink'ing and truncating of files.

I also wonder about making this a linked-list or something which can
trivially grow as we go and then walk later. That would also keep the
size of it small instead of a static/fixed amount.

> 1) It would have to have some transactions that cause >10 or >100 of
> relations to need clean up.

That doesn't seem hard.

> 2) It would have to have even more hundreds of relations
> in RelationIdCache but which don't need cleanup (otherwise, if most
> of RelationIdCache needs cleanup then iterating over that hash would be
> just as efficient as iterating over a list which contains most of the said
> hash)

Good point.

> 3) The above described transaction would have to happen over and over
> again, because if it only happens once there is no point in worrying about
> a little inefficiency.

We regularly do builds where we have lots of created tables which are
later either committed or dropped (much of that is due to our
hand-crafted partitioning system..).

Looking through the pach itself, it looks pretty clean to me.

Thanks,

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-01-20 18:42:45 Re: CF3+4 (was Re: Parallel query execution)
Previous Message Robert Haas 2013-01-20 18:40:15 Re: proposal: fix corner use case of variadic fuctions usage