Re: Bloated pg_shdepend_depender_index

From: adey <adey11(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Peter Eisentraut" <peter_e(at)gmx(dot)net>, pgsql-admin(at)postgresql(dot)org, "Gregory Maxwell" <gmaxwell(at)gmail(dot)com>
Subject: Re: Bloated pg_shdepend_depender_index
Date: 2006-04-03 03:54:45
Message-ID: 1c66bda80604022054i6cf1d94ewb81d4e87b6012fc9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Please could someone help me with my questions below?

On 3/25/06, adey <adey11(at)gmail(dot)com> wrote:
>
> Two questions in this regard please?
> 1) Is tuple theory not the root of this problem
> 2) Vacuum does much the same as a traditional database reorg, and online
> reorgs are a reality now
>
> 1) If I understand tuple theory correctly, copies of rows are created
> through normal Postgres processing, that expire after a period, leaving
> multiple copies of redundant data to be vacuumed dead, then vacuumed out (or
> both). Most databases have been built using one copy of a row with
> sophisticated locking control mechanism that Postgres has some of anyway,
> and the industry has developed methods and designs to exploit locking to
> best advantage. Even with tuples, locking is still evident in Postgres.
> OR
> 2) Can vacuum full not be redesigned to run online without locking tables
> and users, like a conventional online reorg, eg: work on 1 data page at a
> time instead of locking the whole table with a shorter period at the end to
> lock the table and "compress" the remaining populated data pages and release
> disk space back to the OS; or one data file at a time, and have vacuum full
> *per table* reduce / tidy up the wraparound value, thereby avoiding a full
> DB vacuum for longer periods. In this way vacuum can be performed regularly
> and be less intrusive. Nowadays 24x7 is more of a reality for systems and we
> can't afford to take systems down for many hours to perform regular
> maintenance.
>
> (It would be extremely helpful to DBA's with little OS experience or
> access to have more automation in PGAdmin, especially task scheduling and
> alerting, so SQL can be scheduled in PGAmin instead of crontab, which is
> usually a sysadmin function).
>
>
>
>
> On 3/25/06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> > > Am Freitag, 24. März 2006 05:48 schrieb Tom Lane:
> > >> Well, the VACUUM FULL algorithm is incapable of shrinking indexes ---
> > >> the only way is REINDEX, or something else that reconstructs indexes
> > >> from scratch, such as CLUSTER. One of the things we need to look
> > into
> > >> is putting more smarts into VACUUM so that it automatically does
> > >> something reasonable when faced with extreme cases like these.
> >
> > > If the user is running VACUUM FULL, he has presumably determined that
> > the
> > > table is too bloated to be recovered in a graceful way, and quite
> > likely the
> > > indexes are going to be bloated similarly. So seemingly one might as
> > well
> > > launch a reindexing on the table after VACUUM FULL has done its thing.
> >
> > > Whether that should be automatic is another question but perhaps the
> > advice
> > > should be documented somewhere?
> >
> > Actually, I wonder whether VACUUM FULL shouldn't be thrown away and
> > replaced by something else entirely. That algorithm only really works
> > nicely when just a small percentage of the rows need to be moved to
> > re-compact the table --- if you're moving lots of rows, it makes the
> > index bloat situation *worse* not better because of the transient need
> > for index entries pointing to both copies of moved rows. Lazy VACUUM
> > has become the de-facto standard for situations where there's not a huge
> > amount of empty space, and so it's not clear where the sweet spot is for
> >
> > VACUUM FULL anymore. If you've got enough disk space, a rewrite (like
> > CLUSTER or ALTER TABLE) is going to blow the doors off VACUUM FULL,
> > let alone VACUUM FULL plus REINDEX. Not to mention that for
> > sufficiently huge tables, VACUUM FULL fails outright because it runs out
> >
> > of RAM.
> >
> > We need to fix CLUSTER to make it MVCC-safe (ie, not discard
> > recently-dead rows), and it'd be nice to have something like it that
> > didn't worry about ordering but just did a seqscan of the source table.
> > Then I'd be inclined to recommend that instead of VACUUM FULL for most
> > cases of severe bloat.
> >
> > Unfortunately this all breaks down for shared system catalogs and the
> > core (nailed-in) catalogs, because we can't change their relfilenodes
> > and so the crash-safe CLUSTER/REINDEX approach doesn't work. We still
> > need a new idea or two there.
> >
> > regards, tom lane
> >
> > ---------------------------(end of broadcast)---------------------------
> >
> > TIP 2: Don't 'kill -9' the postmaster
> >
>
>

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Brendan Duddridge 2006-04-03 04:15:44 Re: Setting up of PITR system.
Previous Message Tom Lane 2006-04-03 03:39:40 Re: auto vacuuming