Re: Minimally avoiding Transaction Wraparound in VLDBs

Lists: pgsql-hackers
From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Minimally avoiding Transaction Wraparound in VLDBs
Date: 2005-08-31 22:17:44
Message-ID: 1125526664.3956.62.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


For a while now, I've been seeking a method of reducing the impact of
VACUUM when run against a database where 80-95% of the data is
effectively read only and will not be deleted/updated again. This is the
situation in most Data Warehouses. When you get very large databases
(VLDB) the execution time of VACUUM becomes prohibitive.

I understand the need to run some form of VACUUM to avoid transaction id
wraparound, but I see that VACUUM does a lot of other things too.

A new command is proposed - VACUUM MINIMAL. The *sole* purpose of this
command is to do the absolute minimum required to avoid transaction id
wraparound. (Better names welcome....)

This does the same thing as VACUUM except it:

1. does not VACUUM any table younger than 4 billion XIDs old
RATIONALE: If a table was created less than 4 billion XIDs ago, it
clearly can't have any tuples in it with an XID more than 4 billion XIDs
old, so we don't need to VACUUM it to avoid XID wraparound.
(Current VACUUM will scan everything, even if a table was created only a
few transactions ago).

2. does not VACUUM indexes
RATIONALE: Indexes don't contain XIDs so don't *need* to be vacuumed.

By taking those two steps, VACUUM MINIMAL will execute fairly quickly
even on large Data Warehouses. Those steps are fairly easy to implement
without change to the basic structure of VACUUM.

This command only makes sense when you *know* that tables don't need
vacuuming. That is the case when:
- you're using autovacuum, since all tables have dead-tuples removed
whenever this is required - and so indexes will have been covered also
- you are using table partitioning and the data retention period of your
data is less than 4 billion transactions. At 100 requests/sec that is a
whole year of data - and if you are using COPY to load the data, then
that comes out at hundreds of billions of rows, or Terabytes of data.
(Which is also the rationale, since you really *don't* want to VACUUM a
Terabyte of tables with indexes on them, ever).

The limit is set at 4 billion because with this command we are trying to
avoid doing work as long as possible. This makes the command faster,
which in turn allows the command to be run more regularly, probably
daily. Of course, you would get a somewhat longer running command once
table XIDs have been frozen but this is for the user to understand and
avoid, if they have problems with that.

Thus, the user has a choice of two ways of avoiding XID wraparound:
- VACUUM
- VACUUM MINIMAL
Each with their specific strengths and weaknesses.

We've discussed in the past the idea that VACUUM can be speeded-up by
using a bitmap to track which blocks to clean. That's a good idea and I
see that as somewhat orthogonal to the reason for this proposal. To be
of use in the circumstances I'm trying to optimise for, the vacuum
bitmaps would need to be non-lossy, persistent and recoverable to be of
use for xid wraparound use (I know the clog code could be used for
that), as well as only utilised for tables bigger than a certain
threshold, say 128 heap blocks or more - to avoid having too many
bitmaps when there are 1000s of tables. They also still don't help much
with VACUUMing big indexes in a DW context, since VACUUM still makes two
passes of each index even when there are no dead rows to remove from the
the index. That could be tuned somewhat, for which I also have a design
but why bother tuning VACUUM when you can just skip it?

Comments?

Best Regards, Simon Riggs


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Minimally avoiding Transaction Wraparound in VLDBs
Date: 2005-08-31 23:24:35
Message-ID: 5588.1125530675@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> A new command is proposed - VACUUM MINIMAL. The *sole* purpose of this
> command is to do the absolute minimum required to avoid transaction id
> wraparound. (Better names welcome....)

I do not see the point. If you only need to run it every billion
transactions, saving a few cycles seems a bit pointless.

> This does the same thing as VACUUM except it:

> 1. does not VACUUM any table younger than 4 billion XIDs old

So? Your large tables are likely to be long-lived, so this isn't
actually going to save a thing in a DB that's been around long enough
to have an impending wrap problem.

> 2. does not VACUUM indexes
> RATIONALE: Indexes don't contain XIDs so don't *need* to be vacuumed.

Nonstarter. If you remove any tuples, you *must* vacuum indexes to
remove the corresponding index entries. If you don't remove any tuples,
you don't scan the indexes anyway IIRC.

> By taking those two steps, VACUUM MINIMAL will execute fairly quickly
> even on large Data Warehouses.

I don't think this'll save a thing.

> This command only makes sense when you *know* that tables don't need
> vacuuming. That is the case when:
> - you're using autovacuum, since all tables have dead-tuples removed
> whenever this is required - and so indexes will have been covered also

If you're using autovacuum then the problem is already taken care of.
It will be taken care of better in 8.2, if we add per-table tracking
of XID wraparound risk, but it's handled now. The only way that this
proposal makes any sense is if you are trying not to vacuum at all, ever.

> - you are using table partitioning and the data retention period of your
> data is less than 4 billion transactions.

Again, per-table tracking of wraparound horizon would take care of this
case, more elegantly and more safely.

> The limit is set at 4 billion because with this command we are trying to
> avoid doing work as long as possible.

You do realize that 2 billion is already the wrap horizon, and you can't
wait that long if you're doing this on a routine basis rather than
immediately-when-needed?

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Minimally avoiding Transaction Wraparound in VLDBs
Date: 2005-09-01 00:57:02
Message-ID: 1125536222.3956.123.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2005-08-31 at 19:24 -0400, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > A new command is proposed - VACUUM MINIMAL. The *sole* purpose of this
> > command is to do the absolute minimum required to avoid transaction id
> > wraparound. (Better names welcome....)
>
> I do not see the point. If you only need to run it every billion
> transactions, saving a few cycles seems a bit pointless.

It is not simply a few cycles, it could be days of grinding I/O. The
worse it is, the more it gets put off, which makes it worse etc..

...it kindof prevents embedded systems from working happily.

> > This does the same thing as VACUUM except it:
>
> > 1. does not VACUUM any table younger than 4 billion XIDs old
>
> So? Your large tables are likely to be long-lived, so this isn't
> actually going to save a thing in a DB that's been around long enough
> to have an impending wrap problem.

My point was, with table partitioning that the largest tables would
*not* be the longest lived. Any very large table has to be truncated
down to a manageable size at some point. Partitioning completely avoids
the need to DELETE or TRUNCATE, since you use CREATE TABLE to introduce
new partitions and DROP TABLE to remove them.

> > 2. does not VACUUM indexes
> > RATIONALE: Indexes don't contain XIDs so don't *need* to be vacuumed.
>
> Nonstarter. If you remove any tuples, you *must* vacuum indexes to
> remove the corresponding index entries.

Yes, but if no tuples have been deleted or updated, there will be no
tuples removed.

> If you don't remove any tuples,
> you don't scan the indexes anyway IIRC.

No. Even if you remove *zero* tuples, an index is still scanned twice.
Once to not delete the rows and once to not delete the pages.

That isn't as mad as it sounds, since a VACUUM needs to mark pages as
HALF_DEAD on one VACUUM, then come back for a DELETED marker later
because of MVCC. But that behaviour isn't required at all in the
scenario I described, so I was seeking a way to get around that.

We could optimise that, as I said, but I was trying to avoid the problem
entirely. But, I'm easy, if you like the sound of that better I have the
details for that worked out also.

> > By taking those two steps, VACUUM MINIMAL will execute fairly quickly
> > even on large Data Warehouses.
>
> I don't think this'll save a thing.

I understand that the scenarios I describe seem highly atypical, but
there are a number people already running this kind of system. I've
worked with a couple and I've seen a half-dozen others on list, and
there are more coming.

> > This command only makes sense when you *know* that tables don't need
> > vacuuming. That is the case when:
> > - you're using autovacuum, since all tables have dead-tuples removed
> > whenever this is required - and so indexes will have been covered also
>
> If you're using autovacuum then the problem is already taken care of.

autovacuum will respond only to UPDATEs and DELETEs. In the scenario I
outline, these will *never* occur on the largest tables. A VACUUM would
still eventually be required to freeze long lived tuples and this would
not be performed by autovacuum.

> It will be taken care of better in 8.2, if we add per-table tracking
> of XID wraparound risk, but it's handled now.

Thats a good idea, I must have missed the discussion on that.

But what I am suggesting is for a certain class of table, which just
happens to be very large, we defer a VACUUM for as long as possible. The
hope is, and I believe that this could be very likely, that the table
would cease to exist before the table became eligible for VACUUM.

The current viewpoint is that "most" rows never live for 1 billion rows,
so never get frozen. Thats completely valid in something like a Stock
table, but its completely wrong in a TransactionHistory table.

For a historical data table I want to be able to defer the need for a
full table VACUUM for as long as possible, and when/should it ever
occur, I want to avoid scanning the indexes pointlessly.

> The only way that this
> proposal makes any sense is if you are trying not to vacuum at all, ever.

Yes, that is exactly what I'm trying to do, for the largest tables only.

I never want to VACUUM them because I know they don't need it to reclaim
rows and I have a *good* expectation that the table will be dropped, one
day.

For more "normal" tables, I'm happy to VACUUM them and don't want to
alter that behaviour at all.

> You do realize that 2 billion is already the wrap horizon, and you can't
> wait that long if you're doing this on a routine basis rather than
> immediately-when-needed?

OK, 4 billion was just a mistake...I just meant avoid VACUUM as long as
possible. I changed the number immediately before posting, to emphasise
the deferral aspect of this proposal.

Best Regards, Simon Riggs


From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Minimally avoiding Transaction Wraparound in VLDBs
Date: 2005-09-01 01:48:04
Message-ID: 20050901014804.GA18701@surnet.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Sep 01, 2005 at 01:57:02AM +0100, Simon Riggs wrote:

> > If you're using autovacuum then the problem is already taken care of.
>
> autovacuum will respond only to UPDATEs and DELETEs. In the scenario I
> outline, these will *never* occur on the largest tables. A VACUUM would
> still eventually be required to freeze long lived tuples and this would
> not be performed by autovacuum.

Hum, I don't understand -- if you don't want to vacuum the table, why
run vacuum at all? You can (as of 8.1) disable autovacuum for specific
tables. The exception is that you are forced to run a database-wide
VACUUM once in a while (every billion-and-so), but this will hopefully
disappear in 8.2 too, leaving you effectively with the option of never
vacuuming a table.

--
Alvaro Herrera -- Valdivia, Chile Architect, www.EnterpriseDB.com
You liked Linux a lot when he was just the gawky kid from down the block
mowing your lawn or shoveling the snow. But now that he wants to date
your daughter, you're not so sure he measures up. (Larry Greenemeier)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Minimally avoiding Transaction Wraparound in VLDBs
Date: 2005-09-01 02:21:36
Message-ID: 7684.1125541296@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> On Wed, 2005-08-31 at 19:24 -0400, Tom Lane wrote:
>> If you don't remove any tuples,
>> you don't scan the indexes anyway IIRC.

> No. Even if you remove *zero* tuples, an index is still scanned twice.
> Once to not delete the rows and once to not delete the pages.

Yeah? Well, that could probably be improved with a less intrusive fix,
that is, one that does it automatically instead of involving the user.

I really really do not like proposals to introduce still another kind
of VACUUM. We have too many already; any casual glance through the
archives will show that most PG users don't have a grip on when to use
VACUUM FULL vs VACUUM. Throwing in some more types will make that
problem exponentially worse.

> autovacuum will respond only to UPDATEs and DELETEs. In the scenario I
> outline, these will *never* occur on the largest tables. A VACUUM would
> still eventually be required to freeze long lived tuples and this would
> not be performed by autovacuum.

Yes, it will, at least as of 8.1.

regards, tom lane


From: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Minimally avoiding Transaction Wraparound in VLDBs
Date: 2005-09-01 02:40:51
Message-ID: 43166A33.5@familyhealth.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> I really really do not like proposals to introduce still another kind
> of VACUUM. We have too many already; any casual glance through the
> archives will show that most PG users don't have a grip on when to use
> VACUUM FULL vs VACUUM. Throwing in some more types will make that
> problem exponentially worse.

Yes, but if they're all under the control of autovacuum, then users
don't have to worry...

Chris


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Minimally avoiding Transaction Wraparound in VLDBs
Date: 2005-09-01 02:55:41
Message-ID: 7926.1125543341@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au> writes:
>> I really really do not like proposals to introduce still another kind
>> of VACUUM. We have too many already; any casual glance through the
>> archives will show that most PG users don't have a grip on when to use
>> VACUUM FULL vs VACUUM. Throwing in some more types will make that
>> problem exponentially worse.

> Yes, but if they're all under the control of autovacuum, then users
> don't have to worry...

Well, if the proposal comes packaged with an algorithm by which
autovacuum will use it, that's a different story. What's sticking in
my craw about this proposal is really that it's assuming detailed manual
management of vacuuming, which is exactly the thing we've been sweating
to get rid of.

BTW ... the original Berkeley papers on Postgres make frequent reference
to a "vacuum daemon", which seems to be essentially what we're trying to
build with autovacuum. Does anyone know if the Berkeley implementation
ever actually had auto vacuuming, or was that all handwaving? If it did
exist, why was it removed?

regards, tom lane


From: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Minimally avoiding Transaction Wraparound in VLDBs
Date: 2005-09-01 03:19:45
Message-ID: Pine.LNX.4.58.0509011316530.6411@linuxworld.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 31 Aug 2005, Tom Lane wrote:

> Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au> writes:
> >> I really really do not like proposals to introduce still another kind
> >> of VACUUM. We have too many already; any casual glance through the
> >> archives will show that most PG users don't have a grip on when to use
> >> VACUUM FULL vs VACUUM. Throwing in some more types will make that
> >> problem exponentially worse.
>
> > Yes, but if they're all under the control of autovacuum, then users
> > don't have to worry...
>
> Well, if the proposal comes packaged with an algorithm by which
> autovacuum will use it, that's a different story. What's sticking in
> my craw about this proposal is really that it's assuming detailed manual
> management of vacuuming, which is exactly the thing we've been sweating
> to get rid of.
>
> BTW ... the original Berkeley papers on Postgres make frequent reference
> to a "vacuum daemon", which seems to be essentially what we're trying to
> build with autovacuum. Does anyone know if the Berkeley implementation
> ever actually had auto vacuuming, or was that all handwaving? If it did
> exist, why was it removed?

Well, from my reading of some of the early papers, VACUUM was kind of
different to what it is now. The idea was that expired data would be moved
out the heap and stored else where. A timetravel mechanism could be used
to see previous versions of the row.

It makes sense that they would manage this with a daemon, but I never saw
one. Mind, I wasn't looking for one.

Thanks,

Gavin


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Minimally avoiding Transaction Wraparound in VLDBs
Date: 2005-09-01 04:16:49
Message-ID: 200508312116.49401.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Gavin, Tom,

> Well, from my reading of some of the early papers, VACUUM was kind of
> different to what it is now. The idea was that expired data would be moved
> out the heap and stored else where. A timetravel mechanism could be used
> to see previous versions of the row.

And from talking to a couple of Stonebraker's former students at conferences,
this mechanism was never build satisfactorily; it always existed on paper, at
least at UCB. Illustra might have finished it. Elein around?

--
Josh Berkus
Aglio Database Solutions
San Francisco


From: Andrew - Supernews <andrew+nonews(at)supernews(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Minimally avoiding Transaction Wraparound in VLDBs
Date: 2005-09-01 04:21:58
Message-ID: slrndhd0f6.1vfu.andrew+nonews@trinity.supernews.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2005-09-01, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> On Thu, Sep 01, 2005 at 01:57:02AM +0100, Simon Riggs wrote:
>
>> > If you're using autovacuum then the problem is already taken care of.
>>
>> autovacuum will respond only to UPDATEs and DELETEs. In the scenario I
>> outline, these will *never* occur on the largest tables. A VACUUM would
>> still eventually be required to freeze long lived tuples and this would
>> not be performed by autovacuum.
>
> Hum, I don't understand -- if you don't want to vacuum the table, why
> run vacuum at all? You can (as of 8.1) disable autovacuum for specific
> tables. The exception is that you are forced to run a database-wide
> VACUUM once in a while (every billion-and-so), but this will hopefully
> disappear in 8.2 too,

Wishful thinking, or do you have a concrete plan to achieve it?

--
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Minimally avoiding Transaction Wraparound in VLDBs
Date: 2005-09-01 04:22:19
Message-ID: 200508312122.19664.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom,

> If you're using autovacuum then the problem is already taken care of.
> It will be taken care of better in 8.2, if we add per-table tracking
> of XID wraparound risk, but it's handled now.  The only way that this
> proposal makes any sense is if you are trying not to vacuum at all, ever.

Hmmm ... the potential problem which Simon is mentioning is very real ... in
large DWs, there may be tables/partitions which are never, ever vacuumed.
Ever.

For example, at one client's site they load their data via ETL jobs that
insert about 25 million rows a day and update about 100,000. Given that
updates are < 5%, there is no reason from a data efficiency perspective to
ever vacuum. So that's the plan ... the main fact table will never, ever
be vacuumed. (in that particular case, since the ETL uses large transaction
batches, XID wraparound won't happen for about 20 years. But with a
different data load model, it could be a serious problem).

So, will per-table XID tracking allow us to avoid *ever* vacuuming some
tables? If not, what could?

--
Josh Berkus
Aglio Database Solutions
San Francisco


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Minimally avoiding Transaction Wraparound in VLDBs
Date: 2005-09-01 09:29:56
Message-ID: 1125566996.3956.150.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2005-08-31 at 22:21 -0400, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > On Wed, 2005-08-31 at 19:24 -0400, Tom Lane wrote:
> >> If you don't remove any tuples,
> >> you don't scan the indexes anyway IIRC.
>
> > No. Even if you remove *zero* tuples, an index is still scanned twice.
> > Once to not delete the rows and once to not delete the pages.
>
> Yeah? Well, that could probably be improved with a less intrusive fix,
> that is, one that does it automatically instead of involving the user.
>
> I really really do not like proposals to introduce still another kind
> of VACUUM. We have too many already; any casual glance through the
> archives will show that most PG users don't have a grip on when to use
> VACUUM FULL vs VACUUM. Throwing in some more types will make that
> problem exponentially worse.

I'll post my proposal for changing that, so we can see the two
alternatives. I'm easy either way at the moment.

Best Regards, Simon Riggs


From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: andrew(at)supernews(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Minimally avoiding Transaction Wraparound in VLDBs
Date: 2005-09-01 13:31:13
Message-ID: 20050901133112.GB28356@surnet.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Sep 01, 2005 at 04:21:58AM -0000, Andrew - Supernews wrote:
> On 2005-09-01, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> > On Thu, Sep 01, 2005 at 01:57:02AM +0100, Simon Riggs wrote:
> >
> >> > If you're using autovacuum then the problem is already taken care of.
> >>
> >> autovacuum will respond only to UPDATEs and DELETEs. In the scenario I
> >> outline, these will *never* occur on the largest tables. A VACUUM would
> >> still eventually be required to freeze long lived tuples and this would
> >> not be performed by autovacuum.
> >
> > Hum, I don't understand -- if you don't want to vacuum the table, why
> > run vacuum at all? You can (as of 8.1) disable autovacuum for specific
> > tables. The exception is that you are forced to run a database-wide
> > VACUUM once in a while (every billion-and-so), but this will hopefully
> > disappear in 8.2 too,
>
> Wishful thinking, or do you have a concrete plan to achieve it?

We talked about it during the autovacuum discussions just before feature
freeze. There is a vague plan which I intend to study eventually.

--
Alvaro Herrera -- Valdivia, Chile Architect, www.EnterpriseDB.com
"On the other flipper, one wrong move and we're Fatal Exceptions"
(T.U.X.: Term Unit X - http://www.thelinuxreview.com/TUX/)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Minimally avoiding Transaction Wraparound in VLDBs
Date: 2005-09-01 13:35:35
Message-ID: 12257.1125581735@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> So, will per-table XID tracking allow us to avoid *ever* vacuuming some
> tables?

If your definition of "ever" is less than a billion transactions, sure.
(As Simon points out, with time-partitioned data sets that could often
be arranged, so it's not a completely silly response.)

> If not, what could?

The only possibility for more-than-a-billion is widening XIDs to 64
bits, which would come with its own set of performance penalties.

regards, tom lane


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Minimally avoiding Transaction Wraparound in VLDBs
Date: 2005-09-01 16:09:59
Message-ID: 1125590999.3956.215.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 2005-09-01 at 10:29 +0100, Simon Riggs wrote:
> On Wed, 2005-08-31 at 22:21 -0400, Tom Lane wrote:
> > Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > > On Wed, 2005-08-31 at 19:24 -0400, Tom Lane wrote:
> > >> If you don't remove any tuples,
> > >> you don't scan the indexes anyway IIRC.
> >
> > > No. Even if you remove *zero* tuples, an index is still scanned twice.
> > > Once to not delete the rows and once to not delete the pages.
> >
> > Yeah? Well, that could probably be improved with a less intrusive fix,
> > that is, one that does it automatically instead of involving the user.
> >
> > I really really do not like proposals to introduce still another kind
> > of VACUUM. We have too many already; any casual glance through the
> > archives will show that most PG users don't have a grip on when to use
> > VACUUM FULL vs VACUUM. Throwing in some more types will make that
> > problem exponentially worse.
>
> I'll post my proposal for changing that, so we can see the two
> alternatives. I'm easy either way at the moment.

Currently, VACUUM will always scan each index twice, even if there are
no tuples to remove from the index. Once in index_bulk_delete() and once
in index_vacuum_cleanup() (at least for the nbtree and rtree AMs).

My first proposal is to add an extra parameter onto the
index_bulk_delete() call - ntuples. If ntuples == 0 then btbulkdelete()
will avoid scanning and return immediately. If a scan occurs, then we
keep track of how many tuples have been marked deleted and stop the scan
when we have reached this number. This approach means that two use cases
will be optimised:
1) where the VACUUM does not remove any tuples
2) where all the deletions are on the LHS of the index, as would be the
case in a frequently updated table like Shipments where an index had
either a monotonically increasing key or a time based key.

Optimising index_vacuum_cleanup() is more complex. btvacuumcleanup()
marks pages deleted OR adds pages already deleted onto the freelist.
Non-FULL VACUUM never does both at the same time for MVCC reasons, so it
takes two VACUUMs to actually move a page back onto the freelist. So
performing a scan of the index during index_vacuum_cleanup() does
actually have a purpose when there are no tuples deleted during the
vacuum because it might be the subsequent VACUUM coming back later to
freelist the deleted pages. (This is interesting, because autovacuum
knows nothing of the deleted pages and may not trigger a second vacuum,
even though one would be beneficial). index_vacuum_cleanup() knows how
many rows are to be removed because it is already part of the info
structure passed between index_bulk_delete() and index_vacuum_cleanup().

There are two options:
1) avoid the scan when there are no dead tuples, and ignore the
possibility that a VACUUM might be doing a follow-up scan to remove
previously deleted tuples.
2) close the loophole by recording whether a previous VACUUM had set any
pages as deleted, or not. If the index was "clean", then we would skip
the scan.

1) seems the best option since 2) is a lot of work for less gain.

If all of that seems OK, I'll code a patch.

Best Regards, Simon Riggs


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Minimally avoiding Transaction Wraparound in VLDBs
Date: 2005-09-01 18:03:14
Message-ID: 4809.1125597794@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> My first proposal is to add an extra parameter onto the
> index_bulk_delete() call - ntuples. If ntuples == 0 then btbulkdelete()
> will avoid scanning and return immediately. If a scan occurs, then we
> keep track of how many tuples have been marked deleted and stop the scan
> when we have reached this number.

This seems reasonable. I think you overstate the value of an early
stop, but it can't hurt given the minimal cost of making the check.

> 1) avoid the scan when there are no dead tuples, and ignore the
> possibility that a VACUUM might be doing a follow-up scan to remove
> previously deleted tuples.

I think this is a pretty bad idea.

Your analysis assumes that the two scans are equally expensive,
which is about as far from true as is possible. In the first place,
the first scan is in index order while the second is in physical order
(hence can benefit from readahead). In the second place, the first
scan iterates over each item on each page, where the second does not.
So what you're proposing is to install an unavoidable disk space bloat
mechanism in order to save only a small part of the cost of VACUUM.

(What I'd really like to see is a way to do the first scan in physical
order, but we'd need a different locking approach AFAICS --- see the
notes in nbtree/README.)

Keeping track of whether there's any work pending for the second pass
doesn't seem too unreasonable to me. It was left undone in the original
coding on the KISS principle, but it could certainly be done. I'd still
like to see some demonstration first that this part of index vacuuming
is a significant cost compared to the heap and first-pass index costs.

regards, tom lane


From: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Minimally avoiding Transaction Wraparound in VLDBs
Date: 2005-09-22 05:52:21
Message-ID: Pine.LNX.4.58.0509221550570.24326@linuxworld.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 31 Aug 2005, Tom Lane wrote:

> BTW ... the original Berkeley papers on Postgres make frequent reference
> to a "vacuum daemon", which seems to be essentially what we're trying to
> build with autovacuum. Does anyone know if the Berkeley implementation
> ever actually had auto vacuuming, or was that all handwaving? If it did
> exist, why was it removed?

Well, I was just poking around the executor and noticed this in
ExecDelete():

/*
* Note: Normally one would think that we have to delete index tuples
* associated with the heap tuple now..
*
* ... but in POSTGRES, we have no need to do this because the vacuum
* daemon automatically opens an index scan and deletes index tuples
* when it finds deleted heap tuples. -cim 9/27/89
*/

So, it seems they must have actually written the vacuum daemon.

Gavin


From: David Fetter <david(at)fetter(dot)org>
To: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Cimarron Taylor <cimarron(at)christopherrobin(dot)kolo(dot)net>
Subject: Re: Minimally avoiding Transaction Wraparound in VLDBs
Date: 2005-09-22 06:55:41
Message-ID: 20050922065541.GC20409@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Sep 22, 2005 at 03:52:21PM +1000, Gavin Sherry wrote:
> On Wed, 31 Aug 2005, Tom Lane wrote:
>
> > BTW ... the original Berkeley papers on Postgres make frequent reference
> > to a "vacuum daemon", which seems to be essentially what we're trying to
> > build with autovacuum. Does anyone know if the Berkeley implementation
> > ever actually had auto vacuuming, or was that all handwaving? If it did
> > exist, why was it removed?
>
> Well, I was just poking around the executor and noticed this in
> ExecDelete():
>
> /*
> * Note: Normally one would think that we have to delete index tuples
> * associated with the heap tuple now..
> *
> * ... but in POSTGRES, we have no need to do this because the vacuum
> * daemon automatically opens an index scan and deletes index tuples
> * when it finds deleted heap tuples. -cim 9/27/89
> */

I have an idea who this might be :)

Cheers,
D
--
David Fetter david(at)fetter(dot)org http://fetter.org/
phone: +1 510 893 6100 mobile: +1 415 235 3778

Remember to vote!


From: Michael Glaesemann <grzm(at)myrealbox(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Cimarron Taylor <cimarron(at)christopherrobin(dot)kolo(dot)net>
Subject: Re: Minimally avoiding Transaction Wraparound in VLDBs
Date: 2005-09-22 07:15:26
Message-ID: 1D4141D0-F12E-4234-AFBA-21635BE27017@myrealbox.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Sep 22, 2005, at 3:55 PM, David Fetter wrote:

> On Thu, Sep 22, 2005 at 03:52:21PM +1000, Gavin Sherry wrote:
>
>> On Wed, 31 Aug 2005, Tom Lane wrote:
>>
>> Well, I was just poking around the executor and noticed this in
>> ExecDelete():
>>
>> /*
>> * Note: Normally one would think that we have to delete index
>> tuples
>> * associated with the heap tuple now..
>> *
>> * ... but in POSTGRES, we have no need to do this because the
>> vacuum
>> * daemon automatically opens an index scan and deletes index
>> tuples
>> * when it finds deleted heap tuples. -cim 9/27/89
>> */
>>
>
> I have an idea who this might be :)

giof

http://www.taylors.org/cim/resume/cimarron.html

Michael Glaesemann
grzm myrealbox com