Re: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

Lists: pgsql-hackerspgsql-performance
From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Why is vacuum_freeze_min_age 100m?
Date: 2009-08-11 21:14:12
Message-ID: 4A81DF24.5000102@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

All,

I've just been tweaking some autovac settings for a large database, and
came to wonder: why does vacuum_max_freeze_age default to such a high
number? What's the logic behind that?

AFAIK, you want max_freeze_age to be the largest possible interval of
XIDs where an existing transaction might still be in scope, but no
larger. Yes?

If that's the case, I'd assert that users who do actually go through
100M XIDs within a transaction window are probably doing some
hand-tuning. And we could lower the default for most users
considerably, such as to 1 million.

Have I missed something?

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Why is vacuum_freeze_min_age 100m?
Date: 2009-08-11 21:23:59
Message-ID: 4A81E16F.3090507@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On 8/11/09 2:14 PM, Josh Berkus wrote:
> All,
>
> I've just been tweaking some autovac settings for a large database, and
> came to wonder: why does vacuum_max_freeze_age default to such a high
> number? What's the logic behind that?
>
> AFAIK, you want max_freeze_age to be the largest possible interval of
> XIDs where an existing transaction might still be in scope, but no
> larger. Yes?
>
> If that's the case, I'd assert that users who do actually go through
> 100M XIDs within a transaction window are probably doing some
> hand-tuning. And we could lower the default for most users
> considerably, such as to 1 million.

(replying to myself) actually, we don't want to set FrozenXID until the
row is not likely to be modified again. However, for most small-scale
installations (ones where the user has not done any tuning) that's still
likely to be less than 100m transactions.

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Why is vacuum_freeze_min_age 100m?
Date: 2009-08-11 21:45:55
Message-ID: 603c8f070908111445o6a0fbdrb392a649528aaa73@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On Tue, Aug 11, 2009 at 5:23 PM, Josh Berkus<josh(at)agliodbs(dot)com> wrote:
> On 8/11/09 2:14 PM, Josh Berkus wrote:
>> All,
>>
>> I've just been tweaking some autovac settings for a large database, and
>> came to wonder: why does vacuum_max_freeze_age default to such a high
>> number?  What's the logic behind that?
>>
>> AFAIK, you want max_freeze_age to be the largest possible interval of
>> XIDs where an existing transaction might still be in scope, but no
>> larger.  Yes?
>>
>> If that's the case, I'd assert that users who do actually go through
>> 100M XIDs within a transaction window are probably doing some
>> hand-tuning.  And we could lower the default for most users
>> considerably, such as to 1 million.
>
> (replying to myself) actually, we don't want to set FrozenXID until the
> row is not likely to be modified again.  However, for most small-scale
> installations (ones where the user has not done any tuning) that's still
> likely to be less than 100m transactions.

I don't think that's the name of the parameter, since a Google search
gives zero hits. There are so many fiddly parameters for this thing
that I don't want to speculate about which one you meant.

...Robert


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Why is vacuum_freeze_min_age 100m?
Date: 2009-08-11 22:06:54
Message-ID: 4A81EB7E.4050003@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance


> I don't think that's the name of the parameter, since a Google search
> gives zero hits. There are so many fiddly parameters for this thing
> that I don't want to speculate about which one you meant.

Sorry, subject line had it correct.

http://www.postgresql.org/docs/8.4/static/runtime-config-client.html#GUC-VACUUM-FREEZE-MIN-AGE

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Why is vacuum_freeze_min_age 100m?
Date: 2009-08-12 00:54:44
Message-ID: 1849.1250038484@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> I've just been tweaking some autovac settings for a large database, and
> came to wonder: why does vacuum_max_freeze_age default to such a high
> number? What's the logic behind that?

(1) not destroying potentially useful forensic evidence too soon;
(2) there's not really much to be gained by reducing it.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Why is vacuum_freeze_min_age 100m?
Date: 2009-08-12 01:11:37
Message-ID: 603c8f070908111811y31a9f7c8v5e14a1f8ecaff2a9@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On Tue, Aug 11, 2009 at 6:06 PM, Josh Berkus<josh(at)agliodbs(dot)com> wrote:
>
>> I don't think that's the name of the parameter, since a Google search
>> gives zero hits.  There are so many fiddly parameters for this thing
>> that I don't want to speculate about which one you meant.
>
> Sorry, subject line had it correct.
>
> http://www.postgresql.org/docs/8.4/static/runtime-config-client.html#GUC-VACUUM-FREEZE-MIN-AGE

Ah. Yeah, I agree with Tom: how would it help to make this smaller?
It seems like that could possibly increase I/O, if the old data is
changing at all, but even if it doesn't it I don't see that it saves
you anything to freeze it sooner. Generally freezing is unnecessary
pain: if we had 128-bit transaction IDs, I'm guessing that we wouldn't
care about freezing or wraparound at all. (Of course that would
create other problems, which is why we don't, but the point is
freezing is at best a necessary evil.)

...Robert


From: Torsten Zühlsdorff <foo(at)meisterderspiele(dot)de>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Why is vacuum_freeze_min_age 100m?
Date: 2009-08-12 06:48:45
Message-ID: h5tp40$5u4$1@news.eternal-september.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Tom Lane schrieb:
> Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> I've just been tweaking some autovac settings for a large database, and
>> came to wonder: why does vacuum_max_freeze_age default to such a high
>> number? What's the logic behind that?
>
> (1) not destroying potentially useful forensic evidence too soon;
> (2) there's not really much to be gained by reducing it.

If there is not really much to gain by changing the value, why do not
remove the parameter?

Greetings from germany,
Torsten


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Josh Berkus" <josh(at)agliodbs(dot)com>,"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Why is vacuum_freeze_min_age 100m?
Date: 2009-08-12 18:17:05
Message-ID: 4A82C0D10200002500029977@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> (2) there's not really much to be gained by reducing it.

That depends. The backup techniques I recently posted, using hard
links and rsync, saved us the expense of another ten or twenty TB of
mirrored SAN archival storage space, and expensive WAN bandwidth
upgrades. In piloting this we found that we were sending our
insert-only data over the wire twice -- once after it was inserted and
once after it aged sufficiently to be frozen. Aggressive freezing
effectively cut our bandwidth and storage needs for backup down almost
by half. (Especially after we made sure we left enough time for the
VACUUM FREEZE to complete before starting that night's backup
process.)

Not that most people have the same issue, but there are at least
*some* situations where there is something significant to be gained by
aggressive freezing. Not that this is an argument for changing the
*default*, of course; if someone is going to venture into these backup
techniques, they'd better have the technical savvy to deal with
tweaking their freeze strategy.

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Why is vacuum_freeze_min_age 100m?
Date: 2009-08-12 21:22:11
Message-ID: 7454.1250112131@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> (2) there's not really much to be gained by reducing it.

> That depends. The backup techniques I recently posted, using hard
> links and rsync, saved us the expense of another ten or twenty TB of
> mirrored SAN archival storage space, and expensive WAN bandwidth
> upgrades. In piloting this we found that we were sending our
> insert-only data over the wire twice -- once after it was inserted and
> once after it aged sufficiently to be frozen. Aggressive freezing
> effectively cut our bandwidth and storage needs for backup down almost
> by half. (Especially after we made sure we left enough time for the
> VACUUM FREEZE to complete before starting that night's backup
> process.)

Hmmm ... if you're using VACUUM FREEZE, its behavior is unaffected by
this GUC anyway --- that option makes it use a freeze age of zero.

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Why is vacuum_freeze_min_age 100m?
Date: 2009-08-12 21:33:44
Message-ID: 4A82EEE802000025000299AE@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Hmmm ... if you're using VACUUM FREEZE, its behavior is unaffected
> by this GUC anyway --- that option makes it use a freeze age of
> zero.

Yeah, I know, but feel like I'm being a bit naughty in using VACUUM
FREEZE -- the documentation says:

| Selects aggressive "freezing" of tuples. Specifying FREEZE is
| equivalent to performing VACUUM with the vacuum_freeze_min_age
| parameter set to zero. The FREEZE option is deprecated and will be
| removed in a future release; set the parameter instead.

So I figure that since it is deprecated, at some point I'll be setting
the vacuum_freeze_min_age option rather than leaving it at the default
and using VACUUM FREEZE in the nightly maintenance run.

-Kevin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Why is vacuum_freeze_min_age 100m?
Date: 2009-08-12 21:57:34
Message-ID: 10661.1250114254@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Yeah, I know, but feel like I'm being a bit naughty in using VACUUM
> FREEZE -- the documentation says:

> | Selects aggressive "freezing" of tuples. Specifying FREEZE is
> | equivalent to performing VACUUM with the vacuum_freeze_min_age
> | parameter set to zero. The FREEZE option is deprecated and will be
> | removed in a future release; set the parameter instead.

> So I figure that since it is deprecated, at some point I'll be setting
> the vacuum_freeze_min_age option rather than leaving it at the default
> and using VACUUM FREEZE in the nightly maintenance run.

I might be mistaken, but I think the reason we're planning to remove the
option is mainly so we can get rid of FREEZE as a semi-reserved keyword.
The GUC isn't going anywhere.

Anyway, the bottom line is what you said: fooling with this setting
seems like something that's only needed by advanced users.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Why is vacuum_freeze_min_age 100m?
Date: 2009-08-12 23:49:00
Message-ID: 603c8f070908121649u2524a42dj750c3719ea897b3f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On Wed, Aug 12, 2009 at 5:57 PM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
>> Yeah, I know, but feel like I'm being a bit naughty in using VACUUM
>> FREEZE -- the documentation says:
>
>> | Selects aggressive "freezing" of tuples. Specifying FREEZE is
>> | equivalent to performing VACUUM with the vacuum_freeze_min_age
>> | parameter set to zero. The FREEZE option is deprecated and will be
>> | removed in a future release; set the parameter instead.
>
>> So I figure that since it is deprecated, at some point I'll be setting
>> the vacuum_freeze_min_age option rather than leaving it at the default
>> and using VACUUM FREEZE in the nightly maintenance run.
>
> I might be mistaken, but I think the reason we're planning to remove the
> option is mainly so we can get rid of FREEZE as a semi-reserved keyword.
> The GUC isn't going anywhere.
>
> Anyway, the bottom line is what you said: fooling with this setting
> seems like something that's only needed by advanced users.

Someone had the idea a while back of pre-freezing inserted tuples in
the WAL-bypass case.

It seems like in theory you could have a background process that would
iterate through dirty shared buffers and freeze tuples
opportunistically before they are written back to disk, but I'm not
sure that it would really be worth it.

...Robert


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Why is vacuum_freeze_min_age 100m?
Date: 2009-08-13 14:32:40
Message-ID: 4A83DDB802000025000299ED@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> Someone had the idea a while back of pre-freezing inserted tuples in
> the WAL-bypass case.

I'm sure I'm not the one who thought up the idea and first posted
about it, but I'm certainly an advocate for it.

> It seems like in theory you could have a background process that
> would iterate through dirty shared buffers and freeze tuples
> opportunistically before they are written back to disk, but I'm not
> sure that it would really be worth it.

We have routinely been doing a database-level VACUUM FREEZE after a
pg_dump | psql copy of a database, because:

(1) Otherwise, users experience abysmal performance running routine
queries as every tuple scanned has its hint bits set during simple
SELECT statements. The massive disk write levels during SELECTs was
very confusing at first, and if you search the archives, I'm sure
you'll find that I'm not the only one who's been confused by it.

(2) Otherwise, there looms a point where every tuple restored, which
is not subsequently updated or deleted, will need to be frozen by
autovacuum -- all at the same time. Unless you're paying
extraordinary attention to the issue, you won't know when it is
coming, but the day will come. Probably in the middle of some
time-critical process which is doing a lot of work.

(3) We want to get this done before starting the WAL archiving, to
prevent having massive quantities of WAL to transmit across the WAN.

(4) With our improved backup processes we have another reason -- our
PITR base backup space requirements and WAN bandwidth usage will be
higher if we don't start from a frozen state and stay frozen.

So really, we'd be pretty silly *not* to make sure that all tuples are
frozen and have hint bits set after a pg_dump | psql copy. It would
speed the process somewhat if the tuples could be written in that
state to start with.

-Kevin


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Why is vacuum_freeze_min_age 100m?
Date: 2009-08-13 21:15:19
Message-ID: 4A848267.30207@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Robert,

> Ah. Yeah, I agree with Tom: how would it help to make this smaller?
> It seems like that could possibly increase I/O, if the old data is
> changing at all, but even if it doesn't it I don't see that it saves
> you anything to freeze it sooner.

Before 8.4, it actually does on tables which are purely cumulative
(WORM). Within a short time, say, 10,000 transactions, the rows to be
frozen are still in the cache. By 100m transactions, they are in an
archive partition which will need to be dragged from disk. So if I know
they won't be altered, then freezing them sooner would be better.

However, I can easily manage this through the autovacuum settings. I
just wanted confirmation of what I was thinking.

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Date: 2009-08-13 21:33:00
Message-ID: 1250199180.24981.35.camel@monkey-cat.sm.truviso.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

[ moving to -hackers ]

If this topic has been discussed previously, please point me to the
earlier threads.

Why aren't we more opportunistic about freezing tuples? For instance, if
we already have a dirty buffer in cache, we should be more aggressive
about freezing those tuples than freezing tuples on disk.

I looked at the code, and it looks like if we freeze one tuple on the
page during VACUUM, we mark it dirty. Wouldn't that be a good
opportunity to freeze all the other tuples on the page that we can?

Or, perhaps when the bgwriter is flushing dirty buffers, it can look for
opportunities to set hint bits or freeze tuples.

Regards,
Jeff Davis


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-performance(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Date: 2009-08-13 21:58:04
Message-ID: 20090813215804.GV5909@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Jeff Davis wrote:

> Why aren't we more opportunistic about freezing tuples? For instance, if
> we already have a dirty buffer in cache, we should be more aggressive
> about freezing those tuples than freezing tuples on disk.

The most widely cited reason is that you lose forensics data. Although
they are increasingly rare, there are still situations in which the heap
tuple machinery messes up and the xmin/xmax/etc fields of the tuple are
the best/only way to find out what happened and thus fix the bug. If
you freeze early, there's just no way to know.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Jeff Davis" <pgsql(at)j-davis(dot)com>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>,<pgsql-hackers(at)postgresql(dot)org>, <pgsql-performance(at)postgresql(dot)org>
Subject: [PERFORM] Re: freezing tuples ( was: Why is vacuum_freeze_min_age100m? )
Date: 2009-08-13 22:17:28
Message-ID: 4A844AA80200002500029A7F@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Alvaro Herrera <alvherre(at)commandprompt(dot)com> wrote:
> Jeff Davis wrote:
>
>> Why aren't we more opportunistic about freezing tuples? For
>> instance, if we already have a dirty buffer in cache, we should be
>> more aggressive about freezing those tuples than freezing tuples on
>> disk.
>
> The most widely cited reason is that you lose forensics data.
> Although they are increasingly rare, there are still situations in
> which the heap tuple machinery messes up and the xmin/xmax/etc
> fields of the tuple are the best/only way to find out what happened
> and thus fix the bug. If you freeze early, there's just no way to
> know.

Although I find it hard to believe that this is compelling argument in
the case where an entire table or database is loaded in a single
database transaction.

In the more general case, I'm not sure why this argument applies here
but not to cassert and other diagnostic options. It wouldn't surprise
me to find workloads where writing data three times (once for the
data, once for hint bits, and once to freeze the tid) affects
performance more than cassert.

-Kevin


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Date: 2009-08-13 22:20:43
Message-ID: 1250202043.24981.62.camel@monkey-cat.sm.truviso.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On Thu, 2009-08-13 at 17:58 -0400, Alvaro Herrera wrote:
> The most widely cited reason is that you lose forensics data. Although
> they are increasingly rare, there are still situations in which the heap
> tuple machinery messes up and the xmin/xmax/etc fields of the tuple are
> the best/only way to find out what happened and thus fix the bug. If
> you freeze early, there's just no way to know.

As it stands, it looks like it's not just one extra write for each
buffer, but potentially many (theoretically, as many as there are tuples
on a page). I suppose the reasoning is that tuples on the same page have
approximately the same xmin, and are likely to be frozen at the same
time. But it seems entirely reasonable that the xmins on one page span
several VACUUM runs, and that seems more likely with the FSM. That means
that a few tuples on the page are older than 100M and get frozen, and
the rest are only about 95M transactions old, so we have to come back
and freeze them again, later.

Let's say that we had a range like 50-100M, where if it's older than
100M, we freeze it, and if it's older than 50M we freeze it only if it's
on a dirty page. We would still have forensic evidence, but we could
make a range such that we avoid writing multiple times.

And people who don't care about forensic evidence can set it to 0-100M.

Regards,
Jeff Davis


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PERFORM] Re: freezing tuples ( was: Why is vacuum_freeze_min_age100m? )
Date: 2009-08-13 22:24:21
Message-ID: 1250202261.24981.66.camel@monkey-cat.sm.truviso.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On Thu, 2009-08-13 at 17:17 -0500, Kevin Grittner wrote:
> It wouldn't surprise
> me to find workloads where writing data three times (once for the
> data, once for hint bits, and once to freeze the tid)

I'm not sure that we're limited to 3 times, here. I could be missing
something, but if you have tuples with different xmins on the same page,
some might be older than 100M, which you freeze, and then you will have
to come back later to freeze the rest. As far as I can tell, the maximum
number of writes is the number of tuples that fit on the page.

Regards,
Jeff Davis


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-performance(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Date: 2009-08-13 22:25:16
Message-ID: 603c8f070908131525o54e95dbbna868c370485c9381@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On Thu, Aug 13, 2009 at 5:33 PM, Jeff Davis<pgsql(at)j-davis(dot)com> wrote:
> Or, perhaps when the bgwriter is flushing dirty buffers, it can look for
> opportunities to set hint bits or freeze tuples.

One of the tricky things here is that the time you are mostly likely
to want to do this is when you are loading a lot of data. But in that
case shared buffers are likely to be written back to disk before
transaction commit, so it'll be too early to do anything.

...Robert


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-performance(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PERFORM] Re: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Date: 2009-08-13 22:35:32
Message-ID: 4A849534.4030506@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance


>> Why aren't we more opportunistic about freezing tuples? For instance, if
>> we already have a dirty buffer in cache, we should be more aggressive
>> about freezing those tuples than freezing tuples on disk.
>
> The most widely cited reason is that you lose forensics data. Although
> they are increasingly rare, there are still situations in which the heap
> tuple machinery messes up and the xmin/xmax/etc fields of the tuple are
> the best/only way to find out what happened and thus fix the bug. If
> you freeze early, there's just no way to know.

That argument doesn't apply. If the page is in memory and is being
written anyway, and some of the rows are past vacuum_freeze_min_age,
then why not freeze them rather than waiting for a vacuum process to
read them off disk and rewrite them?

We're not talking about freezing every tuple as soon as it's out of
scope. Just the ones which are more that 100m (or whatever the setting
is) old. I seriously doubt that anyone is doing useful forensics using
xids which are 100m old.

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Date: 2009-08-13 22:46:10
Message-ID: 29103.1250203570@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Jeff Davis <pgsql(at)j-davis(dot)com> writes:
> Let's say that we had a range like 50-100M, where if it's older than
> 100M, we freeze it, and if it's older than 50M we freeze it only if it's
> on a dirty page. We would still have forensic evidence, but we could
> make a range such that we avoid writing multiple times.

Yeah, making the limit "slushy" would doubtless save some writes, with
not a lot of downside.

> And people who don't care about forensic evidence can set it to 0-100M.

Everybody *thinks* they don't care about forensic evidence. Until they
need it.

regards, tom lane


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Date: 2009-08-13 23:01:08
Message-ID: 1250204468.24981.94.camel@monkey-cat.sm.truviso.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On Thu, 2009-08-13 at 18:46 -0400, Tom Lane wrote:
> Yeah, making the limit "slushy" would doubtless save some writes, with
> not a lot of downside.

OK, then should we make this a TODO? I'll make an attempt at this.

> > And people who don't care about forensic evidence can set it to 0-100M.
>
> Everybody *thinks* they don't care about forensic evidence. Until they
> need it.

We already allow setting vacuum_freeze_min_age to zero, so I don't see a
solution here other than documentation.

Regards,
Jeff Davis


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Date: 2009-08-13 23:05:57
Message-ID: 29393.1250204757@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Jeff Davis <pgsql(at)j-davis(dot)com> writes:
> On Thu, 2009-08-13 at 18:46 -0400, Tom Lane wrote:
>> Everybody *thinks* they don't care about forensic evidence. Until they
>> need it.

> We already allow setting vacuum_freeze_min_age to zero, so I don't see a
> solution here other than documentation.

Yeah, we allow it. I just don't want to encourage it ... and definitely
not make it default.

What are you envisioning exactly? If vacuum finds any reason to dirty
a page (or it's already dirty), then freeze everything on the page that's
got age > some lower threshold?

regards, tom lane


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Date: 2009-08-13 23:07:39
Message-ID: 4A849CBB.2060201@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Jeff, Tom,

>> Let's say that we had a range like 50-100M, where if it's older than
>> 100M, we freeze it, and if it's older than 50M we freeze it only if it's
>> on a dirty page. We would still have forensic evidence, but we could
>> make a range such that we avoid writing multiple times.
>
> Yeah, making the limit "slushy" would doubtless save some writes, with
> not a lot of downside.

This would mean two settings: vacuum_freeze_min_age and
vacuum_freeze_dirty_age. And we'd need to add those to the the
autovacuum settings for each table as well. While we could just make
one setting 1/2 of the other, that prevents me from saying:

"freeze this table agressively if it's in memory, but wait a long time
to vaccuum if it's on disk"

I can completely imagine a table which has a vacuum_freeze_dirty_age of
10000 and a vacuum_freeze_min_age of 1m.

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Date: 2009-08-13 23:16:37
Message-ID: 4A849ED5.8040305@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance


> What are you envisioning exactly? If vacuum finds any reason to dirty
> a page (or it's already dirty), then freeze everything on the page that's
> got age > some lower threshold?

I was envisioning, if the page is already dirty and in memory *for any
reason*, the freeze rows at below some threshold.

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Date: 2009-08-13 23:20:23
Message-ID: 1250205623.24981.128.camel@monkey-cat.sm.truviso.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On Thu, 2009-08-13 at 19:05 -0400, Tom Lane wrote:
> What are you envisioning exactly? If vacuum finds any reason to dirty
> a page (or it's already dirty), then freeze everything on the page that's
> got age > some lower threshold?

Yes. There are two ways to do the threshold:
1. Constant fraction of vacuum_freeze_min_age
2. Extra GUC

I lean toward #1, because it avoids an extra GUC*, and it avoids the
awkwardness when the "lower" setting is higher than the "higher"
setting.

However, #2 might be nice for people who want to live on the edge or
experiment with new values. But I suspect most of the advantage would be
had just by saying that we opportunistically freeze tuples older than
50% of vacuum_freeze_min_age.

Regards,
Jeff Davis

*: As an aside, these GUCs already have incredibly confusing names, and
an extra variable would increase the confusion. For instance, they seem
to use "min" and "max" interchangeably.


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Date: 2009-08-13 23:21:09
Message-ID: 1250205669.24981.129.camel@monkey-cat.sm.truviso.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On Thu, 2009-08-13 at 18:25 -0400, Robert Haas wrote:
> On Thu, Aug 13, 2009 at 5:33 PM, Jeff Davis<pgsql(at)j-davis(dot)com> wrote:
> > Or, perhaps when the bgwriter is flushing dirty buffers, it can look for
> > opportunities to set hint bits or freeze tuples.
>
> One of the tricky things here is that the time you are mostly likely
> to want to do this is when you are loading a lot of data. But in that
> case shared buffers are likely to be written back to disk before
> transaction commit, so it'll be too early to do anything.

I think it would be useful in other cases, like avoiding repeated
freezing of different tuples on the same page.

Regards,
Jeff Davis


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Date: 2009-08-13 23:21:47
Message-ID: 407d949e0908131621x273d8930lc086e4084570c19d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On Fri, Aug 14, 2009 at 12:07 AM, Josh Berkus<josh(at)agliodbs(dot)com> wrote:
> "freeze this table agressively if it's in memory, but wait a long time
> to vaccuum if it's on disk"

Waitasec, "in memory"?

There are two projects here:

1) Make vacuum when it's freezing tuples freeze every tuple > lesser
age if it finds any tuples which are > max_age (or I suppose if the
page is already dirty due to vacuum or something else). Vacuum still
has to read in all the pages before it finds out that they don't need
freezing so it doesn't mean distinguishing "in memory" from "needs to
be read in".

2) Have something like bgwriter check if the page is dirty and vacuum
and freeze things based on the lesser threshold. This would
effectively only be vacuuming things that are "in memory"

However the latter is a more complex and frought project. We looked at
this a while back in EDB and we found that the benefits were less than
we expected and the complexities more than we expected. I would
recommend sticking with (1) for now and only looking at (2) if we have
a more detailed plan and solid testable use cases.

--
greg
http://mit.edu/~gsstark/resume.pdf


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Date: 2009-08-13 23:21:55
Message-ID: 29685.1250205715@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Josh Berkus <josh(at)agliodbs(dot)com> writes:
>> What are you envisioning exactly? If vacuum finds any reason to dirty
>> a page (or it's already dirty), then freeze everything on the page that's
>> got age > some lower threshold?

> I was envisioning, if the page is already dirty and in memory *for any
> reason*, the freeze rows at below some threshold.

I believe we've had this discussion before. I do *NOT* want freezing
operations pushed into any random page access, and in particular will
do my best to veto any attempt to put them into the bgwriter. Freezing
requires accessing the clog and emitting a WAL record, and neither is
appropriate for low-level code like bgwriter. The deadlock potential
alone is sufficient reason why not.

regards, tom lane


From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Date: 2009-08-13 23:31:15
Message-ID: 407d949e0908131631j3a96f1dbl7c467b6faa790b44@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On Fri, Aug 14, 2009 at 12:21 AM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I was envisioning, if the page is already dirty and in memory *for any
>> reason*, the freeze rows at below some threshold.
>
> I believe we've had this discussion before.  I do *NOT* want freezing
> operations pushed into any random page access, and in particular will
> do my best to veto any attempt to put them into the bgwriter.

It's possible Josh accidentally waved this red flag and really meant
just to make it conditional on whether the page is dirty rather than
on whether vacuum dirtied it.

However he did give me a thought....

With the visibility map vacuum currently only covers pages that are
known to have in-doubt tuples. That's why we have the anti-wraparound
vacuums. However it could also check if the pages its skipping are in
memory and process them if they are even if they don't have in-doubt
tuples.

Or it could first go through ram and process any pages that are in
cache before going to the visibility map and starting from page 0,
which would hopefully avoid having to read them in later when we get
to them and find they've been flushed out.

I'm just brainstorming here. I'm not sure if either of these are
actually worth the complexity and danger of finding new bottlenecks in
special case optimization codepaths.

--
greg
http://mit.edu/~gsstark/resume.pdf


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Why is vacuum_freeze_min_age 100m?
Date: 2009-08-14 03:11:39
Message-ID: 603c8f070908132011q402e708em3653c88dc203241@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On Thu, Aug 13, 2009 at 5:15 PM, Josh Berkus<josh(at)agliodbs(dot)com> wrote:
> Robert,
>
>> Ah.  Yeah, I agree with Tom: how would it help to make this smaller?
>> It seems like that could possibly increase I/O, if the old data is
>> changing at all, but even if it doesn't it I don't see that it saves
>> you anything to freeze it sooner.
>
> Before 8.4, it actually does on tables which are purely cumulative
> (WORM).  Within a short time, say, 10,000 transactions, the rows to be
> frozen are still in the cache.  By 100m transactions, they are in an
> archive partition which will need to be dragged from disk.  So if I know
> they won't be altered, then freezing them sooner would be better.
>
> However, I can easily manage this through the autovacuum settings.  I
> just wanted confirmation of what I was thinking.

Interesting. Thanks for the explanation.

...Robert


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Date: 2009-08-14 18:37:24
Message-ID: 12037.1250275044@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Jeff Davis <pgsql(at)j-davis(dot)com> writes:
> Yes. There are two ways to do the threshold:
> 1. Constant fraction of vacuum_freeze_min_age
> 2. Extra GUC

> I lean toward #1, because it avoids an extra GUC*, and it avoids the
> awkwardness when the "lower" setting is higher than the "higher"
> setting.

I tend to agree with Josh that you do need to offer two knobs. But
expressing the second knob as a fraction (with range 0 to 1) might be
better than an independent "min" parameter. As you say, that'd be
useful to prevent people from setting them inconsistently.

> *: As an aside, these GUCs already have incredibly confusing names, and
> an extra variable would increase the confusion. For instance, they seem
> to use "min" and "max" interchangeably.

Some of them are in fact max's, I believe. They are complicated :-(.
It might be worth somebody taking two steps back and seeing if we need
quite so many knobs. I think we got here partly by not wanting to
predetermine vacuuming strategies, but it doesn't help to offer
flexibility if people can't figure out how to use it.

regards, tom lane


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Date: 2009-08-14 20:57:07
Message-ID: 1250283427.24981.160.camel@monkey-cat.sm.truviso.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On Fri, 2009-08-14 at 14:37 -0400, Tom Lane wrote:
> I tend to agree with Josh that you do need to offer two knobs. But
> expressing the second knob as a fraction (with range 0 to 1) might be
> better than an independent "min" parameter. As you say, that'd be
> useful to prevent people from setting them inconsistently.

Ok. Any ideas for a name?

Josh suggests "vacuum_freeze_dirty_age" (or perhaps he was using at as a
placeholder). I don't particularly like that name, but I can't think of
anything better without renaming vacuum_freeze_min_age.

> > *: As an aside, these GUCs already have incredibly confusing names, and
> > an extra variable would increase the confusion. For instance, they seem
> > to use "min" and "max" interchangeably.
>
> Some of them are in fact max's, I believe.

Looking at the definitions of vacuum_freeze_min_age and
autovacuum_freeze_max_age there seems to be almost no distinction
between "min" and "max" in those two names. I've complained about this
before:

http://archives.postgresql.org/pgsql-hackers/2008-12/msg01731.php

I think both are essentially thresholds, so giving them two names with
opposite meaning is misleading.

Regards,
Jeff Davis


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Date: 2009-08-15 23:02:03
Message-ID: 1250377323.9960.2.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On fre, 2009-08-14 at 13:57 -0700, Jeff Davis wrote:
> Looking at the definitions of vacuum_freeze_min_age and
> autovacuum_freeze_max_age there seems to be almost no distinction
> between "min" and "max" in those two names.

For min, the action happens at or above the min values. For max, the
action happens at or below the max value.

With those two particular parameters, the freezing happens exactly
between the min and the max value.


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Date: 2009-08-15 23:55:41
Message-ID: 1250380541.23986.60.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On Sun, 2009-08-16 at 02:02 +0300, Peter Eisentraut wrote:
> For min, the action happens at or above the min values. For max, the
> action happens at or below the max value.

>From the docs, 23.1.4:

"autovacuum is invoked on any table that might contain XIDs older than
the age specified by the configuration parameter
autovacuum_freeze_max_age"

I interpret that to mean that the forced autovacuum run happens above
the value. You could reasonably call it the "minimum age of relfrozenxid
that will cause autovacuum to forcibly run a vacuum".

Similarly, you could call vacuum_freeze_min_age "the maximum age a tuple
can be before a vacuum will freeze it".

I'm not trying to be argumentative, I'm just trying to show that it can
be confusing if you interpret it the wrong way. The first time I saw
those configuration names, I was confused, and ever since, I have to
think about it: "is that variable called min or max?".

My general feeling is that both of these are thresholds. The only real
maximum happens near wraparound.

> With those two particular parameters, the freezing happens exactly
> between the min and the max value.

Thanks, that's a helpful way to remember it.

It may be a little obsolete because now the freezing will normally
happen between vacuum_freeze_min_age and vacuum_freeze_table_age; but at
least I should be able to remember which of the other parameters is
"min" and which one is "max".

Regards,
Jeff Davis


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )
Date: 2009-08-16 12:14:16
Message-ID: 1250424856.26280.0.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

On lör, 2009-08-15 at 16:55 -0700, Jeff Davis wrote:
> Similarly, you could call vacuum_freeze_min_age "the maximum age a
> tuple
> can be before a vacuum will freeze it".

Heh, you could also call max_connections the "minimum number of
connections before the server will refuse new connection attempts".

It's not easy ... ;-)


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Jeff Davis" <pgsql(at)j-davis(dot)com>,"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: freezing tuples ( was: Why is vacuum_freeze_min_age100m? )
Date: 2009-08-17 14:38:59
Message-ID: 4A8925330200002500029B24@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-performance

Jeff Davis <pgsql(at)j-davis(dot)com> wrote:

> There are two ways to do the threshold:
> 1. Constant fraction of vacuum_freeze_min_age
> 2. Extra GUC

I appreciate that there may be room to improve this while protecting
the forensic values; but there are already strategies for managing the
day-to-day performance issues as long as you have adequate backup to
not need to rely on old XID information for recovery. What we don't
have covered is loading a database from pg_dump without having to
rewrite all pages at least once afterward -- and likely two more
times, with most maintenance strategies.

I seem to remember that someone took a shot at making a special case
of WAL-bypassed inserts, but there was a problem or two that were hard
to overcome. Does anyone recall the details? Was that about
pre-setting the hint bits for a successful commit (based on the fact
that the entire table will be empty if rolled back and no data will be
visible to any other transaction until commit), or was it about
setting the frozen XID in the inserted tuples (based on the fact that
this is no less useful for forensic purposes than having all rows set
to any other value)?

Should we have a TODO item for this special case, or is it "not
wanted" or viewed as having intractable problems?

-Kevin