Re: [PATCH] Unremovable tuple monitoring

From: Yeb Havinga <yebhavinga(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Royce Ausburn <royce(dot)ml(at)inomial(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Kevin Grittner <kevin(dot)grittner(at)wicourts(dot)gov>
Subject: Re: [PATCH] Unremovable tuple monitoring
Date: 2011-11-16 08:31:43
Message-ID: 4EC374EF.6050000@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2011-11-15 22:04, Tom Lane wrote:
> Robert Haas<robertmhaas(at)gmail(dot)com> writes:
>> Oh. I was thinking "dead" meant "no longer visible to anyone". But
>> it sounds what we call "unremovable" here is what we elsewhere call
>> "recently dead".
> Would have to look at the code to be sure, but I think that
> "nonremovable" is meant to count both live tuples and
> dead-but-still-visible-to-somebody tuples.
>
> The question that I think needs to be asked is why it would be useful
> to track this using the pgstats mechanisms. By definition, the
> difference between this and the live-tuple count is going to be
> extremely unstable --- I don't say small, necessarily, but short-lived.
> So it's debatable whether it's worth memorializing the count obtained
> by the last VACUUM at all. And doing it through pgstats is an expensive
> thing. We've already had push-back about the size of the stats table
> on large (lots-o-tables) databases. Adding another counter will impose
> a performance overhead on everybody, whether they care about this number
> or not.
>
> What's more, to the extent that I can think of use-cases for knowing
> this number, I think I would want a historical trace of it --- that is,
> not only the last VACUUM's result but those of previous VACUUM cycles.
> So pgstats seems like it's both expensive and useless for the purpose.
>

Before reviewing this patch I didn't even know these kind of dead rows
could exist. Now I know it, I expect that if I wanted to know the
current number, I would start looking at table statistics: pg_stat* or
perhaps contrib/pgstattuple.

Looking at how that looks with transaction a the old version:

t=# begin TRANSACTION ISOLATION LEVEL repeatable read;
BEGIN
t=# select * from t;
i | b
----+---
10 | 2
(1 row)

in transaction b the new version:
t=# select * from t;
i | b
----+---
10 | 4
(1 row)

after a vacuum of t:

stat_user_table counts:
n_tup_ins | 1
n_tup_upd | 6
n_tup_del | 0
n_tup_hot_upd | 6
n_live_tup | 2
n_dead_tup | 0
n_unremovable_tup | 1

t=# select * from pgstattuple('t');
-[ RECORD 1 ]------+------
table_len | 8192
tuple_count | 1
tuple_len | 32
tuple_percent | 0.39
dead_tuple_count | 1
dead_tuple_len | 32
dead_tuple_percent | 0.39
free_space | 8080
free_percent | 98.63

Apparently pg_stat* counts the recently_dead tuple under n_live_tup,
else 2 is a wrong number, where pgstattuple counts recently_dead under
dead_tuple_count. This could be a source of confusion. If there is any
serious work considered here, IMHO at least the numbers of the two
different sources of tuple counters should match in terminology and
actual values. Maybe also if pgstattuple were to include the distinction
unremovable dead tuples vs dead tuples, a log line by vacuum
encountering unremovable dead tuples could refer to pgstattuple for
statistics.

regards,
Yeb Havinga

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-11-16 13:00:47 Re: Core Extensions relocation
Previous Message Simon Riggs 2011-11-16 08:18:42 Re: Group Commit