Quick Links

Re: [GENERAL] ANALYZE getting dead tuple count hopelessly wrong

Lists:	pgsql-generalpgsql-hackers

From:	Stuart Brooks <stuartb(at)cat(dot)co(dot)za>
To:	pgsql-general(at)postgresql(dot)org
Subject:	ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-03-31 08:03:23
Message-ID:	47F09ACB.5020506@cat.co.za
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

I have a table with about 15 million rows which is constantly having
tuples added to the head and deleted in blocks from the tail to maintain
the size. The dead tuple count in pg_stat_user_tables tracks the deleted
rows fairly accurately until an auto-ANALYZE is done in the background
at which point the value it calculates is wrong by a factor of 2-3 times
(calculated value is 30-50% of the correct value), which completely
throws the auto-VACUUMing. An example is that the auto-VACUUM only ran
when there were 12 million (real) dead rows! Any ideas?

Thanks
Stuart

PS. Running 8.3.1 on NetBSD 3.

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Stuart Brooks" <stuartb(at)cat(dot)co(dot)za>
Cc:	pgsql-general(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-03-31 12:34:26
Message-ID:	2e78013d0803310534j4462b1bepd4bd37f287e16bee@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

On Mon, Mar 31, 2008 at 1:33 PM, Stuart Brooks <stuartb(at)cat(dot)co(dot)za> wrote:
> I have a table with about 15 million rows which is constantly having
> tuples added to the head and deleted in blocks from the tail to maintain
> the size. The dead tuple count in pg_stat_user_tables tracks the deleted
> rows fairly accurately until an auto-ANALYZE is done in the background
> at which point the value it calculates is wrong by a factor of 2-3 times
> (calculated value is 30-50% of the correct value)

(copying -hackers)

Seems like the redirected-dead line pointers are playing spoil-sport here.
In this particular example, the deleted tuples may get truncated to
redirected-dead line pointers. Analyze would report them as empty
slots and not as dead tuples. So in the worst case, if all the deleted
tuples are already truncated to redirected-dead line pointers, analyze
may report "zero" dead tuple count.

This is a slightly tricky situation because in normal case we might want
to delay autovacuum to let subsequent UPDATEs in the page to reuse
the space released by the deleted tuples. But in this particular example,
delaying autovacuum is not a good thing because the relation would
just keep growing.

I think we should check for redirected-dead line pointers in analyze.c
and report them as dead tuples. The other longer term alternative
could be to track redirected-dead line pointers and give them some
weightage while deciding on autovacuum. We can also update the
FSM information of a page when its pruned/defragged so that the page
can also be used for subsequent INSERTs or non-HOT UPDATEs in
other pages. This might be easier said than done.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	"Stuart Brooks" <stuartb(at)cat(dot)co(dot)za>, pgsql-general(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-03-31 15:32:43
Message-ID:	28334.1206977563@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> writes:
> Seems like the redirected-dead line pointers are playing spoil-sport here.
> In this particular example, the deleted tuples may get truncated to
> redirected-dead line pointers. Analyze would report them as empty
> slots and not as dead tuples. So in the worst case, if all the deleted
> tuples are already truncated to redirected-dead line pointers, analyze
> may report "zero" dead tuple count.

[ Please see if you can stop using the "redirected dead" terminology ]

Yeah, I think I agree. The page pruning code is set up so that changing
a line pointer to DEAD state doesn't change the count of dead tuples in
the table, so we are counting unreclaimed DEAD pointers as still being
dead tuples requiring VACUUM. ANALYZE should surely not affect that.

It looks like there's no trivial way to get ANALYZE to do things that
way, though. heap_release_fetch() doesn't distinguish a DEAD line
pointer from an unused or redirected one. But in the current
implementation of ANALYZE there's really no benefit to using
heap_release_fetch anyway --- it always examines all line pointers
on each selected page, so we might as well rewrite it to use a simple
loop more like vacuum uses.

I notice that this'd leave heap_release_fetch completely unused...
at least in HEAD I'd be tempted to get rid of it and restore heap_fetch
to its former simplicity.

regards, tom lane

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Stuart Brooks" <stuartb(at)cat(dot)co(dot)za>, pgsql-general(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-03-31 18:34:25
Message-ID:	2e78013d0803311134v44604be5l20e97ee7d3461ec4@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

On Mon, Mar 31, 2008 at 9:02 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

>
> [ Please see if you can stop using the "redirected dead" terminology ]
>
>

Apologies, will keep that in mind. Seems like a hang-over from the past :-)

> Yeah, I think I agree. The page pruning code is set up so that changing
> a line pointer to DEAD state doesn't change the count of dead tuples in
> the table, so we are counting unreclaimed DEAD pointers as still being
> dead tuples requiring VACUUM. ANALYZE should surely not affect that.
>
> It looks like there's no trivial way to get ANALYZE to do things that
> way, though. heap_release_fetch() doesn't distinguish a DEAD line
> pointer from an unused or redirected one. But in the current
> implementation of ANALYZE there's really no benefit to using
> heap_release_fetch anyway --- it always examines all line pointers
> on each selected page, so we might as well rewrite it to use a simple
> loop more like vacuum uses.
>

I agree. I would write a patch on these lines, unless you are already on to it.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	"Stuart Brooks" <stuartb(at)cat(dot)co(dot)za>, pgsql-general(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-03-31 19:52:46
Message-ID:	19331.1206993166@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> writes:
> On Mon, Mar 31, 2008 at 9:02 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> It looks like there's no trivial way to get ANALYZE to do things that
>> way, though. heap_release_fetch() doesn't distinguish a DEAD line
>> pointer from an unused or redirected one. But in the current
>> implementation of ANALYZE there's really no benefit to using
>> heap_release_fetch anyway --- it always examines all line pointers
>> on each selected page, so we might as well rewrite it to use a simple
>> loop more like vacuum uses.

> I agree. I would write a patch on these lines, unless you are already on to it.

Please do --- I have a lot of other stuff on my plate.

regards, tom lane

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Stuart Brooks" <stuartb(at)cat(dot)co(dot)za>, pgsql-general(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-04-01 07:37:20
Message-ID:	2e78013d0804010037x2a7e16d0wbbc94e9e50ecf03e@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

On Tue, Apr 1, 2008 at 1:22 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

>
> Please do --- I have a lot of other stuff on my plate.
>

Please see the attached patch. One change I made is to hold the SHARE lock
on the page while ANALYZE is reading tuples from it. I thought it would
be a right thing to do instead of repeatedly acquiring/releasing the lock.

Another thing I noticed while working on this is VACUUM probably reports the
number of dead tuples incorrectly. We don't count the DEAD line pointers as
"tups_vacuumed" which is fine if the line pointer was marked DEAD in the
immediately preceding heap_page_prune(). In that case the DEAD line pointer
is counted in "ndeleted" count returned by heap_page_prune(). But it fails to
count already DEAD line pointers.

For example

postgres=# CREATE TABLE test (a int, b char(500));
CREATE TABLE
postgres=# INSERT INTO test VALUES (generate_series(1,15),'foo');
INSERT 0 15
postgres=# DELETE FROM test;
DELETE 15
postgres=# select count(*) from test;
count
-------
0
(1 row)

postgres=# VACUUM VERBOSE test;
INFO: vacuuming "public.test"
INFO: "test": removed 0 row versions in 1 pages
INFO: "test": found 0 removable, 0 nonremovable row versions in 1 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
1 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "test": truncated 1 to 0 pages
DETAIL: CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM

So VACUUM reports "zero" dead row versions which may seem
counter-intuitive especially in the autovac log message (as
someone may wonder why autovac got triggered on the table)

I am thinking we can make heap_page_prune() to only return
number of HOT tuples pruned and then explicitly count the DEAD
line pointers in tups_vacuumed.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

Attachment	Content-Type	Size
Analyze-fix.patch.gz	application/x-gzip	2.5 KB

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stuart Brooks <stuartb(at)cat(dot)co(dot)za>, pgsql-general(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-04-01 08:04:21
Message-ID:	1207037061.4238.8.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

On Tue, 2008-04-01 at 13:07 +0530, Pavan Deolasee wrote:

> Please see the attached patch. One change I made is to hold the SHARE lock
> on the page while ANALYZE is reading tuples from it. I thought it would
> be a right thing to do instead of repeatedly acquiring/releasing the lock.

ANALYZE is a secondary task and so we shouldn't be speeding it up at the
possible expense of other primary tasks. So I think holding locks for
longer than minimum is not good in this case and I see no reason to make
the change described.

We can speed up ANALYZE by using the background reader to preread the
blocks, assuming bgreader will happen one day.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk

From:	Stuart Brooks <stuartb(at)cat(dot)co(dot)za>
To:	pgsql-general(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-04-01 13:00:31
Message-ID:	47F231EF.6040006@cat.co.za
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

>> Please do --- I have a lot of other stuff on my plate.
>>
>>
>
> Please see the attached patch. One change I made is to hold the SHARE lock
> on the page while ANALYZE is reading tuples from it. I thought it would
> be a right thing to do instead of repeatedly acquiring/releasing the lock.
>
I have applied the patch and have started my test again, it takes a
little while to fill up so I should have the results sometime tomorrow.

Thanks for the quick response.
Stuart

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Stuart Brooks <stuartb(at)cat(dot)co(dot)za>, pgsql-general(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-04-01 14:22:03
Message-ID:	1678.1207059723@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> On Tue, 2008-04-01 at 13:07 +0530, Pavan Deolasee wrote:
>> Please see the attached patch. One change I made is to hold the SHARE lock
>> on the page while ANALYZE is reading tuples from it. I thought it would
>> be a right thing to do instead of repeatedly acquiring/releasing the lock.

> ANALYZE is a secondary task and so we shouldn't be speeding it up at the
> possible expense of other primary tasks. So I think holding locks for
> longer than minimum is not good in this case and I see no reason to make
> the change described.

I think Pavan's change is probably good. In the first place, it's only
a shared buffer lock and besides ANALYZE isn't going to be holding it
long (all it's doing at this point is counting tuples and copying some
of them into memory). In the second place, repeated lock release and
re-grab threatens cache line contention and a context swap storm if
there is anyone else trying to access the page. In the third, whether
there's contention or not the extra acquire/release work will cost CPU
cycles. In the fourth, if we actually believed this was a problem we'd
need to redesign VACUUM too, as it does the same thing.

I haven't read the patch yet, but I'm inclined to go with the design
Pavan suggests.

regards, tom lane

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Stuart Brooks <stuartb(at)cat(dot)co(dot)za>, pgsql-general(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-04-01 14:46:15
Message-ID:	1207061175.4238.32.camel@ebony.site
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

On Tue, 2008-04-01 at 10:22 -0400, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > On Tue, 2008-04-01 at 13:07 +0530, Pavan Deolasee wrote:
> >> Please see the attached patch. One change I made is to hold the SHARE lock
> >> on the page while ANALYZE is reading tuples from it. I thought it would
> >> be a right thing to do instead of repeatedly acquiring/releasing the lock.
>
> > ANALYZE is a secondary task and so we shouldn't be speeding it up at the
> > possible expense of other primary tasks. So I think holding locks for
> > longer than minimum is not good in this case and I see no reason to make
> > the change described.
>
> I think Pavan's change is probably good. In the first place, it's only
> a shared buffer lock and besides ANALYZE isn't going to be holding it
> long (all it's doing at this point is counting tuples and copying some
> of them into memory). In the second place, repeated lock release and
> re-grab threatens cache line contention and a context swap storm if
> there is anyone else trying to access the page. In the third, whether
> there's contention or not the extra acquire/release work will cost CPU
> cycles. In the fourth, if we actually believed this was a problem we'd
> need to redesign VACUUM too, as it does the same thing.

VACUUM waits until nobody else has the buffer pinned, so lock contention
is much less of a consideration there. Plus it rearranges the block,
which is hard to do one tuple at a time even if we wanted to.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Stuart Brooks <stuartb(at)cat(dot)co(dot)za>, pgsql-general(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-04-01 15:00:15
Message-ID:	2320.1207062015@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> On Tue, 2008-04-01 at 10:22 -0400, Tom Lane wrote:
>> In the fourth, if we actually believed this was a problem we'd
>> need to redesign VACUUM too, as it does the same thing.

> VACUUM waits until nobody else has the buffer pinned, so lock contention
> is much less of a consideration there. Plus it rearranges the block,
> which is hard to do one tuple at a time even if we wanted to.

That's the second scan. The first scan acts exactly like Pavan is
proposing for ANALYZE.

regards, tom lane

From:	Gregory Stark <stark(at)enterprisedb(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>, "Stuart Brooks" <stuartb(at)cat(dot)co(dot)za>, <pgsql-general(at)postgresql(dot)org>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-04-01 19:18:18
Message-ID:	87lk3x5qpx.fsf@oxford.xeocode.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
>> On Tue, 2008-04-01 at 13:07 +0530, Pavan Deolasee wrote:
>>> Please see the attached patch. One change I made is to hold the SHARE lock
>>> on the page while ANALYZE is reading tuples from it. I thought it would
>>> be a right thing to do instead of repeatedly acquiring/releasing the lock.
>
>> ANALYZE is a secondary task and so we shouldn't be speeding it up at the
>> possible expense of other primary tasks. So I think holding locks for
>> longer than minimum is not good in this case and I see no reason to make
>> the change described.
>
> I think Pavan's change is probably good. In the first place, it's only
> a shared buffer lock and besides ANALYZE isn't going to be holding it
> long (all it's doing at this point is counting tuples and copying some
> of them into memory). In the second place, repeated lock release and
> re-grab threatens cache line contention and a context swap storm if
> there is anyone else trying to access the page. In the third, whether
> there's contention or not the extra acquire/release work will cost CPU
> cycles. In the fourth, if we actually believed this was a problem we'd
> need to redesign VACUUM too, as it does the same thing.

I'm not sure all those arguments are valid (at least the first two seem
contradictory). However I'm skeptical about Simon's premise. It's not clear
any changes to ANALYZE here are at the expense of other proceses. Any cycles
saved in ANALYZE are available for those other processes after all...

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [GENERAL] ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-04-02 20:29:31
Message-ID:	13905.1207168171@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> writes:
> Please see the attached patch. One change I made is to hold the SHARE lock
> on the page while ANALYZE is reading tuples from it. I thought it would
> be a right thing to do instead of repeatedly acquiring/releasing the lock.

Bruce pointed out to me off-list that this patch is closely related to
the patch I proposed awhile back for ANALYZE *overcounting* dead tuples.
That one is on the current commit-fest list because we held it over
after this discussion:
http://archives.postgresql.org/pgsql-hackers/2007-11/msg00771.php

On reflection it seems to me that we allowed ourselves to get distracted
by schemes for reducing the error attributable to the uncertain state
of in-doubt tuples. That's still something interesting to think about,
but we forgot the fact that there's a serious problem in 8.3 and the
patches we have would clearly make it better. What I propose therefore
is combining this patch with my older one so that ANALYZE counts
according to the following rules:

REDIRECT line pointer: ignore
DEAD line pointer: count as dead
HEAPTUPLE_LIVE tuple: count as live, include in statistics pool
HEAPTUPLE_DEAD: count as dead
HEAPTUPLE_RECENTLY_DEAD: count as dead
HEAPTUPLE_INSERT_IN_PROGRESS: ignore
HEAPTUPLE_DELETE_IN_PROGRESS: count as live

We might want to adjust these rules later after more thought, but in any
case ANALYZE has to be fixed to be able to distinguish these cases in
the first place. This is better than what we have and is reasonable to
back-patch. Trying to reduce the race conditions for in-doubt tuples
should go on the TODO list.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [GENERAL] ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-04-02 23:04:09
Message-ID:	17953.1207177449@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

I wrote:
> ... What I propose therefore
> is combining this patch with my older one so that ANALYZE counts
> according to the following rules:

> REDIRECT line pointer: ignore
> DEAD line pointer: count as dead
> HEAPTUPLE_LIVE tuple: count as live, include in statistics pool
> HEAPTUPLE_DEAD: count as dead
> HEAPTUPLE_RECENTLY_DEAD: count as dead
> HEAPTUPLE_INSERT_IN_PROGRESS: ignore
> HEAPTUPLE_DELETE_IN_PROGRESS: count as live

While working on this I realized that there's a special case ANALYZE has
to face that is not faced by VACUUM: it might see tuples inserted or
deleted by its own transaction. For example consider

begin;
... load lots of data into mytable ...
analyze mytable;
... issue complex queries against mytable ...
commit;

This is not an uncommon scenario, particularly with respect to temporary
tables. ANALYZE's historical behavior of sampling everything that's
good according to SnapshotNow does the right thing here, but ignoring
INSERT_IN_PROGRESS tuples would not.

The right way seems to be to treat our own insertions as live during
ANALYZE, but then subtract off our own pending insertions from the
live-tuples count sent to the stats collector. pgstat_report_analyze()
can handle the latter part by groveling through the backend's pending
statistics data.

Comments?

regards, tom lane

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [GENERAL] ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-04-03 16:12:48
Message-ID:	2e78013d0804030912j3f6ed6bcxf5a1048630d79981@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

On Thu, Apr 3, 2008 at 4:34 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

>
> The right way seems to be to treat our own insertions as live during
> ANALYZE, but then subtract off our own pending insertions from the
> live-tuples count sent to the stats collector. pgstat_report_analyze()
> can handle the latter part by groveling through the backend's pending
> statistics data.
>

Seems like a right approach to me. I assume we shall do the same for
DELETE_IN_PROGRESS tuples.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	"Stuart Brooks" <stuartb(at)cat(dot)co(dot)za>, pgsql-general(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-04-03 16:32:28
Message-ID:	2718.1207240348@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

I've applied a modified/extended form of this patch for 8.3.2.

regards, tom lane

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Stuart Brooks" <stuartb(at)cat(dot)co(dot)za>, pgsql-general(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-04-03 16:51:24
Message-ID:	2e78013d0804030951m54c866b9y29ffcf06469318b3@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

On Thu, Apr 3, 2008 at 10:02 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

>
> I've applied a modified/extended form of this patch for 8.3.2.
>

Thanks. I had another concern about VACUUM not reporting DEAD line
pointers (please see up thread). Any comments on that ?

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	"Stuart Brooks" <stuartb(at)cat(dot)co(dot)za>, pgsql-general(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-04-03 17:09:58
Message-ID:	4236.1207242598@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> writes:
> Thanks. I had another concern about VACUUM not reporting DEAD line
> pointers (please see up thread). Any comments on that ?

If you want to work on that, go ahead, but I wanted it separate because
I didn't think it merited back-patching. It's strictly cosmetic in
terms of being about what VACUUM VERBOSE prints, no?

regards, tom lane

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Stuart Brooks" <stuartb(at)cat(dot)co(dot)za>, pgsql-general(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-04-04 04:31:31
Message-ID:	2e78013d0804032131y5cb9de8bt43babeef0438d1ed@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

On Thu, Apr 3, 2008 at 10:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> writes:
>
> > Thanks. I had another concern about VACUUM not reporting DEAD line
> > pointers (please see up thread). Any comments on that ?
>
> If you want to work on that, go ahead

Ok. I would do that.

> but I wanted it separate because
> I didn't think it merited back-patching. It's strictly cosmetic in
> terms of being about what VACUUM VERBOSE prints, no?
>

Umm.. Whatever we decide on the fix, I think we should backpatch it to
8.3 because I am worried that someone way get completely confused with
the current vacuum report, especially if the autovac is triggered just
because of
heap full of DEAD line pointers. The num of dead rows reported may
awfully be low in that case.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	"Stuart Brooks" <stuartb(at)cat(dot)co(dot)za>, pgsql-general(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-04-04 05:40:35
Message-ID:	4355.1207287635@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> writes:
> On Thu, Apr 3, 2008 at 10:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I didn't think it merited back-patching. It's strictly cosmetic in
>> terms of being about what VACUUM VERBOSE prints, no?

> Umm.. Whatever we decide on the fix, I think we should backpatch it to
> 8.3 because I am worried that someone way get completely confused with
> the current vacuum report,

"Somebody might misread an optional report" doesn't seem to me to be on
the same risk level as "we might destabilize a stable release". The
policy of this project is that we only put nontrivial bug fixes into
back branches, and I don't think this item qualifies ...

regards, tom lane

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Stuart Brooks" <stuartb(at)cat(dot)co(dot)za>, pgsql-general(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-04-04 05:48:59
Message-ID:	2e78013d0804032248j6b0d6683sea192d7260eb0ff2@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

On Fri, Apr 4, 2008 at 11:10 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> The
> policy of this project is that we only put nontrivial bug fixes into
> back branches, and I don't think this item qualifies ...
>

Got it. I will submit a patch for HEAD.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

From:	Stuart Brooks <stuartb(at)cat(dot)co(dot)za>
To:	pgsql-general(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Cc:	pavan(dot)deolasee(at)gmail(dot)com
Subject:	Re: [HACKERS] ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-04-07 08:04:51
Message-ID:	47F9D5A3.1070203@cat.co.za
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-general pgsql-hackers

Pavan Deolasee wrote:
> On Fri, Apr 4, 2008 at 11:10 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>
>> The
>> policy of this project is that we only put nontrivial bug fixes into
>> back branches, and I don't think this item qualifies ...
>>
>>
>
> Got it. I will submit a patch for HEAD.
>
> Thanks,
As I mentioned earlier, I patched 8.3.1 with Pavan's patch and have been
running tests. After a few days I have got postgres to lock up - not
sure if it is related. Below is a ps from my system (NetBSD 3).

TEST> ps -ax | grep post
1952 ? IW<s 13:52.24 postgres: writer process
2113 ? S<s 0:03.04 postgres: logger process
2157 ? S<s 0:03.12 postgres: autovacuum launcher process
2199 ? I<s 0:00.04 postgres: metauser metadb [local] SELECT
2472 ? DW<s 814:23.50 postgres: metauser metadb localhost(65524)
COMMIT
2661 ? DW<s 0:11.27 postgres: metauser metadb localhost(65525) idle
2680 ? S<s 1:18.75 postgres: stats collector process
3156 ? S<s 0:45.12 postgres: wal writer process
24362 ? IW<s 0:00.00 postgres: autovacuum worker process
25024 ? IW<s 0:00.00 postgres: autovacuum worker process
25134 ? IW<s 0:00.00 postgres: autovacuum worker process
3289 ttyp5 I< 0:01.96 /usr/local/pgsql/bin/postgres -D ../data/metadb

and I was disconnected in my client app with the following message:

[WARN] PSQL:exec - failed in command <SELECT
relname,n_tup_ins,n_live_tup,n_dead_tup,pg_total_relation_size('s8_0000.'
|| relname)*10/(1024*1024),last_autovacuum FROM pg_stat_user_tables
WHERE schemaname='s8_0000' ORDER BY n_tup_ins DESC>
[WARN] error = 'server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.'
[WARN] ConnectionNB: PQconsumeInput failed with error 'server closed the
connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.'

The server is still running but I can't access it. A top yields:

load averages: 0.23, 0.23, 0.21 09:53:58
110 processes: 109 sleeping, 1 on processor

Memory: 513M Act, 256M Inact, 1336K Wired, 75M Exec, 557M File, 2776K Free
Swap: 600M Total, 600M Free

PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND
463 root 2 0 6132K 14M select 0:06 0.05% 0.05% kdeinit
2472 postgres -22 -2 4580K 4K mclpl 814:23 0.00% 0.00% <postgres>
2631 root -22 0 644K 4K mclpl 606:25 0.00% 0.00% <test_writer
1622 root 2 0 8456K 14M select 19:05 0.00% 0.00% kdeinit
1952 postgres 2 -2 3544K 4K netlck 13:52 0.00% 0.00% <postgres>
233 root 2 0 24M 31M select 4:47 0.00% 0.00% XFree86
451 root 2 0 3544K 15M select 4:45 0.00% 0.00% kdeinit
16 root 18 0 0K 182M syncer 3:51 0.00% 0.00% [ioflush]
17 root -18 0 0K 182M aiodoned 1:46 0.00% 0.00% [aiodoned]
15 root -18 0 0K 182M pgdaemon 1:30 0.00% 0.00% [pagedaemon]
1301 root -22 0 4092K 4K mclpl 1:23 0.00% 0.00% <kdeinit>
2680 postgres 2 -2 3560K 1588K poll 1:18 0.00% 0.00% postgres
1493 root 2 0 3488K 17M select 1:09 0.00% 0.00% korgac
461 root 2 0 3748K 16M select 0:57 0.00% 0.00% kdeinit
3156 postgres 2 -2 3448K 1792K select 0:45 0.00% 0.00% postgres
1174 root 2 0 2608K 2928K select 0:40 0.00% 0.00% profiler
1428 root 2 0 3376K 13M select 0:26 0.00% 0.00% kdeinit
2661 postgres -22 -2 4896K 4K mclpl 0:11 0.00% 0.00% <postgres>

I'm not convinced this is a postgresql bug (state=mclpl concerns me),
but it's the first time I've seen it. I suppose it could be:
http://www.netbsd.org/cgi-bin/query-pr-single.pl?number=35224.

Anything I can do which might help isolating the problem?

Regards
Stuart