Quick Links

Re: 8.3.0 Core with concurrent vacuum fulls

Lists:	pgsql-hackers

From:	"Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>
To:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-04 14:02:27
Message-ID:	af1bce590803040602t5a0bb3e5wf2da023134984497@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

This morning I had a postgres 8.3 install core this morning while multiple
vacuum fulls were taking place. I saved the core file, would anyone be
interested in dissecting it? I've otherwise had no issues with this machine
or pgsql install.
Gavin

From:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To:	"Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-04 16:35:44
Message-ID:	20080304163544.GI4755@alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Gavin M. Roy wrote:
> This morning I had a postgres 8.3 install core this morning while multiple
> vacuum fulls were taking place. I saved the core file, would anyone be
> interested in dissecting it? I've otherwise had no issues with this machine
> or pgsql install.

Of course. Please post the backtrace.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

From:	"Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>
To:	"Alvaro Herrera" <alvherre(at)commandprompt(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-05 02:01:23
Message-ID:	af1bce590803041801s25b61a10mc6b724a528026a61@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

[gmr(at)db04 backup]$ cat /etc/redhat-release
CentOS release 4.4 (Final)

BINDIR = /usr/local/pgsql/bin
DOCDIR = /usr/local/pgsql/doc
INCLUDEDIR = /usr/local/pgsql/include
PKGINCLUDEDIR = /usr/local/pgsql/include
INCLUDEDIR-SERVER = /usr/local/pgsql/include/server
LIBDIR = /usr/local/pgsql/lib
PKGLIBDIR = /usr/local/pgsql/lib
LOCALEDIR =
MANDIR = /usr/local/pgsql/man
SHAREDIR = /usr/local/pgsql/share
SYSCONFDIR = /usr/local/pgsql/etc
PGXS = /usr/local/pgsql/lib/pgxs/src/makefiles/pgxs.mk
CONFIGURE = '--with-ldap' '--with-perl' '--enable-integer-datetimes'
CC = gcc
CPPFLAGS = -D_GNU_SOURCE
CFLAGS = -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Winline
-Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing
CFLAGS_SL = -fpic
LDFLAGS = -Wl,-rpath,'/usr/local/pgsql/lib'
LDFLAGS_SL =
LIBS = -lpgport -lz -lreadline -ltermcap -lcrypt -ldl -lm
VERSION = PostgreSQL 8.3.0

(gdb) where
#0 0x0000003fe362e21d in raise () from /lib64/tls/libc.so.6
#1 0x0000003fe362fa1e in abort () from /lib64/tls/libc.so.6
#2 0x000000000063a2e3 in errfinish ()
#3 0x00000000005974c4 in DeadLockReport ()
#4 0x000000000059381f in LockAcquire ()
#5 0x0000000000592357 in LockRelationOid ()
#6 0x0000000000457255 in relation_open ()
#7 0x00000000004574c3 in heap_open ()
#8 0x000000000062cf41 in CatalogCacheInitializeCache ()
#9 0x000000000062dfad in PrepareToInvalidateCacheTuple ()
#10 0x000000000062e8c5 in CacheInvalidateHeapTuple ()
#11 0x000000000045c803 in heap_page_prune ()
#12 0x00000000005086cd in vacuum_rel ()
#13 0x00000000005096bb in vacuum ()
#14 0x00000000005a163b in PortalRunUtility ()
#15 0x00000000005a1714 in PortalRunMulti ()
#16 0x00000000005a1d30 in PortalRun ()
#17 0x000000000059f4b6 in PostgresMain ()
#18 0x00000000005760c0 in ServerLoop ()
#19 0x0000000000577770 in PostmasterMain ()
#20 0x000000000052fd3e in main ()

On Tue, Mar 4, 2008 at 11:35 AM, Alvaro Herrera <alvherre(at)commandprompt(dot)com>
wrote:

> Gavin M. Roy wrote:
> > This morning I had a postgres 8.3 install core this morning while
> multiple
> > vacuum fulls were taking place. I saved the core file, would anyone be
> > interested in dissecting it? I've otherwise had no issues with this
> machine
> > or pgsql install.
>
> Of course. Please post the backtrace.
>
> --
> Alvaro Herrera
> http://www.CommandPrompt.com/
> The PostgreSQL Company - Command Prompt, Inc.
>

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>
Cc:	"Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-05 02:56:00
Message-ID:	4545.1204685760@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Gavin M. Roy" <gmr(at)myyearbook(dot)com> writes:
> (gdb) where
> #0 0x0000003fe362e21d in raise () from /lib64/tls/libc.so.6
> #1 0x0000003fe362fa1e in abort () from /lib64/tls/libc.so.6
> #2 0x000000000063a2e3 in errfinish ()
> #3 0x00000000005974c4 in DeadLockReport ()
> #4 0x000000000059381f in LockAcquire ()
> #5 0x0000000000592357 in LockRelationOid ()
> #6 0x0000000000457255 in relation_open ()
> #7 0x00000000004574c3 in heap_open ()
> #8 0x000000000062cf41 in CatalogCacheInitializeCache ()
> #9 0x000000000062dfad in PrepareToInvalidateCacheTuple ()
> #10 0x000000000062e8c5 in CacheInvalidateHeapTuple ()
> #11 0x000000000045c803 in heap_page_prune ()
> #12 0x00000000005086cd in vacuum_rel ()
> #13 0x00000000005096bb in vacuum ()
> #14 0x00000000005a163b in PortalRunUtility ()
> #15 0x00000000005a1714 in PortalRunMulti ()
> #16 0x00000000005a1d30 in PortalRun ()
> #17 0x000000000059f4b6 in PostgresMain ()
> #18 0x00000000005760c0 in ServerLoop ()
> #19 0x0000000000577770 in PostmasterMain ()
> #20 0x000000000052fd3e in main ()

So what did DeadLockReport put in the server log before panic'ing?

I'm wondering exactly why CatalogCacheInitializeCache is being called
here --- seems like that should have been done long before we got to
VACUUM. But it would be useful to know just what deadlock it saw.

regards, tom lane

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-05 10:11:53
Message-ID:	2e78013d0803050211t73fae3f2j1da5760e55c0173b@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Mar 5, 2008 at 8:26 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Gavin M. Roy" <gmr(at)myyearbook(dot)com> writes:
> > (gdb) where
> > #0 0x0000003fe362e21d in raise () from /lib64/tls/libc.so.6
> > #1 0x0000003fe362fa1e in abort () from /lib64/tls/libc.so.6
> > #2 0x000000000063a2e3 in errfinish ()
> > #3 0x00000000005974c4 in DeadLockReport ()
> > #4 0x000000000059381f in LockAcquire ()
> > #5 0x0000000000592357 in LockRelationOid ()
> > #6 0x0000000000457255 in relation_open ()
> > #7 0x00000000004574c3 in heap_open ()
> > #8 0x000000000062cf41 in CatalogCacheInitializeCache ()
> > #9 0x000000000062dfad in PrepareToInvalidateCacheTuple ()
> > #10 0x000000000062e8c5 in CacheInvalidateHeapTuple ()
> > #11 0x000000000045c803 in heap_page_prune ()
> > #12 0x00000000005086cd in vacuum_rel ()
> > #13 0x00000000005096bb in vacuum ()
> > #14 0x00000000005a163b in PortalRunUtility ()
> > #15 0x00000000005a1714 in PortalRunMulti ()
> > #16 0x00000000005a1d30 in PortalRun ()
> > #17 0x000000000059f4b6 in PostgresMain ()
> > #18 0x00000000005760c0 in ServerLoop ()
> > #19 0x0000000000577770 in PostmasterMain ()
> > #20 0x000000000052fd3e in main ()
>
> So what did DeadLockReport put in the server log before panic'ing?
>
> I'm wondering exactly why CatalogCacheInitializeCache is being called
> here --- seems like that should have been done long before we got to
> VACUUM. But it would be useful to know just what deadlock it saw.
>

Seems like its possible that the second phase of catalog cache initialization
is not done when VACUUM FULL is called, especially if VACUUM FULL is
the first command/query in the backend.

InitCatalogCachePhase2() is called only if new cache file needs to be
written.

/*
* Lastly, write out a new relcache cache file if one is needed.
*/
if (needNewCacheFile)
{
/*
* Force all the catcaches to finish initializing and thereby open the
* catalogs and indexes they use. This will preload the relcache with
* entries for all the most important system catalogs and indexes, so
* that the init file will be most useful for future backends.
*/
InitCatalogCachePhase2();

/* now write the file */
write_relcache_init_file();
}

We haven't yet seen the deadlock message, but here is my theory of a possible
deadlock scenario:

Two backends try to vacuum full two different catalog tables. Each acquires an
exclusive lock on the respective catalog relation. Then each try to
initialize its
own catalog cache. But to do that they need AccessShareLock on each other's
table leading to a deadlock.

Why not just unconditionally finish the phase 2 as part of
InitPostgres ? I understand
that we may end up initializing caches that we don't need in that
session, but there
might be other places where this deadlock is waiting to happen.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-05 10:46:07
Message-ID:	2e78013d0803050246j7994259es5fa37dfc2fbf9343@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Mar 5, 2008 at 3:41 PM, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com> wrote:
>
>
>
> Two backends try to vacuum full two different catalog tables. Each acquires an
> exclusive lock on the respective catalog relation. Then each try to
> initialize its
> own catalog cache. But to do that they need AccessShareLock on each other's
> table leading to a deadlock.
>

Well I could reproduce the above mentioned deadlock scenario with two system
relations. So that validates the theory.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	"Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-05 14:16:52
Message-ID:	23355.1204726612@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> writes:
> Why not just unconditionally finish the phase 2 as part of InitPostgres ?

You're jumping to a patch before we even understand what's happening.
In particular, if that's the problem, why has this not been seen before?
The fact that it's going through heap_page_prune doesn't seem very
relevant --- VACUUM FULL has certainly always had to invoke
CacheInvalidateHeapTuple someplace or other. So I still want to see
the deadlock report ... we at least need to know which tables are
involved in the deadlock.

A separate line of thought is whether it's a good idea that
heap_page_prune calls the inval code inside a critical section.
That's what's turning an ordinary deadlock failure into a PANIC.
Even without the possibility of having to do cache initialization,
that's a lot of code to be invoking, and it has obvious failure
modes (eg, out of memory for the new inval list item).

regards, tom lane

From:	"Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-05 15:09:44
Message-ID:	af1bce590803050709j5d597f3ck81dadc598e590a03@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2008-03-04 05:45:47 EST [6698]: [1-1] LOG: process 6698 still waiting for
AccessShareLock on relation 1247 of database 16385 after 1001.519 ms
2008-03-04 05:45:47 EST [6698]: [2-1] STATEMENT: VACUUM FULL
autograph.autograph_creators
2008-03-04 05:46:28 EST [6730]: [1-1] LOG: process 6730 still waiting for
AccessShareLock on relation 1247 of database 16385 after 1000.887 ms
2008-03-04 05:46:28 EST [6730]: [2-1] STATEMENT: VACUUM FULL
lunchmoney.totals
2008-03-04 05:47:55 EST [3809]: [18-1] LOG: server process (PID 6742) was
terminated by signal 6: Aborted
2008-03-04 05:47:55 EST [3809]: [19-1] LOG: terminating any other active
server processes
2008-03-04 05:47:55 EST [6741]: [12-1] WARNING: terminating connection
because of crash of another server process

On Tue, Mar 4, 2008 at 9:56 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> "Gavin M. Roy" <gmr(at)myyearbook(dot)com> writes:
> > (gdb) where
> > #0 0x0000003fe362e21d in raise () from /lib64/tls/libc.so.6
> > #1 0x0000003fe362fa1e in abort () from /lib64/tls/libc.so.6
> > #2 0x000000000063a2e3 in errfinish ()
> > #3 0x00000000005974c4 in DeadLockReport ()
> > #4 0x000000000059381f in LockAcquire ()
> > #5 0x0000000000592357 in LockRelationOid ()
> > #6 0x0000000000457255 in relation_open ()
> > #7 0x00000000004574c3 in heap_open ()
> > #8 0x000000000062cf41 in CatalogCacheInitializeCache ()
> > #9 0x000000000062dfad in PrepareToInvalidateCacheTuple ()
> > #10 0x000000000062e8c5 in CacheInvalidateHeapTuple ()
> > #11 0x000000000045c803 in heap_page_prune ()
> > #12 0x00000000005086cd in vacuum_rel ()
> > #13 0x00000000005096bb in vacuum ()
> > #14 0x00000000005a163b in PortalRunUtility ()
> > #15 0x00000000005a1714 in PortalRunMulti ()
> > #16 0x00000000005a1d30 in PortalRun ()
> > #17 0x000000000059f4b6 in PostgresMain ()
> > #18 0x00000000005760c0 in ServerLoop ()
> > #19 0x0000000000577770 in PostmasterMain ()
> > #20 0x000000000052fd3e in main ()
>
> So what did DeadLockReport put in the server log before panic'ing?
>
> I'm wondering exactly why CatalogCacheInitializeCache is being called
> here --- seems like that should have been done long before we got to
> VACUUM. But it would be useful to know just what deadlock it saw.
>
> regards, tom lane
>

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>, "Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-05 15:13:37
Message-ID:	25084.1204730017@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

I wrote:
> In particular, if that's the problem, why has this not been seen before?
> The fact that it's going through heap_page_prune doesn't seem very
> relevant --- VACUUM FULL has certainly always had to invoke
> CacheInvalidateHeapTuple someplace or other. So I still want to see
> the deadlock report ... we at least need to know which tables are
> involved in the deadlock.

Actually, maybe it *has* been seen before. Gavin, are you in the habit
of running concurrent VACUUM FULLs on system catalogs, and if so have
you noted that they occasionally get deadlock failures?

> A separate line of thought is whether it's a good idea that
> heap_page_prune calls the inval code inside a critical section.
> That's what's turning an ordinary deadlock failure into a PANIC.
> Even without the possibility of having to do cache initialization,
> that's a lot of code to be invoking, and it has obvious failure
> modes (eg, out of memory for the new inval list item).

The more I think about this the more I don't like it. The critical
section in heap_page_prune is *way* too big. Aside from the inval
call, there are HeapTupleSatisfiesVacuum() calls, which could have
failures during attempted clog references.

The reason the critical section is so large is that we're manipulating
the contents of a shared buffer, and we don't want a failure to leave a
partially-modified page in the buffer. We could fix that if we were to
memcpy the page into local storage and do all the pruning work there.
Then the critical section would only surround copying the page back to
the buffer and writing the WAL record. Copying the page is a tad
annoying but heap_page_prune is an expensive operation anyway, and
I think we really are at too much risk of PANIC the way it's being done
now. Has anyone got a better idea?

regards, tom lane

From:	"Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-05 15:27:32
Message-ID:	af1bce590803050727y13991498i445cd8a9c75f8be1@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Mar 5, 2008 at 10:13 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I wrote:
> > In particular, if that's the problem, why has this not been seen before?
> > The fact that it's going through heap_page_prune doesn't seem very
> > relevant --- VACUUM FULL has certainly always had to invoke
> > CacheInvalidateHeapTuple someplace or other. So I still want to see
> > the deadlock report ... we at least need to know which tables are
> > involved in the deadlock.
>
> Actually, maybe it *has* been seen before. Gavin, are you in the habit
> of running concurrent VACUUM FULLs on system catalogs, and if so have
> you noted that they occasionally get deadlock failures?

Generally no, I've never noticed deadlocks before, but I'll go back and
look at some of the other the machines.

> A separate line of thought is whether it's a good idea that
> > heap_page_prune calls the inval code inside a critical section.
> > That's what's turning an ordinary deadlock failure into a PANIC.
> > Even without the possibility of having to do cache initialization,
> > that's a lot of code to be invoking, and it has obvious failure
> > modes (eg, out of memory for the new inval list item).
>
> The more I think about this the more I don't like it. The critical
> section in heap_page_prune is *way* too big. Aside from the inval
> call, there are HeapTupleSatisfiesVacuum() calls, which could have
> failures during attempted clog references.
>
> The reason the critical section is so large is that we're manipulating
> the contents of a shared buffer, and we don't want a failure to leave a
> partially-modified page in the buffer. We could fix that if we were to
> memcpy the page into local storage and do all the pruning work there.
> Then the critical section would only surround copying the page back to
> the buffer and writing the WAL record. Copying the page is a tad
> annoying but heap_page_prune is an expensive operation anyway, and
> I think we really are at too much risk of PANIC the way it's being done
> now. Has anyone got a better idea?
>
> regards, tom lane
>

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>
Cc:	"Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-05 15:31:05
Message-ID:	25742.1204731065@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Gavin M. Roy" <gmr(at)myyearbook(dot)com> writes:
> 2008-03-04 05:45:47 EST [6698]: [1-1] LOG: process 6698 still waiting for
> AccessShareLock on relation 1247 of database 16385 after 1001.519 ms
> 2008-03-04 05:45:47 EST [6698]: [2-1] STATEMENT: VACUUM FULL
> autograph.autograph_creators
> 2008-03-04 05:46:28 EST [6730]: [1-1] LOG: process 6730 still waiting for
> AccessShareLock on relation 1247 of database 16385 after 1000.887 ms
> 2008-03-04 05:46:28 EST [6730]: [2-1] STATEMENT: VACUUM FULL
> lunchmoney.totals
> 2008-03-04 05:47:55 EST [3809]: [18-1] LOG: server process (PID 6742) was
> terminated by signal 6: Aborted
> 2008-03-04 05:47:55 EST [3809]: [19-1] LOG: terminating any other active
> server processes
> 2008-03-04 05:47:55 EST [6741]: [12-1] WARNING: terminating connection
> because of crash of another server process

How annoying ... the PANIC message doesn't seem to have reached the log.
elog.c is careful to fflush(stderr) before abort(), so that isn't
supposed to happen. But it looks like you are using syslog for logging
(correct?) so maybe this is a problem with the syslog implementation
you're using. What's the platform exactly?

I wonder if it'd be reasonable to put a closelog() call just before
the abort() ...

regards, tom lane

From:	"Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-05 15:39:37
Message-ID:	af1bce590803050739u48aa37d5k9c9c0381886dd305@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Mar 5, 2008 at 10:31 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> "Gavin M. Roy" <gmr(at)myyearbook(dot)com> writes:
> > 2008-03-04 05:45:47 EST [6698]: [1-1] LOG: process 6698 still waiting
> for
> > AccessShareLock on relation 1247 of database 16385 after 1001.519 ms
> > 2008-03-04 05:45:47 EST [6698]: [2-1] STATEMENT: VACUUM FULL
> > autograph.autograph_creators
> > 2008-03-04 05:46:28 EST [6730]: [1-1] LOG: process 6730 still waiting
> for
> > AccessShareLock on relation 1247 of database 16385 after 1000.887 ms
> > 2008-03-04 05:46:28 EST [6730]: [2-1] STATEMENT: VACUUM FULL
> > lunchmoney.totals
> > 2008-03-04 05:47:55 EST [3809]: [18-1] LOG: server process (PID 6742)
> was
> > terminated by signal 6: Aborted
> > 2008-03-04 05:47:55 EST [3809]: [19-1] LOG: terminating any other
> active
> > server processes
> > 2008-03-04 05:47:55 EST [6741]: [12-1] WARNING: terminating connection
> > because of crash of another server process
>
> How annoying ... the PANIC message doesn't seem to have reached the log.
> elog.c is careful to fflush(stderr) before abort(), so that isn't
> supposed to happen. But it looks like you are using syslog for logging
> (correct?) so maybe this is a problem with the syslog implementation
> you're using. What's the platform exactly?
>
> I wonder if it'd be reasonable to put a closelog() call just before
> the abort() ...
>
> regards, tom lane

I'm using stderr, emulated to look like syslog for pgfouine.

log_destination = 'stderr'
logging_collector = on # Enable capturing of stderr and
csvlog
log_directory = '/var/log/postgres/' # Directory where log files are
written
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # Log file name pattern.

From:	"Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-05 15:40:39
Message-ID:	af1bce590803050740m2785c48bk91ed45b76bc78b02@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Mar 5, 2008 at 10:31 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

The panic may have made it if this is what you were looking for:

2008-03-04 05:45:20 EST [6742]: [7-1] PANIC: deadlock detected
2008-03-04 05:58:33 EST [8751]: [3-1] PANIC: deadlock detected

(it cored twice before I lowered the concurrency of vacuums)

From:	"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>, "Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-05 15:44:15
Message-ID:	47CEBFCF.5080108@enterprisedb.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:
> The reason the critical section is so large is that we're manipulating
> the contents of a shared buffer, and we don't want a failure to leave a
> partially-modified page in the buffer. We could fix that if we were to
> memcpy the page into local storage and do all the pruning work there.
> Then the critical section would only surround copying the page back to
> the buffer and writing the WAL record. Copying the page is a tad
> annoying but heap_page_prune is an expensive operation anyway, and
> I think we really are at too much risk of PANIC the way it's being done
> now. Has anyone got a better idea?

We could do the pruning in two phases: first figure out what to do
without modifyng anything, outside critical-section, and then actually
do it, inside critical section.

Looking at heap_page_prune, we already collect information of what we
did in the redirected/nowdead/nowunused arrays for WAL logging purposes.
We could use that, but we would also have to teach heap_prune_chain to
not step into tuples that we've already decided to remove.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>
Cc:	"Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-05 15:45:58
Message-ID:	26344.1204731958@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Gavin M. Roy" <gmr(at)myyearbook(dot)com> writes:
> The panic may have made it if this is what you were looking for:

> 2008-03-04 05:45:20 EST [6742]: [7-1] PANIC: deadlock detected
> 2008-03-04 05:58:33 EST [8751]: [3-1] PANIC: deadlock detected

That's what I expected to find, but where's the DETAIL for these?

regards, tom lane

From:	"Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-05 15:53:23
Message-ID:	af1bce590803050753x38d98025ob14a625d27d562c3@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

2008-03-04 05:45:20 EST [6742]: [7-1] PANIC: deadlock detected
2008-03-04 05:45:20 EST [6742]: [8-1] DETAIL: Process 6742 waits for
AccessShareLock on relation 2619 of database 16385; blocked by process 6740.
Process 6740 waits for AccessShareLock on relation 1247 of database 16385;
blocked by process 6742.
2008-03-04 05:45:20 EST [6742]: [9-1] STATEMENT: VACUUM FULL
pg_catalog.pg_type

Sorry, been juggling too many things this morning!

On Wed, Mar 5, 2008 at 10:45 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> "Gavin M. Roy" <gmr(at)myyearbook(dot)com> writes:
> > The panic may have made it if this is what you were looking for:
>
> > 2008-03-04 05:45:20 EST [6742]: [7-1] PANIC: deadlock detected
> > 2008-03-04 05:58:33 EST [8751]: [3-1] PANIC: deadlock detected
>
> That's what I expected to find, but where's the DETAIL for these?
>
> regards, tom lane
>

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>, "Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-05 15:59:42
Message-ID:	26892.1204732782@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
> Tom Lane wrote:
>> I think we really are at too much risk of PANIC the way it's being done
>> now. Has anyone got a better idea?

> We could do the pruning in two phases: first figure out what to do
> without modifyng anything, outside critical-section, and then actually
> do it, inside critical section.

> Looking at heap_page_prune, we already collect information of what we
> did in the redirected/nowdead/nowunused arrays for WAL logging purposes.

That's a thought, but ...

> We could use that, but we would also have to teach heap_prune_chain to
> not step into tuples that we've already decided to remove.

... seems like this would require searching the aforementioned arrays
for each tuple examined, which could turn into an O(N^2) problem.
If there are many removable tuples it could easily end up slower than
copying.

[ thinks some more... ] I guess we could use a flag array dimensioned
MaxHeapTuplesPerPage to mark already-processed tuples, so that you
wouldn't need to search the existing arrays but just index into the flag
array with the tuple's offsetnumber.

I wonder if the logic could be restructured to avoid this by taking
advantage of it being a two-pass process, instead of fighting it?
But that'd probably be a bigger change than we'd want to risk
back-patching.

Since I'm the one complaining about the PANIC risk, I guess I should
do the legwork here.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>
Cc:	"Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-05 16:16:15
Message-ID:	27483.1204733775@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Gavin M. Roy" <gmr(at)myyearbook(dot)com> writes:
> 2008-03-04 05:45:20 EST [6742]: [7-1] PANIC: deadlock detected
> 2008-03-04 05:45:20 EST [6742]: [8-1] DETAIL: Process 6742 waits for
> AccessShareLock on relation 2619 of database 16385; blocked by process 6740.
> Process 6740 waits for AccessShareLock on relation 1247 of database 16385;
> blocked by process 6742.
> 2008-03-04 05:45:20 EST [6742]: [9-1] STATEMENT: VACUUM FULL
> pg_catalog.pg_type

Hmm ...

regression=# select 2619::regclass, 1247::regclass;
regclass | regclass
--------------+----------
pg_statistic | pg_type
(1 row)

regression=#

So presumably 6740 is doing a vac full on pg_statistic. There isn't
really any need for these to conflict in cache initialization.
It strikes me that we could fix most of the deadlock risk if we just
switched the order of the two tests in the loop in
PrepareToInvalidateCacheTuple; that is, don't force initialization
of a catcache unless it's one we need to access right now. Then a
VAC FULL is only going to be interested in initializing caches for
the catalog it's vacuuming, which should be safe enough.

We still need to deal with the excessive PANIC risk, but I think we
have a plan for that now.

regards, tom lane

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>
Cc:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-05 17:07:18
Message-ID:	315.1204736838@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Gavin M. Roy" <gmr(at)myyearbook(dot)com> writes:
> On Wed, Mar 5, 2008 at 10:13 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Actually, maybe it *has* been seen before. Gavin, are you in the habit
>> of running concurrent VACUUM FULLs on system catalogs, and if so have
>> you noted that they occasionally get deadlock failures?

> Generally no, I've never noticed deadlocks before, but I'll go back and
> look at some of the other the machines.

After digging a bit deeper, it seems that pre-8.2 releases wouldn't have
been at risk for a deadlock here anyway, because
CatalogCacheInitializeCache didn't lock the system catalog it was
initializing a cache for. (That had *other* risks, but not this one.)
So possibly the lack of prior reports is just because not too many
people are in the habit of using concurrent VACUUM FULLs with late-model
Postgres. I can reproduce the deadlock (though not the ensuing PANIC)
in 8.2, so it's definitely not heap_page_prune's fault.

regards, tom lane

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>, "Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-06 09:07:01
Message-ID:	2e78013d0803060107s2bb4ee40j6a971f61285077b6@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Mar 5, 2008 at 9:29 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>
> [ thinks some more... ] I guess we could use a flag array dimensioned
> MaxHeapTuplesPerPage to mark already-processed tuples, so that you
> wouldn't need to search the existing arrays but just index into the flag
> array with the tuple's offsetnumber.
>

We can actually combine this and the page copying ideas. Instead of copying
the entire page, we can just copy the line pointers array and work on the copy.
ISTM that the only place where we change the tuple contents is when we
collapse the redirected line pointers and that we can do at the end, on the
original page.

The last step which we run inside a critical section would then be just like
invoking heap_xlog_clean with the information collected in the first pass.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>, "Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-06 18:00:25
Message-ID:	18945.1204826425@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> writes:
> On Wed, Mar 5, 2008 at 9:29 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> [ thinks some more... ] I guess we could use a flag array dimensioned
>> MaxHeapTuplesPerPage to mark already-processed tuples, so that you
>> wouldn't need to search the existing arrays but just index into the flag
>> array with the tuple's offsetnumber.

> We can actually combine this and the page copying ideas. Instead of copying
> the entire page, we can just copy the line pointers array and work on the copy.

I think that just makes things more complex and fragile. I like
Heikki's idea, in part because it makes the normal path and the WAL
recovery path guaranteed to work alike. I'll attach my work-in-progress
patch for this --- it doesn't do anything about the invalidation
semantics problem but it does fix the critical-section-too-big problem.

regards, tom lane

Attachment	Content-Type	Size
heap_prune_refactor.patch.gz	application/octet-stream	6.5 KB

From:	"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>, "Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-06 18:53:57
Message-ID:	47D03DC5.8010301@enterprisedb.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Tom Lane wrote:
> "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> writes:
>> On Wed, Mar 5, 2008 at 9:29 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> [ thinks some more... ] I guess we could use a flag array dimensioned
>>> MaxHeapTuplesPerPage to mark already-processed tuples, so that you
>>> wouldn't need to search the existing arrays but just index into the flag
>>> array with the tuple's offsetnumber.
>
>> We can actually combine this and the page copying ideas. Instead of copying
>> the entire page, we can just copy the line pointers array and work on the copy.
>
> I think that just makes things more complex and fragile. I like
> Heikki's idea, in part because it makes the normal path and the WAL
> recovery path guaranteed to work alike. I'll attach my work-in-progress
> patch for this --- it doesn't do anything about the invalidation
> semantics problem but it does fix the critical-section-too-big problem.

FWIW, the patch looks fine to me. By inspection; I didn't test it.

I'm glad we got away with a single "marked" array. I was afraid we would
need to consult the unused/redirected/dead arrays separately.

Do you have a plan for the invalidation problem? I think we could just
not remove the redirection line pointers in catalog tables.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>, "Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-06 19:10:40
Message-ID:	20053.1204830640@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
> I'm glad we got away with a single "marked" array. I was afraid we would
> need to consult the unused/redirected/dead arrays separately.

Yeah, I was worried about that too. The fundamental reason why it's
okay seems to be that redirects can only be the heads of HOT chains.
Therefore, a newly marked tuple cannot be a member of any chain we'll
need to scan later, no matter whether it is marked as newly redirected,
dead, or unused. And so we can just ignore marked tuples in all cases.

I set up the code to mark the redirection target too, but that is just a
minor optimization AFAICS --- if we reach the redirection target later
in the outer scan loop, we'd decide not to process it anyway.

> Do you have a plan for the invalidation problem? I think we could just
> not remove the redirection line pointers in catalog tables.

Still thinking about it, but I agree that I'd rather make a small
modification to VACUUM FULL than try to redesign cache invalidation.

The trick with not removing redirect pointers would be to ensure we
don't remove the redirection target. While the redirection target was
presumably not DEAD when heap_page_prune looked at it, it seems possible
that it is DEAD by the time vacuum.c looks. Another risk factor is
trying to move the redirection target down to a lower page.

regards, tom lane

From:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>, "Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-07 09:22:56
Message-ID:	2e78013d0803070122v5ee2a2c6l610c01699ca0c0cb@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Mar 6, 2008 at 11:30 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

>
> I think that just makes things more complex and fragile. I like
> Heikki's idea, in part because it makes the normal path and the WAL
> recovery path guaranteed to work alike. I'll attach my work-in-progress
> patch for this --- it doesn't do anything about the invalidation
> semantics problem but it does fix the critical-section-too-big problem.
>

The WIP patch looks good to me. I haven't yet tested it (will wait for the
final version). The following pointer arithmetic caught my eye though.

! nunused = (end - nowunused);

Shouldn't we typecast them to (char *) first ?

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
Cc:	"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>, "Gavin M(dot) Roy" <gmr(at)myyearbook(dot)com>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 8.3.0 Core with concurrent vacuum fulls
Date:	2008-03-07 13:07:24
Message-ID:	11440.1204895244@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> writes:
> The WIP patch looks good to me. I haven't yet tested it (will wait for the
> final version). The following pointer arithmetic caught my eye though.
> ! nunused = (end - nowunused);
> Shouldn't we typecast them to (char *) first ?

No ... we want the number of OffsetNumbers, not the number of bytes.

regards, tom lane