VACUUM/t_ctid bug (was Re: GiST concurrency commited)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: VACUUM/t_ctid bug (was Re: GiST concurrency commited)
Date: 2005-08-20 06:20:09
Message-ID: 20570.1124518809@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Awhile back, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
> And there is one more problem: it caused approximatly one time per 2-4 million
> statements, I got traps:
> TRAP: FailedAssertion("!((*curpage)->offsets_used == num_tuples)", File:
> "vacuum.c", Line: 2766)
> LOG: server process (PID 15847) was terminated by signal 6
> Sorry, but I couldn't debug this trap and my knowledge about this piece of code
> is very limited. Postgres didn't create a core file. I don't believe this
> problem is in touch with my GiST framework, becouse it is about heap pages. I
> suspect trap occurs while concurrent vacuum, but I am not sure.

> PS
> My concurrency testing scripts:
> http://www.sigaev.ru/gist/
> concur.pl - generator of SQL statements
> concur.sh - simple wrapper about concur.pl which reinit db, makes db and table.

I have committed changes that I believe fix this problem:
http://archives.postgresql.org/pgsql-committers/2005-08/msg00213.php
But it needs more testing. Would you update to CVS tip and see if you
still see the failure?

Also, if anyone else has some vacuum + concurrent update test cases,
any testing you can do in CVS tip would be useful. This patch is big
and ugly enough that back-patching it into all the supported back
branches is a pretty scary prospect. I don't think we have a lot of
choice --- it is a data-loss risk --- but we need to beat the heck
out of the CVS-tip version before we start pushing it into the release
branches.

My current intention is to leave it just in CVS tip for the next few
days, and not to start developing back-branch versions until after
we've made the first 8.1 beta release. The back-ports are going to
be painful (the code involved has changed often enough that I fear
each branch will need a custom tailored patch) ... so I really don't
want to start without some confidence that the CVS-tip patch is right.

In other words ... if you can test this ... HELP!!!

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gavin Sherry 2005-08-20 07:54:05 Re: VACUUM/t_ctid bug (was Re: GiST concurrency commited)
Previous Message Tom Lane 2005-08-20 04:23:38 Re: Why is lock not released?