Re: Postgres 9.0 crash on win7

Lists: pgsql-bugs
From: Andrea Peri 2007 <aperi2007(at)gmail(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Cc: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
Subject: Re: Postgres 9.0 crash on win7
Date: 2010-10-03 20:41:21
Message-ID: 4CA8EA71.5020506@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hi,

I have some update on the crash of pg9.0.

seem that PG9 will crash even on windows 32bit.

But meanwhile in win7-64 bit crash always at first try, in win7 32bit it
crash
from first and second time after restart.
As report here from Postgis Team.

http://postgis.refractions.net/pipermail/postgis-users/2010-October/027843.html

Regards,

Andrea.


From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Andrea Peri 2007 <aperi2007(at)gmail(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: Postgres 9.0 crash on win7
Date: 2010-10-04 01:06:01
Message-ID: 4CA92879.1060902@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On 4/10/2010 4:41 AM, Andrea Peri 2007 wrote:
> Hi,
>
> I have some update on the crash of pg9.0.
>
> seem that PG9 will crash even on windows 32bit.

Yes, it will. I've just been able to reproduce it here with your script,
on 32-bit win7. I should be able to report where it's crashing shortly.

I gave you bad advice on the backtrace, I'm afraid. A backtrace of a
32-bit process on 64-bit Windows, at least using the 64-bit debugging
tools, appears to be pretty useless. Having never tried that particular
setup before, I didn't realize that I'm afraid. I've added a note to
that effect to the instructions. Sorry about that.

I've been able to reproduce the crash (thanks for the test case!) and
obtain the crash information here on my 32-bit windows 7 install, so
there's no need for you to do anything else so far.

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/


From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Andrea Peri 2007 <aperi2007(at)gmail(dot)com>
Subject: Re: Postgres 9.0 crash on win7
Date: 2010-10-04 11:09:31
Message-ID: 4CA9B5EB.4040706@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Just an update on this issue:

> I've been able to reproduce the crash (thanks for the test case!) and
> obtain the crash information here on my 32-bit windows 7 install, so
> there's no need for you to do anything else so far.

I still can't get a usable backtrace. The autovacuum workers/launcher
split makes it *really* hard to catch an autovacuum worker in action.
The post-mortem debugger won't trigger for service processes, so I can't
trap it that way, and I can't pre-attach a debugger to it.

OTOH, it's now pretty clearly autovacuum that's dying, as Tom Lane
suggested it probably would be.

debug5 logging shows:

> 2010-10-04 18:18:54 WST 3692 DEBUG: InitPostgres
> 2010-10-04 18:18:54 WST 3692 DEBUG: my backend id is 3
> 2010-10-04 18:18:54 WST 3692 DEBUG: StartTransaction
> 2010-10-04 18:18:54 WST 3692 DEBUG: name: unnamed; blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
> 2010-10-04 18:18:54 WST 3692 DEBUG: mapped win32 error code 2 to 2
> 2010-10-04 18:18:54 WST 3692 DEBUG: CommitTransaction
> 2010-10-04 18:18:54 WST 3692 DEBUG: name: unnamed; blockState: STARTED; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
> 2010-10-04 18:18:54 WST 3692 DEBUG: autovacuum: processing database "test"
> 2010-10-04 18:18:54 WST 3692 DEBUG: StartTransaction
> 2010-10-04 18:18:54 WST 3692 DEBUG: name: unnamed; blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
> 2010-10-04 18:18:54 WST 3692 DEBUG: pg_statistic: vac: 0 (threshold 118), anl: 0 (threshold 84)

... followed by lots more startup messages, a series of transactions, then:

> 2010-10-04 18:18:55 WST 3692 DEBUG: name: unnamed; blockState: STARTED; state: INPROGR, xid/subid/cid: 159661/1/0 (used), nestlvl: 1, children:
> 2010-10-04 18:18:55 WST 3692 DEBUG: StartTransaction
> 2010-10-04 18:18:55 WST 3692 DEBUG: name: unnamed; blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
> 2010-10-04 18:18:55 WST 3692 DEBUG: poslist: vac: 0 (threshold 50), anl: 1804 (threshold 50)
> 2010-10-04 18:18:55 WST 3692 DEBUG: autovac_balance_cost(pid=3692 db=98315, rel=98390, cost_limit=200, cost_delay=20)
> 2010-10-04 18:18:55 WST 3692 DEBUG: CommitTransaction
> 2010-10-04 18:18:55 WST 3692 DEBUG: name: unnamed; blockState: STARTED; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
> 2010-10-04 18:18:55 WST 3692 DEBUG: StartTransaction
> 2010-10-04 18:18:55 WST 3692 DEBUG: name: unnamed; blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
> 2010-10-04 18:18:55 WST 3692 DEBUG: analyzing "public.poslist"
> 2010-10-04 18:18:55 WST 2408 DEBUG: server process (PID 3692) was terminated by exception 0xC0000005
> 2010-10-04 18:18:55 WST 2408 LOG: server process (PID 3692) was terminated by exception 0xC0000005

Autovacuum usually dies after:

analyzing "public.suolo" (three times)

but I've also seen it die after:

analyzing "public.poslist"

as shown above.

I'm really struggling to get a debugger attached to the *(at)#$@#$(at)$* thing
though. Ideas?

*punting to PostGIS folks for a look*

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/