Re: BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes

Lists: pgsql-bugs
From: mayank(dot)mittal(dot)1982(at)hotmail(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
Date: 2012-09-20 16:15:11
Message-ID: E1TEjPD-0002Yq-7L@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 7562
Logged by: Mayank Mittal
Email address: mayank(dot)mittal(dot)1982(at)hotmail(dot)com
PostgreSQL version: 9.1.5
Operating system: Debian Linux 6.0
Description:

We are using 2 node set-up of PostgreSQL 9.1.5 in which one is master and
other is slave which is in sync of master with streaming replication.
The design is in such a way that in case of master node failure the slave
node has to take master role. I'm controlling this behaviour using Corosync
and Heartbeat.
My application is requirement needs heavy database updates. Upon fail-over
I've noticed that database indexes got corrupted.
I'm not sure why this is happening. I was referring release notes of 9.1.3
and found similar issue is already fixed in it, but we are facing the same.


From: Mayank Mittal <mayank(dot)mittal(dot)1982(at)outlook(dot)com>
To: "mayank(dot)mittal(dot)1982(at)hotmail(dot)com" <mayank(dot)mittal(dot)1982(at)hotmail(dot)com>, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
Date: 2012-09-20 16:44:42
Message-ID: COL002-W116226DCA4E62088445BF8CC69A0@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Here is a snapshot of installed postgresql packages:
mayank(at)server:~$ dpkg -l | grep postgres

ii
postgresql-9.1
9.1.5-1~bpo60+1
object-relational SQL database, version 9.1 server

ii
postgresql-client-9.1
9.1.5-1~bpo60+1
front-end programs for PostgreSQL 9.1

ii
postgresql-client-common
130~bpo60+1
manager for multiple PostgreSQL client versions

ii postgresql-common
130~bpo60+1
PostgreSQL database-cluster manager

ii
postgresql-contrib
9.1+130~bpo60+2
additional facilities for PostgreSQL (supported
version)

ii
postgresql-contrib-9.1
9.1.5-1~bpo60+1
additional facilities for PostgreSQL
Regards,
Mayank Mittal

> Date: Thu, 20 Sep 2012 16:15:11 +0000
> Subject: [BUGS] BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
> To: pgsql-bugs(at)postgresql(dot)org
> From: mayank(dot)mittal(dot)1982(at)hotmail(dot)com
>
> The following bug has been logged on the website:
>
> Bug reference: 7562
> Logged by: Mayank Mittal
> Email address: mayank(dot)mittal(dot)1982(at)hotmail(dot)com
> PostgreSQL version: 9.1.5
> Operating system: Debian Linux 6.0
> Description:
>
> We are using 2 node set-up of PostgreSQL 9.1.5 in which one is master and
> other is slave which is in sync of master with streaming replication.
> The design is in such a way that in case of master node failure the slave
> node has to take master role. I'm controlling this behaviour using Corosync
> and Heartbeat.
> My application is requirement needs heavy database updates. Upon fail-over
> I've noticed that database indexes got corrupted.
> I'm not sure why this is happening. I was referring release notes of 9.1.3
> and found similar issue is already fixed in it, but we are facing the same.
>
>
>
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: mayank(dot)mittal(dot)1982(at)hotmail(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
Date: 2012-09-20 17:15:17
Message-ID: 25722.1348161317@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

mayank(dot)mittal(dot)1982(at)hotmail(dot)com writes:
> The following bug has been logged on the website:
> Bug reference: 7562
> Logged by: Mayank Mittal
> Email address: mayank(dot)mittal(dot)1982(at)hotmail(dot)com
> PostgreSQL version: 9.1.5
> Operating system: Debian Linux 6.0
> Description:

> We are using 2 node set-up of PostgreSQL 9.1.5 in which one is master and
> other is slave which is in sync of master with streaming replication.
> The design is in such a way that in case of master node failure the slave
> node has to take master role. I'm controlling this behaviour using Corosync
> and Heartbeat.
> My application is requirement needs heavy database updates. Upon fail-over
> I've noticed that database indexes got corrupted.

Hmm. There is a fix for a slave-side-index-corruption problem in 9.1.6,
which is due to be announced Monday. I am not certain whether this is
the same thing though; that bug is low-probability as far as we can
tell (it would only happen if the master had been in the middle of an
index page split or page deletion at the instant of failover). Anyway
the first thing to find out is whether 9.1.6 fixes it.

regards, tom lane


From: Mayank Mittal <mayank(dot)mittal(dot)1982(at)hotmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
Date: 2012-09-20 17:25:26
Message-ID: COL002-W47D620EFF93624337E6482D59A0@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hello Tom, Thanks for the information. But problem is it is occurring quite frequently in my case.
Regards,
Mayank Mittal

> From: tgl(at)sss(dot)pgh(dot)pa(dot)us
> To: mayank(dot)mittal(dot)1982(at)hotmail(dot)com
> CC: pgsql-bugs(at)postgresql(dot)org
> Subject: Re: [BUGS] BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
> Date: Thu, 20 Sep 2012 13:15:17 -0400
>
> mayank(dot)mittal(dot)1982(at)hotmail(dot)com writes:
> > The following bug has been logged on the website:
> > Bug reference: 7562
> > Logged by: Mayank Mittal
> > Email address: mayank(dot)mittal(dot)1982(at)hotmail(dot)com
> > PostgreSQL version: 9.1.5
> > Operating system: Debian Linux 6.0
> > Description:
>
> > We are using 2 node set-up of PostgreSQL 9.1.5 in which one is master and
> > other is slave which is in sync of master with streaming replication.
> > The design is in such a way that in case of master node failure the slave
> > node has to take master role. I'm controlling this behaviour using Corosync
> > and Heartbeat.
> > My application is requirement needs heavy database updates. Upon fail-over
> > I've noticed that database indexes got corrupted.
>
> Hmm. There is a fix for a slave-side-index-corruption problem in 9.1.6,
> which is due to be announced Monday. I am not certain whether this is
> the same thing though; that bug is low-probability as far as we can
> tell (it would only happen if the master had been in the middle of an
> index page split or page deletion at the instant of failover). Anyway
> the first thing to find out is whether 9.1.6 fixes it.
>
> regards, tom lane
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, mayank(dot)mittal(dot)1982(at)hotmail(dot)com
Subject: Re: BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
Date: 2012-09-20 21:31:35
Message-ID: 201209202331.35967.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Thursday, September 20, 2012 07:15:17 PM Tom Lane wrote:
> mayank(dot)mittal(dot)1982(at)hotmail(dot)com writes:
> > The following bug has been logged on the website:
> > Bug reference: 7562
> > Logged by: Mayank Mittal
> > Email address: mayank(dot)mittal(dot)1982(at)hotmail(dot)com
> > PostgreSQL version: 9.1.5
> > Operating system: Debian Linux 6.0
> > Description:
> >
> > We are using 2 node set-up of PostgreSQL 9.1.5 in which one is master and
> > other is slave which is in sync of master with streaming replication.
> > The design is in such a way that in case of master node failure the slave
> > node has to take master role. I'm controlling this behaviour using
> > Corosync and Heartbeat.
> > My application is requirement needs heavy database updates. Upon
> > fail-over I've noticed that database indexes got corrupted.
What kind of indexes are you using? Hash indexes by any chance?

As you say downthread the failures are frequent could you provide a bit more
details about your setup (including configuration, initial setup etc) and the
logs on both machines?

> Hmm. There is a fix for a slave-side-index-corruption problem in 9.1.6,
> which is due to be announced Monday. I am not certain whether this is
> the same thing though; that bug is low-probability as far as we can
> tell (it would only happen if the master had been in the middle of an
> index page split or page deletion at the instant of failover). Anyway
> the first thing to find out is whether 9.1.6 fixes it.
I think the likelihood of that bug causing the the index file to be zero bytes
- at least thats what I read from $subject - is really, really small:

The index would need to be created (setting a proper BM_PERMANENT flag on the
meta page), evicted from the buffer cache and thus written to the filesystem,
the root page would need to split causing the meta page to be rewritten (this
time without a proper BM_PERMANENT) in a very quick succession followed by a
OS/HW failure loosing the data already in the OS cache.
So, unless I am missing something, I don't see how that can happen.

Greetings,

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org, mayank(dot)mittal(dot)1982(at)hotmail(dot)com
Subject: Re: BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
Date: 2012-09-20 21:38:52
Message-ID: 15918.1348177132@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On Thursday, September 20, 2012 07:15:17 PM Tom Lane wrote:
>> Hmm. There is a fix for a slave-side-index-corruption problem in 9.1.6,
>> which is due to be announced Monday. I am not certain whether this is
>> the same thing though; that bug is low-probability as far as we can
>> tell (it would only happen if the master had been in the middle of an
>> index page split or page deletion at the instant of failover). Anyway
>> the first thing to find out is whether 9.1.6 fixes it.

> I think the likelihood of that bug causing the the index file to be zero bytes
> - at least thats what I read from $subject - is really, really small:

Sure, but what about the heap? The case I was speculating about was
that the heap had been truncated, but because of the corruption problem,
the index still had heap pointers in it. We don't know what file 16585
is supposed to be.

Your point about hash indexes is definitely worth asking though...
that would square with the reported symptoms.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org, mayank(dot)mittal(dot)1982(at)hotmail(dot)com
Subject: Re: BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
Date: 2012-09-20 22:10:35
Message-ID: 201209210010.35352.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Thursday, September 20, 2012 11:38:52 PM Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On Thursday, September 20, 2012 07:15:17 PM Tom Lane wrote:
> >> Hmm. There is a fix for a slave-side-index-corruption problem in 9.1.6,
> >> which is due to be announced Monday. I am not certain whether this is
> >> the same thing though; that bug is low-probability as far as we can
> >> tell (it would only happen if the master had been in the middle of an
> >> index page split or page deletion at the instant of failover). Anyway
> >> the first thing to find out is whether 9.1.6 fixes it.
> >
> > I think the likelihood of that bug causing the the index file to be zero
> > bytes
>
> > - at least thats what I read from $subject - is really, really small:
> Sure, but what about the heap? The case I was speculating about was
> that the heap had been truncated, but because of the corruption problem,
> the index still had heap pointers in it. We don't know what file 16585
> is supposed to be.
Hm. Interesting thought.

*think*

Wouldn't the truncation have created a completely new index relation?

Greetings,

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org, mayank(dot)mittal(dot)1982(at)hotmail(dot)com
Subject: Re: BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
Date: 2012-09-20 22:18:12
Message-ID: 16858.1348179492@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On Thursday, September 20, 2012 11:38:52 PM Tom Lane wrote:
>> Sure, but what about the heap? The case I was speculating about was
>> that the heap had been truncated, but because of the corruption problem,
>> the index still had heap pointers in it. We don't know what file 16585
>> is supposed to be.

> Wouldn't the truncation have created a completely new index relation?

If it were an actual TRUNCATE, yeah. But it could be a case of VACUUM
truncating a now-empty table to zero blocks.

But nothing like this would explain the OP's report that corruption is
completely reproducible for him. So I like your theory about hash index
use better. We really oughta get some WAL support in there.

regards, tom lane


From: Mayank Mittal <mayank(dot)mittal(dot)1982(at)hotmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
Date: 2012-09-21 07:01:00
Message-ID: COL002-W355839495E63B8ABC2750BD5990@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

Hello Andres,I didn't mention hashing type for indexes explicitly. I'm relying on the default one which is B-Tree.Here is the basic configuration of my system.
Operating System: Debian Linux 6.0Type: 64-bitFile system Type: ext4RAM : 4G
Also I didn't understand where to find BM_PERMANENT flag setting.
Here is steps for initial setup.
1. Server 1 is running in master mode.2. When server 2 came up. Our Resource Agent initiates pg_dump on master node and copy the dump to data folder of slave node.3. Once copied completely, we create recovery.conf file on the slave node and starts the Postgre.4. In case of Master failure, RA creates trigger file in slave to promote it to master.
I'm using following command to take dump of master:pg_basebackup -U postgres -h <master_node_ip> -P -x -D <backup_location>

Regards,
Mayank MittalBarco Electronics System Ltd.Mob. +91 9873437922

> From: andres(at)2ndquadrant(dot)com
> To: pgsql-bugs(at)postgresql(dot)org
> Subject: Re: [BUGS] BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
> Date: Thu, 20 Sep 2012 23:31:35 +0200
> CC: tgl(at)sss(dot)pgh(dot)pa(dot)us; mayank(dot)mittal(dot)1982(at)hotmail(dot)com
>
> On Thursday, September 20, 2012 07:15:17 PM Tom Lane wrote:
> > mayank(dot)mittal(dot)1982(at)hotmail(dot)com writes:
> > > The following bug has been logged on the website:
> > > Bug reference: 7562
> > > Logged by: Mayank Mittal
> > > Email address: mayank(dot)mittal(dot)1982(at)hotmail(dot)com
> > > PostgreSQL version: 9.1.5
> > > Operating system: Debian Linux 6.0
> > > Description:
> > >
> > > We are using 2 node set-up of PostgreSQL 9.1.5 in which one is master and
> > > other is slave which is in sync of master with streaming replication.
> > > The design is in such a way that in case of master node failure the slave
> > > node has to take master role. I'm controlling this behaviour using
> > > Corosync and Heartbeat.
> > > My application is requirement needs heavy database updates. Upon
> > > fail-over I've noticed that database indexes got corrupted.
> What kind of indexes are you using? Hash indexes by any chance?
>
> As you say downthread the failures are frequent could you provide a bit more
> details about your setup (including configuration, initial setup etc) and the
> logs on both machines?
>
> > Hmm. There is a fix for a slave-side-index-corruption problem in 9.1.6,
> > which is due to be announced Monday. I am not certain whether this is
> > the same thing though; that bug is low-probability as far as we can
> > tell (it would only happen if the master had been in the middle of an
> > index page split or page deletion at the instant of failover). Anyway
> > the first thing to find out is whether 9.1.6 fixes it.
> I think the likelihood of that bug causing the the index file to be zero bytes
> - at least thats what I read from $subject - is really, really small:
>
> The index would need to be created (setting a proper BM_PERMANENT flag on the
> meta page), evicted from the buffer cache and thus written to the filesystem,
> the root page would need to split causing the meta page to be rewritten (this
> time without a proper BM_PERMANENT) in a very quick succession followed by a
> OS/HW failure loosing the data already in the OS cache.
> So, unless I am missing something, I don't see how that can happen.
>
> Greetings,
>
> Andres
> --
> Andres Freund http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs

Attachment Content-Type Size
postgresql.conf text/plain 18.8 KB

From: Bernd Helmle <mailings(at)oopsware(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org, mayank(dot)mittal(dot)1982(at)hotmail(dot)com
Subject: Re: BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
Date: 2012-09-21 08:18:39
Message-ID: D069D2420F31415F7402A495@apophis.credativ.lan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

--On 20. September 2012 18:18:12 -0400 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> If it were an actual TRUNCATE, yeah. But it could be a case of VACUUM
> truncating a now-empty table to zero blocks.
>
> But nothing like this would explain the OP's report that corruption is
> completely reproducible for him. So I like your theory about hash index
> use better. We really oughta get some WAL support in there.

We had a similar issue at a customer site. The server was shut down for
updating it from 9.1.4 to 9.1.5, after starting it again the log was
immediately cluttered with

ERROR: could not read block 251 in file "base/6447890/7843708": read only
0 of 8192 bytes

The index was a primary key on table with mostly INSERTS (only a few
hundred DELETEs, autovacuum didn't even bother to vacuum it yet and no
manual VACUUM). According to the customer, no DDL action takes place on
this specific table. The kernel didn't show any errors.

--
Thanks

Bernd


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Bernd Helmle <mailings(at)oopsware(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-bugs(at)postgresql(dot)org, mayank(dot)mittal(dot)1982(at)hotmail(dot)com
Subject: Re: BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
Date: 2012-09-21 08:25:50
Message-ID: 201209211025.51119.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Friday, September 21, 2012 10:18:39 AM Bernd Helmle wrote:
> --On 20. September 2012 18:18:12 -0400 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > If it were an actual TRUNCATE, yeah. But it could be a case of VACUUM
> > truncating a now-empty table to zero blocks.
> >
> > But nothing like this would explain the OP's report that corruption is
> > completely reproducible for him. So I like your theory about hash index
> > use better. We really oughta get some WAL support in there.
>
> We had a similar issue at a customer site. The server was shut down for
> updating it from 9.1.4 to 9.1.5, after starting it again the log was
> immediately cluttered with
How was it shutdown? -m fast or -m immediate?

> ERROR: could not read block 251 in file "base/6447890/7843708": read only
> 0 of 8192 bytes
So, not block 0. How many blocks does the new index contain?

Mayank:
Do you always see the error in block 0?

> The index was a primary key on table with mostly INSERTS (only a few
> hundred DELETEs, autovacuum didn't even bother to vacuum it yet and no
> manual VACUUM). According to the customer, no DDL action takes place on
> this specific table. The kernel didn't show any errors.
Ok, this is getting wierd. Bernd some minutes ago confirmed on IRC that the
table is older than the last checkpoint...

Greetings,

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Mayank Mittal <mayank(dot)mittal(dot)1982(at)hotmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>, Bernd Helmle <mailings(at)oopsware(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
Date: 2012-09-21 08:42:44
Message-ID: COL002-W8969D3ED20672EDE4550E9D5990@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

No, Most of the time I've seen in block 0, but 2-3 time it was with other blocks as well.

Regards,
Mayank MittalBarco Electronics System Ltd.Mob. +91 9873437922

> From: andres(at)2ndquadrant(dot)com
> To: mailings(at)oopsware(dot)de
> Subject: Re: [BUGS] BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
> Date: Fri, 21 Sep 2012 10:25:50 +0200
> CC: tgl(at)sss(dot)pgh(dot)pa(dot)us; pgsql-bugs(at)postgresql(dot)org; mayank(dot)mittal(dot)1982(at)hotmail(dot)com
>
> On Friday, September 21, 2012 10:18:39 AM Bernd Helmle wrote:
> > --On 20. September 2012 18:18:12 -0400 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > > If it were an actual TRUNCATE, yeah. But it could be a case of VACUUM
> > > truncating a now-empty table to zero blocks.
> > >
> > > But nothing like this would explain the OP's report that corruption is
> > > completely reproducible for him. So I like your theory about hash index
> > > use better. We really oughta get some WAL support in there.
> >
> > We had a similar issue at a customer site. The server was shut down for
> > updating it from 9.1.4 to 9.1.5, after starting it again the log was
> > immediately cluttered with
> How was it shutdown? -m fast or -m immediate?
>
> > ERROR: could not read block 251 in file "base/6447890/7843708": read only
> > 0 of 8192 bytes
> So, not block 0. How many blocks does the new index contain?
>
> Mayank:
> Do you always see the error in block 0?
>
> > The index was a primary key on table with mostly INSERTS (only a few
> > hundred DELETEs, autovacuum didn't even bother to vacuum it yet and no
> > manual VACUUM). According to the customer, no DDL action takes place on
> > this specific table. The kernel didn't show any errors.
> Ok, this is getting wierd. Bernd some minutes ago confirmed on IRC that the
> table is older than the last checkpoint...
>
> Greetings,
>
> Andres
> --
> Andres Freund http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs


From: Bernd Helmle <mailings(at)oopsware(dot)de>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-bugs(at)postgresql(dot)org, mayank(dot)mittal(dot)1982(at)hotmail(dot)com
Subject: Re: BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
Date: 2012-09-21 09:34:49
Message-ID: 873D76786B6DA97F0EBD16ED@apophis.credativ.lan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

--On 21. September 2012 10:25:50 +0200 Andres Freund
<andres(at)2ndquadrant(dot)com> wrote:

>> We had a similar issue at a customer site. The server was shut down for
>> updating it from 9.1.4 to 9.1.5, after starting it again the log was
>> immediately cluttered with
> How was it shutdown? -m fast or -m immediate?
>

-m fast

>> ERROR: could not read block 251 in file "base/6447890/7843708": read
>> only 0 of 8192 bytes
> So, not block 0. How many blocks does the new index contain?

255 blocks according to its current size.

--
Thanks

Bernd


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Mayank Mittal <mayank(dot)mittal(dot)1982(at)hotmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
Date: 2012-09-21 11:43:00
Message-ID: 201209211343.00408.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

On Friday, September 21, 2012 01:37:38 PM Mayank Mittal wrote:
> As discussed with Andres on IRC, I tried to reproduce the issue with some
> debug log enabled.In order to reproduce I fixed my already broken system
> (index corrupted) by running REINDEX database <database_name>.Once done I
> performed the failover and now I'm getting following
> error:[org.postgresql.util.PSQLException: ERROR: missing chunk number 0
> for toast value 33972 in pg_toast_16582]
Unfortunately I don't think its really a valid approach to start from an
already corrupted database when doing this :( There might already be lingering
corruption causing the problem.

Have you seen the missing chunk error before? Did you reproduce the issue from
a corrupted database as well before?

Greetings,

Andres

> Regards,
> Mayank MittalBarco Electronics System Ltd.Mob. +91 9873437922
>
> > Date: Fri, 21 Sep 2012 11:34:49 +0200
> > From: mailings(at)oopsware(dot)de
> > To: andres(at)2ndquadrant(dot)com
> > CC: tgl(at)sss(dot)pgh(dot)pa(dot)us; pgsql-bugs(at)postgresql(dot)org;
> > mayank(dot)mittal(dot)1982(at)hotmail(dot)com Subject: Re: [BUGS] BUG #7562: could not
> > read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
> >
> >
> >
> > --On 21. September 2012 10:25:50 +0200 Andres Freund
> >
> > <andres(at)2ndquadrant(dot)com> wrote:
> > >> We had a similar issue at a customer site. The server was shut down
> > >> for updating it from 9.1.4 to 9.1.5, after starting it again the log
> > >> was immediately cluttered with
> > >
> > > How was it shutdown? -m fast or -m immediate?
> >
> > -m fast
> >
> > >> ERROR: could not read block 251 in file "base/6447890/7843708": read
> > >> only 0 of 8192 bytes
> > >
> > > So, not block 0. How many blocks does the new index contain?
> >
> > 255 blocks according to its current size.

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Mayank Mittal <mayank(dot)mittal(dot)1982(at)hotmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
Date: 2012-09-21 11:48:42
Message-ID: COL002-W741D308FF6303B9489EA9CD5990@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs

No, this is the first time, I've seen this issue.In past as well reindex the tables and it works well.
BTW now I'm resetting the database to start from fresh.

Regards,
Mayank MittalBarco Electronics System Ltd.Mob. +91 9873437922

> From: andres(at)2ndquadrant(dot)com
> To: mayank(dot)mittal(dot)1982(at)hotmail(dot)com
> Subject: Re: [BUGS] BUG #7562: could not read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
> Date: Fri, 21 Sep 2012 13:43:00 +0200
> CC: tgl(at)sss(dot)pgh(dot)pa(dot)us; pgsql-bugs(at)postgresql(dot)org
>
> On Friday, September 21, 2012 01:37:38 PM Mayank Mittal wrote:
> > As discussed with Andres on IRC, I tried to reproduce the issue with some
> > debug log enabled.In order to reproduce I fixed my already broken system
> > (index corrupted) by running REINDEX database <database_name>.Once done I
> > performed the failover and now I'm getting following
> > error:[org.postgresql.util.PSQLException: ERROR: missing chunk number 0
> > for toast value 33972 in pg_toast_16582]
> Unfortunately I don't think its really a valid approach to start from an
> already corrupted database when doing this :( There might already be lingering
> corruption causing the problem.
>
> Have you seen the missing chunk error before? Did you reproduce the issue from
> a corrupted database as well before?
>
> Greetings,
>
> Andres
>
> > Regards,
> > Mayank MittalBarco Electronics System Ltd.Mob. +91 9873437922
> >
> > > Date: Fri, 21 Sep 2012 11:34:49 +0200
> > > From: mailings(at)oopsware(dot)de
> > > To: andres(at)2ndquadrant(dot)com
> > > CC: tgl(at)sss(dot)pgh(dot)pa(dot)us; pgsql-bugs(at)postgresql(dot)org;
> > > mayank(dot)mittal(dot)1982(at)hotmail(dot)com Subject: Re: [BUGS] BUG #7562: could not
> > > read block 0 in file "base/16385/16585": read only 0 of 8192 bytes
> > >
> > >
> > >
> > > --On 21. September 2012 10:25:50 +0200 Andres Freund
> > >
> > > <andres(at)2ndquadrant(dot)com> wrote:
> > > >> We had a similar issue at a customer site. The server was shut down
> > > >> for updating it from 9.1.4 to 9.1.5, after starting it again the log
> > > >> was immediately cluttered with
> > > >
> > > > How was it shutdown? -m fast or -m immediate?
> > >
> > > -m fast
> > >
> > > >> ERROR: could not read block 251 in file "base/6447890/7843708": read
> > > >> only 0 of 8192 bytes
> > > >
> > > > So, not block 0. How many blocks does the new index contain?
> > >
> > > 255 blocks according to its current size.
>
> --
> Andres Freund http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs