Re: Multicolumn index corruption on 8.4 beta 2

Lists: pgsql-hackers
From: Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-09 00:32:46
Message-ID: 4A2DADAE.9000206@je-eigen-domein.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

I pgdump'ed a 8.3.7 database and loaded the dump to a different server
running PostgreSQL 8.4 beta 2 (compiled from source) under Opensolaris.

One of the tables has about 6 million records, and a Btree index that
spans 3 columns.

I am having the problem that some queries are unable to find rows when
using the index.
When I force a sequential scan, by doing "set enable_indexscan=false;
set enable_bitmapscan=false;", the same queries work fine.

In addition, while running "vacuum full analyze" I got the following
error a couple times:

==
ERROR: failed to re-find parent key in index "pgb_idx" for deletion
target page 25470
===

Doing "reindex" or dropping and creating the index, makes the error go
away for a while.
However it does not solve the problem of the missing rows, making me
believe the index Postgresql generates is still corrupt.

According to memtest the memory of the server is fine, and according to
"zpool status" there are no disk or ZFS checksum errors.

Any idea how to solve or debug this issue?

Yours sincerely,

Floris Bos


From: Richard Huxton <dev(at)archonet(dot)com>
To: Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-09 08:43:25
Message-ID: 4A2E20AD.80601@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Floris Bos / Maxnet wrote:
> I am having the problem that some queries are unable to find rows when
> using the index.
> When I force a sequential scan, by doing "set enable_indexscan=false;
> set enable_bitmapscan=false;", the same queries work fine.

Not a hacker myself, but I can tell you that the first question you'll
be asked is "can you produce a test case"? If you can generate the
problem from a test table+generated data that will let people figure out
the problem for you.

If not, details of the table schema will be needed, and is there any
pattern to the missed rows? Also - compile settings, character set and
locale details might be relevant too.

--
Richard Huxton
Archonet Ltd


From: Greg Stark <greg(dot)stark(at)enterprisedb(dot)com>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-09 08:54:56
Message-ID: 4DC8DD1D-A337-4BCF-94BC-6AB9C9EDAFD5@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

And can you post an explain plan for the incorrect scan? In particular
is it using a bitmap index scan or a regular index scan? Or does it
happen with either?

--
Greg

On 9 Jun 2009, at 09:43, Richard Huxton <dev(at)archonet(dot)com> wrote:

> Floris Bos / Maxnet wrote:
>> I am having the problem that some queries are unable to find rows
>> when using the index.
>> When I force a sequential scan, by doing "set
>> enable_indexscan=false; set enable_bitmapscan=false;", the same
>> queries work fine.
>
> Not a hacker myself, but I can tell you that the first question
> you'll be asked is "can you produce a test case"? If you can
> generate the problem from a test table+generated data that will let
> people figure out the problem for you.
>
> If not, details of the table schema will be needed, and is there any
> pattern to the missed rows? Also - compile settings, character set
> and locale details might be relevant too.
>
> --
> Richard Huxton
> Archonet Ltd
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-09 11:40:28
Message-ID: 4A2E4A2C.3090207@je-eigen-domein.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Richard Huxton wrote:
> Not a hacker myself, but I can tell you that the first question you'll
> be asked is "can you produce a test case"? If you can generate the
> problem from a test table+generated data that will let people figure out
> the problem for you.

Unfortunately, I have not been able to produce a test case (yet) on a
small data set.
While the data in the database is public information, the whole database
is about 100 GB, and therefore kinda hard to share.

> If not, details of the table schema will be needed, and is there any
> pattern to the missed rows? Also - compile settings, character set and
> locale details might be relevant too.

==
Compile settings
==

No fancy settings.
- Clean Opensolaris 2009.06 installation
- Installed gcc and gmake packages.
- Downloaded source and did a ./configure --disable-readline ; gmake ;
gmake install

==
Postgresql settings
==

The following settings differ from the defaults:

--
shared_buffers=3500MB
maintenance_work_mem = 128MB
fsync = off
synchronous_commit = off
checkpoint_segments = 25
--

The locale used when creating the database is SQL_ASCII

==
Hardware
==

Tyan barebone
2x Opteron 2376 quadcore
32 GB reg ecc memory
1x Intel X25-E 32 GB SSD for OS and pg_xlog directory
2x Intel X25-E 64 GB SSD (ZFS striping) for the database

==
Table layout
==

--
Table "public.posts_index"
Column | Type |
Modifiers
------------+------------------------+-----------------------------------------------------------
cid | integer | not null default
nextval('posts_index_cid
_seq'::regclass)
groupid | integer | not null
startdate | integer | not null
poster | character varying(64) | not null
basefile | character varying(64) | not null
subject | character varying(255) | not null
size | real |
nfo | boolean |
c | boolean |
parts | integer |
totalparts | integer |
imdb | integer |
ng1 | boolean | default false
g2 | integer | default 0
g3 | integer | default 0
data | bytea |
Indexes:
"posts_index5_pkey" PRIMARY KEY, btree (cid) CLUSTER
"gr_idx" btree (groupid, (- cid))
"pgb_idx" btree (poster, groupid, basefile)
--

Only noticed problems with the pgb_idx index so far.

The problem only occurs on a subset of the rows, at a time.
After adding/updating rows and doing a reindex, the rows that were
missing before sometimes suddenly do work, but then different ones do not.

> And can you post an explain plan for the incorrect scan? In particular is it using a bitmap index scan or a regular index scan? Or does it happen with either?

Happens with both.

Index scan:

===
=> explain SELECT count(*) FROM posts_index WHERE
poster='Yenc(at)power-post(dot)org (Yenc-PP-A&A)' AND groupid=300 AND
basefile='NIB8124849';

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=11.25..11.26 rows=1 width=0)
-> Index Scan using pgb_idx on posts_index (cost=0.00..11.25
rows=1 width=0)
Index Cond: (((poster)::text = 'Yenc(at)power-post(dot)org
(Yenc-PP-A&A)'::text) AND (groupid = 300) AND ((basefile)::text =
'NIB8124849'::text))

=> SELECT count(*) FROM posts_index WHERE poster='Yenc(at)power-post(dot)org
(Yenc-PP-A&A)' AND groupid=300 AND basefile='NIB8124849';
count
-------
0
===

When I disable index scan, it uses bitmap without luck:

==
=> set enable_indexscan=false;
SET
=> explain SELECT count(*) FROM posts_index WHERE
poster='Yenc(at)power-post(dot)org (Yenc-PP-A&A)' AND groupid=300 AND
basefile='NIB8124849';

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=11.26..11.27 rows=1 width=0)
-> Bitmap Heap Scan on posts_index (cost=7.24..11.26 rows=1 width=0)
Recheck Cond: (((poster)::text = 'Yenc(at)power-post(dot)org
(Yenc-PP-A&A)'::text) AND (groupid = 300) AND ((basefile)::text =
'NIB8124849'::text))
-> Bitmap Index Scan on pgb_idx (cost=0.00..7.24 rows=1 width=0)
Index Cond: (((poster)::text = 'Yenc(at)power-post(dot)org
(Yenc-PP-A&A)'::text) AND (groupid = 300) AND ((basefile)::text =
'NIB8124849'::text))

=> SELECT count(*) FROM posts_index WHERE poster='Yenc(at)power-post(dot)org
(Yenc-PP-A&A)' AND groupid=300 AND basefile='NIB8124849';
count
-------
0
==

Sequential scan does find the row:

==
=> set enable_indexscan=false;
SET
=> set enable_bitmapscan=false;
SET
=> explain SELECT count(*) FROM posts_index WHERE
poster='Yenc(at)power-post(dot)org (Yenc-PP-A&A)' AND groupid=300 AND
basefile='NIB8124849';

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=288153.28..288153.29 rows=1 width=0)
-> Seq Scan on posts_index (cost=0.00..288153.28 rows=1 width=0)
Filter: (((poster)::text = 'Yenc(at)power-post(dot)org
(Yenc-PP-A&A)'::text) AND (groupid = 300) AND ((basefile)::text =
'NIB8124849'::text))
(3 rows)

=> SELECT count(*) FROM posts_index WHERE poster='Yenc(at)power-post(dot)org
(Yenc-PP-A&A)' AND groupid=300 AND basefile='NIB8124849';
count
-------
1
==

Yours sincerely,

Floris Bos


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-09 14:13:47
Message-ID: 17692.1244556827@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl> writes:
> Richard Huxton wrote:
>> Not a hacker myself, but I can tell you that the first question you'll
>> be asked is "can you produce a test case"? If you can generate the
>> problem from a test table+generated data that will let people figure out
>> the problem for you.

> Unfortunately, I have not been able to produce a test case (yet) on a
> small data set.
> While the data in the database is public information, the whole database
> is about 100 GB, and therefore kinda hard to share.

Seems like we'd only need a dump of the one problem table, not the
entire database.

regards, tom lane


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-09 16:07:35
Message-ID: 1244563655.15799.324.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Tue, 2009-06-09 at 13:40 +0200, Floris Bos / Maxnet wrote:

> fsync = off

That's a bad plan if you care about your database.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support


From: Florian Weimer <fweimer(at)bfk(dot)de>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-09 16:12:01
Message-ID: 82zlche69a.fsf@mid.bfk.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Simon Riggs:

> On Tue, 2009-06-09 at 13:40 +0200, Floris Bos / Maxnet wrote:
>
>> fsync = off
>
> That's a bad plan if you care about your database.

It shouldn't introduce this type of corruption, though.

--
Florian Weimer <fweimer(at)bfk(dot)de>
BFK edv-consulting GmbH http://www.bfk.de/
Kriegsstraße 100 tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99


From: Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-09 16:29:09
Message-ID: 4A2E8DD5.4050004@je-eigen-domein.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Simon Riggs wrote:
> On Tue, 2009-06-09 at 13:40 +0200, Floris Bos / Maxnet wrote:
>> fsync = off
>
> That's a bad plan if you care about your database.

I am aware of the risk of dataloss in case of power failure, etc.

However fsync=on is simply too slow for my purpose, and it concerns data
that can be regenerated from its source.

The website this setup is for has been running various previous versions
of PostgreSQL with fsync=off since 2005.
So I still expect it to work.

Yours sincerely,

Floris Bos


From: Greg Stark <greg(dot)stark(at)enterprisedb(dot)com>
To: Florian Weimer <fweimer(at)bfk(dot)de>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-09 16:42:52
Message-ID: C22FE0E4-8BC7-4E05-8279-8C19006DC5B6@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Well sure it could -- once. It wouldn't be reproducible in a freshly
rebuilt index unless he's crashing his machine every time.

--
Greg

On 9 Jun 2009, at 17:12, Florian Weimer <fweimer(at)bfk(dot)de> wrote:

> * Simon Riggs:
>
>> On Tue, 2009-06-09 at 13:40 +0200, Floris Bos / Maxnet wrote:
>>
>>> fsync = off
>>
>> That's a bad plan if you care about your database.
>
> It shouldn't introduce this type of corruption, though.
>
> --
> Florian Weimer <fweimer(at)bfk(dot)de>
> BFK edv-consulting GmbH http://www.bfk.de/
> Kriegsstraße 100 tel: +49-721-96201-1
> D-76133 Karlsruhe fax: +49-721-96201-99
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-09 17:14:33
Message-ID: 4A2E9879.2000503@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Floris,

> The website this setup is for has been running various previous versions
> of PostgreSQL with fsync=off since 2005.
> So I still expect it to work.

You've been lucky, that's all.

Our documentation has been clear, back to version 7.0, that turning
fsync=off carries the risk that you will have to recreate your entire
database in the event of unexpected shutdown. That's not new.

So, the operative question is: was 8.4 shut down with -immediate or
otherwise unexpectedly? If so, then we don't have a bug. If 8.4 was
never shut down, then we have some strange behavior which bears looking
into. And you've found a wierd corner case, which is what we count on
our users for.

Thanks for testing.

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


From: Florian Weimer <fweimer(at)bfk(dot)de>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-09 17:19:10
Message-ID: 82ski9e35d.fsf@mid.bfk.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Josh Berkus:

> Our documentation has been clear, back to version 7.0, that turning
> fsync=off carries the risk that you will have to recreate your entire
> database in the event of unexpected shutdown. That's not new.

The documentation does not say this. Instead, there's the following
rather explicit explanation that only OS crashes matter:

| (Crashes of the database software itself are not a risk factor
| here. Only an operating-system-level crash creates a risk of
| corruption.)

If it really matters how PostgreSQL is shut down in "fsync = off" mode
(while the operating system keeps running), the documentation is
seriously wrong here.

--
Florian Weimer <fweimer(at)bfk(dot)de>
BFK edv-consulting GmbH http://www.bfk.de/
Kriegsstraße 100 tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Florian Weimer <fweimer(at)bfk(dot)de>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-09 17:37:35
Message-ID: 20090609173734.GF5938@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Florian Weimer wrote:
> * Josh Berkus:
>
> > Our documentation has been clear, back to version 7.0, that turning
> > fsync=off carries the risk that you will have to recreate your entire
> > database in the event of unexpected shutdown. That's not new.
>
> The documentation does not say this. Instead, there's the following
> rather explicit explanation that only OS crashes matter:
>
> | (Crashes of the database software itself are not a risk factor
> | here. Only an operating-system-level crash creates a risk of
> | corruption.)
>
> If it really matters how PostgreSQL is shut down in "fsync = off" mode
> (while the operating system keeps running), the documentation is
> seriously wrong here.

Yeah, AFAICT the writes are handed off to the operating system (just not
synced), so if it flushes its caches sanely at all there shouldn't be a
problem.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Florian Weimer" <fweimer(at)bfk(dot)de>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>
Cc: "Josh Berkus" <josh(at)agliodbs(dot)com>, "Floris Bos / Maxnet" <bos(at)je-eigen-domein(dot)nl>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-09 17:52:24
Message-ID: 4A2E5B080200002500027776@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)commandprompt(dot)com> wrote:

> Yeah, AFAICT the writes are handed off to the operating system (just
> not synced), so if it flushes its caches sanely at all there
> shouldn't be a problem.

I would certainly *hope* that's the case. We sometimes use fsync=off
for conversions, where we plan to just start over if the conversion
crashes, and set it to on when the conversion is done. It would be
disturbing to discover that fsync=off also means "don't bother to
write dirty buffers to the OS before shutdown."

-Kevin


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Florian Weimer <fweimer(at)bfk(dot)de>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-09 17:57:49
Message-ID: 4A2EA29D.800@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro, Kevin,

>> Yeah, AFAICT the writes are handed off to the operating system (just
>> not synced), so if it flushes its caches sanely at all there
>> shouldn't be a problem.
>
> I would certainly *hope* that's the case. We sometimes use fsync=off
> for conversions, where we plan to just start over if the conversion
> crashes, and set it to on when the conversion is done. It would be
> disturbing to discover that fsync=off also means "don't bother to
> write dirty buffers to the OS before shutdown."

It doesn't. But what I don't trust, and the *first* place I'd look for
problems, is whether the OS flushes *all* dirty buffers to disk in the
event the application gets killed.

That's why I want more information on Floris' case. Was 8.4 killed or
shut down with -m immediate? Or the os rebooted with 8.4 running?

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Florian Weimer <fweimer(at)bfk(dot)de>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-09 18:50:59
Message-ID: 25575.1244573459@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> It doesn't. But what I don't trust, and the *first* place I'd look for
> problems, is whether the OS flushes *all* dirty buffers to disk in the
> event the application gets killed.

Why wouldn't you trust it? The sort of thing you seem to be thinking
about would require tracking which process(es) wrote each dirty buffer
and then going back and dropping selected dirty buffers when a process
exits abnormally. I can hardly imagine any OS wishing to do that.

regards, tom lane


From: Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-09 19:14:56
Message-ID: 4A2EB4B0.7090700@je-eigen-domein.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Josh Berkus wrote:
> It doesn't. But what I don't trust, and the *first* place I'd look for
> problems, is whether the OS flushes *all* dirty buffers to disk in the
> event the application gets killed.
>
> That's why I want more information on Floris' case. Was 8.4 killed or
> shut down with -m immediate? Or the os rebooted with 8.4 running?

The only reboots I have done on that server were with the "reboot"
system command, which should send a SIGTERM to all processes first
including PostgreSQL, before pulling the plug.

I do recall that during the execution of "vacuum full" the psql client
program once did report that it lost connection with the server, but was
able to reconnect. Maybe the server processes handling the connection
died then, but I am not sure of that, and it only happened once.

Anyway, the problem also occurs when there is no reboot or unexpected
event between the reindex and the query.

After a REINDEX it is able to find the row it was missing first, but
then other rows become missing.

All in the same psql session:

===
usenet=> SELECT count(*) FROM posts_index WHERE
poster='Yenc(at)power-post(dot)org (Yenc-PP-A&A)' AND groupid=300 AND
basefile='NIB8124849';
count
-------
0
(1 row)

usenet=> reindex index pgb_idx;
REINDEX
usenet=> SELECT count(*) FROM posts_index WHERE
poster='Yenc(at)power-post(dot)org (Yenc-PP-A&A)' AND groupid=300 AND
basefile='NIB8124849';
count
-------
1
(1 row)

usenet=> SELECT count(*) FROM posts_index WHERE
poster='Yenc(at)power-post(dot)org (Yenc-PP-A&A)' AND basefile='frx-fffe' AND
groupid=757;
count
-------
0
(1 row)

usenet=> set enable_indexscan=false;
SET
usenet=> set enable_bitmapscan=false;
SET
usenet=> SELECT count(*) FROM posts_index WHERE
poster='Yenc(at)power-post(dot)org (Yenc-PP-A&A)' AND basefile='frx-fffe' AND
groupid=757;
count
-------
1
(1 row)

===

Yours sincerely,

Floris Bos


From: Andy Colson <andy(at)squeakycode(dot)net>
To: Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl>
Subject: Re: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-10 13:51:00
Message-ID: 4A2FBA44.6050807@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Floris Bos / Maxnet wrote:
> The following settings differ from the defaults:
>
> --
> shared_buffers=3500MB
> maintenance_work_mem = 128MB
> fsync = off
> synchronous_commit = off
> checkpoint_segments = 25
> --
>
>
> ==
> Table layout
> ==
>
> --
> Table "public.posts_index"
> Column | Type | Modifiers
> ------------+------------------------+-----------------------------------------------------------
>
> cid | integer | not null default
> nextval('posts_index_cid
> _seq'::regclass)
> groupid | integer | not null
> startdate | integer | not null
> poster | character varying(64) | not null
> basefile | character varying(64) | not null
> subject | character varying(255) | not null
> size | real |
> nfo | boolean |
> c | boolean |
> parts | integer |
> totalparts | integer |
> imdb | integer |
> ng1 | boolean | default false
> g2 | integer | default 0
> g3 | integer | default 0
> data | bytea |
> Indexes:
> "posts_index5_pkey" PRIMARY KEY, btree (cid) CLUSTER
> "gr_idx" btree (groupid, (- cid))
> "pgb_idx" btree (poster, groupid, basefile)
> --
>
> Only noticed problems with the pgb_idx index so far.

I have been trying to reproduce the problem but no success so far. I
made myself a table that matches yours, then I wrote a little perl
script to fill it with random data. (The script also writes out a text
file I can use to re-query things).

I fill the db, then add the indexes. Then I test lookup every record I
added, and find them all.

So, a few questions:

1) did you dump/restore into 8.4beta1 first and then upgrade the
program? Or did you dump/restore into 8.4beta2?

2) did you use any of the concurrent restore options?

3) do you do any updates or deletes to the table after you restore it?

4) do you do any other operations on the table (vacuum, cluster, etc..)?

5) got any triggers or stored procs?

6) To the -hackers: I write the records and then refind them in the
exact same order, would it be a better test to search for records in a
more random order? would it make a difference? Would searching for
some but not all make a difference?

-Andy


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Andy Colson <andy(at)squeakycode(dot)net>
Cc: Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl>, pgsql-hackers(at)postgresql(dot)org, Zdenek Kotala <zdenek(dot)kotala(at)gmail(dot)com>, "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
Subject: Re: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-10 18:01:28
Message-ID: 4A2FF4F8.1000208@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andy,

> 6) To the -hackers: I write the records and then refind them in the
> exact same order, would it be a better test to search for records in a
> more random order? would it make a difference? Would searching for some
> but not all make a difference?

Are you on OpenSolaris? Can you give your script to Zdenek & Jignesh to
test in their enviroments?

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


From: Andy Colson <andy(at)squeakycode(dot)net>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Zdenek Kotala <zdenek(dot)kotala(at)gmail(dot)com>, "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
Subject: Re: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-10 18:20:50
Message-ID: 4A2FF982.7010400@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Josh Berkus wrote:
> Andy,
>
>> 6) To the -hackers: I write the records and then refind them in the
>> exact same order, would it be a better test to search for records in a
>> more random order? would it make a difference? Would searching for some
>> but not all make a difference?
>
> Are you on OpenSolaris? Can you give your script to Zdenek & Jignesh to
> test in their enviroments?
>

I am not, and yes I can.

Hopefully I can attach a .tar.bz2

The bigtest.sh is the one to run (it runs all the parts). You'll need
to edit fill.pl and test.pl and set the dbname and maybe give a
username/password.

In the fill.pl there is a $max variable that's used to set the number of
records to insert. (its set to 10 million right now)

Oh, this .tar wont create a subdirectory

-Andy

Attachment Content-Type Size
test.tar.bz2 application/octet-stream 2.0 KB

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Andy Colson <andy(at)squeakycode(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Zdenek Kotala <zdenek(dot)kotala(at)gmail(dot)com>, "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
Subject: Re: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-10 18:26:23
Message-ID: 4A2FFACF.9060105@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Floris,

One more question set: what version of OpenSolaris, and what filesystem
are you using? Does the OS have any non-default tuning settings? How
did you install or compile PostgreSQL?

--
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Multicolumn index corruption on 8.4 beta 2
Date: 2009-06-10 19:42:44
Message-ID: 11966.1244662964@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl> writes:
> Hi,
> Tom Lane wrote:
>> Floris Bos / Maxnet <bos(at)je-eigen-domein(dot)nl> writes:
>>> postgres(at)db:/data$ /opt/postgres/8.4-beta/bin/64/initdb -E SQL_ASCII -X
>>> /data/pg_xlog /data/db
>>> The database cluster will be initialized with locale en_US.UTF-8.
>>
>> Oooh, that doesn't look real good. You're going to be using strcoll()
>> comparisons that assume the data is in UTF8, but the database is not
>> enforcing valid UTF8 encoding. I have not checked the dump to see if
>> it's all valid data, but this could be the root of the issue.
>>
>> If you want to use SQL_ASCII because the data isn't uniformly encoded,
>> it'd be better to use C locale.

> Darn.
> Looks like you are right!
> Works a lot better with "--locale=C"

> My 8.3 PostgreSQL installation ran under FreeBSD, and there the locale
> is C by default:
> So I was not used to have to add a "--locale=C" option.
> Under Opensolaris it's indeed UTF-8 by default.

Yeah, this is kind of unfortunate. I'm not sure there is much we could
do about it, unless we want to insist that C locale be used if the
database encoding is SQL_ASCII. That cure seems worse than the disease
though. We have locked down encoding/locale combinations pretty
strictly for 8.4, but SQL_ASCII is generally thought to be a "let the
user beware" setting.

regards, tom lane