Re: cvs head initdb hangs on unixware

Lists: pgsql-hackers
From: ohp(at)pyrenet(dot)fr
To: pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: cvs head initdb hangs on unixware
Date: 2008-12-02 15:23:26
Message-ID: Pine.UW2.4.63.0812021609240.12549@sun.pyrenet
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi all,

cvs head configured without --enable-debug hang in initdb while making
check.

warthog doesn't exhibit it because it's configured with debug.

when it hangs, postmaster takes 100% cpu doing nothing. initdb waits for
it while creating template db.

According to truss, the last usefull thing postmaster does is writing 8K
zeroes to disk.

If someone needs an access to a unixware machine, let me know.

regards,

--
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
15, Chemin des Monges +33-5-61-50-97-01 (Fax)
31190 AUTERIVE +33-6-07-63-80-64 (GSM)
FRANCE Email: ohp(at)pyrenet(dot)fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: ohp(at)pyrenet(dot)fr
Cc: pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-02 16:22:25
Message-ID: 493560C1.7050402@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Could you generate a core and send a stacktrace?

kill SIGABRT <pid> should do that.

Zdenek

ohp(at)pyrenet(dot)fr napsal(a):
> Hi all,
>
> cvs head configured without --enable-debug hang in initdb while making
> check.
>
> warthog doesn't exhibit it because it's configured with debug.
>
> when it hangs, postmaster takes 100% cpu doing nothing. initdb waits for
> it while creating template db.
>
> According to truss, the last usefull thing postmaster does is writing 8K
> zeroes to disk.
>
> If someone needs an access to a unixware machine, let me know.
>
> regards,
>


From: ohp(at)pyrenet(dot)fr
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-02 16:55:35
Message-ID: Pine.UW2.4.63.0812021752330.24447@sun.pyrenet
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2 Dec 2008, Zdenek Kotala wrote:

> Date: Tue, 02 Dec 2008 17:22:25 +0100
> From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
> To: ohp(at)pyrenet(dot)fr
> Cc: pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
> Subject: Re: [HACKERS] cvs head initdb hangs on unixware
>
> Could you generate a core and send a stacktrace?
>
> kill SIGABRT <pid> should do that.
>
> Zdenek
Hmm. No point doing it, it's not debug enabled, I'm afraid stack trace
won't show us anything usefull.
>
> ohp(at)pyrenet(dot)fr napsal(a):
>> Hi all,
>>
>> cvs head configured without --enable-debug hang in initdb while making
>> check.
>>
>> warthog doesn't exhibit it because it's configured with debug.
>>
>> when it hangs, postmaster takes 100% cpu doing nothing. initdb waits for it
>> while creating template db.
>>
>> According to truss, the last usefull thing postmaster does is writing 8K
>> zeroes to disk.
>>
>> If someone needs an access to a unixware machine, let me know.
>>
>> regards,
>>
>
>

--
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
15, Chemin des Monges +33-5-61-50-97-01 (Fax)
31190 AUTERIVE +33-6-07-63-80-64 (GSM)
FRANCE Email: ohp(at)pyrenet(dot)fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)


From: ohp(at)pyrenet(dot)fr
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-02 17:32:49
Message-ID: Pine.UW2.4.63.0812021828130.1560@sun.pyrenet
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2 Dec 2008, Zdenek Kotala wrote:

> Date: Tue, 02 Dec 2008 17:22:25 +0100
> From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
> To: ohp(at)pyrenet(dot)fr
> Cc: pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
> Subject: Re: [HACKERS] cvs head initdb hangs on unixware
>
> Could you generate a core and send a stacktrace?
>
> kill SIGABRT <pid> should do that.
>
> Zdenek
Zdenek,

On second thought, I tried and got that:
Suivi de pile correspondant à p1, Programme postmaster
*[0] fsm_rebuild_page( présumé: 0xbd9731a0, 0, 0xbd9731a0)
[0x81e6a97]
[1] fsm_search_avail( présumé: 0x2, 0x6, 0x1) [0x81e68d9]
[2] fsm_set_and_search(0x84b2250, 0, 0, 0x2e, 0x5, 0x6, 0x2e, 0x8047416,
0xb4) [0x81e6385]
[3] RecordAndGetPageWithFreeSpace(0x84b2250, 0x2e, 0xa0, 0xb4)
[0x81e5a00]
[4] RelationGetBufferForTuple( présumé: 0x84b2250, 0xb4, 0)
[0x8099b59]
[5] heap_insert(0x84b2250, 0x853a338, 0, 0, 0) [0x8097042]
[6] simple_heap_insert( présumé: 0x84b2250, 0x853a338, 0x853a310)
[0x8097297]
[7] InsertOneTuple( présumé: 0xb80, 0x84057b0, 0x8452fb8)
[0x80cb210]
[8] boot_yyparse( présumé: 0xffffffff, 0x3, 0x8047ab8) [0x80c822b]
[9] BootstrapModeMain( présumé: 0x66, 0x8454600, 0x4) [0x80ca233]
[10] AuxiliaryProcessMain(0x4, 0x8047ab4) [0x80cab3b]
[11] main(0x4, 0x8047ab4, 0x8047ac8) [0x8177dce]
[12] _start() [0x807ff96]

seems interesting!

We've had problems already with unixware optimizer, hope this one is
fixable!

regards
>
> ohp(at)pyrenet(dot)fr napsal(a):
>> Hi all,
>>
>> cvs head configured without --enable-debug hang in initdb while making
>> check.
>>
>> warthog doesn't exhibit it because it's configured with debug.
>>
>> when it hangs, postmaster takes 100% cpu doing nothing. initdb waits for it
>> while creating template db.
>>
>> According to truss, the last usefull thing postmaster does is writing 8K
>> zeroes to disk.
>>
>> If someone needs an access to a unixware machine, let me know.
>>
>> regards,
>>
>
>

--
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
15, Chemin des Monges +33-5-61-50-97-01 (Fax)
31190 AUTERIVE +33-6-07-63-80-64 (GSM)
FRANCE Email: ohp(at)pyrenet(dot)fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)
>From pgsql-hackers-owner(at)postgresql(dot)org Tue Dec 2 13:46:51 2008
Received: from localhost (unknown [200.46.204.183])
by mail.postgresql.org (Postfix) with ESMTP id ED83C64FE0F
for <pgsql-hackers-postgresql(dot)org(at)mail(dot)postgresql(dot)org>; Tue, 2 Dec 2008 13:46:50 -0400 (AST)
Received: from mail.postgresql.org ([200.46.204.86])
by localhost (mx1.hub.org [200.46.204.183]) (amavisd-maia, port 10024)
with ESMTP id 83332-01
for <pgsql-hackers-postgresql(dot)org(at)mail(dot)postgresql(dot)org>;
Tue, 2 Dec 2008 13:46:48 -0400 (AST)
X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6
Received: from rv-out-0506.google.com (rv-out-0506.google.com [209.85.198.233])
by mail.postgresql.org (Postfix) with ESMTP id 0557464FD9F
for <pgsql-hackers(at)postgresql(dot)org>; Tue, 2 Dec 2008 13:46:47 -0400 (AST)
Received: by rv-out-0506.google.com with SMTP id b25so2998730rvf.43
for <pgsql-hackers(at)postgresql(dot)org>; Tue, 02 Dec 2008 09:46:46 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=gmail.com; s=gamma;
h=domainkey-signature:received:received:message-id:date:from:to
:subject:cc:in-reply-to:mime-version:content-type
:content-transfer-encoding:content-disposition:references;
bh=aKIdYuz7B/SybfXN4yCNWHRV9RMbF3h1248u3XyI3cg=;
b=nzKv5HinM1zE5rJCm0fWGnb/OtP25JOLx7HcHoehFO5j5VNgyjuEXEcfwbQoQQNBBQ
fLZmY0jUzjAT+YH4C+j0nN23kbCsiEgLWFqu+LTnTUgSTfNQwdA4QjM5cvRwC/tQnWdG
VchslhVbBRHXzQ3uBB/qjDO3Vn3jGT9nD+muA=
DomainKey-Signature: a=rsa-sha1; c=nofws;
d=gmail.com; s=gamma;
h=message-id:date:from:to:subject:cc:in-reply-to:mime-version
:content-type:content-transfer-encoding:content-disposition
:references;
b=IxCKiF6Y4QgkUmSn1EAHTJibriYXjrGEpTFqWn8fWDgWVKMB8dazpIZYd5kH8/1BiF
c3+TGGrAHRTmzFow7DKTDxPMQDtVKbOkMOmnhWUO0rlq56a5rsWS03hqcbffz8OGdr7E
emB+yILNyH4LXHGseQUyW/IYSClgk+CE0jFHM=
Received: by 10.141.212.5 with SMTP id o5mr5852879rvq.247.1228240006866;
Tue, 02 Dec 2008 09:46:46 -0800 (PST)
Received: by 10.141.189.10 with HTTP; Tue, 2 Dec 2008 09:46:46 -0800 (PST)
Message-ID: <e08cc0400812020946i7c4c2afxf24a45e5a37c153(at)mail(dot)gmail(dot)com>
Date: Wed, 3 Dec 2008 02:46:46 +0900
From: "Hitoshi Harada" <umi(dot)tanuki(at)gmail(dot)com>
To: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Subject: Re: Windowing Function Patch Review -> Standard Conformance
Cc: "David Rowley" <dgrowley(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
In-Reply-To: <492D3356(dot)2070705(at)enterprisedb(dot)com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
References: <9E276C7F44A4410D969D25BEDDC2E7FE(at)amd64>
<e08cc0400811232348v1ad4d192tf4c9967705bca5fe(at)mail(dot)gmail(dot)com>
<492A8E4B(dot)4050409(at)enterprisedb(dot)com>
<e08cc0400811240541p296f051v9f3298b821e23e0(at)mail(dot)gmail(dot)com>
<492AEBB8(dot)8030609(at)enterprisedb(dot)com>
<e08cc0400811242046v4b368eebx3a18995e92e3538(at)mail(dot)gmail(dot)com>
<e08cc0400811252203o46e2e859y29104c6732394395(at)mail(dot)gmail(dot)com>
<492D3356(dot)2070705(at)enterprisedb(dot)com>
X-Virus-Scanned: Maia Mailguard 1.0.1
X-Spam-Status: No, hits=0 tagged_above=0 required=5 tests=none
X-Spam-Level:
X-Archive-Number: 200812/85
X-Sequence-Number: 128714

2008/11/26 Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>:
> Hitoshi Harada wrote:
>>
>> I read more, and your spooling approach seems flexible for both now
>> and the furture. Looking at only current release, the frame with ORDER
>> BY is done by detecting peers in WinFrameGetArg() and add row number
>> of peers to winobj->currentpos. Actually if we have capability to
>> spool all rows we need on demand, the frame would be only a boundary
>> problem.
>
> Yeah, we could do that. I'm afraid it would be pretty slow, though, if
> there's a lot of peers. That could probably be alleviated with some sort of
> caching, though.

I added code for this issue. See
http://git.postgresql.org/?p=~davidfetter/window_functions/.git;a=blobdiff;f=src/backend/executor/nodeWindow.c;h=f2144bf73a94829cd7a306c28064fa5454f8d369;hp=50a6d6ca4a26cd4854c445364395ed183b61f831;hb=895f1e615352dfc733643a701d1da3de7f91344b;hpb=843e34f341f0e824fd2cc0f909079ad943e3815b

This process is very similar to your aggregatedupto in window
aggregate, so they might be shared as general "the way to detect frame
boundary", aren't they?

I am randomly trying some issues instead of agg common code (which I
now doubt if it's worth sharing the code), so tell me if you're
restarting your hack again. I'll send the whole patch.

Regards,

--
Hitoshi Harada


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: ohp(at)pyrenet(dot)fr
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-02 18:47:19
Message-ID: 493582B7.1000508@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ohp(at)pyrenet(dot)fr wrote:
> Suivi de pile correspondant à p1, Programme postmaster
> *[0] fsm_rebuild_page( présumé: 0xbd9731a0, 0, 0xbd9731a0) [0x81e6a97]
> [1] fsm_search_avail( présumé: 0x2, 0x6, 0x1) [0x81e68d9]
> [2] fsm_set_and_search(0x84b2250, 0, 0, 0x2e, 0x5, 0x6, 0x2e,
> 0x8047416, 0xb4) [0x81e6385]
> [3] RecordAndGetPageWithFreeSpace(0x84b2250, 0x2e, 0xa0, 0xb4) [0x81e5a00]
> [4] RelationGetBufferForTuple( présumé: 0x84b2250, 0xb4, 0) [0x8099b59]
> [5] heap_insert(0x84b2250, 0x853a338, 0, 0, 0) [0x8097042]
> [6] simple_heap_insert( présumé: 0x84b2250, 0x853a338, 0x853a310)
> [0x8097297]
> [7] InsertOneTuple( présumé: 0xb80, 0x84057b0, 0x8452fb8) [0x80cb210]
> [8] boot_yyparse( présumé: 0xffffffff, 0x3, 0x8047ab8) [0x80c822b]
> [9] BootstrapModeMain( présumé: 0x66, 0x8454600, 0x4) [0x80ca233]
> [10] AuxiliaryProcessMain(0x4, 0x8047ab4) [0x80cab3b]
> [11] main(0x4, 0x8047ab4, 0x8047ac8) [0x8177dce]
> [12] _start() [0x807ff96]
>
> seems interesting!
>
> We've had problems already with unixware optimizer, hope this one is
> fixable!

Looking at fsm_rebuild_page, I wonder if the compiler is treating "int"
as an unsigned integer? That would cause an infinite loop.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: ohp(at)pyrenet(dot)fr
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-03 13:13:01
Message-ID: Pine.UW2.4.63.0812031407440.12249@sun.pyrenet
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2 Dec 2008, Heikki Linnakangas wrote:

> Date: Tue, 02 Dec 2008 20:47:19 +0200
> From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
> To: ohp(at)pyrenet(dot)fr
> Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>,
> pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
> Subject: Re: [HACKERS] cvs head initdb hangs on unixware
>
> ohp(at)pyrenet(dot)fr wrote:
>> Suivi de pile correspondant à p1, Programme postmaster
>> *[0] fsm_rebuild_page( présumé: 0xbd9731a0, 0, 0xbd9731a0) [0x81e6a97]
>> [1] fsm_search_avail( présumé: 0x2, 0x6, 0x1) [0x81e68d9]
>> [2] fsm_set_and_search(0x84b2250, 0, 0, 0x2e, 0x5, 0x6, 0x2e, 0x8047416,
>> 0xb4) [0x81e6385]
>> [3] RecordAndGetPageWithFreeSpace(0x84b2250, 0x2e, 0xa0, 0xb4) [0x81e5a00]
>> [4] RelationGetBufferForTuple( présumé: 0x84b2250, 0xb4, 0) [0x8099b59]
>> [5] heap_insert(0x84b2250, 0x853a338, 0, 0, 0) [0x8097042]
>> [6] simple_heap_insert( présumé: 0x84b2250, 0x853a338, 0x853a310)
>> [0x8097297]
>> [7] InsertOneTuple( présumé: 0xb80, 0x84057b0, 0x8452fb8) [0x80cb210]
>> [8] boot_yyparse( présumé: 0xffffffff, 0x3, 0x8047ab8) [0x80c822b]
>> [9] BootstrapModeMain( présumé: 0x66, 0x8454600, 0x4) [0x80ca233]
>> [10] AuxiliaryProcessMain(0x4, 0x8047ab4) [0x80cab3b]
>> [11] main(0x4, 0x8047ab4, 0x8047ac8) [0x8177dce]
>> [12] _start() [0x807ff96]
>>
>> seems interesting!
>>
>> We've had problems already with unixware optimizer, hope this one is
>> fixable!
>
> Looking at fsm_rebuild_page, I wonder if the compiler is treating "int" as an
> unsigned integer? That would cause an infinite loop.
>
>
No, a simple printf of nodeno shows it starting at 4096 all the way down
to 0, starting back at 4096...

I wonder if leftchild/rightchild definitions has something to do with
it...

--
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
15, Chemin des Monges +33-5-61-50-97-01 (Fax)
31190 AUTERIVE +33-6-07-63-80-64 (GSM)
FRANCE Email: ohp(at)pyrenet(dot)fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)
>From pgsql-hackers-owner(at)postgresql(dot)org Wed Dec 3 09:23:34 2008
Received: from localhost (unknown [200.46.204.183])
by mail.postgresql.org (Postfix) with ESMTP id A2EDE650014
for <pgsql-hackers-postgresql(dot)org(at)mail(dot)postgresql(dot)org>; Wed, 3 Dec 2008 09:23:33 -0400 (AST)
Received: from mail.postgresql.org ([200.46.204.86])
by localhost (mx1.hub.org [200.46.204.183]) (amavisd-maia, port 10024)
with ESMTP id 87376-09
for <pgsql-hackers-postgresql(dot)org(at)mail(dot)postgresql(dot)org>;
Wed, 3 Dec 2008 09:23:31 -0400 (AST)
X-Greylist: from auto-whitelisted by SQLgrey-1.7.6
Received: from ug-out-1314.google.com (ug-out-1314.google.com [66.249.92.169])
by mail.postgresql.org (Postfix) with ESMTP id 5948264FEBD
for <pgsql-hackers(at)postgresql(dot)org>; Wed, 3 Dec 2008 09:23:29 -0400 (AST)
Received: by ug-out-1314.google.com with SMTP id k40so3309484ugc.7
for <pgsql-hackers(at)postgresql(dot)org>; Wed, 03 Dec 2008 05:23:28 -0800 (PST)
Received: by 10.210.52.15 with SMTP id z15mr15406978ebz.19.1228310607851;
Wed, 03 Dec 2008 05:23:27 -0800 (PST)
Received: from ?80.223.223.193? (dsl-hkibrasgw2-fedfdf00-193.dhcp.inet.fi [80.223.223.193])
by mx.google.com with ESMTPS id h6sm35338289nfh.21.2008.12.03.05.23.24
(version=TLSv1/SSLv3 cipher=RC4-MD5);
Wed, 03 Dec 2008 05:23:25 -0800 (PST)
Message-ID: <4936884B(dot)6050205(at)enterprisedb(dot)com>
Date: Wed, 03 Dec 2008 15:23:23 +0200
Organization: EnterpriseDB
User-Agent: Mozilla-Thunderbird 2.0.0.17 (X11/20081018)
MIME-Version: 1.0
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
CC: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Visibility map, partial vacuums
References: <4905AE17(dot)7090305(at)enterprisedb(dot)com> <491D376B(dot)9000608(at)enterprisedb(dot)com> <491D7F52(dot)6070908(at)enterprisedb(dot)com> <4925664C(dot)3090605(at)enterprisedb(dot)com> <26361(dot)1227467112(at)sss(dot)pgh(dot)pa(dot)us> <492A6032(dot)6080000(at)enterprisedb(dot)com> <18086(dot)1227537479(at)sss(dot)pgh(dot)pa(dot)us> <492D4460(dot)1000809(at)enterprisedb(dot)com> <5856(dot)1227705135(at)sss(dot)pgh(dot)pa(dot)us> <492EF88F(dot)9050709(at)enterprisedb(dot)com>
In-Reply-To: <492EF88F(dot)9050709(at)enterprisedb(dot)com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
X-Virus-Scanned: Maia Mailguard 1.0.1
X-Spam-Status: No, hits=0 tagged_above=0 required=5 tests=none
X-Spam-Level:
X-Archive-Number: 200812/147
X-Sequence-Number: 128776

Heikki Linnakangas wrote:
> Here's an updated version, with a lot of smaller cleanups, and using
> relcache invalidation to notify other backends when the visibility map
> fork is extended. I already committed the change to FSM to do the same.
> I'm feeling quite satisfied to commit this patch early next week.

Committed.

I haven't done any doc changes for this yet. I think a short section in
the "database internal storage" chapter is probably in order, and the
fact that plain VACUUM skips pages should be mentioned somewhere. I'll
skim through references to vacuum and see what needs to be changed.

Hmm. It just occurred to me that I think this circumvented the
anti-wraparound vacuuming: a normal vacuum doesn't advance relfrozenxid
anymore. We'll need to disable the skipping when autovacuum is triggered
to prevent wraparound. VACUUM FREEZE does that already, but it's
unnecessarily aggressive in freezing.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: ohp(at)pyrenet(dot)fr
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-03 18:13:59
Message-ID: 4936CC67.5090206@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ohp(at)pyrenet(dot)fr wrote:
>>
>> Looking at fsm_rebuild_page, I wonder if the compiler is treating
>> "int" as an unsigned integer? That would cause an infinite loop.
>>
>>
> No, a simple printf of nodeno shows it starting at 4096 all the way
> down to 0, starting back at 4096...
>
> I wonder if leftchild/rightchild definitions has something to do with
> it...

With probably no relevance at all, I notice that this routine is
declared extern, although it is only referenced in its own file
apparently. Don't we have a tool that checks that?

cheers

andrew


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: ohp(at)pyrenet(dot)fr
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-03 18:29:01
Message-ID: 4936CFED.3040107@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ohp(at)pyrenet(dot)fr wrote:
> On Tue, 2 Dec 2008, Heikki Linnakangas wrote:
>
>> Date: Tue, 02 Dec 2008 20:47:19 +0200
>> From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
>> To: ohp(at)pyrenet(dot)fr
>> Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>,
>> pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
>> Subject: Re: [HACKERS] cvs head initdb hangs on unixware
>>
>> ohp(at)pyrenet(dot)fr wrote:
>>> Suivi de pile correspondant à p1, Programme postmaster
>>> *[0] fsm_rebuild_page( présumé: 0xbd9731a0, 0, 0xbd9731a0) [0x81e6a97]
>>> [1] fsm_search_avail( présumé: 0x2, 0x6, 0x1) [0x81e68d9]
>>> [2] fsm_set_and_search(0x84b2250, 0, 0, 0x2e, 0x5, 0x6, 0x2e,
>>> 0x8047416, 0xb4) [0x81e6385]
>>> [3] RecordAndGetPageWithFreeSpace(0x84b2250, 0x2e, 0xa0, 0xb4)
>>> [0x81e5a00]
>>> [4] RelationGetBufferForTuple( présumé: 0x84b2250, 0xb4, 0) [0x8099b59]
>>> [5] heap_insert(0x84b2250, 0x853a338, 0, 0, 0) [0x8097042]
>>> [6] simple_heap_insert( présumé: 0x84b2250, 0x853a338, 0x853a310)
>>> [0x8097297]
>>> [7] InsertOneTuple( présumé: 0xb80, 0x84057b0, 0x8452fb8) [0x80cb210]
>>> [8] boot_yyparse( présumé: 0xffffffff, 0x3, 0x8047ab8) [0x80c822b]
>>> [9] BootstrapModeMain( présumé: 0x66, 0x8454600, 0x4) [0x80ca233]
>>> [10] AuxiliaryProcessMain(0x4, 0x8047ab4) [0x80cab3b]
>>> [11] main(0x4, 0x8047ab4, 0x8047ac8) [0x8177dce]
>>> [12] _start() [0x807ff96]
>>>
>>> seems interesting!
>>>
>>> We've had problems already with unixware optimizer, hope this one is
>>> fixable!
>>
>> Looking at fsm_rebuild_page, I wonder if the compiler is treating
>> "int" as an unsigned integer? That would cause an infinite loop.
>>
> No, a simple printf of nodeno shows it starting at 4096 all the way
> down to 0, starting back at 4096...

Hmm, it's probably looping in fsm_search_avail then. In a fresh cluster,
there shouldn't be any broken FSM pages that need rebuilding.

I'd like to see what the FSM page in question looks like. Could you try
to run initdb with "-d -n" options? I bet you'll get an infinite number
of lines like:

DEBUG: fixing corrupt FSM block 1, relation 123/456/789

Could you zip up the FSM file of that relation (a file called e.g
"789_fsm"), and send it over? Or the whole data directory, it shouldn't
be that big.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: ohp(at)pyrenet(dot)fr, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-03 22:46:15
Message-ID: 200812032246.mB3MkFY02180@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan wrote:
>
>
> ohp(at)pyrenet(dot)fr wrote:
> >>
> >> Looking at fsm_rebuild_page, I wonder if the compiler is treating
> >> "int" as an unsigned integer? That would cause an infinite loop.
> >>
> >>
> > No, a simple printf of nodeno shows it starting at 4096 all the way
> > down to 0, starting back at 4096...
> >
> > I wonder if leftchild/rightchild definitions has something to do with
> > it...
>
> With probably no relevance at all, I notice that this routine is
> declared extern, although it is only referenced in its own file
> apparently. Don't we have a tool that checks that?

Sure, src/tools/find_static.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: ohp(at)pyrenet(dot)fr
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-04 10:57:52
Message-ID: Pine.UW2.4.63.0812041150480.26968@sun.pyrenet
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 3 Dec 2008, Heikki Linnakangas wrote:

> Date: Wed, 03 Dec 2008 20:29:01 +0200
> From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
> To: ohp(at)pyrenet(dot)fr
> Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>,
> pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
> Subject: Re: [HACKERS] cvs head initdb hangs on unixware
>
> ohp(at)pyrenet(dot)fr wrote:
>> On Tue, 2 Dec 2008, Heikki Linnakangas wrote:
>>
>>> Date: Tue, 02 Dec 2008 20:47:19 +0200
>>> From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
>>> To: ohp(at)pyrenet(dot)fr
>>> Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>,
>>> pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
>>> Subject: Re: [HACKERS] cvs head initdb hangs on unixware
>>>
>>> ohp(at)pyrenet(dot)fr wrote:
>>>> Suivi de pile correspondant à p1, Programme postmaster
>>>> *[0] fsm_rebuild_page( présumé: 0xbd9731a0, 0, 0xbd9731a0) [0x81e6a97]
>>>> [1] fsm_search_avail( présumé: 0x2, 0x6, 0x1) [0x81e68d9]
>>>> [2] fsm_set_and_search(0x84b2250, 0, 0, 0x2e, 0x5, 0x6, 0x2e, 0x8047416,
>>>> 0xb4) [0x81e6385]
>>>> [3] RecordAndGetPageWithFreeSpace(0x84b2250, 0x2e, 0xa0, 0xb4)
>>>> [0x81e5a00]
>>>> [4] RelationGetBufferForTuple( présumé: 0x84b2250, 0xb4, 0) [0x8099b59]
>>>> [5] heap_insert(0x84b2250, 0x853a338, 0, 0, 0) [0x8097042]
>>>> [6] simple_heap_insert( présumé: 0x84b2250, 0x853a338, 0x853a310)
>>>> [0x8097297]
>>>> [7] InsertOneTuple( présumé: 0xb80, 0x84057b0, 0x8452fb8) [0x80cb210]
>>>> [8] boot_yyparse( présumé: 0xffffffff, 0x3, 0x8047ab8) [0x80c822b]
>>>> [9] BootstrapModeMain( présumé: 0x66, 0x8454600, 0x4) [0x80ca233]
>>>> [10] AuxiliaryProcessMain(0x4, 0x8047ab4) [0x80cab3b]
>>>> [11] main(0x4, 0x8047ab4, 0x8047ac8) [0x8177dce]
>>>> [12] _start() [0x807ff96]
>>>>
>>>> seems interesting!
>>>>
>>>> We've had problems already with unixware optimizer, hope this one is
>>>> fixable!
>>>
>>> Looking at fsm_rebuild_page, I wonder if the compiler is treating "int" as
>>> an unsigned integer? That would cause an infinite loop.
>>>
>> No, a simple printf of nodeno shows it starting at 4096 all the way down
>> to 0, starting back at 4096...
>
> Hmm, it's probably looping in fsm_search_avail then. In a fresh cluster,
> there shouldn't be any broken FSM pages that need rebuilding.
You're right!
>
> I'd like to see what the FSM page in question looks like. Could you try to
> run initdb with "-d -n" options? I bet you'll get an infinite number of lines
> like:
>
> DEBUG: fixing corrupt FSM block 1, relation 123/456/789
>
right again!
DEBUG: fixing corrupt FSM block 2, relation 1663/1/1255

> Could you zip up the FSM file of that relation (a file called e.g
> "789_fsm"), and send it over? Or the whole data directory, it shouldn't be
> that big.
>
you get both.
BTW, this is an optimizer problem, not anything wrong with the code, but
I'd hate to have a -g compiled postmaster in prod :)
>

best regards,
--
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
15, Chemin des Monges +33-5-61-50-97-01 (Fax)
31190 AUTERIVE +33-6-07-63-80-64 (GSM)
FRANCE Email: ohp(at)pyrenet(dot)fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)

Attachment Content-Type Size
1255_fsm application/octet-stream 24.0 KB
db.tgz application/octet-stream 725 bytes

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: ohp(at)pyrenet(dot)fr
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-04 11:19:15
Message-ID: 4937BCB3.8060302@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ohp(at)pyrenet(dot)fr wrote:
> On Wed, 3 Dec 2008, Heikki Linnakangas wrote:
>> Could you zip up the FSM file of that relation (a file called e.g
>> "789_fsm"), and send it over? Or the whole data directory, it
>> shouldn't be that big.
>>
> you get both.

Thanks. Hmm, the FSM pages are full of zeros, as I would expect for a
just-created relation. fsm_search_avail should've returned quickly at
the top of the function in that case. Can you put a extra printf or
something at the top of the function, to print all the arguments? And
the value of fsmpage->fp_nodes[0].

> BTW, this is an optimizer problem, not anything wrong with the code, but
> I'd hate to have a -g compiled postmaster in prod :)

Yes, so it seems, although I wouldn't be surprised if it turns out to be
a bug in the new FSM code either..

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: ohp(at)pyrenet(dot)fr
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-04 13:17:06
Message-ID: Pine.UW2.4.63.0812041412590.7861@sun.pyrenet
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 4 Dec 2008, Heikki Linnakangas wrote:

> Date: Thu, 04 Dec 2008 13:19:15 +0200
> From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
> To: ohp(at)pyrenet(dot)fr
> Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>,
> pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
> Subject: Re: [HACKERS] cvs head initdb hangs on unixware
>
> ohp(at)pyrenet(dot)fr wrote:
>> On Wed, 3 Dec 2008, Heikki Linnakangas wrote:
>>> Could you zip up the FSM file of that relation (a file called e.g
>>> "789_fsm"), and send it over? Or the whole data directory, it shouldn't be
>>> that big.
>>>
>> you get both.
>
> Thanks. Hmm, the FSM pages are full of zeros, as I would expect for a
> just-created relation. fsm_search_avail should've returned quickly at the top
> of the function in that case. Can you put a extra printf or something at the
> top of the function, to print all the arguments? And the value of
> fsmpage->fp_nodes[0].
>
>> BTW, this is an optimizer problem, not anything wrong with the code, but
>> I'd hate to have a -g compiled postmaster in prod :)
>
> Yes, so it seems, although I wouldn't be surprised if it turns out to be a
> bug in the new FSM code either..
As you can see in attached initdb.log, it seems fsm_search_avail is called
repeatedly and args are sort of looping...

>
>

--
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
15, Chemin des Monges +33-5-61-50-97-01 (Fax)
31190 AUTERIVE +33-6-07-63-80-64 (GSM)
FRANCE Email: ohp(at)pyrenet(dot)fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)

Attachment Content-Type Size
initdb.log text/plain 10.8 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ohp(at)pyrenet(dot)fr
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-08 02:57:21
Message-ID: 5692.1228705041@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ohp(at)pyrenet(dot)fr writes:
> As you can see in attached initdb.log, it seems fsm_search_avail is called
> repeatedly and args are sort of looping...

That's expected, since the system is inserting a lot of tuples
successively. What it looks like to me is that the failing call is the
first one where the initial test *doesn't* result in falling out
immediately. So the probability is that there's something wrong with
the code that descends the tree.

Note that the all-zeroes pages in your dump are uninformative because
none of the real FSM data has been written to disk yet. We can see
from this trace that the code is dealing with not-all-zero pages.

regards, tom lane


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: ohp(at)pyrenet(dot)fr, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-08 07:17:52
Message-ID: 493CCA20.4090408@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> ohp(at)pyrenet(dot)fr writes:
>> As you can see in attached initdb.log, it seems fsm_search_avail is called
>> repeatedly and args are sort of looping...
>
> That's expected, since the system is inserting a lot of tuples
> successively.

Right. I suspect it was in the infinite loop yet. Try to run it for
*much* longer (it'll probably take much longer than usual because it's
printing all the debug stuff), until it gets stuck looping over the same
pages in same relation.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: ohp(at)pyrenet(dot)fr
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-08 15:20:00
Message-ID: Pine.UW2.4.63.0812081552520.17458@sun.pyrenet
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dear all,
On Mon, 8 Dec 2008, Heikki Linnakangas wrote:

> Date: Mon, 08 Dec 2008 09:17:52 +0200
> From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
> To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> Cc: ohp(at)pyrenet(dot)fr, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>,
> pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
> Subject: Re: [HACKERS] cvs head initdb hangs on unixware
>
> Tom Lane wrote:
>> ohp(at)pyrenet(dot)fr writes:
>>> As you can see in attached initdb.log, it seems fsm_search_avail is called
>>> repeatedly and args are sort of looping...
>>
>> That's expected, since the system is inserting a lot of tuples
>> successively.
>
> Right. I suspect it was in the infinite loop yet. Try to run it for *much*
> longer (it'll probably take much longer than usual because it's printing all
> the debug stuff), until it gets stuck looping over the same pages in same
> relation.
>
the infinite loop occurs in fsm_search_avail when called for the 32nd
time.

It loops between restart: and goto restart

the long (95M) initdb.log can be found at
ftp://ftp.pyrenet.fr/private/initdb.log
>

regards,

--
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
15, Chemin des Monges +33-5-61-50-97-01 (Fax)
31190 AUTERIVE +33-6-07-63-80-64 (GSM)
FRANCE Email: ohp(at)pyrenet(dot)fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ohp(at)pyrenet(dot)fr
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-08 18:15:28
Message-ID: 28494.1228760128@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ohp(at)pyrenet(dot)fr writes:
> the infinite loop occurs in fsm_search_avail when called for the 32nd
> time.

... which is the first time that the initial test doesn't make it fall
out immediately.

Would you add a couple more printouts, along the line of

nodeno = target;
while (nodeno > 0)
{
+ fprintf(stderr, "ascend at node %d value %d\n",
+ nodeno, fsmpage->fp_nodes[nodeno]);

if (fsmpage->fp_nodes[nodeno] >= minvalue)
break;

/*
* Move to the right, wrapping around on same level if necessary,
* then climb up.
*/
nodeno = parentof(rightneighbor(nodeno));
}

/*
* We're now at a node with enough free space, somewhere in the middle of
* the tree. Descend to the bottom, following a path with enough free
* space, preferring to move left if there's a choice.
*/
while (nodeno < NonLeafNodesPerPage)
{
int leftnodeno = leftchild(nodeno);
int rightnodeno = leftnodeno + 1;
bool leftok = (leftnodeno < NodesPerPage) &&
(fsmpage->fp_nodes[leftnodeno] >= minvalue);
bool rightok = (rightnodeno < NodesPerPage) &&
(fsmpage->fp_nodes[rightnodeno] >= minvalue);

+ fprintf(stderr, "descend at node %d value %d, leftnode %d value %d, rightnode %d value %d\n",
+ nodeno, fsmpage->fp_nodes[nodeno],
+ leftnodeno, fsmpage->fp_nodes[leftnodeno],
+ rightnodeno, fsmpage->fp_nodes[rightnodeno]);

if (leftok)
nodeno = leftnodeno;
else if (rightok)
nodeno = rightnodeno;
else

(I'm assuming we can print possibly-off-the-end array elements without dumping
core; which is bogus in general but I expect we can get away with it
for this purpose.)

Also, we don't really need 94MB of log to convince us it's an
infinite loop ;-)

regards, tom lane


From: ohp(at)pyrenet(dot)fr
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-09 13:23:16
Message-ID: Pine.UW2.4.63.0812091414520.29794@sun.pyrenet
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Tom,
On Mon, 8 Dec 2008, Tom Lane wrote:

> Date: Mon, 08 Dec 2008 13:15:28 -0500
> From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> To: ohp(at)pyrenet(dot)fr
> Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>,
> Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>,
> pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
> Subject: Re: [HACKERS] cvs head initdb hangs on unixware
>
> ohp(at)pyrenet(dot)fr writes:
>> the infinite loop occurs in fsm_search_avail when called for the 32nd
>> time.
>
> ... which is the first time that the initial test doesn't make it fall
> out immediately.
>
> Would you add a couple more printouts, along the line of
>
>
> nodeno = target;
> while (nodeno > 0)
> {
> + fprintf(stderr, "ascend at node %d value %d\n",
> + nodeno, fsmpage->fp_nodes[nodeno]);
>
> if (fsmpage->fp_nodes[nodeno] >= minvalue)
> break;
>
> /*
> * Move to the right, wrapping around on same level if necessary,
> * then climb up.
> */
> nodeno = parentof(rightneighbor(nodeno));
> }
>
> /*
> * We're now at a node with enough free space, somewhere in the middle of
> * the tree. Descend to the bottom, following a path with enough free
> * space, preferring to move left if there's a choice.
> */
> while (nodeno < NonLeafNodesPerPage)
> {
> int leftnodeno = leftchild(nodeno);
> int rightnodeno = leftnodeno + 1;
> bool leftok = (leftnodeno < NodesPerPage) &&
> (fsmpage->fp_nodes[leftnodeno] >= minvalue);
> bool rightok = (rightnodeno < NodesPerPage) &&
> (fsmpage->fp_nodes[rightnodeno] >= minvalue);
>
> + fprintf(stderr, "descend at node %d value %d, leftnode %d value %d, rightnode %d value %d\n",
> + nodeno, fsmpage->fp_nodes[nodeno],
> + leftnodeno, fsmpage->fp_nodes[leftnodeno],
> + rightnodeno, fsmpage->fp_nodes[rightnodeno]);
>
> if (leftok)
> nodeno = leftnodeno;
> else if (rightok)
> nodeno = rightnodeno;
> else
>
> (I'm assuming we can print possibly-off-the-end array elements without dumping
> core; which is bogus in general but I expect we can get away with it
> for this purpose.)
>
> Also, we don't really need 94MB of log to convince us it's an
> infinite loop ;-)
oops, sorry
>
> regards, tom lane
>
I first misread your mail, and added only the first fprintf , while I was
uploading a 400M initdb.log, I went back to add the second one.

Guess what! with the fprintf .. descending node... in place, everything
goes well. The optimizer definitly does something weird along the
definition/assignement of leftok/rightok..

--
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
15, Chemin des Monges +33-5-61-50-97-01 (Fax)
31190 AUTERIVE +33-6-07-63-80-64 (GSM)
FRANCE Email: ohp(at)pyrenet(dot)fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: ohp(at)pyrenet(dot)fr
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-09 13:45:42
Message-ID: 493E7686.6020001@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ohp(at)pyrenet(dot)fr napsal(a):

>>
> I first misread your mail, and added only the first fprintf , while I
> was uploading a 400M initdb.log, I went back to add the second one.
>
> Guess what! with the fprintf .. descending node... in place, everything
> goes well. The optimizer definitly does something weird along the
> definition/assignement of leftok/rightok..
>

Could you generate assembler code with and without optimization of fsmSearch
function? Of course without extra printf :-). It should show difference.

Zdenek


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ohp(at)pyrenet(dot)fr
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-09 14:23:06
Message-ID: 14084.1228832586@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ohp(at)pyrenet(dot)fr writes:
> Guess what! with the fprintf .. descending node... in place, everything
> goes well. The optimizer definitly does something weird along the
> definition/assignement of leftok/rightok..

Hmm, so the problem is in that second loop. The trick is to pick some
reasonably non-ugly code change that makes the problem go away.

The first thing I'd try is to get rid of the overly cute optimization

int rightnodeno = leftnodeno + 1;

and make it just read

int rightnodeno = rightchild(nodeno);

If that doesn't work, we might try refactoring the code enough to get
rid of the goto, but that looks a little bit tedious.

regards, tom lane


From: ohp(at)pyrenet(dot)fr
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-09 16:47:47
Message-ID: Pine.UW2.4.63.0812091744140.29358@sun.pyrenet
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 9 Dec 2008, Tom Lane wrote:

> Date: Tue, 09 Dec 2008 09:23:06 -0500
> From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> To: ohp(at)pyrenet(dot)fr
> Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>,
> Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>,
> pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
> Subject: Re: [HACKERS] cvs head initdb hangs on unixware
>
> ohp(at)pyrenet(dot)fr writes:
>> Guess what! with the fprintf .. descending node... in place, everything
>> goes well. The optimizer definitly does something weird along the
>> definition/assignement of leftok/rightok..
>
> Hmm, so the problem is in that second loop. The trick is to pick some
> reasonably non-ugly code change that makes the problem go away.
>
> The first thing I'd try is to get rid of the overly cute optimization
>
> int rightnodeno = leftnodeno + 1;
>
> and make it just read
>
> int rightnodeno = rightchild(nodeno);
>
> If that doesn't work, we might try refactoring the code enough to get
> rid of the goto, but that looks a little bit tedious.
>
> regards, tom lane
>
I tried that and moving leftok,rightok declaration outside the loop, and
refactor the assignement code of leftok, rightok . nothing worked!

Regards,
--
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
15, Chemin des Monges +33-5-61-50-97-01 (Fax)
31190 AUTERIVE +33-6-07-63-80-64 (GSM)
FRANCE Email: ohp(at)pyrenet(dot)fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)


From: Kenneth Marshall <ktm(at)rice(dot)edu>
To: ohp(at)pyrenet(dot)fr
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-09 16:52:56
Message-ID: 20081209165256.GB26318@it.is.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Would it be reasonable to turn of optimization for this file?

Ken

On Tue, Dec 09, 2008 at 05:47:47PM +0100, ohp(at)pyrenet(dot)fr wrote:
> On Tue, 9 Dec 2008, Tom Lane wrote:
>
>> Date: Tue, 09 Dec 2008 09:23:06 -0500
>> From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
>> To: ohp(at)pyrenet(dot)fr
>> Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>,
>> Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>,
>> pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
>> Subject: Re: [HACKERS] cvs head initdb hangs on unixware ohp(at)pyrenet(dot)fr
>> writes:
>>> Guess what! with the fprintf .. descending node... in place, everything
>>> goes well. The optimizer definitly does something weird along the
>>> definition/assignement of leftok/rightok..
>>
>> Hmm, so the problem is in that second loop. The trick is to pick some
>> reasonably non-ugly code change that makes the problem go away.
>>
>> The first thing I'd try is to get rid of the overly cute optimization
>>
>> int rightnodeno = leftnodeno + 1;
>>
>> and make it just read
>>
>> int rightnodeno = rightchild(nodeno);
>>
>> If that doesn't work, we might try refactoring the code enough to get
>> rid of the goto, but that looks a little bit tedious.
>>
>> regards, tom lane
>>
> I tried that and moving leftok,rightok declaration outside the loop, and
> refactor the assignement code of leftok, rightok . nothing worked!
>
> Regards,
> --
> Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
> 15, Chemin des Monges +33-5-61-50-97-01 (Fax)
> 31190 AUTERIVE +33-6-07-63-80-64 (GSM)
> FRANCE Email: ohp(at)pyrenet(dot)fr
> ------------------------------------------------------------------------------
> Make your life a dream, make your dream a reality. (St Exupery)
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ohp(at)pyrenet(dot)fr
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-09 17:03:00
Message-ID: 16881.1228842180@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ohp(at)pyrenet(dot)fr writes:
> On Tue, 9 Dec 2008, Tom Lane wrote:
>> Hmm, so the problem is in that second loop. The trick is to pick some
>> reasonably non-ugly code change that makes the problem go away.

> I tried that and moving leftok,rightok declaration outside the loop, and
> refactor the assignement code of leftok, rightok . nothing worked!

I was afraid of that. We'd need to look at the assembly code to be sure
(can you provide it?), but what I bet is happening is that the compiler
is looking at the leftnodeno/rightnodeno computations and thinking it can
optimize those by a strength-reduction method, failing to notice that
the loop isn't a simple scan on nodeno.

Now in that regard the logic isn't very much different from a binary
search, which we have lots of and those have always worked. So I'm
back to the theory that the goto inside the inner loop is probably
contributing to the confusion somehow.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ohp(at)pyrenet(dot)fr
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-09 18:24:21
Message-ID: 18738.1228847061@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ohp(at)pyrenet(dot)fr writes:
> FWIW, I have attached the 2 generated .s. Someone with knowledge of asm
> may want to have a look..

Hmm. It looks to me like the compiler is getting confused by the
interaction between nodeno, leftnodeno, and rightnodeno. Try this
patch to see if it gets around it. (This is a tad better anyway
since it avoids examining the right child if not needed.)

regards, tom lane

Attachment Content-Type Size
unknown_filename text/plain 1.2 KB

From: ohp(at)pyrenet(dot)fr
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-10 10:17:56
Message-ID: Pine.UW2.4.63.0812101113280.7144@sun.pyrenet
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Dear Tom,
On Tue, 9 Dec 2008, Tom Lane wrote:

> Date: Tue, 09 Dec 2008 13:24:21 -0500
> From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> To: ohp(at)pyrenet(dot)fr
> Cc: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>,
> Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>,
> pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
> Subject: Re: [HACKERS] cvs head initdb hangs on unixware
>
> ohp(at)pyrenet(dot)fr writes:
>> FWIW, I have attached the 2 generated .s. Someone with knowledge of asm
>> may want to have a look..
>
> Hmm. It looks to me like the compiler is getting confused by the
> interaction between nodeno, leftnodeno, and rightnodeno. Try this
> patch to see if it gets around it. (This is a tad better anyway
> since it avoids examining the right child if not needed.)
>
> regards, tom lane
>
>
Brillant!
You made my day, can't wait for this patch to be committed.
Thanks!!!

PS: I wish I had 10% of your knowledge/genius!
--
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
15, Chemin des Monges +33-5-61-50-97-01 (Fax)
31190 AUTERIVE +33-6-07-63-80-64 (GSM)
FRANCE Email: ohp(at)pyrenet(dot)fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: ohp(at)pyrenet(dot)fr
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-10 11:00:31
Message-ID: 493FA14F.9070207@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ohp(at)pyrenet(dot)fr wrote:
> On Tue, 9 Dec 2008, Tom Lane wrote:
>> Hmm. It looks to me like the compiler is getting confused by the
>> interaction between nodeno, leftnodeno, and rightnodeno. Try this
>> patch to see if it gets around it. (This is a tad better anyway
>> since it avoids examining the right child if not needed.)
>>
> Brillant!
> You made my day, can't wait for this patch to be committed.

I find it pretty scary to work around compiler bugs like this. Who knows
what other code it miscompiles. Can you reduce fsm_search_avail into a
small stand-alone test program, and file a bug report with the compiler
vendor?

BTW, why does this work on warthog buildfarm member? Different compiler
version?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: ohp(at)pyrenet(dot)fr, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-10 11:35:17
Message-ID: 200812101135.mBABZHR04802@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas wrote:
> ohp(at)pyrenet(dot)fr wrote:
> > On Tue, 9 Dec 2008, Tom Lane wrote:
> >> Hmm. It looks to me like the compiler is getting confused by the
> >> interaction between nodeno, leftnodeno, and rightnodeno. Try this
> >> patch to see if it gets around it. (This is a tad better anyway
> >> since it avoids examining the right child if not needed.)
> >>
> > Brillant!
> > You made my day, can't wait for this patch to be committed.
>
> I find it pretty scary to work around compiler bugs like this. Who knows
> what other code it miscompiles. Can you reduce fsm_search_avail into a
> small stand-alone test program, and file a bug report with the compiler
> vendor?
>
> BTW, why does this work on warthog buildfarm member? Different compiler
> version?

I assume this is the SCO compiler; I gave up on the SCO compiler in the
1990's, and I suggest we do the same.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: ohp(at)pyrenet(dot)fr, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-10 11:38:21
Message-ID: 493FAA2D.1060505@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas wrote:
> I find it pretty scary to work around compiler bugs like this. Who knows
> what other code it miscompiles. Can you reduce fsm_search_avail into a
> small stand-alone test program, and file a bug report with the compiler
> vendor?
>
> BTW, why does this work on warthog buildfarm member? Different compiler
> version?

The archives are full of compiler bugs specifically in the SCO compilers
appearing and disappearing in various versions. We usually don't try to
work around it; instead we make a note to avoid certain compiler
versions. Filing upstream bugs usually also works.


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, ohp(at)pyrenet(dot)fr, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-10 11:41:18
Message-ID: 200812101141.mBABfIp05716@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut wrote:
> Heikki Linnakangas wrote:
> > I find it pretty scary to work around compiler bugs like this. Who knows
> > what other code it miscompiles. Can you reduce fsm_search_avail into a
> > small stand-alone test program, and file a bug report with the compiler
> > vendor?
> >
> > BTW, why does this work on warthog buildfarm member? Different compiler
> > version?
>
> The archives are full of compiler bugs specifically in the SCO compilers
> appearing and disappearing in various versions. We usually don't try to
> work around it; instead we make a note to avoid certain compiler
> versions. Filing upstream bugs usually also works.

The SCO compiler is so bad and so prone to breakage that I question
whether it is even worth filing upstream bug reports.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: ohp(at)pyrenet(dot)fr
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-10 14:03:17
Message-ID: Pine.UW2.4.63.0812101453170.7144@sun.pyrenet
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 10 Dec 2008, Heikki Linnakangas wrote:

> Date: Wed, 10 Dec 2008 13:00:31 +0200
> From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
> To: ohp(at)pyrenet(dot)fr
> Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>,
> pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
> Subject: Re: [HACKERS] cvs head initdb hangs on unixware
>
> ohp(at)pyrenet(dot)fr wrote:
>> On Tue, 9 Dec 2008, Tom Lane wrote:
>>> Hmm. It looks to me like the compiler is getting confused by the
>>> interaction between nodeno, leftnodeno, and rightnodeno. Try this
>>> patch to see if it gets around it. (This is a tad better anyway
>>> since it avoids examining the right child if not needed.)
>>>
>> Brillant!
>> You made my day, can't wait for this patch to be committed.
>
> I find it pretty scary to work around compiler bugs like this. Who knows what
> other code it miscompiles. Can you reduce fsm_search_avail into a small
> stand-alone test program, and file a bug report with the compiler vendor?
FWIW, the compiler doesn't miscompîle anything on postgresql, as an heavy
user/hoster, I'd know!

Let's not start a flame here, SCO compiler is as good or as bad as
anyother..

Never saw a problem with gcc, hp-ux, darwin or M$?
>
> BTW, why does this work on warthog buildfarm member? Different compiler
> version?
>
it's configured with --enable-debug.
Maybe run_build.pl should run twice, onece with --enable-debug once
without.
>

--
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
15, Chemin des Monges +33-5-61-50-97-01 (Fax)
31190 AUTERIVE +33-6-07-63-80-64 (GSM)
FRANCE Email: ohp(at)pyrenet(dot)fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)
>From pgsql-hackers-owner(at)postgresql(dot)org Wed Dec 10 10:06:07 2008
Received: from localhost (unknown [200.46.204.183])
by mail.postgresql.org (Postfix) with ESMTP id AB28464FFE8
for <pgsql-hackers-postgresql(dot)org(at)mail(dot)postgresql(dot)org>; Wed, 10 Dec 2008 10:06:06 -0400 (AST)
Received: from mail.postgresql.org ([200.46.204.86])
by localhost (mx1.hub.org [200.46.204.183]) (amavisd-maia, port 10024)
with ESMTP id 34846-07
for <pgsql-hackers-postgresql(dot)org(at)mail(dot)postgresql(dot)org>;
Wed, 10 Dec 2008 10:05:56 -0400 (AST)
X-Greylist: from auto-whitelisted by SQLgrey-1.7.6
Received: from lists.commandprompt.com (host-159.commandprompt.net [207.173.203.159])
by mail.postgresql.org (Postfix) with ESMTP id F0B8F64FEB3
for <pgsql-hackers(at)postgresql(dot)org>; Wed, 10 Dec 2008 10:05:55 -0400 (AST)
Received: from perhan.alvh.no-ip.org (200-126-68-73.bk5-dsl.surnet.cl [200.126.68.73])
(authenticated bits=0)
by lists.commandprompt.com (8.13.8/8.13.8) with ESMTP id mBAEAOmJ031897
(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
Wed, 10 Dec 2008 06:10:27 -0800
Received: by perhan.alvh.no-ip.org (Postfix, from userid 1000)
id 0F8C847CCD; Wed, 10 Dec 2008 11:05:23 -0300 (CLST)
Date: Wed, 10 Dec 2008 11:05:23 -0300
From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>,
dmitry(at)koterov(dot)ru, pgsql-hackers(at)postgresql(dot)org
Subject: Re: ALTER composite type does not work, but ALTER TABLE
which ROWTYPE is used as a type - works fine
Message-ID: <20081210140522(dot)GB5503(at)alvh(dot)no-ip(dot)org>
References: <603c8f070812080649y29f8946fref9f46a7232a8489(at)mail(dot)gmail(dot)com> <200812101136(dot)mBABaO805042(at)momjian(dot)us> <603c8f070812100444i4bf1d416se0dccbf2c02ba724(at)mail(dot)gmail(dot)com> <b42b73150812100459s21ff5284s92e3077485111468(at)mail(dot)gmail(dot)com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <b42b73150812100459s21ff5284s92e3077485111468(at)mail(dot)gmail(dot)com>
User-Agent: Mutt/1.5.18 (2008-05-17)
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0 (lists.commandprompt.com [207.173.203.159]); Wed, 10 Dec 2008 06:10:28 -0800 (PST)
X-Virus-Scanned: Maia Mailguard 1.0.1
X-Spam-Status: No, hits=0 tagged_above=0 required=5 tests=none
X-Spam-Level:
X-Archive-Number: 200812/640
X-Sequence-Number: 129269

Merlin Moncure escribió:
> >> Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
> >> OK, so what should the TODO item be?
> On Wed, Dec 10, 2008 at 7:44 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > Allow ALTER TYPE to add, rename, change the type of, and drop columns?
>
> That's probably the consensus view. Personally, I think creating
> composite types through 'create type as' was a mistake...we probably
> should have gone through create table instead with some special syntax
> for storage-less tables aka composite types.

I disagree that CREATE TABLE should be (or should have been) used to
create types. Someday we might need to expand the work we do for that
case in a different direction than tables, and we would be stuck.

Also, for tables we create files, we generate statistics, we compute
relfrozenxid, we call vacuum on, and so on and so forth. We do none of
these things on types.

In fact, types are not in pg_class at all.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: ohp(at)pyrenet(dot)fr
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-10 14:20:00
Message-ID: 493FD010.3060509@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ohp(at)pyrenet(dot)fr wrote:
> On Wed, 10 Dec 2008, Heikki Linnakangas wrote:
>> I find it pretty scary to work around compiler bugs like this. Who
>> knows what other code it miscompiles. Can you reduce fsm_search_avail
>> into a small stand-alone test program, and file a bug report with the
>> compiler vendor?
> FWIW, the compiler doesn't miscompîle anything on postgresql, as an
> heavy user/hoster, I'd know!
>
> Let's not start a flame here, SCO compiler is as good or as bad as
> anyother..
>
> Never saw a problem with gcc, hp-ux, darwin or M$?

Sure, that's not what I was saying. My point is, when there's a bug in
one version of a compiler, we shouldn't try to adapt PostgreSQL to that
bug. Instead, we should narrow down the bug, get it fixed in the
compiler, and tell users to use the most recent version of the compiler
where the bug has been fixed.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ohp(at)pyrenet(dot)fr
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-10 17:17:18
Message-ID: 3474.1228929438@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ohp(at)pyrenet(dot)fr writes:
> On Wed, 10 Dec 2008, Heikki Linnakangas wrote:
>> BTW, why does this work on warthog buildfarm member? Different compiler
>> version?
>>
> it's configured with --enable-debug.
> Maybe run_build.pl should run twice, onece with --enable-debug once
> without.

No, the standard way to deal with such issues is to set up two buildfarm
members. This would be a 100% waste of cycles for gcc-based members
anyway, since gcc generates the same code with or without -g. However,
for compilers where it makes a difference, it might well be worth having
an additional member to test the optimized build.

regards, tom lane


From: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: ohp(at)pyrenet(dot)fr, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-10 17:27:05
Message-ID: 493FFBE9.70403@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane napsal(a):
> ohp(at)pyrenet(dot)fr writes:
>> On Wed, 10 Dec 2008, Heikki Linnakangas wrote:
>>> BTW, why does this work on warthog buildfarm member? Different compiler
>>> version?
>>>
>> it's configured with --enable-debug.
>> Maybe run_build.pl should run twice, onece with --enable-debug once
>> without.
>
> No, the standard way to deal with such issues is to set up two buildfarm
> members. This would be a 100% waste of cycles for gcc-based members
> anyway, since gcc generates the same code with or without -g. However,
> for compilers where it makes a difference, it might well be worth having
> an additional member to test the optimized build.

I think current infrastructures is not good for it. For example I would like to
compile postgres on one machine with three different compiler and in 32 or 64
mode. Should I have 6 animals? I think better idea is to have one animal and
several test sets. Animals defines HW+OS version and test set specify PG
version, configure switches, compiler and so on.

these are my two cents

Zdenek


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: ohp(at)pyrenet(dot)fr, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-10 17:29:36
Message-ID: 3673.1228930176@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> ohp(at)pyrenet(dot)fr wrote:
>> Never saw a problem with gcc, hp-ux, darwin or M$?

> Sure, that's not what I was saying. My point is, when there's a bug in
> one version of a compiler, we shouldn't try to adapt PostgreSQL to that
> bug. Instead, we should narrow down the bug, get it fixed in the
> compiler, and tell users to use the most recent version of the compiler
> where the bug has been fixed.

We should certainly file a bug report against the compiler. However,
ISTM a workaround is a good idea too if it's not too ugly (which this
one isn't). If a bug exists in one compiler there might be similar
bugs in other compilers.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: ohp(at)pyrenet(dot)fr, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-10 17:36:38
Message-ID: 3867.1228930598@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> writes:
> Tom Lane napsal(a):
>> No, the standard way to deal with such issues is to set up two buildfarm
>> members.

> I think current infrastructures is not good for it. For example I would like to
> compile postgres on one machine with three different compiler and in 32 or 64
> mode. Should I have 6 animals?

Yes.

> I think better idea is to have one animal and
> several test sets.

That simply complicates everything --- the reporting infrastructure,
identifying which case failed, etc --- without actually improving
anything.

regards, tom lane


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, ohp(at)pyrenet(dot)fr, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-10 19:39:17
Message-ID: 20081210193917.GC5422@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Dec 10, 2008 at 06:27:05PM +0100, Zdenek Kotala wrote:
> I think current infrastructures is not good for it. For example I would
> like to compile postgres on one machine with three different compiler and
> in 32 or 64 mode. Should I have 6 animals? I think better idea is to have
> one animal and several test sets. Animals defines HW+OS version and test
> set specify PG version, configure switches, compiler and so on.

Well, you could name them animal-1, animal-2, animal-3, etc... Once the
list reaches 100 entries we can think about alternatives...

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Zdenek Kotala <Zdenek(dot)Kotala(at)sun(dot)com>, ohp(at)pyrenet(dot)fr, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-10 21:08:04
Message-ID: 200812102308.05828.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wednesday 10 December 2008 19:36:38 Tom Lane wrote:
> Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> writes:
> > Tom Lane napsal(a):
> >> No, the standard way to deal with such issues is to set up two buildfarm
> >> members.
> >
> > I think current infrastructures is not good for it. For example I would
> > like to compile postgres on one machine with three different compiler and
> > in 32 or 64 mode. Should I have 6 animals?
>
> Yes.

I have to say, I have concerns similar to Zdenek's. Setting up a load of
different animals for every altered configuration makes it difficult to tell
which configurations are actually related.

I have been thinking about test coverage recently and analyzed bugs and so on.
To get more confidence beyond a random (not even truly random) subset of
platforms and options we should really be building with a lot more
combinations of

- compilers
- compiler options
- configure options
- run time options
(- more tests of other code areas, but that is a different problem)

Note, for example, that downstream binary packages are almost never built with
default or near-default compiler options, and of course production
installations are hopefully never run with the default run-time
configuration. Essentially, we are not really testing what the users are
running.

To cover reality better, I can easily imagine that a single platform (say,
CPU, OS, bitness, and compiler) should do at least fifty different test runs
in different combinations. There, we'd also have resource problems, but some
people have machines that can do that (and want to do that). How can we
accomodate that today?

A coincidental trouble with this is that I find the animal names to be
increasingly difficult to process and remember. They are basically just line
noise to me at this point. Other non-biologists might feel the same. And we
might eventually run out of reasonable names.

> That simply complicates everything --- the reporting infrastructure,
> identifying which case failed, etc --- without actually improving
> anything.

I don't think it has to be that complicated. We could probably augment the
naming scheme like "animal/foo" or "animal/12" or something like that.


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, ohp(at)pyrenet(dot)fr, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-10 21:40:47
Message-ID: 4940375F.1070803@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Zdenek Kotala wrote:
> Tom Lane napsal(a):
>> ohp(at)pyrenet(dot)fr writes:
>>> On Wed, 10 Dec 2008, Heikki Linnakangas wrote:
>>>> BTW, why does this work on warthog buildfarm member? Different
>>>> compiler version?
>>>>
>>> it's configured with --enable-debug.
>>> Maybe run_build.pl should run twice, onece with --enable-debug once
>>> without.
>>
>> No, the standard way to deal with such issues is to set up two buildfarm
>> members. This would be a 100% waste of cycles for gcc-based members
>> anyway, since gcc generates the same code with or without -g. However,
>> for compilers where it makes a difference, it might well be worth having
>> an additional member to test the optimized build.
>
> I think current infrastructures is not good for it. For example I
> would like to compile postgres on one machine with three different
> compiler and in 32 or 64 mode. Should I have 6 animals? I think better
> idea is to have one animal and several test sets. Animals defines
> HW+OS version and test set specify PG version, configure switches,
> compiler and so on.
>
>

Well, you're asking for a significant redesign for which I at least
don't have time. What is so hard about having six animals on one
machine. A number of people have such setups, including me.

cheers

andrew


From: Aidan Van Dyk <aidan(at)highrise(dot)ca>
To: Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, ohp(at)pyrenet(dot)fr, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-10 21:54:47
Message-ID: 20081210215447.GV26596@yugib.highrise.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM> [081210 12:29]:

>> No, the standard way to deal with such issues is to set up two buildfarm
>> members. This would be a 100% waste of cycles for gcc-based members
>> anyway, since gcc generates the same code with or without -g. However,
>> for compilers where it makes a difference, it might well be worth having
>> an additional member to test the optimized build.

> I think current infrastructures is not good for it. For example I would
> like to compile postgres on one machine with three different compiler and
> in 32 or 64 mode. Should I have 6 animals? I think better idea is to have
> one animal and several test sets. Animals defines HW+OS version and test
> set specify PG version, configure switches, compiler and so on.

Sure and in my neck of the woods, and there are cows, calfs, heiffers,
bulls, steers, but they are all cattle... And when talking about cows,
Jerseys and Guernsey's have high MF, lower production, Ayrshire have
high production, lower MF, and Holstiens inbetween.

Should I call them "cow with high MF" and "cow with high production", or
just say Jersey or Ayrshire?

Where ever you (the generic you, not specific you) draw the line, what
you call it is still arbitrary... But where that line is drawn
currently defined in the buildfarm code...

Not that it can't be changed, but I thin there's much better things to
worry about ;-)

a.

--
Aidan Van Dyk Create like a god,
aidan(at)highrise(dot)ca command like a king,
http://www.highrise.ca/ work like a slave.


From: ohp(at)pyrenet(dot)fr
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: cvs head initdb hangs on unixware
Date: 2008-12-14 16:43:20
Message-ID: Pine.UW2.4.63.0812141735440.4273@sun.pyrenet
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom,
On Wed, 10 Dec 2008, Tom Lane wrote:

> Date: Wed, 10 Dec 2008 12:17:18 -0500
> From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> To: ohp(at)pyrenet(dot)fr
> Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>,
> Zdenek Kotala <Zdenek(dot)Kotala(at)Sun(dot)COM>,
> pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org>
> Subject: Re: [HACKERS] cvs head initdb hangs on unixware
>
> ohp(at)pyrenet(dot)fr writes:
>> On Wed, 10 Dec 2008, Heikki Linnakangas wrote:
>>> BTW, why does this work on warthog buildfarm member? Different compiler
>>> version?
>>>
>> it's configured with --enable-debug.
>> Maybe run_build.pl should run twice, onece with --enable-debug once
>> without.
>
> No, the standard way to deal with such issues is to set up two buildfarm
> members. This would be a 100% waste of cycles for gcc-based members
> anyway, since gcc generates the same code with or without -g. However,
> for compilers where it makes a difference, it might well be worth having
> an additional member to test the optimized build.
>
> regards, tom lane
>
I understand your concern. Maybe an option --flip-debug that would not
be used by gcc owners could help having both tests in 1 run.

In the mean time, while preparing my home unixware server to become an
other animal, I came on a new optimizer bug in ecpg.

To not pollute this close thread, I start a new one.

--
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
15, Chemin des Monges +33-5-61-50-97-01 (Fax)
31190 AUTERIVE +33-6-07-63-80-64 (GSM)
FRANCE Email: ohp(at)pyrenet(dot)fr
------------------------------------------------------------------------------
Make your life a dream, make your dream a reality. (St Exupery)