Re: odd output in initdb

Lists: pgsql-hackers
From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: odd output in initdb
Date: 2010-01-29 20:00:09
Message-ID: 4B633E49.2010308@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


I saw some odd pgstat output during an initdb on Windows today:

The files belonging to this database system will be owned by user
"pgrunner".
This user must also own the server process.

The database cluster will be initialized with locale C.
The default database encoding has accordingly been set to SQL_ASCII.
The default text search configuration will be set to "english".

creating directory data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 32MB
creating configuration files ... ok
creating template1 database in data/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... WARNING: pgstat wait timeout
WARNING: pgstat wait timeout
ok
creating system views ... ok
loading system objects' descriptions ... ok
creating conversions ... ok
creating dictionaries ... ok
setting privileges on built-in objects ... ok
creating information schema ... ok
loading PL/pgSQL server-side language ... ok
vacuuming database template1 ... WARNING: pgstat wait timeout
WARNING: pgstat wait timeout
ok
copying template1 to template0 ... WARNING: pgstat wait timeout
ok
copying template1 to postgres ... ok

Success. You can now start the database server using:

"C:\msys\1.0\home\pgrunner\bf\root\HEAD\inst\bin\postgres" -D "data"
or
"C:\msys\1.0\home\pgrunner\bf\root\HEAD\inst\bin\pg_ctl" -D
"data" -l logfile start

WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the -A option the
next time you run initdb.

A little searching on the buildfarm shows this started around Jan 10th:

pgbfprod=# select min(snapshot) from build_status_log where sysname
= 'red_bat' and log_stage = 'initdb.log' and log_text ~ 'pgstat';
min
---------------------
2010-01-10 23:30:01
(1 row)

I can't see an obvious culprit in the commit logs right off. Does anyone
have an idea what's going on?

cheers

andrew


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in initdb
Date: 2010-01-29 22:28:55
Message-ID: 4B636127.3060609@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan wrote:
>
>
> initializing dependencies ... WARNING: pgstat wait timeout
> WARNING: pgstat wait timeout
> ok
>
> vacuuming database template1 ... WARNING: pgstat wait timeout
> WARNING: pgstat wait timeout
> ok
> copying template1 to template0 ... WARNING: pgstat wait timeout
>
>
>
> I can't see an obvious culprit in the commit logs right off.

Actually, on close inspection it looks to me like this commit: "Create
typedef pgsocket for storing socket descriptors."
<http://git.postgresql.org/gitweb?p=postgresql.git;a=commit;h=ea1a4463e9de9662b7c9e13374ec8c7b92ff2f19>
could well be the culprit.

cheers

andrew


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in initdb
Date: 2010-01-31 16:33:44
Message-ID: 9837222c1001310833i13a83800o87670491ad7ae8c2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jan 29, 2010 at 23:28, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>
>
> Andrew Dunstan wrote:
>>
>>
>>   initializing dependencies ... WARNING:  pgstat wait timeout
>>   WARNING:  pgstat wait timeout
>>   ok
>>     vacuuming database template1 ... WARNING:  pgstat wait timeout
>>   WARNING:  pgstat wait timeout
>>   ok
>>   copying template1 to template0 ... WARNING:  pgstat wait timeout
>>
>>
>> I can't see an obvious culprit in the commit logs right off.
>
> Actually, on close inspection it looks to me like this commit: "Create
> typedef pgsocket for storing socket descriptors."
> <http://git.postgresql.org/gitweb?p=postgresql.git;a=commit;h=ea1a4463e9de9662b7c9e13374ec8c7b92ff2f19>
> could well be the culprit.

I'm not claiming it's not, but what exactly points to that? Does the
problem go away if you move to a version before that?

Because I'm 99% sure I saw it well before that commit, and we've had
reports on it from 8.4 as well, I'm not so sure... But it may be that
that commit made something more likely to happen...

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in initdb
Date: 2010-01-31 16:56:11
Message-ID: 26640.1264956971@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Magnus Hagander <magnus(at)hagander(dot)net> writes:
> On Fri, Jan 29, 2010 at 23:28, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>>> initializing dependencies ... WARNING: pgstat wait timeout

> I'm not claiming it's not, but what exactly points to that? Does the
> problem go away if you move to a version before that?

> Because I'm 99% sure I saw it well before that commit, and we've had
> reports on it from 8.4 as well, I'm not so sure... But it may be that
> that commit made something more likely to happen...

I notice pgstat_send is still using "if (pgStatSock < 0)" to detect
PGINVALID_SOCKET ...

regards, tom lane


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in initdb
Date: 2010-01-31 17:24:39
Message-ID: 9837222c1001310924p680b363fn8daf8490cc048dbf@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jan 31, 2010 at 17:56, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Magnus Hagander <magnus(at)hagander(dot)net> writes:
>> On Fri, Jan 29, 2010 at 23:28, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>>>>   initializing dependencies ... WARNING:  pgstat wait timeout
>
>> I'm not claiming it's not, but what exactly points to that? Does the
>> problem go away if you move to a version before that?
>
>> Because I'm 99% sure I saw it well before that commit, and we've had
>> reports on it from 8.4 as well, I'm not so sure... But it may be that
>> that commit made something more likely to happen...
>
> I notice pgstat_send is still using "if (pgStatSock < 0)" to detect
> PGINVALID_SOCKET ...

That's certainly so, but that shouldn't have any effect on this -
since on that platform it's defined to -1 anyway. But I'll go ahead
and fix it :-)

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in initdb
Date: 2010-01-31 17:33:13
Message-ID: 8682.1264959193@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Magnus Hagander <magnus(at)hagander(dot)net> writes:
> On Sun, Jan 31, 2010 at 17:56, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I notice pgstat_send is still using "if (pgStatSock < 0)" to detect
>> PGINVALID_SOCKET ...

> That's certainly so, but that shouldn't have any effect on this -
> since on that platform it's defined to -1 anyway. But I'll go ahead
> and fix it :-)

I was more worried about the other way around: could a *valid* handle
be < 0? Although it's still not clear why it'd only recently have
started being a problem.

regards, tom lane


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in initdb
Date: 2010-01-31 17:38:12
Message-ID: 9837222c1001310938j1bbd986fnfa342b21c2028c60@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jan 31, 2010 at 18:33, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Magnus Hagander <magnus(at)hagander(dot)net> writes:
>> On Sun, Jan 31, 2010 at 17:56, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> I notice pgstat_send is still using "if (pgStatSock < 0)" to detect
>>> PGINVALID_SOCKET ...
>
>> That's certainly so, but that shouldn't have any effect on this -
>> since on that platform it's defined to -1 anyway. But I'll go ahead
>> and fix it :-)
>
> I was more worried about the other way around: could a *valid* handle
> be < 0?  Although it's still not clear why it'd only recently have
> started being a problem.

I don't think it can.

And no, I can't figure out why that should've changed recently because
of this, since we've checked against that before..

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in initdb
Date: 2010-02-01 01:13:32
Message-ID: 4B662ABC.8010407@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Magnus Hagander wrote:
>>
>> Actually, on close inspection it looks to me like this commit: "Create
>> typedef pgsocket for storing socket descriptors."
>> <http://git.postgresql.org/gitweb?p=postgresql.git;a=commit;h=ea1a4463e9de9662b7c9e13374ec8c7b92ff2f19>
>> could well be the culprit.
>>
>
>
> I'm not claiming it's not, but what exactly points to that? Does the
> problem go away if you move to a version before that?
>
> Because I'm 99% sure I saw it well before that commit, and we've had
> reports on it from 8.4 as well, I'm not so sure... But it may be that
> that commit made something more likely to happen...
>

The buildfarm logs say otherwise. This was first seen in Jan 10, at
least on my Windows animals:

pgbfprod=# select sysname, min(snapshot) from build_status_log where
sysname ~ 'bat' and log_stage ~ 'initdb' and log_text ~ 'pgstat'
group by sysname;
sysname | min
----------+---------------------
dawn_bat | 2010-01-10 17:30:02
red_bat | 2010-01-10 23:30:01
(2 rows)

cheers

andrew


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in initdb
Date: 2010-02-01 09:07:21
Message-ID: 9837222c1002010107o4e79f282x67a777771f9b2cf@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2010/2/1 Andrew Dunstan <andrew(at)dunslane(dot)net>:
>
> Magnus Hagander wrote:
>>>
>>> Actually, on close inspection it looks to me like this commit: "Create
>>> typedef pgsocket for storing socket descriptors."
>>> <http://git.postgresql.org/gitweb?p=postgresql.git;a=commit;h=ea1a4463e9de9662b7c9e13374ec8c7b92ff2f19>
>>> could well be the culprit.
>>>
>>
>>
>> I'm not claiming it's not, but what exactly points to that? Does the
>> problem go away if you move to a version before that?
>>
>> Because I'm 99% sure I saw it well before that commit, and we've had
>> reports on it from 8.4 as well, I'm not so sure... But it may be that
>> that commit made something more likely to happen...
>>
>
>
> The buildfarm logs say otherwise. This was first seen in Jan 10, at least on my Windows animals:
>
>   pgbfprod=# select sysname, min(snapshot) from build_status_log where
>   sysname ~ 'bat' and log_stage ~ 'initdb' and log_text ~ 'pgstat'
>   group by sysname;
>    sysname  |         min           ----------+---------------------
>    dawn_bat | 2010-01-10 17:30:02
>    red_bat  | 2010-01-10 23:30:01
>   (2 rows)

Well, that's clearly information I didn't have access to ;)

We *have* seen it before. But I guess that change either made it more
likely to happen, or it's a different issue with the same message.

But this one may well have been the -1 vs PGINVALID_SOCKET checks. It
didn't show up in my intial tests, but it also didn't show up entirely
consistently before :S The latest run of mastodon is also not showing
it.

Did those two members produce consistent issues? Can you give them a
couple of kicks to get enough buidls out of them to figure out if this
solved the problem?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: odd output in initdb
Date: 2010-02-01 12:54:48
Message-ID: 4B66CF18.6080203@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Magnus Hagander wrote:
> Did those two members produce consistent issues? Can you give them a
> couple of kicks to get enough buidls out of them to figure out if this
> solved the problem?
>
>

Not very easily. I am currently travelling. It was happening fairly
consistently on red_bat and seems to have stopped now.

cheers

andrew