Re: [HACKERS] posmaster failed under high load

Lists: pgsql-hackers
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] posmaster failed under high load
Date: 1999-05-06 00:25:24
Message-ID: 7398.925950324@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> writes:
>> It's interesting, that process with pid 701 migrates from
>> (postmaster) to postgres with normal ps output !

> Yes, that's pretty strong evidence in favor of my theory (that these
> processes are just new backends that haven't received a command yet).

Nope, that theory is all wet --- the backend definitely does
PS_SET_STATUS("idle") before it waits for a query. Something is
*really* peculiar here, since your backtrace shows that the backend
has reached the point of waiting for client input. It is not possible
to get there without having done PS_SET_STATUS. So why does the process
still show up as "(postmaster)" in ps? Something is flaky about your
system's support of ps status setting, I think.

regards, tom lane


From: Taral <taral(at)taral(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] posmaster failed under high load
Date: 1999-05-06 01:50:17
Message-ID: Pine.LNX.4.10.9905052049500.1871-100000@dragon.taral.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 5 May 1999, Tom Lane wrote:

> Nope, that theory is all wet --- the backend definitely does
> PS_SET_STATUS("idle") before it waits for a query. Something is
> *really* peculiar here, since your backtrace shows that the backend
> has reached the point of waiting for client input. It is not possible
> to get there without having done PS_SET_STATUS. So why does the process
> still show up as "(postmaster)" in ps? Something is flaky about your
> system's support of ps status setting, I think.

You never altered the task_struct, and so it's still 'postmaster' there.
Note the W... the process is paged out, so the argv is not available!

Taral


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Taral <taral(at)taral(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] posmaster failed under high load
Date: 1999-05-06 03:53:16
Message-ID: Pine.GSO.3.96.SK.990506074917.26882H-100000@ra
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 5 May 1999, Taral wrote:

> Date: Wed, 5 May 1999 20:50:17 -0500 (CDT)
> From: Taral <taral(at)taral(dot)net>
> To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, hackers(at)postgreSQL(dot)org
> Subject: Re: [HACKERS] posmaster failed under high load
>
> On Wed, 5 May 1999, Tom Lane wrote:
>
> > Nope, that theory is all wet --- the backend definitely does
> > PS_SET_STATUS("idle") before it waits for a query. Something is
> > *really* peculiar here, since your backtrace shows that the backend
> > has reached the point of waiting for client input. It is not possible
> > to get there without having done PS_SET_STATUS. So why does the process
> > still show up as "(postmaster)" in ps? Something is flaky about your
> > system's support of ps status setting, I think.
>
> You never altered the task_struct, and so it's still 'postmaster' there.
> Note the W... the process is paged out, so the argv is not available!

The system was under very high load, at peak load was about 69
(actually, it could be higher, I just wasn't able to enter a command :-)
Client (http_load from http://www.acme.com) tests checksum for every
connection, so definetely command was issued and backend returns a result.

Oleg

>
> Taral
>
>

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: Taral <taral(at)taral(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] posmaster failed under high load
Date: 1999-05-06 05:20:13
Message-ID: 199905060520.BAA14883@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On Wed, 5 May 1999, Tom Lane wrote:
>
> > Nope, that theory is all wet --- the backend definitely does
> > PS_SET_STATUS("idle") before it waits for a query. Something is
> > *really* peculiar here, since your backtrace shows that the backend
> > has reached the point of waiting for client input. It is not possible
> > to get there without having done PS_SET_STATUS. So why does the process
> > still show up as "(postmaster)" in ps? Something is flaky about your
> > system's support of ps status setting, I think.
>
> You never altered the task_struct, and so it's still 'postmaster' there.
> Note the W... the process is paged out, so the argv is not available!

Yes, I remember now. To do ps-args you need to read the process address
space. If it is paged out, ps does not bring in the pages just to read
the args. This is probably as expected. If someone wants to add a
linux-specific fix for this, I guess you could, though I am not sure it
is worth it.

--
Bruce Momjian | http://www.op.net/~candle
maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
Cc: Taral <taral(at)taral(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] posmaster failed under high load
Date: 1999-05-06 05:59:14
Message-ID: Pine.GSO.3.96.SK.990506095542.26882I-100000@ra
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 6 May 1999, Bruce Momjian wrote:

> Date: Thu, 6 May 1999 01:20:13 -0400 (EDT)
> From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
> To: Taral <taral(at)taral(dot)net>
> Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>,
> hackers(at)postgreSQL(dot)org
> Subject: Re: [HACKERS] posmaster failed under high load
>
> > On Wed, 5 May 1999, Tom Lane wrote:
> >
> > > Nope, that theory is all wet --- the backend definitely does
> > > PS_SET_STATUS("idle") before it waits for a query. Something is
> > > *really* peculiar here, since your backtrace shows that the backend
> > > has reached the point of waiting for client input. It is not possible
> > > to get there without having done PS_SET_STATUS. So why does the process
> > > still show up as "(postmaster)" in ps? Something is flaky about your
> > > system's support of ps status setting, I think.
> >
> > You never altered the task_struct, and so it's still 'postmaster' there.
> > Note the W... the process is paged out, so the argv is not available!
>
> Yes, I remember now. To do ps-args you need to read the process address
> space. If it is paged out, ps does not bring in the pages just to read
> the args. This is probably as expected. If someone wants to add a
> linux-specific fix for this, I guess you could, though I am not sure it
> is worth it.
>

How to explain that process with PID 701 which shown in ps output
as (postmaster) after some time becomes looks as usual postgres

Oleg

> --
> Bruce Momjian | http://www.op.net/~candle
> maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
> + If your life is a hard drive, | 830 Blythe Avenue
> + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
>

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


From: Taral <taral(at)taral(dot)net>
To: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
Cc: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] posmaster failed under high load
Date: 1999-05-06 06:05:11
Message-ID: Pine.LNX.4.10.9905060104250.2906-100000@dragon.taral.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 6 May 1999, Oleg Bartunov wrote:

> How to explain that process with PID 701 which shown in ps output
> as (postmaster) after some time becomes looks as usual postgres

Because 'postmaster' is written in the kernel task_struct, whereas the
task's argv[] says 'postgres'.

The only way to get around this is to do an execv(), at which point the
kernel will recopy argv[0].

Taral


From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: Taral <taral(at)taral(dot)net>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] posmaster failed under high load
Date: 1999-05-06 06:36:58
Message-ID: 199905060636.CAA15866@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On Thu, 6 May 1999, Oleg Bartunov wrote:
>
> > How to explain that process with PID 701 which shown in ps output
> > as (postmaster) after some time becomes looks as usual postgres
>
> Because 'postmaster' is written in the kernel task_struct, whereas the
> task's argv[] says 'postgres'.
>
> The only way to get around this is to do an execv(), at which point the
> kernel will recopy argv[0].

We used to do execv(), but stopped doing it for performance reasons.

--
Bruce Momjian | http://www.op.net/~candle
maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026