Re: Linux server connection process consumes all memory

Lists: pgsql-novice
From: "Ioannis Anagnostopoulos" <ioannis(at)anatec(dot)com>
To: <pgsql-novice(at)postgresql(dot)org>
Subject: Linux server connection process consumes all memory
Date: 2011-12-03 22:45:11
Message-ID: B56C2BFC6806BE4A9C86ADDDD269933A6E8A09@appserver.anatec.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

Hi all,

This is my first post here and as I wanted to play it safe my first option was the novice list. My problem is as follows:

I have a simple select command that I need for some reason to execute it around 22million times. In fact it is a prepared command from a .net environment. I have used both ado.net postgres options (the freeware NPGSQL and the commercial from Devart) and the behaviour is the same. What happens is that once I establish the connection with the corresponding connection object, a new process is created on the server as expected. When I start iterating executing the select command as time goes by the memory footprint of the process grows bigger and bigger. Sooner or later the server either collapses or becomes unworkable as the physical memory gets exhausted and the swap is heavily used. This is just a simple select, with a data reader that I make sure to close after each iteration. I find no reason why the connection process on the server gets so out of control. Is it caching data? If yes I don't want to. Do I need to close the connection every so often to free up the memory?

Any help will be much appreciated
Kind Regards
Yiannis


From: Alan Hodgson <ahodgson(at)simkin(dot)ca>
To: pgsql-novice(at)postgresql(dot)org
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-03 23:58:38
Message-ID: 201112031558.38312.ahodgson@simkin.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

On Saturday, December 03, 2011 02:45:11 PM Ioannis Anagnostopoulos wrote:
> When I start iterating executing the select command as time goes by the
> memory footprint of the process grows bigger and bigger. Sooner or later
> the server either collapses or becomes unworkable as the physical memory

It's not clear from this what process is continuing to grow. Is it the
PostgreSQL server process? Or your client process?


From: "Ioannis Anagnostopoulos" <ioannis(at)anatec(dot)com>
To: "Alan Hodgson" <ahodgson(at)simkin(dot)ca>, <pgsql-novice(at)postgresql(dot)org>
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-04 01:43:30
Message-ID: B56C2BFC6806BE4A9C86ADDDD269933A6E8A0B@appserver.anatec.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice


Hi thank you for the answer, I am referring to the postgres process
on the Linux server corresponding to the connection I have established.
After doing a TOP you may see quite a few postgres processes. One of them
belongs to the connection established by the client application. It is this
process that grows consuming memory.

Kind regards
Yiannis

-----Original Message-----
From: pgsql-novice-owner(at)postgresql(dot)org on behalf of Alan Hodgson
Sent: Sat 03/12/2011 23:58
To: pgsql-novice(at)postgresql(dot)org
Subject: Re: [NOVICE] Linux server connection process consumes all memory

On Saturday, December 03, 2011 02:45:11 PM Ioannis Anagnostopoulos wrote:
> When I start iterating executing the select command as time goes by the
> memory footprint of the process grows bigger and bigger. Sooner or later
> the server either collapses or becomes unworkable as the physical memory

It's not clear from this what process is continuing to grow. Is it the
PostgreSQL server process? Or your client process?


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Ioannis Anagnostopoulos" <ioannis(at)anatec(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-04 04:45:27
Message-ID: 11644.1322973927@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

"Ioannis Anagnostopoulos" <ioannis(at)anatec(dot)com> writes:
> This is my first post here and as I wanted to play it safe my first option was the novice list. My problem is as follows:

> I have a simple select command that I need for some reason to execute it around 22million times. In fact it is a prepared command from a .net environment. I have used both ado.net postgres options (the freeware NPGSQL and the commercial from Devart) and the behaviour is the same. What happens is that once I establish the connection with the corresponding connection object, a new process is created on the server as expected. When I start iterating executing the select command as time goes by the memory footprint of the process grows bigger and bigger. Sooner or later the server either collapses or becomes unworkable as the physical memory gets exhausted and the swap is heavily used. This is just a simple select, with a data reader that I make sure to close after each iteration. I find no reason why the connection process on the server gets so out of control. Is it caching data? If yes I don't want to. Do I need to close the connection every so often to free up the memory?

This problem is not resolvable with the amount of information you've
provided. Your description makes it sound like you've run into a
server-internal memory leak. While those are certainly possible, we
don't find one very often, which is why Alan was skeptical about whether
the memory growth was in the server process or not. It seems at least
as likely that you've misdiagnosed what's happening. Also, even if
there is a memory leak, a well-configured system will normally fail with
an "out of memory" error before it gets into the kind of distress that
you're describing.

So at this point I'm wondering which PG version you're running, what
non-default configuration settings you've selected, and how much RAM the
box has got. It might also be useful to see the specific query you're
running and the output that EXPLAIN gives for it.

regards, tom lane


From: Alan Hodgson <ahodgson(at)simkin(dot)ca>
To: pgsql-novice(at)postgresql(dot)org
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-04 06:17:46
Message-ID: 201112032217.46928.ahodgson@simkin.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

On Saturday, December 03, 2011 05:43:30 PM Ioannis Anagnostopoulos wrote:
> Hi thank you for the answer, I am referring to the postgres process
> on the Linux server corresponding to the connection I have established.
> After doing a TOP you may see quite a few postgres processes. One of them
> belongs to the connection established by the client application. It is this
> process that grows consuming memory.
>

Tom has posted a request for some follow-up information that might help track
this down.

One thought, though, is it possible you are creating a new prepared statement
for each query instead of re-using one? And then not relelasing them? I did a
quick test and creating millions of prepared statements will chew up a lot of
memory pretty quick in the backend.


From: "Ioannis Anagnostopoulos" <ioannis(at)anatec(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-novice(at)postgresql(dot)org>, <ahodgson(at)simkin(dot)ca>
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-04 09:38:28
Message-ID: B56C2BFC6806BE4A9C86ADDDD269933A6E8A0C@appserver.anatec.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

Hello all, and thanks again for the effort. So here are the details:
This is a dual core machine with 8G of Ram running Ubuntu server.
It runs Postgres 9.0 with its Postgis extension. Now it is true that
the linux's OutOfMemory Daemon killer was killing the connection before as
the connection process indeed was consuming the memory. To fight against this
we follow instructions and we reduced to 25% of the machine's physical memory the
shared_buffers (i.e 2GB) while we also set the effective_cache_size to 6GB.
After this we no longer run out of memory but of course we have the issue described
before.

Some background of this database, this is a "readonly" database which contain millions of
rows regarding shipping positions. The main database that receives the data
updates/adds about 15000 rows per minute with a processing time of 1000 lines per 1.16 sec.
Now imagine that the "readonly" database we are working on has monthly data so you can get
an idea of how many rows it contains in its major tables.

I am not sure if I can but I will try to attach my server's configuration for
you to take a look. I am also attaching the vb.net code that is executed. In case
that you are wondering for the code, you will have the same issue whether or not
the command is prepared or not. In fact the only way to get this issue resolved
is to close and open the connection for every call/instance of this class. But then
of course as I have to query 22million rows my speed degrades (for example this code
returns 1 row every 0.001 sec while If I close and open connection to get rid of the
memory issue it returns 1 row every 0.01 sec.

Thank you in advance
Yiannis

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Sun 04/12/2011 04:45
To: Ioannis Anagnostopoulos
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: [NOVICE] Linux server connection process consumes all memory

"Ioannis Anagnostopoulos" <ioannis(at)anatec(dot)com> writes:
> This is my first post here and as I wanted to play it safe my first option was the novice list. My problem is as follows:

> I have a simple select command that I need for some reason to execute it around 22million times. In fact it is a prepared command from a .net environment. I have used both ado.net postgres options (the freeware NPGSQL and the commercial from Devart) and the behaviour is the same. What happens is that once I establish the connection with the corresponding connection object, a new process is created on the server as expected. When I start iterating executing the select command as time goes by the memory footprint of the process grows bigger and bigger. Sooner or later the server either collapses or becomes unworkable as the physical memory gets exhausted and the swap is heavily used. This is just a simple select, with a data reader that I make sure to close after each iteration. I find no reason why the connection process on the server gets so out of control. Is it caching data? If yes I don't want to. Do I need to close the connection every so often to free up the memory?

This problem is not resolvable with the amount of information you've
provided. Your description makes it sound like you've run into a
server-internal memory leak. While those are certainly possible, we
don't find one very often, which is why Alan was skeptical about whether
the memory growth was in the server process or not. It seems at least
as likely that you've misdiagnosed what's happening. Also, even if
there is a memory leak, a well-configured system will normally fail with
an "out of memory" error before it gets into the kind of distress that
you're describing.

So at this point I'm wondering which PG version you're running, what
non-default configuration settings you've selected, and how much RAM the
box has got. It might also be useful to see the specific query you're
running and the output that EXPLAIN gives for it.

regards, tom lane

Attachment Content-Type Size
postgresql.zip application/zip 5.8 KB
QueryClass.zip application/zip 986 bytes

From: "Ioannis Anagnostopoulos" <ioannis(at)anatec(dot)com>
To: "Ioannis Anagnostopoulos" <ioannis(at)anatec(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-novice(at)postgresql(dot)org>, <ahodgson(at)simkin(dot)ca>
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-04 10:44:41
Message-ID: B56C2BFC6806BE4A9C86ADDDD269933A6E8A0F@appserver.anatec.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

Hello again, here is an extract of my TOP:

top - 10:38:57 up 95 days, 1:34, 1 user, load average: 0.00, 0.01, 0.05
Tasks: 87 total, 2 running, 85 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.1%us, 0.0%sy, 0.0%ni, 99.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8070900k total, 2797004k used, 5273896k free, 46532k buffers
Swap: 7811068k total, 5336k used, 7805732k free, 2477508k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
18059 postgres 20 0 2186m 102m 99m S 0 1.3 0:03.19 postgres
17028 postgres 20 0 2180m 61m 60m S 0 0.8 0:08.06 postgres
11631 postgres 20 0 2181m 13m 13m S 0 0.2 0:15.20 postgres

PID 18059 is the connection process. It is already running at 1.3% of memory and its
RES is already 102m. If I leave it run for the rest of the day, it will go up
to 96% of Mem, the physical memory will get down down to 50M (from 5G available at this stage)
and then it will start swapping eventually exhausting even the swap.

Kind Regards
Yiannis

-----Original Message-----
From: pgsql-novice-owner(at)postgresql(dot)org on behalf of Ioannis Anagnostopoulos
Sent: Sun 04/12/2011 09:38
To: Tom Lane
Cc: pgsql-novice(at)postgresql(dot)org; ahodgson(at)simkin(dot)ca
Subject: Re: [NOVICE] Linux server connection process consumes all memory

Hello all, and thanks again for the effort. So here are the details:
This is a dual core machine with 8G of Ram running Ubuntu server.
It runs Postgres 9.0 with its Postgis extension. Now it is true that
the linux's OutOfMemory Daemon killer was killing the connection before as
the connection process indeed was consuming the memory. To fight against this
we follow instructions and we reduced to 25% of the machine's physical memory the
shared_buffers (i.e 2GB) while we also set the effective_cache_size to 6GB.
After this we no longer run out of memory but of course we have the issue described
before.

Some background of this database, this is a "readonly" database which contain millions of
rows regarding shipping positions. The main database that receives the data
updates/adds about 15000 rows per minute with a processing time of 1000 lines per 1.16 sec.
Now imagine that the "readonly" database we are working on has monthly data so you can get
an idea of how many rows it contains in its major tables.

I am not sure if I can but I will try to attach my server's configuration for
you to take a look. I am also attaching the vb.net code that is executed. In case
that you are wondering for the code, you will have the same issue whether or not
the command is prepared or not. In fact the only way to get this issue resolved
is to close and open the connection for every call/instance of this class. But then
of course as I have to query 22million rows my speed degrades (for example this code
returns 1 row every 0.001 sec while If I close and open connection to get rid of the
memory issue it returns 1 row every 0.01 sec.

Thank you in advance
Yiannis

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Sun 04/12/2011 04:45
To: Ioannis Anagnostopoulos
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: [NOVICE] Linux server connection process consumes all memory

"Ioannis Anagnostopoulos" <ioannis(at)anatec(dot)com> writes:
> This is my first post here and as I wanted to play it safe my first option was the novice list. My problem is as follows:

> I have a simple select command that I need for some reason to execute it around 22million times. In fact it is a prepared command from a .net environment. I have used both ado.net postgres options (the freeware NPGSQL and the commercial from Devart) and the behaviour is the same. What happens is that once I establish the connection with the corresponding connection object, a new process is created on the server as expected. When I start iterating executing the select command as time goes by the memory footprint of the process grows bigger and bigger. Sooner or later the server either collapses or becomes unworkable as the physical memory gets exhausted and the swap is heavily used. This is just a simple select, with a data reader that I make sure to close after each iteration. I find no reason why the connection process on the server gets so out of control. Is it caching data? If yes I don't want to. Do I need to close the connection every so often to free up the memory?

This problem is not resolvable with the amount of information you've
provided. Your description makes it sound like you've run into a
server-internal memory leak. While those are certainly possible, we
don't find one very often, which is why Alan was skeptical about whether
the memory growth was in the server process or not. It seems at least
as likely that you've misdiagnosed what's happening. Also, even if
there is a memory leak, a well-configured system will normally fail with
an "out of memory" error before it gets into the kind of distress that
you're describing.

So at this point I'm wondering which PG version you're running, what
non-default configuration settings you've selected, and how much RAM the
box has got. It might also be useful to see the specific query you're
running and the output that EXPLAIN gives for it.

regards, tom lane


From: "Ioannis Anagnostopoulos" <ioannis(at)anatec(dot)com>
To: "Alan Hodgson" <ahodgson(at)simkin(dot)ca>, <pgsql-novice(at)postgresql(dot)org>
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-04 11:08:35
Message-ID: B56C2BFC6806BE4A9C86ADDDD269933A6E8A10@appserver.anatec.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

Hello Alan,

From the code I posted I am pretty sure that my prepared statement is
a const/shared thus it is the same among the instances of the class. However
as I said you may try it with a non prepared command. The result will be exactly the same.

What I tend to believe is that this is some kind of memory that is
held ON the server via the connection by the client application. For this reason when the client
exits all memory on the server returns to normal. On the JAVA world I had a similar happening when
the connection was accumulating too many "server notices" that were not consumed by the client. These noticed
had to be cleared manually in order to reduce the memory usage. To this day I have not seen something similar
in the .net world (in both flavours of ado.net driver i.e comercial and NPGSQL).

Kind Regards
Yiannis

-----Original Message-----
From: pgsql-novice-owner(at)postgresql(dot)org on behalf of Alan Hodgson
Sent: Sun 04/12/2011 06:17
To: pgsql-novice(at)postgresql(dot)org
Subject: Re: [NOVICE] Linux server connection process consumes all memory

On Saturday, December 03, 2011 05:43:30 PM Ioannis Anagnostopoulos wrote:
> Hi thank you for the answer, I am referring to the postgres process
> on the Linux server corresponding to the connection I have established.
> After doing a TOP you may see quite a few postgres processes. One of them
> belongs to the connection established by the client application. It is this
> process that grows consuming memory.
>

Tom has posted a request for some follow-up information that might help track
this down.

One thought, though, is it possible you are creating a new prepared statement
for each query instead of re-using one? And then not relelasing them? I did a
quick test and creating millions of prepared statements will chew up a lot of
memory pretty quick in the backend.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Ioannis Anagnostopoulos" <ioannis(at)anatec(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org, ahodgson(at)simkin(dot)ca
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-04 16:30:04
Message-ID: 23418.1323016204@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

"Ioannis Anagnostopoulos" <ioannis(at)anatec(dot)com> writes:
> I am not sure if I can but I will try to attach my server's configuration for
> you to take a look. I am also attaching the vb.net code that is
> executed.

I don't know much of anything about VB, but I am filled with suspicion
that something in the VB infrastructure you're using is not doing what
you think. In particular, I'm betting that repeated executions of this
code are in fact generating new prepared statements without deallocating
old ones. You could investigate that theory by setting log_statement =
all in the server configuration and then watching the server log to see
exactly what SQL commands are actually getting sent.

regards, tom lane


From: "Ioannis Anagnostopoulos" <ioannis(at)anatec(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-novice(at)postgresql(dot)org>, <ahodgson(at)simkin(dot)ca>
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-04 17:02:48
Message-ID: B56C2BFC6806BE4A9C86ADDDD269933A6E8A12@appserver.anatec.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

Hi Tom,

I am pretty sure that at least for my example there is only ONE
prepared statement created once when the static variable is NULL/Nothing
during the first iteration. Thereafter the same prepared statement is
executed over and over again. As I said it may well be a bug on the ado.net
driver. In any case I will set log_statement = all and will revert with my
findings.

Kind Regards
Yiannis

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Sun 04/12/2011 16:30
To: Ioannis Anagnostopoulos
Cc: pgsql-novice(at)postgresql(dot)org; ahodgson(at)simkin(dot)ca
Subject: Re: [NOVICE] Linux server connection process consumes all memory

"Ioannis Anagnostopoulos" <ioannis(at)anatec(dot)com> writes:
> I am not sure if I can but I will try to attach my server's configuration for
> you to take a look. I am also attaching the vb.net code that is
> executed.

I don't know much of anything about VB, but I am filled with suspicion
that something in the VB infrastructure you're using is not doing what
you think. In particular, I'm betting that repeated executions of this
code are in fact generating new prepared statements without deallocating
old ones. You could investigate that theory by setting log_statement =
all in the server configuration and then watching the server log to see
exactly what SQL commands are actually getting sent.

regards, tom lane


From: Ioannis Anagnostopoulos <ioannis(at)anatec(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-novice(at)postgresql(dot)org, ahodgson(at)simkin(dot)ca
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-05 10:41:36
Message-ID: 4EDC9FE0.8030506@anatec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

Below is the forum discussion I have opened with Devart's ado.net driver
and the NPGSQL (opensource) driver.

http://www.devart.com/forums/viewtopic.php?p=76192#76192
http://pgfoundry.org/forum/forum.php?thread_id=10481&forum_id=519

As I say there, I managed to reduce my queries to 55.000 from 22
million. The server in this scenario is not running out of memory
but it is still apparent that memory consumption is high (8.9%!!). So if
20 people try to run the same query we are going to be back
in square 1. For one more time let me assure that there is ONLY one
prepared statement that is created at the beginning and is executed
over and over again. My question to postgres people is IF there is any
bug that becomes apparent as a memory leak after a lot of executions
of the same statement. My next test will be to convert my .net code to
PGSql and execute it on the server without involving any client.

The top below show the "updated" query running. This time the load is at
8.9% right at the end of the run. Nowhere near the 96% but please
keep in mind that this happened when I reduced to iterations to 55.000.

top - 10:35:23 up 96 days, 1:30, 1 user, load average: 0.00, 0.01, 0.05
Tasks: 87 total, 1 running, 86 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.5%us, 0.0%sy, 0.0%ni, 98.5%id, 0.0%wa, 0.0%hi, 0.0%si,
0.0%st
Mem: 8070900k total, 8017768k used, 53132k free, 56800k buffers
Swap: 7811068k total, 4336k used, 7806732k free, 7671980k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
22181 postgres 20 0 2187m 701m 697m S 6 8.9 1:29.33 postgres

Kind Regards
Yiannis

On 04/12/2011 16:30, Tom Lane wrote:
> "Ioannis Anagnostopoulos"<ioannis(at)anatec(dot)com> writes:
>> I am not sure if I can but I will try to attach my server's configuration for
>> you to take a look. I am also attaching the vb.net code that is
>> executed.
> I don't know much of anything about VB, but I am filled with suspicion
> that something in the VB infrastructure you're using is not doing what
> you think. In particular, I'm betting that repeated executions of this
> code are in fact generating new prepared statements without deallocating
> old ones. You could investigate that theory by setting log_statement =
> all in the server configuration and then watching the server log to see
> exactly what SQL commands are actually getting sent.
>
> regards, tom lane
>


From: Alan Hodgson <ahodgson(at)simkin(dot)ca>
To: pgsql-novice(at)postgresql(dot)org
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-05 16:23:55
Message-ID: 201112050823.55438.ahodgson@simkin.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

On Monday, December 05, 2011 02:41:36 AM Ioannis Anagnostopoulos wrote:
> The top below show the "updated" query running. This time the load is at
> 8.9% right at the end of the run. Nowhere near the 96% but please
> keep in mind that this happened when I reduced to iterations to 55.000.
>

Reduce it to 3 or 4. Turn on statement logging per Tom's request. Empty the
logs. Restart PostgreSQL. Do one run. Let's see everything PostgreSQL is
actually getting from the client.

Also ... please don't top-post.

> On 04/12/2011 16:30, Tom Lane wrote:
> >
> > I don't know much of anything about VB, but I am filled with suspicion
> > that something in the VB infrastructure you're using is not doing what
> > you think. In particular, I'm betting that repeated executions of this
> > code are in fact generating new prepared statements without deallocating
> > old ones. You could investigate that theory by setting log_statement =
> > all in the server configuration and then watching the server log to see
> > exactly what SQL commands are actually getting sent.
> >
> > regards, tom lane


From: "Lukas" <Lukas(at)fmf(dot)vtu(dot)lt>
To: pgsql-novice(at)postgresql(dot)org
Subject: Setting bigger OS cache on Windows server
Date: 2011-12-05 20:24:17
Message-ID: b0fefce790f945b4b234c38911dce399.squirrel@fmf.vtu.lt
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

Hello,

I found such note in wiki of Postgresql
(http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server ):
Note that on Windows (and on PostgreSQL versions before 8.1), large values
for shared_buffers aren't as effective, and you may find better results
keeping it relatively low and using the OS cache more instead.

Can anyone tell how OS cache should be increased on windows server?

thx
Lukas


From: "ktm(at)rice(dot)edu" <ktm(at)rice(dot)edu>
To: Lukas <Lukas(at)fmf(dot)vtu(dot)lt>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: Setting bigger OS cache on Windows server
Date: 2011-12-05 20:32:35
Message-ID: 20111205203235.GF28286@staff-mud-56-27.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

On Mon, Dec 05, 2011 at 10:24:17PM +0200, Lukas wrote:
> Hello,
>
> I found such note in wiki of Postgresql
> (http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server ):
> Note that on Windows (and on PostgreSQL versions before 8.1), large values
> for shared_buffers aren't as effective, and you may find better results
> keeping it relatively low and using the OS cache more instead.
>
> Can anyone tell how OS cache should be increased on windows server?
>
>
> thx
> Lukas
>

I think that like Linux it is not a tunable setting. The OS will us
as much memory as it can while still meeting the memory resource needs
of the applications running on the machine. The upshot is -- buy more
memory. The take-away from the above doc quote is to drop the amount
of shared_buffers for PostgreSQL, since you can control that.

Cheers,
Ken


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Ioannis Anagnostopoulos <ioannis(at)anatec(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-novice(at)postgresql(dot)org, ahodgson(at)simkin(dot)ca
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-05 21:13:47
Message-ID: CAHyXU0wM_u5yrWVc=Lak6U1WxKJvBR0uZLEFO=djtiaU=htpjQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

On Mon, Dec 5, 2011 at 4:41 AM, Ioannis Anagnostopoulos
<ioannis(at)anatec(dot)com> wrote:
> Below is the forum discussion I have opened with Devart's ado.net driver and
> the NPGSQL (opensource) driver.
>
> http://www.devart.com/forums/viewtopic.php?p=76192#76192
> http://pgfoundry.org/forum/forum.php?thread_id=10481&forum_id=519
>
> As I say there, I managed to reduce my queries to 55.000 from 22 million.
> The server in this scenario is not running out of memory
> but it is still apparent that memory consumption is high (8.9%!!). So if 20
> people try to run the same query we are going to be back
> in square 1. For one more time let me assure that there is ONLY one prepared
> statement that is created at the beginning and is executed
> over and over again. My question to postgres people is IF there is any bug
> that becomes apparent as a memory leak after a lot of executions
> of the same statement.  My next test will be to convert my .net code to
> PGSql and execute it on the server without involving any client.
>
> The top below show the "updated" query running. This time the load is at
> 8.9% right at the end of the run. Nowhere near the 96% but please
> keep in mind that this happened when I reduced to iterations to 55.000.
>
> top - 10:35:23 up 96 days,  1:30,  1 user,  load average: 0.00, 0.01, 0.05
> Tasks:  87 total,   1 running,  86 sleeping,   0 stopped,   0 zombie
> Cpu(s):  1.5%us,  0.0%sy,  0.0%ni, 98.5%id,  0.0%wa,  0.0%hi,  0.0%si,
>  0.0%st
> Mem:   8070900k total,  8017768k used,    53132k free,    56800k buffers
> Swap:  7811068k total,     4336k used,  7806732k free,  7671980k cached
>
>
>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 22181 postgres  20   0 2187m 701m 697m S    6  8.9   1:29.33 postgres

You are misinterpreting what top is telling you (it's a pretty common
mistake). Most of the memory postgres is using as you can see is
inside SHR, or is shared memory that is shared between all backend
processes.

What have you set shared_buffers to? My money is on you having
oversubscribed your box.

merlin


From: "Lukas" <Lukas(at)fmf(dot)vtu(dot)lt>
To: pgsql-novice(at)postgresql(dot)org
Subject: Re: Setting bigger OS cache on Windows server
Date: 2011-12-05 22:57:55
Message-ID: ea3dbecec1be1d14f53427b3d17d1fee.squirrel@fmf.vtu.lt
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

Hello,

well, according the document it is not like that:
"It's likely you will have to increase the amount of memory your operating
system allows you to allocate at once..."
I understand it as I need to set that value somewhere in OS..?
I have problem, that our database is big, server has 4GB of ram, and
about 70% of ram is free... At the same time I have some 20-30 busy
connections to Postgres, and it works slow... For me seems like tunable
problem..?

Lukas

> On Mon, Dec 05, 2011 at 10:24:17PM +0200, Lukas wrote:
>> Hello,
>>
>> I found such note in wiki of Postgresql
>> (http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server ):
>> Note that on Windows (and on PostgreSQL versions before 8.1), large
>> values
>> for shared_buffers aren't as effective, and you may find better results
>> keeping it relatively low and using the OS cache more instead.
>>
>> Can anyone tell how OS cache should be increased on windows server?
>>
>>
>> thx
>> Lukas
>>
>
> I think that like Linux it is not a tunable setting. The OS will us
> as much memory as it can while still meeting the memory resource needs
> of the applications running on the machine. The upshot is -- buy more
> memory. The take-away from the above doc quote is to drop the amount
> of shared_buffers for PostgreSQL, since you can control that.
>
> Cheers,
> Ken
>
> --
> Sent via pgsql-novice mailing list (pgsql-novice(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-novice
>


From: Ioannis Anagnostopoulos <ioannis(at)anatec(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-novice(at)postgresql(dot)org, ahodgson(at)simkin(dot)ca
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-06 09:14:23
Message-ID: 4EDDDCEF.5080903@anatec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

On 05/12/2011 21:13, Merlin Moncure wrote:
>
> On Mon, Dec 5, 2011 at 4:41 AM, Ioannis Anagnostopoulos
> <ioannis(at)anatec(dot)com> wrote:
> > Below is the forum discussion I have opened with Devart's ado.net
> driver and
> > the NPGSQL (opensource) driver.
> >
> > http://www.devart.com/forums/viewtopic.php?p=76192#76192
> > http://pgfoundry.org/forum/forum.php?thread_id=10481&forum_id=519
> <http://pgfoundry.org/forum/forum.php?thread_id=10481&forum_id=519>
> >
> > As I say there, I managed to reduce my queries to 55.000 from 22
> million.
> > The server in this scenario is not running out of memory
> > but it is still apparent that memory consumption is high (8.9%!!).
> So if 20
> > people try to run the same query we are going to be back
> > in square 1. For one more time let me assure that there is ONLY one
> prepared
> > statement that is created at the beginning and is executed
> > over and over again. My question to postgres people is IF there is
> any bug
> > that becomes apparent as a memory leak after a lot of executions
> > of the same statement. My next test will be to convert my .net code to
> > PGSql and execute it on the server without involving any client.
> >
> > The top below show the "updated" query running. This time the load is at
> > 8.9% right at the end of the run. Nowhere near the 96% but please
> > keep in mind that this happened when I reduced to iterations to 55.000.
> >
> > top - 10:35:23 up 96 days, 1:30, 1 user, load average: 0.00,
> 0.01, 0.05
> > Tasks: 87 total, 1 running, 86 sleeping, 0 stopped, 0 zombie
> > Cpu(s): 1.5%us, 0.0%sy, 0.0%ni, 98.5%id, 0.0%wa, 0.0%hi, 0.0%si,
> > 0.0%st
> > Mem: 8070900k total, 8017768k used, 53132k free, 56800k buffers
> > Swap: 7811068k total, 4336k used, 7806732k free, 7671980k cached
> >
> >
> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> > 22181 postgres 20 0 2187m 701m 697m S 6 8.9 1:29.33 postgres
>
> You are misinterpreting what top is telling you (it's a pretty common
> mistake). Most of the memory postgres is using as you can see is
> inside SHR, or is shared memory that is shared between all backend
> processes.
>
Hello Merlin, thank you for the answer. Well the example that you quote
is from
a sunshine scenario were I had reduced the iteration to something more
down to earth
and the process concluded. However if you followed the other examples I
listed in earlier
post you will see cases like this:

top - 11:46:05 up 85 days, 2:41, 1 user, load average: 0.67, 1.03, 1.19
Tasks: 89 total, 1 running, 88 sleeping, 0 stopped, 0 zombie
Cpu(s): 10.3%us, 0.7%sy, 0.0%ni, 69.8%id, 19.2%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8070900k total, 8017232k used, 53668k free, 1508k buffers
Swap: 7811068k total, 4283720k used, 3527348k free, 2088528k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15965 postgres 20 0 12.9g 6.4g 1.4g S 11 83.4 13:59.15 postgres
15961 postgres 20 0 4285m 1.8g 1.8g D 31 23.2 9:35.03 postgres
15827 postgres 20 0 4276m 52m 51m S 0 0.7 0:00.50 postgres
15830 postgres 20 0 4278m 25m 24m S 0 0.3 0:01.27 postgres
15959 postgres 20 0 4288m 4376 4288 S 0 0.1 0:00.25 postgres
15832 postgres 20 0 4279m 1388 888 S 0 0.0 0:00.19 postgres

This is an extract from top and below is an extract of iostat:

Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 158.00 1802.00 1602.00 3604 3204
sdb 70.50 590.00 40.00 1180 80
sdc 16.00 528.00 0.00 1056 0

Please note that process 15965 is my connection process. In that example
we have
VIRT = 12.9G
RES = 6.4G
SHR=1.4G

And swap of course is almost exhausted. Maybe you are right but in the
light of this example
can you provide any further explanation?

> What have you set shared_buffers to? My money is on you having
> oversubscribed your box.
>
> merlin
>
Shared_buffers = 2GB and according to this
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
I am spot on since my machine has 8G of RAM.

Thank you
Yiannis


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Ioannis Anagnostopoulos <ioannis(at)anatec(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-novice(at)postgresql(dot)org, ahodgson(at)simkin(dot)ca
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-06 17:02:02
Message-ID: CAHyXU0z+xjcW-7nDj4hpZEo2pX7MQajeZ_NYVjiMpF4RoQK5TA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

On Tue, Dec 6, 2011 at 3:14 AM, Ioannis Anagnostopoulos
<ioannis(at)anatec(dot)com> wrote:
> On 05/12/2011 21:13, Merlin Moncure wrote:
>
> On Mon, Dec 5, 2011 at 4:41 AM, Ioannis Anagnostopoulos
> <ioannis(at)anatec(dot)com> wrote:
>> Below is the forum discussion I have opened with Devart's ado.net driver
>> and
>> the NPGSQL (opensource) driver.
>>
>> http://www.devart.com/forums/viewtopic.php?p=76192#76192
>> http://pgfoundry.org/forum/forum.php?thread_id=10481&forum_id=519
>>
>> As I say there, I managed to reduce my queries to 55.000 from 22 million.
>> The server in this scenario is not running out of memory
>> but it is still apparent that memory consumption is high (8.9%!!). So if
>> 20
>> people try to run the same query we are going to be back
>> in square 1. For one more time let me assure that there is ONLY one
>> prepared
>> statement that is created at the beginning and is executed
>> over and over again. My question to postgres people is IF there is any bug
>> that becomes apparent as a memory leak after a lot of executions
>> of the same statement.  My next test will be to convert my .net code to
>> PGSql and execute it on the server without involving any client.
>>
>> The top below show the "updated" query running. This time the load is at
>> 8.9% right at the end of the run. Nowhere near the 96% but please
>> keep in mind that this happened when I reduced to iterations to 55.000.
>>
>> top - 10:35:23 up 96 days,  1:30,  1 user,  load average: 0.00, 0.01, 0.05
>> Tasks:  87 total,   1 running,  86 sleeping,   0 stopped,   0 zombie
>> Cpu(s):  1.5%us,  0.0%sy,  0.0%ni, 98.5%id,  0.0%wa,  0.0%hi,  0.0%si,
>>  0.0%st
>> Mem:   8070900k total,  8017768k used,    53132k free,    56800k buffers
>> Swap:  7811068k total,     4336k used,  7806732k free,  7671980k cached
>>
>>
>>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>> 22181 postgres  20   0 2187m 701m 697m S    6  8.9   1:29.33 postgres
>
> You are misinterpreting what top is telling you (it's a pretty common
> mistake).  Most of the memory postgres is using as you can see is
> inside SHR, or is shared memory that is shared between all backend
> processes.
>
> Hello Merlin, thank you for the answer. Well the example that you quote is
> from
> a sunshine scenario were I had reduced the iteration to something more down
> to earth
> and the process concluded. However if you followed the other examples I
> listed in earlier
> post you will see cases like this:
>
> top - 11:46:05 up 85 days, 2:41, 1 user, load average: 0.67, 1.03, 1.19
> Tasks: 89 total, 1 running, 88 sleeping, 0 stopped, 0 zombie
> Cpu(s): 10.3%us, 0.7%sy, 0.0%ni, 69.8%id, 19.2%wa, 0.0%hi, 0.0%si, 0.0%st
> Mem: 8070900k total, 8017232k used, 53668k free, 1508k buffers
> Swap: 7811068k total, 4283720k used, 3527348k free, 2088528k cached
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 15965 postgres 20 0 12.9g 6.4g 1.4g S 11 83.4 13:59.15 postgres
> 15961 postgres 20 0 4285m 1.8g 1.8g D 31 23.2 9:35.03 postgres
> 15827 postgres 20 0 4276m 52m 51m S 0 0.7 0:00.50 postgres
> 15830 postgres 20 0 4278m 25m 24m S 0 0.3 0:01.27 postgres
> 15959 postgres 20 0 4288m 4376 4288 S 0 0.1 0:00.25 postgres
> 15832 postgres 20 0 4279m 1388 888 S 0 0.0 0:00.19 postgres
>
> This is an extract from top and below is an extract of iostat:
>
> Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
> sda 158.00 1802.00 1602.00 3604 3204
> sdb 70.50 590.00 40.00 1180 80
> sdc 16.00 528.00 0.00 1056 0
>
> Please note that process 15965 is my connection process. In that example we
> have
> VIRT = 12.9G
> RES = 6.4G
> SHR=1.4G

Interesting. Well, in your 'sunshine scenario', postgres non shared
memory usage is basically zero -- growth of memory consumption in SHR
does not in any suggest runaway growth -- so memory consumption is not
in fact 8.9% but is < 1% and is basically a completely normal memory
profile. This also tends to rule out incremental memory consumption
issues that are a general class of problem when you are dealing with
huge numbers of system objects, like prepared statements, tables,
schemas, etc (but not records).

This means that something changed in your bigger job that caused
memory consumption to go completely out of whack....5 gigs of resident
memory is definitely not normal. Given the above. my first suspicion
is around work_mem -- what's it set to? Also, are you using any third
party sever-side modules (like PostGIS?) C functions? pl/java? A more
unlikely but possible scenario is an absolutely pathological query
plan that has gone totally haywire.

What we would need to know is exactly what is happening at the precise
point int time your resident memory starts running away from SHR.
Helpful information might include:

*) if your program is continually processing queries or is stuck at a
single query (pg_stat_activity system view is your friend here)

*) if we luck out and a single query is doing it, we'd need to see the
query and the plan (explain the query).

*) your setting for work_mem and maintenance_work_mem

*) anything else that's interesting...large numbers of tables/views/etc?

*) You may want to consider changing your vm over commit settings
and/or reducing swap in order to get your server to more aggressively
return OOM to postgres memory allocation. The specific error returned
to postgres for an OOM of course would be very helpful.

*) Using the binary protocol? custom C types? A forged/invalid datum
size can cause postgres to grab a lot of memory (data corruption can
also do this) -- but it would take more than one consecutive one to do
it.

merlin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Ioannis Anagnostopoulos <ioannis(at)anatec(dot)com>, pgsql-novice(at)postgresql(dot)org, ahodgson(at)simkin(dot)ca
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-06 17:10:46
Message-ID: 22700.1323191446@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> *) You may want to consider changing your vm over commit settings
> and/or reducing swap in order to get your server to more aggressively
> return OOM to postgres memory allocation. The specific error returned
> to postgres for an OOM of course would be very helpful.

Yeah. I would try starting the postmaster under smaller ulimit settings
so that the kernel gives it ENOMEM before you start getting swapped.
When that happens, the backend will dump a memory map into the
postmaster log that would be very useful for seeing what is actually
happening here.

regards, tom lane


From: Ioannis Anagnostopoulos <ioannis(at)anatec(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, pgsql-novice(at)postgresql(dot)org, ahodgson(at)simkin(dot)ca
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-07 09:24:38
Message-ID: 4EDF30D6.20705@anatec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

On 06/12/2011 17:10, Tom Lane wrote:
> Merlin Moncure<mmoncure(at)gmail(dot)com> writes:
>> *) You may want to consider changing your vm over commit settings
>> and/or reducing swap in order to get your server to more aggressively
>> return OOM to postgres memory allocation. The specific error returned
>> to postgres for an OOM of course would be very helpful.
> Yeah. I would try starting the postmaster under smaller ulimit settings
> so that the kernel gives it ENOMEM before you start getting swapped.
> When that happens, the backend will dump a memory map into the
> postmaster log that would be very useful for seeing what is actually
> happening here.
>
> regards, tom lane
>
Hello all,

I think I have solved the problem. Many thanks for the support and the
time you spend. The solution/bug/problem is as follows:

1. There was one connection that as I described was used IN A LOOP
22million times. This connection was assigned a PID x (on the linux server)
2. Nested within this LOOP there was another connection that had been
forgotten from past code and the linux server was assigning to it a PID y
3. PID y was of course called also 22million times (since it was in the
loop). However it had a nasty bug and it was creating constantly
prepared commands! (opps my mistake). So PID y was creating 22million
prepared commands!
4. As I had no clue that that there was at all PID y, monitoring the TOP
on the server I was presented with the misbehaving PID y but I was of
the impression that it was PID x. In fact PID x was below in the list
happy doing its own job.

So the healthy PID X had a top signature as follows (please note the
difference between RES and SHR as well as the magnitude in Mb as Merlin
suggested):
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
30475 postgres 20 0 2187m 746m 741m S 31 9.5 0:41.48 postgres

While the unhealthy PID Y had a TOP signature (please note that RES
memory is at 12.9g! and SHR 1.4g as well as the magnitude in Gb!):
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15965 postgres 20 0 12.9g 6.4g 1.4g S 11 83.4 13:59.15 postgres

As I said I had no clue about the existence of PID Y and since it was
coming top at the TOP list I had wrongfully assumed that it was the PID
X. It gets more complicated by the fact that the test code I sent you,
which should have been working fine as it had no nested buggy loop, was
mainly running from home over the DSL line thus I never let it conclude
its 22million iterations (it would have been still running!) instead I
was monitoring the TOP and since the memory was going UP I was
wrongfully assuming that I had the same issue (if I had let it run for 2
-3 hours I would have noticed what Merlin suggested about RES/SHR
ratio). So it was a misdiagnosis after all :)

I hope this explains everything.
Kind Regards and sorry for the misunderstanding.
Yiannis


From: Pandu Poluan <pandu(at)poluan(dot)info>
To: Ioannis Anagnostopoulos <ioannis(at)anatec(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, ahodgson(at)simkin(dot)ca, pgsql-novice(at)postgresql(dot)org, Merlin Moncure <mmoncure(at)gmail(dot)com>
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-07 11:23:56
Message-ID: CAA2qdGUQx8ZpaHbCwre0yMEmOX-OnA4BVm86yEQO2OT7O37AMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

On Dec 7, 2011 4:26 PM, "Ioannis Anagnostopoulos" <ioannis(at)anatec(dot)com>
wrote:
>
> On 06/12/2011 17:10, Tom Lane wrote:
>>
>> Merlin Moncure<mmoncure(at)gmail(dot)com> writes:
>>>
>>> *) You may want to consider changing your vm over commit settings
>>> and/or reducing swap in order to get your server to more aggressively
>>> return OOM to postgres memory allocation. The specific error returned
>>> to postgres for an OOM of course would be very helpful.
>>
>> Yeah. I would try starting the postmaster under smaller ulimit settings
>> so that the kernel gives it ENOMEM before you start getting swapped.
>> When that happens, the backend will dump a memory map into the
>> postmaster log that would be very useful for seeing what is actually
>> happening here.
>>
>> regards, tom lane
>>
> Hello all,
>
> I think I have solved the problem. Many thanks for the support and the
time you spend. The solution/bug/problem is as follows:
>
> 1. There was one connection that as I described was used IN A LOOP
22million times. This connection was assigned a PID x (on the linux server)
> 2. Nested within this LOOP there was another connection that had been
forgotten from past code and the linux server was assigning to it a PID y
> 3. PID y was of course called also 22million times (since it was in the
loop). However it had a nasty bug and it was creating constantly prepared
commands! (opps my mistake). So PID y was creating 22million prepared
commands!
> 4. As I had no clue that that there was at all PID y, monitoring the TOP
on the server I was presented with the misbehaving PID y but I was of the
impression that it was PID x. In fact PID x was below in the list happy
doing its own job.
>
> So the healthy PID X had a top signature as follows (please note the
difference between RES and SHR as well as the magnitude in Mb as Merlin
suggested):
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 30475 postgres 20 0 2187m 746m 741m S 31 9.5 0:41.48 postgres
>
> While the unhealthy PID Y had a TOP signature (please note that RES
memory is at 12.9g! and SHR 1.4g as well as the magnitude in Gb!):
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 15965 postgres 20 0 12.9g 6.4g 1.4g S 11 83.4 13:59.15 postgres
>
> As I said I had no clue about the existence of PID Y and since it was
coming top at the TOP list I had wrongfully assumed that it was the PID X.
It gets more complicated by the fact that the test code I sent you, which
should have been working fine as it had no nested buggy loop, was mainly
running from home over the DSL line thus I never let it conclude its
22million iterations (it would have been still running!) instead I was
monitoring the TOP and since the memory was going UP I was wrongfully
assuming that I had the same issue (if I had let it run for 2 -3 hours I
would have noticed what Merlin suggested about RES/SHR ratio). So it was a
misdiagnosis after all :)
>
> I hope this explains everything.
> Kind Regards and sorry for the misunderstanding.

All's well that ends well.

May I ask how you finally found the culprit? Through profiling or code
review?

That said, I can't help admiring that PostgreSQL still survives 22 million
prepared commands without committing suicide. Yes, it's severely impacted,
but still survives.

Rgds,


From: Ioannis Anagnostopoulos <ioannis(at)anatec(dot)com>
To: Pandu Poluan <pandu(at)poluan(dot)info>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, ahodgson(at)simkin(dot)ca, pgsql-novice(at)postgresql(dot)org, Merlin Moncure <mmoncure(at)gmail(dot)com>
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-07 12:14:21
Message-ID: 4EDF589D.7060003@anatec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

On 07/12/2011 11:23, Pandu Poluan wrote:
>
>
> On Dec 7, 2011 4:26 PM, "Ioannis Anagnostopoulos" <ioannis(at)anatec(dot)com
> <mailto:ioannis(at)anatec(dot)com>> wrote:
> >
> > On 06/12/2011 17:10, Tom Lane wrote:
> >>
> >> Merlin Moncure<mmoncure(at)gmail(dot)com <mailto:mmoncure(at)gmail(dot)com>> writes:
> >>>
> >>> *) You may want to consider changing your vm over commit settings
> >>> and/or reducing swap in order to get your server to more aggressively
> >>> return OOM to postgres memory allocation. The specific error returned
> >>> to postgres for an OOM of course would be very helpful.
> >>
> >> Yeah. I would try starting the postmaster under smaller ulimit
> settings
> >> so that the kernel gives it ENOMEM before you start getting swapped.
> >> When that happens, the backend will dump a memory map into the
> >> postmaster log that would be very useful for seeing what is actually
> >> happening here.
> >>
> >> regards, tom lane
> >>
> > Hello all,
> >
> > I think I have solved the problem. Many thanks for the support and
> the time you spend. The solution/bug/problem is as follows:
> >
> > 1. There was one connection that as I described was used IN A LOOP
> 22million times. This connection was assigned a PID x (on the linux
> server)
> > 2. Nested within this LOOP there was another connection that had
> been forgotten from past code and the linux server was assigning to it
> a PID y
> > 3. PID y was of course called also 22million times (since it was in
> the loop). However it had a nasty bug and it was creating constantly
> prepared commands! (opps my mistake). So PID y was creating 22million
> prepared commands!
> > 4. As I had no clue that that there was at all PID y, monitoring the
> TOP on the server I was presented with the misbehaving PID y but I was
> of the impression that it was PID x. In fact PID x was below in the
> list happy doing its own job.
> >
> > So the healthy PID X had a top signature as follows (please note the
> difference between RES and SHR as well as the magnitude in Mb as
> Merlin suggested):
> >
> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> > 30475 postgres 20 0 2187m 746m 741m S 31 9.5 0:41.48 postgres
> >
> > While the unhealthy PID Y had a TOP signature (please note that RES
> memory is at 12.9g! and SHR 1.4g as well as the magnitude in Gb!):
> >
> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> > 15965 postgres 20 0 12.9g 6.4g 1.4g S 11 83.4 13:59.15 postgres
> >
> > As I said I had no clue about the existence of PID Y and since it
> was coming top at the TOP list I had wrongfully assumed that it was
> the PID X. It gets more complicated by the fact that the test code I
> sent you, which should have been working fine as it had no nested
> buggy loop, was mainly running from home over the DSL line thus I
> never let it conclude its 22million iterations (it would have been
> still running!) instead I was monitoring the TOP and since the memory
> was going UP I was wrongfully assuming that I had the same issue (if I
> had let it run for 2 -3 hours I would have noticed what Merlin
> suggested about RES/SHR ratio). So it was a misdiagnosis after all :)
> >
> > I hope this explains everything.
> > Kind Regards and sorry for the misunderstanding.
>
> All's well that ends well.
>
> May I ask how you finally found the culprit? Through profiling or code
> review?
>
> That said, I can't help admiring that PostgreSQL still survives 22
> million prepared commands without committing suicide. Yes, it's
> severely impacted, but still survives.
>
> Rgds,
>

When I compared the "demo program" (the one I though it was not working
as I was not letting it run for long enough) and the actual code base I
realised that the actual code base was running much slower than the demo
even though they were, to my mind doing the same things. It took me then
a few "step into" to find somewhere deep the nasty .prepare() which was
placed in a function that once was declared static and now it was called
for every object created (22million of them). In fact it is a little bit
more complicated as I was in the middle of a major re factoring reducing
the 22million iteration to 55000 by grouping and creating a few prepared
instead of executing 22million times a command. So what I am not sure
still is what will happen if I execute 22million times non-prepared
statements, will it crash or will it survive since I removed the buggy
"prepared" that were nested.

Regards


From: Alan Hodgson <ahodgson(at)simkin(dot)ca>
To: pgsql-novice(at)postgresql(dot)org
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-07 20:26:13
Message-ID: 201112071226.13291.ahodgson@simkin.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

On Wednesday, December 07, 2011 04:14:21 AM Ioannis Anagnostopoulos wrote:
>. So what I am not sure
> still is what will happen if I execute 22million times non-prepared
> statements, will it crash or will it survive since I removed the buggy
> "prepared" that were nested.

PostgreSQL? It'll happily serve up queries for years if you aren't stacking up
prepared query objects in your session :p


From: "Jean-Yves F(dot) Barbier" <12ukwn(at)gmail(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Subject: Re: Linux server connection process consumes all memory
Date: 2011-12-07 20:40:23
Message-ID: 20111207214023.4cd06deb@anubis.defcon1
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-novice

On Wed, 07 Dec 2011 12:14:21 +0000
Ioannis Anagnostopoulos <ioannis(at)anatec(dot)com> wrote:

> So what I am not sure
> still is what will happen if I execute 22million times non-prepared
> statements, will it crash or will it survive since I removed the buggy
> "prepared" that were nested.

It would take "some" time but wouldn't crash at all, but avoid to do
this kinda exercise daily on a SSD, it wouldn't like.

Think about Pg as the same thing as oracle, except it doesn't try to
hijack your assets each time you have a question to ask about it ;)

--
If you are a police dog, where's your badge?
-- Question James Thurber used to drive his German Shepherd crazy.