Postgres DB crashing

From: bhanu udaya <udayabhanu1984(at)hotmail(dot)com>
To: Kevin Grittner <kgrittn(at)mail(dot)com>, Adrian Klaver <adrian(dot)klaver(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>, "pgadmin-support(at)postgresql(dot)org" <pgadmin-support(at)postgresql(dot)org>
Cc: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Chris Travers <chris(dot)travers(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Postgres DB crashing
Date: 2013-06-18 17:31:31
Message-ID: COL127-W252BAA66380F06037FA842D38C0@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgadmin-support pgsql-general

Hello,
Greetings.
My PostgresSQL (9.2) is crashing after certain load tests. Currently, postgressql is crashing when simulatenously 800 to 1000 threads are run on a 10 million records schema. Not sure, if we have to tweak some more parameters of postgres. Currently, the postgressql is configured as below on a 7GB Ram on an Intel Xeon CPU E5507 2.27 GZ. Is this postgres limitation to support only 800 threads or any other configuration required. Please look at the log as below with errors. Please reply




max_connections
5000


shared_buffers
2024 MB


synchronous_commit
off


wal_buffers
100 MB


wal_writer_delays
1000ms


checkpoint_segments
512


checkpoint_timeout
5 min


checkpoint_completion_target
0.5


checkpoint_warning
30s


work_memory
1G


effective_cache_size
5 GB




2013-06-11 15:11:17 GMT [26201]: [1-1]ERROR: canceling autovacuum task

2013-06-11 15:11:17 GMT [26201]: [2-1]CONTEXT: automatic vacuum of table "newrelic.tenant1.customer"

2013-06-11 15:11:17 GMT [25242]: [1-1]LOG: sending cancel to blocking autovacuum PID 26201

2013-06-11 15:11:17 GMT [25242]: [2-1]DETAIL: Process 25242 waits for ExclusiveLock on extension of relation 679054 of database 666546.

2013-06-11 15:11:17 GMT [25242]: [3-1]STATEMENT: UPDATE tenant1.customer SET lastmodifieddate = $1 WHERE id IN ( select random_range((select min(id) from tenant1.customer ), (select max(id) from tenant1.customer )) as id ) AND softdeleteflag IS NOT TRUE

2013-06-11 15:11:17 GMT [25242]: [4-1]WARNING: could not send signal to process 26201: No such process

2013-06-11 15:22:29 GMT [22229]: [11-1]WARNING: worker took too long to start; canceled

2013-06-11 15:24:10 GMT [26511]: [1-1]WARNING: autovacuum worker started without a worker entry

2013-06-11 16:03:33 GMT [23092]: [1-1]LOG: could not receive data from client: Connection timed out

2013-06-11 16:06:05 GMT [23222]: [5-1]LOG: could not receive data from client: Connection timed out

2013-06-11 16:07:06 GMT [26869]: [1-1]FATAL: canceling authentication due to timeout

2013-06-11 16:23:16 GMT [25128]: [1-1]LOG: could not receive data from client: Connection timed out

2013-06-11 16:23:20 GMT [25128]: [2-1]LOG: unexpected EOF on client connection with an open transaction

2013-06-11 16:30:56 GMT [23695]: [1-1]LOG: could not receive data from client: Connection timed out

2013-06-11 16:43:55 GMT [24618]: [1-1]LOG: could not receive data from client: Connection timed out

2013-06-11 16:44:29 GMT [25204]: [1-1]LOG: could not receive data from client: Connection timed out

2013-06-11 16:54:14 GMT [22226]: [1-1]PANIC: stuck spinlock (0x2aaab54279d4) detected at bufmgr.c:1239

2013-06-11 16:54:14 GMT [32521]: [8-1]LOG: checkpointer process (PID 22226) was terminated by signal 6: Aborted

2013-06-11 16:54:14 GMT [32521]: [9-1]LOG: terminating any other active server processes

2013-06-11 16:54:14 GMT [26931]: [1-1]WARNING: terminating connection because of crash of another server process

2013-06-11 16:54:14 GMT [26931]: [2-1]DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

2013-06-11 16:54:14 GMT [26931]: [3-1]HINT: In a moment you should be able to reconnect to the database and repeat your command.

2013-06-11 16:54:14 GMT [26401]: [1-1]WARNING: terminating connection because of crash of another server process

2013-06-11 16:54:14 GMT [26401]: [2-1]DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

2013-06-11 16:55:08 GMT [27579]: [1-1]FATAL: the database system is in recovery mode

2013-06-11 16:55:08 GMT [24041]: [1-1]WARNING: terminating connection because of crash of another server process

2013-06-11 16:55:08 GMT [24041]: [2-1]DETAIL: The postmaster has commanded this server process to roll back the current

In response to

Responses

Browse pgadmin-support by date

  From Date Subject
Next Message AI Rumman 2013-06-18 17:54:09 Re: Postgres DB crashing
Previous Message Brett Maton 2013-06-18 16:30:16 Re: Server Instrumentation

Browse pgsql-general by date

  From Date Subject
Next Message Jeff Herrin 2013-06-18 17:42:58 earthdistance compass bearing
Previous Message Jeff Janes 2013-06-18 17:02:12 Re: I want to make an example of using parameterized path