performance sol10 zone (fup)

Lists: pgsql-performance
From: "Heiko L(dot)" <heikol(at)fh-lausitz(dot)de>
To: pgsql-performance(at)postgresql(dot)org
Subject: performance sol10 zone (fup)
Date: 2010-08-04 07:16:09
Message-ID: 46496.192.168.5.110.1280906169.squirrel@webmail.fh-lausitz.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hallo,

Im running pg-8,pgpoolII on sol10-zone.

After update sol10u7, queries on coltype timestamp are very slow.
System: sparc, 2GB RAM

This DB is a greylist-DB to fight spam.
500 connections should be easy.
But 16 connection consum 10sec/query.
On another system (sparc) only 1 sec.i

s. details

howto diag?

regards heiko

- details

$ /opt/csw/postgresql/bin/postmaster -V
postgres (PostgreSQL) 8.3.1

- numDS: 200k-800k
- conn: 50-500
- max age: 2d

- testbed
tab=blacklist
filter="^ .*[0-9]|elaps"
cmd="/usr/bin/time psql -t -p 5432 -h 192.168.5.126 -U smtpuser smtp"
sql="select count(*) from $tab";
sql1="select count(*) from $tab where create_time > abstime(int4(timenow()) -30
00)";
sql2="select count(*) from $tab where create_time < abstime(int4(timenow()) -30
00)";
echo "$sql" | $cmd 2>&1 | egrep "$filter";
echo "$sql1" | $cmd 2>&1 | egrep "$filter";
echo "$sql2" | $cmd 2>&1 | egrep "$filter";

- result:
152603
0.01user 0.01system 0:00.29elapsed 10%CPU (0avgtext+0avgdata 0maxresident)k
0
0.02user 0.01system 0:00.05elapsed 58%CPU (0avgtext+0avgdata 0maxresident)k
152603
0.00user 0.02system 0:01.95elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k

- select without where: 150000 / 0.3 s
- select with where: 150000 / 2 s
- time is depended on recv DS (linear)

I simulate any parallel queries:
$ ./sqltestcon.sh -v 1 -i 20 -min 1 -max 64 -sql "$sql1; $sql2" -cmd "$cmd"
3081844 sum: 1 (5) * 13 = 13 -> 0 R/s, e=0 0:01.36elapsed 3%CPU
3081901 sum: 2 (7) * 10 = 20 -> 1 R/s, e=0 0:01.53elapsed 3%CPU
3081922 sum: 4 (11) * 8 = 32 -> 1 R/s, e=0 0:02.40elapsed 0%CPU
3081947 sum: 8 (19) * 5 = 40 -> 2 R/s, e=0 0:04.69elapsed 0%CPU
3082005 sum: 16 (36) * 2 = 32 -> 1 R/s, e=0 0:09.06elapsed 0%CPU
3082043 sum: 32 (67) * 2 = 64 -> 3 R/s, e=0 0:17.83elapsed 0%CPU
3082119 sum: 64 (130) * 1 = 64 -> 3 R/s, e=0 0:34.19elapsed 0%CPU

-> 16 connections: only 2 passes -> aprox. 10s /query

$ connstat -cols "_sy_load 5432 5433 _ps_pool _ps_postmaster" 60
## time load 5432 5433 pgpool postmaster
08:18:34 026 0 1 2 553 6
08:19:36 471 1 10 553 14
08:20:40 1804 2 35 553 42
08:21:51 5877 63 128 553 132

-> 128 connections: Load=58!

- following test are used:
- reindex
- drop, create index
- drop, create database
- zfs recordsize: 128,8 -> no differ
- zfs load: 10% ops, 10% read
- shmmax: (0x80000000)
- shared_buffers
- work_mem 64,256,1024
- wal_buffers
- effective_cache_size
- #log_disconnections = off
- autovacuum = on , off

What can i do?
Howto interpret explain output?

$ echo "EXPLAIN ANALYSE $sql2" | $cmd 2>&1
Aggregate (cost=5236.53..5236.54 rows=1 width=0) (actual time=2018.985..2018.9
86 rows=1 loops=1)
-> Seq Scan on blacklist (cost=0.00..4855.06 rows=152588 width=0) (actual t
ime=0.329..1883.275 rows=152603 loops=1)
Filter: (create_time < ((((timenow())::integer - 3000))::abstime)::time
stamp with time zone)
Total runtime: 2019.371 ms

0.00user 0.02system 0:02.07elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+2353minor)pagefaults 0swaps

------------------------------------------------------------------------
- test 5 frische DB

tab=testbl
f=/tmp/test.sql
cat > $f <<EOF
create table $tab
(
relay_ip inet,
create_time timestamp default now() NOT NULL
);

create index ${tab}_relay_ip_idx on ${tab}(relay_ip);
create index ${tab}_create_time_idx on ${tab}(create_time);
EOF
cat $f | $cmd

------------------------------------------------------------------------
- memstat [10]

$ echo "::memstat"|mdb -k
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 189712 1482 74%
Anon 25308 197 10%
Exec and libs 1991 15 1%
Page cache 1849 14 1%
Free (cachelist) 2949 23 1%
Free (freelist) 35060 273 14%

Total 256869 2006
Physical 255327 1994

- shm (org)
$ echo "shminfo_shmmax/E" | mdb -k
shminfo_shmmax:
shminfo_shmmax: 8388608


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "Heiko L(dot)" <heikol(at)fh-lausitz(dot)de>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: performance sol10 zone (fup)
Date: 2010-08-12 03:16:54
Message-ID: AANLkTikWPuctJOyqCpcrttsSUoqefC5NHHPEZH6qXLGK@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Wed, Aug 4, 2010 at 3:16 AM, Heiko L. <heikol(at)fh-lausitz(dot)de> wrote:
> $ echo "EXPLAIN ANALYSE $sql2" | $cmd 2>&1
>  Aggregate  (cost=5236.53..5236.54 rows=1 width=0) (actual time=2018.985..2018.9
> 86 rows=1 loops=1)
>   ->  Seq Scan on blacklist  (cost=0.00..4855.06 rows=152588 width=0) (actual t
> ime=0.329..1883.275 rows=152603 loops=1)
>         Filter: (create_time < ((((timenow())::integer - 3000))::abstime)::time
> stamp with time zone)
>  Total runtime: 2019.371 ms

Maybe you need an index on create_time.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: "Heiko L(dot)" <heikol(at)fh-lausitz(dot)de>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: performance sol10 zone (fup)
Date: 2010-08-12 04:12:08
Message-ID: AANLkTik7W3qcttmDwrgujXsn9eraVoMGL2ADXHB994X2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Wed, Aug 4, 2010 at 1:16 AM, Heiko L. <heikol(at)fh-lausitz(dot)de> wrote:
> Hallo,
>
> Im running pg-8,pgpoolII on sol10-zone.

I noticed late you mention 8.3.1. Two points, you're missing > 1 year
of updates, bug fixes, security patches etc. Assuming this version
was fast before, we'll assume it's not the cause of this problem,
however, you're asking for trouble with a version that old. There are
bugs that might not bite you today, but may well in the future.
Please upgrade to 8.3.11.

> After update sol10u7, queries on coltype timestamp are very slow.
> System: sparc, 2GB RAM

Is it possible you had an index that was working that now isn't? Are
the queries you included the real ones or approximations?

It looks like you have a bunch of seq scans happening. If they're all
happening on the same table or small set of them, then a lot of
queries should be able to access them in any order together in 8.3

Are sequential scans normal for this query when it runs fast?

What does vmstat 10 and / or iostat -xd 10 have to say while this is running?

--
To understand recursion, one must first understand recursion.