Re: postgres overall performance seems to degrade when large SELECT are requested

Lists: pgsql-performance
From: Valentin Bogdanov <valiouk(at)yahoo(dot)co(dot)uk>
To: pgsql-performance(at)postgresql(dot)org
Subject: postgres overall performance seems to degrade when large SELECT are requested
Date: 2008-05-14 10:00:39
Message-ID: 942836.27599.qm@web25805.mail.ukl.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

HI,

I have an application that maintains 150 open connections to a Postgres DB server. The application works fine without a problem for the most time.

The problem seem to arise when a SELECT that returns a lot of rows is executed or the SELECT is run on a large object. These selects are run from time to time by a separate process whose purpose is to generate reports from the db data.

The problem is that when the SELECTs are run the main application starts running out of available connections which means that postgres is not returning the query results fast enough. What I find a little bit starnge is that the report engine's SELECTs operate on a different set of tables than the ones the main application is using. Also the db box is hardly breaking a sweat, CPU and memory utilization are ridiculously low and IOwaits are typically less than 10%.

Has anyone experienced this? Are there any settings I can change to improve throughput? Any help will be greatly appreciated.

Thanks,
val

__________________________________________________________
Sent from Yahoo! Mail.
A Smarter Email http://uk.docs.yahoo.com/nowyoucan.html


From: PFC <lists(at)peufeu(dot)com>
To: valiouk(at)yahoo(dot)co(dot)uk, pgsql-performance(at)postgresql(dot)org
Subject: Re: postgres overall performance seems to degrade when large SELECT are requested
Date: 2008-05-14 11:10:34
Message-ID: op.ua45bwqjcigqcu@apollo13.peufeu.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance


> The problem seem to arise when a SELECT that returns a lot of rows is

Does the SELECT return a lot of rows, or does it scan a lot of rows ?
(for instance, if you use aggregates, it might scan lots of data but only
return few rows).

> The problem is that when the SELECTs are run the main application starts
> running out of available connections which means that postgres is not
> returning the query results fast enough. What I find a little bit
> starnge is that the report engine's SELECTs operate on a different set
> of tables than the ones the main application is using. Also the db box
> is hardly breaking a sweat, CPU and memory utilization are ridiculously
> low and IOwaits are typically less than 10%.

Is it swapping ? (vmstat -> si/so)
Is it locking ? (probably not from what you say)
Is the network connection between the client and DB server saturated ?
(easy with 100 Mbps connections, SELECT with a large result set will
happily blast your LAN)
Is the reporting tool running on the same machine as the DB client and
killing it ? (swapping, etc)

If it's a saturated network, solutions are :
- install Gb ethernet
- run the report on the database server (no bandwidth problems...)
- rewrite the reporting tool to use SQL aggregates to transfer less data
over the network
- or use a cursor to fetch your results in chunks, and wait a little
between chunks

> Has anyone experienced this?

Yeah on benchmarks sometimes the LAN gave up before Postgres broke a
sweat... Gb ethernet solved that...

> Are there any settings I can change to improve throughput? Any help
> will be greatly appreciated.

iptraf will tell you all about your network traffic
vmstat will tell you if your server or client is io-cpu-swap bound
you'd need to post output from those...

>
>
> Thanks,
> val
>
>
> __________________________________________________________
> Sent from Yahoo! Mail.
> A Smarter Email http://uk.docs.yahoo.com/nowyoucan.html
>