query overhead

Lists: pgsql-performance
From: Andy Halsall <halsall_andy(at)hotmail(dot)com>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: query overhead
Date: 2012-07-11 11:46:23
Message-ID: BLU123-W2825EA0DA3346EA0D91091F5D10@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance


Version.....
PostgreSQL 9.1.2 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.5.2, 64-bit

Server.....
Server: RX800 S2 (8 x Xeon 7040 3GHz dual-core processors, 32GB memory
O/S: SLES11 SP1 64-bit

Scenario.....
Legacy application with bespoke but very efficient interface to its persistent data. We're looking to replace the application and use
PostgreSQL to hold the data. Performance measures on the legacy application on the same server shows that it can perform a particular read operation in ~215 microseconds (averaged) which includes processing the request and getting the result out.

Question......
I've written an Immutable stored procedure that takes no parameters and returns a fixed value to try and determine the round trip overhead of a query to PostgreSQL. Call to sp is made using libpq. We're all local and using UNIX domain sockets.

Client measures are suggesting ~150-200 microseconds to call sp and get the answer back

ping to loopback returns in ~20 microseconds (I assume domain sockets are equivalent).

strace of server process I think confirms time at server to be ~150-200 microsecs. For example:
11:17:50.109936 recvfrom(6, "P\0\0\0'\0SELECT * FROM sp_select_no"..., 8192, 0, NULL, NULL) = 77 <0.000018>
11:17:50.110098 sendto(6, "1\0\0\0\0042\0\0\0\4T\0\0\0(\0\1sp_select_no_op"..., 86, 0, NULL, 0) = 86 <0.000034>

So it looks like a local no-op overhead of at least 150 microseconds which would leave us struggling.
Could someone please let me know if this is usual and if so where the time's spent?
Short of getting a faster server, is there anything I can do to influence this?

Thanks,
Andy


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andy Halsall <halsall_andy(at)hotmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: query overhead
Date: 2012-07-13 16:15:07
Message-ID: 3893.1342196107@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Andy Halsall <halsall_andy(at)hotmail(dot)com> writes:
> I've written an Immutable stored procedure that takes no parameters and returns a fixed value to try and determine the round trip overhead of a query to PostgreSQL. Call to sp is made using libpq. We're all local and using UNIX domain sockets.

> Client measures are suggesting ~150-200 microseconds to call sp and get the answer back

That doesn't sound out of line for what you're doing, which appears to
include parsing/planning a SELECT command. Some of that overhead could
probably be avoided by using a prepared statement instead of a plain
query. Or you could try using the "fast path" API (see libpq's PQfn)
to invoke the function directly without any SQL query involved.

Really, however, the way to make things fly is to get rid of the round
trip overhead in the first place by migrating more of your application
logic into the stored procedure. I realize that that might require
pretty significant rewrites, but if you can't tolerate per-query
overheads in the 100+ usec range, that's where you're going to end up.

If you don't like any of those answers, maybe Postgres isn't the
solution for you. You might consider an embeddable database such
as SQLLite.

regards, tom lane


From: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
To: Andy Halsall <halsall_andy(at)hotmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: query overhead
Date: 2012-07-14 03:28:26
Message-ID: 5000E75A.6030108@ringerc.id.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On 07/11/2012 07:46 PM, Andy Halsall wrote:
>
> I've written an Immutable stored procedure that takes no parameters
> and returns a fixed value to try and determine the round trip overhead
> of a query to PostgreSQL. Call to sp is made using libpq. We're all
> local and using UNIX domain sockets.
>
PL/PgSQL or SQL stored proc? There's a definite calling overhead for
PL/PgSQL compared to plain SQL functions. SQL functions in turn cost
more than a direct statement.

These costs aren't big. They're massively outweighed by any kind of disk
access or any non-trivial query. They start to add up if you have a lot
of procs that wrap a simple "SELECT * FROM x WHERE x.id = $1" though.

> Client measures are suggesting ~150-200 microseconds to call sp and
> get the answer back
0.0015 to 0.002 milliseconds?

That's ... um ... fast. Presumably that's during a loop where your no-op
is run repeatedly without connection setup costs, etc.

>
> ping to loopback returns in ~20 microseconds (I assume domain sockets
> are equivalent).
UNIX domain sockets are typically at least as fast and somewhat lower
overhead.

> So it looks like a local no-op overhead of at least 150 microseconds
> which would leave us struggling.
> Could someone please let me know if this is usual and if so where the
> time's spent?
> Short of getting a faster server, is there anything I can do to
> influence this?

I'm not sure how much a faster server would help with single query
response time. It'll help with response to many parallel queries, but
may not speed up a single query, especially a tiny lightweight one,
particularly dramatically.

The Xeon 7040:
http://ark.intel.com/products/27226/Intel-Xeon-Processor-7040-(4M-Cache-3_00-GHz-667-MHz-FSB)
<http://ark.intel.com/products/27226/Intel-Xeon-Processor-7040-%284M-Cache-3_00-GHz-667-MHz-FSB%29>
is not the newest beast out there, but it's not exactly slow.

Honestly, PostgreSQL's focus is on performance with bigger units of
work, not on being able to return a response to a tiny query in
microseconds. If you are converting an application that has microsecond
response time requirements and hammers its database with millions of
tiny queries, PostgreSQL is probably not going to be your best bet.

If you're able to adapt the app to use set-oriented access patterns
instead of looping, eg instead of (pseudocode):

customer_ids = [ ... array from somewhere ... ]
for customer_id in ids:
c = SELECT c FROM customer c WHERE customer_id = :id
if c.some_var:
UPDATE customer SET c.some_other_var = 't'

you can do:

UPDATE customer SET c.some_other_var = [expression] WHERE [expression]

then you'll get much better results from Pg.

--
Craig Ringer


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Andy Halsall <halsall_andy(at)hotmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: query overhead
Date: 2012-08-10 18:19:56
Message-ID: CAMkU=1wrXUA=dDPMavKWAm_30Uoed=-Cq44-GwPs+35FJ=UcJw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Wed, Jul 11, 2012 at 4:46 AM, Andy Halsall <halsall_andy(at)hotmail(dot)com> wrote:
> Version.....
> PostgreSQL 9.1.2 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.5.2,
> 64-bit
>
> Server.....
> Server: RX800 S2 (8 x Xeon 7040 3GHz dual-core processors, 32GB memory
> O/S: SLES11 SP1 64-bit

I don't really know how to compare these, but I've got:

Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz

>
> Scenario.....
> Legacy application with bespoke but very efficient interface to its
> persistent data. We're looking to replace the application and use
> PostgreSQL to hold the data. Performance measures on the legacy application
> on the same server shows that it can perform a particular read operation in
> ~215 microseconds (averaged) which includes processing the request and
> getting the result out.
>
> Question......
> I've written an Immutable stored procedure that takes no parameters and
> returns a fixed value to try and determine the round trip overhead of a
> query to PostgreSQL. Call to sp is made using libpq. We're all local and
> using UNIX domain sockets.
>
> Client measures are suggesting ~150-200 microseconds to call sp and get the
> answer back

using the plpgsql function you provided down thread:

cat dummy2.sql
select sp_select_no_op();

pgbench -f dummy2.sql -T300
tps = 18703.309132 (excluding connections establishing)

So that comes out to 53.5 microseconds/call.

If I use a prepared statement:

pgbench -M prepared -f dummy2.sql -T300
tps = 30289.908144 (excluding connections establishing)

or 33 us/call.

So unless your server is a lot slower than mine, I think your client
may be the bottleneck. What is your client program? what does "top"
show as the relative CPU usage of your client program vs the "postgres
... [local]" program to which it is connected?

Cheers,

Jeff