Re: How to handle missing pgsql.connect_timeout

From: Marco Colombo <pgsql(at)esiway(dot)net>
To: Christopher Murtagh <christopher(dot)murtagh(at)mcgill(dot)ca>
Cc: Janning Vygen <vygen(at)planwerk6(dot)de>, pgsql-php(at)postgresql(dot)org
Subject: Re: How to handle missing pgsql.connect_timeout
Date: 2005-02-08 11:25:34
Message-ID: Pine.LNX.4.61.0502081224510.5203@Megathlon.ESI
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-php

On Wed, 2 Feb 2005, Christopher Murtagh wrote:

> On Wed, 2005-02-02 at 09:49 +0100, Janning Vygen wrote:
>> It's not broken anymore! And it speeds up your website.
>
> Well it might not be broken anymore, but I haven't seen any evidence of
> a speed increase (quite the opposite in fact). What we did instead was
> write a very small abstraction layer so that any page request would only
> use 1 database connection. We're running on dual 2.8 GHz Xeons for our
> web server and db machine and taking on over 500,000 hits per day. Some
> of our pages are very data intensive (http://www.mcgill.ca/calendar/ for
> example).
>
>> At the moment it waits "forever" (until max_execution_time i guess) and you
>> have no chance to show an error message like: DB is overloaded, please try
>> again.
>
> maybe pg_pconnect() isn't all it is cracked up to be? Anyone else here
> have good experiences with it?

You have to evaluate carefully your setup. I think there's little hope
to plan this in advance, you'll have to monitor your system at runtime.

I've been using persistent connections a lot. First, think of what kind
of problem you're trying to solve... decrease the connection overhead.
One connection made to the DB, per page. Now consider the following:

- PostgreSQL forks a backend process for each connection: this is
almost universally considered a huge overhead. But on modern operating
systems, this is no longer the case.

- network latency: if the PG server runs on a different host, it may be
an issue. If it runs on the same host, and is accessed via Unix sockets,
latency is negligible.

- in order to implement connect semantic, PHP has to 'reset' the session,
which is a kind of overhead per se. Consider the same apache backend
serving two different PG users! The connection may be the same, but
all local and remote state has to be removed.

- if your pages use multiple db connections, you'll get many PG backends
per apache backend.

- each PG backend uses up some resources (RAM mostly): there are a number
of 'knobs' you can turn to control that, but of course, unless your RAM
in infinite, you may have to _reduce_ them in order to increase the
number of concurrent backends.

To put it very simply (maybe too simply) it's a CPU vs RAM tradeoff.
Using persistent connections turns a little (very OS and setup dependant)
CPU overhead into a not so little RAM overhead on the PG server side.

I wasn't able to find a definitive answer. I have many concurrent accesses
(say 200-300), with simple (read and write) queries, but on a large data
set. I have two apache frontends, and one PG server (same LAN). I wasn't
able to choose between pconnect or connect. At times, some PHP programmer
kills the DB, and I have to teach him not to perform 300 different queries
_per page_ with a PHP for loop and to learn how to use table joins instead.
I can tell you that when that happens, you'll forget about pconnect or
connect, and thank PG developers for writing such a robust application
(I've seen load averages up to 100, still PG was doing its job, even
if very slowly). :-)

.TM.
--
____/ ____/ /
/ / / Marco Colombo
___/ ___ / / Technical Manager
/ / / ESI s.r.l.
_____/ _____/ _/ Colombo(at)ESI(dot)it

In response to

Browse pgsql-php by date

  From Date Subject
Next Message Stefan Sturm 2005-02-08 13:32:26 Transactions and Savepoints
Previous Message Keary Suska 2005-02-03 18:41:00 Re: How to handle missing pgsql.connect_timeout