Re: Frequently updated tables

Lists: pgsql-hackers
From: pgsql(at)mohawksoft(dot)com
To: pgsql-hackers(at)postgresql(dot)org
Subject: Frequently updated tables
Date: 2004-06-08 19:24:42
Message-ID: 15231.64.119.142.34.1086722682.squirrel@mail.mohawksoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I've been down several roads about how to handle data that has to change
on a very frequent and rapid manner.

Think about summary tables, WEB session tables, etc. As great as MVCC is
for the vast majority of uses. The overhead of updates and deletes can
kill a project that needs to constantly update tables.

In most cases, the answer is just not to use PostgreSQL for that, but then
you are stuck with Mysql or something worse.

Would having a special class of table that is marked for frequent updates,
which causes the system to lock and update in place, make sense? Is that
even possible?

There has to be a way of making PostgreSQL able to handle this class of
problem. Anyone have a reasonable idea?


From: Mark Kirkwood <markir(at)coretech(dot)co(dot)nz>
To: pgsql(at)mohawksoft(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Frequently updated tables
Date: 2004-06-08 22:15:40
Message-ID: 40C63A8C.8040707@coretech.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

pgsql(at)mohawksoft(dot)com wrote:

>I've been down several roads about how to handle data that has to change
>on a very frequent and rapid manner.
>
>Think about summary tables, WEB session tables, etc. As great as MVCC is
>for the vast majority of uses. The overhead of updates and deletes can
>kill a project that needs to constantly update tables.
>
>
>
>

Are you saying that MVCC has *by design* a higher overhead for updates
and deletes? or are you referring to the gradual loss of performance as
a consequence of many dead tuples?

I am guessing you mean the latter, but best to be sure :-)

regards

Mark


From: pgsql(at)mohawksoft(dot)com
To: "Mark Kirkwood" <markir(at)coretech(dot)co(dot)nz>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Frequently updated tables
Date: 2004-06-08 23:16:45
Message-ID: 19034.24.91.171.78.1086736605.squirrel@mail.mohawksoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>
>
> pgsql(at)mohawksoft(dot)com wrote:
>
>>I've been down several roads about how to handle data that has to change
>>on a very frequent and rapid manner.
>>
>>Think about summary tables, WEB session tables, etc. As great as MVCC is
>>for the vast majority of uses. The overhead of updates and deletes can
>>kill a project that needs to constantly update tables.
>>
>>
>>
>>
>
> Are you saying that MVCC has *by design* a higher overhead for updates
> and deletes? or are you referring to the gradual loss of performance as
> a consequence of many dead tuples?
>
> I am guessing you mean the latter, but best to be sure :-)
The best phrasing would be "the accumulating overhead of deletes and
updates."

Yes.


From: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>
To: pgsql(at)mohawksoft(dot)com
Cc: Mark Kirkwood <markir(at)coretech(dot)co(dot)nz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Frequently updated tables
Date: 2004-06-08 23:32:51
Message-ID: 20040608233251.GX75987@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 08, 2004 at 07:16:45PM -0400, pgsql(at)mohawksoft(dot)com wrote:
> >
> >
> > pgsql(at)mohawksoft(dot)com wrote:
> >
> >>I've been down several roads about how to handle data that has to change
> >>on a very frequent and rapid manner.
> >>
> >>Think about summary tables, WEB session tables, etc. As great as MVCC is
> >>for the vast majority of uses. The overhead of updates and deletes can
> >>kill a project that needs to constantly update tables.
> >>
> >>
> >>
> >>
> >
> > Are you saying that MVCC has *by design* a higher overhead for updates
> > and deletes? or are you referring to the gradual loss of performance as
> > a consequence of many dead tuples?
> >
> > I am guessing you mean the latter, but best to be sure :-)
> The best phrasing would be "the accumulating overhead of deletes and
> updates."
>
> Yes.

Doesn't pg_autovacuum largely take care of this issue?
--
Jim C. Nasby, Database Consultant jim(at)nasby(dot)net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


From: Mark Kirkwood <markir(at)coretech(dot)co(dot)nz>
To: pgsql(at)mohawksoft(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Frequently updated tables
Date: 2004-06-09 00:54:48
Message-ID: 40C65FD8.2030104@coretech.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


pgsql(at)mohawksoft(dot)com wrote:

>The best phrasing would be "the accumulating overhead of deletes and
>updates."
>
>Yes.
>
>

Are you using 7.3?

I am asking because in 7.3 high update / delete tables could suffer
(index and toast) bloat that was untamable via (lazy) VACUUM and FSM.
I believe this is fixed in 7.4, so it should be possible to achieve on
disk size control of tables / indexes by configuring FSM and (lazy)
VACUUM. Did you find this not to be the case?

regards

Mark


From: pgsql(at)mohawksoft(dot)com
To: "Mark Kirkwood" <markir(at)coretech(dot)co(dot)nz>
Cc: pgsql(at)mohawksoft(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Frequently updated tables
Date: 2004-06-09 12:21:28
Message-ID: 17403.24.91.171.78.1086783688.squirrel@mail.mohawksoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>
> pgsql(at)mohawksoft(dot)com wrote:
>
>>The best phrasing would be "the accumulating overhead of deletes and
>>updates."
>>
>>Yes.
>>
>>
>
> Are you using 7.3?
>
> I am asking because in 7.3 high update / delete tables could suffer
> (index and toast) bloat that was untamable via (lazy) VACUUM and FSM.
> I believe this is fixed in 7.4, so it should be possible to achieve on
> disk size control of tables / indexes by configuring FSM and (lazy)
> VACUUM. Did you find this not to be the case?
>
Interesting, the company is usng 7.3.4. One single row summary table got
up to 2 million dead rows. A select from that single row took a quarter of
a second. A regular vacuum did not fix it, only a vacuum full did.
However, when the test was re-run with constant vacuums, it did not get
out of hand.

My concern is performance, and yes, for inserts PostgreSQL is great. For
data that is constantly being updated, PostgreSQL is a bit weak. Think
about a table with a few million rows that needs to be updated a few
thousand times a minute.

I love PG, I've been using it since version 6x, and it has gotten
fantastic over the years, and in many cases, I would choose it over
Oracle, but for systems that need frequent updates, I have a lot of
concerns.


From: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
To: pgsql(at)mohawksoft(dot)com
Cc: Mark Kirkwood <markir(at)coretech(dot)co(dot)nz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Frequently updated tables
Date: 2004-06-09 14:49:20
Message-ID: 40C72370.8050603@familyhealth.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> I love PG, I've been using it since version 6x, and it has gotten
> fantastic over the years, and in many cases, I would choose it over
> Oracle, but for systems that need frequent updates, I have a lot of
> concerns.

...that's the price you pay for concurrency man...

Chris


From: pgsql(at)mohawksoft(dot)com
To: "Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>
Cc: pgsql(at)mohawksoft(dot)com, "Mark Kirkwood" <markir(at)coretech(dot)co(dot)nz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Frequently updated tables
Date: 2004-06-09 15:18:36
Message-ID: 32561.64.119.142.34.1086794316.squirrel@mail.mohawksoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>> I love PG, I've been using it since version 6x, and it has gotten
>> fantastic over the years, and in many cases, I would choose it over
>> Oracle, but for systems that need frequent updates, I have a lot of
>> concerns.
>
> ...that's the price you pay for concurrency man...

I think that's a cop-out. Other databases can handle this case fine and
they have MVCC.

Are we not "open source" "free software" proponents? Isn't one of our
motivations that we can do it better?

This *is* a problem with PostgreSQL, and it *is* a concern for a
reasonable number of potential deployments.


From: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
To: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
Cc: pgsql(at)mohawksoft(dot)com, Mark Kirkwood <markir(at)coretech(dot)co(dot)nz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Frequently updated tables
Date: 2004-06-09 17:05:43
Message-ID: 20040609170542.GA7582@dcc.uchile.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jun 09, 2004 at 10:49:20PM +0800, Christopher Kings-Lynne wrote:
> >I love PG, I've been using it since version 6x, and it has gotten
> >fantastic over the years, and in many cases, I would choose it over
> >Oracle, but for systems that need frequent updates, I have a lot of
> >concerns.
>
> ...that's the price you pay for concurrency man...

Also he said that the problem was solved with enough lazy VACUUM
scheduling. I don't understand why he doesn't want to use that
solution.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"People get annoyed when you try to debug them." (Larry Wall)


From: pgsql(at)mohawksoft(dot)com
To: "Alvaro Herrera" <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: "Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>, pgsql(at)mohawksoft(dot)com, "Mark Kirkwood" <markir(at)coretech(dot)co(dot)nz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Frequently updated tables
Date: 2004-06-09 17:41:27
Message-ID: 56883.64.119.142.34.1086802887.squirrel@mail.mohawksoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On Wed, Jun 09, 2004 at 10:49:20PM +0800, Christopher Kings-Lynne wrote:
>> >I love PG, I've been using it since version 6x, and it has gotten
>> >fantastic over the years, and in many cases, I would choose it over
>> >Oracle, but for systems that need frequent updates, I have a lot of
>> >concerns.
>>
>> ...that's the price you pay for concurrency man...
>
> Also he said that the problem was solved with enough lazy VACUUM
> scheduling. I don't understand why he doesn't want to use that
> solution.
>

Sigh, because vacuums take away from performance. Imagine a table that has
to be updated on the order of a few thousand times a minute. Think about
the drop in performance during the vacuum.

On a one row table, vacuum is not so bad, but try some benchmarks on a
table with a goodly number of rows.


From: Bruno Wolff III <bruno(at)wolff(dot)to>
To: pgsql(at)mohawksoft(dot)com
Cc: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>, Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, Mark Kirkwood <markir(at)coretech(dot)co(dot)nz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Frequently updated tables
Date: 2004-06-09 18:12:30
Message-ID: 20040609181230.GA19306@wolff.to
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jun 09, 2004 at 13:41:27 -0400,
pgsql(at)mohawksoft(dot)com wrote:
>
> Sigh, because vacuums take away from performance. Imagine a table that has
> to be updated on the order of a few thousand times a minute. Think about
> the drop in performance during the vacuum.
>
> On a one row table, vacuum is not so bad, but try some benchmarks on a
> table with a goodly number of rows.

But you only need to rapidly vacuum the one table that is keeping your
totals record. This isn't going to be a big hit in performance relative
to the updates that are going on. You don't need to vacuum the tables
you are doing the inserts or updates to at that same rate.


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: pgsql(at)mohawksoft(dot)com
Cc: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>, Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, Mark Kirkwood <markir(at)coretech(dot)co(dot)nz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Frequently updated tables
Date: 2004-06-09 18:14:51
Message-ID: 40C7539B.3030807@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


>
> Sigh, because vacuums take away from performance.

This is a known issue that has been pretty much resolved for 7.5. Vacuum
in 7.5 does not take even close to as much IO resources.

Imagine a table that has
> to be updated on the order of a few thousand times a minute. Think about
> the drop in performance during the vacuum.
>
> On a one row table, vacuum is not so bad, but try some benchmarks on a
> table with a goodly number of rows.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
> joining column's datatypes do not match

--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd(at)commandprompt(dot)com - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL

Attachment Content-Type Size
jd.vcf text/x-vcard 640 bytes

From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, pgsql(at)mohawksoft(dot)com, Mark Kirkwood <markir(at)coretech(dot)co(dot)nz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Frequently updated tables
Date: 2004-06-09 18:16:17
Message-ID: 40C753F1.5040305@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> Also he said that the problem was solved with enough lazy VACUUM
> scheduling. I don't understand why he doesn't want to use that
> solution.

Because even lazy VACUUM is horrendous to performance but as I said in
a further post this has been pretty much fixed by (Jan I believe) in 7.5.

Sincerely,

Joshua D. Drake

>

--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd(at)commandprompt(dot)com - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL

Attachment Content-Type Size
jd.vcf text/x-vcard 640 bytes

From: "Scott Marlowe" <smarlowe(at)qwest(dot)net>
To: pgsql(at)mohawksoft(dot)com
Cc: "Alvaro Herrera" <alvherre(at)dcc(dot)uchile(dot)cl>, "Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>, "Mark Kirkwood" <markir(at)coretech(dot)co(dot)nz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Frequently updated tables
Date: 2004-06-09 18:26:27
Message-ID: 1086805587.23855.34.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2004-06-09 at 11:41, pgsql(at)mohawksoft(dot)com wrote:
> > On Wed, Jun 09, 2004 at 10:49:20PM +0800, Christopher Kings-Lynne wrote:
> >> >I love PG, I've been using it since version 6x, and it has gotten
> >> >fantastic over the years, and in many cases, I would choose it over
> >> >Oracle, but for systems that need frequent updates, I have a lot of
> >> >concerns.
> >>
> >> ...that's the price you pay for concurrency man...
> >
> > Also he said that the problem was solved with enough lazy VACUUM
> > scheduling. I don't understand why he doesn't want to use that
> > solution.
> >
>
> Sigh, because vacuums take away from performance. Imagine a table that has
> to be updated on the order of a few thousand times a minute. Think about
> the drop in performance during the vacuum.
>
> On a one row table, vacuum is not so bad, but try some benchmarks on a
> table with a goodly number of rows.

Several points:

All databases pay to clean up the mess they've made, so to speak. In
PostgreSQL you get to choose when, instead of always paying the price at
the end of transaction.

Lazy vacuum does not impact performance nearly as much as the old full
vacuum. With the sleep / delay patch that's been passed around on
hackers its impact is virtually zero on the rest of the database

Properly setup fsm settings, pg_autovacuum deamon, and an installation
of the sleep / delay patch mentioned aobve makes this a non-issue.


From: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
To: pgsql(at)mohawksoft(dot)com
Cc: Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, Mark Kirkwood <markir(at)coretech(dot)co(dot)nz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Frequently updated tables
Date: 2004-06-09 18:36:24
Message-ID: 20040609183623.GB7582@dcc.uchile.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Jun 09, 2004 at 01:41:27PM -0400, pgsql(at)mohawksoft(dot)com wrote:
> > On Wed, Jun 09, 2004 at 10:49:20PM +0800, Christopher Kings-Lynne wrote:

> > Also he said that the problem was solved with enough lazy VACUUM
> > scheduling. I don't understand why he doesn't want to use that
> > solution.
>
> Sigh, because vacuums take away from performance. Imagine a table that has
> to be updated on the order of a few thousand times a minute. Think about
> the drop in performance during the vacuum.
>
> On a one row table, vacuum is not so bad, but try some benchmarks on a
> table with a goodly number of rows.

Hmm, this can be a problem if VACUUM pollutes the shared buffer pool.
So what about a new buffer replacement policy that takes this into
account and is not fooled by VACUUM? This is already implemented in
7.5. Also, how about a background writer process that writes dirty
buffers so that backends don't have to wait for IO to complete when a
dirty buffer has to be written? This is also in current CVS.

Have you tried and measured how the current CVS code performs? Jan
Wieck reported a lot of performance improvement some time ago while he
was developing this. The code has changed since and I have not seen any
measurement.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
Oh, oh, las chicas galacianas, lo harán por las perlas,
¡Y las de Arrakis por el agua! Pero si buscas damas
Que se consuman como llamas, ¡Prueba una hija de Caladan! (Gurney Halleck)


From: pgsql(at)mohawksoft(dot)com
To: "Bruno Wolff III" <bruno(at)wolff(dot)to>
Cc: "Alvaro Herrera" <alvherre(at)dcc(dot)uchile(dot)cl>, "Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>, "Mark Kirkwood" <markir(at)coretech(dot)co(dot)nz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Frequently updated tables
Date: 2004-06-09 20:01:33
Message-ID: 36381.64.119.142.34.1086811293.squirrel@mail.mohawksoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On Wed, Jun 09, 2004 at 13:41:27 -0400,
> pgsql(at)mohawksoft(dot)com wrote:
>>
>> Sigh, because vacuums take away from performance. Imagine a table that
>> has
>> to be updated on the order of a few thousand times a minute. Think about
>> the drop in performance during the vacuum.
>>
>> On a one row table, vacuum is not so bad, but try some benchmarks on a
>> table with a goodly number of rows.
>
> But you only need to rapidly vacuum the one table that is keeping your
> totals record. This isn't going to be a big hit in performance relative
> to the updates that are going on. You don't need to vacuum the tables
> you are doing the inserts or updates to at that same rate.
>

I have been talking about two types of problems which are both based on
PostgreSQL's behavior with frequently updated tables.

Summary table: In the single row table system, you have to vacuum very
requently, and this affects performance.

Frequently updated tables: think about the session table for a website.
Each new user gets a new session row. Everytime they refresh or act in the
site, the row is updated. When they leave or their session times out, the
row is deleted. I wrote a RAM only session manager for PHP because
PostgreSQL couldn't handle the volume. (2000 hits a second)

If you have an active site, with hundreds or thousands of hits a second,
vacuuming the table constantly is not practical.

I don't think anyone who has seriously looked at these issues has
concluded that PostgreSQL works fine in these cases. The question is what,
if anything, can be done? The frequent update issue really affects
PostgreSQL's acceptance in web applications, and one which MySQL seems to
do a better job.

IMHO, this issue, a two stage commit based replication system, and a real
and usable setup/configuration system are all that stands between
PostgreSQL and the serious enterprise deployment.


From: Mark Kirkwood <markir(at)coretech(dot)co(dot)nz>
To: pgsql(at)mohawksoft(dot)com
Cc: Bruno Wolff III <bruno(at)wolff(dot)to>, Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>, Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Frequently updated tables
Date: 2004-06-09 22:55:44
Message-ID: 40C79570.6040309@coretech.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

pgsql(at)mohawksoft(dot)com wrote:

>
>I have been talking about two types of problems which are both based on
>PostgreSQL's behavior with frequently updated tables.
>
>Summary table: In the single row table system, you have to vacuum very
>requently, and this affects performance.
>
>Frequently updated tables: think about the session table for a website.
>Each new user gets a new session row. Everytime they refresh or act in the
>site, the row is updated. When they leave or their session times out, the
>row is deleted. I wrote a RAM only session manager for PHP because
>PostgreSQL couldn't handle the volume. (2000 hits a second)
>
>
>
It would be interesting to see if the vacuum delay patch, fsm tuning +
vacuum scheduling could have changed this situation. Clearly there is an
issue here (hence a patch...), but ISTM that just as significant is the
fact that it is difficult to know how to configure the various bits and
pieces, and also difficult to know if it has been done optimally.

>If you have an active site, with hundreds or thousands of hits a second,
>vacuuming the table constantly is not practical.
>
>I don't think anyone who has seriously looked at these issues has
>concluded that PostgreSQL works fine in these cases. The question is what,
>if anything, can be done? The frequent update issue really affects
>PostgreSQL's acceptance in web applications, and one which MySQL seems to
>do a better job.
>
>
>
>
As an aside, I have had similar issues with DB2 and high update tables -
lock escalations (locklist tuning needed). It is not just
non-overwriting storage managers that need the magic tuning wand :-)

regards

Mark


From: pgsql(at)mohawksoft(dot)com
To: "Mark Kirkwood" <markir(at)coretech(dot)co(dot)nz>
Cc: pgsql(at)mohawksoft(dot)com, "Bruno Wolff III" <bruno(at)wolff(dot)to>, "Alvaro Herrera" <alvherre(at)dcc(dot)uchile(dot)cl>, "Christopher Kings-Lynne" <chriskl(at)familyhealth(dot)com(dot)au>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Frequently updated tables
Date: 2004-06-09 23:44:35
Message-ID: 17376.24.91.171.78.1086824675.squirrel@mail.mohawksoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> pgsql(at)mohawksoft(dot)com wrote:
>
>>
>>I have been talking about two types of problems which are both based on
>>PostgreSQL's behavior with frequently updated tables.
>>
>>Summary table: In the single row table system, you have to vacuum very
>>requently, and this affects performance.
>>
>>Frequently updated tables: think about the session table for a website.
>>Each new user gets a new session row. Everytime they refresh or act in
>> the
>>site, the row is updated. When they leave or their session times out, the
>>row is deleted. I wrote a RAM only session manager for PHP because
>>PostgreSQL couldn't handle the volume. (2000 hits a second)
>>
>>
>>
> It would be interesting to see if the vacuum delay patch, fsm tuning +
> vacuum scheduling could have changed this situation. Clearly there is an
> issue here (hence a patch...), but ISTM that just as significant is the
> fact that it is difficult to know how to configure the various bits and
> pieces, and also difficult to know if it has been done optimally.
>
>>If you have an active site, with hundreds or thousands of hits a second,
>>vacuuming the table constantly is not practical.
>>
>>I don't think anyone who has seriously looked at these issues has
>>concluded that PostgreSQL works fine in these cases. The question is
>> what,
>>if anything, can be done? The frequent update issue really affects
>>PostgreSQL's acceptance in web applications, and one which MySQL seems to
>>do a better job.
>>
>>
>>
>>
> As an aside, I have had similar issues with DB2 and high update tables -
> lock escalations (locklist tuning needed). It is not just
> non-overwriting storage managers that need the magic tuning wand :-)

Funny, I've used DB2 for a few projects, but never for a web session
system. This is an interesting data point thanks.