Re: Frequent Update Project: Design Overview

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Csaba Nagy" <nagy(at)ecircle-ag(dot)com>, "Hannu Krosing" <hannu(at)skype(dot)net>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "postgres hackers" <pgsql-hackers(at)postgresql(dot)org>, "Zeugswetter Andreas ADI SD" <ZeugswetterA(at)spardat(dot)at>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>
Subject: Re: Frequent Update Project: Design Overview
Date: 2006-11-17 15:25:57
Message-ID: 455D8025.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>>> On Fri, Nov 17, 2006 at 5:30 AM, in message
<1163763016(dot)2941(dot)92(dot)camel(at)localhost(dot)localdomain>, Hannu Krosing
<hannu(at)skype(dot)net> wrote:
> Ühel kenal päeval, E, 2006-11-13 kell 13:42, kirjutas Csaba Nagy:
>> [snip]
>> > IMHO *most* UPDATEs occur on non-indexed fields. [snip]
>> >
>> > If my assumption is badly wrong on that then perhaps HOT would not be
>> > useful after all. If we find that the majority of UPDATEs meet the HOT
>> > pre-conditions, then I would continue to advocate it.
>>
>> Just to confirm that the scenario is valid: our application has almost
>> all it's updates affecting only non-indexed columns. There are a few
>> exceptions, but the vast majority is non-indexed, and that holds to the
>> execution frequency too, not just for the count of tables/queries.
>
> One interesting case which should also be considered is conditional
> indexes:
>
> create index on payments(payment_id) where status = 'waiting';
>
> here the payment_id is not changed when processing the payment, but when
> status is changed to 'processed' it still should be removed from the
> index.
>
> How would this interact with HOT ?

I would say that at least 80% of our updates (probably higher) do not modify indexed columns. We have a few very small tables (under 100 rows) which have high update rates (often exceeding 100 updates per second) which are not against indexed columns. These quickly degraded our performance until we set pretty aggressive autovacuum parameters (20% + 1 row every 10 seconds) and added a daily cluster to our maintenance crontab runs.

At the other extreme, we have a table which tracks the last modification timestamp of each court case, indexed by timestamp, to support our SOAP subscribers who want to stay up-to-date on all active court cases. Updates in this table are both high volume and always involve an indexed column.

Like Hannu, we do use conditional indexes with high updates on columns in the WHERE clause, although these columns are not part of the index sequence. For example, we have a receivables table which contains a balance due. For audit trail purposes these rows remain for many years after the balance hits zero, but they're not something you want to look at when someone is standing at the counter with their checkbook. We index by name where the balance is non-zero. The balance is updated frequently, with most eventually hitting zero. (The reason for the frequent updates is that the receivable is maintained by triggers from the supporting assessment detail, so a receivable will be initially added with a zero balance and may immediately be updated dozens of times as the assessment detail is added.) Infrequently, the balance may hit zero and subsequently become non-zero again.

I hope this is helpful.

-Kevin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jonah H. Harris 2006-11-17 15:26:41 Re: ALTER TABLE RENAME column
Previous Message Mario Weilguni 2006-11-17 14:58:02 Re: ALTER TABLE RENAME column