Re: vacuum, performance, and MVCC

From: "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>
To: "Mark Woodward" <pgsql(at)mohawksoft(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Csaba Nagy" <nagy(at)ecircle-ag(dot)com>, "Hannu Krosing" <hannu(at)skype(dot)net>, "Christopher Browne" <cbbrowne(at)acm(dot)org>, "postgres hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: vacuum, performance, and MVCC
Date: 2006-06-23 18:28:26
Message-ID: 36e682920606231128j76c1655do413c5edfbba522db@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 6/23/06, Mark Woodward <pgsql(at)mohawksoft(dot)com> wrote:
> I, for one, see a particularly nasty unscalable behavior in the
> implementation of MVCC with regards to updates.

I think this is a fairly common acceptance. The overhead required to
perform an UPDATE in PostgreSQL is pretty heavy. Actually, it's not
really PostgreSQL's implementation, but anything that employs basic
multi-version timestamp ordering (MVTO) style MVCC. Basically,
MVTO-style systems require additional work to be done in an UPDATE so
that queries can find the most current row more quickly.

> This is a very pessimistic behavior

Yes, and that's basically the point of MVTO in general. The nice
thing about MVTO-style MVCC is that it isn't super complicated. No
big UNDO strategy is needed because the old versions are always there
and just have to satisfy a snapshot.

> I still think an in-place indirection to the current row could fix the
> problem and speed up the database, there are some sticky situations that
> need to be considered, but it shouldn't break much.

I agree, but should make clear that moving to an in-place update isn't
a quick-fix; it will require a good amount of design and planning.

What I find in these discussions is that we always talk about over
complicating vacuum in order to fix the poor behavior in MVCC. Fixing
autovacuum does not eliminate the overhead required to add index
entries and everything associated with performing an UPDATE... it's
just cleaning up the mess after the fact. As I see it, fixing the
root problem by moving to update-in-place may add a little more
complication to the core, but will eliminate a lot of the headaches we
have in overhead, performance, and manageability.

--
Jonah H. Harris, Software Architect | phone: 732.331.1300
EnterpriseDB Corporation | fax: 732.331.1301
33 Wood Ave S, 2nd Floor | jharris(at)enterprisedb(dot)com
Iselin, New Jersey 08830 | http://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2006-06-23 18:30:05 Re: Anyone still care about Cygwin? (was Re: [CORE] GPL
Previous Message David Fetter 2006-06-23 18:20:41 Re: vacuum, performance, and MVCC