Re: pessimal trivial-update performance

Lists: pgsql-hackers
From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: pessimal trivial-update performance
Date: 2010-07-04 04:05:45
Message-ID: AANLkTilng6sINyEhC46OIOCCGCcjxlb3nuifrggWGAyz@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Consider updating a PL/pgsql variable repeatedly, and then consider
updating a single-column, single-row table repeatedly, thus:

CREATE OR REPLACE FUNCTION update_var() RETURNS void AS $$
DECLARE
x int := 0;
BEGIN
FOR i IN 1..100000 LOOP
x := x + 1;
END LOOP;
END
$$ LANGUAGE plpgsql;

CREATE TABLE tab (x integer);

CREATE OR REPLACE FUNCTION update_tab() RETURNS void AS $$
BEGIN
INSERT INTO tab VALUES (0);
FOR i IN 1..100000 LOOP
UPDATE tab SET x = x + 1;
END LOOP;
END
$$ LANGUAGE plpgsql;

On my Fedora 12 VM, the first of these takes 33-36 ms, and the second
takes 114-121 s. While you'd expect updating a table to be slower
than updating a variable, a factor of 3000x seems rather excessive to
me. Profiling reveals that 80%+ of the time is spend testing tuple
visibility, which apparently needs to be done an average of over 7000
times per loop iteration. Full gprof results are attached, bzip'd so
as to avoid hitting the attachment size limit for this list.
Highlights below:

27.00 42.60 42.60 1410265409 0.00 0.00
TransactionIdIsCurrentTransactionId
23.51 79.69 37.09 705082704 0.00 0.00 HeapTupleSatisfiesMVCC
19.65 110.69 31.00 705182704 0.00 0.00 HeapTupleHeaderGetCmin
13.04 131.26 20.57 704982704 0.00 0.00 HeapTupleHeaderGetCmax
8.09 144.02 12.76 22173923 0.00 0.01 heapgetpage
1.09 145.74 1.72 XidInMVCCSnapshot

heapgettup_pagemode is called 200,000 times exactly; it makes
2,217,932 calls to heapgetpage (or approximately 110 per call), which
makes 705,082,704 calls to HeapTupleSatisfiesMVCC (or approximately
317 per heapgetpage call). Is there anything we can do about this?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Attachment Content-Type Size
gprof.tab.out.bz2 application/x-bzip2 54.4 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pessimal trivial-update performance
Date: 2010-07-04 04:11:19
Message-ID: 26947.1278216679@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> CREATE OR REPLACE FUNCTION update_tab() RETURNS void AS $$
> BEGIN
> INSERT INTO tab VALUES (0);
> FOR i IN 1..100000 LOOP
> UPDATE tab SET x = x + 1;
> END LOOP;
> END
> $$ LANGUAGE plpgsql;

I believe that none of the dead row versions can be vacuumed during this
test. So yes, it sucks, but is it representative of real-world cases?

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pessimal trivial-update performance
Date: 2010-07-04 04:17:32
Message-ID: AANLkTikO81Wqgxe2ULNwrbGHRfVqVcuRUQ61cFX3vjlo@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jul 4, 2010 at 12:11 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> CREATE OR REPLACE FUNCTION update_tab() RETURNS void AS $$
>> BEGIN
>>       INSERT INTO tab VALUES (0);
>>       FOR i IN 1..100000 LOOP
>>               UPDATE tab SET x = x + 1;
>>       END LOOP;
>> END
>> $$ LANGUAGE plpgsql;
>
> I believe that none of the dead row versions can be vacuumed during this
> test.

Yep, you seem to be right. The table grows to 802 pages. But why is
it that we can't vacuum them as we go along?

> So yes, it sucks, but is it representative of real-world cases?

Hard to say, but I think it probably is to some degree. I stumbled on
it more-or-less by accident, but it wouldn't surprise me to find out
that there are people doing such things in real applications. It's
not uncommon to want to store an updateable counter somewhere.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


From: Rainer Pruy <Rainer(dot)Pruy(at)Acrys(dot)COM>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pessimal trivial-update performance
Date: 2010-07-04 07:40:06
Message-ID: 4C303AD6.5090405@acrys.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Am 04.07.2010 06:11, wrote Tom Lane:
> ... but is it representative of real-world cases?
>
> regards, tom lane
>

Hi Tom,
we do run an application in productive use that suffered from a similar effect.
We did not have 1000000 updates per row, but 10-100 updates per row on about 1-10 million rows of a table.
In the end we managed to increase performance by factor of more than two
by adding support to the application to track updates internally and only "flush" changes to the database
at the (final) application commit.
This did cost a lot as now we needed to adjust queries on the table with data stored internally
(as not yet reflected in the database). This still is more efficient as updating and performing operation an the database
directly. (e.g. an update using the primary key of the table (about 50 million rows total) would have lasted over 3 seconds(!)
while initially the very same update was done within far below 1ms).

So I think this could qualify as a real world example of that case.

Regards,
Rainer

--
Rainer Pruy
Managing Director

Acrys Consult GmbH & Co. KG
Theodor-Heuss-Str. 53-63, 61118 Bad Vilbel, Germany
Phone: +49-6101-98760-0 Fax: +49-6101-98760-50
Web: http://www.acrys.com - Email: office(at)acrys(dot)com
Registered: Frankfurt am Main, HRA 31151
General partner: Acrys Verwaltungs GmbH
Theodor-Heuss-Str. 53-63, D-61118 Bad Vilbel
Registered: Frankfurt am Main, HRB 57625


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pessimal trivial-update performance
Date: 2010-07-04 13:48:38
Message-ID: 3271.1278251318@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Sun, Jul 4, 2010 at 12:11 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I believe that none of the dead row versions can be vacuumed during this
>> test.

> Yep, you seem to be right. The table grows to 802 pages. But why is
> it that we can't vacuum them as we go along?

Sure. What you'd need is for HeapTupleSatisfiesVacuum to observe that
(a) the tuple's xmin and xmax are equal,
(b) they're equal to my own transaction's XID,
(c) none of the live snapshots in my backend can see cmin but not cmax,
(d) cmax < currentCommandId, ensuring that every future snapshot will
see cmax too (not quite convinced this is certain to hold).

Now that we have a centralized list of all live snapshots, it's at least
possible in principle to do (c).

(I'm ignoring the possibility that the xmin and xmax are from different
subtransactions of my own XID --- that seems to complicate matters
greatly in order to handle even-more-cornerish cases.)

Of course, you'd also need to get to HeapTupleSatisfiesVacuum in the
first place. The complained-of case lacks any VACUUM call. Maybe a HOT
cleanup would happen at the right time but I'm not sure. If it doesn't,
adding one would represent a significant expenditure that would usually
not be repaid.

Another issue here is that since xmin is certainly within the GlobalXmin
horizon, it would be essential to preserve the update chain ctid links,
ie, make the tuple's update predecessor point to its successor. That
seems workable for the case of cleaning out an intermediate entry in a
HOT chain, but not otherwise.

Details left as an exercise for the student.

regards, tom lane


From: Jesper Krogh <jesper(at)krogh(dot)cc>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pessimal trivial-update performance
Date: 2010-07-05 09:56:19
Message-ID: 4C31AC43.80409@krogh.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2010-07-04 06:11, Tom Lane wrote:
> Robert Haas<robertmhaas(at)gmail(dot)com> writes:
>
>> CREATE OR REPLACE FUNCTION update_tab() RETURNS void AS $$
>> BEGIN
>> INSERT INTO tab VALUES (0);
>> FOR i IN 1..100000 LOOP
>> UPDATE tab SET x = x + 1;
>> END LOOP;
>> END
>> $$ LANGUAGE plpgsql;
>>
> I believe that none of the dead row versions can be vacuumed during this
> test. So yes, it sucks, but is it representative of real-world cases?
>
>
The problem can generally be written as "tuples seeing multiple
updates in the same transaction"?

I think that every time PostgreSQL is used with an ORM, there is
a certain amount of multiple updates taking place. I have actually
been reworking clientside to get around multiple updates, since they
popped up in one of my profiling runs. Allthough the time I optimized
away ended being both "roundtrip time" + "update time", but having
the database do half of it transparently, might have been sufficient
to get me to have had a bigger problem elsewhere..

To sum up. Yes I think indeed it is a real-world case.

Jesper

--
Jesper


From: "Pierre C" <lists(at)peufeu(dot)com>
To: "Jesper Krogh" <jesper(at)krogh(dot)cc>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Robert Haas" <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pessimal trivial-update performance
Date: 2010-07-05 10:11:38
Message-ID: op.vfc7xobzeorkce@apollo13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> The problem can generally be written as "tuples seeing multiple
> updates in the same transaction"?
>
> I think that every time PostgreSQL is used with an ORM, there is
> a certain amount of multiple updates taking place. I have actually
> been reworking clientside to get around multiple updates, since they
> popped up in one of my profiling runs. Allthough the time I optimized
> away ended being both "roundtrip time" + "update time", but having
> the database do half of it transparently, might have been sufficient
> to get me to have had a bigger problem elsewhere..
>
> To sum up. Yes I think indeed it is a real-world case.
>
> Jesper

On the Python side, elixir and sqlalchemy have an excellent way of
handling this, basically when you start a transaction, all changes are
accumulated in a "session" object and only flushed to the database on
session commit (which is also generally the transaction commit). This has
multiple advantages, for instance it is able to issue multiple-line
statements, updates are only done once, you save a lot of roundtrips, etc.
Of course it is most of the time not compatible with database triggers, so
if there are triggers the ORM needs to be told about them.


From: Jesper Krogh <jesper(at)krogh(dot)cc>
To: Pierre C <lists(at)peufeu(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pessimal trivial-update performance
Date: 2010-07-05 10:26:23
Message-ID: 4C31B34F.4020506@krogh.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2010-07-05 12:11, Pierre C wrote:
>
> > The problem can generally be written as "tuples seeing multiple
> > updates in the same transaction"?
> >
> > I think that every time PostgreSQL is used with an ORM, there is a
> > certain amount of multiple updates taking place. I have actually
> > been reworking clientside to get around multiple updates, since
> > they popped up in one of my profiling runs. Allthough the time I
> > optimized away ended being both "roundtrip time" + "update time",
> > but having the database do half of it transparently, might have
> > been sufficient to get me to have had a bigger problem elsewhere..
> >
> > To sum up. Yes I think indeed it is a real-world case.
> >
> > Jesper
>
> On the Python side, elixir and sqlalchemy have an excellent way of
> handling this, basically when you start a transaction, all changes
> are accumulated in a "session" object and only flushed to the
> database on session commit (which is also generally the transaction
> commit). This has multiple advantages, for instance it is able to
> issue multiple-line statements, updates are only done once, you save
> a lot of roundtrips, etc. Of course it is most of the time not
> compatible with database triggers, so if there are triggers the ORM
> needs to be told about them.

How about unique constraints, foreign key violations and checks? Would
you also pospone those errors to commit time? And transactions with lots
of data?

It doesn't really seem like a net benefit to me, but I can see applications
where it easily will fit.

Jesper


From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: "Pierre C" <lists(at)peufeu(dot)com>, "Jesper Krogh" <jesper(at)krogh(dot)cc>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Robert Haas" <robertmhaas(at)gmail(dot)com>
Subject: Re: pessimal trivial-update performance
Date: 2010-07-05 10:28:38
Message-ID: 201007051228.39200.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Monday 05 July 2010 12:11:38 Pierre C wrote:
> > The problem can generally be written as "tuples seeing multiple
> > updates in the same transaction"?
> >
> > I think that every time PostgreSQL is used with an ORM, there is
> > a certain amount of multiple updates taking place. I have actually
> > been reworking clientside to get around multiple updates, since they
> > popped up in one of my profiling runs. Allthough the time I optimized
> > away ended being both "roundtrip time" + "update time", but having
> > the database do half of it transparently, might have been sufficient
> > to get me to have had a bigger problem elsewhere..
> >
> > To sum up. Yes I think indeed it is a real-world case.
> >
> > Jesper
>
> On the Python side, elixir and sqlalchemy have an excellent way of
> handling this, basically when you start a transaction, all changes are
> accumulated in a "session" object and only flushed to the database on
> session commit (which is also generally the transaction commit). This has
> multiple advantages, for instance it is able to issue multiple-line
> statements, updates are only done once, you save a lot of roundtrips, etc.
> Of course it is most of the time not compatible with database triggers, so
> if there are triggers the ORM needs to be told about them.
Its also not concurrency safe in many cases.

Andres


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pessimal trivial-update performance
Date: 2010-07-05 21:57:37
Message-ID: AANLkTiku2ByDtPaxRIRu--k8n9bxn7J6P9gl9tXOp4BT@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Jul 4, 2010 at 9:48 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Sure.  What you'd need is for HeapTupleSatisfiesVacuum to observe that
> (a) the tuple's xmin and xmax are equal,
> (b) they're equal to my own transaction's XID,
> (c) none of the live snapshots in my backend can see cmin but not cmax,
> (d) cmax < currentCommandId, ensuring that every future snapshot will
>    see cmax too (not quite convinced this is certain to hold).
[...]
> Of course, you'd also need to get to HeapTupleSatisfiesVacuum in the
> first place.  The complained-of case lacks any VACUUM call.  Maybe a HOT
> cleanup would happen at the right time but I'm not sure.  If it doesn't,
> adding one would represent a significant expenditure that would usually
> not be repaid.

It looks like a HOT cleanup happens when pd_prune_xid falls behind
OldestXmin. Normally, we set pd_prune_xid to the xmax of the deleted
tuple, but we could perhaps fudge that here to get the desired
behavior; maybe just set it to FrozenXID. Where it gets sticky is
that the proposed rules for HeapTupleSatisfiesVacuum() give different
answers depending on who does the vacuuming, so if backend A sets a
hint say, hey, there's vacuumable stuff on this page, and then backend
B tries to prune it, nothing will happen. What would be nicer is if
there were a way for the updater to mark the item pointer or tuple in
some way that would make it look vacuumable to everyone, but without
breaking the HOT chain.

> Another issue here is that since xmin is certainly within the GlobalXmin
> horizon, it would be essential to preserve the update chain ctid links,
> ie, make the tuple's update predecessor point to its successor.  That
> seems workable for the case of cleaning out an intermediate entry in a
> HOT chain, but not otherwise.

Yeah, that's a shame. HOT is huge, but it would be great if we had a
way to do partial vacuuming even when the indexed columns are updated.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company