Proposal for UPDATE: do not insert new tuple on heap if update does not change data

From: Gasper Zejn <zelo(dot)zejn(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Proposal for UPDATE: do not insert new tuple on heap if update does not change data
Date: 2016-01-20 09:55:31
Message-ID: CAMxXOOEZ4p0J6nke76XSjs1RfPR0mcb0Ddh-vbmAknjSPq-LTw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I was wondering if PostgreSQL adds new tuple if data is not changed
when using UPDATE. It turns out it does add them and I think it might
be beneficial not to add a new tuple in this case, since it causes a
great deal of maintenance: updating indexes, vacuuming table and
index, also heap fragmentation.

How to check:

CREATE TABLE foo (pk serial primary key, val text);
-- Starting point: two rows.
INSERT INTO foo VALUES (1, 'first');
INSERT INTO foo VALUES (2, 'second');
CHECKPOINT;

-- Updating row with same value.
UPDATE foo SET val = 'second' where pk = 2;
CHECKPOINT;

-- "Upsert" is the same.
INSERT INTO foo VALUES (2, 'second') ON CONFLICT (pk) DO UPDATE SET
val = 'second';
CHECKPOINT;

If after any checkpoint you look at page data, you can see multiple
versions of same row with "second".

Unfortunately, I don't believe I can come up with a patch on my own,
but will happily offer any further help with testing and ideas.

Attached is a script with minimal test case.

Kind regards,
Gasper Zejn

Attachment Content-Type Size
pg-duplicate-tuple.sh application/x-sh 1.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-01-20 10:13:26 Re: checkpointer continuous flushing
Previous Message Etsuro Fujita 2016-01-20 09:50:36 Re: Odd behavior in foreign table modification (Was: Re: Optimization for updating foreign tables in Postgres FDW)