How to check whether the row was modified by this transaction before?

Lists: pgsql-hackers
From: Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru>
To: pgsql-hackers(at)postgresql(dot)org
Subject: How to check whether the row was modified by this transaction before?
Date: 2012-12-06 08:58:22
Message-ID: 50C05E2E.4080602@dc.baikal.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

In a BEFORE UPDATE trigger I need to know whether the row was previously
modified by this transaction. Is it safe to use xmin and txid_current()
for this purpose (xmin is 32-bit txid type but txid_current() returns
64-bit bigint).

CREATE FUNCTION test_trigger()
RETURNS TRIGGER AS $$
BEGIN
IF OLD.xmin = txid_current() THEN
-- Do something.
ELSE
-- Do something else.
END IF;
END;
$$ LANGUAGE plpgsql;


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to check whether the row was modified by this transaction before?
Date: 2012-12-06 17:53:33
Message-ID: CA+Tgmob3KLxzksRjWFOvWVQP1L1yX5xqHUj8N+SBfzZTW60VHw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Dec 6, 2012 at 3:58 AM, Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru> wrote:
> In a BEFORE UPDATE trigger I need to know whether the row was previously
> modified by this transaction. Is it safe to use xmin and txid_current() for
> this purpose (xmin is 32-bit txid type but txid_current() returns 64-bit
> bigint).
>
> CREATE FUNCTION test_trigger()
> RETURNS TRIGGER AS $$
> BEGIN
> IF OLD.xmin = txid_current() THEN
> -- Do something.
> ELSE
> -- Do something else.
> END IF;
> END;
> $$ LANGUAGE plpgsql;

txid_current() will return a different value from xmin after the XID
space has wrapped around at least once; also, you might need to
consider subtransactions.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to check whether the row was modified by this transaction before?
Date: 2012-12-06 17:53:46
Message-ID: 6067.1354816426@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru> writes:
> In a BEFORE UPDATE trigger I need to know whether the row was previously
> modified by this transaction. Is it safe to use xmin and txid_current()
> for this purpose (xmin is 32-bit txid type but txid_current() returns
> 64-bit bigint).

> IF OLD.xmin = txid_current() THEN

Comparing to txid_current() mod 2^32 would probably work, but note this
will not think that subtransactions or parent transactions are "this
transaction", so any use of savepoints or plpgsql exception blocks is
likely to cause headaches. Why do you think you need to know this?

regards, tom lane


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to check whether the row was modified by this transaction before?
Date: 2012-12-06 18:08:51
Message-ID: 20121206180851.GB4299@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru> writes:
> > In a BEFORE UPDATE trigger I need to know whether the row was previously
> > modified by this transaction. Is it safe to use xmin and txid_current()
> > for this purpose (xmin is 32-bit txid type but txid_current() returns
> > 64-bit bigint).
>
> > IF OLD.xmin = txid_current() THEN
>
> Comparing to txid_current() mod 2^32 would probably work,

I think we should be setting the initial epoch to something other than
zero. That way, some quick testing would have revealed this problem
immediately.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to check whether the row was modified by this transaction before?
Date: 2012-12-06 18:27:12
Message-ID: 6702.1354818432@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> I think we should be setting the initial epoch to something other than
> zero. That way, some quick testing would have revealed this problem
> immediately.

Yeah, having initdb start the epoch at 1 doesn't seem unreasonable.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to check whether the row was modified by this transaction before?
Date: 2012-12-06 18:39:37
Message-ID: 20121206183937.GB20926@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2012-12-06 15:08:51 -0300, Alvaro Herrera wrote:
> Tom Lane wrote:
> > Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru> writes:
> > > In a BEFORE UPDATE trigger I need to know whether the row was previously
> > > modified by this transaction. Is it safe to use xmin and txid_current()
> > > for this purpose (xmin is 32-bit txid type but txid_current() returns
> > > 64-bit bigint).

I wonder if we shouldn't have a function txid_is_current(xid); I could
have used that previously to avoid queuing multiple external
cache-invalidations if something gets repeatedly updated in the same
transaction. And I seem to remember some people here asking this
question before on the lists.

> >
> > > IF OLD.xmin = txid_current() THEN
> >
> > Comparing to txid_current() mod 2^32 would probably work,
>
> I think we should be setting the initial epoch to something other than
> zero. That way, some quick testing would have revealed this problem
> immediately.

+1, currently the difference of xid vs bigint is hard to spot.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to check whether the row was modified by this transaction before?
Date: 2012-12-06 18:59:32
Message-ID: 7386.1354820372@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On 2012-12-06 15:08:51 -0300, Alvaro Herrera wrote:
>> Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru> writes:
>>> In a BEFORE UPDATE trigger I need to know whether the row was previously
>>> modified by this transaction. Is it safe to use xmin and txid_current()
>>> for this purpose (xmin is 32-bit txid type but txid_current() returns
>>> 64-bit bigint).

> I wonder if we shouldn't have a function txid_is_current(xid);

Yeah, I was wondering that too, and wanted to know if the OP had a
use-case that was mainstream enough to justify adding such a function.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to check whether the row was modified by this transaction before?
Date: 2012-12-06 19:08:50
Message-ID: 20121206190850.GE20926@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2012-12-06 13:59:32 -0500, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On 2012-12-06 15:08:51 -0300, Alvaro Herrera wrote:
> >> Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru> writes:
> >>> In a BEFORE UPDATE trigger I need to know whether the row was previously
> >>> modified by this transaction. Is it safe to use xmin and txid_current()
> >>> for this purpose (xmin is 32-bit txid type but txid_current() returns
> >>> 64-bit bigint).
>
> > I wonder if we shouldn't have a function txid_is_current(xid);
>
> Yeah, I was wondering that too, and wanted to know if the OP had a
> use-case that was mainstream enough to justify adding such a function.

I think architectures with an invalidation-queue to external caches (be
it web-proxies or something lower-level) are quite popular. And with the
new NOTIFY or pgq relatively simple. Ad to those its sensible not to
post a single primary key more than once.

Magnus had talks about specifically that on various conferences if that
counts as anything ;)

Mainstreamy enough?

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: How to check whether the row was modified by this transaction before?
Date: 2012-12-06 21:26:50
Message-ID: 9921.1354829210@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On 2012-12-06 13:59:32 -0500, Tom Lane wrote:
>> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
>>> On 2012-12-06 15:08:51 -0300, Alvaro Herrera wrote:
>>> Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru> writes:
>>>> In a BEFORE UPDATE trigger I need to know whether the row was previously
>>>> modified by this transaction. Is it safe to use xmin and txid_current()
>>>> for this purpose (xmin is 32-bit txid type but txid_current() returns
>>>> 64-bit bigint).

>>> I wonder if we shouldn't have a function txid_is_current(xid);

> I think architectures with an invalidation-queue to external caches (be
> it web-proxies or something lower-level) are quite popular. And with the
> new NOTIFY or pgq relatively simple. Ad to those its sensible not to
> post a single primary key more than once.

It strikes me that the notion of "this row was previously modified by
the current transaction" is squishier than it might look, and we'd do
well to clarify it before we consider exporting anything. I think there
are three ways you might define such a function:

1. xmin is exactly equal to current (sub)transaction's XID.

2. xmin is this (sub)transaction's XID, or the XID of any subcommitted
subtransaction of it.

3. xmin is this (sub)transaction's XID, or the XID of any subcommitted
subtransaction, or the XID of any open parent transaction or
subcommitted subtransaction thereof.

If I've got my head screwed on straight, test #2 gives you the semantics
that "the previous row update cannot commit unless the action you are
about to take (with the current XID) commits". Test #3 gives you the
semantics "the action you are about to take cannot commit unless the
previous row update does". And test #1 doesn't have much to recommend
it except simplicity; while it might appear to have the semantics "the
previous row update will commit if and only if the action you are about
to take commits", it's actually narrower than that, because the same
could be said for already-subcommitted subtransactions.

In a cache invalidation context you probably want test #2, but
TransactionIdIsCurrentTransactionId() presently performs test #3.

(We discussed this point in connection with commit 7b90469b, and I've
been intending since then to take a closer look at all the callers of
TransactionIdIsCurrentTransactionId to see if these semantics are in
fact what they need. We might have some bugs associated with confusion
on this.)

I'm not sure which of these semantics we might wish to expose to users.

regards, tom lane


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Re: How to check whether the row was modified by this transaction before?
Date: 2012-12-06 22:47:05
Message-ID: 20121206224705.GA6515@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2012-12-06 16:26:50 -0500, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On 2012-12-06 13:59:32 -0500, Tom Lane wrote:
> >> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> >>> On 2012-12-06 15:08:51 -0300, Alvaro Herrera wrote:
> >>> Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru> writes:
> >>>> In a BEFORE UPDATE trigger I need to know whether the row was previously
> >>>> modified by this transaction. Is it safe to use xmin and txid_current()
> >>>> for this purpose (xmin is 32-bit txid type but txid_current() returns
> >>>> 64-bit bigint).
>
> >>> I wonder if we shouldn't have a function txid_is_current(xid);
>
> > I think architectures with an invalidation-queue to external caches (be
> > it web-proxies or something lower-level) are quite popular. And with the
> > new NOTIFY or pgq relatively simple. Ad to those its sensible not to
> > post a single primary key more than once.
>
> It strikes me that the notion of "this row was previously modified by
> the current transaction" is squishier than it might look, and we'd do
> well to clarify it before we consider exporting anything.

You're right.

I am also wondering if we can assume that for all interesting purposes
enough context is available to determine if the row was just inserted or
updated or whether we need to make a test for HEAP_UPDATED available in
some form.

> I think there
> are three ways you might define such a function:
>
> 1. xmin is exactly equal to current (sub)transaction's XID.
>
> 2. xmin is this (sub)transaction's XID, or the XID of any subcommitted
> subtransaction of it.
>
> 3. xmin is this (sub)transaction's XID, or the XID of any subcommitted
> subtransaction, or the XID of any open parent transaction or
> subcommitted subtransaction thereof.
>
> If I've got my head screwed on straight, test #2 gives you the semantics
> that "the previous row update cannot commit unless the action you are
> about to take (with the current XID) commits". Test #3 gives you the
> semantics "the action you are about to take cannot commit unless the
> previous row update does". And test #1 doesn't have much to recommend
> it except simplicity; while it might appear to have the semantics "the
> previous row update will commit if and only if the action you are about
> to take commits", it's actually narrower than that, because the same
> could be said for already-subcommitted subtransactions.
>
> In a cache invalidation context you probably want test #2, but
> TransactionIdIsCurrentTransactionId() presently performs test #3.

I agree that 1) isn't all that interesting but after that it really gets
a bit harder...

2) would be useful to do stuff like avoiding to queue any invalidations
if the table is newly truncated, but that seems like a fragile thing to
do from userspace. A table rewrite seems to be something fundamentally
different from a truncation here because that would need more detail
that txid_is_current(xid) would give you.
Maybe something like 'pg_relation_is_new()' would be interesting, but
that seems like a different thing, and I am not sure the use-case for
that is clear enough.
From a userlevel perspective 3) seems to be enough if all you want to
test whether the relation is new, but again, you couldn't test for that
sensibly because there's no access to HEAP_UPDATED.

For which cases do you think 2) is interesting wrt. cache invalidations?

I think 3) might be more interesting for the (for me) common case where
you just want to avoid queuing duplicate invalidations. Because all
youre interested in that case is: "Can I be sure that if I commit
another invalidation has already been queued for this tuple.". And 3)
seems to gives you that.

> (We discussed this point in connection with commit 7b90469b, and I've
> been intending since then to take a closer look at all the callers of
> TransactionIdIsCurrentTransactionId to see if these semantics are in
> fact what they need. We might have some bugs associated with confusion
> on this.)

Its an easy error to make I think, so I wouldn't be too surprised
somebody made it before me. On the other hand, it seems to me that to be
dangerous you need a part of the system thats not obeying transactional
semantics like the indexes which still contain enum oids that aren't in
the catalogs anymore. So I hope there aren't too many places where
people could have made that mistake.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: How to check whether the row was modified by this transaction before?
Date: 2012-12-07 02:38:13
Message-ID: 50C15695.5080002@dc.baikal.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/07/2012 02:53 AM, Tom Lane wrote:
> Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru> writes:
>> In a BEFORE UPDATE trigger I need to know whether the row was previously
>> modified by this transaction. Is it safe to use xmin and txid_current()
>> for this purpose (xmin is 32-bit txid type but txid_current() returns
>> 64-bit bigint).
>> IF OLD.xmin = txid_current() THEN
> Comparing to txid_current() mod 2^32 would probably work, but note this
> will not think that subtransactions or parent transactions are "this
> transaction", so any use of savepoints or plpgsql exception blocks is
> likely to cause headaches. Why do you think you need to know this?
>
> regards, tom lane
>
The use case is quite simple. I'm trying to rewrite our internal system
versioning extension (SQL feature T180) in more abstract way. Any
temporal versioned table uses its associated history table to store
updated and deleted data rows. For this purpose the extension adds AFTER
UPDATE/DELETE triggers to the table that insert OLD row in the history
table for updated and deleted rows. But if there are multiple changes to
a row in the same transaction the trigger should generate a history row
only for the first change.

On 12/07/2012 06:26 AM, Tom Lane wrote:
> It strikes me that the notion of "this row was previously modified by
> the current transaction" is squishier than it might look, and we'd do
> well to clarify it before we consider exporting anything. I think there
> are three ways you might define such a function:
>
> 1. xmin is exactly equal to current (sub)transaction's XID.
>
> 2. xmin is this (sub)transaction's XID, or the XID of any subcommitted
> subtransaction of it.
>
> 3. xmin is this (sub)transaction's XID, or the XID of any subcommitted
> subtransaction, or the XID of any open parent transaction or
> subcommitted subtransaction thereof.
If I understand you correctly, what I'm looking for is described by the
3rd case and I may use TransactionIdIsCurrentTransactionId() for this
purpose?