Re: Using xmin to identify last modified rows

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Richard Broersma <richard(dot)broersma(at)gmail(dot)com>
Cc: Stéphane A(dot) Schildknecht <stephane(dot)schildknecht(at)postgresqlfr(dot)org>, Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Using xmin to identify last modified rows
Date: 2009-02-25 15:36:15
Message-ID: 87skm2ttow.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Richard Broersma <richard(dot)broersma(at)gmail(dot)com> writes:

> On Wed, Feb 25, 2009 at 5:21 AM, Gregory Stark <stark(at)enterprisedb(dot)com> wrote:
>
>> I'm not sure using xmin is such a great idea really. It's handy for ad-hoc
>> queries but there are all kinds of cases where it might not give you the
>> results you expect.
>
>
> Its been a while since the following emails were written. Has the
> treatment of xmin changed since then, or is using a timestamp a better
> practice?
>
> http://archives.postgresql.org/pgsql-novice/2007-02/msg00079.php
> http://archives.postgresql.org/pgsql-hackers/2004-02/msg00654.php

Well those emails aren't discussing evaluating when records were updated or
deciding which were updated more recently than others. All they say is that in
Postgres if ctid and xmin both match then you're looking at the same version
of the same record. For a gui table editor or ODBC driver that's an important
thing to know.

If what you want to do is find records which have been updated for something
like a "Recently updated pages" it's unlikely that the desired behaviour will
exactly match how Postgres works. You're better off deciding the policy you
want and writing code to implement that.

Some examples of how xmin might not do what you expect: The order in which
transactions *start* will determine the ordering, not the order in which they
commit. If you look at records you've updated in the same transaction it's
even possible to see records which come from the "future". If any records have
frozen you lose any record of what order they were created.

Another example is that it's impossible to ignore "trivial" updates -- any
update will update xmin no matter how trivial, even if no columns are updated.

Worse, in the future there may be changes to database internals which change
when xmin is updated which won't match your desired policy. For example if we
decide to replace VACUUM FULL with something which does no-op updates instead
of moving tuples then you'll find records spontaneously appearing to have been
recently updated.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Andreas Wenk 2009-02-25 15:53:22 cursor question
Previous Message Albe Laurenz 2009-02-25 15:31:21 Re: Using xmin to identify last modified rows