Re: Using xmin to identify last modified rows

Lists: pgsql-general
From: "Stéphane A(dot) Schildknecht" <stephane(dot)schildknecht(at)postgresqlfr(dot)org>
To: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Using xmin to identify last modified rows
Date: 2009-02-25 12:00:40
Message-ID: 49A532E8.5090803@postgresqlfr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

Trying to identify last modified (updated or inserted) rows in a table, I
thought I could use xmin.

I tried is to get some lines sorted by xmin.

When doing it on a slonified database, I had no problem getting these lines.

But, trying the same query on a non slonified DB, I got an error, as there is
no ordering operator for xid.

I think that in the slon case, the query uses the implicit cast xid->xxid, and
then the operator to sort xxid.

What would be the best way to get last modified rows?

What I tried :
db=# select id_table, date_table, code_table from tb_table order by xmin desc
limit 10;
ERROR: could not identify an ordering operator for type xid
ASTUCE : Use an explicit ordering operator or modify the query.

Thanks in advance.

Best regards,
- --
Stéphane Schildknecht
PostgreSQLFr - http://www.postgresql.fr
Dalibo - http://www.dalibo.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJpTLoA+REPKWGI0ERAiCaAKCOSFQp/RtWFaLScwXLpqXQJKGzLgCgsNUn
jXCUCSBBXVP7WEIn/M0Pklc=
=PN5v
-----END PGP SIGNATURE-----


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Stéphane A(dot) Schildknecht <stephane(dot)schildknecht(at)postgresqlfr(dot)org>
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Using xmin to identify last modified rows
Date: 2009-02-25 13:21:12
Message-ID: 871vtmveif.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

"Stéphane A. Schildknecht" <stephane(dot)schildknecht(at)postgresqlfr(dot)org> writes:

> But, trying the same query on a non slonified DB, I got an error, as there is
> no ordering operator for xid.
>
> I think that in the slon case, the query uses the implicit cast xid->xxid, and
> then the operator to sort xxid.

You could order by age(xmin) instead

> What would be the best way to get last modified rows?

I'm not sure using xmin is such a great idea really. It's handy for ad-hoc
queries but there are all kinds of cases where it might not give you the
results you expect.

You probably want to put a timestamp column on your tables and manage the date
you put in their according to a policy you control.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!


From: Richard Broersma <richard(dot)broersma(at)gmail(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Stéphane A(dot) Schildknecht <stephane(dot)schildknecht(at)postgresqlfr(dot)org>, Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Using xmin to identify last modified rows
Date: 2009-02-25 13:40:49
Message-ID: 396486430902250540g58da7123i19dcd7b620e21e6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Wed, Feb 25, 2009 at 5:21 AM, Gregory Stark <stark(at)enterprisedb(dot)com> wrote:

> I'm not sure using xmin is such a great idea really. It's handy for ad-hoc
> queries but there are all kinds of cases where it might not give you the
> results you expect.

Its been a while since the following emails were written. Has the
treatment of xmin changed since then, or is using a timestamp a better
practice?

http://archives.postgresql.org/pgsql-novice/2007-02/msg00079.php
http://archives.postgresql.org/pgsql-hackers/2004-02/msg00654.php

--
Regards,
Richard Broersma Jr.

Visit the Los Angeles PostgreSQL Users Group (LAPUG)
http://pugs.postgresql.org/lapug


From: "Stéphane A(dot) Schildknecht" <stephane(dot)schildknecht(at)postgresqlfr(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Using xmin to identify last modified rows
Date: 2009-02-25 15:07:22
Message-ID: 49A55EAA.8040001@postgresqlfr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Gregory Stark a écrit :
> "Stéphane A. Schildknecht" <stephane(dot)schildknecht(at)postgresqlfr(dot)org> writes:
(...)
>
> You could order by age(xmin) instead
>
>> What would be the best way to get last modified rows?
>
> I'm not sure using xmin is such a great idea really. It's handy for ad-hoc
> queries but there are all kinds of cases where it might not give you the
> results you expect.
>
> You probably want to put a timestamp column on your tables and manage the date
> you put in their according to a policy you control.
>
>

Gregory,

Thanks for the answer.

A timestamp would surely be a better idea. BTW, I don't have hand on the schema
yet and was just looking for a quick way to get some last modified rows.

Regards,
- --
Stéphane Schildknecht
PostgreSQLFr - http://www.postgresql.fr
Dalibo - http://www.dalibo.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJpV6qA+REPKWGI0ERAhq/AJwNt845SDujYmFhe4aTqI30QBBC9gCg4vcH
edlSZti3KDtozJ82Od0nErQ=
=z1lm
-----END PGP SIGNATURE-----


From: "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Stéphane A(dot) Schildknecht" *EXTERN* <stephane(dot)schildknecht(at)postgresqlfr(dot)org>, "Postgres General" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Using xmin to identify last modified rows
Date: 2009-02-25 15:31:21
Message-ID: D960CB61B694CF459DCFB4B0128514C2031AB58B@exadv11.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Stéphane A. Schildknecht wrote:
> Trying to identify last modified (updated or inserted) rows in a table, I
> thought I could use xmin.
>
> I tried is to get some lines sorted by xmin.
>
> When doing it on a slonified database, I had no problem getting these lines.
>
> But, trying the same query on a non slonified DB, I got an error, as there is
> no ordering operator for xid.
>
> I think that in the slon case, the query uses the implicit cast xid->xxid, and
> then the operator to sort xxid.
>
> What would be the best way to get last modified rows?
>
> What I tried :
> db=# select id_table, date_table, code_table from tb_table order by xmin desc limit 10;
> ERROR: could not identify an ordering operator for type xid
> ASTUCE : Use an explicit ordering operator or modify the query.

What about:

test=> SELECT xmin, * FROM mausi ORDER BY xmin::text::bigint;
xmin | id | val
-------+----+----------------
14005 | 1 | test\test/test
14040 | 3 | mamma
(2 rows)

Yours,
Laurenz Albe


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Richard Broersma <richard(dot)broersma(at)gmail(dot)com>
Cc: Stéphane A(dot) Schildknecht <stephane(dot)schildknecht(at)postgresqlfr(dot)org>, Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Using xmin to identify last modified rows
Date: 2009-02-25 15:36:15
Message-ID: 87skm2ttow.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Richard Broersma <richard(dot)broersma(at)gmail(dot)com> writes:

> On Wed, Feb 25, 2009 at 5:21 AM, Gregory Stark <stark(at)enterprisedb(dot)com> wrote:
>
>> I'm not sure using xmin is such a great idea really. It's handy for ad-hoc
>> queries but there are all kinds of cases where it might not give you the
>> results you expect.
>
>
> Its been a while since the following emails were written. Has the
> treatment of xmin changed since then, or is using a timestamp a better
> practice?
>
> http://archives.postgresql.org/pgsql-novice/2007-02/msg00079.php
> http://archives.postgresql.org/pgsql-hackers/2004-02/msg00654.php

Well those emails aren't discussing evaluating when records were updated or
deciding which were updated more recently than others. All they say is that in
Postgres if ctid and xmin both match then you're looking at the same version
of the same record. For a gui table editor or ODBC driver that's an important
thing to know.

If what you want to do is find records which have been updated for something
like a "Recently updated pages" it's unlikely that the desired behaviour will
exactly match how Postgres works. You're better off deciding the policy you
want and writing code to implement that.

Some examples of how xmin might not do what you expect: The order in which
transactions *start* will determine the ordering, not the order in which they
commit. If you look at records you've updated in the same transaction it's
even possible to see records which come from the "future". If any records have
frozen you lose any record of what order they were created.

Another example is that it's impossible to ignore "trivial" updates -- any
update will update xmin no matter how trivial, even if no columns are updated.

Worse, in the future there may be changes to database internals which change
when xmin is updated which won't match your desired policy. For example if we
decide to replace VACUUM FULL with something which does no-op updates instead
of moving tuples then you'll find records spontaneously appearing to have been
recently updated.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Richard Broersma <richard(dot)broersma(at)gmail(dot)com>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, Stéphane A(dot) Schildknecht <stephane(dot)schildknecht(at)postgresqlfr(dot)org>, Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Using xmin to identify last modified rows
Date: 2009-02-25 16:26:33
Message-ID: 26861.1235579193@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Richard Broersma <richard(dot)broersma(at)gmail(dot)com> writes:
> On Wed, Feb 25, 2009 at 5:21 AM, Gregory Stark <stark(at)enterprisedb(dot)com> wrote:
>> I'm not sure using xmin is such a great idea really. It's handy for ad-hoc
>> queries but there are all kinds of cases where it might not give you the
>> results you expect.

> Its been a while since the following emails were written. Has the
> treatment of xmin changed since then, or is using a timestamp a better
> practice?

> http://archives.postgresql.org/pgsql-novice/2007-02/msg00079.php
> http://archives.postgresql.org/pgsql-hackers/2004-02/msg00654.php

Those statements are all still true, but notice that nowhere do they
suggest doing anything except simple equality comparisons on XIDs.
The OP was looking for ordering, which is a lot trickier, especially
if you might be dealing with old (frozen) tuples.

regards, tom lane