Re: preserving forensic information when we freeze

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Jim Nasby <jim(at)nasby(dot)net>
Subject: Re: preserving forensic information when we freeze
Date: 2014-01-02 19:44:34
Message-ID: 14138.1388691874@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On 2014-01-02 12:46:34 -0500, Tom Lane wrote:
>> For real
>> forensics work, you need to be able to see all tuples, which makes me
>> think that something akin to pgstattuple is the right API; that is "return
>> a set of the header info for all tuples on such-and-such pages of this
>> relation". That should dodge any performance problem, because the
>> heap_open overhead could be amortized across lots of tuples, and it also
>> sidesteps all problems with adding new system columns.

> The biggest problem with such an API is that it's painful to use - I've
> used pageinspect a fair bit, and not being able to easily get the
> content of the rows you're looking at makes it really far less useful in
> many scenarios. That could partially be improved by a neater API

Surely. Why couldn't you join against the table on ctid?

> And I really don't see any page-at-a-time access that's going to be
> convenient.

As I commented to Robert, the page-at-a-time behavior of pageinspect
is not an API detail we'd want to copy for this. I envision something
like

select hdr.*, foo.*
from tuple_header_details('foo'::regclass) as hdr
left join foo on hdr.ctid = foo.ctid;

On a large table you might want a version that restricts its scan
to pages M through N, but that's just optimization. More useful
would be to improve the planner's intelligence about joins on ctid ...

>>> [ removing system columns from pg_attribute ]]
>> I think this will inevitably break a lot of code, not all of it ours,
>> so I'm not in favor of pursuing that direction.

> Are you thinking of client or extension code? From what I've looked at I
> don't think it's all that likely too break much of either.

It will break anything that assumes that every column is represented in
pg_attribute. I think if you think this assumption is easily removed,
you've not looked hard enough.

> It would make pg_attribute a fair bit smaller, especially on systems
> with lots of narrow relations.

I'd like to do that too, but I think getting rid of xmin/xmax/cmin/cmax
would be enough to get most of the benefit, and we could do that without
any inconsistency if we stopped exposing those as system columns.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-01-02 19:48:41 Re: preserving forensic information when we freeze
Previous Message Andres Freund 2014-01-02 19:44:22 Re: proposal: multiple read-write masters in a cluster with wal-streaming synchronization