From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Stephen Frost <sfrost(at)snowman(dot)net>, Jim Nasby <jim(at)nasby(dot)net> |
Subject: | Re: preserving forensic information when we freeze |
Date: | 2014-01-02 19:44:34 |
Message-ID: | 14138.1388691874@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> On 2014-01-02 12:46:34 -0500, Tom Lane wrote:
>> For real
>> forensics work, you need to be able to see all tuples, which makes me
>> think that something akin to pgstattuple is the right API; that is "return
>> a set of the header info for all tuples on such-and-such pages of this
>> relation". That should dodge any performance problem, because the
>> heap_open overhead could be amortized across lots of tuples, and it also
>> sidesteps all problems with adding new system columns.
> The biggest problem with such an API is that it's painful to use - I've
> used pageinspect a fair bit, and not being able to easily get the
> content of the rows you're looking at makes it really far less useful in
> many scenarios. That could partially be improved by a neater API
Surely. Why couldn't you join against the table on ctid?
> And I really don't see any page-at-a-time access that's going to be
> convenient.
As I commented to Robert, the page-at-a-time behavior of pageinspect
is not an API detail we'd want to copy for this. I envision something
like
select hdr.*, foo.*
from tuple_header_details('foo'::regclass) as hdr
left join foo on hdr.ctid = foo.ctid;
On a large table you might want a version that restricts its scan
to pages M through N, but that's just optimization. More useful
would be to improve the planner's intelligence about joins on ctid ...
>>> [ removing system columns from pg_attribute ]]
>> I think this will inevitably break a lot of code, not all of it ours,
>> so I'm not in favor of pursuing that direction.
> Are you thinking of client or extension code? From what I've looked at I
> don't think it's all that likely too break much of either.
It will break anything that assumes that every column is represented in
pg_attribute. I think if you think this assumption is easily removed,
you've not looked hard enough.
> It would make pg_attribute a fair bit smaller, especially on systems
> with lots of narrow relations.
I'd like to do that too, but I think getting rid of xmin/xmax/cmin/cmax
would be enough to get most of the benefit, and we could do that without
any inconsistency if we stopped exposing those as system columns.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2014-01-02 19:48:41 | Re: preserving forensic information when we freeze |
Previous Message | Andres Freund | 2014-01-02 19:44:22 | Re: proposal: multiple read-write masters in a cluster with wal-streaming synchronization |