Re: Review: Patch to compute Max LSN of Data Pages

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Robert Haas'" <robertmhaas(at)gmail(dot)com>, "'Andres Freund'" <andres(at)2ndquadrant(dot)com>
Cc: "'Josh Berkus'" <josh(at)agliodbs(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Review: Patch to compute Max LSN of Data Pages
Date: 2013-07-03 12:44:11
Message-ID: 003101ce77eb$0507cbe0$0f1763a0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wednesday, July 03, 2013 1:26 AM Robert Haas wrote:
> On Tue, Jun 25, 2013 at 1:42 PM, Andres Freund <andres(at)2ndquadrant(dot)com>
> wrote:
> > I think the usecase for this utility isn't big enough to be included
> in
> > postgres since it really can only help in a very limited
> > circumstances. And I think it's too likely to be misused for stuff
> it's
> > not useable for (e.g. remastering).
> >
> > The only scenario I see is that somebody deleted/corrupted
> > pg_controldata. In that scenario the tool is supposed to be used to
> find
> > the biggest lsn used so far so the user then can use pg_resetxlog to
> set
> > that as the wal starting point.
> > But that can be way much easier solved by just setting the LSN to
> > something very, very high. The database cannot be used for anything
> > reliable afterwards anyway.
>
> I guess this is true, but I think I'm mildly in favor of including
> this anyway. I think I would have used it once or twice, if it had
> been there - maybe not even to feed into pg_resetxlog, but just to
> check for future LSNs. We don't have anything like a suite of
> diagnostic tools in bin or contrib today, for use by database
> professionals in fixing things that fall strictly in the realm of
> "don't try this at home", and I think we should have such a thing.
> Unfortunately this covers about 0.1% of the space I'd like to see
> covered, which might be a reason to reject it or at least insist that
> it be enhanced first.
>
> At any rate, I don't think this is anywhere near committable as it
> stands today. Some random review comments:
>
> remove_parent_refernces() is spelled wrong.

It was corrected in last posted version.
http://www.postgresql.org/message-id/6C0B27F7206C9E4CA54AE035729E9C383BEB928
8(at)szxeml509-mbx

> Why does this patch need all of this fancy path-handling logic -
> specifically, remove_parent_refernces() and make_absolute_path()?
> Surely its needs are not that different from pg_ctl or pg_resetxlog or
> pg_controldata. If they all need it and it's duplicated, we should
> fix that. Otherwise, why the special logic here?
>
> I don't think we need getLinkPath() either. The server has no trouble
> finding its files by just using a pathname that follows the symlink.
> Why do we need anything different here? The whole point of symlinks
> is that you can traverse them transparently, without knowing where
> they point.

It is to handle negative scenario's like if there is any recursion in path.
However if you feel this is not important, it can be removed.

> Duplicating PageHeaderIsValid doesn't seem acceptable. Moreover,
> since the point of this is to be able to use it on a damaged cluster,
> why is that logic even desirable? I think we'd be better off assuming
> the pages to be valid.
>
> The calling convention for this utility seems quite baroque. There's
> no real need for all of this -p/-P stuff. I think the syntax should
> just be:
>
> pg_computemaxlsn file_or_directory...
>
> For each argument, we determine whether it's a file or a directory.
> If it's a file, we assume it's a PostgreSQL data file and scan it. If
> it's a directory, we check whether it looks like a data directory. If
> it does, we recurse through the whole tree structure and find the data
> files, and process them. If it doesn't look like a data directory, we
> scan each plain file in that directory whose name looks like a
> PostgreSQL data file name. With this approach, there's no need to
> limit ourselves to a single input argument and no need to specify what
> kind of argument it is; the tool just figures it out.
>
> I think it would be a good idea to have a mode that prints out the max
> LSN found in *each data file* scanned, and then prints out the overall
> max found at the end. In fact, I think that should perhaps be the
> default mode, with -q, --quiet to disable it.

Printing too many LSN for each file might fill user's screen and he might be
needing only overall LSN.
Should we keep -p --printall as option to print all LSN's and keep default
as overall max LSN?

With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2013-07-03 12:59:15 Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])
Previous Message Simon Riggs 2013-07-03 12:39:53 Re: Move unused buffers to freelist