Re: Review: Patch to compute Max LSN of Data Pages

From: Amit kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Muhammad Usama'" <m(dot)usama(at)gmail(dot)com>
Cc: "'Robert Haas'" <robertmhaas(at)gmail(dot)com>, "'Fujii Masao'" <masao(dot)fujii(at)gmail(dot)com>, "'Alvaro Herrera'" <alvherre(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Review: Patch to compute Max LSN of Data Pages
Date: 2012-11-27 12:52:15
Message-ID: 6C0B27F7206C9E4CA54AE035729E9C383BE96305@szxeml509-mbs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Friday, November 23, 2012 5:38 AM Muhammad Usama wrote:

> Observations and Comments
> -------------------------------------------
> - If no option is given to pg_computemaxlsn utility then we get a wrong
> error message
> ./pg_computemaxlsn ../data
> pg_computemaxlsn: file or directory "(null)" exists
> In my opinion we should consider the -P (Print max LSN from whole data
> directory) as default in case no option is specified, or otherwise throw a
> more descriptive error.

Fixed. By considering default option as Whole data directory.

> - Options -p and -P should be mutually exclusive

Fixed. Both options will not be allowed at the same time.

- Long options missing for -p and -P

Fixed. introduced long options as --path & --data-directory.

> - Help string for -P should be something like "print max LSN from whole
> data directory" instead of "print max LSN from whole database"

Fixed, by changing the message as suggested.

> - Help menu is silent about -V or --version option

Fixed.

> - I think when finding the max LSN of single file the utility should
> consider all relation segments.

IMHO, as this utility is not aware of relation or any other logical entity and deals with only file or directory,

it is okay to find only for that file.

> - There is no check for extra arguments
> e.g
> ./pg_computemaxlsn -P . ../data/ ../data/base/ ../data2/

Fixed.
- Extra options gives error.
- Multiple -p options also gives error.
Eg:
./pg_computemaxlsn -p base/12286/12200 -p base/12286/12201 data

> - For -p {file | dir} option the utility expects the file path relative to
> the specified data directory path which makes thing little confusing
> for example
> ./pg_computemaxlsn -p data/base/12286/12200 data
> pg_computemaxlsn: file or directory "data/base/12286/12200" does not exists
> although data/base/12286/12200 is valid file path but we gets file does not
> exists error
As the path of file is relative to data directory, that's why in error it prints the path as "data/base/12286/12200".
Do you have any suggestion what should be done for this?

> - Utility should print the max LSN number in hexadecimal notation and LSN
> format (logid/segno) for consistency with PG,

Fixed

> and may also print the file name which contains that LSN.

It might be confusing to users for the cases where it has to follow link (pg_tblspc) and then print some absolute path which is not related to relative path given by user.
Also I am not sure if it will be of any use to user.

> - When finding max LSN from directory, the utility does not considers the
> sub directories, which I think could be useful in some cases, like if we
> want to find the max LSN in tablespace directory. With the current
> implementation we would require to run the utility multiple times, once for
> each database.

It will consider sub-directories as well.

> Code Review
> ---------------------

> Code generally looks good, I have few small points.

> - In main() function variable maxLSN is defined as uint64,
> Although XLogRecPtr and uint64 are the same thing, but I think we should
> use XLogRecPtr instead.
Changed as per suggestion.

> - Just a small thing although will hardly improve anything but still for
> sake of saving few cycles, I think you should use XLByteLT ( < ) instead
> of XLByteLE ( <= )
> to find if the previous max LSN is lesser then current.

Changed as per suggestion.

> - '\n' is missing from one the printf in the code :-)
> fprintf(stderr, _("%s: file or directory \"%s\" does not exists"),
> progname, LsnSearchPath);

Changed as per suggestion.

> - While finding the max LSN from single file, you are not verifying that
> the file actually exists inside the data directory path provided. and can
> end up looking
>at wrong data directory for checking if the server is running.
> For example:
> bin/pg_computemaxlsn -p $PWD/data/base/12286/12200
> $PWD/otherinstallation/data
> Maximum LSN found is: 0, WAL segment file name (fileid,seg):
> 0000000000000000

> - Code is not verifying, if the provided data directory is the valid data
> directory, This could make pg_computemaxlsn to compute max LSN from the
> running server file.
> For example:
> bin/pg_computemaxlsn -p $PWD/data/base/12286/12200 NONDATADIR/
> Maximum LSN found is: 0, WAL segment file name (fileid,seg):
>0000000000000000

Fixed, by allowing only relative path for file or directory and check if that is relative and below data directory with function path_is_relative_and_below_cwd().

> Questions
> -----------------

> - I am wondering why are you not following the link inside the "pg_tblspc"
> directory to get to the table space location instead of building the
> directory path. I am thinking if you follow the link instead, The utility
> can be used across the different versions of PG.
Changed as per suggestion.

Apart from above, updated the documentation.
With Regards,
Amit Kapila.

Attachment Content-Type Size
pg_computemaxlsn.v2.patch application/octet-stream 20.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2012-11-27 13:45:05 Re: [WIP] pg_ping utility
Previous Message Pavan Deolasee 2012-11-27 11:23:21 Re: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update