Review: Patch to compute Max LSN of Data Pages

From: Muhammad Usama <m(dot)usama(at)gmail(dot)com>
To: Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Review: Patch to compute Max LSN of Data Pages
Date: 2012-11-23 00:08:16
Message-ID: CAEJvTzXArhYnmi8RBEbMOU9M9U+zVMs5aoSN19Tm_9SZ3PW_Qg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Amit

I have reviewed and tested the patch, Following are my observations and
comments.

Basic stuff
-----------------
- Patch applies OK
- Compiles cleanly with no warnings
- All regression tests pass.

Observations and Comments
-------------------------------------------

- If no option is given to pg_computemaxlsn utility then we get a wrong
error message
./pg_computemaxlsn ../data
pg_computemaxlsn: file or directory "(null)" exists
In my opinion we should consider the -P (Print max LSN from whole data
directory) as default in case no option is specified, or otherwise throw a
more descriptive error.

- Options -p and -P should be mutually exclusive

- Long options missing for -p and -P

- Help string for -P should be something like "print max LSN from whole
data directory" instead of "print max LSN from whole database"

- Help menu is silent about -V or --version option

- I think when finding the max LSN of single file the utility should
consider all relation segments.

- There is no check for extra arguments
e.g
./pg_computemaxlsn -P . ../data/ ../data/base/ ../data2/

- For -p {file | dir} option the utility expects the file path relative to
the specified data directory path which makes thing little confusing
for example
./pg_computemaxlsn -p data/base/12286/12200 data
pg_computemaxlsn: file or directory "data/base/12286/12200" does not exists
although data/base/12286/12200 is valid file path but we gets file does not
exists error

- Utility should print the max LSN number in hexadecimal notation and LSN
format (logid/segno) for consistency with PG, and may also print the file
name which contains that LSN.

- When finding max LSN from directory, the utility does not considers the
sub directories, which I think could be useful in some cases, like if we
want to find the max LSN in tablespace directory. With the current
implementation we would require to run the utility multiple times, once for
each database.

Code Review
---------------------

Code generally looks good, I have few small points.

- In main() function variable maxLSN is defined as uint64,
Although XLogRecPtr and uint64 are the same thing, but I think we should
use XLogRecPtr instead.

- Just a small thing although will hardly improve anything but still for
sake of saving few cycles, I think you should use XLByteLT ( < ) instead
of XLByteLE ( <= )
to find if the previous max LSN is lesser then current.

- '\n' is missing from one the printf in the code :-)
fprintf(stderr, _("%s: file or directory \"%s\" does not exists"),
progname, LsnSearchPath);

- While finding the max LSN from single file, you are not verifying that
the file actually exists inside the data directory path provided. and can
end up looking
at wrong data directory for checking if the server is running.
For example:
bin/pg_computemaxlsn -p $PWD/data/base/12286/12200
$PWD/otherinstallation/data
Maximum LSN found is: 0, WAL segment file name (fileid,seg):
0000000000000000

- Code is not verifying, if the provided data directory is the valid data
directory, This could make pg_computemaxlsn to compute max LSN from the
running server file.
For example:
bin/pg_computemaxlsn -p $PWD/data/base/12286/12200 NONDATADIR/
Maximum LSN found is: 0, WAL segment file name (fileid,seg):
0000000000000000

Questions
-----------------

- I am wondering why are you not following the link inside the "pg_tblspc"
directory to get to the table space location instead of building the
directory path. I am thinking if you follow the link instead, The utility
can be used across the different versions of PG.

Regards,
Muhammad Usama

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2012-11-23 05:30:19 Re: PQconninfo function for libpq
Previous Message Fujii Masao 2012-11-22 19:14:38 Re: the number of pending entries in GIN index with FASTUPDATE=on