pg_controldata gobbledygook

Lists: pgsql-hackers
From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: pg_controldata gobbledygook
Date: 2013-04-26 03:07:02
Message-ID: 1366945622.8928.16.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I'm not sure who is supposed to be able to read this sort of stuff:

Latest checkpoint's NextXID: 0/7575
Latest checkpoint's NextOID: 49152
Latest checkpoint's NextMultiXactId: 7
Latest checkpoint's NextMultiOffset: 13
Latest checkpoint's oldestXID: 1265
Latest checkpoint's oldestXID's DB: 1
Latest checkpoint's oldestActiveXID: 0
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 1

Note that these symbols don't even correspond to the actual symbols used
in the source code in some cases.

The comments in the pg_control.h header file use much more pleasant
terms, which when put to use would lead to output similar to this:

Latest checkpoint's next free transaction ID: 0/7575
Latest checkpoint's next free OID: 49152
Latest checkpoint's next free MultiXactId: 7
Latest checkpoint's next free MultiXact offset: 13
Latest checkpoint's cluster-wide minimum datfrozenxid: 1265
Latest checkpoint's database with cluster-wide minimum datfrozenxid: 1
Latest checkpoint's oldest transaction ID still running: 0
Latest checkpoint's cluster-wide minimum datminmxid: 1
Latest checkpoint's database with cluster-wide minimum datminmxid: 1

One could even rearrange the layout a little bit like this:

Control data as of latest checkpoint:
next free transaction ID: 0/7575
next free OID: 49152
etc.

Comments?


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_controldata gobbledygook
Date: 2013-04-26 03:19:14
Message-ID: 29207.1366946354@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> The comments in the pg_control.h header file use much more pleasant
> terms, which when put to use would lead to output similar to this:

> Latest checkpoint's next free transaction ID: 0/7575
> Latest checkpoint's next free OID: 49152
> Latest checkpoint's next free MultiXactId: 7
> Latest checkpoint's next free MultiXact offset: 13
> Latest checkpoint's cluster-wide minimum datfrozenxid: 1265
> Latest checkpoint's database with cluster-wide minimum datfrozenxid: 1
> Latest checkpoint's oldest transaction ID still running: 0
> Latest checkpoint's cluster-wide minimum datminmxid: 1
> Latest checkpoint's database with cluster-wide minimum datminmxid: 1

> One could even rearrange the layout a little bit like this:

> Control data as of latest checkpoint:
> next free transaction ID: 0/7575
> next free OID: 49152
> etc.

> Comments?

I think I've heard of scripts grepping the output of pg_controldata for
this that or the other. Any rewording of the labels would break that.
While I'm not opposed to improving the labels, I would vote against your
second, abbreviated scheme because it would make things ambiguous for
simple grep-based scripts.

regards, tom lane


From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_controldata gobbledygook
Date: 2013-04-26 03:22:48
Message-ID: CAM3SWZTHv0i_KN3fP2f1u8Nw6G8xwyth-wWLAysS1GaBcY8niQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Apr 25, 2013 at 8:07 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> Comments?

+1 from me.

I don't think that these particular changes would break WAL-E,
Heroku's continuous archiving tool, which has a class called
PgControlDataParser. However, it's possible to imagine someone being
affected in a similar way. So I'd be sure to document it clearly, and
to perhaps preserve the old label names to avoid breaking scripts.

--
Peter Geoghegan


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_controldata gobbledygook
Date: 2013-04-26 04:21:58
Message-ID: 20130426042158.GW2169@eldon.alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:

> I think I've heard of scripts grepping the output of pg_controldata for
> this that or the other. Any rewording of the labels would break that.
> While I'm not opposed to improving the labels, I would vote against your
> second, abbreviated scheme because it would make things ambiguous for
> simple grep-based scripts.

We could provide two alternative outputs, one for human consumption with
the proposed format and something else that uses, say, shell assignment
syntax. (I did propose this years ago and I might have an unfinished
patch still lingering about somewhere.)

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_controldata gobbledygook
Date: 2013-04-26 04:25:40
Message-ID: CAFcNs+qQ7AgLgAyziOHnwRwd68VDn7YQwaGKsRV1GrK7Mu=V9w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Apr 26, 2013 at 12:22 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:

> On Thu, Apr 25, 2013 at 8:07 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> > Comments?
>
> +1 from me.
>
> I don't think that these particular changes would break WAL-E,
> Heroku's continuous archiving tool, which has a class called
> PgControlDataParser. However, it's possible to imagine someone being
> affected in a similar way. So I'd be sure to document it clearly, and
> to perhaps preserve the old label names to avoid breaking scripts.
>
>
Why don't we add options to pg_controldata outputs the info in other
several formats like json, yaml, xml or another one?

Best regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_controldata gobbledygook
Date: 2013-04-26 04:34:54
Message-ID: 672.1366950894@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> Tom Lane wrote:
>> I think I've heard of scripts grepping the output of pg_controldata for
>> this that or the other. Any rewording of the labels would break that.
>> While I'm not opposed to improving the labels, I would vote against your
>> second, abbreviated scheme because it would make things ambiguous for
>> simple grep-based scripts.

> We could provide two alternative outputs, one for human consumption with
> the proposed format and something else that uses, say, shell assignment
> syntax. (I did propose this years ago and I might have an unfinished
> patch still lingering about somewhere.)

And a script would use that how? "pg_controldata --machine-friendly"
would fail outright on older versions. I think it's okay to ask script
writers to write
pg_controldata | grep -e 'old label|new label'
but not okay to ask them to deal with anything as complicated as trying
a switch to see if it works or not.

regards, tom lane


From: Daniel Farina <daniel(at)heroku(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_controldata gobbledygook
Date: 2013-04-26 06:53:24
Message-ID: CAAZKuFa5ougGqfy+z6SpMB+ppCy3Oxq1xP1X4ekEgVvj4Zt2cQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Apr 25, 2013 at 9:34 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>> Tom Lane wrote:
>>> I think I've heard of scripts grepping the output of pg_controldata for
>>> this that or the other. Any rewording of the labels would break that.
>>> While I'm not opposed to improving the labels, I would vote against your
>>> second, abbreviated scheme because it would make things ambiguous for
>>> simple grep-based scripts.
>
>> We could provide two alternative outputs, one for human consumption with
>> the proposed format and something else that uses, say, shell assignment
>> syntax. (I did propose this years ago and I might have an unfinished
>> patch still lingering about somewhere.)
>
> And a script would use that how? "pg_controldata --machine-friendly"
> would fail outright on older versions. I think it's okay to ask script
> writers to write
> pg_controldata | grep -e 'old label|new label'
> but not okay to ask them to deal with anything as complicated as trying
> a switch to see if it works or not.

From what I'm reading, it seems like the main benefit of the changes
is to make things easier for humans to skim over. Automated programs
that care about precise meanings of each field are awkwardly but
otherwise well-served by the precise output as rendered right now.

What about doing something similar but different from the
--machine-readable proposal, such as adding an option for the
*human*-readable variant that is guaranteed to mercilessly change as
human-readers/-hackers sees fit on whim? It's a bit of a kludge that
this is not the default, but would prevent having to serve two quite
different masters with the same output.

Although I'm not seriously proposing explicitly "-h" (as seen in some
GNU programs in rendering byte sizes and the like...yet could be
confused for 'help'), something like that may serve as prior art.


From: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
To: Daniel Farina <daniel(at)heroku(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_controldata gobbledygook
Date: 2013-04-26 09:00:26
Message-ID: 517A422A.8070303@archidevsys.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26/04/13 18:53, Daniel Farina wrote:
> On Thu, Apr 25, 2013 at 9:34 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
>>> Tom Lane wrote:
>>>> I think I've heard of scripts grepping the output of pg_controldata for
>>>> this that or the other. Any rewording of the labels would break that.
>>>> While I'm not opposed to improving the labels, I would vote against your
>>>> second, abbreviated scheme because it would make things ambiguous for
>>>> simple grep-based scripts.
>>> We could provide two alternative outputs, one for human consumption with
>>> the proposed format and something else that uses, say, shell assignment
>>> syntax. (I did propose this years ago and I might have an unfinished
>>> patch still lingering about somewhere.)
>> And a script would use that how? "pg_controldata --machine-friendly"
>> would fail outright on older versions. I think it's okay to ask script
>> writers to write
>> pg_controldata | grep -e 'old label|new label'
>> but not okay to ask them to deal with anything as complicated as trying
>> a switch to see if it works or not.
> From what I'm reading, it seems like the main benefit of the changes
> is to make things easier for humans to skim over. Automated programs
> that care about precise meanings of each field are awkwardly but
> otherwise well-served by the precise output as rendered right now.
>
> What about doing something similar but different from the
> --machine-readable proposal, such as adding an option for the
> *human*-readable variant that is guaranteed to mercilessly change as
> human-readers/-hackers sees fit on whim? It's a bit of a kludge that
> this is not the default, but would prevent having to serve two quite
> different masters with the same output.
>
> Although I'm not seriously proposing explicitly "-h" (as seen in some
> GNU programs in rendering byte sizes and the like...yet could be
> confused for 'help'), something like that may serve as prior art.
>
>
I think the current way should remain the default, as Daniel suggests
- but a '--human-readable' (or suitable abbreviation) flag could be added.

Such as in the command to list directory details, using the 'ls' command
in Linux...

(Below, *Y* = 1024 * 1024 * 1024 * 1024 * 1024 * 1024 * 1024 * 1024
bytes = 2^80 bytes.)

*man ls**
**[...]**
** -h, --human-readable**
** with -l, print sizes in human readable format (e.g., 1K
234M 2G)**
**[...]**
** SIZE may be (or may be an integer optionally followed by) one
of fol-**
** lowing: KB 1000, K 1024, MB 1000*1000, M 1024*1024, and so on
for G, T,**
** P, E, Z, Y.**
**[...]*

Cheers,
Gavin


From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_controldata gobbledygook
Date: 2013-04-26 09:08:10
Message-ID: 20130426090810.GA5892@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2013-04-25 23:07:02 -0400, Peter Eisentraut wrote:
> I'm not sure who is supposed to be able to read this sort of stuff:
>
> Latest checkpoint's NextXID: 0/7575
> Latest checkpoint's NextOID: 49152
> Latest checkpoint's NextMultiXactId: 7
> Latest checkpoint's NextMultiOffset: 13
> Latest checkpoint's oldestXID: 1265
> Latest checkpoint's oldestXID's DB: 1
> Latest checkpoint's oldestActiveXID: 0
> Latest checkpoint's oldestMultiXid: 1
> Latest checkpoint's oldestMulti's DB: 1
>
> Note that these symbols don't even correspond to the actual symbols used
> in the source code in some cases.
>
> The comments in the pg_control.h header file use much more pleasant
> terms, which when put to use would lead to output similar to this:
>
> Latest checkpoint's next free transaction ID: 0/7575
> Latest checkpoint's next free OID: 49152
> Latest checkpoint's next free MultiXactId: 7
> Latest checkpoint's next free MultiXact offset: 13
> Latest checkpoint's cluster-wide minimum datfrozenxid: 1265
> Latest checkpoint's database with cluster-wide minimum datfrozenxid: 1
> Latest checkpoint's oldest transaction ID still running: 0
> Latest checkpoint's cluster-wide minimum datminmxid: 1
> Latest checkpoint's database with cluster-wide minimum datminmxid: 1
>
> One could even rearrange the layout a little bit like this:
>
> Control data as of latest checkpoint:
> next free transaction ID: 0/7575
> next free OID: 49152

I have to admit I don't see the point. None of those values is particularly
interesting to anybody without implementation level knowledge and those
will likely deal with them just fine. And I find the version with the
shorter names far quicker to read.
The clarity win here doesn't seem to be worth the price of potentially
breaking some tools.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Bernd Helmle <mailings(at)oopsware(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_controldata gobbledygook
Date: 2013-04-26 11:28:07
Message-ID: 02794891993DAA78C8AC6C6E@apophis.credativ.lan
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

--On 25. April 2013 23:19:14 -0400 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I think I've heard of scripts grepping the output of pg_controldata for
> this that or the other. Any rewording of the labels would break that.
> While I'm not opposed to improving the labels, I would vote against your
> second, abbreviated scheme because it would make things ambiguous for
> simple grep-based scripts.

I had exactly this kind of discussion just a few days ago with a customer,
who wants to use the output in their scripts and was a little worried about
the compatibility between major versions.

I don't think we do guarantuee any output format compatibility between
corresponding symbols in major versions explicitly, but given that
pg_controldata seems to have a broad use case here, we should maybe
document it somewhere wether to discourage or encourage people to rely on
it?

--
Thanks

Bernd


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_controldata gobbledygook
Date: 2013-04-26 12:51:23
Message-ID: CA+TgmoY6btCfYghkLNBDFswQDRdj7s6jk-B6+1W_L9khMC5X4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Apr 26, 2013 at 5:08 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> I have to admit I don't see the point. None of those values is particularly
> interesting to anybody without implementation level knowledge and those
> will likely deal with them just fine. And I find the version with the
> shorter names far quicker to read.
> The clarity win here doesn't seem to be worth the price of potentially
> breaking some tools.

+1.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_controldata gobbledygook
Date: 2013-04-26 16:31:16
Message-ID: CAMkU=1z2KaBiWws6MEadnhv62V7+xteFkRNJm0MPmNuJwPMnXA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Apr 26, 2013 at 2:08 AM, Andres Freund <andres(at)2ndquadrant(dot)com>wrote:

> On 2013-04-25 23:07:02 -0400, Peter Eisentraut wrote:
> > I'm not sure who is supposed to be able to read this sort of stuff:
> >
> > Latest checkpoint's NextXID: 0/7575
> > Latest checkpoint's NextOID: 49152
> > Latest checkpoint's NextMultiXactId: 7
> > Latest checkpoint's NextMultiOffset: 13
> > Latest checkpoint's oldestXID: 1265
> > Latest checkpoint's oldestXID's DB: 1
> > Latest checkpoint's oldestActiveXID: 0
> > Latest checkpoint's oldestMultiXid: 1
> > Latest checkpoint's oldestMulti's DB: 1
> >
> > Note that these symbols don't even correspond to the actual symbols used
> > in the source code in some cases.
> >
> > The comments in the pg_control.h header file use much more pleasant
> > terms, which when put to use would lead to output similar to this:
> >
> > Latest checkpoint's next free transaction ID: 0/7575
> > Latest checkpoint's next free OID: 49152
> > Latest checkpoint's next free MultiXactId: 7
> > Latest checkpoint's next free MultiXact offset: 13
> > Latest checkpoint's cluster-wide minimum datfrozenxid: 1265
> > Latest checkpoint's database with cluster-wide minimum datfrozenxid: 1
> > Latest checkpoint's oldest transaction ID still running: 0
> > Latest checkpoint's cluster-wide minimum datminmxid: 1
> > Latest checkpoint's database with cluster-wide minimum datminmxid: 1
> >
> > One could even rearrange the layout a little bit like this:
> >
> > Control data as of latest checkpoint:
> > next free transaction ID: 0/7575
> > next free OID: 49152
>
> I have to admit I don't see the point. None of those values is particularly
> interesting to anybody without implementation level knowledge and those
> will likely deal with them just fine. And I find the version with the
> shorter names far quicker to read.
>

I agree. For the ones I didn't know the meaning of, I still don't know the
meaning of them based on the long form, either. While a tutorial on what
these things mean might be useful, embedding the tutorial into the output
of pg_controldata probably isn't the right place.

Cheers,

Jeff


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_controldata gobbledygook
Date: 2013-05-02 14:31:27
Message-ID: 20130502143127.GC23555@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Apr 26, 2013 at 08:51:23AM -0400, Robert Haas wrote:
> On Fri, Apr 26, 2013 at 5:08 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > I have to admit I don't see the point. None of those values is particularly
> > interesting to anybody without implementation level knowledge and those
> > will likely deal with them just fine. And I find the version with the
> > shorter names far quicker to read.
> > The clarity win here doesn't seem to be worth the price of potentially
> > breaking some tools.
>
> +1.

FYI, pg_upgrade would certainly have to be updated to handle this
change.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +