Re: TRUE/FALSE vs true/false

Lists: pgsql-hackers
From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: pgsql-hackers(at)postgresql(dot)org
Subject: TRUE/FALSE vs true/false
Date: 2011-08-04 10:08:21
Message-ID: 4E3A6F95.3030104@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

I looked at b4fbe392f8ff6ff1a66b488eb7197eef9e1770a4
and I noticed that it's using TRUE, FALSE, true and false
inconsistently:

@@ -248,6 +249,7 @@ CreateSharedInvalidationState(void)
shmInvalBuffer->procState[i].nextMsgNum = 0; /* meaningless */
shmInvalBuffer->procState[i].resetState = false;
shmInvalBuffer->procState[i].signaled = false;
+ shmInvalBuffer->procState[i].hasMessages = false;
shmInvalBuffer->procState[i].nextLXID = InvalidLocalTransactionId;
}
}

@@ -316,6 +316,7 @@ SharedInvalBackendInit(bool sendOnly)
stateP->nextMsgNum = segP->maxMsgNum;
stateP->resetState = false;
stateP->signaled = false;
+ stateP->hasMessages = false;
stateP->sendOnly = sendOnly;

LWLockRelease(SInvalWriteLock);

@@ -459,6 +461,19 @@ SIInsertDataEntries(const SharedInvalidationMessage *data, int n)
SpinLockRelease(&vsegP->msgnumLock);
}

+ /*
+ * Now that the maxMsgNum change is globally visible, we give
+ * everyone a swift kick to make sure they read the newly added
+ * messages. Releasing SInvalWriteLock will enforce a full memory
+ * barrier, so these (unlocked) changes will be committed to memory
+ * before we exit the function.
+ */
+ for (i = 0; i < segP->lastBackend; i++)
+ {
+ ProcState *stateP = &segP->procState[i];
+ stateP->hasMessages = TRUE;
+ }
+
LWLockRelease(SInvalWriteLock);
}
}

@@ -499,11 +514,36 @@ SIGetDataEntries(SharedInvalidationMessage *data, int datasize)
...
+ * Note that, if we don't end up reading all of the messages, we had
+ * better be certain to reset this flag before exiting!
+ */
+ stateP->hasMessages = FALSE;
+

@@ -544,10 +584,16 @@ SIGetDataEntries(SharedInvalidationMessage *data, int datasize)
...
if (stateP->nextMsgNum >= max)
stateP->signaled = false;
+ else
+ stateP->hasMessages = TRUE;

Also, grepping for checking for or assigning bool values reveal that
"true" and "false" are used far more than "TRUE" and "FALSE":

[zozo(at)localhost backend]$ find . -name "*.c" | xargs grep -w true | grep -v 'true"' | grep
= | wc -l
2446
[zozo(at)localhost backend]$ find . -name "*.c" | xargs grep -w false | grep -v 'false"' |
grep = | wc -l
2745
[zozo(at)localhost backend]$ find . -name "*.c" | xargs grep -w TRUE | grep -v 'TRUE"' | grep
= | wc -l
119
[zozo(at)localhost backend]$ find . -name "*.c" | xargs grep -w FALSE | grep -v 'FALSE"' |
grep = | wc -l
140

Shouldn't these get fixed to be consistent?

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TRUE/FALSE vs true/false
Date: 2011-08-04 12:32:47
Message-ID: CA+TgmoZrr++L5_1J4uXKfDEv_YBBkiWbyaGCNT+cxTU6Q32czQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2011/8/4 Boszormenyi Zoltan <zb(at)cybertec(dot)at>:
> Shouldn't these get fixed to be consistent?

I believe I already did. See commit 85b436f7b1f06a6ffa8d2f29b03d6e440de18784.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TRUE/FALSE vs true/false
Date: 2011-08-04 12:44:35
Message-ID: 4E3A9433.7020602@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2011-08-04 14:32 keltezéssel, Robert Haas írta:
> 2011/8/4 Boszormenyi Zoltan <zb(at)cybertec(dot)at>:
>> Shouldn't these get fixed to be consistent?
> I believe I already did. See commit 85b436f7b1f06a6ffa8d2f29b03d6e440de18784.

I meant a mass "sed -e 's/TRUE/true/g' -e 's/FALSE/false/g'" run
so all the ~200 occurrences of both "TRUE" and "FALSE" get
converted so the whole source tree is consistent.

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TRUE/FALSE vs true/false
Date: 2011-08-04 12:57:23
Message-ID: CA+Tgmoba-ZN2SQuWV8sy3_tYzKsUio=94+EufSnamj45DunJ8A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Aug 4, 2011 at 8:44 AM, Boszormenyi Zoltan <zb(at)cybertec(dot)at> wrote:
> 2011-08-04 14:32 keltezéssel, Robert Haas írta:
>> 2011/8/4 Boszormenyi Zoltan <zb(at)cybertec(dot)at>:
>>> Shouldn't these get fixed to be consistent?
>> I believe I already did.  See commit 85b436f7b1f06a6ffa8d2f29b03d6e440de18784.
>
> I meant a mass "sed -e 's/TRUE/true/g'  -e 's/FALSE/false/g'" run
> so all the ~200 occurrences of both "TRUE" and "FALSE" get
> converted so the whole source tree is consistent.

Oh, I see. Well, I don't care either way, so I'll let others weigh
in. The way it is doesn't bother me, but fixing it doesn't bother me
either.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: TRUE/FALSE vs true/false
Date: 2011-08-04 13:17:23
Message-ID: CAEYLb_WYvjnQUr3PF0E=GX0dg9Q8reLgx4OtOkG5ANhktXOTmg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 4 August 2011 13:57, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Oh, I see.  Well, I don't care either way, so I'll let others weigh
> in.  The way it is doesn't bother me, but fixing it doesn't bother me
> either.

Idiomatic win32 code uses BOOL and TRUE/FALSE. They are macros defined
somewhere or other.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: TRUE/FALSE vs true/false
Date: 2011-08-04 18:00:11
Message-ID: 1312480811.24208.9.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On tor, 2011-08-04 at 14:44 +0200, Boszormenyi Zoltan wrote:
> 2011-08-04 14:32 keltezéssel, Robert Haas írta:
> > 2011/8/4 Boszormenyi Zoltan <zb(at)cybertec(dot)at>:
> >> Shouldn't these get fixed to be consistent?
> > I believe I already did. See commit 85b436f7b1f06a6ffa8d2f29b03d6e440de18784.
>
> I meant a mass "sed -e 's/TRUE/true/g' -e 's/FALSE/false/g'" run
> so all the ~200 occurrences of both "TRUE" and "FALSE" get
> converted so the whole source tree is consistent.

I would be in favor of that.


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Boszormenyi Zoltan <zb(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: TRUE/FALSE vs true/false
Date: 2012-08-14 19:24:53
Message-ID: 20120814192453.GC11913@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Aug 4, 2011 at 09:00:11PM +0300, Peter Eisentraut wrote:
> On tor, 2011-08-04 at 14:44 +0200, Boszormenyi Zoltan wrote:
> > 2011-08-04 14:32 keltezéssel, Robert Haas írta:
> > > 2011/8/4 Boszormenyi Zoltan <zb(at)cybertec(dot)at>:
> > >> Shouldn't these get fixed to be consistent?
> > > I believe I already did. See commit 85b436f7b1f06a6ffa8d2f29b03d6e440de18784.
> >
> > I meant a mass "sed -e 's/TRUE/true/g' -e 's/FALSE/false/g'" run
> > so all the ~200 occurrences of both "TRUE" and "FALSE" get
> > converted so the whole source tree is consistent.
>
> I would be in favor of that.

I have implemented this with the patch at:

http://momjian.us/expire/true_diff.txt

I did not modify the Win32 code which traditionally uses uppercase for
TRUE/FALSE.

It would be applied only to HEAD.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: TRUE/FALSE vs true/false
Date: 2012-08-14 21:34:02
Message-ID: 13725.1344980042@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> On Thu, Aug 4, 2011 at 09:00:11PM +0300, Peter Eisentraut wrote:
>> On tor, 2011-08-04 at 14:44 +0200, Boszormenyi Zoltan wrote:
>>> I meant a mass "sed -e 's/TRUE/true/g' -e 's/FALSE/false/g'" run
>>> so all the ~200 occurrences of both "TRUE" and "FALSE" get
>>> converted so the whole source tree is consistent.

>> I would be in favor of that.

> I have implemented this with the patch at:
> http://momjian.us/expire/true_diff.txt

Does this really do anything for us that will justify the extra
back-patching pain it will cause? I don't see that it's improving
code readability any.

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: TRUE/FALSE vs true/false
Date: 2012-08-14 21:36:35
Message-ID: 20120814213635.GB15578@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Aug 14, 2012 at 05:34:02PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > On Thu, Aug 4, 2011 at 09:00:11PM +0300, Peter Eisentraut wrote:
> >> On tor, 2011-08-04 at 14:44 +0200, Boszormenyi Zoltan wrote:
> >>> I meant a mass "sed -e 's/TRUE/true/g' -e 's/FALSE/false/g'" run
> >>> so all the ~200 occurrences of both "TRUE" and "FALSE" get
> >>> converted so the whole source tree is consistent.
>
> >> I would be in favor of that.
>
> > I have implemented this with the patch at:
> > http://momjian.us/expire/true_diff.txt
>
> Does this really do anything for us that will justify the extra
> back-patching pain it will cause? I don't see that it's improving
> code readability any.

I think it is more of a consistency issue. There were multiple people
who wanted this change. Of course, some of those people don't backport
stuff.

Other comments?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: TRUE/FALSE vs true/false
Date: 2012-08-15 02:57:08
Message-ID: 1344999428.17599.6.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2012-08-14 at 17:36 -0400, Bruce Momjian wrote:
> On Tue, Aug 14, 2012 at 05:34:02PM -0400, Tom Lane wrote:
> > Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > > On Thu, Aug 4, 2011 at 09:00:11PM +0300, Peter Eisentraut wrote:
> > >> On tor, 2011-08-04 at 14:44 +0200, Boszormenyi Zoltan wrote:
> > >>> I meant a mass "sed -e 's/TRUE/true/g' -e 's/FALSE/false/g'" run
> > >>> so all the ~200 occurrences of both "TRUE" and "FALSE" get
> > >>> converted so the whole source tree is consistent.
> >
> > >> I would be in favor of that.
> >
> > > I have implemented this with the patch at:
> > > http://momjian.us/expire/true_diff.txt
> >
> > Does this really do anything for us that will justify the extra
> > back-patching pain it will cause? I don't see that it's improving
> > code readability any.
>
> I think it is more of a consistency issue. There were multiple people
> who wanted this change. Of course, some of those people don't backport
> stuff.

I guess you could argue this particular case without end, but I think we
should be open to these kinds of changes. They will make future code
easier to deal with and confuse new developers less (when to use which,
do they mean different things, etc.).

If back-patching really becomes a problem, we might want to look a
little deeper into git options. I think the Linux kernel people do
these kinds of cleanups more often, so there is probably some better
support for it.


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: TRUE/FALSE vs true/false
Date: 2012-08-16 19:15:12
Message-ID: 20120816191512.GA6286@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Aug 14, 2012 at 10:57:08PM -0400, Peter Eisentraut wrote:
> On Tue, 2012-08-14 at 17:36 -0400, Bruce Momjian wrote:
> > On Tue, Aug 14, 2012 at 05:34:02PM -0400, Tom Lane wrote:
> > > Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > > > On Thu, Aug 4, 2011 at 09:00:11PM +0300, Peter Eisentraut wrote:
> > > >> On tor, 2011-08-04 at 14:44 +0200, Boszormenyi Zoltan wrote:
> > > >>> I meant a mass "sed -e 's/TRUE/true/g' -e 's/FALSE/false/g'" run
> > > >>> so all the ~200 occurrences of both "TRUE" and "FALSE" get
> > > >>> converted so the whole source tree is consistent.
> > >
> > > >> I would be in favor of that.
> > >
> > > > I have implemented this with the patch at:
> > > > http://momjian.us/expire/true_diff.txt
> > >
> > > Does this really do anything for us that will justify the extra
> > > back-patching pain it will cause? I don't see that it's improving
> > > code readability any.
> >
> > I think it is more of a consistency issue. There were multiple people
> > who wanted this change. Of course, some of those people don't backport
> > stuff.
>
> I guess you could argue this particular case without end, but I think we
> should be open to these kinds of changes. They will make future code
> easier to deal with and confuse new developers less (when to use which,
> do they mean different things, etc.).
>
> If back-patching really becomes a problem, we might want to look a
> little deeper into git options. I think the Linux kernel people do
> these kinds of cleanups more often, so there is probably some better
> support for it.

So what do we want to do with this? I am a little concerned that we are
sacrificing code clarity for backpatching ease, but I don't do as much
backpatching as Tom.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Bruce Momjian" <bruce(at)momjian(dot)us>
Cc: "Boszormenyi Zoltan" <zb(at)cybertec(dot)at>, <pgsql-hackers(at)postgresql(dot)org>,"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: TRUE/FALSE vs true/false
Date: 2012-08-16 19:21:12
Message-ID: 502D01D802000025000497C9@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> wrote:

> So what do we want to do with this? I am a little concerned that
> we are sacrificing code clarity for backpatching ease, but I don't
> do as much backpatching as Tom.

Well, if you back-patched this change, it would eliminate the issue
for Tom, wouldn't it? Not sure if that's sane; just a thought.

-Kevin


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: TRUE/FALSE vs true/false
Date: 2012-08-16 19:32:16
Message-ID: 20120816193216.GB6286@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Aug 16, 2012 at 02:21:12PM -0500, Kevin Grittner wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>
> > So what do we want to do with this? I am a little concerned that
> > we are sacrificing code clarity for backpatching ease, but I don't
> > do as much backpatching as Tom.
>
> Well, if you back-patched this change, it would eliminate the issue
> for Tom, wouldn't it? Not sure if that's sane; just a thought.

I would be worried about some instability in backpatching. I was
looking for an 'ignore-case' mode to patch, but I don't see it.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: TRUE/FALSE vs true/false
Date: 2012-08-20 19:09:08
Message-ID: CA+TgmoYCivSv6HfzTuYAxuuGeoChNwtjfywaCRpAN8ya_xrnYg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Aug 16, 2012 at 3:32 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> On Thu, Aug 16, 2012 at 02:21:12PM -0500, Kevin Grittner wrote:
>> Bruce Momjian <bruce(at)momjian(dot)us> wrote:
>>
>> > So what do we want to do with this? I am a little concerned that
>> > we are sacrificing code clarity for backpatching ease, but I don't
>> > do as much backpatching as Tom.
>>
>> Well, if you back-patched this change, it would eliminate the issue
>> for Tom, wouldn't it? Not sure if that's sane; just a thought.
>
> I would be worried about some instability in backpatching. I was
> looking for an 'ignore-case' mode to patch, but I don't see it.

I have difficult believing that a change of this type, if implemented
judiciously, is really going to create that much difficulty in
back-patching. I don't do as much back-patching as Tom either (no one
does), but most of the patches I do back-patch can be cherry-picked
all the way back without a problem. Some require adjustment, but even
then this kind of thing is pretty trivial to handle, as it's pretty
obvious what happened when you look through it. The really nasty
problems tend to come from places where the code has been rearranged,
rather than simple A-for-B substitutions.

I think the thing we need to look at is what percentage of our code
churn is coming from stuff like this, versus what percentage of it is
coming from other factors. If we change 250,000 lines of code per
release cycle and of that this kind of thing accounts for 5,000 lines
of deltas, then IMHO it's not really material. If it accounts for
50,000 lines of deltas out of the same base, that's probably more than
can really be justified by the benefit we're going to get out of it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: TRUE/FALSE vs true/false
Date: 2012-08-23 14:13:22
Message-ID: 20120823141322.GA19565@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Aug 20, 2012 at 03:09:08PM -0400, Robert Haas wrote:
> I have difficult believing that a change of this type, if implemented
> judiciously, is really going to create that much difficulty in
> back-patching. I don't do as much back-patching as Tom either (no one
> does), but most of the patches I do back-patch can be cherry-picked
> all the way back without a problem. Some require adjustment, but even
> then this kind of thing is pretty trivial to handle, as it's pretty
> obvious what happened when you look through it. The really nasty
> problems tend to come from places where the code has been rearranged,
> rather than simple A-for-B substitutions.
>
> I think the thing we need to look at is what percentage of our code
> churn is coming from stuff like this, versus what percentage of it is
> coming from other factors. If we change 250,000 lines of code per
> release cycle and of that this kind of thing accounts for 5,000 lines
> of deltas, then IMHO it's not really material. If it accounts for
> 50,000 lines of deltas out of the same base, that's probably more than
> can really be justified by the benefit we're going to get out of it.

The true/false capitalization patch changes 1.2k lines.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: TRUE/FALSE vs true/false
Date: 2012-08-23 15:01:05
Message-ID: 12232.1345734065@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> On Mon, Aug 20, 2012 at 03:09:08PM -0400, Robert Haas wrote:
>> I think the thing we need to look at is what percentage of our code
>> churn is coming from stuff like this, versus what percentage of it is
>> coming from other factors. If we change 250,000 lines of code per
>> release cycle and of that this kind of thing accounts for 5,000 lines
>> of deltas, then IMHO it's not really material. If it accounts for
>> 50,000 lines of deltas out of the same base, that's probably more than
>> can really be justified by the benefit we're going to get out of it.

> The true/false capitalization patch changes 1.2k lines.

I did a quick look at git diff --stat between recent branches:

$ git diff --shortstat REL9_0_9 REL9_1_5
3186 files changed, 314847 insertions(+), 210452 deletions(-)
$ git diff --shortstat REL9_1_5 REL9_2_BETA4
2037 files changed, 290919 insertions(+), 189487 deletions(-)

However, when you look at things a bit closer, these numbers are
misleading because they include the .po files, which seem to have huge
inter-branch churn --- well in excess of 100000 lines changed per
release, at least in git's simpleminded view. Excluding those, as well
as src/test/isolation/expected/prepared-transactions.out which added
34843 lines all by itself, I get
173080 insertions, 70300 deletions for 9.0.9 -> 9.1.5
130706 insertions, 55714 deletions for 9.1.5 -> 9.2beta4.
So it looks like we touch order-of-magnitude of 100K lines per release;
which still seems astonishingly high, but then this includes docs and
regression tests not just code. If I restrict the stat to *.[chyl]
files it's about half that:

$ git diff --numstat REL9_0_9 REL9_1_5 | grep '\.[chyl]$' | awk '{a += $1; b += $2}
END{print a,b}'
90234 33902
$ git diff --numstat REL9_1_5 REL9_2_BETA4 | grep '\.[chyl]$' | awk '{a += $1; b += $2}
END{print a,b}'
90200 42218

So a patch of 1K lines would by itself represent about 2% of the typical
inter-branch delta. Maybe that's below our threshold of pain, or maybe
it isn't. I'd be happier about it if there were a more compelling
argument for it, but to me it looks like extremely trivial neatnik-ism.

regards, tom lane


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <kevin(dot)grittner(at)wicourts(dot)gov>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: size of .po changesets
Date: 2012-08-23 15:21:35
Message-ID: 1345735134-sup-2821@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Tom Lane's message of jue ago 23 11:01:05 -0400 2012:

>
> $ git diff --shortstat REL9_0_9 REL9_1_5
> 3186 files changed, 314847 insertions(+), 210452 deletions(-)
> $ git diff --shortstat REL9_1_5 REL9_2_BETA4
> 2037 files changed, 290919 insertions(+), 189487 deletions(-)
>
> However, when you look at things a bit closer, these numbers are
> misleading because they include the .po files, which seem to have huge
> inter-branch churn --- well in excess of 100000 lines changed per
> release, at least in git's simpleminded view.

Yeah, IMHO .po files are handled pretty badly by SCMs. I wonder if we
could reduce the amount of git churn caused by those files by simply
removing all comment lines from these files as they are exported from
pgtranslation into postgresql.git? Since they are not "source" for
postgresql.git anyway, the other one being the canonical repository,
there doesn't seem to be any point to those lines ... or am I mistaken?

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Roger Leigh <rleigh(at)codelibre(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: size of .po changesets
Date: 2012-08-23 16:08:42
Message-ID: 20120823160842.GB20427@codelibre.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Aug 23, 2012 at 11:21:35AM -0400, Alvaro Herrera wrote:
> Excerpts from Tom Lane's message of jue ago 23 11:01:05 -0400 2012:
>
> >
> > $ git diff --shortstat REL9_0_9 REL9_1_5
> > 3186 files changed, 314847 insertions(+), 210452 deletions(-)
> > $ git diff --shortstat REL9_1_5 REL9_2_BETA4
> > 2037 files changed, 290919 insertions(+), 189487 deletions(-)
> >
> > However, when you look at things a bit closer, these numbers are
> > misleading because they include the .po files, which seem to have huge
> > inter-branch churn --- well in excess of 100000 lines changed per
> > release, at least in git's simpleminded view.
>
> Yeah, IMHO .po files are handled pretty badly by SCMs. I wonder if we
> could reduce the amount of git churn caused by those files by simply
> removing all comment lines from these files as they are exported from
> pgtranslation into postgresql.git? Since they are not "source" for
> postgresql.git anyway, the other one being the canonical repository,
> there doesn't seem to be any point to those lines ... or am I mistaken?

Have you considered adding "--no-location" to XGETTEXT_OPTIONS in
po/Makevars? This stops the massive churn through line renumbering
changes.

Regards,
Roger

--
.''`. Roger Leigh
: :' : Debian GNU/Linux http://people.debian.org/~rleigh/
`. `' schroot and sbuild http://alioth.debian.org/projects/buildd-tools
`- GPG Public Key F33D 281D 470A B443 6756 147C 07B3 C8BC 4083 E800


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Roger Leigh <rleigh(at)codelibre(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: size of .po changesets
Date: 2012-08-23 17:33:46
Message-ID: 548.1345743226@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Roger Leigh <rleigh(at)codelibre(dot)net> writes:
> On Thu, Aug 23, 2012 at 11:21:35AM -0400, Alvaro Herrera wrote:
>> Yeah, IMHO .po files are handled pretty badly by SCMs. I wonder if we
>> could reduce the amount of git churn caused by those files by simply
>> removing all comment lines from these files as they are exported from
>> pgtranslation into postgresql.git?

> Have you considered adding "--no-location" to XGETTEXT_OPTIONS in
> po/Makevars? This stops the massive churn through line renumbering
> changes.

I think the line numbers are actually useful to the translators --- at
least, the theory behind having them is to make it easy to look at the
message in context. Alvaro's point is that the copies of the .po files
in our SCM are pretty much write-only data, and could be stripped
relative to what the translators work with.

regards, tom lane


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Roger Leigh <rleigh(at)codelibre(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: size of .po changesets
Date: 2012-08-23 19:00:12
Message-ID: 1345748343-sup-2772@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Tom Lane's message of jue ago 23 13:33:46 -0400 2012:
> Roger Leigh <rleigh(at)codelibre(dot)net> writes:
> > On Thu, Aug 23, 2012 at 11:21:35AM -0400, Alvaro Herrera wrote:
> >> Yeah, IMHO .po files are handled pretty badly by SCMs. I wonder if we
> >> could reduce the amount of git churn caused by those files by simply
> >> removing all comment lines from these files as they are exported from
> >> pgtranslation into postgresql.git?
>
> > Have you considered adding "--no-location" to XGETTEXT_OPTIONS in
> > po/Makevars? This stops the massive churn through line renumbering
> > changes.
>
> I think the line numbers are actually useful to the translators --- at
> least, the theory behind having them is to make it easy to look at the
> message in context.

Yeah, and I, for one, do use them quite a bit to look up the code
context when the message is unclear.

> Alvaro's point is that the copies of the .po files
> in our SCM are pretty much write-only data, and could be stripped
> relative to what the translators work with.

Right.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <kevin(dot)grittner(at)wicourts(dot)gov>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: size of .po changesets
Date: 2012-08-24 03:36:45
Message-ID: 1345779405.9270.5.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 2012-08-23 at 11:21 -0400, Alvaro Herrera wrote:
> Yeah, IMHO .po files are handled pretty badly by SCMs.

By SCMs that store diffs internally, perhaps, but Git doesn't, so I
don't think it matters much for storage whether .po files diff well.

> I wonder if we
> could reduce the amount of git churn caused by those files by simply
> removing all comment lines from these files as they are exported from
> pgtranslation into postgresql.git? Since they are not "source" for
> postgresql.git anyway, the other one being the canonical repository,
> there doesn't seem to be any point to those lines ... or am I mistaken?

I don't see this being worth the trouble. It would just make it more
difficult to track where the files are coming from. There could also be
problems with downstream distributors if we are not shipping files in
source form.


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: TRUE/FALSE vs true/false
Date: 2012-08-25 16:26:30
Message-ID: CA+TgmoazC912BS_W9=6COMOd9jJxtaeER-vcPUY_26hEy9sYMg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Aug 23, 2012 at 11:01 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>> On Mon, Aug 20, 2012 at 03:09:08PM -0400, Robert Haas wrote:
>>> I think the thing we need to look at is what percentage of our code
>>> churn is coming from stuff like this, versus what percentage of it is
>>> coming from other factors. If we change 250,000 lines of code per
>>> release cycle and of that this kind of thing accounts for 5,000 lines
>>> of deltas, then IMHO it's not really material. If it accounts for
>>> 50,000 lines of deltas out of the same base, that's probably more than
>>> can really be justified by the benefit we're going to get out of it.
>
>> The true/false capitalization patch changes 1.2k lines.
>
> I did a quick look at git diff --stat between recent branches:
>
> $ git diff --shortstat REL9_0_9 REL9_1_5
> 3186 files changed, 314847 insertions(+), 210452 deletions(-)
> $ git diff --shortstat REL9_1_5 REL9_2_BETA4
> 2037 files changed, 290919 insertions(+), 189487 deletions(-)
>
> However, when you look at things a bit closer, these numbers are
> misleading because they include the .po files, which seem to have huge
> inter-branch churn --- well in excess of 100000 lines changed per
> release, at least in git's simpleminded view. Excluding those, as well
> as src/test/isolation/expected/prepared-transactions.out which added
> 34843 lines all by itself, I get
> 173080 insertions, 70300 deletions for 9.0.9 -> 9.1.5
> 130706 insertions, 55714 deletions for 9.1.5 -> 9.2beta4.
> So it looks like we touch order-of-magnitude of 100K lines per release;
> which still seems astonishingly high, but then this includes docs and
> regression tests not just code. If I restrict the stat to *.[chyl]
> files it's about half that:
>
> $ git diff --numstat REL9_0_9 REL9_1_5 | grep '\.[chyl]$' | awk '{a += $1; b += $2}
> END{print a,b}'
> 90234 33902
> $ git diff --numstat REL9_1_5 REL9_2_BETA4 | grep '\.[chyl]$' | awk '{a += $1; b += $2}
> END{print a,b}'
> 90200 42218
>
> So a patch of 1K lines would by itself represent about 2% of the typical
> inter-branch delta. Maybe that's below our threshold of pain, or maybe
> it isn't. I'd be happier about it if there were a more compelling
> argument for it, but to me it looks like extremely trivial neatnik-ism.

I wouldn't mind a bit if we devoted 2% of our inter-branch deltas to
this sort of thing, but I've got to admit that 2% for one patch seems
a little on the high side to me. It might not be a bad idea to
establish one formulation or the other as the one to be used in all
new code, though, to avoid making the problem worse.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: TRUE/FALSE vs true/false
Date: 2012-08-29 14:19:48
Message-ID: 20120829141948.GE26103@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Aug 25, 2012 at 12:26:30PM -0400, Robert Haas wrote:
> > So a patch of 1K lines would by itself represent about 2% of the typical
> > inter-branch delta. Maybe that's below our threshold of pain, or maybe
> > it isn't. I'd be happier about it if there were a more compelling
> > argument for it, but to me it looks like extremely trivial neatnik-ism.
>
> I wouldn't mind a bit if we devoted 2% of our inter-branch deltas to
> this sort of thing, but I've got to admit that 2% for one patch seems
> a little on the high side to me. It might not be a bad idea to
> establish one formulation or the other as the one to be used in all
> new code, though, to avoid making the problem worse.

Patch withdrawn. If we ever do a major code churn, it might be good to
revisit this cleanup.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +