Re: Size vs size_t

Lists: pgsql-hackers
From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Size vs size_t
Date: 2017-03-16 20:40:17
Message-ID: CAEepm=1eA0vsgA7-2oigKzqg10YeXoPWiS-fCuQRDLwwmgMXag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Noticing that the assembled hackers don't seem to agree on $SUBJECT in
new patches, I decided to plot counts of lines matching \<Size\> and
\<size_t\> over time. After a very long run in the lead, size_t has
recently been left in the dust by Size.

--
Thomas Munro
http://www.enterprisedb.com

Attachment Content-Type Size
image/png 29.0 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Size vs size_t
Date: 2017-03-16 20:59:29
Message-ID: CA+TgmoZ7F-TeqqstK46+Q2ymgFHB1qx6N6BxRcLRSAw0ZbCjOA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Mar 16, 2017 at 4:40 PM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> Noticing that the assembled hackers don't seem to agree on $SUBJECT in
> new patches, I decided to plot counts of lines matching \<Size\> and
> \<size_t\> over time. After a very long run in the lead, size_t has
> recently been left in the dust by Size.

I guess I assumed that we wouldn't have defined PG-specific types if
we wanted to just use the OS-supplied ones.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Size vs size_t
Date: 2017-03-16 21:01:34
Message-ID: 20170316210134.4si3zeuizmev7j73@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2017-03-16 16:59:29 -0400, Robert Haas wrote:
> On Thu, Mar 16, 2017 at 4:40 PM, Thomas Munro
> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> > Noticing that the assembled hackers don't seem to agree on $SUBJECT in
> > new patches, I decided to plot counts of lines matching \<Size\> and
> > \<size_t\> over time. After a very long run in the lead, size_t has
> > recently been left in the dust by Size.
>
> I guess I assumed that we wouldn't have defined PG-specific types if
> we wanted to just use the OS-supplied ones.

I think, in this case, defining Size in the first place was a bad call
on behalf of the project. It gains us absolutely nothing, but makes it
harder to read for people that don't know PG all that well. I think we
should slowly phase out Size usage, at least in new code.

Greetings,

Andres Freund


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Size vs size_t
Date: 2017-03-16 21:10:09
Message-ID: CA+TgmoZ+_Qz3um7-5G1h6XBXAEGbGGj8_LePKc2QqVgcqkP9-Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Mar 16, 2017 at 5:01 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2017-03-16 16:59:29 -0400, Robert Haas wrote:
>> On Thu, Mar 16, 2017 at 4:40 PM, Thomas Munro
>> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>> > Noticing that the assembled hackers don't seem to agree on $SUBJECT in
>> > new patches, I decided to plot counts of lines matching \<Size\> and
>> > \<size_t\> over time. After a very long run in the lead, size_t has
>> > recently been left in the dust by Size.
>>
>> I guess I assumed that we wouldn't have defined PG-specific types if
>> we wanted to just use the OS-supplied ones.
>
> I think, in this case, defining Size in the first place was a bad call
> on behalf of the project. It gains us absolutely nothing, but makes it
> harder to read for people that don't know PG all that well. I think we
> should slowly phase out Size usage, at least in new code.

Well, I don't think we want to end up with a mix of Size and size_t in
related code. That buys nobody anything. I'm fine with replacing
Size with size_t if they are always equivalent, but there's no sense
in having a jumble of styles.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Size vs size_t
Date: 2017-03-16 21:24:17
Message-ID: 25076.1489699457@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Thu, Mar 16, 2017 at 5:01 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> On 2017-03-16 16:59:29 -0400, Robert Haas wrote:
>>> I guess I assumed that we wouldn't have defined PG-specific types if
>>> we wanted to just use the OS-supplied ones.

>> I think, in this case, defining Size in the first place was a bad call
>> on behalf of the project.

The short answer to that is that "Size" predates the universal acceptance
of size_t. If we were making these decisions today, or anytime since the
early 2000s, we'd surely have just gone with size_t. But it wasn't a
realistic option in the 90s.

> Well, I don't think we want to end up with a mix of Size and size_t in
> related code. That buys nobody anything. I'm fine with replacing
> Size with size_t if they are always equivalent, but there's no sense
> in having a jumble of styles.

I'm not in a hurry to do "s/Size/size_t/g" because I'm afraid it'll create
a lot of merge pain for back-patching, while not actually buying anything
much concretely. I think this falls under the same policy we use for many
other stylistic details, ie make new code look like the code right around
it. But I'm fine with entirely-new files standardizing on size_t.

regards, tom lane


From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Size vs size_t
Date: 2017-03-16 21:39:55
Message-ID: 20170316213955.2hu2fadka5vywdid@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2017-03-16 17:24:17 -0400, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > On Thu, Mar 16, 2017 at 5:01 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> >> On 2017-03-16 16:59:29 -0400, Robert Haas wrote:
> >>> I guess I assumed that we wouldn't have defined PG-specific types if
> >>> we wanted to just use the OS-supplied ones.
>
> >> I think, in this case, defining Size in the first place was a bad call
> >> on behalf of the project.
>
> The short answer to that is that "Size" predates the universal acceptance
> of size_t. If we were making these decisions today, or anytime since the
> early 2000s, we'd surely have just gone with size_t. But it wasn't a
> realistic option in the 90s.

Just out of curiosity I checked when we switched to backing Size with
size_t:
1998 - 0ad5d2a3a886e72b429ea2b84bfcb36c0680f84d

> > Well, I don't think we want to end up with a mix of Size and size_t in
> > related code. That buys nobody anything. I'm fine with replacing
> > Size with size_t if they are always equivalent, but there's no sense
> > in having a jumble of styles.
>
> I'm not in a hurry to do "s/Size/size_t/g" because I'm afraid it'll create
> a lot of merge pain for back-patching, while not actually buying anything
> much concretely. I think this falls under the same policy we use for many
> other stylistic details, ie make new code look like the code right around
> it. But I'm fine with entirely-new files standardizing on size_t.

That seems like sane policy. I'm a bit doubtful that the pain would be
all that bad, but I'm also not wild about trying.

Greetings,

Andres Freund


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Size vs size_t
Date: 2017-03-16 22:00:11
Message-ID: 26314.1489701611@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2017-03-16 17:24:17 -0400, Tom Lane wrote:
>> The short answer to that is that "Size" predates the universal acceptance
>> of size_t. If we were making these decisions today, or anytime since the
>> early 2000s, we'd surely have just gone with size_t. But it wasn't a
>> realistic option in the 90s.

> Just out of curiosity I checked when we switched to backing Size with
> size_t:
> 1998 - 0ad5d2a3a886e72b429ea2b84bfcb36c0680f84d

Yeah. We inherited the previous definition (as "unsigned int") from
Berkeley. I wasn't involved then, of course, but I follow their reasoning
perfectly because I remember fighting the same type of portability battles
with libjpeg in the early 90s. "size_t" was invented by the ANSI C
committee (hence, 1989 or 1990) and had only very haphazard penetration
until the late 90s. If you wanted to write portable code you couldn't
depend on it.

regards, tom lane


From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Size vs size_t
Date: 2017-03-16 22:12:54
Message-ID: CAEepm=0y67tAMmVkpfMu0LR0_OunE9uOkJJZSVbqLW6WuLj-uw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Mar 17, 2017 at 10:39 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2017-03-16 17:24:17 -0400, Tom Lane wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> > On Thu, Mar 16, 2017 at 5:01 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> >> On 2017-03-16 16:59:29 -0400, Robert Haas wrote:
>> > Well, I don't think we want to end up with a mix of Size and size_t in
>> > related code. That buys nobody anything. I'm fine with replacing
>> > Size with size_t if they are always equivalent, but there's no sense
>> > in having a jumble of styles.
>>
>> I'm not in a hurry to do "s/Size/size_t/g" because I'm afraid it'll create
>> a lot of merge pain for back-patching, while not actually buying anything
>> much concretely. I think this falls under the same policy we use for many
>> other stylistic details, ie make new code look like the code right around
>> it. But I'm fine with entirely-new files standardizing on size_t.
>
> That seems like sane policy. I'm a bit doubtful that the pain would be
> all that bad, but I'm also not wild about trying.

Naive replacement in new files (present in master but not in 9.6) with
the attached script, followed by a couple of manual corrections where
Size was really an English word in a comment, gets the attached diff.

src/backend/access/hash/hash_xlog.c | 26 ++--
src/backend/replication/logical/launcher.c | 4 +-
src/backend/utils/misc/backend_random.c | 4 +-
src/backend/utils/mmgr/dsa.c | 94 ++++++-------
src/backend/utils/mmgr/freepage.c | 202 ++++++++++++++--------------
src/backend/utils/mmgr/slab.c | 34 ++---
src/include/lib/simplehash.h | 6 +-
src/include/replication/logicallauncher.h | 2 +-
src/include/utils/backend_random.h | 2 +-
src/include/utils/dsa.h | 10 +-
src/include/utils/freepage.h | 24 ++--
src/include/utils/relptr.h | 4 +-
12 files changed, 206 insertions(+), 206 deletions(-)

That might be just about enough for size_t to catch up...

--
Thomas Munro
http://www.enterprisedb.com

Attachment Content-Type Size
Size-to-size_t.sh application/x-sh 159 bytes
Size-to-size_t.patch application/octet-stream 46.6 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Size vs size_t
Date: 2017-03-16 22:20:54
Message-ID: 27349.1489702854@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> writes:
> Naive replacement in new files (present in master but not in 9.6) with
> the attached script, followed by a couple of manual corrections where
> Size was really an English word in a comment, gets the attached diff.

In the case of mmgr/slab.c, a lot of those uses of Size probably
correspond to instantiations of the MemoryContext APIs; so blindly
changing them to "size_t" seems like a bit of a type violation
(and might indeed draw warnings from pickier compilers). Don't
know if any of the other things you've identified here have similar
entanglements.

regards, tom lane


From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Size vs size_t
Date: 2017-03-20 10:14:11
Message-ID: D8C0749C-54B4-4503-AAF8-53059C3E624F@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> On 16 Mar 2017, at 23:20, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> writes:
>> Naive replacement in new files (present in master but not in 9.6) with
>> the attached script, followed by a couple of manual corrections where
>> Size was really an English word in a comment, gets the attached diff.
>
> In the case of mmgr/slab.c, a lot of those uses of Size probably
> correspond to instantiations of the MemoryContext APIs; so blindly
> changing them to "size_t" seems like a bit of a type violation
> (and might indeed draw warnings from pickier compilers). Don't
> know if any of the other things you've identified here have similar
> entanglements.

While it might not be an issue that hits many developers, Size is also defined
on macOS in the MacTypes.h header so using CoreFoundation when hacking on macOS
port code will cause typedef redefinition errors.

Not really a case for or against, but another datapoint.

cheers ./daniel