Re: WIP patch for Oid formatting in printf/elog strings

Lists: pgsql-hackers
From: Mark Dilger <hornschnorter(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: WIP patch for Oid formatting in printf/elog strings
Date: 2014-12-11 17:31:26
Message-ID: CAE-h2TrbiH2e_NQJGBMzQ9Eg24i6pkSsa4E2YKWqjWz2PNru2A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I found a few places in the code where a variable of
type Oid is printed using "%d" rather than "%u" and
changed them in the attached patch.

In src/backend/replication/logical/reorderbuffer.c,
circa line 2494, chunk_seq is of type Oid, but
ent->last_chunk_seq is of type int32, leading me
to question if perhaps the use of %d for chunk_seq
is correct, but the use of Oid for the type of chunk_seq
is in error. If neither is in error, perhaps someone
can provide a short code comment explaining the
logic of the signed/unsigned discrepancy.

Thanks,

Mark Dilger

Attachment Content-Type Size
oidformat.diff text/plain 2.8 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Mark Dilger <hornschnorter(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP patch for Oid formatting in printf/elog strings
Date: 2014-12-11 17:39:48
Message-ID: 20141211173948.GL1768@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Dilger wrote:
> I found a few places in the code where a variable of
> type Oid is printed using "%d" rather than "%u" and
> changed them in the attached patch.
>
> In src/backend/replication/logical/reorderbuffer.c,
> circa line 2494, chunk_seq is of type Oid, but
> ent->last_chunk_seq is of type int32, leading me
> to question if perhaps the use of %d for chunk_seq
> is correct, but the use of Oid for the type of chunk_seq
> is in error. If neither is in error, perhaps someone
> can provide a short code comment explaining the
> logic of the signed/unsigned discrepancy.

tuptoaster defines chunk_seq as signed int32; ReorderBufferToastAppendChunk
is wrong in declaring it Oid, and so this part of your patch is bogus.
The others seem correct.

> diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
> index 6e75398..41a4896 100644
> --- a/src/backend/replication/logical/reorderbuffer.c
> +++ b/src/backend/replication/logical/reorderbuffer.c
> @@ -2487,11 +2487,11 @@ ReorderBufferToastAppendChunk(ReorderBuffer *rb, ReorderBufferTXN *txn,
> dlist_init(&ent->chunks);
>
> if (chunk_seq != 0)
> - elog(ERROR, "got sequence entry %d for toast chunk %u instead of seq 0",
> + elog(ERROR, "got sequence entry %u for toast chunk %u instead of seq 0",
> chunk_seq, chunk_id);
> }
> else if (found && chunk_seq != ent->last_chunk_seq + 1)
> - elog(ERROR, "got sequence entry %d for toast chunk %u instead of seq %d",
> + elog(ERROR, "got sequence entry %u for toast chunk %u instead of seq %d",
> chunk_seq, chunk_id, ent->last_chunk_seq + 1);
>
> chunk = DatumGetPointer(fastgetattr(&newtup->tuple, 3, desc, &isnull));

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Mark Dilger <hornschnorter(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP patch for Oid formatting in printf/elog strings
Date: 2014-12-11 18:27:34
Message-ID: CAE-h2Tqb2UDr052x7COnvtMzo_ujeV66x9HLwpwx=xBSh-vsvA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thank you Alvaro,

The attached patch changes the type of chunk_seq to int32,
rather than changing the %d formatting. The other changes
are the same as in the previous patch.

Mark Dilger

On Thu, Dec 11, 2014 at 9:39 AM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
wrote:

> Mark Dilger wrote:
> > I found a few places in the code where a variable of
> > type Oid is printed using "%d" rather than "%u" and
> > changed them in the attached patch.
> >
> > In src/backend/replication/logical/reorderbuffer.c,
> > circa line 2494, chunk_seq is of type Oid, but
> > ent->last_chunk_seq is of type int32, leading me
> > to question if perhaps the use of %d for chunk_seq
> > is correct, but the use of Oid for the type of chunk_seq
> > is in error. If neither is in error, perhaps someone
> > can provide a short code comment explaining the
> > logic of the signed/unsigned discrepancy.
>
> tuptoaster defines chunk_seq as signed int32; ReorderBufferToastAppendChunk
> is wrong in declaring it Oid, and so this part of your patch is bogus.
> The others seem correct.
>
> > diff --git a/src/backend/replication/logical/reorderbuffer.c
> b/src/backend/replication/logical/reorderbuffer.c
> > index 6e75398..41a4896 100644
> > --- a/src/backend/replication/logical/reorderbuffer.c
> > +++ b/src/backend/replication/logical/reorderbuffer.c
> > @@ -2487,11 +2487,11 @@ ReorderBufferToastAppendChunk(ReorderBuffer *rb,
> ReorderBufferTXN *txn,
> > dlist_init(&ent->chunks);
> >
> > if (chunk_seq != 0)
> > - elog(ERROR, "got sequence entry %d for toast chunk
> %u instead of seq 0",
> > + elog(ERROR, "got sequence entry %u for toast chunk
> %u instead of seq 0",
> > chunk_seq, chunk_id);
> > }
> > else if (found && chunk_seq != ent->last_chunk_seq + 1)
> > - elog(ERROR, "got sequence entry %d for toast chunk %u
> instead of seq %d",
> > + elog(ERROR, "got sequence entry %u for toast chunk %u
> instead of seq %d",
> > chunk_seq, chunk_id, ent->last_chunk_seq + 1);
> >
> > chunk = DatumGetPointer(fastgetattr(&newtup->tuple, 3, desc,
> &isnull));
>
> --
> Álvaro Herrera http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Attachment Content-Type Size
oidformat.diff text/plain 2.5 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Dilger <hornschnorter(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP patch for Oid formatting in printf/elog strings
Date: 2014-12-11 20:12:23
Message-ID: 6593.1418328743@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Dilger <hornschnorter(at)gmail(dot)com> writes:
> The attached patch changes the type of chunk_seq to int32,
> rather than changing the %d formatting. The other changes
> are the same as in the previous patch.

BTW, how are you finding these? If it's some automated tool, what?
(If you're finding them by hand, I'm in awe of your scan rate.)

regards, tom lane


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Mark Dilger <hornschnorter(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP patch for Oid formatting in printf/elog strings
Date: 2014-12-11 20:46:03
Message-ID: 20141211204603.GQ1768@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane wrote:
> Mark Dilger <hornschnorter(at)gmail(dot)com> writes:
> > The attached patch changes the type of chunk_seq to int32,
> > rather than changing the %d formatting. The other changes
> > are the same as in the previous patch.
>
> BTW, how are you finding these? If it's some automated tool, what?
> (If you're finding them by hand, I'm in awe of your scan rate.)

BTW if it's an automated tool, it might also improve by examining the
DatumGetFoo() macros. In the reorderbuffer.c code just fixed you can
see that chunk_seq is obtained by DatumGetInt32(), which was wrong to
assign to a value of type Oid.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Mark Dilger <hornschnorter(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP patch for Oid formatting in printf/elog strings
Date: 2014-12-11 21:25:00
Message-ID: CAE-h2Tp4wdu7DFR_skvi4Q3TBph94pu4m-Z8K=mxR8EBTY_rXw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Dec 11, 2014 at 12:12 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Mark Dilger <hornschnorter(at)gmail(dot)com> writes:
> > The attached patch changes the type of chunk_seq to int32,
> > rather than changing the %d formatting. The other changes
> > are the same as in the previous patch.
>
> BTW, how are you finding these? If it's some automated tool, what?
> (If you're finding them by hand, I'm in awe of your scan rate.)
>
> regards, tom lane
>

I've been advised to hoodwink you and say that I have found them all
by hand. Actually, I am changing the typedef in src/include/c.h and then
recompiling, and watching for compiler warnings telling me that I am
passing a uint64 to a %d. Of course, I get lots of warnings about
passing a uint64 to a %u, but I can filter through those easily enough.

Also, the macros in outfuncs.c tend to complain in a similar way, and
I can check that output, too.

mark


From: Mark Dilger <hornschnorter(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP patch for Oid formatting in printf/elog strings
Date: 2014-12-11 21:40:54
Message-ID: CAE-h2Topay9sbTt86OrEzmkApoTbuaSwsCO_9e7FW_6tEFRU2g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I spoke too soon. Actually, the Oid definition is in
src/include/postgres_ext.h,
which I'm sure you all know. But I'm changing other typedefs too, not
just for Oid.

I've been going through a copy of the code and replacing int32 and uint32
with the appropriate type such as Oid, TransactionId, and such, and have
created my own typedef for TypeModType, in order to help chase through
the code and figure out what functions handle what kind of data. By
defining TypeModType to int64, for example, I get lots of compiler errors
when passing a TypeModType * into a function that takes int32 *, and that
lets me know that the callers of that function think of it as a typemod
argument,
even if it does not have any clear documentation to that effect.

The exercise is a bit tedious, as I have to change lots of strings like
"the value %d is out of bounds" to something like
"the value " TYPEMODTYPE_FORMAT " is out of bounds", but the clarity
I get from it helps enough that it is useful to me.

I hope to eventually be able to simply pass switches to configure like
--64bit-oids and have the whole system work with Oid defined as 64-bits.
Likewise for varlena headers.

I ported the whole of postgres to 64-bit Oids and 64-bit varlena headers
once before, and it worked and passed all the regression tests and such,
but I did it by replacing lots of instances of int32 with instances of
int64,
so it didn't help clarify the meaning of anything. This time, I'm hoping
that
I can keep in sync with all the commits so that I can build a 32-bit or a
64-bit postgres at a moments notice.

Mark Dilger

On Thu, Dec 11, 2014 at 1:25 PM, Mark Dilger <hornschnorter(at)gmail(dot)com>
wrote:

>
>
> On Thu, Dec 11, 2014 at 12:12 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> Mark Dilger <hornschnorter(at)gmail(dot)com> writes:
>> > The attached patch changes the type of chunk_seq to int32,
>> > rather than changing the %d formatting. The other changes
>> > are the same as in the previous patch.
>>
>> BTW, how are you finding these? If it's some automated tool, what?
>> (If you're finding them by hand, I'm in awe of your scan rate.)
>>
>> regards, tom lane
>>
>
> I've been advised to hoodwink you and say that I have found them all
> by hand. Actually, I am changing the typedef in src/include/c.h and then
> recompiling, and watching for compiler warnings telling me that I am
> passing a uint64 to a %d. Of course, I get lots of warnings about
> passing a uint64 to a %u, but I can filter through those easily enough.
>
> Also, the macros in outfuncs.c tend to complain in a similar way, and
> I can check that output, too.
>
> mark
>


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Mark Dilger <hornschnorter(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP patch for Oid formatting in printf/elog strings
Date: 2014-12-11 21:40:57
Message-ID: 20141211214057.GI19832@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Dec 11, 2014 at 01:25:00PM -0800, Mark Dilger wrote:
>
>
> On Thu, Dec 11, 2014 at 12:12 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Mark Dilger <hornschnorter(at)gmail(dot)com> writes:
> > The attached patch changes the type of chunk_seq to int32,
> > rather than changing the %d formatting.  The other changes
> > are the same as in the previous patch.
>
> BTW, how are you finding these?  If it's some automated tool, what?
> (If you're finding them by hand, I'm in awe of your scan rate.)
>
>                         regards, tom lane
>
>
> I've been advised to hoodwink you and say that I have found them all
> by hand.  Actually, I am changing the typedef in src/include/c.h and then
> recompiling, and watching for compiler warnings telling me that I am
> passing a uint64 to a %d.  Of course, I get lots of warnings about
> passing a uint64 to a %u, but I can filter through those easily enough.

Cool idea!

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Mark Dilger <hornschnorter(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP patch for Oid formatting in printf/elog strings
Date: 2014-12-18 13:23:16
Message-ID: 20141218132316.GJ1768@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Mark Dilger wrote:

> I've been going through a copy of the code and replacing int32 and uint32
> with the appropriate type such as Oid, TransactionId, and such, and have
> created my own typedef for TypeModType, in order to help chase through
> the code and figure out what functions handle what kind of data. By
> defining TypeModType to int64, for example, I get lots of compiler errors
> when passing a TypeModType * into a function that takes int32 *, and that
> lets me know that the callers of that function think of it as a typemod
> argument,
> even if it does not have any clear documentation to that effect.
>
> The exercise is a bit tedious, as I have to change lots of strings like
> "the value %d is out of bounds" to something like
> "the value " TYPEMODTYPE_FORMAT " is out of bounds", but the clarity
> I get from it helps enough that it is useful to me.

If it weren't for the format string issue, which makes translations
impossible, I'd say let's go ahead with these ideas. For all the
drawbacks that a 64-bit TransactionId has, I'm sure there's people who
would prefer that to the constant vacuuming the 32-bit system causes.

But the int32/TypModType thing looks like we could carry without causing
any trouble at all. Not the format string part of it, though. How
large is a patch that changes typmod from int32 to TypModType?

I can see value in having 64-bit OIDs, for databases with large number
of toasted items. Not in the default build, mind you.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Mark Dilger <hornschnorter(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP patch for Oid formatting in printf/elog strings
Date: 2014-12-18 17:16:56
Message-ID: 40099B07-63DD-46BA-BD38-55C473C33252@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> On Dec 18, 2014, at 5:23 AM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
>
> Mark Dilger wrote:
>
>> I've been going through a copy of the code and replacing int32 and uint32
>> with the appropriate type such as Oid, TransactionId, and such, and have
>> created my own typedef for TypeModType, in order to help chase through
>> the code and figure out what functions handle what kind of data. By
>> defining TypeModType to int64, for example, I get lots of compiler errors
>> when passing a TypeModType * into a function that takes int32 *, and that
>> lets me know that the callers of that function think of it as a typemod
>> argument,
>> even if it does not have any clear documentation to that effect.
>>
>> The exercise is a bit tedious, as I have to change lots of strings like
>> "the value %d is out of bounds" to something like
>> "the value " TYPEMODTYPE_FORMAT " is out of bounds", but the clarity
>> I get from it helps enough that it is useful to me.
>
> If it weren't for the format string issue, which makes translations
> impossible, I'd say let's go ahead with these ideas. For all the
> drawbacks that a 64-bit TransactionId has, I'm sure there's people who
> would prefer that to the constant vacuuming the 32-bit system causes.

I have created #defines from configure which can be used in the code:

#if USE_BIGVARLENA

“the value %ld is out of bounds”

#else

“the value %d is out of bounds”

#endif

(Note the %ld vs %d)

I have not used these defines specifically to switch between string constants in my
working copy, but it could be done, with the obvious reduction in brevity.

> But the int32/TypModType thing looks like we could carry without causing
> any trouble at all. Not the format string part of it, though. How
> large is a patch that changes typmod from int32 to TypModType?
>
> I can see value in having 64-bit OIDs, for databases with large number
> of toasted items. Not in the default build, mind you.
>
> --
> Álvaro Herrera http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

I am attaching a patch that includes my current changes. This compiles
for me, at least on mac, and runs the regression tests (see below), so long as
I do not use the new configure options —enable-bigoid or —enable-bigvarlena.
I do use those from time to time, to get the code configured that way, but
only so I can see what breaks and work to fix it.

—enable-bigxid compiles and runs the regression tests with only one failure,
at that seems to just be a difference in the query plan that it is choosing.

—enable-bigcid is probably a complete waste of time as a feature that anybody
would want, but it helps me find places in the code where uint32 needs to be
changed to CommandId, and it passes all regression tests, at least on my mac.

Mark Dilger

Attachment Content-Type Size
patch.diff.gz application/x-gzip 126.4 KB