Re: WIP: guc enums

Lists: pgsql-patches
From: Magnus Hagander <magnus(at)hagander(dot)net>
To: pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: WIP: guc enums
Date: 2008-03-04 13:25:06
Message-ID: 20080304132506.GH17216@svr2.hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Attached is my first work at implementing GUC enums. It's not done yet,
obviously, but I did hit a problem that I need to ask about. So I'll just
send what I have now for comments.

The patch only converts a couple of the potential enum variables to the new
type, mainly as a proof of concept. But already I hit the problem twice -
the variable that holds the value of the guc enum is a C enum itself, which
gives a compiler warning when I pass a pointer to it for
config_enum.variable. (in this case, Log_error_verbosity and log_statement
are enums and have the problem, but client_min_messages, log_min_messages
and log_min_error_statement are already int and don't have it)

On my platform (linux x86) it works fine when I just cast this to (int *),
but I'm unsure if that's going to be safe on other platforms. I had some
indication that it's probably not?

And if not, the only way I know to do it is to change the C level enums to
be an int and use #define:s instead of what's there now. If that's
required, is that an acceptable change in order to implement this? If not,
any better ideas on how to do it?

And finally, please let me know if I seem to be off on a completely wrong
track with this :-)

//Magnus

Attachment Content-Type Size
guc_enum.patch text/plain 23.7 KB

From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Magnus Hagander" <magnus(at)hagander(dot)net>
Cc: "pgsql-patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: WIP: guc enums
Date: 2008-03-04 21:35:27
Message-ID: 47CDC09F.8000503@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Magnus Hagander wrote:
> The patch only converts a couple of the potential enum variables to the new
> type, mainly as a proof of concept. But already I hit the problem twice -
> the variable that holds the value of the guc enum is a C enum itself, which
> gives a compiler warning when I pass a pointer to it for
> config_enum.variable. (in this case, Log_error_verbosity and log_statement
> are enums and have the problem, but client_min_messages, log_min_messages
> and log_min_error_statement are already int and don't have it)
>
> On my platform (linux x86) it works fine when I just cast this to (int *),
> but I'm unsure if that's going to be safe on other platforms. I had some
> indication that it's probably not?

No, I don't think that's safe. Some googleing (*) suggests that the
compiler is free to choose any integer type for an enum. If you do
"*((int *)enumptr) = x", and the compiler chose a 16-bit type for the
enum, you overwrite some unrelated piece of memory.

> And if not, the only way I know to do it is to change the C level enums to
> be an int and use #define:s instead of what's there now. If that's
> required, is that an acceptable change in order to implement this? If not,
> any better ideas on how to do it?

Yuck :-(.

We could keep using the assignment hooks. But they could be a lot
simpler than they are nowadays, if the string -> int conversion was done
by the GUC machinery:

static const char *
assign_client_min_messages(int newval, bool doit, GucSource source)
{
client_min_messages = newval;
}

Another idea would be cajole the compiler to choose a type of the same
length as "int", by adding a dummy enum value to the enum, like:

enum client_min_messages {
DEBUG,
INFO,
...,
DUMMY = INT_MAX
}

Though I guess it might in theory choose something even wider, and the
"*((int *)enumptr) = x" would fail to set all the bits of the enum variable.

BTW, shouldn't be using malloc in config_enum_get_options...

(*): http://david.tribble.com/text/cdiffs.htm#C99-enum-type

and what I believe to be the current C99 standard, see "6.7.2.2
Enumeration specifiers":

http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "Magnus Hagander" <magnus(at)hagander(dot)net>, "pgsql-patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: WIP: guc enums
Date: 2008-03-05 05:48:03
Message-ID: 8857.1204696083@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
> Magnus Hagander wrote:
>> On my platform (linux x86) it works fine when I just cast this to (int *),
>> but I'm unsure if that's going to be safe on other platforms. I had some
>> indication that it's probably not?

> No, I don't think that's safe. Some googleing (*) suggests that the
> compiler is free to choose any integer type for an enum.

Yeah, it's absolutely not safe.

What I'd suggest is declaring the actual variable as int. You can still
use an enum typedef to declare the values, and just avert your eyes
when you have to cast the enum to int or vice versa. (This is legal per
C spec, so you won't introduce any portability issues when you do it.)

regards, tom lane


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Magnus Hagander" <magnus(at)hagander(dot)net>, "pgsql-patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: WIP: guc enums
Date: 2008-03-05 09:42:28
Message-ID: 47CE6B04.6020708@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Tom Lane wrote:
> "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
>> Magnus Hagander wrote:
>>> On my platform (linux x86) it works fine when I just cast this to (int *),
>>> but I'm unsure if that's going to be safe on other platforms. I had some
>>> indication that it's probably not?
>
>> No, I don't think that's safe. Some googleing (*) suggests that the
>> compiler is free to choose any integer type for an enum.
>
> Yeah, it's absolutely not safe.
>
> What I'd suggest is declaring the actual variable as int. You can still
> use an enum typedef to declare the values, and just avert your eyes
> when you have to cast the enum to int or vice versa. (This is legal per
> C spec, so you won't introduce any portability issues when you do it.)

That's pretty much the same as int variable and #defined constants. You
lose compiler checks, like assigning from one enum type to another, and
the "enumeration value ‘FOOBAR’ not handled in switch" warning.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: WIP: guc enums
Date: 2008-03-05 09:53:13
Message-ID: 20080305095313.GE5559@svr2.hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On Tue, Mar 04, 2008 at 09:35:27PM +0000, Heikki Linnakangas wrote:
>
> We could keep using the assignment hooks. But they could be a lot
> simpler than they are nowadays, if the string -> int conversion was done
> by the GUC machinery:
>
> static const char *
> assign_client_min_messages(int newval, bool doit, GucSource source)
> {
> client_min_messages = newval;
> }

That would work. We'd need to keep the dummy variable as well, and point to
that one (since we need a place for guc to store it).

Toms suggested method obviously works as well, takes less code, but does
have those other drawbacks you mentioned (compiler checks off). I can try
either way depending on what people prefer :-)

> Another idea would be cajole the compiler to choose a type of the same
> length as "int", by adding a dummy enum value to the enum, like:
>
> enum client_min_messages {
> DEBUG,
> INFO,
> ...,
> DUMMY = INT_MAX
> }
>
> Though I guess it might in theory choose something even wider, and the
> "*((int *)enumptr) = x" would fail to set all the bits of the enum variable.

This seems like a bad choice to me.

> BTW, shouldn't be using malloc in config_enum_get_options...

Ah, that's indeed wrong. Originally I used it only for the hint message,
which is explicitly free()d later... But now it needs a palloc, yes. Fixed.

//Magnus


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "Magnus Hagander" <magnus(at)hagander(dot)net>, "pgsql-patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: WIP: guc enums
Date: 2008-03-05 13:18:19
Message-ID: 21019.1204723099@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
> Tom Lane wrote:
>> What I'd suggest is declaring the actual variable as int. You can still
>> use an enum typedef to declare the values, and just avert your eyes
>> when you have to cast the enum to int or vice versa. (This is legal per
>> C spec, so you won't introduce any portability issues when you do it.)

> That's pretty much the same as int variable and #defined constants. You
> lose compiler checks, like assigning from one enum type to another, and
> the "enumeration value FOOBAR not handled in switch" warning.

Well, you can at least get the latter if you cast explicitly:

switch ((MyEnum) myvariable) ...

We do this in several places already where the underlying variable isn't
declared as the enum for one reason or another. Also, local variables
can be declared as the enum type to get a little more safety.

In any case, the alternative being suggested of keeping the variables as
strings throws away *every* possible code-level advantage of having an
enum variable classification.

regards, tom lane


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Magnus Hagander" <magnus(at)hagander(dot)net>, "pgsql-patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: WIP: guc enums
Date: 2008-03-05 13:33:09
Message-ID: 47CEA115.40705@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Tom Lane wrote:
> "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
>> Tom Lane wrote:
>>> What I'd suggest is declaring the actual variable as int. You can still
>>> use an enum typedef to declare the values, and just avert your eyes
>>> when you have to cast the enum to int or vice versa. (This is legal per
>>> C spec, so you won't introduce any portability issues when you do it.)
>
>> That's pretty much the same as int variable and #defined constants. You
>> lose compiler checks, like assigning from one enum type to another, and
>> the "enumeration value ‘FOOBAR’ not handled in switch" warning.
>
> Well, you can at least get the latter if you cast explicitly:
>
> switch ((MyEnum) myvariable) ...
>
> We do this in several places already where the underlying variable isn't
> declared as the enum for one reason or another. Also, local variables
> can be declared as the enum type to get a little more safety.

True, I guess that's not too bad then. Just have to remember to do that.

Regarding the places where we already do that, I could find just three:
src/backend/utils/adt/lockfuncs.c: switch ((LockTagType)
lock->tag.locktag_type)
src/backend/storage/lmgr/lmgr.c: switch ((LockTagType) tag->locktag_type)
src/backend/regex/regc_locale.c: switch ((enum classes) index)

The first and 2nd are really the same, and it seems legitimate. At quick
glance, I couldn't figure out why "index" is an int-variable, and not an
enum, but that code comes from tcl.

> In any case, the alternative being suggested of keeping the variables as
> strings throws away *every* possible code-level advantage of having an
> enum variable classification.

Oh no, I didn't suggest keeping the variables as strings, that's
madness. I suggested keeping the variables as enums, and defining
"setter" functions for them, similar to the assign hooks we have now,
but the setter function wouldn't have to do anything else than assign an
int to the enum variable. The setter function would be just a
replacement for "*((int *)variable) = X".

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "Magnus Hagander" <magnus(at)hagander(dot)net>, "pgsql-patches" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: WIP: guc enums
Date: 2008-03-05 13:40:56
Message-ID: 21962.1204724456@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
> Oh no, I didn't suggest keeping the variables as strings, that's
> madness. I suggested keeping the variables as enums, and defining
> "setter" functions for them, similar to the assign hooks we have now,
> but the setter function wouldn't have to do anything else than assign an
> int to the enum variable. The setter function would be just a
> replacement for "*((int *)variable) = X".

Oh, I misunderstood. That would work, though you'd *also* need a fetch
function. Having to have two extra hook functions for every variable
seems like a lot of notational overhead for not much gain. (In my
experience C compilers are pretty darn lax about enums anyway, and so
there's not that much "strong typing" benefit to be gained from
declaring the variables as enums rather than int.)

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: WIP: guc enums
Date: 2008-03-05 15:03:22
Message-ID: 47CEB63A.3060606@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

Tom Lane wrote:
> Having to have two extra hook functions for every variable
> seems like a lot of notational overhead for not much gain.
>

+1

cheers

andrew


From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: WIP: guc enums
Date: 2008-03-05 15:20:27
Message-ID: 20080305152027.GS5559@svr2.hagander.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-patches

On Wed, Mar 05, 2008 at 08:18:19AM -0500, Tom Lane wrote:
> "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
> > Tom Lane wrote:
> >> What I'd suggest is declaring the actual variable as int. You can still
> >> use an enum typedef to declare the values, and just avert your eyes
> >> when you have to cast the enum to int or vice versa. (This is legal per
> >> C spec, so you won't introduce any portability issues when you do it.)
>
> > That's pretty much the same as int variable and #defined constants. You
> > lose compiler checks, like assigning from one enum type to another, and
> > the "enumeration value FOOBAR not handled in switch" warning.
>
> Well, you can at least get the latter if you cast explicitly:
>
> switch ((MyEnum) myvariable) ...
>
> We do this in several places already where the underlying variable isn't
> declared as the enum for one reason or another. Also, local variables
> can be declared as the enum type to get a little more safety.

Looking for the two variables I've converted so far, there appears to be
zero places to make this change. They are only used in == and <= tests,
never in switches. Or should I add casts for those as well?

(I'm sure there can be other variables that are used in enums when I get
further down that list)

If we're good with that method, I'll proceed using that one. Any other
things people noticed with the patch that I should take into consideration
when I start my cleanup pass over the code?

//Magnus