Re: Questions about parsing boolean and casting to anyelement

Lists: pgsql-hackers
From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Questions about parsing boolean and casting to anyelement
Date: 2009-02-16 04:04:38
Message-ID: 20090216121254.F19D.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

The pg_autovacuum system catalog will be deprecated in 8.4,
but my customers use them to control autovacuum to emulate
maintenance window. So, I'm trying to re-implement the catalog
using a VIEW and RULEs in 8.4.

The attached is a WIP script, but I have some questions around it:
(XXX: I don't mean to propose the script in the core.)

- Postgres interprets 'on' as true and 'off' as false in configuration
parameters, but they are not accepted in sql-boolean.
Is it a design? or should we add a parser for 'on' and 'off' ?
I'd like to allow 'on' and 'off' in sql-boolean, too.

- The input strings are stored as-is in pg_class.reloptions.
So, mixed values could be shown in reloptions. For example
autovacuum_enabled=0/1/on/off/true/false .
Should we canonicalize them? However, I think the current behavior
is not so bad because it can preserve user inputs.

- Are there any limitations in casting to anyelement?
I got an error when I define the 3rd argument of array_find()
as anyelement:
ERROR: UNION types text and integer cannot be matched
Even if I use casts, it seems to be ignored.

CREATE FUNCTION array_find(text[], text, anyelement)
RETURNS anyelement AS
$$
SELECT substring(i from E'\\W*=(.*)')::anyelement
FROM unnest($1) AS t(i) WHERE i LIKE $2 || '=%'
UNION ALL SELECT $3 LIMIT 1
$$
LANGUAGE sql IMMUTABLE STRICT;

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment Content-Type Size
pg_autovacuum-8.4.sql application/octet-stream 1.6 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Questions about parsing boolean and casting to anyelement
Date: 2009-02-16 15:48:44
Message-ID: 17184.1234799324@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> - Are there any limitations in casting to anyelement?

It's a no-op ... probably we shouldn't even let you do it, if the
lack of an error leaves room for such misinterpretation as this.
anyelement and friends are placeholders for use in function
declarations, not real types that it makes sense to cast to.

> CREATE FUNCTION array_find(text[], text, anyelement)
> RETURNS anyelement AS
> $$
> SELECT substring(i from E'\\W*=(.*)')::anyelement
> FROM unnest($1) AS t(i) WHERE i LIKE $2 || '=%'
> UNION ALL SELECT $3 LIMIT 1
> $$
> LANGUAGE sql IMMUTABLE STRICT;

The substring() necessarily produces type text, so I dunno why you
think $3 needs to be anything but text.

regards, tom lane


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Questions about parsing boolean and casting to anyelement
Date: 2009-02-17 00:53:04
Message-ID: 20090217092434.F173.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> > - Are there any limitations in casting to anyelement?
>
> It's a no-op ... probably we shouldn't even let you do it, if the
> lack of an error leaves room for such misinterpretation as this.
> anyelement and friends are placeholders for use in function
> declarations, not real types that it makes sense to cast to.

I hope anyelement could be used in cast because casts are supported by
almost programming languages where template or generics are available.
Moreover, we can cast to anyelement if we use C functions; using oid of
anyelement in runtime and querying an associated cast function from
system catalog.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Questions about parsing boolean and casting to anyelement
Date: 2009-02-17 01:03:33
Message-ID: 23727.1234832613@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
>>> - Are there any limitations in casting to anyelement?
>>
>> It's a no-op ... probably we shouldn't even let you do it, if the
>> lack of an error leaves room for such misinterpretation as this.
>> anyelement and friends are placeholders for use in function
>> declarations, not real types that it makes sense to cast to.

> I hope anyelement could be used in cast because casts are supported by
> almost programming languages where template or generics are available.

I think what you're suggesting is that inside a polymorphic function,
anyelement would somehow be a macro for the type that the function's
current anyelement parameter(s) have. It's an interesting idea but
it's just fantasy at the moment; I don't even have an idea of how we
might implement that.

In the meantime I'm more convinced than ever that we should throw an
error for attempting such a cast. If people are imagining that it will
do something like that, we need to disillusion them.

regards, tom lane


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Allow on/off as input texts for boolean.
Date: 2009-02-17 06:17:45
Message-ID: 20090217150947.F185.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> wrote:
> - Postgres interprets 'on' as true and 'off' as false in configuration
> parameters, but they are not accepted in sql-boolean.
> Is it a design? or should we add a parser for 'on' and 'off' ?
> I'd like to allow 'on' and 'off' in sql-boolean, too.

Here is a patch to allow 'on' and 'off' as input texts for boolean.
Duplicated boolean parsers in parse_bool() and boolin() are merged
into a new parse_bool_with_len().

I think the change is useful when we treat reloptions in programs.
Since human-readable texts are not suitable for programs, we would need
conversions from text to boolean. Then the shared parser works well.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment Content-Type Size
boolin_accepts_onoff.patch application/octet-stream 6.5 KB

From: Sam Mason <sam(at)samason(dot)me(dot)uk>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Questions about parsing boolean and casting to anyelement
Date: 2009-02-17 11:59:15
Message-ID: 20090217115915.GF32672@frubble.xen.chris-lamb.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Feb 16, 2009 at 08:03:33PM -0500, Tom Lane wrote:
> ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
> > I hope anyelement could be used in cast because casts are supported by
> > almost programming languages where template or generics are available.

Programming languages with "generics" (aka, parametric polymorphism in
literature) should mean that you need *less* casts because the type
system is expressive enough that you don't need to "escape" through a
cast.

> I think what you're suggesting is that inside a polymorphic function,
> anyelement would somehow be a macro for the type that the function's
> current anyelement parameter(s) have. It's an interesting idea but
> it's just fantasy at the moment; I don't even have an idea of how we
> might implement that.

A couple of solutions would immediately present themselves; making
functions first class objects and introducing something called "type
classes" (please note these bear little resemblance to "classes" in
object orientated programming).

If functions were first class objects; you could pass in the "input"
function (i.e. boolin, or numeric_in) to the "array_find" function
directly call it in place of the "magic" cast syntax (magic because it
has to figure out the type of the LHS, whereas if it was a function with
known type then it wouldn't need to infer the source type).

Type classes[1][2] are a general mechanism for making the "magic" above
tractable and sound. The cast above would be exactly analogous to the
"read" function in Haskell, and is used very regularly in most code.

> In the meantime I'm more convinced than ever that we should throw an
> error for attempting such a cast. If people are imagining that it will
> do something like that, we need to disillusion them.

Yes, sounds sensible at the moment.

--
Sam http://samason.me.uk/

[1] http://portal.acm.org/citation.cfm?id=75277.75283
is the original paper
[2] http://portal.acm.org/citation.cfm?id=141536
extends them to have multiple type parameters, not for PG but nice
to know it's been done before and isn't new ground


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Questions about parsing boolean and casting to anyelement
Date: 2009-02-17 15:40:36
Message-ID: 3030.1234885236@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
>> I hope anyelement could be used in cast because casts are supported by
>> almost programming languages where template or generics are available.

> I think what you're suggesting is that inside a polymorphic function,
> anyelement would somehow be a macro for the type that the function's
> current anyelement parameter(s) have. It's an interesting idea but
> it's just fantasy at the moment; I don't even have an idea of how we
> might implement that.

After thinking about it for awhile, I don't like the notation anyway
--- it's not immediately obvious that a cast to anyelement should mean
something like that. What seems more sensible to me is to introduce
a function to get the type of an expression, so that you could write
something like

cast(expression as typeof(expression))

This special function would act like C's sizeof and similar constructs
in that its argument would never be evaluated, only inspected at parse
time to determine its type. (There are already precedents for this in
SQL; see the IS OF construct.) So the original requirement would be
met with something like "expression::typeof($1)".

A small disadvantage of this approach is that it's notationally a bit
uglier for anyelement/anyarray pairs. For example, consider a function
"foo(anyelement) returns anyarray". To get at the element type you just
say typeof($1), but if you have to name the array type you need a hack
like typeof(array[$1]). In the other direction (name the element type
of a parameter array) something like typeof($1[1]) would work.

The countervailing advantage is that this solves a lot of problems that
overloading anyelement wouldn't ever solve, since you can get at the
type of any expression not just a bare parameter.

Also I think it'd be relatively easy to stick into the parser; it
wouldn't require introduction of any new parse-time context information.

Anyway, none of this is material for 8.4, just a possible TODO item.

regards, tom lane


From: Brendan Jurd <direvus(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Questions about parsing boolean and casting to anyelement
Date: 2009-02-17 19:36:52
Message-ID: 37ed240d0902171136m6808f684rc787d85e42649556@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Feb 18, 2009 at 2:40 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> After thinking about it for awhile, I don't like the notation anyway
> --- it's not immediately obvious that a cast to anyelement should mean
> something like that. What seems more sensible to me is to introduce
> a function to get the type of an expression, so that you could write
> something like

We already have such a function, pg_typeof(). I submitted a patch for
it in the November commitfest, and you committed it. [1]

Or is that not the sort of function you were thinking of?

Cheers,
BJ

[1] http://git.postgresql.org/?p=postgresql.git;a=commit;h=1a850edf036a1c7dbb9f4fcfeae1e5f2c68cf049


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Brendan Jurd <direvus(at)gmail(dot)com>
Cc: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Questions about parsing boolean and casting to anyelement
Date: 2009-02-17 19:50:07
Message-ID: 6993.1234900207@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Brendan Jurd <direvus(at)gmail(dot)com> writes:
> On Wed, Feb 18, 2009 at 2:40 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> After thinking about it for awhile, I don't like the notation anyway
>> --- it's not immediately obvious that a cast to anyelement should mean
>> something like that. What seems more sensible to me is to introduce
>> a function to get the type of an expression, so that you could write
>> something like

> We already have such a function, pg_typeof().

No, pg_typeof is a more-or-less ordinary function that delivers an OID
at runtime. What we need here is something that will work as a CAST
target, ie, it has to be treated as a type name at parse time.

regards, tom lane


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Questions about parsing boolean and casting to anyelement
Date: 2009-02-17 21:10:46
Message-ID: 162867790902171310y2e4c5bc7if00e22c6014968ce@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2009/2/17 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> I wrote:
>> ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:
>>> I hope anyelement could be used in cast because casts are supported by
>>> almost programming languages where template or generics are available.
>
>> I think what you're suggesting is that inside a polymorphic function,
>> anyelement would somehow be a macro for the type that the function's
>> current anyelement parameter(s) have. It's an interesting idea but
>> it's just fantasy at the moment; I don't even have an idea of how we
>> might implement that.
>
> After thinking about it for awhile, I don't like the notation anyway
> --- it's not immediately obvious that a cast to anyelement should mean
> something like that. What seems more sensible to me is to introduce
> a function to get the type of an expression, so that you could write
> something like
>
> cast(expression as typeof(expression))
>
> This special function would act like C's sizeof and similar constructs
> in that its argument would never be evaluated, only inspected at parse
> time to determine its type. (There are already precedents for this in
> SQL; see the IS OF construct.) So the original requirement would be
> met with something like "expression::typeof($1)".
>
> A small disadvantage of this approach is that it's notationally a bit
> uglier for anyelement/anyarray pairs. For example, consider a function
> "foo(anyelement) returns anyarray". To get at the element type you just
> say typeof($1), but if you have to name the array type you need a hack
> like typeof(array[$1]). In the other direction (name the element type
> of a parameter array) something like typeof($1[1]) would work.
>
> The countervailing advantage is that this solves a lot of problems that
> overloading anyelement wouldn't ever solve, since you can get at the
> type of any expression not just a bare parameter.
>
> Also I think it'd be relatively easy to stick into the parser; it
> wouldn't require introduction of any new parse-time context information.
>
> Anyway, none of this is material for 8.4, just a possible TODO item.

it's look like good idea

regards
Pavel Stehule

>
> regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Questions about parsing boolean and casting to anyelement
Date: 2009-02-17 23:39:26
Message-ID: 24487.1234913966@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> In the meantime I'm more convinced than ever that we should throw an
> error for attempting such a cast. If people are imagining that it will
> do something like that, we need to disillusion them.

BTW, I wrote up what I thought was a trivial patch to make this happen,
and promptly got a regression test failure:

CREATE TABLE enumtest_child (parent rainbow REFERENCES enumtest_parent);
INSERT INTO enumtest_parent VALUES ('red');
INSERT INTO enumtest_child VALUES ('red');
+ ERROR: casting to a polymorphic type such as anyenum is meaningless
+ LINE 1: ... FROM ONLY "public"."enumtest_parent" x WHERE "id"::pg_catal...
+ ^
+ QUERY: SELECT 1 FROM ONLY "public"."enumtest_parent" x WHERE "id"::pg_catalog.anyenum OPERATOR(pg_catalog.=) $1::pg_catalog.anyenum FOR SHARE OF x
INSERT INTO enumtest_child VALUES ('blue'); -- fail

What is happening is that the code to generate RI check queries is
blindly casting to the declared input type of the operator it's
selected, which here is "anyenum = anyenum". We could easily prevent
it from doing that for polymorphic input types; but since I tripped over
this case almost immediately, I'm wondering what other cases might be
out there that would get broken by throwing this error.

Seeing that this type of confusion hasn't come up before, I think it
might be better to leave things alone here.

regards, tom lane


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Allow on/off as input texts for boolean.
Date: 2009-02-19 13:18:56
Message-ID: 499D5C40.7010406@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ITAGAKI Takahiro wrote:
> Here is a patch to allow 'on' and 'off' as input texts for boolean.
> Duplicated boolean parsers in parse_bool() and boolin() are merged
> into a new parse_bool_with_len().

Regarding your FIXME comment, I think parse_bool* should be in bool.c
and declared in builtins.h, which guc.c already includes.
(Conceptually, the valid format of a bool should be drived by the
boolean type, not the GUC system, I think.)


From: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Allow on/off as input texts for boolean.
Date: 2009-02-20 08:48:44
Message-ID: 20090220174642.9FD7.52131E4D@oss.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:

> ITAGAKI Takahiro wrote:
> > Here is a patch to allow 'on' and 'off' as input texts for boolean.
>
> Regarding your FIXME comment, I think parse_bool* should be in bool.c
> and declared in builtins.h, which guc.c already includes.
> (Conceptually, the valid format of a bool should be drived by the
> boolean type, not the GUC system, I think.)

Here is an updated patch to move parse_bool* into bool.c.
I also added tests of on/off values to the regression test.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

Attachment Content-Type Size
boolin_accepts_onoff.2.patch application/octet-stream 9.1 KB

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Allow on/off as input texts for boolean.
Date: 2009-03-09 14:35:05
Message-ID: 49B52919.2040108@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

ITAGAKI Takahiro wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>
>> ITAGAKI Takahiro wrote:
>>> Here is a patch to allow 'on' and 'off' as input texts for boolean.
>> Regarding your FIXME comment, I think parse_bool* should be in bool.c
>> and declared in builtins.h, which guc.c already includes.
>> (Conceptually, the valid format of a bool should be drived by the
>> boolean type, not the GUC system, I think.)
>
> Here is an updated patch to move parse_bool* into bool.c.
> I also added tests of on/off values to the regression test.

applied