Re: [BUGS] Failure to coerce unknown type to specific type

Lists: pgsql-bugspgsql-hackers
From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Failure to coerce unknown type to specific type
Date: 2015-04-09 01:21:44
Message-ID: CAMp0ubdECGUJWpQKyUM36PxQWoKe0VfTbHGHgn+w2H_5q1Nmow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Original report and patch by Karl Schnaitter.

create table a(u) as select '1';
create table b(i int);
insert into b select u from a;
ERROR: failed to find conversion function from unknown to integer

SELECT a=b FROM (SELECT ''::text, ' ') x(a,b);
ERROR: failed to find conversion function from unknown to text

The strange thing about these cases are that can_coerce_type returns
true, but coerce_type fails.

This can be fixed by a small change (attached) to find_coercion_pathway to add:

else if (sourceTypeId == UNKNOWNOID)
result = COERCION_PATH_COERCEVIAIO;

I don't see any big problem with doing this, because we've already
done the type inference; it's just a question of whether we succeed or
fail when we try to apply it. I don't see any reason to fail a cast
from unknown to something else.

However, this still doesn't fix a more complex case like:

create table tt(x text primary key);
create table ut(x unknown, foreign key(x) references tt);
insert into tt values('');
insert into ut values('');
update ut set x = x;

Regards,
Jeff Davis

Attachment Content-Type Size
unknown_coerce.patch text/x-diff 2.6 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-04-09 01:31:52
Message-ID: 1430.1428543112@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Jeff Davis <pgsql(at)j-davis(dot)com> writes:
> Original report and patch by Karl Schnaitter.
> create table a(u) as select '1';

We should, in fact, fail that to begin with. Unknown-type columns are
a spectactularly horrid idea.

> This can be fixed by a small change (attached) to find_coercion_pathway to add:

> else if (sourceTypeId == UNKNOWNOID)
> result = COERCION_PATH_COERCEVIAIO;

This is not a good idea, I think. I definitely don't accept any reasoning
that starts from the premise that UNKNOWN is a first-class type.

regards, tom lane


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-04-09 06:18:20
Message-ID: 1428560300.2845.45.camel@jeff-desktop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Wed, 2015-04-08 at 21:31 -0400, Tom Lane wrote:
> Jeff Davis <pgsql(at)j-davis(dot)com> writes:
> > Original report and patch by Karl Schnaitter.
> > create table a(u) as select '1';
>
> We should, in fact, fail that to begin with. Unknown-type columns are
> a spectactularly horrid idea.

That example was just for illustration. My other example didn't require
creating a table at all:

SELECT a=b FROM (SELECT ''::text, ' ') x(a,b);

it's fine with me if we want that to fail, but I don't think we're
failing in the right place, or with the right error message.

I'm not clear on what rules we're applying such that the above query
should fail, but:

SELECT ''::text=' ';

should succeed. Unknown literals are OK, but unknown column references
are not? If that's the rule, can we catch that earlier, and throw an
error like 'column reference "b" has unknown type'?

Regards,
Jeff Davis


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUGS] Failure to coerce unknown type to specific type
Date: 2015-04-23 01:40:33
Message-ID: CAMp0ubdVJAC+yO5XqU=M08WyXX19jnk9r2Faeo5W-SVDWAKzZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Moving thread to -hackers.

On Wed, Apr 8, 2015 at 11:18 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> That example was just for illustration. My other example didn't require
> creating a table at all:
>
> SELECT a=b FROM (SELECT ''::text, ' ') x(a,b);
>
> it's fine with me if we want that to fail, but I don't think we're
> failing in the right place, or with the right error message.
>
> I'm not clear on what rules we're applying such that the above query
> should fail, but:
>
> SELECT ''::text=' ';
>
> should succeed. Unknown literals are OK, but unknown column references
> are not? If that's the rule, can we catch that earlier, and throw an
> error like 'column reference "b" has unknown type'?

Is the behavior of unknown literals vs. unknown column references
documented anywhere? I tried looking here:
http://www.postgresql.org/docs/devel/static/typeconv.html, but it
doesn't seem to make the distinction between how unknown literals vs.
unknown column references are handled.

My understanding until now has been that unknown types are a
placeholder while still inferring the types. But that leaves a few
questions:

1. Why do we care whether the unknown is a literal or a column reference?
2. Unknown column references seem basically useless, because we will
almost always encounter the "failure to find conversion" error, so why
do we allow them?
3. If unknowns are just a placeholder, why do we return them to the
client? What do we expect the client to do with it?

Regards,
Jeff Davis


From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUGS] Failure to coerce unknown type to specific type
Date: 2015-04-23 03:35:04
Message-ID: CAKFQuwYHOyFnKxmOeiBw4pxgn9oLJidaQeo48gKnAsUXucnnuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

My apologies if much of this is already assumed knowledge by most
-hackers...I'm trying to learn from observation instead of, largely,
reading code in a foreign language.

On Wed, Apr 22, 2015 at 6:40 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:

> Moving thread to -hackers.
>
> On Wed, Apr 8, 2015 at 11:18 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> > That example was just for illustration. My other example didn't require
> > creating a table at all:
> >
> >
> ​​
> ​​
> SELECT a=b FROM (SELECT ''::text, ' ') x(a,b);
> >
> > it's fine with me if we want that to fail, but I don't think we're
> > failing in the right place, or with the right error message.
> >
> > I'm not clear on what rules we're applying such that the above query
> > should fail, but:
> >
> >
> ​​
> SELECT ''::text=' ';

>
> > should succeed. Unknown literals are OK, but unknown column references
> > are not? If that's the rule, can we catch that earlier, and throw an
> > error like 'column reference "b" has unknown type'?
>

But the fact that column "b" has the data type "unknown" is only a warning
- not an error.

This seems to be a case of the common problem (or, at least recently
mentioned) where type conversion only deals with data and not context.

http://www.postgresql.org/message-id/CADx9qBmVPQvSH3+2cH4cwwPmphW1mE18e=WUmLFUC-QZ-t7Q6Q@mail.gmail.com

Additional hinting regarding the column containing the offending data would
be welcomed by the community - but I suspect it is a non-trivial endeavor.

> Is the behavior of unknown literals vs. unknown column references
> documented anywhere? I tried looking here:
> http://www.postgresql.org/docs/devel/static/typeconv.html, but it
> doesn't seem to make the distinction between how unknown literals vs.
> unknown column references are handled.
>
> My understanding until now has been that unknown types are a
> placeholder while still inferring the types. But that leaves a few
> questions:
>
> 1. Why do we care whether the unknown is a literal or a column reference?
>

Apparently the difference is in when non-implicit casts can be used for
coercion - or, rather, when input functions can be used instead of casting
functions.

in ​SELECT ' '::text = 'a' the explicit cast between the implicit unknown
and text is used while going through the subquery forces the planner to
locate an implicit cast between the explicit unknown and text.

​The following fails no matter what you try because no casts exist from
unknown to integer:

​​SELECT a::int=b FROM (SELECT '1', 1) x(a,b);

but this too works - which is why the implicit cast concept above fails
(I'm leaving it since the thought process may help in understanding):

SELECT 1 = '1';

From which I infer that an unknown literal is allowed to be fed directly
into a type's input function to facilitate a direct coercion.

Writing this makes me wish for more precise terminology...is there
something already established here? "untyped" versus "unknown" makes sense
to me. untyped literals only exist within the confines of a single node
and can be passed through a type's input function to make them take on a
type. If the untyped reference passes through the node without having been
assigned an explicit type it is assigned the unknown type.

2. Unknown column references seem basically useless, because we will
> almost always encounter the "failure to find conversion" error, so why
> do we allow them?
>

At this point...backward compatibility?

I do get a warning in psql (9.3.6) from your original -bugs example

create table a(u) as select '1';

WARNING: "column "u" has type "unknown"​
DETAIL: Proceeding with relation creation anyway.

Related question: was there ever a time when the above failed instead of
just supplying a warning?

My git-fu is not super strong but the above warning was last edited by Tom
Lane back in 2003 (d8528630) though it was just a refactor - the warning
was already present. I suppose after 12 years the "why" doesn't matter so
much...

create table b(i int);
insert into b select u from a;
ERROR: failed to find conversion function from unknown to integer

Text appears to have a cast defined:

SELECT u::text FROM a;

> 3. If unknowns are just a placeholder, why do we return them to the
> client? What do we expect the client to do with it?

​We do?​ I suspect that it is effectively treated as if it were text by
client libraries.

​My gut reaction is if you feel strongly enough to add some additional
documentation or warnings/hints/details related to this topic they probably
would get put in; but disallowing "unknown" as first-class type is likely
to fail to pass a cost-benefit evaluation.

Distinguishing between "untyped" literals and "unknown type" literals seems
promising concept to aid in understanding the difference in the face of not
being able (or wanting) to actually change the behavior.

David J.


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUGS] Failure to coerce unknown type to specific type
Date: 2015-04-23 06:26:43
Message-ID: 1429770403.4604.22.camel@jeff-desktop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Wed, 2015-04-22 at 20:35 -0700, David G. Johnston wrote:

> But the fact that column "b" has the data type "unknown" is only a
> warning - not an error.
>
I get an error:

postgres=# SELECT ' '::text = 'a';
?column?
----------
f
(1 row)

postgres=# SELECT a=b FROM (SELECT ''::text, ' ') x(a,b);
ERROR: failed to find conversion function from unknown to text

So that means the column reference "b" is treated differently than the
literal. Here I don't mean a reference to an actual column of a real
table, just an identifier ("b") that parses as a columnref.

Creating the table gives you a warning (not an error), but I think that
was a poor example for me to choose, and not important to my point.
>
> This seems to be a case of the common problem (or, at least recently
> mentioned) where type conversion only deals with data and not context.
>
>
> http://www.postgresql.org/message-id/CADx9qBmVPQvSH3
> +2cH4cwwPmphW1mE18e=WUmLFUC-QZ-t7Q6Q(at)mail(dot)gmail(dot)com
>
>
I think that is a different problem. That's a runtime type conversion
error (execution time), and I'm talking about something happening at
parse analysis time.

>
> but this too works - which is why the implicit cast concept above
> fails (I'm leaving it since the thought process may help in
> understanding):
>
>
> SELECT 1 = '1';
>
>
> From which I infer that an unknown literal is allowed to be fed
> directly into a type's input function to facilitate a direct coercion.

Yes, I believe that's what's happening. When we use an unknown literal,
it's acting more like a value constructor and will pass it to the type
input function. When it's a columnref, even if unknown, it tries to cast
it and fails.

But that is very confusing. In the example at the top of this email, it
seems like the second query should be equivalent to the first, or even
that postgres should be able to rewrite the second into the first. But
the second query fails where the first succeeds.

> At this point...backward compatibility?

Backwards compatibility of what queries? I guess the ones that return
unknowns to the client or create tables with unknown columns?

> create table a(u) as select '1';
>
>
> WARNING: "column "u" has type "unknown"​
> DETAIL: Proceeding with relation creation anyway.
>
>
> Related question: was there ever a time when the above failed instead
> of just supplying a warning?

Not that I recall.

> ​My gut reaction is if you feel strongly enough to add some additional
> documentation or warnings/hints/details related to this topic they
> probably would get put in; but disallowing "unknown" as first-class
> type is likely to fail to pass a cost-benefit evaluation.

I'm not proposing that we eliminate unknown. I just think columnrefs and
literals should behave consistently. If we really don't want unknown
columnrefs, it seems like we could at least throw a better error.

If we were starting from scratch, I'd also not return unknown to the
client, but we have to worry about the backwards compatibility.

> Distinguishing between "untyped" literals and "unknown type" literals
> seems promising concept to aid in understanding the difference in the
> face of not being able (or wanting) to actually change the behavior.

Not sure I understand that proposal, can you elaborate?

Regards,
Jeff Davis


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: pgsql(at)j-davis(dot)com
Cc: david(dot)g(dot)johnston(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [BUGS] Failure to coerce unknown type to specific type
Date: 2015-04-23 08:07:10
Message-ID: 20150423.170710.83534644.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Hello, I think this is a bug.

The core of this problem is that coerce_type() fails for Var of
type UNKNOWNOID.

The comment for the function says that,

> * The caller should already have determined that the coercion is possible;
> * see can_coerce_type.

But can_coerce_type() should say it's possible to convert from
unknown to any type as it doesn't see the target node type. I
think this as an inconsistency between can_coerce_type and
coerce_type. So making this consistent would be right way.

Concerning only this issue, putting on-the-fly conversion for
unkown nonconstant as attached patch worked for me. I'm not so
confident on this, though..

regards,

At Wed, 22 Apr 2015 23:26:43 -0700, Jeff Davis <pgsql(at)j-davis(dot)com> wrote in <1429770403(dot)4604(dot)22(dot)camel(at)jeff-desktop>
> On Wed, 2015-04-22 at 20:35 -0700, David G. Johnston wrote:
>
> > But the fact that column "b" has the data type "unknown" is only a
> > warning - not an error.
> >
> I get an error:
>
> postgres=# SELECT ' '::text = 'a';
> ?column?
> ----------
> f
> (1 row)
>
> postgres=# SELECT a=b FROM (SELECT ''::text, ' ') x(a,b);
> ERROR: failed to find conversion function from unknown to text
>
> So that means the column reference "b" is treated differently than the
> literal. Here I don't mean a reference to an actual column of a real
> table, just an identifier ("b") that parses as a columnref.
>
> Creating the table gives you a warning (not an error), but I think that
> was a poor example for me to choose, and not important to my point.
> >
> > This seems to be a case of the common problem (or, at least recently
> > mentioned) where type conversion only deals with data and not context.
> >
> >
> > http://www.postgresql.org/message-id/CADx9qBmVPQvSH3
> > +2cH4cwwPmphW1mE18e=WUmLFUC-QZ-t7Q6Q(at)mail(dot)gmail(dot)com
> >
> >
> I think that is a different problem. That's a runtime type conversion
> error (execution time), and I'm talking about something happening at
> parse analysis time.
>
> >
> > but this too works - which is why the implicit cast concept above
> > fails (I'm leaving it since the thought process may help in
> > understanding):
> >
> >
> > SELECT 1 = '1';
> >
> >
> > From which I infer that an unknown literal is allowed to be fed
> > directly into a type's input function to facilitate a direct coercion.
>
> Yes, I believe that's what's happening. When we use an unknown literal,
> it's acting more like a value constructor and will pass it to the type
> input function. When it's a columnref, even if unknown, it tries to cast
> it and fails.
>
> But that is very confusing. In the example at the top of this email, it
> seems like the second query should be equivalent to the first, or even
> that postgres should be able to rewrite the second into the first. But
> the second query fails where the first succeeds.
>
>
> > At this point...backward compatibility?
>
> Backwards compatibility of what queries? I guess the ones that return
> unknowns to the client or create tables with unknown columns?
>
> > create table a(u) as select '1';
> >
> >
> > WARNING: "column "u" has type "unknown"​
> > DETAIL: Proceeding with relation creation anyway.
> >
> >
> > Related question: was there ever a time when the above failed instead
> > of just supplying a warning?
>
> Not that I recall.
>
>
>
> > ​My gut reaction is if you feel strongly enough to add some additional
> > documentation or warnings/hints/details related to this topic they
> > probably would get put in; but disallowing "unknown" as first-class
> > type is likely to fail to pass a cost-benefit evaluation.
>
> I'm not proposing that we eliminate unknown. I just think columnrefs and
> literals should behave consistently. If we really don't want unknown
> columnrefs, it seems like we could at least throw a better error.
>
> If we were starting from scratch, I'd also not return unknown to the
> client, but we have to worry about the backwards compatibility.
>
> > Distinguishing between "untyped" literals and "unknown type" literals
> > seems promising concept to aid in understanding the difference in the
> > face of not being able (or wanting) to actually change the behavior.
>
> Not sure I understand that proposal, can you elaborate?

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
var_coerce.patch text/x-patch 1.3 KB

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: pgsql(at)j-davis(dot)com
Cc: david(dot)g(dot)johnston(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [BUGS] Failure to coerce unknown type to specific type
Date: 2015-04-23 08:18:08
Message-ID: 20150423.171808.150348710.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Sorry, the patch had obvious bug..

-+ Int32GetDatum(inputTypeMod),
++ Int32GetDatum(targetTypeMod),

regards,

> Hello, I think this is a bug.
>
> The core of this problem is that coerce_type() fails for Var of
> type UNKNOWNOID.
>
> The comment for the function says that,
>
> > * The caller should already have determined that the coercion is possible;
> > * see can_coerce_type.
>
> But can_coerce_type() should say it's possible to convert from
> unknown to any type as it doesn't see the target node type. I
> think this as an inconsistency between can_coerce_type and
> coerce_type. So making this consistent would be right way.
>
> Concerning only this issue, putting on-the-fly conversion for
> unkown nonconstant as attached patch worked for me. I'm not so
> confident on this, though..
>
> regards,
>
> At Wed, 22 Apr 2015 23:26:43 -0700, Jeff Davis <pgsql(at)j-davis(dot)com> wrote in <1429770403(dot)4604(dot)22(dot)camel(at)jeff-desktop>
> > On Wed, 2015-04-22 at 20:35 -0700, David G. Johnston wrote:
> >
> > > But the fact that column "b" has the data type "unknown" is only a
> > > warning - not an error.
> > >
> > I get an error:
> >
> > postgres=# SELECT ' '::text = 'a';
> > ?column?
> > ----------
> > f
> > (1 row)
> >
> > postgres=# SELECT a=b FROM (SELECT ''::text, ' ') x(a,b);
> > ERROR: failed to find conversion function from unknown to text

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
var_coerce-2.patch text/x-patch 1.3 KB

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: pgsql(at)j-davis(dot)com
Cc: david(dot)g(dot)johnston(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [BUGS] Failure to coerce unknown type to specific type
Date: 2015-04-23 08:31:44
Message-ID: 20150423.173144.97709612.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Now I found a comment at just where I patched,

> * XXX if the typinput function is not immutable, we really ought to
> * postpone evaluation of the function call until runtime. But there
> * is no way to represent a typinput function call as an expression
> * tree, because C-string values are not Datums. (XXX This *is*
> * possible as of 7.3, do we want to do it?)

Is it OK to *now* we can do this?

regards,

> Hello, I think this is a bug.
>
> The core of this problem is that coerce_type() fails for Var of
> type UNKNOWNOID.
>
> The comment for the function says that,
>
> > * The caller should already have determined that the coercion is possible;
> > * see can_coerce_type.
>
> But can_coerce_type() should say it's possible to convert from
> unknown to any type as it doesn't see the target node type. I
> think this as an inconsistency between can_coerce_type and
> coerce_type. So making this consistent would be right way.
>
> Concerning only this issue, putting on-the-fly conversion for
> unkown nonconstant as attached patch worked for me. I'm not so
> confident on this, though..
>
> regards,
>
> At Wed, 22 Apr 2015 23:26:43 -0700, Jeff Davis <pgsql(at)j-davis(dot)com> wrote in <1429770403(dot)4604(dot)22(dot)camel(at)jeff-desktop>
> > On Wed, 2015-04-22 at 20:35 -0700, David G. Johnston wrote:
> >
> > > But the fact that column "b" has the data type "unknown" is only a
> > > warning - not an error.
> > >
> > I get an error:
> >
> > postgres=# SELECT ' '::text = 'a';
> > ?column?
> > ----------
> > f
> > (1 row)
> >
> > postgres=# SELECT a=b FROM (SELECT ''::text, ' ') x(a,b);
> > ERROR: failed to find conversion function from unknown to text

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: pgsql(at)j-davis(dot)com
Cc: david(dot)g(dot)johnston(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [BUGS] Failure to coerce unknown type to specific type
Date: 2015-04-23 08:35:35
Message-ID: 20150423.173535.202061455.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Very sorry for the trash..

===
Now I found a comment at just where I patched,

> * XXX if the typinput function is not immutable, we really ought to
> * postpone evaluation of the function call until runtime. But there
> * is no way to represent a typinput function call as an expression
> * tree, because C-string values are not Datums. (XXX This *is*
> * possible as of 7.3, do we want to do it?)

- Is it OK to *now* we can do this?
+ Is it OK to regard that we can do this *now*?

regards,

> Hello, I think this is a bug.
>
> The core of this problem is that coerce_type() fails for Var of
> type UNKNOWNOID.
>
> The comment for the function says that,
>
> > * The caller should already have determined that the coercion is possible;
> > * see can_coerce_type.
>
> But can_coerce_type() should say it's possible to convert from
> unknown to any type as it doesn't see the target node type. I
> think this as an inconsistency between can_coerce_type and
> coerce_type. So making this consistent would be right way.
>
> Concerning only this issue, putting on-the-fly conversion for
> unkown nonconstant as attached patch worked for me. I'm not so
> confident on this, though..
>
> regards,
>
> At Wed, 22 Apr 2015 23:26:43 -0700, Jeff Davis <pgsql(at)j-davis(dot)com> wrote in <1429770403(dot)4604(dot)22(dot)camel(at)jeff-desktop>
> > On Wed, 2015-04-22 at 20:35 -0700, David G. Johnston wrote:
> >
> > > But the fact that column "b" has the data type "unknown" is only a
> > > warning - not an error.
> > >
> > I get an error:
> >
> > postgres=# SELECT ' '::text = 'a';
> > ?column?
> > ----------
> > f
> > (1 row)
> >
> > postgres=# SELECT a=b FROM (SELECT ''::text, ' ') x(a,b);
> > ERROR: failed to find conversion function from unknown to text


From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUGS] Failure to coerce unknown type to specific type
Date: 2015-04-23 08:49:38
Message-ID: CAKFQuwY_byuc0L9eLB11Y=GnvY0CUNYFmSYbgvWfYQtzkT6QSQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Wednesday, April 22, 2015, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:

> On Wed, 2015-04-22 at 20:35 -0700, David G. Johnston wrote:
>
> > But the fact that column "b" has the data type "unknown" is only a
> > warning - not an error.
> >
> I get an error:
>
> postgres=# SELECT ' '::text = 'a';
> ?column?
> ----------
> f
> (1 row)
>
> postgres=# SELECT a=b FROM (SELECT ''::text, ' ') x(a,b);
> ERROR: failed to find conversion function from unknown to text
>
> So that means the column reference "b" is treated differently than the
> literal. Here I don't mean a reference to an actual column of a real
> table, just an identifier ("b") that parses as a columnref.
>
> Creating the table gives you a warning (not an error), but I think that
> was a poor example for me to choose, and not important to my point.

I get the point but the warning stems from converting from untyped to
unknown. Then you get the error looking for an implicit cast from
unknown. The error you had stated referred to the first situation, the
conversion of untyped to unknown.

>
> > This seems to be a case of the common problem (or, at least recently
> > mentioned) where type conversion only deals with data and not context.
> >
> >
> > http://www.postgresql.org/message-id/CADx9qBmVPQvSH3
> > +2cH4cwwPmphW1mE18e=WUmLFUC-QZ-t7Q6Q(at)mail(dot)gmail(dot)com <javascript:;>
> >
> >
> I think that is a different problem. That's a runtime type conversion
> error (execution time), and I'm talking about something happening at
> parse analysis time.

Again, referring here to why your proposed error seems unlikely in face of
similar errors not currently providing sufficient context either. I don't
know enough to posit why this is the case.

> >
> > but this too works - which is why the implicit cast concept above
> > fails (I'm leaving it since the thought process may help in
> > understanding):
> >
> >
> > SELECT 1 = '1';
> >
> >
> > From which I infer that an unknown literal is allowed to be fed
> > directly into a type's input function to facilitate a direct coercion.
>
> Yes, I believe that's what's happening. When we use an unknown literal,
> it's acting more like a value constructor and will pass it to the type
> input function. When it's a columnref, even if unknown, it tries to cast
> it and fails.
>
> But that is very confusing. In the example at the top of this email, it
> seems like the second query should be equivalent to the first, or even
> that postgres should be able to rewrite the second into the first. But
> the second query fails where the first succeeds.

Agreed (sorta, I can understand your PoV) - but it is consistently
confusing...and quite obvious when you've changed from one to the other.

Is there something concrete to dig teeth into here?

>
> > At this point...backward compatibility?
>
> Backwards compatibility of what queries? I guess the ones that return
> unknowns to the client or create tables with unknown columns?

Yes, disallowing unknown and requiring everything to be untyped or error.

> > create table a(u) as select '1';
> >
> >
> > WARNING: "column "u" has type "unknown"​
> > DETAIL: Proceeding with relation creation anyway.
> >
> >
> > Related question: was there ever a time when the above failed instead
> > of just supplying a warning?
>
> Not that I recall.
>
>
>
> > ​My gut reaction is if you feel strongly enough to add some additional
> > documentation or warnings/hints/details related to this topic they
> > probably would get put in; but disallowing "unknown" as first-class
> > type is likely to fail to pass a cost-benefit evaluation.
>
> I'm not proposing that we eliminate unknown. I just think columnrefs and
> literals should behave consistently. If we really don't want unknown
> columnrefs, it seems like we could at least throw a better error.

We do allow unknown column refs. We don't allow you to do much with them
though - given the lack of casts, implicit and otherwise. The error that
result from that situation are where the complaint lies. Since we cannot
disallow unknown column refs the question is can the resultant errors be
improved. I really don't see value in expending effort solely trying to
improve this limited situation. If the same effort also improves a wider
swath of the code base then great.

The only other option is to allow unknowns to be implicitly cast to text
and then fed into the input type just like an untyped literal would. But
those are not the same thing - no matter how similar your two mock queries
make them seem - and extrapolation from those two alone doesn't seem
justified. And his is crux of where your similarity falls apart. If you
can justify the above behavior then maybe...

>
> If we were starting from scratch, I'd also not return unknown to the
> client, but we have to worry about the backwards compatibility.
>
> > Distinguishing between "untyped" literals and "unknown type" literals
> > seems promising concept to aid in understanding the difference in the
> > face of not being able (or wanting) to actually change the behavior.
>
> Not sure I understand that proposal, can you elaborate?
>

Purely documentation explaining and naming the two different behaviors you
are seeing.

Reading and writing all this I'm convinced you have gotten the idea in your
mind an expectation of equivalency and consistency where there really is
little or none from an overall design perspective. And none insofar as
would merit trying to force the two example queries you provide to behave
identically. There are a number of things about SQL that one either simply
lives with or goes through mind contortions to understand the, possibly
imperfect, reasoning behind. This is one of those things: and while it's
been fun to do those contortions in the end I am only a little bit better
off than when I simply accepted the fact the unknown and untyped were
similar but different (even if I hadn't considered giving them different
names).

Literals and column references are different. If you put a literal into a
column you lose the ability to then treat it as a literal.

David J.


From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUGS] Failure to coerce unknown type to specific type
Date: 2015-04-23 09:22:27
Message-ID: CAKFQuwZAmCHjOnpc758+74Fc8dR4JwU1+V5LLr8uV3UOwzJt6g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Thu, Apr 23, 2015 at 1:49 AM, David G. Johnston <
david(dot)g(dot)johnston(at)gmail(dot)com> wrote:

> On Wednesday, April 22, 2015, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>
>> On Wed, 2015-04-22 at 20:35 -0700, David G. Johnston wrote:
>>
>
>> > ​My gut reaction is if you feel strongly enough to add some additional
>> > documentation or warnings/hints/details related to this topic they
>> > probably would get put in; but disallowing "unknown" as first-class
>> > type is likely to fail to pass a cost-benefit evaluation.
>>
>> I'm not proposing that we eliminate unknown. I just think columnrefs and
>> literals should behave consistently. If we really don't want unknown
>> columnrefs, it seems like we could at least throw a better error.
>
>
> We do allow unknown column refs. We don't allow you to do much with them
> though - given the lack of casts, implicit and otherwise. The error that
> result from that situation are where the complaint lies. Since we cannot
> disallow unknown column refs the question is can the resultant errors be
> improved. I really don't see value in expending effort solely trying to
> improve this limited situation. If the same effort also improves a wider
> swath of the code base then great.
>
>
​Slightly tangential but a pair of recent threads

http://www.postgresql.org/message-id/55244654.7020405@BlueTreble.com
http://www.postgresql.org/message-id/flat/CAKFQuwazck37J7fA4pZOz8m9JsKMQQoPAftxRY0cA_n4R2xfbQ(at)mail(dot)gmail(dot)com#CAKFQuwazck37J7fA4pZOz8m9JsKMQQoPAftxRY0cA_n4R2xfbQ@mail.gmail.com

where I pondered about polymorphic functions got me thinking about the
following function definition:

​create function poly(inarg anyelement, testarg unknown) returns anyelement
AS $$
BEGIN
SELECT inarg;
END;
$$
LANGUAGE plpgsql
;

Which indeed works...

Jim's "variant" type largely mirrors the behavior desired in this thread.

The lack of casting fails to unknown makes it an unsuitable alternative for
"variant" though maybe if indeed we allow type coercion to work it would
become viable. But the same concerns apply as well.

David J.


From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUGS] Failure to coerce unknown type to specific type
Date: 2015-04-23 09:26:16
Message-ID: CAKFQuwYZ2p2o1xgVgJF-6cQoWW1QmYr1oRUTUbU1AJ1XwdxSjQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Thu, Apr 23, 2015 at 1:35 AM, Kyotaro HORIGUCHI <
horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:

> Very sorry for the trash..
>
> ===
> Now I found a comment at just where I patched,
>
> > * XXX if the typinput function is not immutable, we really ought to
> > * postpone evaluation of the function call until runtime. But there
> > * is no way to represent a typinput function call as an expression
> > * tree, because C-string values are not Datums. (XXX This *is*
> > * possible as of 7.3, do we want to do it?)
>
> - Is it OK to *now* we can do this?
> + Is it OK to regard that we can do this *now*?
>
>
In this patch or a different one? Does this comment have anything to do
with the concern of this thread?

David J.


From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUGS] Failure to coerce unknown type to specific type
Date: 2015-04-23 09:29:52
Message-ID: CAKFQuwa6LsueyCBfDTi0DrdX16c0Eh5NO67nJp5K=NcDAYyBOQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Thu, Apr 23, 2015 at 1:07 AM, Kyotaro HORIGUCHI <
horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:

> Hello, I think this is a bug.
>
> The core of this problem is that coerce_type() fails for Var of
> type UNKNOWNOID.
>
> The comment for the function says that,
>
> > * The caller should already have determined that the coercion is
> possible;
> > * see can_coerce_type.
>
> But can_coerce_type() should say it's possible to convert from
> unknown to any type as it doesn't see the target node type. I
> think this as an inconsistency between can_coerce_type and
> coerce_type. So making this consistent would be right way.
>
>
​You have two pieces of contradictory knowledge - how are you picking which
one to "fix"?​

Concerning only this issue, putting on-the-fly conversion for
> unkown nonconstant as attached patch worked for me. I'm not so
> confident on this, though..
>

​Confident about what aspect - the safety of the patch itself or whether
the conversion is even a good idea?​

David J.​


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: david(dot)g(dot)johnston(at)gmail(dot)com
Cc: pgsql(at)j-davis(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [BUGS] Failure to coerce unknown type to specific type
Date: 2015-04-23 10:07:49
Message-ID: 20150423.190749.87287279.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Hello,

> > Very sorry for the trash..
> >
> > ===
> > Now I found a comment at just where I patched,
> >
> > > * XXX if the typinput function is not immutable, we really ought to
> > > * postpone evaluation of the function call until runtime. But there
> > > * is no way to represent a typinput function call as an expression
> > > * tree, because C-string values are not Datums. (XXX This *is*
> > > * possible as of 7.3, do we want to do it?)
> >
> > - Is it OK to *now* we can do this?
> > + Is it OK to regard that we can do this *now*?
> >
> >
> In this patch or a different one? Does this comment have anything to do
> with the concern of this thread?

The comment cieted above is in the PostgreSQL source file. The
reason why implicit cast (coercion) don't work for subquery's
results of unkown type, but works for constants is in the comment
cited above.

For "select ''::text = ' '", the internal intermediate
representation of the expression looks something like this,

Equal(text_const(''), unknown_const(' '))

and the second parameter can be immediately converted into
text_const(' ') during expression transformation in parse phase.

On the other hand, the problematic query is represented as
follows,

Equal(text_const(a), unknown_const(b))
for select ''::text as a, ' ' as b

But as described in the comment, PostgreSQL knows what to do but
it couldn't be implemented at the time of.. 7.3? But it seems to
be doable now.

By the way, the following query is similar to the problematic one
but doesn't fail.

SELECT a = b FROM (SELECT ''::text, ' ' UNION ALL SELECT '':text,
' ') AS x(a, b);

This is because parsing of UNION immediately converts constants
of unknown type in the UNION's both arms to text so the top level
select won't be bothered by this problem. But the problematic
query doesn't have appropriate timing to do that until the
function I patched.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: david(dot)g(dot)johnston(at)gmail(dot)com
Cc: pgsql(at)j-davis(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [BUGS] Failure to coerce unknown type to specific type
Date: 2015-04-23 10:20:43
Message-ID: 20150423.192043.05311305.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Hello,

> > Hello, I think this is a bug.
> >
> > The core of this problem is that coerce_type() fails for Var of
> > type UNKNOWNOID.
> >
> > The comment for the function says that,
> >
> > > * The caller should already have determined that the coercion is
> > possible;
> > > * see can_coerce_type.
> >
> > But can_coerce_type() should say it's possible to convert from
> > unknown to any type as it doesn't see the target node type. I
> > think this as an inconsistency between can_coerce_type and
> > coerce_type. So making this consistent would be right way.
> >
> >
> You have two pieces of contradictory knowledge - how are you picking which
> one to "fix"?

What do you think are contradicting? The patch fixes the problem
that unkown nonconstats cannot get proper conversion and the
query fails.

| SELECT a=b FROM (SELECT ''::text, ' ') x(a,b);
| ERROR: failed to find conversion function from unknown to text

> > Concerning only this issue, putting on-the-fly conversion for
> > unkown nonconstant as attached patch worked for me. I'm not so
> > confident on this, though..
> >
>
> Confident about what aspect - the safety of the patch itself or whether
> the conversion is even a good idea?​

Mainly the former, and how far it covers the other similar but
unkown troubles, generality? in other words.

The fix would not so significant but no processing cost or
complexity added, and would reduce astonishment if it works
safely.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUGS] Failure to coerce unknown type to specific type
Date: 2015-04-23 17:22:19
Message-ID: CAMp0ubehp_j16rhLLuNhut6iVkdaGZyJ2F3u0sRk5wtvCwMCvg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Thu, Apr 23, 2015 at 1:49 AM, David G. Johnston
<david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
> Reading and writing all this I'm convinced you have gotten the idea in your
> mind an expectation of equivalency and consistency where there really is
> little or none from an overall design perspective. And none insofar as
> would merit trying to force the two example queries you provide to behave
> identically. There are a number of things about SQL that one either simply
> lives with or goes through mind contortions to understand the, possibly
> imperfect, reasoning behind. This is one of those things: and while it's
> been fun to do those contortions in the end I am only a little bit better
> off than when I simply accepted the fact the unknown and untyped were
> similar but different (even if I hadn't considered giving them different
> names).

You make some good points, but I disagree for two reasons:

1. This is a postgres issue, not a general SQL issue, so we can't
simply say "SQL is weird" (like we can with a lot of the strange NULL
semantics).
2. Postgres has determined that the unknown column reference "b" in
the query "SELECT a=b FROM (SELECT ''::text, ' ') x(a,b)" can and
should be coerced to text. So postgres has already made that decision.
It's just that when it tries to coerce it, it fails. Even if you
believe that literals and column references are not equivalent, I
don't see any justification at all for this behavior -- postgres
should not decide to coerce it in the first place if it's going to
fail.

Regards,
Jeff Davis


From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUGS] Failure to coerce unknown type to specific type
Date: 2015-04-23 17:39:53
Message-ID: CAKFQuwYJ0XBPo4PSnbERNLb9z9=2OvhVqF5cF+CvC_7+RM_kiQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Thursday, April 23, 2015, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:

> On Thu, Apr 23, 2015 at 1:49 AM, David G. Johnston
> <david(dot)g(dot)johnston(at)gmail(dot)com <javascript:;>> wrote:
> > Reading and writing all this I'm convinced you have gotten the idea in
> your
> > mind an expectation of equivalency and consistency where there really is
> > little or none from an overall design perspective. And none insofar as
> > would merit trying to force the two example queries you provide to behave
> > identically. There are a number of things about SQL that one either
> simply
> > lives with or goes through mind contortions to understand the, possibly
> > imperfect, reasoning behind. This is one of those things: and while it's
> > been fun to do those contortions in the end I am only a little bit better
> > off than when I simply accepted the fact the unknown and untyped were
> > similar but different (even if I hadn't considered giving them different
> > names).
>
> You make some good points, but I disagree for two reasons:
>
> 1. This is a postgres issue, not a general SQL issue, so we can't
> simply say "SQL is weird" (like we can with a lot of the strange NULL
> semantics).

But it is ok to admit that we are weird when we are. Though yes, we are
being inefficient here even with the current behavior taken as desired.

2. Postgres has determined that the unknown column reference "b" in
> the query "SELECT a=b FROM (SELECT ''::text, ' ') x(a,b)" can and
> should be coerced to text.

So the error should be "operator does not exist: text = unknown"...instead
it tries and fails to cast the unknown to text.

Or allow for the coercion to proceed.

There would be fewer obvious errors, and simple cases that fail would begin
to work sensibly, but it still feels like implicit casting from text which
was recently largely removed from the system for cause. Albeit to the
disgruntlement of some and accusations of being draconian by others.

> So postgres has already made that decision.
> It's just that when it tries to coerce it, it fails. Even if you
> believe that literals and column references are not equivalent, I
> don't see any justification at all for this behavior -- postgres
> should not decide to coerce it in the first place if it's going to
> fail.
>
>
This is true - but obviously one solution is to not attempt it in the first
place.

David J.


From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: <pgsql(at)j-davis(dot)com>, <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [BUGS] Failure to coerce unknown type to specific type
Date: 2015-04-23 19:49:29
Message-ID: 55394CC9.5050703@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On 4/23/15 5:07 AM, Kyotaro HORIGUCHI wrote:
> This is because parsing of UNION immediately converts constants
> of unknown type in the UNION's both arms to text so the top level
> select won't be bothered by this problem. But the problematic
> query doesn't have appropriate timing to do that until the
> function I patched.

FWIW, I think that's more accidental than anything.

I'm no expert in our casting and type handling code but I spent a lot of
time stuck in it while working on the variant type, and it seems very
scattered. There's stuff in the actual casting code, there's some stuff
in other parts of parse/plan, there's stuff in individual types (array
and record at least).

Some stuff is handled by casting; some stuff is handled by mangling the
parse tree.

Something else I noticed is we're not consistent with handling typmod
either. I don't remember the exact example I found, but there's cases
involving casting of constants where we ignore it (I don't think it was
as simple as SELECT 1::int::variant(...), but it was something like that).

I don't know how much of this is just historical and how much is
intentional, but it'd be nice if we could consolidate it more.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: Jim(dot)Nasby(at)BlueTreble(dot)com
Cc: david(dot)g(dot)johnston(at)gmail(dot)com, pgsql(at)j-davis(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [BUGS] Failure to coerce unknown type to specific type
Date: 2015-04-24 03:54:48
Message-ID: 20150424.125448.105731176.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Hello,

At Thu, 23 Apr 2015 14:49:29 -0500, Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com> wrote in <55394CC9(dot)5050703(at)BlueTreble(dot)com>
> On 4/23/15 5:07 AM, Kyotaro HORIGUCHI wrote:
> > This is because parsing of UNION immediately converts constants
> > of unknown type in the UNION's both arms to text so the top level
> > select won't be bothered by this problem. But the problematic
> > query doesn't have appropriate timing to do that until the
> > function I patched.
>
> FWIW, I think that's more accidental than anything.

I guess so. It looks not intentional about this behavior at
all.

> I'm no expert in our casting and type handling code but I spent a lot
> of time stuck in it while working on the variant type, and it seems
> very scattered. There's stuff in the actual casting code, there's some
> stuff in other parts of parse/plan, there's stuff in individual types
> (array and record at least).
>
> Some stuff is handled by casting; some stuff is handled by mangling
> the parse tree.

That's what makes me unconfident. But if coercion is always made
by coerce_type and coercion is properly considered at all places
needs it, and this coercion steps is appropriate, we will see
nothing bad. I hope.

> Something else I noticed is we're not consistent with handling typmod
> either. I don't remember the exact example I found, but there's cases
> involving casting of constants where we ignore it (I don't think it
> was as simple as SELECT 1::int::variant(...), but it was something
> like that).

Mmm.. It's a serious bug if explicit casts are ignored. If some
cast procedures does wrong, it should be fixed.

> I don't know how much of this is just historical and how much is
> intentional, but it'd be nice if we could consolidate it more.

Yeah, but it seems tough to do it throughly.

--
Kyotaro Horiguchi
NTT Open Source Software Center


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-04-29 21:50:15
Message-ID: CA+Tgmoav4gswNkA39SYdb3SdSBfrcHmU08=nXLNbxMZFg7UYsw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Wed, Apr 8, 2015 at 9:31 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Jeff Davis <pgsql(at)j-davis(dot)com> writes:
>> Original report and patch by Karl Schnaitter.
>> create table a(u) as select '1';
>
> We should, in fact, fail that to begin with. Unknown-type columns are
> a spectactularly horrid idea.

I agree. So why don't we throw an error when someone tries to create one?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-04-29 22:37:51
Message-ID: 35651.1430347071@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Wed, Apr 8, 2015 at 9:31 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Jeff Davis <pgsql(at)j-davis(dot)com> writes:
>>> Original report and patch by Karl Schnaitter.
>>> create table a(u) as select '1';

>> We should, in fact, fail that to begin with. Unknown-type columns are
>> a spectactularly horrid idea.

> I agree. So why don't we throw an error when someone tries to create one?

We should IMO, but there's been push-back about backwards compatibility
when this has been proposed in the past. But I'd rather break backwards
compatibility to the extent of saying "you can't do that" than to try to
make unknown a full-fledged type, which is what this patch wants to do.

I have some recollection that we'd also put it off pending resolution of
debates about how to handle unknown-type literals in UNIONs and similar
contexts. But poking at such examples right now, the behavior seems
generally reasonable: it seems like we resolve "unknown" as text when
forced to make a decision, but otherwise put it off as long as possible.
So that consideration may be obsolete.

An alternative design that is also worthy of consideration is to force
the created view or table column to be type text rather than unknown.
But that would be more complex to implement, and it's not obviously
superior to saying "hey bud, you need to be more specific as to what
column type you intend here".

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-04-30 13:05:41
Message-ID: CA+TgmobymzvoNZLwk-m_b3XTL8JQowyjLJyM8JkCyjhj5CrcvA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Wed, Apr 29, 2015 at 6:37 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> We should, in fact, fail that to begin with. Unknown-type columns are
>>> a spectactularly horrid idea.
>
>> I agree. So why don't we throw an error when someone tries to create one?
>
> We should IMO, but there's been push-back about backwards compatibility
> when this has been proposed in the past.

I do think that there are probably people who have got cruft lying
around in their database where they've accidentally created views or
tables with unknown-type columns. They are unlikely to be actually
used, but they may exist. If we want to ease the
backward-compatibility pain, maybe we could map unknown -> text, so
that the upgrade still works but changes the column type under the
hood. Or we can just break it. But we should do something.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-04-30 16:06:45
Message-ID: 1439794807.1364461.1430410005618.JavaMail.yahoo@mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I have some recollection that we'd also put it off pending
> resolution of debates about how to handle unknown-type literals
> in UNIONs and similar contexts. But poking at such examples
> right now, the behavior seems generally reasonable: it seems like
> we resolve "unknown" as text when forced to make a decision, but
> otherwise put it off as long as possible. So that consideration
> may be obsolete.

I recall two constructs that we had in production that caused some
pain moving to PostgreSQL.

Here's one:

test=# create table x (d date);
CREATE TABLE
test=# insert into x values (null);
INSERT 0 1
test=# insert into x values (coalesce(null, null));
ERROR: column "d" is of type date but expression is of type text
LINE 1: insert into x values (coalesce(null, null));
^
HINT: You will need to rewrite or cast the expression.

I know these worked in Sybase ASE, SAP DB, MySQL MaxdDB, IBM OS/2
EE's port of DB2, and early versions of MS SQL Server. I have
confirmed (using SQL Fiddle) that it works in Oracle 11g R2, MySQL
5.5 and 5.6, and SQLite (SQL.js). Interestingly, MS SQL Server
2014 now throws this error:

At least one of the arguments to COALESCE must be an expression that is not the NULL constant.

Here the other:

test=# select null as ts union all select null union all select now();
ERROR: UNION types text and timestamp with time zone cannot be matched
LINE 1: ...ect null as ts union all select null union all select now();
^
test=# create table n (id int not null);
CREATE TABLE
test=# insert into n values (1);
INSERT 0 1
test=# select null as ts from n
test-# union all
test-# select null from n
test-# union all
test-# select 1 from n;
ERROR: UNION types text and integer cannot be matched
LINE 5: select 1 from n;
^

This runs in *all* of the above environments.

I don't know of any other database product which chokes on the above.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-05-01 05:50:50
Message-ID: 1430459450.2499.17.camel@jeff-desktop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Wed, 2015-04-29 at 15:37 -0700, Tom Lane wrote:
> We should IMO, but there's been push-back about backwards compatibility
> when this has been proposed in the past. But I'd rather break backwards
> compatibility to the extent of saying "you can't do that" than to try to
> make unknown a full-fledged type, which is what this patch wants to do.

My intention was to make can_coerce_type and coerce_type consistent --
right now, coerce_type can fail after can_coerce_type returns true.

(I also wanted to improve the composability of subqueries, but I got
enough resistance that I'm setting that argument aside.)

It really has nothing to do with creating real tables with real columns
of type unknown.

=> select u=t from (select 'x' as u, 'y'::text as t) s;
ERROR: failed to find conversion function from unknown to text

That's an elog (not an ereport, just plain elog) for what is a
high-level query compilation error of a fairly sane-looking query, which
seems wrong.

The code comment above the error says that the caller blew it, but as
far as I can tell there's nothing the caller can do about it. That seems
wrong, too.

Regards,
Jeff Davis


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-05-01 12:42:45
Message-ID: CA+TgmoaUNFTvUnmV3Fj+kK-TyjVN_S0zNSD_hMBYP8kMCo1d3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Fri, May 1, 2015 at 1:50 AM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Wed, 2015-04-29 at 15:37 -0700, Tom Lane wrote:
>> We should IMO, but there's been push-back about backwards compatibility
>> when this has been proposed in the past. But I'd rather break backwards
>> compatibility to the extent of saying "you can't do that" than to try to
>> make unknown a full-fledged type, which is what this patch wants to do.
>
> My intention was to make can_coerce_type and coerce_type consistent --
> right now, coerce_type can fail after can_coerce_type returns true.
>
> (I also wanted to improve the composability of subqueries, but I got
> enough resistance that I'm setting that argument aside.)
>
> It really has nothing to do with creating real tables with real columns
> of type unknown.
>
> => select u=t from (select 'x' as u, 'y'::text as t) s;
> ERROR: failed to find conversion function from unknown to text
>
> That's an elog (not an ereport, just plain elog) for what is a
> high-level query compilation error of a fairly sane-looking query, which
> seems wrong.
>
> The code comment above the error says that the caller blew it, but as
> far as I can tell there's nothing the caller can do about it. That seems
> wrong, too.

I agree. I'm not sure what the right fix is, but that query should
probably work, and if it must fail, it certainly shouldn't elog().

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-05-01 17:08:54
Message-ID: 37798.1430500134@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Fri, May 1, 2015 at 1:50 AM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
>> => select u=t from (select 'x' as u, 'y'::text as t) s;
>> ERROR: failed to find conversion function from unknown to text
>>
>> That's an elog (not an ereport, just plain elog) for what is a
>> high-level query compilation error of a fairly sane-looking query, which
>> seems wrong.

> I agree. I'm not sure what the right fix is, but that query should
> probably work, and if it must fail, it certainly shouldn't elog().

What really ought to happen here, IMO, is that the output columns of the
sub-select ought to get resolved to non-unknown types while we are doing
parse analysis of the sub-select.

I believe the core reason why we haven't done this ages ago is that
currently, the "right thing" will happen if you do this:

insert into t (a, b) select x, 'foo' from s;

namely that after the sub-select is parsed, the INSERT will reach down and
coerce the unknown literal to the datatype of b. If we force the output
of the sub-select to text earlier, that will stop working (or at least, it
will only work in cases where there's an assignment cast from text to b's
type). That's a horrid kluge, but if memory serves it's required to
handle some case the SQL spec expects to work.

It's conceivable that we could preserve the desirable aspects of
INSERT/SELECT while eliminating "unknown" outputs from sub-selects,
if INSERT were to pass down some context saying "here are the target
column types I want", and then the resolution rule in SELECT target list
processing would be "if an output column is UNKNOWN, coerce it to the
target type provided by context if any, otherwise coerce to text".
Not sure how much overhead this would add ... probably not very much,
since INSERT will have to identify the target column types sooner or
later.

regards, tom lane


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-05-02 21:55:56
Message-ID: 1430603756.2499.44.camel@jeff-desktop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Fri, 2015-05-01 at 10:08 -0700, Tom Lane wrote:
> What really ought to happen here, IMO, is that the output columns of the
> sub-select ought to get resolved to non-unknown types while we are doing
> parse analysis of the sub-select.

So, what would happen for something like:

select u+i from (select '1' as u, '2'::int as i) s;

?

I can see two possibilities:

1. Resolve "u" from the subselect as text, and later fail to find a
match for +(text,int).
2. Resolve +(unknown, int) to +(int, int) first, then inform the
subselect that it's looking for an int (in the same way that you propose
the insert pass down some context).

I don't think the second one really makes sense though. For example:

select u+i, u||'suffix'::text from (select '1' as u, '2'::int as i) s;

In that case, "+" would be resolved to +(int, int) and || would be
resolved to ||(text, text). But "u" from the subselect can't be both an
int and text.

Then again, we probably want to fail a query like that anyway. So maybe
it does make sense as long as we can figure out a single type for "u",
and we fail otherwise.

Regards,
Jeff Davis


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-05-03 17:00:48
Message-ID: 16364.1430672448@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Jeff Davis <pgsql(at)j-davis(dot)com> writes:
> On Fri, 2015-05-01 at 10:08 -0700, Tom Lane wrote:
>> What really ought to happen here, IMO, is that the output columns of the
>> sub-select ought to get resolved to non-unknown types while we are doing
>> parse analysis of the sub-select.

> So, what would happen for something like:
> select u+i from (select '1' as u, '2'::int as i) s;

I don't think there's any useful alternative to failing on this type of
case. You can't realistically postpone resolution of the subquery output
types long enough for outer-level expression resolution to provide
context. Even if you could, the behavior wouldn't be very well defined,
because (as you note) there might be more than one such expression leading
to contradictory results.

I think it's reasonable to try to let context inform resolution of the
subquery output types where there is exactly one immediate source of
context, such as the INSERT/SELECT case or UNION/INTERSECT/EXCEPT cases.
(Upthread, Kevin whines about our handling of UNION, but that's possibly
fixable.)

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-05-03 17:13:51
Message-ID: 16778.1430673231@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Kevin Grittner <kgrittn(at)ymail(dot)com> writes:
> I recall two constructs that we had in production that caused some
> pain moving to PostgreSQL.

> Here's one:

> test=# insert into x values (coalesce(null, null));
> ERROR: column "d" is of type date but expression is of type text

I don't have a lot of sympathy for that one. coalesce(null, null)
isn't legal at all per SQL spec, for essentially the reason SQL Server
gives:

> At least one of the arguments to COALESCE must be an expression that is not the NULL constant.

Otherwise the result type of coalesce() isn't well-defined, and there is
nothing at all in the spec that would suggest looking to surrounding
context to decide that. Our choice to resolve it as text rather than
failing is admittedly a bit arbitrary, but I don't find it unreasonable.

> Here the other:

> test=# select null as ts union all select null union all select now();
> ERROR: UNION types text and timestamp with time zone cannot be matched

Yeah, this one is a bit annoying, especially considering we do get it
right in related cases:

regression=# select null as ts union all (select null union all select now());
ts
-------------------------------


2015-05-03 13:05:30.639594-04
(3 rows)

It's possible this could be fixed with some rejiggering of parse analysis
so that matching of output-column types is performed across a whole
set-operation tree at once rather than on binary pairs of leaf queries.

On the other hand, a case could be made that such behavior would also be
in violation of the standard, which is perfectly clear that you process
set operations as binary pairs not holistically. There would certainly
be some compatibility risk involved in changing the resolution behavior
like that, especially for cases where the type choice affects the set
operation's behavior significantly.

regards, tom lane


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-05-03 17:50:19
Message-ID: 1430675419.2464.0.camel@jeff-desktop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

On Sun, 2015-05-03 at 13:00 -0400, Tom Lane wrote:
> I don't think there's any useful alternative to failing on this type of
> case. You can't realistically postpone resolution of the subquery output
> types long enough for outer-level expression resolution to provide
> context. Even if you could, the behavior wouldn't be very well defined,
> because (as you note) there might be more than one such expression leading
> to contradictory results.
>
> I think it's reasonable to try to let context inform resolution of the
> subquery output types where there is exactly one immediate source of
> context, such as the INSERT/SELECT case or UNION/INTERSECT/EXCEPT cases.
> (Upthread, Kevin whines about our handling of UNION, but that's possibly
> fixable.)

OK, your proposal sounds good to me then.

Regards,
Jeff Davis


From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-05-03 18:28:07
Message-ID: 1378610475.587034.1430677687479.JavaMail.yahoo@mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Jeff Davis <pgsql(at)j-davis(dot)com> writes:
>> On Fri, 2015-05-01 at 10:08 -0700, Tom Lane wrote:
>>> What really ought to happen here, IMO, is that the output
>>> columns of the sub-select ought to get resolved to non-unknown
>>> types while we are doing parse analysis of the sub-select.
>>
>> So, what would happen for something like:
>> select u+i from (select '1' as u, '2'::int as i) s;
>
> I don't think there's any useful alternative to failing on this
> type of case. You can't realistically postpone resolution of the
> subquery output types long enough for outer-level expression
> resolution to provide context.

According to SQL Fiddle: MySQL, SQL Lite, and MS SQL Server all
come up with 3 as the answer. Oracle 11g R2 (with a one-row table
to avoid the "FROM keyword not found where expected" error) gives
this rather cryptic message:

ORA-00933: SQL command not properly ended

If I pull out the subquery and run it by itself Oracle gives this:

Invalid SQL type: sqlKind = UNINITIALIZED

> Even if you could, the behavior wouldn't be very well defined,
> because (as you note) there might be more than one such
> expression leading to contradictory results.

Do you have a simple example of what you mean? I'm kinda curious
whether the products that manage to handle the above give a sane
error when it becomes ambiguous, or whether they fail in confusing
ways. It's not like I feel that we need to support a statement
just because other products do, but when I respond to complaints
from users trying to migrate to PostgreSQL, it is useful to have
examples to demonstrate the down side of what other products are
doing.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-05-03 18:42:47
Message-ID: 22462.1430678567@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Kevin Grittner <kgrittn(at)ymail(dot)com> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Jeff Davis <pgsql(at)j-davis(dot)com> writes:
>>> So, what would happen for something like:
>>> select u+i from (select '1' as u, '2'::int as i) s;

>> I don't think there's any useful alternative to failing on this
>> type of case. You can't realistically postpone resolution of the
>> subquery output types long enough for outer-level expression
>> resolution to provide context.

> According to SQL Fiddle: MySQL, SQL Lite, and MS SQL Server all
> come up with 3 as the answer.

Really? The :: syntax is a Postgres-ism, so you surely didn't test
this query on those.

>> Even if you could, the behavior wouldn't be very well defined,
>> because (as you note) there might be more than one such
>> expression leading to contradictory results.

> Do you have a simple example of what you mean?

Jeff already pointed out the issue, but consider

select u+i from (select '1' as u, '2'::int as i) s where u<'foo'::text;

At the very least such a query would behave differently depending on
whether we process the outer query's WHERE clause before or after its
SELECT output list.

regards, tom lane


From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-05-03 18:56:14
Message-ID: 1850649255.594603.1430679374313.JavaMail.yahoo@mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Kevin Grittner <kgrittn(at)ymail(dot)com> writes:
>> I recall two constructs that we had in production that caused some
>> pain moving to PostgreSQL.
>
>> Here's one:
>
>> test=# insert into x values (coalesce(null, null));
>> ERROR: column "d" is of type date but expression is of type text
>
> I don't have a lot of sympathy for that one. coalesce(null, null)
> isn't legal at all per SQL spec

I don't get that from my reading of the SQL spec. A COALESCE
clause is (and always has been) considered a short form of the CASE
clause (not to be mistaken for a function, for example). The spec
section 6.11 1) c) very explicitly requires
COALESCE(NULL, NULL)
be the exact equivalent of
CASE WHEN NULL IS NOT NULL THEN NULL ELSE NULL END

Yet in PostgreSQL the long form of the CASE clause returns the same
thing as a bare NULL, while the short form (COALESCE) gives an
error. Please indicate what in the spec makes you think that
COALESCE(NULL, NULL) should ever be treated differently from a bare
NULL, because I've looked at the spec and I'm not seeing anything
to support what you said.

> Otherwise the result type of coalesce() isn't well-defined, and there is
> nothing at all in the spec that would suggest looking to surrounding
> context to decide that.

The definition of COALESCE says that when there are different types
the result type should be determined according to section 9.3
(Result of data type combinations). Because the organization of
our code doesn't lend itself well to conforming to the standard in
that regard, I realize that we are dealing in practical
compromises; but let's not pretend the spec is not clear about
this.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-05-03 19:20:38
Message-ID: 691275181.611927.1430680838706.JavaMail.yahoo@mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:

> The spec section 6.11 1) c) very explicitly requires
> COALESCE(NULL, NULL)
> be the exact equivalent of
> CASE WHEN NULL IS NOT NULL THEN NULL ELSE NULL END
>
> Yet in PostgreSQL the long form of the CASE clause returns the
> same thing as a bare NULL, while the short form (COALESCE) gives
> an error.

Never mind; I got confused in my testing -- they both return NULL
of type text. I'll dig on that one some more.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-05-03 19:28:03
Message-ID: 23667.1430681283@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Kevin Grittner <kgrittn(at)ymail(dot)com> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I don't have a lot of sympathy for that one. coalesce(null, null)
>> isn't legal at all per SQL spec

> I don't get that from my reading of the SQL spec. A COALESCE
> clause is (and always has been) considered a short form of the CASE
> clause (not to be mistaken for a function, for example). The spec
> section 6.11 1) c) very explicitly requires
> COALESCE(NULL, NULL)
> be the exact equivalent of
> CASE WHEN NULL IS NOT NULL THEN NULL ELSE NULL END

Sure, and that isn't legal per spec either (see below).

> Yet in PostgreSQL the long form of the CASE clause returns the same
> thing as a bare NULL, while the short form (COALESCE) gives an
> error.

Hm? I get the same thing for either variant:

regression=# create table x (d date);
CREATE TABLE
regression=# insert into x values (coalesce(null, null));
ERROR: column "d" is of type date but expression is of type text
LINE 1: insert into x values (coalesce(null, null));
^
HINT: You will need to rewrite or cast the expression.
regression=# insert into x values ( CASE WHEN NULL IS NOT NULL THEN NULL ELSE NULL END );
ERROR: column "d" is of type date but expression is of type text
LINE 1: insert into x values ( CASE WHEN NULL IS NOT NULL THEN NULL ...
^
HINT: You will need to rewrite or cast the expression.

The reason these things aren't legal per spec is that the spec says that
a bare NULL keyword is a <contextually typed value specification> a/k/a
<implicitly typed value specification>, and those are only valid in
situations where a type can be inferred from the *immediate* context.
For example,
insert into x values (null);
is legal because the source of an INSERT can be a
<contextually typed table value constructor> which is a VALUES clause
that can contain a <contextually typed value specification>. On the
other hand, result subexpressions of a CASE are just <value expression>s,
and you will not find any production that allows a bare NULL literal
to be a <value expression>. So far as I can find in SQL:2008, the
only contexts where <contextually typed anything> is syntactically
legal are (1) INSERT, MERGE, and UPDATE source expressions, (2) CAST
source expressions, and (3) table-column DEFAULT expressions, all of
which have a well-defined target type available from the immediately
surrounding semantic context.

In short, the text of the spec only allows a bare NULL literal as
the immediate argument of one of those constructs. Allowing it in
any other context is an extension.

The spec doesn't have a notion of unknown-type literal in the same way
we do, but we've modeled our handling of those as being like bare NULL
literals for type resolution purposes.

Our choice has been to resolve as text in situations where there is no
other info available from immediate context. I agree with the spec
that it would be a bad idea to allow "action at a distance" in the
sense of allowing such info to propagate through multiple levels
of semantic context.

regards, tom lane


From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-05-03 19:29:55
Message-ID: 2007003193.618271.1430681395388.JavaMail.yahoo@mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:

> they both return NULL of type text.

Section 9.3, which the definition of COALESCE references as the
way to resolve type conflicts *starts* with this:

| Let IDTS be the set of data types specified in an application of
| this Subclause. Let DTS be the set of data types in IDTS
| excluding any data types that are undefined. If the cardinality
| of DTS is 0 (zero), then the result data type is undefined and no
| further Rules of this Subclause are evaluated.

That's pretty unambiguous that a COALESCE(NULL, NULL) clause should
yield a NULL with the data type undefined. So, if we're going by
the SQL spec, that is what it would take to conform. It's hardly a
crisis that we don't conform on this point, but it is a fact.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-05-03 19:33:51
Message-ID: 23845.1430681631@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Kevin Grittner <kgrittn(at)ymail(dot)com> writes:
> Section 9.3, which the definition of COALESCE references as the
> way to resolve type conflicts *starts* with this:

> | Let IDTS be the set of data types specified in an application of
> | this Subclause. Let DTS be the set of data types in IDTS
> | excluding any data types that are undefined. If the cardinality
> | of DTS is 0 (zero), then the result data type is undefined and no
> | further Rules of this Subclause are evaluated.

> That's pretty unambiguous that a COALESCE(NULL, NULL) clause should
> yield a NULL with the data type undefined.

This is irrelevant, because such a construct fails the syntax rules
and thus we never get to the question of what type should be inferred,
at least not without going outside the spec. See my other reply.

regards, tom lane


From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-05-03 20:07:13
Message-ID: 1648216624.644283.1430683633940.JavaMail.yahoo@mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> This is irrelevant, because such a construct fails the syntax rules
> and thus we never get to the question of what type should be inferred,
> at least not without going outside the spec. See my other reply.

Yeah, our posts have been crossing a bit. The point about <value
expression> not allowing a NULL literal is valid. I yield on that
regarding COALESCE within the spec. It is an extension to the spec
to allow a NULL literal within a COALESCE clause at all. We would
surely break a lot of working code to forbid it, though. If we
*are* going to allow it, it would be pretty confusing to have it
behave differently that what I previously outlined (regarding the
equivalent long form CASE clause).

The <result> from a long form of the CASE clause explicitly does
explicitly allow an untyped NULL literal, and forcing it to text is
wrong per section 9.3.

To save an extra post -- I did modify the statements in SQL Fiddle
to get to the point where the subquery returned a column without a
type and a column with an int type in the dialect supported. I'm
not sure how that's relevant to the issue about how they resolve
that in the outer query, but I can post the form of the query used
for each product if you think it is germane.

To restate it, this hardly seems like the most important issue to
address; I just don't think the standard gives us much cover here.
What we do for the CASE clause is clearly wrong per spec, and if we
allow a bare NULL in a COALESCE clause it would be crazy not to
have the behavior match CASE. But it is clearly good form to
always cast a NULL literal to some type, and that is a workaround
which should not be too painful for most people. We shouldn't rush
to do anything big here, but we should recognize where we stand.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-05-03 20:11:03
Message-ID: 24815.1430683863@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

I wrote:
> The reason these things aren't legal per spec is that the spec says that
> a bare NULL keyword is a <contextually typed value specification> a/k/a
> <implicitly typed value specification>, and those are only valid in
> situations where a type can be inferred from the *immediate* context.
> For example,
> insert into x values (null);
> is legal because the source of an INSERT can be a
> <contextually typed table value constructor> which is a VALUES clause
> that can contain a <contextually typed value specification>. On the
> other hand, result subexpressions of a CASE are just <value expression>s,
> and you will not find any production that allows a bare NULL literal
> to be a <value expression>. So far as I can find in SQL:2008, the
> only contexts where <contextually typed anything> is syntactically
> legal are (1) INSERT, MERGE, and UPDATE source expressions, (2) CAST
> source expressions, and (3) table-column DEFAULT expressions, all of
> which have a well-defined target type available from the immediately
> surrounding semantic context.

On doing a more thorough search, I see that I missed one reference:
the result expression(s) of a CASE construct are defined as

<result> ::= <result expression>
| NULL
<result expression> ::= <value expression>

which might seem to put the lie to my thesis, except that in the
Syntax Rules we read

At least one <result> in a <case specification> shall specify
a <result expression>.

That means this is legal:

CASE WHEN ... THEN 42 ELSE NULL END;

but this isn't:

CASE WHEN ... THEN NULL ELSE NULL END;

So the core point stands. The spec doesn't allow a bare NULL literal
anywhere that its type can't be determined from the most closely nested
semantic context.

It's a bit weird that they encode this as a syntax rule rather than a
semantic rule, but that's what they did.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-05-03 20:28:54
Message-ID: 25218.1430684934@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Kevin Grittner <kgrittn(at)ymail(dot)com> writes:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> This is irrelevant, because such a construct fails the syntax rules
>> and thus we never get to the question of what type should be inferred,
>> at least not without going outside the spec. See my other reply.

> ... It is an extension to the spec
> to allow a NULL literal within a COALESCE clause at all. We would
> surely break a lot of working code to forbid it, though.

Actually, per my last reply, what's illegal per spec is for *all* the arms
to be NULL literals (so SQL Server is indeed enforcing the spec exactly).
As long as there's an arm with an identifiable type, the CASE's result
type can be determined.

> If we
> *are* going to allow it, it would be pretty confusing to have it
> behave differently that what I previously outlined (regarding the
> equivalent long form CASE clause).

AFAICT, we do treat them the same; can you provide an example where
we don't?

> To save an extra post -- I did modify the statements in SQL Fiddle
> to get to the point where the subquery returned a column without a
> type and a column with an int type in the dialect supported. I'm
> not sure how that's relevant to the issue about how they resolve
> that in the outer query, but I can post the form of the query used
> for each product if you think it is germane.

It may not be. I suspect what is really going on is that they're
resolving the sub-SELECT output column to TEXT (or local equivalent
idiom) and then being laxer than we are about coercing that type to
other types. It would be interesting to try variants of the

select u+i from (select '1' as u, '2'::int as i) s where u<'foo'::text;

example to see what they do if the column has to be converted to two
mutually inconsistent types, assuming you can find candidate types
in each system. Another idea would be to try things like

select u+i from (select 'bar' as u, '2'::int as i) s where u<'foo'::text;

and see exactly what error gets thrown.

> To restate it, this hardly seems like the most important issue to
> address; I just don't think the standard gives us much cover here.

I stand by my opinion that the cases that are controversial here
are all illegal per spec. We may well want to allow them on usability
grounds, but what the spec does *not* provide any cover for is claiming
that the spec requires some particular non-error interpretation.

regards, tom lane


From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Postgres-Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Failure to coerce unknown type to specific type
Date: 2015-05-03 21:15:27
Message-ID: 946937656.655853.1430687727733.JavaMail.yahoo@mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-bugs pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Kevin Grittner <kgrittn(at)ymail(dot)com> writes:

>> If we
>> *are* going to allow it, it would be pretty confusing to have it
>> behave differently that what I previously outlined (regarding the
>> equivalent long form CASE clause).
>
> AFAICT, we do treat them the same; can you provide an example where
> we don't?

No, I was just arguing that if we change the CASE clause to return
untyped NULL, then the COALESCE clause should follow suit.

> It would be interesting to try variants of the
>
> select u+i from (select '1' as u, '2'::int as i) s where u<'foo'::text;
>
> example to see what they do if the column has to be converted to two
> mutually inconsistent types, assuming you can find candidate types
> in each system.

I finagled a syntax that was accepted by all servers on SQL Fiddle
and got a row with a 3 without the WHERE clause from all products
except PostgreSQL:

-- Build Schema
create table onerow (n int not null);
insert into onerow values (1);
-- Run SQL
select u+i from (select cast('1' as char) as u, 2 as i from onerow) s;

Then I added that WHERE clause.

select u+i from (select cast('1' as char) as u, 2 as i from onerow) s
where u<cast('foo' as char);

Much to my amazement, all of them *still* return a row with the
value 3, without error. I'm still picking my jaw up from the floor.

I'm OK with being in the minority on that!

> Another idea would be to try things like
>
> select u+i from (select 'bar' as u, '2'::int as i) s where u<'foo'::text;
>
> and see exactly what error gets thrown.

I changed '1' to 'bar' in the above code.

MySQL and SQL Lite return a row with a 2.

Oracle throws an error: ORA-01722: invalid number

MS SQL Server throws an error: Conversion failed when converting the varchar value 'bar ' to data type int.

Yes, my literal was three characters and the error message added a space.

>> To restate it, this hardly seems like the most important issue to
>> address; I just don't think the standard gives us much cover here.
>
> I stand by my opinion that the cases that are controversial here
> are all illegal per spec.

With that last bit you pointed out, I now agree.

> We may well want to allow them on usability
> grounds, but what the spec does *not* provide any cover for is claiming
> that the spec requires some particular non-error interpretation.

ok

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company