Re: MySQL search query is not executing in Postgres DB

Lists: pgsql-hackers
From: premanand <kottiprem(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: MySQL search query is not executing in Postgres DB
Date: 2012-02-17 05:33:37
Message-ID: 1329456817046-5491531.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

In MySQL the below query is executing properly.

SELECT * FROM <Table-name> WHERE (Table.ID LIKE '1%')

But when i try to execute the above query in Postgres, i get the following
Exception "org.postgresql.util.PSQLException: ERROR: operator does not
exist: integer ~~ unknown Hint: No operator matches the given name and
argument type(s). You might need to add explicit type casts".

If i convert the same query " SELECT * FROM <Table-name> WHERE CAST(Table.ID
as TEXT) LIKE '1%' ". This gets executed directly in Postgres DB. But i need
some query which implicitly type cast in DB, which allows me to execute the
MySQL query without any Exception. Because i remember there is a way for
integer to boolean implicit type cast. Please refer the following link.
http://archives.postgresql.org/pgsql-general/2011-01/msg00866.php

Thanks in advance.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/MySQL-search-query-is-not-executing-in-Postgres-DB-tp5491531p5491531.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: premanand <kottiprem(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-17 07:45:46
Message-ID: 4F3E05AA.4000408@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 17.02.2012 07:33, premanand wrote:
> In MySQL the below query is executing properly.
>
> SELECT * FROM<Table-name> WHERE (Table.ID LIKE '1%')
>
> But when i try to execute the above query in Postgres, i get the following
> Exception "org.postgresql.util.PSQLException: ERROR: operator does not
> exist: integer ~~ unknown Hint: No operator matches the given name and
> argument type(s). You might need to add explicit type casts".
>
> If i convert the same query " SELECT * FROM<Table-name> WHERE CAST(Table.ID
> as TEXT) LIKE '1%' ". This gets executed directly in Postgres DB. But i need
> some query which implicitly type cast in DB, which allows me to execute the
> MySQL query without any Exception. Because i remember there is a way for
> integer to boolean implicit type cast. Please refer the following link.
> http://archives.postgresql.org/pgsql-general/2011-01/msg00866.php

You can use CREATE CAST
(http://www.postgresql.org/docs/current/static/sql-createcast.html). Or
you can create the operator "integer ~~ text" with CREATE FUNCTION +
CREATE OPERATOR. The latter would match fewer cases, which would reduce
the chances of introducing subtle bugs elsewhere in your application.

Of course, the best fix would be to change your queries. It's quite
sloppy to rely on "integer LIKE text" without an explicit cast in the query.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: premanand <kottiprem(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-17 09:22:32
Message-ID: 1329470552610-5491947.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Thanks for the reply. It is not possible to change the query in our
environment. So we need to do casting. I'm a new bee, so it will be more
helpful if you give me some simple examples.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/MySQL-search-query-is-not-executing-in-Postgres-DB-tp5491531p5491947.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: premanand <kottiprem(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-17 12:53:39
Message-ID: 4F3E4DD3.3040109@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/17/2012 04:22 AM, premanand wrote:
> Hi,
>
> Thanks for the reply. It is not possible to change the query in our
> environment. So we need to do casting. I'm a new bee, so it will be more
> helpful if you give me some simple examples.
>

That's not what this list is for, You should be asking in pgsql-general,
not here, (so don't just reply to this email). But in any case, you
should take Heikki's advice seriously. What you are doing is broken. You
should fix it and not just say "Oh, we can't".

cheers

andrew


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-17 15:16:23
Message-ID: CA+TgmoYq3idpuuMQo2_UOCKOudWTM+-_gyLN8mCxXu+=31sKYw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 17, 2012 at 2:45 AM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> Of course, the best fix would be to change your queries. It's quite sloppy
> to rely on "integer LIKE text" without an explicit cast in the query.

Why?

I understand why it's a bad idea to rely on integer = text doing
anything sane - is that text equality or numeric equality?

And in theory the same issue could exist here if there were another
meaning for LIKE. But there isn't.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-17 15:27:35
Message-ID: 26301.1329492455@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I understand why it's a bad idea to rely on integer = text doing
> anything sane - is that text equality or numeric equality?

> And in theory the same issue could exist here if there were another
> meaning for LIKE. But there isn't.

Really? LIKE is just a different spelling for operator ~~, and there is
nothing stopping people from defining more operators named that, not to
mention that there are already four of them in core PG. In particular
the bytea ~~ bytea version is explicitly intended to provide a LIKE
implementation for non-text types. I see some operators named ~~ in
contrib as well.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-17 16:58:10
Message-ID: CA+TgmoY6W50Fp1Gwg0cp_TLiwyctpRXOSto9L2B9xX_xb2ADoQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 17, 2012 at 10:27 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> I understand why it's a bad idea to rely on integer = text doing
>> anything sane - is that text equality or numeric equality?
>
>> And in theory the same issue could exist here if there were another
>> meaning for LIKE.  But there isn't.
>
> Really?  LIKE is just a different spelling for operator ~~, and there is
> nothing stopping people from defining more operators named that, not to
> mention that there are already four of them in core PG.  In particular
> the bytea ~~ bytea version is explicitly intended to provide a LIKE
> implementation for non-text types.  I see some operators named ~~ in
> contrib as well.

As far as I know, the fact that LIKE is another spelling for ~~ is a
PostgreSQL implementation detail with which users ought not to be
burdened. But even given that, there are many situations where we
currently complain about ambiguity even though the ambiguity is
entirely hypothetical: there COULD be a ~~(int,text) operator, but
there actually isn't. Now, I'll admit that this is not an easy
problem to solve without giving up something somewhere, since it's
clearly undesirable for the meaning of something that worked before to
silently change when and if someone defines a new operator.

But on the other hand, I think that labeling the user's coding
practices as sloppy is a cop-out. There are many, many people running
on not only MySQL but also on Oracle who have written large amounts of
code that requires fewer type casts on those systems than it does on
PostgreSQL. That fact presents serious migration challenges for such
users when they move over to PostgreSQL. Labeling the code as the
problem excuses us from the need to think about how to make our type
system work any better than it does today. Boo, hiss. If we're not
doing anything about this because we have carefully examined the
subject and decided that this is a trade-off we must make because
MySQL or Oracle doesn't support XYZ and we do, then let's give that
explanation to the user instead of telling them the problem is that
their code stinks. Otherwise, we have some soul-searching to do, as
time permits.

I remember there was a time when you couldn't say "SELECT a x FROM
foo" in PostgreSQL. We told people that it was because our syntax was
more flexible - we have postfix operators, or something. I no longer
remember the details of where the grammar conflict came from. But
somebody (probably you or Hiroshi Saito, judging by the commit log)
figured out a way to get around the problem, and now that syntax works
fine in 99% of the cases people care about. That is a huge usability
improvement for people coming from other database systems where AS was
never required. I don't know whether a similar improvement is
possible in this area, but we're certainly not going to get there by
labeling the user's expectations as unreasonable. I don't think they
are, and the people who wrote MySQL and Oracle evidently agree.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-17 17:14:08
Message-ID: 28679.1329498848@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I remember there was a time when you couldn't say "SELECT a x FROM
> foo" in PostgreSQL. We told people that it was because our syntax was
> more flexible - we have postfix operators, or something.

Which it was, and yes that was the reason. We eventually thought of a
kluge solution that lets you omit "AS" 90% of the time, which is better
than nothing; but I doubt it would ever have been accepted if it weren't
a matter of improving standards compliance. I am pretty sure that the
SQL spec doesn't say that you should be able to apply LIKE directly to
an integer, so that issue isn't comparable to this one.

> I don't know whether a similar improvement is
> possible in this area, but we're certainly not going to get there by
> labeling the user's expectations as unreasonable. I don't think they
> are, and the people who wrote MySQL and Oracle evidently agree.

The people who wrote MySQL had very poor taste in a lot of areas, and
we are not going to blindly follow their lead. Oracle is not a terribly
presentable system either. Having said that, I don't object to any
clean improvements we can think of in this area --- but "make it work
more like MySQL" had better not be the only argument for it.

regards, tom lane


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "premanand" <kottiprem(at)gmail(dot)com>,<pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-17 17:27:03
Message-ID: 4F3E398702000025000457EE@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> As far as I know, the fact that LIKE is another spelling for ~~ is
a
> PostgreSQL implementation detail with which users ought not to be
> burdened.

+1

LIKE is well defined by the standard, and the ~~ operator is not
mentioned there anywhere.

On the other hand, LIKE is defined to either work on character
strings or binary strings -- there is nothing in the standard about
using it with other data types or automatic casting to support that.
Any such support would be a non-standard PostgreSQL extension. As
such, anyone wanting to write portable code should avoid that by
explicit casting (which should be portable).

> I remember there was a time when you couldn't say "SELECT a x FROM
> foo" in PostgreSQL.

That was in violation of the SQL standard, which makes AS an allowed
but optional noise word.

In spite of all that, perhaps we should have a compatibility
extension which provides more casts, in an attempt to ease the
transition from other databases? Personally, I like having the
default behavior this strict -- I think it reduces the chance of
errors, reduces the chances of accidentally having type mismatches
which defeat optimizations, and improves portability. But I have
nothing against allowing someone to give all that up to ease
transition from another product. My biggest concern is whether we
might "paint ourselves into a corner" by including such an
extension. It might shut off avenues for other cool features
because anyone using the extension would have conflicts. Perhaps
such a thing would be more appropriate on PGXN with admonitions that
it was only intended to ease conversion and that users were
encouraged to migrate to standard syntax as soon as possible.

-Kevin


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-17 17:29:36
Message-ID: CAFj8pRBXdFH1S-8KhJqpzcBJ=FUQT4+oss0MuoFdEc=AmpW7+w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2012/2/17 Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>:
> Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>> As far as I know, the fact that LIKE is another spelling for ~~ is
> a
>> PostgreSQL implementation detail with which users ought not to be
>> burdened.
>
> +1
>
> LIKE is well defined by the standard, and the ~~ operator is not
> mentioned there anywhere.
>
> On the other hand, LIKE is defined to either work on character
> strings or binary strings -- there is nothing in the standard about
> using it with other data types or automatic casting to support that.
> Any such support would be a non-standard PostgreSQL extension. As
> such, anyone wanting to write portable code should avoid that by
> explicit casting (which should be portable).
>
>> I remember there was a time when you couldn't say "SELECT a x FROM
>> foo" in PostgreSQL.
>
> That was in violation of the SQL standard, which makes AS an allowed
> but optional noise word.
>
> In spite of all that, perhaps we should have a compatibility
> extension which provides more casts, in an attempt to ease the
> transition from other databases?  Personally, I like having the
> default behavior this strict -- I think it reduces the chance of
> errors, reduces the chances of accidentally having type mismatches
> which defeat optimizations, and improves portability.  But I have
> nothing against allowing someone to give all that up to ease
> transition from another product.  My biggest concern is whether we
> might "paint ourselves into a corner" by including such an
> extension.  It might shut off avenues for other cool features
> because anyone using the extension would have conflicts.  Perhaps
> such a thing would be more appropriate on PGXN with admonitions that
> it was only intended to ease conversion and that users were
> encouraged to migrate to standard syntax as soon as possible.
>

+1

Pavel

> -Kevin
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-17 17:59:40
Message-ID: CA+TgmoaJC_+dPTAeiQ6=xkONJBNM4s90TCgY0tJRZ001zQhGUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 17, 2012 at 12:14 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I don't know whether a similar improvement is
>> possible in this area, but we're certainly not going to get there by
>> labeling the user's expectations as unreasonable.  I don't think they
>> are, and the people who wrote MySQL and Oracle evidently agree.
>
> The people who wrote MySQL had very poor taste in a lot of areas, and
> we are not going to blindly follow their lead.  Oracle is not a terribly
> presentable system either.  Having said that, I don't object to any
> clean improvements we can think of in this area --- but "make it work
> more like MySQL" had better not be the only argument for it.

Hey, if I preferred MySQL to PostgreSQL, I wouldn't be here. That
doesn't mean that there are exactly 0 things that they do better than
we do. What I'm unhappy about isn't that we're not bug-compatible
with MySQL, but rather that, in this case, I like MySQL's behavior
better, and the fact that they've made it work means it's not
theoretically impossible. It just involves some trade-off that I
don't believe we've thought about hard enough.

Standards compliance is a means to an end. The purpose of having
standards is to allow for interoperable implementations of the same
underlying functionality. That doesn't mean we should copy
non-standard warts, of course, but it isn't obvious to me that this is
a wart. No one has suggested that the user's actual query has more
than one reasonable interpretation, so complaining that it's ambiguous
doesn't impress me very much.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-17 18:21:27
Message-ID: 4F3E9AA7.3030107@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/17/2012 12:59 PM, Robert Haas wrote:
> On Fri, Feb 17, 2012 at 12:14 PM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> I don't know whether a similar improvement is
>>> possible in this area, but we're certainly not going to get there by
>>> labeling the user's expectations as unreasonable. I don't think they
>>> are, and the people who wrote MySQL and Oracle evidently agree.
>> The people who wrote MySQL had very poor taste in a lot of areas, and
>> we are not going to blindly follow their lead. Oracle is not a terribly
>> presentable system either. Having said that, I don't object to any
>> clean improvements we can think of in this area --- but "make it work
>> more like MySQL" had better not be the only argument for it.
> Hey, if I preferred MySQL to PostgreSQL, I wouldn't be here. That
> doesn't mean that there are exactly 0 things that they do better than
> we do. What I'm unhappy about isn't that we're not bug-compatible
> with MySQL, but rather that, in this case, I like MySQL's behavior
> better, and the fact that they've made it work means it's not
> theoretically impossible. It just involves some trade-off that I
> don't believe we've thought about hard enough.
>
> Standards compliance is a means to an end. The purpose of having
> standards is to allow for interoperable implementations of the same
> underlying functionality. That doesn't mean we should copy
> non-standard warts, of course, but it isn't obvious to me that this is
> a wart. No one has suggested that the user's actual query has more
> than one reasonable interpretation, so complaining that it's ambiguous
> doesn't impress me very much.

Assuming we had the cast, What would "intval like '1%'" mean? You're
going to match 1, 10..19, 100..199, 1000..1999 ...

Now maybe there's a good use for such a test, but I'm have a VERY hard
time imagining what it might be.

cheers

andrew


From: Jeff MacDonald <jam(at)zoidtechnologies(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-17 18:31:53
Message-ID: 201202171331.53638.jam@zoidtechnologies.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greetings,

> > My biggest concern is whether we
> > might "paint ourselves into a corner" by including such an
> > extension. It might shut off avenues for other cool features
> > because anyone using the extension would have conflicts. Perhaps
> > such a thing would be more appropriate on PGXN with admonitions that
> > it was only intended to ease conversion and that users were
> > encouraged to migrate to standard syntax as soon as possible.
>

IMHO if you give someone syntax surgar like this and tell them to "fix it ASAP"
it will never get fixed properly. I'm all for getting new users to pgsql, but
this is not the way to do it.

Regards,
J


From: Christopher Browne <cbbrowne(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-17 19:28:21
Message-ID: CAFNqd5Un1UQour3PKB1Lerq0FyO-ctnMRF7QVXAVX_5JE=LpaA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 17, 2012 at 1:21 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> Assuming we had the cast, What would "intval like '1%'" mean? You're going
> to match 1, 10..19, 100..199, 1000..1999 ...
>
> Now maybe there's a good use for such a test, but I'm have a VERY hard time
> imagining what it might be.

Well, I can readily see someone encoding parts of their application
into this sort of encoding, so that, for instance, all customer
numbers beginning with "1" are deemed to be "internal accounts."

It's a pretty terrible approach to encoding data for an application;
it leads to stuff like the "oops, once they have generated 30,000
invoices, the system reaches doomsday and can't work anymore."
http://thedailywtf.com/Articles/Ive-Got-Your-Number.aspx

But nothing prevents users from designing their applications to encode
information in their ID prefixes.

And I have *zero* confidence that for PostgreSQL to rule out "LIKE
'1%'" is preventing those designs from getting built...

When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"


From: Don Baccus <dhogaza(at)pacifier(dot)com>
To: Christopher Browne <cbbrowne(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-17 19:39:17
Message-ID: F18C1E11-E508-4D66-B3CB-6763B89D6598@pacifier.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Feb 17, 2012, at 11:28 AM, Christopher Browne wrote:
> And I have *zero* confidence that for PostgreSQL to rule out "LIKE
> '1%'" is preventing those designs from getting built...
>
> When confronted by a difficult problem, solve it by reducing it to the
> question, "How would the Lone Ranger handle this?"

He would turn to Tonto, who undoubtably would advise:

If you want to treat an integer like a string, figure out how to convert your integer into a string …

The original query strikes me as being similar to expecting the Lone Ranger's six-shooter to be capable of shooting Tonto's arrows.

----
Don Baccus
http://donb.photo.net
http://birdnotes.net
http://openacs.org


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Christopher Browne <cbbrowne(at)gmail(dot)com>
Cc: premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-17 19:44:03
Message-ID: 4F3EAE03.3070300@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/17/2012 02:28 PM, Christopher Browne wrote:
> On Fri, Feb 17, 2012 at 1:21 PM, Andrew Dunstan<andrew(at)dunslane(dot)net> wrote:
>> Assuming we had the cast, What would "intval like '1%'" mean? You're going
>> to match 1, 10..19, 100..199, 1000..1999 ...
>>
>> Now maybe there's a good use for such a test, but I'm have a VERY hard time
>> imagining what it might be.
> Well, I can readily see someone encoding parts of their application
> into this sort of encoding, so that, for instance, all customer
> numbers beginning with "1" are deemed to be "internal accounts."
>
> It's a pretty terrible approach to encoding data for an application;
> it leads to stuff like the "oops, once they have generated 30,000
> invoices, the system reaches doomsday and can't work anymore."
> http://thedailywtf.com/Articles/Ive-Got-Your-Number.aspx
>
> But nothing prevents users from designing their applications to encode
> information in their ID prefixes.
>
> And I have *zero* confidence that for PostgreSQL to rule out "LIKE
> '1%'" is preventing those designs from getting built...
>
> When confronted by a difficult problem, solve it by reducing it to the
> question, "How would the Lone Ranger handle this?"
>

Strings of digits used that way should not be stored in numeric fields
at all, IMNSHO, just as zip codes and phone numbers should not be. They
should be text in the first place, and if the OP had done that he
wouldn't have had any difficulty about this. I hope that's what the Lone
Ranger would do ...

cheers

andrew


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-17 19:52:20
Message-ID: CA+TgmoYwW449y6iEwUyUX_ThPgTvToXSbamoujwFpLV_JQsXrA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 17, 2012 at 1:21 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> Assuming we had the cast, What would "intval like '1%'" mean? You're going
> to match 1, 10..19, 100..199, 1000..1999 ...

Yep.

> Now maybe there's a good use for such a test, but I'm have a VERY hard time
> imagining what it might be.

Dunno. Presumably the test is meaningful for the OP's IDs, or he
wouldn't have written the query that way.

The time I got bitten by this was actually with LPAD(), rather than
LIKE. I had a serial column which I wanted to use to generate record
identifiers off of a sequence: B00001, B00002, B00003, B00004, etc.
So I wrote 'B' || lpad(id, 5, '0'). When the implicit casting
changes came along, I had to go back and change that to id::text.
Fortunately that wasn't a lot of work, especially since by that time I
was following pgsql-hackers enough to understand immediately why it
broke, but it did and does seem unnecessary, because there is no real
ambiguity there. Yeah, there could be ambiguity, if someone created
another LPAD() function... but no one did.

Here's yet another case where the current rules are thoroughly disagreeable.

rhaas=# create or replace function z(smallint) returns smallint as
$$select $1+1$$ language sql;
ERROR: return type mismatch in function declared to return smallint
DETAIL: Actual return type is integer.
CONTEXT: SQL function "z"

So cast the result from an integer to a smallint. What's the big deal?

But, OK, I'll do it your way:

rhaas=# create or replace function z(smallint) returns smallint as
$$select $1+1::smallint$$ language sql;
CREATE FUNCTION
rhaas=# select z(1);
ERROR: function z(integer) does not exist
LINE 1: select z(1);
^
HINT: No function matches the given name and argument types. You
might need to add explicit type casts.

Come on, really? Note that the above example works without casts if
you use int *or* bigint *or* numeric, but not smallint. That could be
fixed by causing sufficiently-small integers to lex as smallints, but
if you think implicit datatype coercions are evil, you ought to be
outraged by the fact that we are already going out of our way to blur
the line between int, bigint, and numeric. We let people write 2.0 +
3 and get 5.0 - surely it's only a short step from there to human
sacrifice, cats and dogs living together... mass hysteria! I mean,
the whole reason for rejecting integer = text is that we aren't sure
whether to coerce the text to an integer or the integer to a string,
and it's better to throw an error than to guess. But in the case of
2.0 + 3, we feel 100% confident in predicting that the user will be
happy to convert the integer to a numeric rather than the numeric to
an integer, so no error. We do that because we know that the domain
of numeric is a superset of the domain of integer, or in other words,
we are using context clues to deduce what the user probably meant
rather than forcing them to be explicit about it.

And yet in other cases, such as LIKE or LPAD with an integer rather
than a string, or just about anything involving smallint, the user is
required to be explicit, even though in most cases there is only one
reasonable implementation of the query. What is the value in erroring
out on a query that's not actually ambiguous? Numerous people here
are defending that behavior as if it were somehow morally superior,
but I think it's merely an accident of how the post-8.3 type system
happens to work. On pre-8.3 systems this all works differently, and
some of those old behaviors are worse than what we have now, while
others are better. If we really believed that implicit casts any form
were evil, we would have removed them entirely instead of trimming
them back. I don't see why it's heretical to suggest that the 8.3
casting changes brought us to exactly that point in the universe where
everything is perfect and nothing can be further improved; does anyone
seriously believe that?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-17 20:02:55
Message-ID: 2351.1329508975@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Assuming we had the cast, What would "intval like '1%'" mean? You're
> going to match 1, 10..19, 100..199, 1000..1999 ...

> Now maybe there's a good use for such a test, but I'm have a VERY hard
> time imagining what it might be.

Yeah, that's another point worth asking. Coercing an integer to text
and then doing LIKE on it is an extremely inefficient way to do what's
probably the wrong thing anyhow. I would be interested to know exactly
why the OP wants to do this, and whether it couldn't be done better with
some arithmetical test.

In this connection it's worth remembering that when we removed some
implicit casts in 8.3, we heard lots of yelling, and we also heard from
lots of people who found bugs in their SQL code that the implicit casts
had masked. Allowing LIKE-on-anything could be a rerun of that mess.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Christopher Browne <cbbrowne(at)gmail(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-17 20:04:59
Message-ID: CA+TgmobR1+M4e6LGkMxHc96CR946i6hfXqb+ShAgGJp8+ZentQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 17, 2012 at 2:44 PM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
> Strings of digits used that way should not be stored in numeric fields at
> all, IMNSHO, just as zip codes and phone numbers should not be. They should
> be text in the first place, and if the OP had done that he wouldn't have had
> any difficulty about this. I hope that's what the Lone Ranger would do ...

The argument isn't about whether the user made the right design
choices; it's about whether he should be forced to insert an explicit
type cast to get the query to do what it is unambiguously intended to
do. I don't believe it's entirely self-evident that it's always
better to store strings of integers in a text column rather than as an
integer or bigint - integers are pretty fast and compact. Even
granting that a text field would have been better, nobody's arguing
that you can't do LIKE against an integer column; we're just
discussing what syntax is required to make the user's intent
acceptably clear.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-17 20:05:43
Message-ID: 4F3EB317.3060800@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/17/2012 02:52 PM, Robert Haas wrote:
> If we really believed that implicit casts any form were evil, we would
> have removed them entirely instead of trimming them back. I don't see
> why it's heretical to suggest that the 8.3 casting changes brought us
> to exactly that point in the universe where everything is perfect and
> nothing can be further improved; does anyone seriously believe that?

I don't believe we are necessarily at a perfect place, nor have I said
it, nor has anyone else that I'm aware of. Neither am I opposed to
implementing MySQL features (or doing them better) when appropriate.

But I do believe that a test for "intval like '1%'" is very likely to
come from a broken design.

cheers

andrew


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-18 00:12:37
Message-ID: 4F3EECF5.4050408@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2/17/12 12:04 PM, Robert Haas wrote:
> The argument isn't about whether the user made the right design
> choices; it's about whether he should be forced to insert an explicit
> type cast to get the query to do what it is unambiguously intended to
> do.

I don't find INTEGER LIKE '1%' to be unambiguous.

Prior to this discussion, if I had run across such a piece of code, I
couldn't have told you what it would do in MySQL without testing.

What *does* it do in MySQL?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Don Baccus <dhogaza(at)pacifier(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-18 00:19:48
Message-ID: 80D838BF-9DA4-4A28-B115-DAEC3E3FBF4A@pacifier.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Feb 17, 2012, at 4:12 PM, Josh Berkus wrote:

> On 2/17/12 12:04 PM, Robert Haas wrote:
>> The argument isn't about whether the user made the right design
>> choices; it's about whether he should be forced to insert an explicit
>> type cast to get the query to do what it is unambiguously intended to
>> do.
>
> I don't find INTEGER LIKE '1%' to be unambiguous.
>
> Prior to this discussion, if I had run across such a piece of code, I
> couldn't have told you what it would do in MySQL without testing.

If someone showed it to me without mention MySQL I'd say:

"oh, it's an error".

>
> What *does* it do in MySQL?

And knowing it's MySQL … "oh, probably *not* an error", but like you … I'd be mystified.

Should 01 like '0%' match?

----
Don Baccus
http://donb.photo.net
http://birdnotes.net
http://openacs.org


From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-18 16:51:22
Message-ID: 8fa4bac539df891a13b01f98806190d7@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

> The time I got bitten by this was actually with LPAD(), rather than LIKE.

+1. This is one of the functions that gave some of our clients
real trouble when 8.3 came out.

> If we really believed that implicit casts any form were evil, we
> would have removed them entirely instead of trimming them back.
> I don't see why it's heretical to suggest that the 8.3 casting
> changes brought us to exactly that point in the universe where
> everything is perfect and nothing can be further improved; does
> anyone seriously believe that?

Agreed (although the last bit is a bit of a straw man). The idea
in this thread of putting some implicit casts into an extension
or other external package is not a very good one, either. Let's
apply some common sense instead, and stick to our guns on the ones
where we feel there could honestly be serious app consequences and
thus we encourage^H^Hforce people to change their code (or write all
sorts of custom casts and functions). I think the actual number of
such app circumstances is rather small, but my clients are not your*
clients, so who knows? In other words, I'll concede int==text, but
really need a strong argument for conceding things like LPAD.

* Your = everyone else, not just M. Haas.

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201202181145
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAk8/1usACgkQvJuQZxSWSsjE6ACdHy31jpHUsXo5juvXcCkzKpGH
RQAAoM/uTbM/JBkDiDjrsI1Blyg3DsWf
=7CA4
-----END PGP SIGNATURE-----


From: Rob Wultsch <wultsch(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-18 20:57:15
Message-ID: CAGdn2uj=sPnr8WBPxhD5u-JROpgBK21rbM0cdrRzS2XQJpqktw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 17, 2012 at 4:12 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 2/17/12 12:04 PM, Robert Haas wrote:
>> The argument isn't about whether the user made the right design
>> choices; it's about whether he should be forced to insert an explicit
>> type cast to get the query to do what it is unambiguously intended to
>> do.
>
> I don't find INTEGER LIKE '1%' to be unambiguous.
>
> Prior to this discussion, if I had run across such a piece of code, I
> couldn't have told you what it would do in MySQL without testing.
>
> What *does* it do in MySQL?
>

IIRC it casts each INTEGER (without any left padding) to text and then
does the comparison as per normal. Comparison of dissimilar types are
a recipe for full table scans and unexpected results. A really good
example is
select * from employees where first_name=5;
vs
select * from employees where first_name='5';

Where first_name is string the queries above have very different
behaviour in MySQL. The first does a full table scan and coerces
first_name to an integer (so '5adfs' -> 5) while the second can use an
index as it is normal string comparison. I have seen this sort of
things cause significant production issues several times.*

I have seen several companies use comparisons of dissimilar data types
as part of their stump the prospective DBA test and they stump lots of
folks.

--
Rob Wultsch
wultsch(at)gmail(dot)com


From: Don Baccus <dhogaza(at)pacifier(dot)com>
To: Rob Wultsch <wultsch(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-18 21:12:04
Message-ID: D55A0FD7-E6A5-4AF9-BE5E-31BE69F479B8@pacifier.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Feb 18, 2012, at 12:57 PM, Rob Wultsch wrote:
>
> Where first_name is string the queries above have very different
> behaviour in MySQL. The first does a full table scan and coerces
> first_name to an integer (so '5adfs' -> 5)

Oh my, I can't wait to see someone rise to the defense of *this* behavior!

----
Don Baccus
http://donb.photo.net
http://birdnotes.net
http://openacs.org


From: Christopher Browne <cbbrowne(at)gmail(dot)com>
To: Don Baccus <dhogaza(at)pacifier(dot)com>
Cc: Rob Wultsch <wultsch(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-18 21:43:12
Message-ID: CAFNqd5U1nSHRRugSg2L75NpiZNNT_MKnRuLKKAD-6gY8nz=w4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Feb 18, 2012 at 4:12 PM, Don Baccus <dhogaza(at)pacifier(dot)com> wrote:
>
> On Feb 18, 2012, at 12:57 PM, Rob Wultsch wrote:
>>
>> Where first_name is string the queries above have very different
>> behaviour in MySQL. The first does a full table scan and coerces
>> first_name to an integer (so '5adfs' -> 5)
>
> Oh my, I can't wait to see someone rise to the defense of *this* behavior!

I can see a use, albeit a clumsy one, to the notion of looking for values
WHERE integer_id_column like '1%'

It's entirely common for companies to organize general ledger account
numbers by having numeric prefixes that are somewhat meaningful.

A hierarchy like the following is perfectly logical:
- 0000 to 0999 :: Cash accounts [1]
- 1000 to 1999 :: Short Term Assets
- 2000 to 2999 :: Long Term Assets
- 3000 to 3999 :: Incomes
- 4000 to 4999 :: Costs of Goods Sold
- 5000 to 5999 :: Other Expenses
- 6000 to 6999 :: Share Capital
- 7000 to 7999 :: Retained Earnings and such

And back in the pre-computer days, accountants got very comfortable
with the shorthands that, for instance, "Income is in the 3000
series."

We are much smarter today (well, not necessarily!) and can use other
ways to indicate hierarchy, so that there's no reason to *care* what
that account number is.

But if old-school accountants that think "3000 series" *demand* that,
and as they're likely senior enough to assert their way, they're
likely to succeed in that demand, then it's pretty easy to this to
lead to somewhat clumsy "account_id like '3%'" as a search for income.

If I put my purist hat on, then the *right* answer is a range query, thus
WHERE account_id between 3000 and 3999

The new RANGE stuff that Jeff Davis has been adding into 9.2 should,
in principle, be the even better way to represent this kind of thing.

I'd think it nearly insane if someone was expecting '3%' to match not
only the '3000 thru 3999' series, but also '300 to 399' and "30 to 39"
and "3". A situation where that is the right set of results requires
a mighty strangely designed numbering system. I imagine a designer
would want to rule out the range 0-999, in such a design.

Nonetheless, the need for "where account_id like '1%'" comes from a
system designed with the above kind of thinking about account numbers,
and that approach fits mighty well with the way people thought back
when a "computer" was a person whose job it was to work out sums.

Notes:
[1] A careful observer will notice that the prefix notion doesn't
work for the first range without forcing leading zeroes onto
numbers...
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"


From: Don Baccus <dhogaza(at)pacifier(dot)com>
To: Christopher Browne <cbbrowne(at)gmail(dot)com>
Cc: Rob Wultsch <wultsch(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-18 22:34:24
Message-ID: 76A1B638-FF35-47E0-A0C6-CF2A0A5AE3A1@pacifier.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On Feb 18, 2012, at 1:43 PM, Christopher Browne wrote:

> On Sat, Feb 18, 2012 at 4:12 PM, Don Baccus <dhogaza(at)pacifier(dot)com> wrote:
>>
>> On Feb 18, 2012, at 12:57 PM, Rob Wultsch wrote:
>>>
>>> Where first_name is string the queries above have very different
>>> behaviour in MySQL. The first does a full table scan and coerces
>>> first_name to an integer (so '5adfs' -> 5)
>>
>> Oh my, I can't wait to see someone rise to the defense of *this* behavior!
>
> I can see a use, albeit a clumsy one, to the notion of looking for values
> WHERE integer_id_column like '1%'
>
> It's entirely common for companies to organize general ledger account
> numbers by having numeric prefixes that are somewhat meaningful.
>
> A hierarchy like the following is perfectly logical:
> - 0000 to 0999 :: Cash accounts [1]

I asked earlier if anyone would expect 01 like '0%' to match …

Apparently so!

Your example is actually a good argument for storing account ids as text, because '0000' like '0%' *will* match.

"I'd think it nearly insane if someone was expecting '3%' to match not
only the '3000 thru 3999' series, but also '300 to 399' and "30 to 39"
and "3"."

How is PG supposed to know that integers compared to strings are always to be padded out to precisely 4 digits?

----
Don Baccus
http://donb.photo.net
http://birdnotes.net
http://openacs.org


From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Don Baccus <dhogaza(at)pacifier(dot)com>
Cc: Christopher Browne <cbbrowne(at)gmail(dot)com>, Rob Wultsch <wultsch(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-18 22:55:43
Message-ID: m2ty2ng39s.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Don Baccus <dhogaza(at)pacifier(dot)com> writes:
>> A hierarchy like the following is perfectly logical:
>> - 0000 to 0999 :: Cash accounts [1]
>
> Your example is actually a good argument for storing account ids as
> text, because '0000' like '0%' *will* match.

FWIW, I too think that if you want to process your integers as text for
some operations (LIKE) and as integer for some others, you'd better do
the casting explicitly.

In the worked-out example Christopher has been proposing, just alter the
column type to text and be done, I can't see summing up or whatever int
arithmetic usage being done on those general ledger account numbers. Use
a domain (well a CHECK constraint really) to tight things down.

As for lpad(), that's a function working on text that returns text, so
having a variant that accepts integers would not be confusing. Then
again, why aren't you using to_char() if processing integers?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

PS: having worked on telephone number prefix indexing and processing
them as text, I might have a biased opinion. You don't add up phone
numbers, though, do you?


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Don Baccus <dhogaza(at)pacifier(dot)com>
Cc: Christopher Browne <cbbrowne(at)gmail(dot)com>, Rob Wultsch <wultsch(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-18 23:03:06
Message-ID: 4F402E2A.8010601@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 02/18/2012 05:34 PM, Don Baccus wrote:
> On Feb 18, 2012, at 1:43 PM, Christopher Browne wrote:
>
>> On Sat, Feb 18, 2012 at 4:12 PM, Don Baccus<dhogaza(at)pacifier(dot)com> wrote:
>>> On Feb 18, 2012, at 12:57 PM, Rob Wultsch wrote:
>>>> Where first_name is string the queries above have very different
>>>> behaviour in MySQL. The first does a full table scan and coerces
>>>> first_name to an integer (so '5adfs' -> 5)
>>> Oh my, I can't wait to see someone rise to the defense of *this* behavior!
>> I can see a use, albeit a clumsy one, to the notion of looking for values
>> WHERE integer_id_column like '1%'
>>
>> It's entirely common for companies to organize general ledger account
>> numbers by having numeric prefixes that are somewhat meaningful.
>>
>> A hierarchy like the following is perfectly logical:
>> - 0000 to 0999 :: Cash accounts [1]
> I asked earlier if anyone would expect 01 like '0%' to match …
>
> Apparently so!
>
> Your example is actually a good argument for storing account ids as text, because '0000' like '0%' *will* match.
>
> "I'd think it nearly insane if someone was expecting '3%' to match not
> only the '3000 thru 3999' series, but also '300 to 399' and "30 to 39"
> and "3"."
>
> How is PG supposed to know that integers compared to strings are always to be padded out to precisely 4 digits?
>

By this point the Lone Ranger has committed suicide.

cheers

andrew


From: Christopher Browne <cbbrowne(at)gmail(dot)com>
To: Don Baccus <dhogaza(at)pacifier(dot)com>
Cc: Rob Wultsch <wultsch(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-19 01:14:19
Message-ID: CAFNqd5WOoZfTgW2af4Tm9M-iFTwxLWByB8fj37eVoKEzC3XKQQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Feb 18, 2012 at 5:34 PM, Don Baccus <dhogaza(at)pacifier(dot)com> wrote:
>
> On Feb 18, 2012, at 1:43 PM, Christopher Browne wrote:
>> A hierarchy like the following is perfectly logical:
>> - 0000 to 0999 :: Cash accounts [1]
>
> I asked earlier if anyone would expect 01 like '0%' to match …
>
> Apparently so!

Yes, and I was intentionally treating this as an oddity.

> Your example is actually a good argument for storing account ids as text, because '0000' like '0%' *will* match.

Absolutely.

The trouble is that if you use the term "account NUMBER" enough times,
some portion of people will think that it's a number in the sense that
it should be meaningful to add and subtract against them.

> "I'd think it nearly insane if someone was expecting '3%' to match not
> only the '3000 thru 3999' series, but also '300 to 399' and "30 to 39"
> and "3"."
>
> How is PG supposed to know that integers compared to strings are always to be padded out to precisely 4 digits?

I think it's not quite right to treat it as "how is PG supposed to
know." The problem is a bit more abstract; it occurs without having a
database involved.

The notion that the ranges (3), (30-39), (300-399), and (3000-3999)
ought to be considered connected together in the account number
classification is what seems crazy to me. But that's what "account
number starts with a 3" could be expected to imply.

At any rate, yes, this is liable to point the Lone Ranger towards
solutions that involve him not riding off into the sunset!
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Don Baccus <dhogaza(at)pacifier(dot)com>
Cc: Rob Wultsch <wultsch(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-19 04:10:01
Message-ID: CA+TgmobcwSfo3HJrWyt9EWmAkayKxWQjmOEmt+QHyNJCGfYzcg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, Feb 18, 2012 at 4:12 PM, Don Baccus <dhogaza(at)pacifier(dot)com> wrote:
> On Feb 18, 2012, at 12:57 PM, Rob Wultsch wrote:
>>
>> Where first_name is string the queries above have very different
>> behaviour in MySQL. The first does a full table scan and coerces
>> first_name to an integer (so '5adfs' -> 5)
>
> Oh my, I can't wait to see someone rise to the defense of *this* behavior!

Well, this gets to my point. The behavior Rob is mentioning here is
the one that caused us to make the implicit casting changes in the
first place. And, in this situation, I agree that throwing an error
is much better than silently doing something that may be quite
different from what the user expects.

However, the fact that the implicit casting changes are an improvement
in this case does not mean that they are an improvement in every case.
All I am asking for here is that we examine the various cases on
their merits rather than assuming that our way must be better than
MySQL's way, or visca versa.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-02-19 04:10:38
Message-ID: CA+TgmoZtj0gqsd0HS7EaQyTtb45oyRGHFLjmm14SvSAU3vBV3w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 17, 2012 at 7:12 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
> On 2/17/12 12:04 PM, Robert Haas wrote:
>> The argument isn't about whether the user made the right design
>> choices; it's about whether he should be forced to insert an explicit
>> type cast to get the query to do what it is unambiguously intended to
>> do.
>
> I don't find INTEGER LIKE '1%' to be unambiguous.

Please propose two reasonable interpretations.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-27 19:13:45
Message-ID: 20120827191345.GT11088@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Feb 17, 2012 at 02:52:20PM -0500, Robert Haas wrote:
> Here's yet another case where the current rules are thoroughly disagreeable.
>
> rhaas=# create or replace function z(smallint) returns smallint as
> $$select $1+1$$ language sql;
> ERROR: return type mismatch in function declared to return smallint
> DETAIL: Actual return type is integer.
> CONTEXT: SQL function "z"
>
> So cast the result from an integer to a smallint. What's the big deal?
>
> But, OK, I'll do it your way:
>
> rhaas=# create or replace function z(smallint) returns smallint as
> $$select $1+1::smallint$$ language sql;
> CREATE FUNCTION
> rhaas=# select z(1);
> ERROR: function z(integer) does not exist
> LINE 1: select z(1);
> ^
> HINT: No function matches the given name and argument types. You
> might need to add explicit type casts.
>
> Come on, really? Note that the above example works without casts if
> you use int *or* bigint *or* numeric, but not smallint. That could be
> fixed by causing sufficiently-small integers to lex as smallints, but
> if you think implicit datatype coercions are evil, you ought to be
> outraged by the fact that we are already going out of our way to blur
> the line between int, bigint, and numeric. We let people write 2.0 +
> 3 and get 5.0 - surely it's only a short step from there to human
> sacrifice, cats and dogs living together... mass hysteria! I mean,
> the whole reason for rejecting integer = text is that we aren't sure
> whether to coerce the text to an integer or the integer to a string,
> and it's better to throw an error than to guess. But in the case of
> 2.0 + 3, we feel 100% confident in predicting that the user will be
> happy to convert the integer to a numeric rather than the numeric to
> an integer, so no error. We do that because we know that the domain
> of numeric is a superset of the domain of integer, or in other words,
> we are using context clues to deduce what the user probably meant
> rather than forcing them to be explicit about it.

Is there any general interest in adjusting smallint casting?

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-27 20:03:05
Message-ID: 8180.1346097785@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian <bruce(at)momjian(dot)us> writes:
> On Fri, Feb 17, 2012 at 02:52:20PM -0500, Robert Haas wrote:
>> Come on, really? Note that the above example works without casts if
>> you use int *or* bigint *or* numeric, but not smallint. That could be
>> fixed by causing sufficiently-small integers to lex as smallints,

> Is there any general interest in adjusting smallint casting?

We tried that once, years ago, and it was a miserable failure: it opened
up far too many ambiguities, eg should "int4col + 1" invoke int4pl or
int42pl? (That particular case works, because there's an exact match
to int42pl, but we found an awful lot of cases where the parser couldn't
resolve a best choice. IIRC there were dozens of failures in the
regression tests then, and there would be more now.)

There's also the problem that if "2 + 2" starts getting parsed as
smallint int2pl smallint, cases like "20000 + 20000" will overflow when
they didn't before. IMO smallint is a bit too narrow to be a useful
general-purpose integer type, so we'd end up wanting int2pl to yield
int4 to avoid unexpected overflows --- and that opens up more cans of
worms, like which version of f() gets called for f(2+2).

It's conceivable that a change in the lexer behavior combined with a
massive reorganization of the integer-related operators would bring us
to a nicer place than where we are now. But it'd be a lot of work for
dubious reward, and it would almost certainly generate a pile of
application compatibility problems.

Some history:
http://archives.postgresql.org/pgsql-hackers/2002-11/msg00468.php
http://archives.postgresql.org/pgsql-hackers/2010-09/msg00223.php
(A lot of the specific details in the 2002 thread are obsolete now,
but the general point remains, I fear.)

regards, tom lane


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-27 20:21:13
Message-ID: 20120827202113.GX11088@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Aug 27, 2012 at 04:03:05PM -0400, Tom Lane wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
> > On Fri, Feb 17, 2012 at 02:52:20PM -0500, Robert Haas wrote:
> >> Come on, really? Note that the above example works without casts if
> >> you use int *or* bigint *or* numeric, but not smallint. That could be
> >> fixed by causing sufficiently-small integers to lex as smallints,
>
> > Is there any general interest in adjusting smallint casting?
>
> We tried that once, years ago, and it was a miserable failure: it opened
> up far too many ambiguities, eg should "int4col + 1" invoke int4pl or
> int42pl? (That particular case works, because there's an exact match
> to int42pl, but we found an awful lot of cases where the parser couldn't
> resolve a best choice. IIRC there were dozens of failures in the
> regression tests then, and there would be more now.)
>
> There's also the problem that if "2 + 2" starts getting parsed as
> smallint int2pl smallint, cases like "20000 + 20000" will overflow when
> they didn't before. IMO smallint is a bit too narrow to be a useful
> general-purpose integer type, so we'd end up wanting int2pl to yield
> int4 to avoid unexpected overflows --- and that opens up more cans of
> worms, like which version of f() gets called for f(2+2).
>
> It's conceivable that a change in the lexer behavior combined with a
> massive reorganization of the integer-related operators would bring us
> to a nicer place than where we are now. But it'd be a lot of work for
> dubious reward, and it would almost certainly generate a pile of
> application compatibility problems.
>
> Some history:
> http://archives.postgresql.org/pgsql-hackers/2002-11/msg00468.php
> http://archives.postgresql.org/pgsql-hackers/2010-09/msg00223.php
> (A lot of the specific details in the 2002 thread are obsolete now,
> but the general point remains, I fear.)

Thanks, just asking. Odd int2 is so much harder than int8/numberic
casts.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-27 21:03:57
Message-ID: CA+Tgmob9XeBu4hK_zp6wW6nGRs0beDU=_94-Ww2LOD=oFqeH-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Aug 27, 2012 at 4:03 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>> On Fri, Feb 17, 2012 at 02:52:20PM -0500, Robert Haas wrote:
>>> Come on, really? Note that the above example works without casts if
>>> you use int *or* bigint *or* numeric, but not smallint. That could be
>>> fixed by causing sufficiently-small integers to lex as smallints,
>
>> Is there any general interest in adjusting smallint casting?
>
> We tried that once, years ago, and it was a miserable failure: it opened
> up far too many ambiguities, eg should "int4col + 1" invoke int4pl or
> int42pl? (That particular case works, because there's an exact match
> to int42pl, but we found an awful lot of cases where the parser couldn't
> resolve a best choice. IIRC there were dozens of failures in the
> regression tests then, and there would be more now.)
>
> There's also the problem that if "2 + 2" starts getting parsed as
> smallint int2pl smallint, cases like "20000 + 20000" will overflow when
> they didn't before. IMO smallint is a bit too narrow to be a useful
> general-purpose integer type, so we'd end up wanting int2pl to yield
> int4 to avoid unexpected overflows --- and that opens up more cans of
> worms, like which version of f() gets called for f(2+2).

I agree that redefining the lexer behavior is a can of worms. What I
don't understand is why f(2+2) can't call f(smallint) when that's the
only extant f. It seems to me that we could do that without breaking
anything that works today: if you look for candidates and don't find
any, try again, allowing assignment casts the second time.

We really ought to put some effort into solving this problem. I've
seen a few Oracle-migration talks at conferences, and *every one* of
them has mentioned the smallint problem. It hits our customers, too.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-27 22:19:43
Message-ID: c4ea734feb675718c45229eb8bb08e91@biglumber.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

Tom Lane replied:
>>> Come on, really? Note that the above example works without casts if
>>> you use int *or* bigint *or* numeric, but not smallint. That could be
>>> fixed by causing sufficiently-small integers to lex as smallints,

>> Is there any general interest in adjusting smallint casting?
...
> It's conceivable that a change in the lexer behavior combined with a
> massive reorganization of the integer-related operators would bring us
> to a nicer place than where we are now. But it'd be a lot of work for
> dubious reward, and it would almost certainly generate a pile of
> application compatibility problems.

Okay, but what about a more targeted solution to the original
poster's problem? That seems doable without causing major
breakage elsewhere....

- --
Greg Sabino Mullane greg(at)turnstep(dot)com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201208271818
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAlA78m0ACgkQvJuQZxSWSshW2gCg1Xcx5zLORMIDQo2yE6QTLVuD
P88AniE9rh4Dojg0o416cWK7cYHWaq0b
=NOAR
-----END PGP SIGNATURE-----


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-27 23:13:55
Message-ID: 11931.1346109235@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I agree that redefining the lexer behavior is a can of worms. What I
> don't understand is why f(2+2) can't call f(smallint) when that's the
> only extant f. It seems to me that we could do that without breaking
> anything that works today: if you look for candidates and don't find
> any, try again, allowing assignment casts the second time.

Yeah, possibly. Where would you fit that in the existing sequence of
tests?
http://www.postgresql.org/docs/devel/static/typeconv-func.html

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-28 14:04:29
Message-ID: CA+TgmoYTM1dmwTbfwsSHjMD7BVXJ8UOty_-fXEsfuus60m1euA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Aug 27, 2012 at 7:13 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> I agree that redefining the lexer behavior is a can of worms. What I
>> don't understand is why f(2+2) can't call f(smallint) when that's the
>> only extant f. It seems to me that we could do that without breaking
>> anything that works today: if you look for candidates and don't find
>> any, try again, allowing assignment casts the second time.
>
> Yeah, possibly. Where would you fit that in the existing sequence of
> tests?
> http://www.postgresql.org/docs/devel/static/typeconv-func.html

I think:

If step 4a would result in discarding all candidates, then instead
discard candidate functions for which the input types do not match and
cannot be converted -- using an ASSIGNMENT conversion -- to match.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Greg Stark <stark(at)mit(dot)edu>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-28 16:47:40
Message-ID: CAM-w4HPdRoJutHksr-bVkou=D=Y3hT_X2+dgOgec_g-PMcwREA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Aug 27, 2012 at 10:03 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> We really ought to put some effort into solving this problem. I've
> seen a few Oracle-migration talks at conferences, and *every one* of
> them has mentioned the smallint problem. It hits our customers, too.

I'm kind of puzzled how Oracle-migration talks talk about a smallint
problem given that Oracle only has NUMBER which is a variable-sized
data type. Why would Oracle people end up with an f(smallint) at all?

Perhaps just a warning on CREATE FUNCTION when one of the arguments
doesn't have an implicit cast from the canonical data type of that
hierarchy saying perhaps you should consider using that data type and
let Postgres convert instead of the more specific data type?

--
greg


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-28 17:06:42
Message-ID: 3540.1346173602@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Greg Stark <stark(at)mit(dot)edu> writes:
> Perhaps just a warning on CREATE FUNCTION when one of the arguments
> doesn't have an implicit cast from the canonical data type of that
> hierarchy saying perhaps you should consider using that data type and
> let Postgres convert instead of the more specific data type?

This might be a good idea if we could write such a test in a principled
way, but I'm not seeing how. We don't really have a concept of
"canonical data types".

Also, right at the moment it's not clear to me whether there are any
other cases besides integer literal vs smallint argument. I think
that's the only particularly surprising case within the numeric
hierarchy --- and for non-numeric types, the literal is generally going
to start out "unknown" so the whole problem doesn't arise. I feel
uncomfortable trying to invent general-purpose solutions to problems
we have only one instance of ...

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-28 17:32:51
Message-ID: CA+Tgmob2-EVgYAxMxyL4j5rOd+BpCv0Jj4TO9Jp4eyvMXXqaSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Aug 28, 2012 at 1:06 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Also, right at the moment it's not clear to me whether there are any
> other cases besides integer literal vs smallint argument. I think
> that's the only particularly surprising case within the numeric
> hierarchy --- and for non-numeric types, the literal is generally going
> to start out "unknown" so the whole problem doesn't arise. I feel
> uncomfortable trying to invent general-purpose solutions to problems
> we have only one instance of ...

The other case that comes up regularly is someone trying to pass some
kind of number to a function such as LPAD(). There is only one LPAD()
so no ambiguity exists, but PostgreSQL doesn't even see that there's a
candidate.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-28 17:39:42
Message-ID: 4279.1346175582@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Aug 28, 2012 at 1:06 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Also, right at the moment it's not clear to me whether there are any
>> other cases besides integer literal vs smallint argument. I think
>> that's the only particularly surprising case within the numeric
>> hierarchy --- and for non-numeric types, the literal is generally going
>> to start out "unknown" so the whole problem doesn't arise. I feel
>> uncomfortable trying to invent general-purpose solutions to problems
>> we have only one instance of ...

> The other case that comes up regularly is someone trying to pass some
> kind of number to a function such as LPAD(). There is only one LPAD()
> so no ambiguity exists, but PostgreSQL doesn't even see that there's a
> candidate.

There still won't be a candidate for that one, unless you're proposing
to allow explicit-only coercions to be applied implicitly.

regards, tom lane


From: Jim Nasby <jim(at)nasby(dot)net>
To: Greg Sabino Mullane <greg(at)turnstep(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-28 17:40:28
Message-ID: 503D028C.2070409@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8/27/12 5:19 PM, Greg Sabino Mullane wrote:
> Tom Lane replied:
>>>> >>>Come on, really? Note that the above example works without casts if
>>>> >>>you use int*or* bigint*or* numeric, but not smallint. That could be
>>>> >>>fixed by causing sufficiently-small integers to lex as smallints,
>>> >>Is there any general interest in adjusting smallint casting?
> ...
>> >It's conceivable that a change in the lexer behavior combined with a
>> >massive reorganization of the integer-related operators would bring us
>> >to a nicer place than where we are now. But it'd be a lot of work for
>> >dubious reward, and it would almost certainly generate a pile of
>> >application compatibility problems.
> Okay, but what about a more targeted solution to the original
> poster's problem? That seems doable without causing major
> breakage elsewhere....

FWIW, this causes problems for me at work as well.

For the case of

const op const

instead of trying to small-cast the 2 constants, would it be possible to large-cast them, perform the operation, and then re-cast the results of the operation? ISTM that would solve the operator issue (ie, the case of 20000+20000).
--
Jim C. Nasby, Database Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-28 18:36:35
Message-ID: CA+Tgmoa=L6YwMAifGsgX1cbT3DNC+BF=WgtQ6MGy3jstwETH7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Aug 28, 2012 at 1:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> The other case that comes up regularly is someone trying to pass some
>> kind of number to a function such as LPAD(). There is only one LPAD()
>> so no ambiguity exists, but PostgreSQL doesn't even see that there's a
>> candidate.
>
> There still won't be a candidate for that one, unless you're proposing
> to allow explicit-only coercions to be applied implicitly.

OK, I'm confused.

rhaas=# create table foo (a text);
CREATE TABLE
rhaas=# insert into foo values (12345);
INSERT 0 1

There are no pg_cast entries for integer -> text, but this still gets
treated as an assignment cast because of some special-case logic in
find_coercion_pathway().

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-28 18:55:56
Message-ID: 6825.1346180156@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Aug 28, 2012 at 1:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> There still won't be a candidate for that one, unless you're proposing
>> to allow explicit-only coercions to be applied implicitly.

> [ not so, see kluge in find_coercion_pathway() ]

Oh, I'd forgotten that worked that way. Frankly, that makes me quite a
bit more concerned about this proposal than I was before. I do *not*
want to re-introduce silent cross-category casts to text, not even if
there's no other way to match the function/operator. I think that hack
was/is tolerable for actual assignment to a table column, because there
is very little chance that the semantics of such an assignment will come
out differently than the user expected. This is not the case when
you're matching to potentially overloaded functions or operators,
though. If we go down this route we're going to find ourselves back in
the badlands of timestamps sometimes being compared as though they were
strings, and all the other sillinesses that we got rid of in 8.3. I got
beat up enough already for taking those toys away from people; I'm not
looking forward to having to have another round of it in the future.

I could see doing what you suggest as long as we exclude the
automatic-coerce-via-IO case.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-28 20:33:33
Message-ID: CA+TgmoZNnXbOtq-rTJGBKEKpd62dxnFbqo-B0WJ0-Sc8U29QSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Aug 28, 2012 at 2:55 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Oh, I'd forgotten that worked that way. Frankly, that makes me quite a
> bit more concerned about this proposal than I was before. I do *not*
> want to re-introduce silent cross-category casts to text, not even if
> there's no other way to match the function/operator. I think that hack
> was/is tolerable for actual assignment to a table column, because there
> is very little chance that the semantics of such an assignment will come
> out differently than the user expected.

Well, I think that when there is only one LPAD function, there is also
very little chance that the results will come out differently than the
user expected. I'm having a hard time seeing a bright line between
those two cases. Remember, I'm not proposing that we try to guess
between more alternatives than we're already trying to guess between -
only that we do something other than fail outright in situations where
we currently do.

The changes we made in 8.3 broke a bunch of cases that were actually
ambiguous. That was painful, but probably for the best. What wasn't,
in my opinion, for the best was that we also broke a lot of cases -
including this one - that were by no means ambiguous. In fact, I
believe that every place that I had to fix my application code
actually fell into the latter category: there was no actual ambiguity,
but I had to go back and insert a cast anyway.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-28 20:47:26
Message-ID: 9432.1346186846@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Aug 28, 2012 at 2:55 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Oh, I'd forgotten that worked that way. Frankly, that makes me quite a
>> bit more concerned about this proposal than I was before. I do *not*
>> want to re-introduce silent cross-category casts to text, not even if
>> there's no other way to match the function/operator. I think that hack
>> was/is tolerable for actual assignment to a table column, because there
>> is very little chance that the semantics of such an assignment will come
>> out differently than the user expected.

> Well, I think that when there is only one LPAD function, there is also
> very little chance that the results will come out differently than the
> user expected.

[ shrug... ] I'm having a hard time resisting the temptation to point
out that there are two. The real point here though is that the proposed
behavior change will affect all functions, not only the cases where you
think there is only one sane behavior. And features such as search paths
and default parameters frequently mean that there are more potential
matches than the user thought of while writing the query.

In the end, SQL is a fairly strongly typed language, especially in our
manifestation of it. I don't think we should give that up, especially
not for benefits as dubious as not having to write a cast to make it
clear that yes you really do want a timestamp to be treated as text.
IMO, saving people from the errors that inevitably arise from that sort
of sloppy thinking is a benefit, not a cost, of having a typed language.

regards, tom lane


From: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-29 01:46:00
Message-ID: 503D7458.3000203@ringerc.id.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 08/29/2012 01:32 AM, Robert Haas wrote:
> On Tue, Aug 28, 2012 at 1:06 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Also, right at the moment it's not clear to me whether there are any
>> other cases besides integer literal vs smallint argument. I think
>> that's the only particularly surprising case within the numeric
>> hierarchy --- and for non-numeric types, the literal is generally going
>> to start out "unknown" so the whole problem doesn't arise. I feel
>> uncomfortable trying to invent general-purpose solutions to problems
>> we have only one instance of ...
>
> The other case that comes up regularly is someone trying to pass some
> kind of number to a function such as LPAD(). There is only one LPAD()
> so no ambiguity exists, but PostgreSQL doesn't even see that there's a
> candidate.

Allowing Pg to assign parameters or fields by using the
normally-only-explicit casts where no ambiguity exists would be *really*
helpful in other areas, too.

In particular, this applies with assignment of fields from `text' input,
too. PostgreSQL can be incredibly frustrating to work with from
Java/JDBC where everything goes through protocol-level parameterised
statements, because you can't use Java `String' types via
PreparedStatement.setString() to assign to, say, an `xml' or `json'
field, you have to use `setObject()'.

That's OK (ish) when working with PgJDBC directly, but it breaks code
that expects this to work like it does in other databases where
setString(...) can be used to assign to anything that's castable from
varchar.

Pg doesn't allow `unknown' to be passed as the type of a parameterised
statement, so the JDBC driver can't work around this by passing such
entries as fields of "unknown" type and letting the server work it out.
It'd instead have to ask the server "what are the permissible types for
the placeholder $1 in this query" ... which AFAIK isn't possible, and
would require extra round trips too.

I currently work around this by creating additional implicit casts where
I need them, eg text->xml, text->json. It'd be lovely not to have to do
that, though.

--
Craig Ringer


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-29 02:58:02
Message-ID: CA+TgmoZ6FJB2QOiTHWGgPDJ2uStzotkvBw+RqG2Ryxa9nTzGWQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Aug 28, 2012 at 4:47 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Well, I think that when there is only one LPAD function, there is also
>> very little chance that the results will come out differently than the
>> user expected.
>
> [ shrug... ] I'm having a hard time resisting the temptation to point
> out that there are two.

Fine, but as they have different numbers of arguments it has no
bearing on the point at hand, which is that right now it is very easy
to write a call that matches unexpectedly fails to match either one.

> The real point here though is that the proposed
> behavior change will affect all functions, not only the cases where you
> think there is only one sane behavior. And features such as search paths
> and default parameters frequently mean that there are more potential
> matches than the user thought of while writing the query.

I'm totally unpersuaded by this argument. I have yet to run into a
customer who defined multiple functions with the same name and then
complained because we called the wrong one, or even because we threw
an error instead of just picking one. I have run into MANY customers
who have been forced to insert typecasts into applications to work
around our willingness to consider calling the only plausible
candidate function or operator. Now some of this is no doubt because
we have very few customers running on pre-8.3 releases (woohoo!), but
that's exactly the point: the bad old days when you could break your
application by accidentally invoking the wrong function are gone.
That problem is dead. What we ought to be focusing on now is fixing
the collateral damage.

> In the end, SQL is a fairly strongly typed language, especially in our
> manifestation of it. I don't think we should give that up, especially
> not for benefits as dubious as not having to write a cast to make it
> clear that yes you really do want a timestamp to be treated as text.
> IMO, saving people from the errors that inevitably arise from that sort
> of sloppy thinking is a benefit, not a cost, of having a typed language.

The benefit is that it allows us to be compatible with other SQL
systems. If PostgreSQL were the only database in the world, I might
agree with you, but it isn't: it's just the only one that requires you
to insert this many casts. It's hard to accept the argument that
there's no sensible alternative when other people have clearly found
something that works for them and their users. We can dig in our
heels and insist we know better, but what does that do other than
drive away users? For most people, the database is just a tool, and
they want it to work with a minimum of fuss, not force them to jump
through unexpected and unwelcome hoops. Again, if there's real
ambiguity then that is one thing, but what I'm proposing does not
change the behavior in any case we currently consider ambiguous. I
don't know of any other programming language or system where it is
considered a virtue to force the user to inject unnecessary
decorations into their code. Indeed, many systems go to quite some
lengths to minimize the amount of such decoration that is required.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-29 03:23:17
Message-ID: 25755.1346210597@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Aug 28, 2012 at 4:47 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> The real point here though is that the proposed
>> behavior change will affect all functions, not only the cases where you
>> think there is only one sane behavior. And features such as search paths
>> and default parameters frequently mean that there are more potential
>> matches than the user thought of while writing the query.

> I'm totally unpersuaded by this argument. I have yet to run into a
> customer who defined multiple functions with the same name and then
> complained because we called the wrong one, or even because we threw
> an error instead of just picking one.

That argument would hold water if we got rid of every single usage of
overloading in the system-defined operators/functions, which as you well
know is not an attractive idea. Since that's not going to happen,
arguing for this on the basis that your customers don't overload
function names is missing the point. Any loosening of the rules is
going to create issues for system-function resolution ... unless you're
going to propose that we somehow do this differently for user and system
defined functions.

> I have run into MANY customers
> who have been forced to insert typecasts into applications to work
> around our willingness to consider calling the only plausible
> candidate function or operator. Now some of this is no doubt because
> we have very few customers running on pre-8.3 releases (woohoo!), but
> that's exactly the point: the bad old days when you could break your
> application by accidentally invoking the wrong function are gone.
> That problem is dead.

The reason it's dead is that we killed it in 8.3. I don't want it
coming back to life, but I think that that will be exactly the outcome
if we let any implicit casts to text get back into the rules for
operator/function overloading resolution.

An example of the sort of problem that I don't want to hear about
ever again is somebody trying to use max() on a "point" column.
We don't have linear sort ordering for points, so this is nonsensical
and should draw an error. Which it does, today. With your proposal,
the system would silently use max(pointcol::text), producing results
that might even look plausible if the user wasn't paying too much
attention. If that's the behavior the user actually wants, fine: let
him say so with an explicit cast to text. But I don't want the system
trapping users into such hard-to-find errors because we are so focused
on mysql compatibility that we let people omit conceptually-critical
casts in the name of ease of use.

> For most people, the database is just a tool, and
> they want it to work with a minimum of fuss, not force them to jump
> through unexpected and unwelcome hoops. Again, if there's real
> ambiguity then that is one thing, but what I'm proposing does not
> change the behavior in any case we currently consider ambiguous. I
> don't know of any other programming language or system where it is
> considered a virtue to force the user to inject unnecessary
> decorations into their code.

Really? You've not had experience with very many programming languages,
then. Just about every one I've ever dealt with that's at a higher
conceptual level than C or BASIC *is* sticky about this sort of thing.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-29 04:27:18
Message-ID: 27068.1346214438@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> That problem is dead.

> The reason it's dead is that we killed it in 8.3. I don't want it
> coming back to life, but I think that that will be exactly the outcome
> if we let any implicit casts to text get back into the rules for
> operator/function overloading resolution.

To put some concreteness into what so far has been a pretty hand-wavy
discussion, I experimented with the attached patch. I'm not sure that
it exactly corresponds to what you proposed, but I think this is the
only place the consideration could be injected without a substantial
amount of code rearrangement. This results in half a dozen regression
test failures (see second attachment), which mostly consist of
"function/operator does not exist" errors changing to "function/operator
is not unique". I've not looked into exactly why each one happens ---
possibly the code is now finding multiple string-category matches where
before it found none. But it definitely illustrates my point that this
would not be without surprises.

Oh, one more thing:

regression=# select lpad(42,8);
ERROR: failed to find conversion function from integer to text

so this doesn't actually solve the problem you want to solve.
I'm not sure why that's happening, either, but evidently some
additional coercion laxity would required.

regards, tom lane

Attachment Content-Type Size
allow-assignment-coercions-1.patch text/x-patch 820 bytes
unknown_filename text/plain 5.9 KB

From: Benedikt Grundmann <bgrundmann(at)janestreet(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-29 06:59:10
Message-ID: CADbMkNOu7TdpMLwD+ehafMmmFTuE-s2r6sKripCe-WR0SmGu_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Aug 28, 2012 at 9:47 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > On Tue, Aug 28, 2012 at 2:55 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Oh, I'd forgotten that worked that way. Frankly, that makes me quite a
> >> bit more concerned about this proposal than I was before. I do *not*
> >> want to re-introduce silent cross-category casts to text, not even if
> >> there's no other way to match the function/operator. I think that hack
> >> was/is tolerable for actual assignment to a table column, because there
> >> is very little chance that the semantics of such an assignment will come
> >> out differently than the user expected.
>
> > Well, I think that when there is only one LPAD function, there is also
> > very little chance that the results will come out differently than the
> > user expected.
>
> [ shrug... ] I'm having a hard time resisting the temptation to point
> out that there are two. The real point here though is that the proposed
> behavior change will affect all functions, not only the cases where you
> think there is only one sane behavior. And features such as search paths
> and default parameters frequently mean that there are more potential
> matches than the user thought of while writing the query.
>
> In the end, SQL is a fairly strongly typed language, especially in our
> manifestation of it. I don't think we should give that up, especially
> not for benefits as dubious as not having to write a cast to make it
> clear that yes you really do want a timestamp to be treated as text.
> IMO, saving people from the errors that inevitably arise from that sort
> of sloppy thinking is a benefit, not a cost, of having a typed language.
>
> regards, tom lane
>
+a very big number

I remember the pain we had when we upgraded from 8.1 to 8.4, but I also
distinctly remember that after the upgrade I was a little bit more
confident that our SQL code does the right thing. But we are a OCaml shop
if there is one thing we believe in with ferocity it is that a STRICT type
checker is a good thing (TM). You pay a little verbosity tax but in return
all the "stupid" little obvious bugs get caught and maybe even more
importantly when you later change your types the system are forced to
reconsider all cases where you used the value of (now different) type (and
that is A VERY GOOD THING in a big code base). Admittedly we are not there
yet in Postgres as functions are only (re)checked upon execution.

My 2cents,

Bene


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-29 10:01:14
Message-ID: CA+TgmobULnAk9cGZbxLWqOUVMWjiurHLvF3rs5Yt9hRnuvd3fg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Aug 28, 2012 at 11:23 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> That argument would hold water if we got rid of every single usage of
> overloading in the system-defined operators/functions, which as you well
> know is not an attractive idea. Since that's not going to happen,
> arguing for this on the basis that your customers don't overload
> function names is missing the point. Any loosening of the rules is
> going to create issues for system-function resolution ... unless you're
> going to propose that we somehow do this differently for user and system
> defined functions.

Obviously not.

> An example of the sort of problem that I don't want to hear about
> ever again is somebody trying to use max() on a "point" column.
> We don't have linear sort ordering for points, so this is nonsensical
> and should draw an error. Which it does, today.

Much as I hate to say it, I have to admit I find this to be a
compelling argument.

> Really? You've not had experience with very many programming languages,
> then. Just about every one I've ever dealt with that's at a higher
> conceptual level than C or BASIC *is* sticky about this sort of thing.

In terms of type-strictness, it runs the gamut. You have things like
Perl where datatypes barely exist at all and silent (sometimes
confusing) conversions are performed nary a second thought, and at the
other end of the spectrum you have things like ML which are incredibly
fanatic about type-checking. But both Perl and ML and, as far as I
know, most of what's in between make a virtue of terseness. The
exceptions are things like Ada and Cobol, which are not IMHO the sorts
of thing we ought to be trying to emulate.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-29 11:34:35
Message-ID: CA+TgmoaujOxBeKOz16SEKXT5vjc7A=N-F6BfT-pa+P55aoVrFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Aug 29, 2012 at 12:27 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> To put some concreteness into what so far has been a pretty hand-wavy
> discussion, I experimented with the attached patch. I'm not sure that
> it exactly corresponds to what you proposed, but I think this is the
> only place the consideration could be injected without a substantial
> amount of code rearrangement.

Yeah, this is what I was thinking of.

> This results in half a dozen regression
> test failures (see second attachment), which mostly consist of
> "function/operator does not exist" errors changing to "function/operator
> is not unique". I've not looked into exactly why each one happens ---
> possibly the code is now finding multiple string-category matches where
> before it found none. But it definitely illustrates my point that this
> would not be without surprises.

Well, the good news is that nothing fails that would have succeeded
before, or for that matter visca versa. But after playing around with
it a little, I agree that there's danger lurking. The fact that
length(42) fails due to the ambiguity between length(text) and
length(bpchar) is mostly happy coincidence. It's hard to get excited
about the possibility of that managing to return "2". The situation
with || is even worse. If I remove textanycat and anytextcat on the
theory that textcat itself ought to be enough under the new rules,
then a whole bunch of regression test failures occur because we end up
bogusly matching the array concatenation operator somehow, and fail to
interpret an unknown literal as an array (ouch!).

The upshot here appears to be that we're kind of schizophrenic about
what we want. With things like text || anyelement, anyelement ||
text, and concat(variadic "any") we are basically asserting that we
want to treat anything that we don't recognize as a string. But then
we have other functions (like max and length) where we don't want that
behavior. I suppose that more than anything this is based on a
perception that || won't be ambiguous (though whether that perception
is entirely correct is debatable, given the array-related meanings of
that operator) but there might be more than one possible sense for
length() or max(). Is there any principled way of distinguishing
these cases, or even a rule for what we ought to do by hand in future
cases of this type, or is it totally arbitrary?

> regression=# select lpad(42,8);
> ERROR: failed to find conversion function from integer to text
>
> so this doesn't actually solve the problem you want to solve.
> I'm not sure why that's happening, either, but evidently some
> additional coercion laxity would required.

This, however, is a trivial problem; make_fn_arguments just didn't get
the memo that it might now need to look for assignment casts. See
attached.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
assignment-casting.patch application/octet-stream 1.3 KB

From: Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-29 11:40:09
Message-ID: 503DFF99.7020200@archidevsys.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 29/08/12 23:34, Robert Haas wrote:
> On Wed, Aug 29, 2012 at 12:27 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> To put some concreteness into what so far has been a pretty hand-wavy
>> discussion, I experimented with the attached patch. I'm not sure that
>> it exactly corresponds to what you proposed, but I think this is the
>> only place the consideration could be injected without a substantial
>> amount of code rearrangement.
> Yeah, this is what I was thinking of.
>
>> This results in half a dozen regression
>> test failures (see second attachment), which mostly consist of
>> "function/operator does not exist" errors changing to "function/operator
>> is not unique". I've not looked into exactly why each one happens ---
>> possibly the code is now finding multiple string-category matches where
>> before it found none. But it definitely illustrates my point that this
>> would not be without surprises.
> Well, the good news is that nothing fails that would have succeeded
> before, or for that matter visca versa. But after playing around with
> it a little, I agree that there's danger lurking. The fact that
> length(42) fails due to the ambiguity between length(text) and
> length(bpchar) is mostly happy coincidence. It's hard to get excited
> about the possibility of that managing to return "2". The situation
> with || is even worse. If I remove textanycat and anytextcat on the
> theory that textcat itself ought to be enough under the new rules,
> then a whole bunch of regression test failures occur because we end up
> bogusly matching the array concatenation operator somehow, and fail to
> interpret an unknown literal as an array (ouch!).
>
> The upshot here appears to be that we're kind of schizophrenic about
> what we want. With things like text || anyelement, anyelement ||
> text, and concat(variadic "any") we are basically asserting that we
> want to treat anything that we don't recognize as a string. But then
> we have other functions (like max and length) where we don't want that
> behavior. I suppose that more than anything this is based on a
> perception that || won't be ambiguous (though whether that perception
> is entirely correct is debatable, given the array-related meanings of
> that operator) but there might be more than one possible sense for
> length() or max(). Is there any principled way of distinguishing
> these cases, or even a rule for what we ought to do by hand in future
> cases of this type, or is it totally arbitrary?
>
>> regression=# select lpad(42,8);
>> ERROR: failed to find conversion function from integer to text
>>
>> so this doesn't actually solve the problem you want to solve.
>> I'm not sure why that's happening, either, but evidently some
>> additional coercion laxity would required.
> This, however, is a trivial problem; make_fn_arguments just didn't get
> the memo that it might now need to look for assignment casts. See
> attached.
>
>
>
You realize of course, that '42' is the answer to Life, the Universe,
and Everything? :-)

Cheers,
Gavin


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-29 15:40:36
Message-ID: 10925.1346254836@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> The upshot here appears to be that we're kind of schizophrenic about
> what we want. With things like text || anyelement, anyelement ||
> text, and concat(variadic "any") we are basically asserting that we
> want to treat anything that we don't recognize as a string. But then
> we have other functions (like max and length) where we don't want that
> behavior. I suppose that more than anything this is based on a
> perception that || won't be ambiguous (though whether that perception
> is entirely correct is debatable, given the array-related meanings of
> that operator) but there might be more than one possible sense for
> length() or max(). Is there any principled way of distinguishing
> these cases, or even a rule for what we ought to do by hand in future
> cases of this type, or is it totally arbitrary?

I would not claim that the situation around || is principled in any
fashion. Rather, || was identified as being a key pain point for
the removal of implicit-casts-to-text during 8.3 development, and we
agreed we would reduce the pain by adding operators that effectively
reintroduced the implicit casts *for that one operator only*. I felt
that was still a big step forward compared to implicit casts everywhere.
But if we'd been doing this in a green field, I doubt that you'd see
text || anyelement or anyelement || text in there. I would vote against
introducing more such things in future, on the grounds that there would
be no backwards-compatibility argument for it.

As for the concat() function, IMO it's an ugly kluge. But again, it's
not introducing any global behavior that might have side-effects on
the resolution of other function or operator names.

>> regression=# select lpad(42,8);
>> ERROR: failed to find conversion function from integer to text
>> I'm not sure why that's happening, either, but evidently some
>> additional coercion laxity would required.

> This, however, is a trivial problem; make_fn_arguments just didn't get
> the memo that it might now need to look for assignment casts. See
> attached.

Meh. I'm a bit worried about whether that might have unexpected
consequences, too.

regards, tom lane


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-29 18:14:56
Message-ID: 503E5C20.1090302@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 8/29/12 11:40 AM, Tom Lane wrote:
>>> regression=# select lpad(42,8);
>>> >>ERROR: failed to find conversion function from integer to text
>>> >>I'm not sure why that's happening, either, but evidently some
>>> >>additional coercion laxity would required.
>> >This, however, is a trivial problem; make_fn_arguments just didn't get
>> >the memo that it might now need to look for assignment casts. See
>> >attached.
> Meh. I'm a bit worried about whether that might have unexpected
> consequences, too.

We are not seriously intending to make the above query work, are we?


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-29 19:54:52
Message-ID: 1188.1346270092@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> On 8/29/12 11:40 AM, Tom Lane wrote:
>> regression=# select lpad(42,8);

> We are not seriously intending to make the above query work, are we?

Well, *I* don't want to, but apparently Robert does.

I don't really want to go much further than finding a way to handle the
"integer constant passed to smallint function argument" case. Maybe we
should consider a narrow fix for that rather than opening up the general
assignment-cast scenario.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-29 21:08:09
Message-ID: CA+TgmobVQiBopPW5pF6DGg1OQN+dW0CyVpf2kWS6Uwjq+CxW+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Aug 29, 2012 at 3:54 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>> On 8/29/12 11:40 AM, Tom Lane wrote:
>>> regression=# select lpad(42,8);
>
>> We are not seriously intending to make the above query work, are we?
>
> Well, *I* don't want to, but apparently Robert does.
>
> I don't really want to go much further than finding a way to handle the
> "integer constant passed to smallint function argument" case. Maybe we
> should consider a narrow fix for that rather than opening up the general
> assignment-cast scenario.

If we could just do that, it would be a huge improvement. I'm not
very sanguine about the possibility of a clean fix in the lexer, but
maybe there is some other tweak to the system that would make it work.

On the more general issue, I continue to see minimal risk of harm in
allowing things like LPAD() to implicitly cast the first argument to
text. I wrote code that did that for years (to pad numeric fields out
to a specific width by adding leading zeros) and until I upgraded to
8.3 it caused me no problems. I knew what I meant, and so did the
database, and we were both happy. The argument seems to be that we
shouldn't have been happy, but we were. Some of the other examples
Tom mentions are, indeed, icky, and I don't know what to do about
that, but LPAD() does indeed seem pretty harmless to me. And, on a
more pragmatic level, as long as I work for a company that helps
people migrate from other database systems, I'm not going to be able
to stop caring about this issue even in cases where I don't personally
think implicit casting is a good idea, because other people who are
not me have tens of thousands of lines of procedural code written for
those other systems and if you tell them they've got to go through and
add hundreds or thousands of casts before they can migrate, it tends
to turn them off. Maybe there's no perfect solution to that problem,
but the status quo is definitely not perfect either.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-29 22:39:37
Message-ID: 5873.1346279977@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On the more general issue, I continue to see minimal risk of harm in
> allowing things like LPAD() to implicitly cast the first argument to
> text.

Well, I see your point about LPAD(), but the problem is how to tell
the difference between a harmless cast omission and an actual mistake
that the user will be very grateful if we point out. If we allow
implicit casts to text in the general case in function/operator calls,
we are definitely going to re-introduce a lot of room for mistakes.

Upthread you were complaining about how we'd reject calls even when
there was only one possible interpretation. I wonder whether there'd be
any value in taking that literally: that is, allow use of assignment
rules when there is, in fact, exactly one function with the right number
of parameters visible in the search path. This would solve the LPAD()
problem (at least as stated), and probably many other practical cases
too, since I admit your point that an awful lot of users do not use
function overloading. The max() example I mentioned earlier would not
get broken since there's more than one max(), and in general it seems
likely that cases where there's a real risk would involve overloaded
names.

The main downside I can see is that code that used to work is likely
to stop working as soon as someone creates a potential overloading
situation. Worse, the error message could be pretty confusing, since
if you had been successfully calling f(smallint) with f(42), you'd get
"f(integer) does not exist", not something like "f() is ambiguous",
after adding f(float8) to the mix. This seems related to the confusing
changes in regression test cases that I got in my experiments yesterday.
This may be sufficient reason to reject the idea, since the very last
thing we need in this area is any degradation in the relevance of the
error messages.

> ... as long as I work for a company that helps
> people migrate from other database systems, I'm not going to be able
> to stop caring about this issue even in cases where I don't personally
> think implicit casting is a good idea, because other people who are
> not me have tens of thousands of lines of procedural code written for
> those other systems and if you tell them they've got to go through and
> add hundreds or thousands of casts before they can migrate, it tends
> to turn them off. Maybe there's no perfect solution to that problem,
> but the status quo is definitely not perfect either.

Meh. I tend to think that a better solution to those folks' problem is
a package of add-on casts that they could install for use with their
legacy code; not dumbing down the system's error detection capability
for everyone. Peter's original try at re-adding implicit text casts
in that way didn't work very well IIRC, but maybe we could try harder.

regards, tom lane


From: David Fetter <david(at)fetter(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-30 13:06:39
Message-ID: 20120830130639.GA4314@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Aug 29, 2012 at 06:39:37PM -0400, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > On the more general issue, I continue to see minimal risk of harm
> > in allowing things like LPAD() to implicitly cast the first
> > argument to text.
>
> Well, I see your point about LPAD(), but the problem is how to tell
> the difference between a harmless cast omission and an actual
> mistake that the user will be very grateful if we point out. If we
> allow implicit casts to text in the general case in
> function/operator calls, we are definitely going to re-introduce a
> lot of room for mistakes.
>
> Upthread you were complaining about how we'd reject calls even when
> there was only one possible interpretation. I wonder whether
> there'd be any value in taking that literally: that is, allow use of
> assignment rules when there is, in fact, exactly one function with
> the right number of parameters visible in the search path.

+1 for this.

> The main downside I can see is that code that used to work is likely
> to stop working as soon as someone creates a potential overloading
> situation. Worse, the error message could be pretty confusing,
> since if you had been successfully calling f(smallint) with f(42),
> you'd get "f(integer) does not exist", not something like "f() is
> ambiguous", after adding f(float8) to the mix. This seems related
> to the confusing changes in regression test cases that I got in my
> experiments yesterday. This may be sufficient reason to reject the
> idea, since the very last thing we need in this area is any
> degradation in the relevance of the error messages.

With the ANY* parameters introduced in the past few versions, there's
a lot less incentive to create this problem. The trick here is
documenting the ANY* parameters in enough places to make sure that
incentive is reduced.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-30 13:13:08
Message-ID: CA+TgmobFyfWH1D9-rZUOg11pF2D19H4KZ6BHGotf_QFJm=dvDQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Aug 29, 2012 at 6:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Well, I see your point about LPAD(), but the problem is how to tell
> the difference between a harmless cast omission and an actual mistake
> that the user will be very grateful if we point out. If we allow
> implicit casts to text in the general case in function/operator calls,
> we are definitely going to re-introduce a lot of room for mistakes.

I concede that point. :-)

> Upthread you were complaining about how we'd reject calls even when
> there was only one possible interpretation. I wonder whether there'd be
> any value in taking that literally: that is, allow use of assignment
> rules when there is, in fact, exactly one function with the right number
> of parameters visible in the search path. This would solve the LPAD()
> problem (at least as stated), and probably many other practical cases
> too, since I admit your point that an awful lot of users do not use
> function overloading. The max() example I mentioned earlier would not
> get broken since there's more than one max(), and in general it seems
> likely that cases where there's a real risk would involve overloaded
> names.

That's an interesting idea. I like it.

> The main downside I can see is that code that used to work is likely
> to stop working as soon as someone creates a potential overloading
> situation. Worse, the error message could be pretty confusing, since
> if you had been successfully calling f(smallint) with f(42), you'd get
> "f(integer) does not exist", not something like "f() is ambiguous",
> after adding f(float8) to the mix. This seems related to the confusing
> changes in regression test cases that I got in my experiments yesterday.

One thought I had when looking at those messages was that, in some
ways, the new messages were actually less confusing than the old
messages. I mean, if you try to call f(42) and you get f(integer) does
not exist, ok, you'll probably figure out that the issue is with the
argument type, since you most likely know that an f of some type does
in fact exist. But it would be even more clear if the error message
said, ok, so there is an f, but I'm not going to call it because the
argument types don't match closely enough. The distinction would be
even more useful if the function happens to be called snuffleupagus
rather than f, because then when you call snufleupagus(42.0), it'll
tell you "i know nothing about a function by that name" whereas when
you call snuffleupagus(42) it'll tell you "i know about a function by
that name, but not with those argument types". I've certainly
encountered this confusion before whilst debugging my own and other
people's databases: is it giving me that error because the function
doesn't exist, or because of an argument type mismatch?

> This may be sufficient reason to reject the idea, since the very last
> thing we need in this area is any degradation in the relevance of the
> error messages.
>
>> ... as long as I work for a company that helps
>> people migrate from other database systems, I'm not going to be able
>> to stop caring about this issue even in cases where I don't personally
>> think implicit casting is a good idea, because other people who are
>> not me have tens of thousands of lines of procedural code written for
>> those other systems and if you tell them they've got to go through and
>> add hundreds or thousands of casts before they can migrate, it tends
>> to turn them off. Maybe there's no perfect solution to that problem,
>> but the status quo is definitely not perfect either.
>
> Meh. I tend to think that a better solution to those folks' problem is
> a package of add-on casts that they could install for use with their
> legacy code; not dumbing down the system's error detection capability
> for everyone. Peter's original try at re-adding implicit text casts
> in that way didn't work very well IIRC, but maybe we could try harder.

Well, the big problem that you run into is that when you add casts,
you tend to create situations that the type system thinks are
ambiguous. A particular example of this is textanycat, anytextcat,
and plain old textcat. If you start adding casts, the system can get
confused about which one it's supposed to call in which situation.
The frustrating thing is that we don't really care. The only reason
why there are three different operators in the first place is because
we want to make sure that everything someone does will match one of
them. But then if something matches two of them, we error out
unnecessarily.

It would be nice to have a way to say "among this group of functions,
we don't care" or perhaps "among this group of functions, here is a
preference ordering; in case of doubt, pick the one with the highest
preference". But in some sense I feel that that isn't really solving
the problem, because the only reason those extra functions exist in
the first place is to work around the fact that sometimes the system
doesn't perform typecasts in situations where we wish it did. It's
almost like we should have a way to flag argument positions and say
"for this function, in this argument position, feel free to implicitly
cast to text". Then, for example, you'd only need one quote_literal()
rather than two.

This is possibly unsatisfying as well because there likely will be
disagreement about which functions ought to have that
implicit-casting-ok behavior and which ones should not. Maybe that'd
be OK anyway; people could always hack their local catalogs if need
be. But, all things being equal, it would be nice to design this in a
way where you could package up a bunch of behavior in the form of an
extension that users could either install or not install according to
the semantics that they wish to have. Your proposal of adding casts
is appealing from that perspective, but it'd work a lot better in
practice if we could eliminate the practice of having multiple
versions of functions with the same semantics just to elicit the
desired casting behavior.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-06 15:57:52
Message-ID: CA+TgmoZLm6Kp77HXEeU_6B_OBGEnWm9TaGrDF4SrPnsvyEvw2A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Aug 30, 2012 at 9:13 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> Upthread you were complaining about how we'd reject calls even when
>> there was only one possible interpretation. I wonder whether there'd be
>> any value in taking that literally: that is, allow use of assignment
>> rules when there is, in fact, exactly one function with the right number
>> of parameters visible in the search path. This would solve the LPAD()
>> problem (at least as stated), and probably many other practical cases
>> too, since I admit your point that an awful lot of users do not use
>> function overloading. The max() example I mentioned earlier would not
>> get broken since there's more than one max(), and in general it seems
>> likely that cases where there's a real risk would involve overloaded
>> names.
>
> That's an interesting idea. I like it.

I did some experimentation with this. It seems that what Tom proposed
here is a lot cleaner than what I proposed previously, while still
increasing usability in many real-world cases. For example, in
unpatched master:

rhaas=# create function xyz(smallint) returns smallint as $$select
$1$$ language sql;
CREATE FUNCTION
rhaas=# select xyz(5);
ERROR: function xyz(integer) does not exist
LINE 1: select xyz(5);
^
HINT: No function matches the given name and argument types. You
might need to add explicit type casts.
rhaas=# create table abc (a int);
CREATE TABLE
rhaas=# select lpad(a, 5, '0') from abc;
ERROR: function lpad(integer, integer, unknown) does not exist
LINE 1: select lpad(a, 5, '0') from abc;
^
HINT: No function matches the given name and argument types. You
might need to add explicit type casts.

But, with the attached patch:

rhaas=# create function xyz(smallint) returns smallint as $$select
$1$$ language sql;
CREATE FUNCTION
rhaas=# select xyz(5);
xyz
-----
5
(1 row)

rhaas=# create table abc (a int);
CREATE TABLE
rhaas=# select lpad(a, 5, '0') from abc;
lpad
------
(0 rows)

There is only one regression test output change:

-ERROR: function int2um(integer) does not exist
+ERROR: function int2um(smallint) requires run-time type coercion

The replacement error message is coming from lookup_agg_function(),
which calls func_get_detail() and then imposes stricter checks on the
result. In the old coding func_get_detail() didn't even identify a
candidate, whereas now it does but lookup_agg_function() decides that
it isn't usable. This seems OK to me, and the error message doesn't
seem any worse either.

So that's the good news. The not-so-good news is that to make it
work, I had to modify make_fn_arguments() to pass COERCION_ASSIGNMENT
rather than COERCION_IMPLICIT to coerce_type(). Otherwise, parsing
succeeds, but then things fall over later when we try to identify the
coercion function to be used. The reason I'm nervous about is because
the code now looks like this:

node = coerce_type(pstate,
node,
actual_arg_types[i],
declared_arg_types[i], -1,
COERCION_ASSIGNMENT,
COERCE_IMPLICIT_CAST,
-1);

It seems wrong to pass COERCE_IMPLICIT_CAST along with
COERCION_ASSIGNMENT, because COERCE_IMPLICIT_CAST controls the way
that the cast is *displayed*, and COERCE_IMPLICIT_CAST means don't
display it at all. That seems like it could create a problem if we
used this new type of argument matching (because there was only one
function with a given name) and then later someone added a second one.
I thought, for example, that there might be a problem with the way
views are reverse-parsed, but it actually seems to work OK, at least
in the case I can think of to test:

rhaas=# create table look_ma (a int, b text);
CREATE TABLE
rhaas=# create view look_ma_view (a, b) as select lpad(a, 5), lpad(b,
5) from look
CREATE VIEW
rhaas=# \d+ look_ma_view
View "public.look_ma_view"
Column | Type | Modifiers | Storage | Description
--------+------+-----------+----------+-------------
a | text | | extended |
b | text | | extended |
View definition:
SELECT lpad(look_ma.a::text, 5) AS a, lpad(look_ma.b, 5) AS b
FROM look_ma;

Note that where the assignment cast was used to find the function to
call, we get a cast in the deparsed query, but in the case where we
used an implicit cast, we don't. This is exactly as I would have
hoped. I fear there might be a subtler case where there is an issue,
but so far I haven't been able to find it. If there in fact is an
issue, I think we can fix it by pushing the logic up from
func_match_argtypes where it is now into func_get_detail;
func_get_detail can then return some indication to the caller
indicating which make_fn_arguments behavior is required. However, I
don't want to add that complexity unless we actually need it for
something.

Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
tom-casting.patch application/octet-stream 2.4 KB

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-19 23:19:06
Message-ID: 1353367146.6670.20.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2012-11-06 at 10:57 -0500, Robert Haas wrote:
> But, with the attached patch:
>
> rhaas=# create function xyz(smallint) returns smallint as $$select
> $1$$ language sql;
> CREATE FUNCTION
> rhaas=# select xyz(5);
> xyz
> -----
> 5
> (1 row)
>
> rhaas=# create table abc (a int);
> CREATE TABLE
> rhaas=# select lpad(a, 5, '0') from abc;
> lpad
> ------
> (0 rows)

I continue to be of the opinion that allowing this second case to work
is not desirable.


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-21 14:42:21
Message-ID: CA+TgmoaS6Z-=Mq-vw+kcsY512z8QOhi8_GXbfSFtF2EcwVyMvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 19, 2012 at 6:19 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> On Tue, 2012-11-06 at 10:57 -0500, Robert Haas wrote:
>> But, with the attached patch:
>>
>> rhaas=# create function xyz(smallint) returns smallint as $$select
>> $1$$ language sql;
>> CREATE FUNCTION
>> rhaas=# select xyz(5);
>> xyz
>> -----
>> 5
>> (1 row)
>>
>> rhaas=# create table abc (a int);
>> CREATE TABLE
>> rhaas=# select lpad(a, 5, '0') from abc;
>> lpad
>> ------
>> (0 rows)
>
> I continue to be of the opinion that allowing this second case to work
> is not desirable.

1. Why?

2. What's your counter-proposal?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-21 15:27:25
Message-ID: CA+U5nM+ZrEopxEYEw_d3aq87jHf9cshNyuAj4qFPMCfyLqZmCw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 29 August 2012 23:39, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> The main downside I can see is that code that used to work is likely
> to stop working as soon as someone creates a potential overloading
> situation. Worse, the error message could be pretty confusing, since
> if you had been successfully calling f(smallint) with f(42), you'd get
> "f(integer) does not exist", not something like "f() is ambiguous",
> after adding f(float8) to the mix. This seems related to the confusing
> changes in regression test cases that I got in my experiments yesterday.
> This may be sufficient reason to reject the idea, since the very last
> thing we need in this area is any degradation in the relevance of the
> error messages.

It would be useful if we issued a NOTICE when an ambiguity is
introduced, rather than when using it.

Like Bison's reporting of reduce conflicts.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-21 16:01:32
Message-ID: 4716.1353513692@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> It would be useful if we issued a NOTICE when an ambiguity is
> introduced, rather than when using it.

I think that's pie in the sky, since whether there is an ambiguity
will depend not only on what set of functions exists, but what the
caller's search_path is.

regards, tom lane


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-21 22:10:01
Message-ID: 50AD5139.10607@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/21/12 9:42 AM, Robert Haas wrote:
> On Mon, Nov 19, 2012 at 6:19 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>> On Tue, 2012-11-06 at 10:57 -0500, Robert Haas wrote:
>>> But, with the attached patch:
>>>
>>> rhaas=# create function xyz(smallint) returns smallint as $$select
>>> $1$$ language sql;
>>> CREATE FUNCTION
>>> rhaas=# select xyz(5);
>>> xyz
>>> -----
>>> 5
>>> (1 row)
>>>
>>> rhaas=# create table abc (a int);
>>> CREATE TABLE
>>> rhaas=# select lpad(a, 5, '0') from abc;
>>> lpad
>>> ------
>>> (0 rows)
>>
>> I continue to be of the opinion that allowing this second case to work
>> is not desirable.
>
> 1. Why?

Because a strongly-typed system should not cast numbers to strings
implicitly. Does the equivalent of the lpad case work in any other
strongly-typed programming language?

> 2. What's your counter-proposal?

Leave things as they are.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-22 15:17:16
Message-ID: 26870.1353597436@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> On 11/21/12 9:42 AM, Robert Haas wrote:
>> On Mon, Nov 19, 2012 at 6:19 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>>> I continue to be of the opinion that allowing this second case to work
>>> is not desirable.

>> 1. Why?

> Because a strongly-typed system should not cast numbers to strings
> implicitly. Does the equivalent of the lpad case work in any other
> strongly-typed programming language?

The argument here is basically between ease of use and ability to detect
common programming mistakes. It's not clear to me that there is any
principled way to make such a tradeoff, because different people can
reasonably put different weights on those two goals.

>> 2. What's your counter-proposal?

> Leave things as they are.

FWIW, I agree with Peter. It's been like this for a long time and
whether the system would be easier to use or not, it would definitely
be uglier and harder to explain. ("Assignment casts are used only
for assignments ... except when they aren't.")

I notice that the proposed patch is devoid of documentation. Perhaps
after Robert is done writing the necessary changes to the SGML docs
about type conversions and casts, he'll agree this is pretty ugly.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-25 23:31:46
Message-ID: CA+TgmoZPVaXFQR5wBG5e_bMPcv9tqt3akd-uZz7z=9CUZjppvQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Nov 21, 2012 at 5:10 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> Because a strongly-typed system should not cast numbers to strings
> implicitly. Does the equivalent of the lpad case work in any other
> strongly-typed programming language?

Does any other strongly-typed programming language distinguish between
explicit, assignment, and implicit casting the way that PostgreSQL
does? In order for the equivalent of the lpad case to exist in some
other programming language, I think they'd need to make that
distinction, and AFAICT no one does that. The only other programming
language I know of in which you can define what it means to cast
between two data types is C++, and it's not generally considered one
of that languages better features. AFAICT, they have implicit casts
and explicit casts, but nothing intermediate. There are dynamic_cast,
static_cast, and reinterpret_cast as well, but those trade-off
efficiency for the possibility of a segmentation fault, and have
nothing to do with the context in which the cast can be applied
automatically.

So I think the answer to your question is probably no, purely on the
grounds that we have set a new world record for byzantine casting
systems. A more fair comparison might be to look at what other SQL
systems allow. Oracle, MySQL, and SQL Server all permit implicit
casting between integer and text, and a call to LPAD with an integer
first argument works just fine in both Oracle and MySQL. It doesn't
work in SQL server, but that's only because SQL server doesn't have it
as a built-in function. FWICT, there's no general problem with
passing an integer to a function that expects varchar in any
mainstream RDBMS other than PostgreSQL.

>> 2. What's your counter-proposal?
>
> Leave things as they are.

I'd be fine with that if we had a system that allows users to pick the
behavior that they want in their particular environment, but in fact
our existing system is extremely inflexible. If you start adding
additional implicit casts to the system, you get failures trying to
invoke commonly-used system functions, because we've got overloaded
versions of them precisely to work around the fact that our casting
rules are more restrictive than real-world users want them to be. If
that's not prima facie evidence that the system doesn't work well in
the real world, I'm not sure what would qualify.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-25 23:36:57
Message-ID: CA+TgmoayyZ2ZCzaSb4SaWAeQFOTk4+8Yc3XR4e4Crx=an7jkXw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Nov 22, 2012 at 10:17 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> The argument here is basically between ease of use and ability to detect
> common programming mistakes. It's not clear to me that there is any
> principled way to make such a tradeoff, because different people can
> reasonably put different weights on those two goals.

I think that is true. But for whatever it's worth, and at the risk of
beating a horse that seems not to be dead yet in spite of the fact
that I feel I've already administered one hell of a beating, the LPAD
case is unambiguous, and therefore it is hard to see what sort of
programming mistake we are protecting users against. If there's only
one function called bob, and the user says bob(x), it is hard to see
what behavior, other than calling bob with x as an argument, would be
even mildly sensible. (Yes, OK, there are two lpad functions, but as
you pointed out previously, they take different numbers of arguments,
so the point still stands.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-25 23:42:45
Message-ID: 50B2ACF5.9010607@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


> I'd be fine with that if we had a system that allows users to pick the
> behavior that they want in their particular environment, but in fact
> our existing system is extremely inflexible. If you start adding
> additional implicit casts to the system, you get failures trying to
> invoke commonly-used system functions, because we've got overloaded
> versions of them precisely to work around the fact that our casting
> rules are more restrictive than real-world users want them to be. If
> that's not prima facie evidence that the system doesn't work well in
> the real world, I'm not sure what would qualify.

They don't even work particularly well for db hackers. I went down the
rabbit hole of trying to make TEXT = CITEXT be a default cast to CITEXT,
and after several days of struggling with breaking system functions, I
gave up.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-25 23:46:52
Message-ID: 9405.1353887212@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Thu, Nov 22, 2012 at 10:17 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> The argument here is basically between ease of use and ability to detect
>> common programming mistakes. It's not clear to me that there is any
>> principled way to make such a tradeoff, because different people can
>> reasonably put different weights on those two goals.

> I think that is true. But for whatever it's worth, and at the risk of
> beating a horse that seems not to be dead yet in spite of the fact
> that I feel I've already administered one hell of a beating, the LPAD
> case is unambiguous, and therefore it is hard to see what sort of
> programming mistake we are protecting users against.

I think we're talking past each other here. It is unarguable that
(as long as there's only one LPAD function) there is only one possible
non-error interpretation. However, you are ignoring the real
possibility that perhaps the situation *is* an error: maybe the user
typed the wrong function name, or the wrong field name, or simply
misunderstands what the function is meant to do. If it is a typo then
complaining about the datatype mismatch is a good thing to do. If it
is intentional, then requiring an explicit cast makes it clear to all
concerned that what's wanted is to convert the non-string value to a
string and then perform a string-ish operation on it.

regards, tom lane


From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-26 00:05:27
Message-ID: CAEYLb_U7iOR1WdcEqEy67=PE=CwfJ=0aG6erZ225DKuv9Ls1Tg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 25 November 2012 23:31, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> The only other programming
> language I know of in which you can define what it means to cast
> between two data types is C++, and it's not generally considered one
> of that languages better features. AFAICT, they have implicit casts
> and explicit casts, but nothing intermediate.

Well, you can make your class copy-constructable by providing a
constructor (and a copy-assignment operator) whose only argument is,
say, an int. In additional to that, you could potentially define a
conversion operator, which will make the class implicitly cast back
into an int. That is kind of a big distinction, because it doesn't
have to go both ways, and in fact it usually doesn't - plenty of
working C++ programmers don't know what a conversion operator is, but
they could all tell you how to get this behaviour:

MyClass foo = 5; // actually calls copy constructor - equivalent to
MyClass foo(5);
foo = 4; // This calls copy assignment operator

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-26 00:21:32
Message-ID: CA+TgmoZ_vx8hO6bkc4Awmc3ZOaLmx34HTSkyTd8OMEZyqin5Dg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Nov 25, 2012 at 6:46 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I think we're talking past each other here. It is unarguable that
> (as long as there's only one LPAD function) there is only one possible
> non-error interpretation. However, you are ignoring the real
> possibility that perhaps the situation *is* an error: maybe the user
> typed the wrong function name, or the wrong field name, or simply
> misunderstands what the function is meant to do. If it is a typo then
> complaining about the datatype mismatch is a good thing to do. If it
> is intentional, then requiring an explicit cast makes it clear to all
> concerned that what's wanted is to convert the non-string value to a
> string and then perform a string-ish operation on it.

Sure, in theory that is true, but no other RDBMS that I know about
feels a need to error out in that situation. I'm skeptical of the
contention that we're smarter than everyone else. Moreover, if
implicit casts to string are a categorically bad idea, why do we allow
them (via various evil hacks) for quote_literal(), concat(), and ||?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-26 00:24:26
Message-ID: CA+TgmoajD=0bKEetUzWkR31WuOu=rZdm6mqDoC86ayeh3_3p8w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Nov 25, 2012 at 7:05 PM, Peter Geoghegan <peter(at)2ndquadrant(dot)com> wrote:
> Well, you can make your class copy-constructable by providing a
> constructor (and a copy-assignment operator) whose only argument is,
> say, an int. In additional to that, you could potentially define a
> conversion operator, which will make the class implicitly cast back
> into an int. That is kind of a big distinction, because it doesn't
> have to go both ways, and in fact it usually doesn't - plenty of
> working C++ programmers don't know what a conversion operator is, but
> they could all tell you how to get this behaviour:
>
> MyClass foo = 5; // actually calls copy constructor - equivalent to
> MyClass foo(5);
> foo = 4; // This calls copy assignment operator

I remember this sort of thing un-fondly from my C++ days, but it
doesn't make me like our current behavior any better. As in C++, we
seem to have created a system where the only way to get even locally
sensible behavior is to throw large piles of hackery at the problem.
Getting the behavior you want globally cannot be obtained at any
price.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-26 00:36:31
Message-ID: CAEYLb_XBBE5xUZHcbym+jJ-+imBD6O_ym2ipELwiopVF9ESH5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 26 November 2012 00:24, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I remember this sort of thing un-fondly from my C++ days, but it
> doesn't make me like our current behavior any better.

You can also make a constructor with a single argument "explicit", and
thereby prevent implicit conversions. So yes, C++ distinguishes
between
explicit, assignment, and implicit casting in a way that is, in broad
strokes, at least as sophisticated as PostgreSQL.

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-26 00:39:22
Message-ID: 10249.1353890362@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> So I think the answer to your question is probably no, purely on the
> grounds that we have set a new world record for byzantine casting
> systems. A more fair comparison might be to look at what other SQL
> systems allow. Oracle, MySQL, and SQL Server all permit implicit
> casting between integer and text, and a call to LPAD with an integer
> first argument works just fine in both Oracle and MySQL. It doesn't
> work in SQL server, but that's only because SQL server doesn't have it
> as a built-in function. FWICT, there's no general problem with
> passing an integer to a function that expects varchar in any
> mainstream RDBMS other than PostgreSQL.

I think this is ignoring the fact that we have an extensible type
system, and thus a lot more room for problems if we allow too many
implicit casts.

It might also be worth noting that some of this complexity comes from
the SQL standard. It is at least arguable that the three-way coercion
distinction exists in the standard: they have got different rules for
what happens in an explicit CAST, in an assignment context, and in plain
expressions. So it's not that relevant whether other mainstream
programming languages have comparable constructs.

Having said that, though, I think you could make an argument that
there's some support for this idea in the SQL standard. In SQL99
10.4 <routine invocation>, it appears that once you've identified
a target routine to be called, you're supposed to use the "store
assignment" rules to decide how to convert the supplied expression(s)
to the parameter data type(s). However, it's not clear to me that
that should be taken as conclusive, because the $64 question here
is exactly how sure you are in your identification of the target
routine. SQL99 doesn't seem to allow anywhere near as much function
overloading as we do --- and of course they have no notion of
overloaded or user-defined operators at all. As far as I can tell
from 10.4, you are supposed to be able to identify the target routine
without any consideration of the actual parameters' types.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-26 01:16:13
Message-ID: CA+Tgmob0nkrrH1-7ny+rYDkas5XzpVHVxgdfqs2kmLS-VbBJ8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Nov 25, 2012 at 7:36 PM, Peter Geoghegan <peter(at)2ndquadrant(dot)com> wrote:
> On 26 November 2012 00:24, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> I remember this sort of thing un-fondly from my C++ days, but it
>> doesn't make me like our current behavior any better.
>
> You can also make a constructor with a single argument "explicit", and
> thereby prevent implicit conversions. So yes, C++ distinguishes
> between
> explicit, assignment, and implicit casting in a way that is, in broad
> strokes, at least as sophisticated as PostgreSQL.

OK, I stand corrected.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-26 02:08:24
Message-ID: CA+TgmoYK7n4_HPzufUMxwSaSYo7fZ_Xr-fJ+0BG7AfUFCnEMdA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Nov 25, 2012 at 6:46 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I think we're talking past each other here. It is unarguable that
> (as long as there's only one LPAD function) there is only one possible
> non-error interpretation. However, you are ignoring the real
> possibility that perhaps the situation *is* an error: maybe the user
> typed the wrong function name, or the wrong field name, or simply
> misunderstands what the function is meant to do.

Yeah, but of course all of that could also be true even if the
argument types match exactly, too. I mean, I won't deny that the
presence of an exact argument-type match lends a little bit of
additional confidence that the call is the one the user intended, but
a user is more likely to confuse LPAD with RPAD than they are to
confuse either of them with a function that does something completely
different but is spelled almost the same (LDAP?).

I think it's also worth reiterating that, whatever you may think of
the LPAD case, there is a significant win here in allowing things like
foo(3) to match foo(smallint) in the absence of any other
foo-of-one-argument. Nobody's even attempted to argue that the
current behavior in that situation is desirable, and the fact that
this would fix it in nearly all of the cases that anyone cares about
strikes me as a distinct point in its favor.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-26 02:08:32
Message-ID: CA+TgmoaOtQAJr5OzZCNzwr_=h_uJURa+7hFAafc4NT3a9i4ALg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, Nov 25, 2012 at 7:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I think this is ignoring the fact that we have an extensible type
> system, and thus a lot more room for problems if we allow too many
> implicit casts.

I don't deny that.

> It might also be worth noting that some of this complexity comes from
> the SQL standard. It is at least arguable that the three-way coercion
> distinction exists in the standard: they have got different rules for
> what happens in an explicit CAST, in an assignment context, and in plain
> expressions. So it's not that relevant whether other mainstream
> programming languages have comparable constructs.
>
> Having said that, though, I think you could make an argument that
> there's some support for this idea in the SQL standard. In SQL99
> 10.4 <routine invocation>, it appears that once you've identified
> a target routine to be called, you're supposed to use the "store
> assignment" rules to decide how to convert the supplied expression(s)
> to the parameter data type(s). However, it's not clear to me that
> that should be taken as conclusive, because the $64 question here
> is exactly how sure you are in your identification of the target
> routine. SQL99 doesn't seem to allow anywhere near as much function
> overloading as we do --- and of course they have no notion of
> overloaded or user-defined operators at all. As far as I can tell
> from 10.4, you are supposed to be able to identify the target routine
> without any consideration of the actual parameters' types.

FWIW, neither MySQL nor Oracle supports function overloading for plain
functions, so the question doesn't arise for them in the context of
something like LPAD(). Oracle does support overloading for package
functions, and I'm not sure exactly how they identify candidate
functions in that context, but they do complain about ambiguous calls
in some circumstances.

Personally, I'm not sure that anyone has come up with an altogether
satisfactory solution to the function overloading problem. If you
have an exact type match in every argument position for one of the
possible candidate functions, then surely any system that permits
overloading at all is going to pick that candidate. Conversely, if
you have one or many candidates all of which are completely
incompatible with the actual argument types, then any system is going
to fail. The tension is all around what to do when you have several
candidates which are about equally good. You can either reject the
call as ambiguous (which will sometimes annoy users who don't feel
that a cast should be needed) or you can use some sort of tiebreak
system to pick a candidate (which risks picking a different function
than the user expected). I tend to think it's better to err on the
side of the former, and I think we do, but there might nonetheless be
some for improvement in that area, with due regard for the possibility
of breaking currently-working applications.

That, however, is a separate question from what's under discussion
here, because the case at issue for the proposed patch is the one in
which only one possible candidate exists, and the question is whether
we ought to allow the use of assignment casts to allow the call to
work rather than fail, NOT which of several overloaded functions we
ought to pick. In any situation in which overloading is in use, the
patch as proposed changes nothing. I'm not generally very good at
interpreting the SQL standard text, but if it says that you ought to
use assignment casts to match actual argument types to the chosen
candidate function, then that seems like it's advocating for
essentially the same position that you arrived at independently and
that the patch also takes, which furthermore happens to be compatible
with what other RDBMS systems do, at least in the no-overloading case.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 09:46:58
Message-ID: 1354009618.10198.177.camel@jdavis-laptop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Sun, 2012-11-25 at 21:08 -0500, Robert Haas wrote:
> That, however, is a separate question from what's under discussion
> here, because the case at issue for the proposed patch is the one in
> which only one possible candidate exists, and the question is whether
> we ought to allow the use of assignment casts to allow the call to
> work rather than fail, NOT which of several overloaded functions we
> ought to pick. In any situation in which overloading is in use, the
> patch as proposed changes nothing. I'm not generally very good at
> interpreting the SQL standard text, but if it says that you ought to
> use assignment casts to match actual argument types to the chosen
> candidate function, then that seems like it's advocating for
> essentially the same position that you arrived at independently and
> that the patch also takes, which furthermore happens to be compatible
> with what other RDBMS systems do, at least in the no-overloading case.

Let's say you have only one function "foo". All your queries are coded
into your application, and everything works fine, using assignment casts
where necessary.

Then the user is foolish enough to CREATE FUNCTION foo... and now their
queries start failing left and right.

In other words, "only one possible candidate exists" should be followed
by "right now" to be more precise.

That's a major violation of the principle of least astonishment, that
CREATE FUNCTION could cause such a disaster. I know that it can now, but
what you're proposing will come into play much more frequently because
most people start off with just one function by a particular name and
define more as needed.

If we do something like this, I think we should explicitly opt out of
the overloading feature at DDL time (somewhat like what Simon suggested
in another reply). E.g. CREATE {UNIQUE|OVERLOADED} FUNCTION ...

I'm not proposing that; in general I am very wary of changes to the type
system. I'm just saying that, if we do have special rules, we should
have a way to make sure that users know when the rules are changing.

Regards,
Jeff Davis


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 09:59:04
Message-ID: 1354010344.10198.188.camel@jdavis-laptop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 2012-11-21 at 15:27 +0000, Simon Riggs wrote:
> It would be useful if we issued a NOTICE when an ambiguity is
> introduced, rather than when using it.
>
> Like Bison's reporting of reduce conflicts.

This brings up a very important point, which is that a lot of the code
is frozen in applications yet invisible at DDL time. So we have to be
careful that DDL changes have a reasonable impact on the ability to
continue to compile and execute the previously-working SQL received from
the applications.

In other words, as I said in another reply, we want to avoid cases where
something seemingly innocuous (like creating a function) causes
previously-working SQL to fail due to ambiguity.

As Tom said, detecting the ambiguity at DDL time is not easy, so I'm not
suggesting that. And I know that creating a function can already cause
previously-working SQL to fail. I'm just saying we should be careful of
these situations and not make them more likely than necessary.

Regards,
Jeff Davis


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 10:10:40
Message-ID: CAFj8pRDjeovaTsW6JkaTRpWQssSy-u8DqBLankC+NnWcFPhENw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello all

2012/11/27 Jeff Davis <pgsql(at)j-davis(dot)com>:
> On Wed, 2012-11-21 at 15:27 +0000, Simon Riggs wrote:
>> It would be useful if we issued a NOTICE when an ambiguity is
>> introduced, rather than when using it.
>>
>> Like Bison's reporting of reduce conflicts.
>
> This brings up a very important point, which is that a lot of the code
> is frozen in applications yet invisible at DDL time. So we have to be
> careful that DDL changes have a reasonable impact on the ability to
> continue to compile and execute the previously-working SQL received from
> the applications.
>
> In other words, as I said in another reply, we want to avoid cases where
> something seemingly innocuous (like creating a function) causes
> previously-working SQL to fail due to ambiguity.
>
> As Tom said, detecting the ambiguity at DDL time is not easy, so I'm not
> suggesting that. And I know that creating a function can already cause
> previously-working SQL to fail. I'm just saying we should be careful of
> these situations and not make them more likely than necessary.
>

from my view - a current design works well, but for someone who see pg
first time, there can be lot of surprises.

a) PostgreSQL reports missing functions -- but there are issue in parameters
b) PostgreSQL requests explicit typing string literals to text -- and
again it reports not informative message

so minimally we can enhance a error messages

Regards

Pavel

> Regards,
> Jeff Davis
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 16:40:37
Message-ID: 20121127164037.GA1820@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 27, 2012 at 01:59:04AM -0800, Jeff Davis wrote:
> On Wed, 2012-11-21 at 15:27 +0000, Simon Riggs wrote:
> > It would be useful if we issued a NOTICE when an ambiguity is
> > introduced, rather than when using it.
> >
> > Like Bison's reporting of reduce conflicts.
>
> This brings up a very important point, which is that a lot of the code
> is frozen in applications yet invisible at DDL time. So we have to be
> careful that DDL changes have a reasonable impact on the ability to
> continue to compile and execute the previously-working SQL received from
> the applications.
>
> In other words, as I said in another reply, we want to avoid cases where
> something seemingly innocuous (like creating a function) causes
> previously-working SQL to fail due to ambiguity.
>
> As Tom said, detecting the ambiguity at DDL time is not easy, so I'm not
> suggesting that. And I know that creating a function can already cause
> previously-working SQL to fail. I'm just saying we should be careful of
> these situations and not make them more likely than necessary.

For me this highlights why looking at how application languages handle
overloading might not be as relevant --- most language don't have
possible-conflicting functions being created at run-time like a database
does. The parallels in how other databases treat overloading is
relevant.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 16:52:48
Message-ID: CAFj8pRCP55E_jNOH+m5kbW2WjS4DAZoDNtnTXSopdfthnfYr5A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2012/11/27 Bruce Momjian <bruce(at)momjian(dot)us>:
> On Tue, Nov 27, 2012 at 01:59:04AM -0800, Jeff Davis wrote:
>> On Wed, 2012-11-21 at 15:27 +0000, Simon Riggs wrote:
>> > It would be useful if we issued a NOTICE when an ambiguity is
>> > introduced, rather than when using it.
>> >
>> > Like Bison's reporting of reduce conflicts.
>>
>> This brings up a very important point, which is that a lot of the code
>> is frozen in applications yet invisible at DDL time. So we have to be
>> careful that DDL changes have a reasonable impact on the ability to
>> continue to compile and execute the previously-working SQL received from
>> the applications.
>>
>> In other words, as I said in another reply, we want to avoid cases where
>> something seemingly innocuous (like creating a function) causes
>> previously-working SQL to fail due to ambiguity.
>>
>> As Tom said, detecting the ambiguity at DDL time is not easy, so I'm not
>> suggesting that. And I know that creating a function can already cause
>> previously-working SQL to fail. I'm just saying we should be careful of
>> these situations and not make them more likely than necessary.
>
> For me this highlights why looking at how application languages handle
> overloading might not be as relevant --- most language don't have
> possible-conflicting functions being created at run-time like a database
> does. The parallels in how other databases treat overloading is
> relevant.

it is a basic problem - PostgreSQL has unique possibilities -
polymorphic parameters and almost all databases doesn't support
overloading

probably our system is very similar to Haskell

>
> --
> Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
> EnterpriseDB http://enterprisedb.com
>
> + It's impossible for everything to be true. +
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 17:07:54
Message-ID: CAHyXU0w1d-3Mbu2sU6v4neywJ5ntW3ATV6r6-4N4vLgMBNRYEQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 27, 2012 at 10:52 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> it is a basic problem - PostgreSQL has unique possibilities -
> polymorphic parameters and almost all databases doesn't support
> overloading

Speaking of polymorphism, why not just implement lpad()'s first
argument as 'anyelement'? ISTM this comes up in mostly in porting
code from other database that is utilizing standard sql functions.
This should be appropriate when the function's basic functionality and
argument signature is not dependent on input type (constrast:
to_timestamp) and there is a good portability case to be made.
Essentially, this applies to a handful of string processing routines
AFAICT.

merlin


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 17:19:43
Message-ID: CA+TgmobXax0-tODdH8ONoQK3AbxUewLM8GUOpFd-Bx0nZc2E2A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 27, 2012 at 4:46 AM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> Let's say you have only one function "foo". All your queries are coded
> into your application, and everything works fine, using assignment casts
> where necessary.
>
> Then the user is foolish enough to CREATE FUNCTION foo... and now their
> queries start failing left and right.
>
> In other words, "only one possible candidate exists" should be followed
> by "right now" to be more precise.
>
> That's a major violation of the principle of least astonishment, that
> CREATE FUNCTION could cause such a disaster. I know that it can now, but
> what you're proposing will come into play much more frequently because
> most people start off with just one function by a particular name and
> define more as needed.

I admit that there are cases where this could happen, and that it will
happen a little more than it does now. But, as you say, this can
happen now, and yet we get very few if any complaints about it,
whereas we get regular complaints about the need to insert casts that
other database systems do not require. The fact is that most
functions are not overloaded, so the esoterica of overloading affect
only a tiny number of relatively sophisticated users. The need for
extra casts cuts a much broader swath through our user base.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 17:47:11
Message-ID: 24955.1354038431@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I admit that there are cases where this could happen, and that it will
> happen a little more than it does now. But, as you say, this can
> happen now, and yet we get very few if any complaints about it,
> whereas we get regular complaints about the need to insert casts that
> other database systems do not require. The fact is that most
> functions are not overloaded, so the esoterica of overloading affect
> only a tiny number of relatively sophisticated users. The need for
> extra casts cuts a much broader swath through our user base.

I find this argument a bit specious. It probably is true that most
*user defined* functions aren't overloaded --- but that's not so true
for system-defined functions, and even less true for operators. So
the parser's behavior with overloaded calls affects all users, whether
they know it or not. It also affects developers, in that adding a
new overloaded version of a system function (that previously wasn't
overloaded) could actually reduce the number of cases for which the
function works without an explicit cast.

We have got to be really careful with changing the parser's behavior
here, or we're going to break cases that work today.

regards, tom lane


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 18:45:21
Message-ID: 1354041921.10198.204.camel@jdavis-laptop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2012-11-27 at 12:19 -0500, Robert Haas wrote:
> I admit that there are cases where this could happen, and that it will
> happen a little more than it does now. But, as you say, this can
> happen now, and yet we get very few if any complaints about it,
> whereas we get regular complaints about the need to insert casts that
> other database systems do not require. The fact is that most
> functions are not overloaded, so the esoterica of overloading affect
> only a tiny number of relatively sophisticated users. The need for
> extra casts cuts a much broader swath through our user base.

Well, I did offer a suggestion that would make your idea safer, which is
to explicitly opt out of the overloading feature at the time the
function is created, rather than making it implicit based on how many
functions happen to have the same name.

The fact that it can only hurt sophisticated users is not convincing to
me. For one thing, our users are programmers, so they should all feel
comfortable defining their own functions, and I don't want to make them
any less so. Next, sophisticated users also make mistakes.

I could also make a security argument. Even today, any user who can
create a function in your search path can make your queries start
failing. If we locked down most of the system-defined functions as
non-overloadable, and allowed users to do the same for their functions
(maybe even the default one day?), then that would greatly reduce the
exposure.

The current strictness of the overloaded functions tends to make users
more explicit about argument types, which reduces the chance of problems
at the expense of usability and compatibility. Not ideal, but if we make
it more permissive then we are permanently stuck with less information
about what types the user intended and which function they intended to
call. In such an extensible system, that worries me on several fronts.

That being said, I'm not outright in opposition to the idea of making
improvements like this, I just think we should do so cautiously.

Regards,
Jeff Davis


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 18:50:55
Message-ID: CA+TgmoawFCxquNnRZkmDoFBD_dyeKLp7JdKV2jyNo53zVCV8kQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 27, 2012 at 12:47 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> I admit that there are cases where this could happen, and that it will
>> happen a little more than it does now. But, as you say, this can
>> happen now, and yet we get very few if any complaints about it,
>> whereas we get regular complaints about the need to insert casts that
>> other database systems do not require. The fact is that most
>> functions are not overloaded, so the esoterica of overloading affect
>> only a tiny number of relatively sophisticated users. The need for
>> extra casts cuts a much broader swath through our user base.
>
> I find this argument a bit specious. It probably is true that most
> *user defined* functions aren't overloaded --- but that's not so true
> for system-defined functions, and even less true for operators. So
> the parser's behavior with overloaded calls affects all users, whether
> they know it or not. It also affects developers, in that adding a
> new overloaded version of a system function (that previously wasn't
> overloaded) could actually reduce the number of cases for which the
> function works without an explicit cast.
>
> We have got to be really careful with changing the parser's behavior
> here, or we're going to break cases that work today.

Well, the whole point of writing the patch the way I did was that it
*doesn't* break any cases that work today.

But as to your point about the system catalogs, it is true that adding
an additional function could reduce the number of cases where things
work today. But I think in many cases it would eliminate the need for
overloading that we already have, and simplify things for future
developers. Right now, quote_literal() allows implicit casts to text
by having a second version that takes any anyelement argument; on the
other hand, concat() allows implicit casts to text by accepting "any"
rather than text as an argument; and || allows implicit casts to text
by defining operators for anynonarray || text, text || anynonarray,
and text || text. So we've got three quite different methods to
create implicit-cast-to-text behavior in particular cases. That's got
developer complexity too, and while this proposal wouldn't do anything
about the third case since || actually sometimes has a different
meaning, namely array concatenation, the first two wouldn't need
overloading any more. They'd just work.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 19:13:43
Message-ID: CA+TgmoZF+s5WrobBX3WvgftSWM_gkgt-fO24xZFZ0V5qf0vuCA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 27, 2012 at 1:45 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> On Tue, 2012-11-27 at 12:19 -0500, Robert Haas wrote:
>> I admit that there are cases where this could happen, and that it will
>> happen a little more than it does now. But, as you say, this can
>> happen now, and yet we get very few if any complaints about it,
>> whereas we get regular complaints about the need to insert casts that
>> other database systems do not require. The fact is that most
>> functions are not overloaded, so the esoterica of overloading affect
>> only a tiny number of relatively sophisticated users. The need for
>> extra casts cuts a much broader swath through our user base.
>
> Well, I did offer a suggestion that would make your idea safer, which is
> to explicitly opt out of the overloading feature at the time the
> function is created, rather than making it implicit based on how many
> functions happen to have the same name.
>
> The fact that it can only hurt sophisticated users is not convincing to
> me. For one thing, our users are programmers, so they should all feel
> comfortable defining their own functions, and I don't want to make them
> any less so. Next, sophisticated users also make mistakes.
>
> I could also make a security argument. Even today, any user who can
> create a function in your search path can make your queries start
> failing. If we locked down most of the system-defined functions as
> non-overloadable, and allowed users to do the same for their functions
> (maybe even the default one day?), then that would greatly reduce the
> exposure.
>
> The current strictness of the overloaded functions tends to make users
> more explicit about argument types, which reduces the chance of problems
> at the expense of usability and compatibility. Not ideal, but if we make
> it more permissive then we are permanently stuck with less information
> about what types the user intended and which function they intended to
> call. In such an extensible system, that worries me on several fronts.
>
> That being said, I'm not outright in opposition to the idea of making
> improvements like this, I just think we should do so cautiously.

Fair enough. I certainly admit that I wouldn't like to release with
this code in place and then find out that it's got some critical flaw,
security or otherwise. A couple of embarrassing bugs have been found
recently in patches I wrote and committed, and I'm not looking to up
that number. That having been said, I remain unconvinced that any of
the things proposed so far are compelling reasons not to do this.
That doesn't mean there aren't any such reasons, but I am personally
unconvinced that we've found them yet. Most of the arguments so far
advanced seem to involve overloading (where this proposal doesn't
change anything vs. today); I think you're the only one who has
proposed a situation where it causes a problem (namely, a function
that is overloaded later) but in my personal opinion that's not going
to happen often enough to justify the amount of user pain the current
system imposes. Of course that's a judgement call.

I do think that applying some kind of explicit flag to the function
indicating whether it should allow implicit assignment
casting/implicit casting to text/overloading/whatever is a possibly
interesting alternative. It seems clear from our system catalogs that
implicit casting to text is sometimes a desirable behavior and
sometimes not, so it's reasonable to think that perhaps we should put
that under user control. What I like about "my" proposal (really
Tom's idea) is that it seems like it solves a pretty high percentage
of the problem cases without requiring any explicit user action. I
actually suspect we could get the "right" behavior even more often by
attaching flags to the function or argument position, but that would
also put more of the onus on the user to get the flags right -- and we
might not even agree amongst ourselves on how the flags should be set.
The fact that quote_literal() allows (by the expedient of
overloading) implicit casts to text and that lpad() does not seems
fairly random to me in hindsight; is there a general principle there
that we'd all sign on to? The nice thing about this proposal is that
it doesn't require any explicit user action. Of course that's no help
if it does the wrong thing, but since it only fixes cases that are
unambiguous and which currently fail, it's hard for me to see how
that's a real danger. That doesn't mean there ISN'T a real danger,
but I want to make sure that if we don't do this we have a clear and
understandable reason, and not just bad memories of the last time we
made a change in this area.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 19:26:14
Message-ID: 26904.1354044374@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> But as to your point about the system catalogs, it is true that adding
> an additional function could reduce the number of cases where things
> work today. But I think in many cases it would eliminate the need for
> overloading that we already have, and simplify things for future
> developers. Right now, quote_literal() allows implicit casts to text
> by having a second version that takes any anyelement argument; on the
> other hand, concat() allows implicit casts to text by accepting "any"
> rather than text as an argument; and || allows implicit casts to text
> by defining operators for anynonarray || text, text || anynonarray,
> and text || text. So we've got three quite different methods to
> create implicit-cast-to-text behavior in particular cases. That's got
> developer complexity too, and while this proposal wouldn't do anything
> about the third case since || actually sometimes has a different
> meaning, namely array concatenation, the first two wouldn't need
> overloading any more. They'd just work.

Uh, no, not really, and I think that assertion just goes to show that
this area is more subtle than you think. quote_literal() for instance
presently works for any datatype that has an explicit cast to text.
After making the change you propose above, it would only work for types
for which the cast was assignment-grade or less. concat() is even
looser: as now implemented, it works for *anything at all*, because it
relies on datatype output functions not casts to text. I'm dubious that
that inconsistency is a good thing, actually, but that's how the
committed code is written.

Now, some of us might think that backing these conversions down to only
allowing assignment-grade casts would be an improvement, in the sense
that it would actually make the type system tighter not looser than it
is today for these particular functions. But I suspect you wouldn't see
it as an improvement, given the position you're arguing from.

In fact, I'm afraid that making this change would result in requests to
downgrade existing explicit casts to be assignment-only, so that people
could be even lazier about not casting function arguments; and that is
something up with which I will not put.

regards, tom lane


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 19:59:18
Message-ID: 1354046358.1766.16.camel@sussancws0025
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2012-11-27 at 14:13 -0500, Robert Haas wrote:
> I do think that applying some kind of explicit flag to the function
> indicating whether it should allow implicit assignment
> casting/implicit casting to text/overloading/whatever is a possibly
> interesting alternative. It seems clear from our system catalogs that
> implicit casting to text is sometimes a desirable behavior and
> sometimes not, so it's reasonable to think that perhaps we should put
> that under user control. What I like about "my" proposal (really
> Tom's idea) is that it seems like it solves a pretty high percentage
> of the problem cases without requiring any explicit user action.

What user action are you concerned about? If we (eventually) made the
non-overloaded case the default, would that resolve your concerns?

Regards,
Jeff Davis


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 20:21:51
Message-ID: 28008.1354047711@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I do think that applying some kind of explicit flag to the function
> indicating whether it should allow implicit assignment
> casting/implicit casting to text/overloading/whatever is a possibly
> interesting alternative.

That idea seems possibly worth pursuing. The thing that I find scary
about the current proposal is that it applies to all functions (and
operators) willy-nilly, which seems to raise the risk of unexpected
side effects pretty high. If we could confine the behavioral change
to a relatively small number of functions for which there was consensus
that they should accept most anything, I'd feel better about it.

(Of course, we might then conclude that something close to the
quote_literal solution would work as well as a new function property.
But it's worth thinking about.)

> The fact that quote_literal() allows (by the expedient of
> overloading) implicit casts to text and that lpad() does not seems
> fairly random to me in hindsight; is there a general principle there
> that we'd all sign on to?

I don't find that random in the slightest. The entire purpose of
quote_literal is "manufacture a SQL-literal string representation of
this value", and that clearly might apply to data of any type. lpad()
is, first last and only, a textual operation. Somebody who thinks it
should apply directly to an integer is guilty of sloppy thinking at
best, or not even understanding what a data type is at worst.

regards, tom lane


From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 20:29:43
Message-ID: 20121127202943.GN5162@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> It also affects developers, in that adding a
> new overloaded version of a system function (that previously wasn't
> overloaded) could actually reduce the number of cases for which the
> function works without an explicit cast.
>
> We have got to be really careful with changing the parser's behavior
> here, or we're going to break cases that work today.

For my 2c- we have to be really careful making changes to the system
functions as well as the parser's behavior.

If we're worried about users creating overloaded versions of system
functions, well, I'd probably suggest a "don't do that then" kind of
approach..

Thanks,

Stephen


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 20:35:10
Message-ID: CA+Tgmoa-=4b51RmvccoxT_-sK2ALWwmAr5eAyk+Wk=XC+Z9_rA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 27, 2012 at 2:26 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Uh, no, not really, and I think that assertion just goes to show that
> this area is more subtle than you think. quote_literal() for instance
> presently works for any datatype that has an explicit cast to text.

That doesn't appear to be the behavior I'm seeing:

rhaas=# select quote_literal(17);
quote_literal
---------------
'17'
(1 row)

rhaas=# select * from pg_cast where castsource = 'int4'::regtype and
casttarget = 'text'::regtype;
castsource | casttarget | castfunc | castcontext | castmethod
------------+------------+----------+-------------+------------
(0 rows)

> After making the change you propose above, it would only work for types
> for which the cast was assignment-grade or less.

...but that's everything, because there's a hardcoded exception in the
code that dictates that even if there is no entry in pg_cast, an
assignment cast to text exists for every data type.

> concat() is even
> looser: as now implemented, it works for *anything at all*, because it
> relies on datatype output functions not casts to text. I'm dubious that
> that inconsistency is a good thing, actually, but that's how the
> committed code is written.

I argued at the time that CONCAT should take variadic text rather than
variadic any and was roundly ignored on the grounds that the implicit
casting to text behavior was what everyone wanted in that particular
case. My feeling is that we need to adopt a solution to this problem
partly so that people don't keep inventing (even in core code!)
one-off, hackish solutions that make certain cases behave completely
differently from the general rules.

> Now, some of us might think that backing these conversions down to only
> allowing assignment-grade casts would be an improvement, in the sense
> that it would actually make the type system tighter not looser than it
> is today for these particular functions. But I suspect you wouldn't see
> it as an improvement, given the position you're arguing from.

Actually, I think it wouldn't matter a bit, because of the exception
that says there's an assignment cast to text for everything.

> In fact, I'm afraid that making this change would result in requests to
> downgrade existing explicit casts to be assignment-only, so that people
> could be even lazier about not casting function arguments; and that is
> something up with which I will not put.

While I'm personally not excited about it, it is certainly imaginable
that someone might prefer something like text -> xml to be an
assignment casts rather than an explicit cast. But we've got an easy
response to that, which is fine, change it for your database, but
we're not changing it in the upstream copy. As a compatibility issue
with other databases, it's not really an issue; I can't remember a
single complaint about needing an explicit cast from text to xml or
integer to boolean or any of the other things that appear in pg_cast
with castcontext = 'e'.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 20:41:03
Message-ID: CA+TgmoYs-scKKK10x_29sk0ZVXv43Qfqi7JJQMx+nDEAXXspFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 27, 2012 at 2:59 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> What user action are you concerned about? If we (eventually) made the
> non-overloaded case the default, would that resolve your concerns?

I can't quite see how a non-overloaded flag would work, unless we get
rid of schemas. But I think there are a variety of other kinds of
labeling that I think would work. I'm still not sure that's as good
as a general solution, because if nothing else it relies on us to make
the right decision as to which type to use in each case, and
considering that neither Tom nor I are particularly sold on what we
did with CONCAT(), nor am I sure that we even agree with each other on
what the right thing to do would have been there, I'm a bit skeptical
about our ability to get these decisions right. But it might still be
an improvement.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 20:48:23
Message-ID: 28572.1354049303@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Nov 27, 2012 at 2:26 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> After making the change you propose above, it would only work for types
>> for which the cast was assignment-grade or less.

> ...but that's everything, because there's a hardcoded exception in the
> code that dictates that even if there is no entry in pg_cast, an
> assignment cast to text exists for every data type.

Ugh. I had been thinking that automatic CoerceViaIO casting only
happened for explicit casts. If that can be invoked via assignment
casts, then what you're proposing really destroys the type system
entirely, at least for functions taking text: there is absolutely
no argument such a function won't accept. I cannot support this.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 21:02:27
Message-ID: CA+TgmobTQEyH8+uJjRLEAgzz4PkkrPO=VV3ZgcYZHg3nD23bxQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 27, 2012 at 3:21 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> I do think that applying some kind of explicit flag to the function
>> indicating whether it should allow implicit assignment
>> casting/implicit casting to text/overloading/whatever is a possibly
>> interesting alternative.
>
> That idea seems possibly worth pursuing. The thing that I find scary
> about the current proposal is that it applies to all functions (and
> operators) willy-nilly, which seems to raise the risk of unexpected
> side effects pretty high. If we could confine the behavioral change
> to a relatively small number of functions for which there was consensus
> that they should accept most anything, I'd feel better about it.
>
> (Of course, we might then conclude that something close to the
> quote_literal solution would work as well as a new function property.
> But it's worth thinking about.)

>> The fact that quote_literal() allows (by the expedient of
>> overloading) implicit casts to text and that lpad() does not seems
>> fairly random to me in hindsight; is there a general principle there
>> that we'd all sign on to?
>
> I don't find that random in the slightest. The entire purpose of
> quote_literal is "manufacture a SQL-literal string representation of
> this value", and that clearly might apply to data of any type. lpad()
> is, first last and only, a textual operation. Somebody who thinks it
> should apply directly to an integer is guilty of sloppy thinking at
> best, or not even understanding what a data type is at worst.

Well, considering I made that "mistake" while working with PostgreSQL
8.2, and considering further that other databases allow it, I'm a
little reluctant to accept this theory. I'm willing to bet that the
fine folks in Redwood understand what a data type is just fine, and
I'm pretty sure that I do, too. Sloppy thinking? Perhaps. But I
think you could make a perfectly fine argument that the function of
lpad() is to concatenate something onto the string representation of a
value, or conversely that the function of quote_literal() is to escape
a string. You might not agree with either of those arguments but I
don't care to label someone who does as an idiot. The problem I have
with the explicit labeling approach is that it seems to depend heavily
on how you conceptualize what the function is trying to do, and not
everyone is going to conceptualize that the same way. Clearly there
are a lot of people who expect at least some string operators to work
on numbers, including the OP, and are confused when they don't. We
can call those people nasty names but that's not going to win us many
friends.

Anyway, I'm not averse to thinking about some kind of labeling
solution but I'm not exactly sure what would work well - and I'd still
like to see some hard evidence that the collateral damage from my er
your proposal is unacceptably high. The most plausible scenario for
how this could break things that has been presented thus far is that
someone might create a function, use it with a data type that requires
assignment-casting, and then create another function, and have things
break. But as Jeff pointed out, that can happen already: in fact, it
did, in core, with pg_size_pretty(), and while you had doubts about
that change at the time, none of us realized exactly what the failure
scenario was until it was too late to change it. Would that kind of
thing happen more often under this proposal? Kind of hard to say, but
if it made us think twice before overloading system catalog functions,
it might even work out to a net positive.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 21:36:44
Message-ID: 50B5326C.6000104@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/25/12 6:36 PM, Robert Haas wrote:
> I think that is true. But for whatever it's worth, and at the risk of
> beating a horse that seems not to be dead yet in spite of the fact
> that I feel I've already administered one hell of a beating, the LPAD
> case is unambiguous, and therefore it is hard to see what sort of
> programming mistake we are protecting users against.

Upstream applications passing wrong data down to the database.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 22:05:58
Message-ID: 50B53946.1070507@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/25/12 7:21 PM, Robert Haas wrote:
> Sure, in theory that is true, but no other RDBMS that I know about
> feels a need to error out in that situation. I'm skeptical of the
> contention that we're smarter than everyone else.

Well, I think in most programming languages that have typed function
prototypes,

define foo(string)

call foo(55)

is an error. I could be convinced otherwise, but that's my current
impression. So the principle of rejecting this is not new.

If, on the other hand, we want to align more with other RDBMS that
apparently allow this, we should look closer into by what rules they do
this. If they use assignment casts (that is, the same rules that apply
when running INSERT, for example), we could look into using that as
well, but then we should do that all the time, and not only as a
fallback of some sort. Because that's (a) arbitrary, and (b) causes
failures when overloaded functions are added, which shouldn't happen
(the existing cases where adding overloaded functions cause an existing
function to fail are surely warts that should be removed, not new ones
designed in).

I wonder what implicit casts would be good for if assignment casts
applied for function and operator calls.


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 22:09:15
Message-ID: 50B53A0B.8060901@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/27/12 12:07 PM, Merlin Moncure wrote:
> Speaking of polymorphism, why not just implement lpad()'s first
> argument as 'anyelement'?

One of the arguments made here was that lpad(not-text) *should* fail.


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 22:24:05
Message-ID: CAHyXU0wgvJ0ziHwRer2cb4DR1-Ha_TErFm0EgGpy_tCXAidwBw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 27, 2012 at 4:09 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> On 11/27/12 12:07 PM, Merlin Moncure wrote:
>> Speaking of polymorphism, why not just implement lpad()'s first
>> argument as 'anyelement'?
>
> One of the arguments made here was that lpad(not-text) *should* fail.

Well, sure. My point is: why do we need to break the casting
machinery when we can simply tweak a few of the standard functions on
portability grounds?

Robert's case on lpad() has merit in the sense it has unambiguous
meaning regardless of input type; polymorphic input types were
designed to solve *exactly* that problem. SQL portability is a
secondary but also important argument.

That said, md5() surely needs some type of cast or interpretation of
non-text types. ditto to_timestamp(), etc. So messing around with
the casting rules is surely the wrong answer. I think if you relaxed
the function sigs of a few functions on this page
(http://www.postgresql.org/docs/9.2/interactive/functions-string.html),
most reported problems would go away.

One thing that worries me is introducing ambiguous cases where
previously there weren't any though.

merlin


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 22:24:16
Message-ID: 1354055056.1766.50.camel@sussancws0025
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2012-11-27 at 15:41 -0500, Robert Haas wrote:
> I can't quite see how a non-overloaded flag would work, unless we get
> rid of schemas.

It may work to pick the first schema in the search path that has any
functions by that name, and then choose the overloaded (or not)
candidate from among those functions in that one schema. Then,
non-overloaded function names would be unique within a schema.

If there are multiple functions of the same name in multiple schemas in
the search path, it does not make sense to me to lump them all together
and choose an overloaded candidate from all of them (although I think
that's what we do now). That sounds like a mistake, to me. Do you know
of any useful examples of doing that?

> But I think there are a variety of other kinds of
> labeling that I think would work.

Worth exploring.

> I'm still not sure that's as good
> as a general solution, because if nothing else it relies on us to make
> the right decision as to which type to use in each case, and
> considering that neither Tom nor I are particularly sold on what we
> did with CONCAT(), nor am I sure that we even agree with each other on
> what the right thing to do would have been there, I'm a bit skeptical
> about our ability to get these decisions right. But it might still be
> an improvement.

I'm not entirely clear on the benefits of a general solution, nor why
your solution is more general. You are still categorizing functions into
"overloaded" and "non-overloaded", but you are doing so at runtime based
on the current contents of the catalog.

Regards,
Jeff Davis


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 22:41:19
Message-ID: 5928.1354056079@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
> ... I think if you relaxed
> the function sigs of a few functions on this page
> (http://www.postgresql.org/docs/9.2/interactive/functions-string.html),
> most reported problems would go away.

That's an interesting way of approaching it. Do we have any data on
exactly which functions people do complain about?

> One thing that worries me is introducing ambiguous cases where
> previously there weren't any though.

Right, but at least we'd be confining the ambiguity to a small number
of function names. Tweaking the casting rules could have a lot of
unforeseen consequences.

regards, tom lane


From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 23:25:39
Message-ID: CAHyXU0yMUb5WgSZ_pgf+ECW0q0xyhmatDEqxDR0B7rO0ws0p=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 27, 2012 at 4:41 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>> ... I think if you relaxed
>> the function sigs of a few functions on this page
>> (http://www.postgresql.org/docs/9.2/interactive/functions-string.html),
>> most reported problems would go away.
>
> That's an interesting way of approaching it. Do we have any data on
> exactly which functions people do complain about?

After a few minutes of google-fu, lpad seems to top the list (if you
don't count operator related issues which I think are mostly coding
bugs).

see: http://drupal.org/node/1338188.
also upper: http://sourceforge.net/tracker/?func=detail&aid=3447417&group_id=171772&atid=859223..
also substring: http://archives.postgresql.org/pgsql-bugs/2008-01/msg00001.php

note in two of the above cases the bugs were raised through 3rd party
issue trackers :/. Interestingly, if you look at popular
postgresql-only functions, such as regexp_replace, google seems to
indicate there's not much problem there. This, IMNSHO, reinforces
Robert's point -- but it seems to mostly bite people porting code,
running cross database code bases, or having a strong background in
other systems.

I found a few non-string cases, especially round().

merlin


From: John R Pierce <pierce(at)hogranch(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-27 23:36:57
Message-ID: 50B54E99.6030807@hogranch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 11/27/12 2:41 PM, Tom Lane wrote:
> Tweaking the casting rules could have a lot of
> unforeseen consequences.

understatement of the year. IMHO. $0.02 worth etc.

--
john r pierce N 37, W 122
santa cruz ca mid-left coast


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-11-28 05:51:01
Message-ID: CAFj8pRCW6wOPm6oT5zJOQAJ=op5NL3R934p2sLN5wtDvi8ttdg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2012/11/28 Merlin Moncure <mmoncure(at)gmail(dot)com>:
> On Tue, Nov 27, 2012 at 4:41 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>>> ... I think if you relaxed
>>> the function sigs of a few functions on this page
>>> (http://www.postgresql.org/docs/9.2/interactive/functions-string.html),
>>> most reported problems would go away.
>>
>> That's an interesting way of approaching it. Do we have any data on
>> exactly which functions people do complain about?
>
> After a few minutes of google-fu, lpad seems to top the list (if you
> don't count operator related issues which I think are mostly coding
> bugs).
>
> see: http://drupal.org/node/1338188.
> also upper: http://sourceforge.net/tracker/?func=detail&aid=3447417&group_id=171772&atid=859223..
> also substring: http://archives.postgresql.org/pgsql-bugs/2008-01/msg00001.php

some of these issues are buggy and I am happy, so it doesn't working now.

http://archives.postgresql.org/pgsql-bugs/2008-01/msg00001.php

and these issue can be simply solved by overloading.

Pavel

>
> note in two of the above cases the bugs were raised through 3rd party
> issue trackers :/. Interestingly, if you look at popular
> postgresql-only functions, such as regexp_replace, google seems to
> indicate there's not much problem there. This, IMNSHO, reinforces
> Robert's point -- but it seems to mostly bite people porting code,
> running cross database code bases, or having a strong background in
> other systems.
>
> I found a few non-string cases, especially round().
>
> merlin


From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-12-09 17:16:17
Message-ID: 50C4C761.1080609@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I am aware that in the case at hand, the call to make_fn_arguments() is
with the only possible candidate function, so changing COERCE_IMPLICIT
to COERCE_ASSIGNMENT inside of make_fn_arguments() is correct. But I
wonder if this may have any unwanted side effects for other code paths
to make_fn_arguments(), like from optimizer/util/clauses.c or from
parser/parse_oper.c. I'm not saying it does, just asking if that could
be the case.

Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-12-10 19:07:53
Message-ID: CA+TgmoY_vX=+VBkc-c+bHug2+cBwYCK-zFCXRg=Nv4SR_hnS4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Nov 27, 2012 at 4:36 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> On 11/25/12 6:36 PM, Robert Haas wrote:
>> I think that is true. But for whatever it's worth, and at the risk of
>> beating a horse that seems not to be dead yet in spite of the fact
>> that I feel I've already administered one hell of a beating, the LPAD
>> case is unambiguous, and therefore it is hard to see what sort of
>> programming mistake we are protecting users against.
>
> Upstream applications passing wrong data down to the database.

The circumstantial evidence suggests that many users don't want to be
protected against that in the way that we are currently protecting
them - or at least not all of them do (see Merlin's email elsewhere on
this thread). What's frustrating about the status quo is that not
only do you need lots of extra casts, but there's no real way to
improve the situation. If you add an implicit cast from integer to
text, for example, then 4 || 'foo' breaks. Now, you may think that
adding an implicit cast from integer to text is a dumb idea, and maybe
it is, but don't get too hung up on that example. The point is that
if you're unhappy with the quote_literal() case (because it accepts
too much), or the lpad() case (because it doesn't accept enough), or
the foo(smallint) case (because it can't be happy with foo(42)), you
don't really have a lot of options for adjusting the behavior as
things stand today. I accept that some people think that decorating
their code with lots of extra casts helps to avoid errors, and maybe
it does, but there is plenty of evidence that a lot of users don't
want to. And we not only don't give them the behavior they want; we
don't even have a meaningful way to give the option of opting into
that behavior at initdb or create-database time.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-12-10 20:53:42
Message-ID: CAFj8pRDh-P4m-4-5VgORE2R=15FPWrzaviUn4QELX2ByrQ0TtQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello

2012/12/10 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Tue, Nov 27, 2012 at 4:36 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>> On 11/25/12 6:36 PM, Robert Haas wrote:
>>> I think that is true. But for whatever it's worth, and at the risk of
>>> beating a horse that seems not to be dead yet in spite of the fact
>>> that I feel I've already administered one hell of a beating, the LPAD
>>> case is unambiguous, and therefore it is hard to see what sort of
>>> programming mistake we are protecting users against.
>>
>> Upstream applications passing wrong data down to the database.
>
> The circumstantial evidence suggests that many users don't want to be
> protected against that in the way that we are currently protecting
> them - or at least not all of them do (see Merlin's email elsewhere on
> this thread). What's frustrating about the status quo is that not
> only do you need lots of extra casts, but there's no real way to
> improve the situation. If you add an implicit cast from integer to
> text, for example, then 4 || 'foo' breaks. Now, you may think that
> adding an implicit cast from integer to text is a dumb idea, and maybe
> it is, but don't get too hung up on that example. The point is that
> if you're unhappy with the quote_literal() case (because it accepts
> too much), or the lpad() case (because it doesn't accept enough), or
> the foo(smallint) case (because it can't be happy with foo(42)), you
> don't really have a lot of options for adjusting the behavior as
> things stand today. I accept that some people think that decorating
> their code with lots of extra casts helps to avoid errors, and maybe
> it does, but there is plenty of evidence that a lot of users don't
> want to. And we not only don't give them the behavior they want; we
> don't even have a meaningful way to give the option of opting into
> that behavior at initdb or create-database time.
>

it is looking so our design missing some feature, flag, that can
signalize safety of implicit cast - or allow more exactly to specify
casting rules for related functionality. For some functions we do this
magic inside parser and rewriter, but we don't allow this for custom
functions.

Few years ago I proposed a parser hooks, where this task can be
solved. The parser hook is probably too generic, but probably we don't
design good solution just by simple change of some parameter of
current design, and we should to enhance current design - maybe some
new parameter modifiers?

Regards

Pavel

> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-12-11 05:59:44
Message-ID: 1355205584.22087.13.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, 2012-12-10 at 14:07 -0500, Robert Haas wrote:
> And we not only don't give them the behavior they want; we
> don't even have a meaningful way to give the option of opting into
> that behavior at initdb or create-database time.

I strongly object to offering options that change the language in such a
substantial way. initdb-time options still mean that we are essentially
dividing our language, and therefore the applications that support
postgres, in half (or worse). One of the things I really like about
postgres is that we haven't forked the language with a million options
like mysql has. I don't even like the fact that we have a GUC to control
the output format of a BYTEA.

For every developer who says "wow, that mysql query just worked without
modification" there is another one who says "oh, I forgot to test with
option XYZ... postgres is too complex to support, I'm going to drop it
from the list of supported databases".

I still don't see a compelling reason why opting out of overloading on a
per-function basis won't work. Your objections seemed fairly minor in
comparison to how strongly you are advocating this use case.

In particular, I didn't get a response to:

http://archives.postgresql.org/message-id/1354055056.1766.50.camel@sussancws0025

For what it's worth, I'm glad that people like you are pushing on these
usability issues, because it can be hard for insiders to see them
sometimes.

Regards,
Jeff Davis


From: Darren Duncan <darren(at)darrenduncan(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-12-11 06:22:10
Message-ID: 50C6D112.10000@darrenduncan.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I agree with Jeff.

Options that change the language at initdb or create-database time just fragment
the language.

It is best to just have 1 language where options are providable either
dynamically per connection or otherwise lexically, so that then they are really
just shorthands for the current local usage, and the language as a whole is the
same.

That also means you can have example code out there and know it will work on any
Postgres install, invariant of static global options. If language modifiers are
local or lexical, then any example code presumably would include the switches to
turn them on for that example.

That all being said, I also think it is best to explicitly overload operators
with extra parameter types, such as defining another operator with the signature
of (Nunber,String) with the same base name as string catenation, rather than
making numbers implicitly stringify. But I can also accept implicit
stringification / language behavior changes if it is a lexical/temporary effect
that the same user is still explicitly turning on.

-- Darren Duncan

Jeff Davis wrote:
> On Mon, 2012-12-10 at 14:07 -0500, Robert Haas wrote:
>> And we not only don't give them the behavior they want; we
>> don't even have a meaningful way to give the option of opting into
>> that behavior at initdb or create-database time.
>
> I strongly object to offering options that change the language in such a
> substantial way. initdb-time options still mean that we are essentially
> dividing our language, and therefore the applications that support
> postgres, in half (or worse). One of the things I really like about
> postgres is that we haven't forked the language with a million options
> like mysql has. I don't even like the fact that we have a GUC to control
> the output format of a BYTEA.
>
> For every developer who says "wow, that mysql query just worked without
> modification" there is another one who says "oh, I forgot to test with
> option XYZ... postgres is too complex to support, I'm going to drop it
> from the list of supported databases".
>
> I still don't see a compelling reason why opting out of overloading on a
> per-function basis won't work. Your objections seemed fairly minor in
> comparison to how strongly you are advocating this use case.
>
> In particular, I didn't get a response to:
>
> http://archives.postgresql.org/message-id/1354055056.1766.50.camel@sussancws0025
>
> For what it's worth, I'm glad that people like you are pushing on these
> usability issues, because it can be hard for insiders to see them
> sometimes.
>
> Regards,
> Jeff Davis


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-12-14 07:30:42
Message-ID: 1355470242.11945.1.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2012-11-06 at 10:57 -0500, Robert Haas wrote:
> I did some experimentation with this. It seems that what Tom proposed
> here is a lot cleaner than what I proposed previously, while still
> increasing usability in many real-world cases. For example, in
> unpatched master:

It looks like we still haven't reached consensus on the design here. Are
we still discussing, or should this be moved to the next CF?

Regards,
Jeff Davis


From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-12-14 07:35:29
Message-ID: 1355470529.11945.5.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, 2012-11-27 at 14:24 -0800, Jeff Davis wrote:
> On Tue, 2012-11-27 at 15:41 -0500, Robert Haas wrote:
> > I can't quite see how a non-overloaded flag would work, unless we get
> > rid of schemas.
>
> It may work to pick the first schema in the search path that has any
> functions by that name, and then choose the overloaded (or not)
> candidate from among those functions in that one schema. Then,
> non-overloaded function names would be unique within a schema.
>
> If there are multiple functions of the same name in multiple schemas in
> the search path, it does not make sense to me to lump them all together
> and choose an overloaded candidate from all of them (although I think
> that's what we do now). That sounds like a mistake, to me. Do you know
> of any useful examples of doing that?

On second thought, this won't work very well, particularly for
operators. Users may want to overload a built-in operator, like "+", yet
still have it work on all the built-in types.

So, if we did try to declare a function non-overloaded at DDL time, the
name would have to be globally unique (not just to a schema).

Regards,
Jeff Davis


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-12-14 18:51:56
Message-ID: CA+TgmoaaCKkgOQgUtXQyc1Nt9UaigBK1xq=WpBph2jfQxSZvgQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Dec 11, 2012 at 12:59 AM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> For every developer who says "wow, that mysql query just worked without
> modification" there is another one who says "oh, I forgot to test with
> option XYZ... postgres is too complex to support, I'm going to drop it
> from the list of supported databases".

Perhaps so. That's why my first choice is still to just fix this
problem across the board. I think there is probably more than one way
of doing that that is technically safe, and I currently believe that
my patch is one of those. However, it seems that more people than not
find the extra casts that PostgreSQL forces programmers to use to be a
feature, not a bug. According to Tom, to allow people to call a
non-overloaded function without casts will "completely destroy the
type system"; Peter Eisentraut was aghast at the idea of allowing
someone to pass a non-text first argument to lpad without an explicit
cast. I recognize that not everyone's going to agree on these things
but I find those attitudes shockingly arrogant. We have regular
evidence that users are coming to PostgreSQL and then abandoning it
because these kinds of things don't work, and we know that numerous
other popular and well-respected systems allow these sorts of things
to Just Work. It is one thing to insist on casts when there is an
ambiguity about which of several overloaded functions a user intended
to call - but when there's only one, it's just masterminding. In more
than ten years of working with PostgreSQL, I've never encountered
where the restriction at issue here prevented a bug. It's only
annoyed me and broken my application code (when moving from PostgreSQL
8.2 to PostgreSQL 8.3, never mind any other database!). There is
ample evidence that I'm not the only one, but I think we have a clear
consensus to continue ignoring the problem, or at least the solutions.

I don't think there's much point in carrying this patch over to the
next CommitFest; I'll mark it as Rejected.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-12-14 19:11:46
Message-ID: 4444.1355512306@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> ... In more
> than ten years of working with PostgreSQL, I've never encountered
> where the restriction at issue here prevented a bug. It's only
> annoyed me and broken my application code (when moving from PostgreSQL
> 8.2 to PostgreSQL 8.3, never mind any other database!).

There are quite a few examples in our archives of application bugs that
would have been prevented, or later were prevented, by the 8.3 changes
that reduced the system's willingness to apply implicit casts to text.
I recall for instance cases where people got wrong/unexpected answers
because the system was sorting what-they-thought-were-timestamp values
textually.

So I find such sweeping claims to be demonstrably false, and I'm
suspicious of behavioral changes that are proposed with such arguments
backing them.

> There is ample evidence that I'm not the only one, but I think we have
> a clear consensus to continue ignoring the problem, or at least the
> solutions.

Oh, I don't think we're ignoring the problem; people beat us up about it
too often for that. But we need to pay attention to error detection not
only ease-of-use, so it's hard to be sure what's a net improvement.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-12-14 20:20:08
Message-ID: CA+TgmoaCZb+EhCGLoNM8vY=jbq1Y3qW6qe2nn_6iOPrnrPbJow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Dec 14, 2012 at 2:11 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> ... In more
>> than ten years of working with PostgreSQL, I've never encountered
>> where the restriction at issue here prevented a bug. It's only
>> annoyed me and broken my application code (when moving from PostgreSQL
>> 8.2 to PostgreSQL 8.3, never mind any other database!).
>
> There are quite a few examples in our archives of application bugs that
> would have been prevented, or later were prevented, by the 8.3 changes
> that reduced the system's willingness to apply implicit casts to text.
> I recall for instance cases where people got wrong/unexpected answers
> because the system was sorting what-they-thought-were-timestamp values
> textually.
>
> So I find such sweeping claims to be demonstrably false, and I'm
> suspicious of behavioral changes that are proposed with such arguments
> backing them.

I think you're mixing apples and oranges. The whole point of the
patch on the table - which, if you recall, was designed originally by
you and merely implemented by me - was to change the behavior only in
the cases where there's only one function with the appropriate name
and argument count. The ambiguous cases that 8.3+ helpfully prevent
are those where overloading is in use and the choice of which function
to call is somewhat arbitrary and perhaps incorrectly-foreseen by the
user. Those changes also have the side-effect of preventing a
straightforward function call from working without casts even in cases
where no overloading is in use - and making that case work is
completely different from making the ambiguous case arbitrarily pick
one of the available answers.

> Oh, I don't think we're ignoring the problem; people beat us up about it
> too often for that. But we need to pay attention to error detection not
> only ease-of-use, so it's hard to be sure what's a net improvement.

Well, that's not how the dynamic of this thread reads to me. There
seems to be massive opposition - including from you - to allowing
unambiguous function calls to resolve without casts, at least as a
categorical matter, and maybe even in the specific cases that users
most frequently care about. I simply disagree with the contention
that there's a value in making people cast to text when the target
function is not overloaded. Maybe there's some world where it's
uncommon to want to pass the text representation of a non-text value
to a non-overloaded function that accepts text, and therefore forcing
a cast upon the user to warn them "here be dragons" is warranted, but
I don't live in it. When the function IS overloaded, well, that's a
completely different situation. I've written enough C++ over the
years to understand what happens when you try too hard to be clever
with tiebreak rules. But that's not got much to do with the question
of whether the only candidate to put in an appearance can be declared
the winner.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-12-16 13:21:27
Message-ID: 50CDCAD7.7040101@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 12/14/2012 3:20 PM, Robert Haas wrote:
> On Fri, Dec 14, 2012 at 2:11 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>>> ... In more
>>> than ten years of working with PostgreSQL, I've never encountered
>>> where the restriction at issue here prevented a bug. It's only
>>> annoyed me and broken my application code (when moving from PostgreSQL
>>> 8.2 to PostgreSQL 8.3, never mind any other database!).
>>
>> There are quite a few examples in our archives of application bugs that
>> would have been prevented, or later were prevented, by the 8.3 changes
>> that reduced the system's willingness to apply implicit casts to text.
>> I recall for instance cases where people got wrong/unexpected answers
>> because the system was sorting what-they-thought-were-timestamp values
>> textually.
>>
>> So I find such sweeping claims to be demonstrably false, and I'm
>> suspicious of behavioral changes that are proposed with such arguments
>> backing them.
>
> I think you're mixing apples and oranges. The whole point of the
> patch on the table - which, if you recall, was designed originally by
> you and merely implemented by me - was to change the behavior only in
> the cases where there's only one function with the appropriate name
> and argument count. The ambiguous cases that 8.3+ helpfully prevent
> are those where overloading is in use and the choice of which function
> to call is somewhat arbitrary and perhaps incorrectly-foreseen by the
> user. Those changes also have the side-effect of preventing a
> straightforward function call from working without casts even in cases
> where no overloading is in use - and making that case work is
> completely different from making the ambiguous case arbitrarily pick
> one of the available answers.

FWIW I for one thought that casting more liberal in the case at hand,
where there is only one function with that name and number of arguments,
would be a good thing. My only concern with the patch presented was that
changing make_fn_assignment() in that way may have some unwanted side
effects because it is called from different locations and the usage of
COERCION_IMPLICIT may actually guard against something, that we don't
want to allow. I don't have any evidence that it does, just the concern
that it may.

Perhaps make_fn_arguments() needs to receive that coercion context as an
argument and the caller hands in COERCION_ASSIGNMENT only in the case at
hand?

Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin


From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-12-16 14:40:44
Message-ID: CA+U5nMJE8spyf7jodDKf86t8Op=ZgybkXu4_ZJgh27CO2L9=vQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 27 November 2012 22:41, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>> ... I think if you relaxed
>> the function sigs of a few functions on this page
>> (http://www.postgresql.org/docs/9.2/interactive/functions-string.html),
>> most reported problems would go away.
>
> That's an interesting way of approaching it. Do we have any data on
> exactly which functions people do complain about?
>
>> One thing that worries me is introducing ambiguous cases where
>> previously there weren't any though.
>
> Right, but at least we'd be confining the ambiguity to a small number
> of function names. Tweaking the casting rules could have a lot of
> unforeseen consequences.

There have been many good points made on this thread.

Being sloppy in all cases is a bad thing we all agree or reluctantly
admit; what is needed is the ability for a user to be able to more
closely define what they mean by such conversions, so that application
SQL can be made to work.

It certainly isn't easy to say that COLUMN LIKE '1%' would work in all
cases, since the preferred format of that data might be (xxx)
xxx-xxxx, or $xxxxx or <EURO symbol>xxxx,xx (i.e. with a comma as the
decimal separator). The format comes from the meaning of the data,
which we cannot know.

What would be useful is to be able to define default format models for
each column. If not defined, there is no implicit cast. If FORMAT is
defined then we know to apply it in the absence of a global cast.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services