Inconsistent or incomplete behavior obverse in where clause

Lists: pgsql-hackerspgsql-sql
From: Paul Ogden <pogden(at)claresco(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: Inconsistent or incomplete behavior obverse in where clause
Date: 2002-11-12 17:39:13
Message-ID: NAEOJBHEEOEHNNICGFADKEMKDEAA.pogden@claresco.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-sql

Hello,
Our application development group has observed what we
feel is inconsistent behavior when comparing numeric
column references to constant/literal values in SQL.

I would appreciate comments on the best approach to
this problem that will allow for the highest
portability of our application code. I have searched
the archives and online docs, but so far have not found
anyone addressing the problem quite this way.

Assume wuActive is a numeric field ( with scale but no
precision ) in the table WU:
select count(wuid) from WU where wuActive = 0 --works fine
select count(wuid) from WU where wuActive = '0' --works fine
select count(wuid) from WU where wuActive = '0.0' --works fine
select count(wuid) from WU where wuActive = 0.0 --throws the
following exception:

"Unable to identify an operator '=' for types 'numeric' and 'double
precision' You will have to retype this query using an explicit cast"

Second, assume tPct is a numeric field ( having scale of 4 and
precision of 1 ) in the table T
select count(tid) from T where tPct > 77 --works fine
select count(tid) from T where tPct > '77' --works fine
select count(tid) from T where tPct > '77.5' --works fine
select count(tid) from T where tPct > 77.5 -- again throws
the exception:

"Unable to identify an operator '>' for types 'numeric' and 'double
precision' You will have to retype this query using an explicit cast"

This seems to occur regardless of connectivity drivers used
(ODBC, JDBC, etc..)

I am aware of the use of type casting to force the desired
behavior in these situations. I have also started to go down
the road of creating functions and operators to force numeric
to numeric comparison operations when comparing numeric to float,
but realize that this approach is fraught with pitfalls, in fact
it is interesting to us to note that with an operator in place
to force numeric = float comparisons to parse as numeric = numeric,
we started getting the opposite behavior. Queries with 'column
reference' = 0.0 worked fine, but queries with 'column reference' = 0
threw a variant of the previous exception:

"Unable to identify an operator '=' for types 'numeric' and 'integer'"

Overall, this behavior appears to be inconsistent and is not
the same behavior I have experienced with many other DBMS's.
Specifically, it seems strange that the parser does not treat
values 0.0 or 77.5 as numeric(s[,p]) when comparing the values
to a column reference known to be of type numeric (s,[p]).

Is an unquoted number in the form of NN.N always treated as a
float? If the planner could somehow recognize that the constant/
literal value was being compared to a column reference of the
type numeric (s,p) and treat the value accordingly, then would
operator identification no longer be a problem?

We are looking to maintain a high degree of portability in our
application code, and while "CAST ( expression as type )" is
fairly portable, no one here feels that it is a portable as
column reference = literal/constant value. If someone knows
of a better approach, or can point us to documentation of build or
run-time configuration that affects the query planner where this
issue is concerned, it would be much appreciated.

Thanks,

Paul Ogden
Database Administrator/Programmer
Claresco Corporation
(510) 549-2290


From: "Josh Berkus" <josh(at)agliodbs(dot)com>
To: pogden(at)claresco(dot)com, pgsql-sql(at)postgresql(dot)org
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Inconsistent or incomplete behavior obverse in where
Date: 2002-11-12 18:23:02
Message-ID: web-1822721@davinci.ethosmedia.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-sql

Paul,

> "Unable to identify an operator '=' for types 'numeric' and 'double
> precision' You will have to retype this query using an explicit cast"

This is due, as you surmised, to decimal values defaulting to floats.
While there is little problem with an = operator for numeric and
float, you would not want an implicit cast for a / operator with
numeric and float. As a result, I believe that all numeric and float
operators have been left undefined.

> I am aware of the use of type casting to force the desired
> behavior in these situations. I have also started to go down
> the road of creating functions and operators to force numeric
> to numeric comparison operations when comparing numeric to float,
> but realize that this approach is fraught with pitfalls, in fact
> it is interesting to us to note that with an operator in place
> to force numeric = float comparisons to parse as numeric = numeric,
> we started getting the opposite behavior. Queries with 'column
> reference' = 0.0 worked fine, but queries with 'column reference' = 0
>
> threw a variant of the previous exception:
>
> "Unable to identify an operator '=' for types 'numeric' and
> 'integer'"

Now, that's interesting. Why would defining a "numeric = float" have
broken "numeric = integer"? There's no reason I can think of.
Perhaps I will try this myself and see if I encounter the same
problem, or if your team modified the numeric = integer operator by
mistake.

> Overall, this behavior appears to be inconsistent and is not
> the same behavior I have experienced with many other DBMS's.
> Specifically, it seems strange that the parser does not treat
> values 0.0 or 77.5 as numeric(s[,p]) when comparing the values
> to a column reference known to be of type numeric (s,[p]).
>
> Is an unquoted number in the form of NN.N always treated as a
> float?

Yes. I believe that this is from the SQL 92 spec; hopefully someone
on this list with a copy of the Guide to the SQL Standard can quote it
for you.

> If the planner could somehow recognize that the constant/
> literal value was being compared to a column reference of the
> type numeric (s,p) and treat the value accordingly, then would
> operator identification no longer be a problem?

It's an interesting idea, and would be wonderful if it could be made to
work. However, the challenge of getting the program to correctly
recognize the context for all literal values *without* making any wrong
assumptions that would afffect the data could be substantial.

Most other RDBMSs deal with this, not by any kind of data type
context-sensitivity, but simply by supporting a large number of
implicit casts. This approach can have its own perils, as I have
experienced with MS SQL Server, where the average of splits for 120,000
transactions is significantly different if you accidentally let the
database implicitly cast the values as Float instead of Numeric.

As such, there was talk on the Hackers list at one time of *reducing*
the number of implicit casts instead of increasing them. This would
obviously make your particular problem even worse, but the proponents
of reduction point out that implicit casts can get you into real
trouble if you're not aware of them, wheras forcing explicit casts just
gets you error messages.

Hmmm ... in fact, I'd think the perfect solution would be a
compile-time option or contrib package which allows you to
enable/disable implicit casts for many data types.

> We are looking to maintain a high degree of portability in our
> application code, and while "CAST ( expression as type )" is
> fairly portable, no one here feels that it is a portable as
> column reference = literal/constant value. If someone knows
> of a better approach, or can point us to documentation of build or
> run-time configuration that affects the query planner where this
> issue is concerned, it would be much appreciated.

Hopefully someone else will respond to your message as well. I'll
re-phrase one of your questions for the Hackers list:

QUESTION: Is there any way we could distinguish between literals and
column references when processing operators? That is, while we would
*not* want to implicitly convert a float column to numeric for equality
comparison, would it be possible to convert a literal value to match
the column to which it is compared? Or is literal processing completed
before any expressions are evaluated?

-Josh Berkus


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pogden(at)claresco(dot)com
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Inconsistent or incomplete behavior obverse in where clause
Date: 2002-11-12 18:34:47
Message-ID: 10295.1037126087@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-sql

Paul Ogden <pogden(at)claresco(dot)com> writes:
> select count(wuid) from WU where wuActive = 0.0 --throws the
> following exception:

> "Unable to identify an operator '=' for types 'numeric' and 'double
> precision' You will have to retype this query using an explicit cast"

This is fixed as of 7.3. (We still have related issues for smallint
and bigint columns, unfortunately.)

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Josh Berkus" <josh(at)agliodbs(dot)com>
Cc: pogden(at)claresco(dot)com, pgsql-sql(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] Inconsistent or incomplete behavior obverse in where
Date: 2002-11-12 18:44:49
Message-ID: 10374.1037126689@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-sql

"Josh Berkus" <josh(at)agliodbs(dot)com> writes:
> Now, that's interesting. Why would defining a "numeric = float" have
> broken "numeric = integer"? There's no reason I can think of.

The problem probably is that the parser now finds two possible
interpretations that look equally good to it, so it can't choose.
It could coerce the integer constant to numeric (and use numeric=numeric)
or to float (and use the added numeric=float operator), and there's no
rule that can break the tie.

In 7.3 and 7.4 we are actually going in the direction of removing
cross-data-type operators, not adding them, because they tend to create
too many options for the parser to choose from.

regards, tom lane


From: Paul Ogden <pogden(at)claresco(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-sql(at)postgresql(dot)org
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Inconsistent or incomplete behavior obverse in where
Date: 2002-11-12 18:52:20
Message-ID: NAEOJBHEEOEHNNICGFADEENEDEAA.pogden@claresco.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-sql

Josh,
Thanks for the reply. Much of what you say is as we expected.
I see that 7.3 has addressed the "Unable to identify an operator
'=' for types 'numeric' and 'double precision'" problem, but
I'm not sure how. Context-sensitive approach? Overloaded operator
approach? Something else ( is there )?

If the release of 7.3 is soon, perhaps we can get by with the
band-aid approach of overloading the comparison operators
until such time as the new version is available. Production
for us is next spring, so maybe we'll be okay on this one.
This approach would certainly allow our development team to
right their code one way.

>
>
> Paul,
>
> > "Unable to identify an operator '=' for types 'numeric' and 'double
> > precision' You will have to retype this query using an explicit cast"
>
> This is due, as you surmised, to decimal values defaulting to floats.
> While there is little problem with an = operator for numeric and
> float, you would not want an implicit cast for a / operator with
> numeric and float. As a result, I believe that all numeric and float
> operators have been left undefined.
>
> > I am aware of the use of type casting to force the desired
> > behavior in these situations. I have also started to go down
> > the road of creating functions and operators to force numeric
> > to numeric comparison operations when comparing numeric to float,
> > but realize that this approach is fraught with pitfalls, in fact
> > it is interesting to us to note that with an operator in place
> > to force numeric = float comparisons to parse as numeric = numeric,
> > we started getting the opposite behavior. Queries with 'column
> > reference' = 0.0 worked fine, but queries with 'column reference' = 0
> >
> > threw a variant of the previous exception:
> >
> > "Unable to identify an operator '=' for types 'numeric' and
> > 'integer'"
>
> Now, that's interesting. Why would defining a "numeric = float" have
> broken "numeric = integer"? There's no reason I can think of.
> Perhaps I will try this myself and see if I encounter the same
> problem, or if your team modified the numeric = integer operator by
> mistake.
>

No, we made no modifications to numeric = integer. In fact, issuing
DROP OPERATOR (numeric,float8);
cleared that problem right up. And brought us back to square one.

> > Overall, this behavior appears to be inconsistent and is not
> > the same behavior I have experienced with many other DBMS's.
> > Specifically, it seems strange that the parser does not treat
> > values 0.0 or 77.5 as numeric(s[,p]) when comparing the values
> > to a column reference known to be of type numeric (s,[p]).
> >
> > Is an unquoted number in the form of NN.N always treated as a
> > float?
>
> Yes. I believe that this is from the SQL 92 spec; hopefully someone
> on this list with a copy of the Guide to the SQL Standard can quote it
> for you.
>
> > If the planner could somehow recognize that the constant/
> > literal value was being compared to a column reference of the
> > type numeric (s,p) and treat the value accordingly, then would
> > operator identification no longer be a problem?
>
> It's an interesting idea, and would be wonderful if it could be made to
> work. However, the challenge of getting the program to correctly
> recognize the context for all literal values *without* making any wrong
> assumptions that would afffect the data could be substantial.
>
> Most other RDBMSs deal with this, not by any kind of data type
> context-sensitivity, but simply by supporting a large number of
> implicit casts. This approach can have its own perils, as I have
> experienced with MS SQL Server, where the average of splits for 120,000
> transactions is significantly different if you accidentally let the
> database implicitly cast the values as Float instead of Numeric.
>
> As such, there was talk on the Hackers list at one time of *reducing*
> the number of implicit casts instead of increasing them. This would
> obviously make your particular problem even worse, but the proponents
> of reduction point out that implicit casts can get you into real
> trouble if you're not aware of them, wheras forcing explicit casts just
> gets you error messages.
>
> Hmmm ... in fact, I'd think the perfect solution would be a
> compile-time option or contrib package which allows you to
> enable/disable implicit casts for many data types.
>

I think this is a great idea. We're more of the postgres user class
than hacker class, so that's out of our scope to undertake, but
we'd sure use it.

> > We are looking to maintain a high degree of portability in our
> > application code, and while "CAST ( expression as type )" is
> > fairly portable, no one here feels that it is a portable as
> > column reference = literal/constant value. If someone knows
> > of a better approach, or can point us to documentation of build or
> > run-time configuration that affects the query planner where this
> > issue is concerned, it would be much appreciated.
>
> Hopefully someone else will respond to your message as well. I'll
> re-phrase one of your questions for the Hackers list:
>
> QUESTION: Is there any way we could distinguish between literals and
> column references when processing operators? That is, while we would
> *not* want to implicitly convert a float column to numeric for equality
> comparison, would it be possible to convert a literal value to match
> the column to which it is compared? Or is literal processing completed
> before any expressions are evaluated?
>
> -Josh Berkus
>

Thanks for doing that.
- Paul Ogden
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pogden(at)claresco(dot)com, pgsql-sql(at)postgresql(dot)org
Subject: Re: Inconsistent or incomplete behavior obverse in where
Date: 2002-11-12 19:13:09
Message-ID: 200211121113.09365.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-sql


Paul,

> Thanks for the reply. Much of what you say is as we expected.
> I see that 7.3 has addressed the "Unable to identify an operator
> '=' for types 'numeric' and 'double precision'" problem, but
> I'm not sure how. Context-sensitive approach? Overloaded operator
> approach? Something else ( is there )?

A modification of the operators available.

> If the release of 7.3 is soon, perhaps we can get by with the
> band-aid approach of overloading the comparison operators
> until such time as the new version is available. Production
> for us is next spring, so maybe we'll be okay on this one.
> This approach would certainly allow our development team to
> right their code one way.

7.3 final is expected before December.

--
-Josh Berkus