Re: New thoughts about indexing cross-type comparisons

From: Dave Smith <dave(dot)smith(at)candata(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: New thoughts about indexing cross-type comparisons
Date: 2003-09-16 17:11:38
Message-ID: 3F67444A.9010407@candata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

If this is only dealing with constants, why not just explicitly add a
cast to the constant of the column type at the planner level. It would
solve this problem as well ...

create table test (f int2);
select * from test where f=cast('1981928928921' as int2);
ERROR: pg_atoi: error reading "1981928928921": Numerical result out of
range

select * from test where f=1981928928921;
f
---
(0 rows)

Tom Lane wrote:
> We've spent much effort trying to solve the "int8col = 42 doesn't use
> an index" class of problems. AFAIR, all the recent tries have focused
> on trying to get the parser to choose an index-compatible operator
> initially. (In this example, that would mean promoting 42 to int8 so
> that int8 = int8 would be chosen at parse time.) While I have not
> completely given up hope on that approach, it's clearly fraught with
> potential for unwanted side-effects.
>
> The other place we could try to fix it is to improve either the planner
> or the index AMs themselves to cope with cross-type comparisons for
> themselves. I have shied away from that thought because it seemed
> unreasonably difficult, but with the parser-side solution looking harder
> and harder, maybe it's time to re-evaluate.
>
> There actually was code in the planner at one time to substitute
> index-compatible operators for non-index-compatible operators.
> The algorithm was basically "if I have indexcolumn OP somevalue, where
> OP isn't a member of the index opclass but also somevalue is of a
> different datatype than indexcolumn, then look to see if somevalue can
> be binary-coerced to the datatype of indexcolumn. If so, look for an
> operator of the same name as OP and accepting the indexcolumn datatype
> on both sides. If found, and it's a member of the index opclass, then
> use that operator instead of the original."
>
> This algorithm was wrong on both practical and theoretical levels;
> in the first place it's not very helpful to only be able to handle
> binary-compatible transformations, and in the second place there isn't
> any good guarantee that it's not changing the semantics when it replaces
> the operator. For instance int4 < and oid < do not act the same.
> Depending on equality of operator names was a bad idea even then, and
> would be quite unworkable now in the world of schema search paths.
>
> However, those objections really stem from the planner not having enough
> information to know when the transformation was safe to make. What if
> we gave it that information?
>
> After some thought I think the most practical approach is to make a new
> system catalog for "secondary members of index opclasses". A secondary
> member is not one of the operators that the index can handle directly,
> but it can be transformed into one of the primary members. The new
> catalog would be called, say, pg_amsecop, and would have columns like
> opclass operatorid replacementop ltransform rtransform
> where the first two are the primary key. The idea is when we have a
> WHERE expression "indexcol OP somevalue", and we can't find OP in the
> index's operator class (ie, there's no pg_amop entry for that operator
> and opclass), we next look to see if the opclass/operator combination
> appears in pg_amsecop. If so, then we are allowed to replace the
> "indexcol OP somevalue" expression by "ltransform(indexcol)
> REPLACEMENTOP rtransform(somevalue)". ltransform and rtransform are
> pg_proc OIDs of cast functions, or zero where no run-time cast is
> needed. replacementop is the OID of the substitute operator, which
> presumably is one of the primary members of the index opclass. Now we
> have an indexable expression. The presence of the entry in pg_amsecop
> represents the opclass designer's promise to us that this is a valid,
> semantics-preserving transformation.
>
> This design might be overly general --- for example, I doubt there can
> be any need to apply a cast function to the indexable column. We could
> eliminate the ltransform column and probably also halve the number of
> entries in the table if we insist that the indexable column be on the
> left before we start looking (in other words, "somevalue OP indexcol"
> must be commuted before we look for the operator in pg_amsecop, not after).
>
> This is not a 100% solution to our problems. I don't think we could use
> it to solve the problem for int2 columns ("int2col = 42") because it'd
> be unsafe to promise that an int4-to-int2 cast could be inserted into
> an expression without changing the behavior. So we'd still want to look
> at having small integer constants be initially typed as int2, which
> leaves us with a number of unsolved issues, as noted here:
> http://archives.postgresql.org/pgsql-hackers/2002-11/msg00468.php
> So maybe the "secondary operator" idea will help, or maybe it won't do
> much for us.
>
> Comments? Does this spur any better ideas?
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faqs/FAQ.html
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-09-16 17:24:05 Re: New thoughts about indexing cross-type comparisons
Previous Message Tom Lane 2003-09-16 16:30:37 New thoughts about indexing cross-type comparisons