Re: Expensive function and the optimizer

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Craig A(dot) James" <cjames(at)modgraph-usa(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Expensive function and the optimizer
Date: 2005-11-08 04:12:07
Message-ID: 1158.1131423127@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

"Craig A. James" <cjames(at)modgraph-usa(dot)com> writes:
> Is there some way to explain this cost to the optimizer in a permanent
> way,

Nope, sorry. One thing you could do in the particular case at hand is
to rejigger the WHERE clause involving the function so that it requires
values from both tables and therefore can't be applied till after the
join is made. (If nothing else, give the function an extra dummy
argument that can be passed as a variable from the other table.)
This is an ugly and non-general solution of course.

> The only way I've thought of is something like this:

> select X.id from
> (select A.id, A.foo, A.row_num
> from A join B ON (A.id = B.id)
> where A.row_num >= 0 and A.row_num <= 43477
> and B.id = 52) as X
> where myfunc(X.foo, 'FooBar') order by X.row_num;

As written, that won't work because the planner will happily flatten the
query to the same thing you had before. You can put an OFFSET 0 into
the sub-select to prevent that from happening, but realize that this
creates a pretty impervious optimization fence ... the side-effects
might be undesirable when you come to look at real queries instead
of toy cases.

regards, tom lane

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Charlie Savage 2005-11-08 07:05:01 Sort performance on large tables
Previous Message Andrew McMillan 2005-11-08 02:35:20 Re: Temporary Table