Re: Hints proposal

From: "Craig A(dot) James" <cjames(at)modgraph-usa(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Hints proposal
Date: 2006-10-15 23:55:54
Message-ID: 4532CA8A.5050308@modgraph-usa.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Josh Berkus wrote:
> I actually think the way to attack this issue is to discuss the kinds of
> errors the planner makes, and what tweaks we could do to correct them.
> Here's the ones I'm aware of:
>
> -- Incorrect selectivity of WHERE clause
> -- Incorrect selectivity of JOIN
> -- Wrong estimate of rows returned from SRF
> -- Incorrect cost estimate for index use
>
> Can you think of any others?

The one that started this discussion: Lack of cost information for functions. I think this feature is a good idea independent of the whole HINTS discussion.

At a minimum, a rough categorization is needed, such as "Lighning fast / Fast / Medium / Slow / Ludicrously slow", with some sort if milliseconds or CPU cycles associated with each category. Or perhaps something like, "This is (much faster|faster|same as|slower|much slower) than reading a block from the disk."

If I understand Tom and others, the planner already is capable of taking advantage of this information, it just doesn't have it yet. It could be part of the CREATE FUNCTION command.

CREATE OR REPLACE FUNCTION foobar(text, text, text) RETURNS text
AS '/usr/local/pgsql/lib/foobar.so', 'foobar'
COST LUDICROUSLY_SLOW
LANGUAGE 'C' STRICT;

Better yet ('tho I have no idea how hard this would be to implement...) would be an optional second function with the same parameter signature as the main function, but it would return a cost estimate:

CREATE OR REPLACE FUNCTION foobar(text, text, text) RETURNS text
AS '/usr/local/pgsql/lib/foobar.so', 'foobar'
COST foobar_cost
LANGUAGE 'C' STRICT;

The planner could call it with the same parameters it was about to use, and get an accurate estimate for the specific operation that is about to be done. In my particular case (running an NP-complete problem), there are cases where I can determine ahead of time that the function will be fast, but in most cases it is *really* slow.

Craig

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig A. James 2006-10-16 00:25:31 Re: Hints proposal
Previous Message mark 2006-10-15 23:55:42 Re: Postgresql Caching

Browse pgsql-performance by date

  From Date Subject
Next Message Craig A. James 2006-10-16 00:25:31 Re: Hints proposal
Previous Message Carlo Stonebanks 2006-10-15 22:48:28 Re: Performance Optimization for Dummies 2 - the SQL