Re: Custom function in where clause

Lists: pgsql-performance
From: Pena Kupen <kupen(at)wippies(dot)fi>
To: pgsql-performance(at)postgresql(dot)org
Subject: Custom function in where clause
Date: 2012-07-10 08:36:15
Message-ID: 203599184.371341909375800.JavaMail.kupen@wippies.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi,

I have searched solution to my problem a few days. On my query, there is big performance problem.
It seems to me, that problem is on where-part of sql and it's function.

My sql is:
select count(*)
from table_h
where
level <= get_level_value(11268,id,area) and
(date1 >= '2011-1-1' or date2>='2011-1-1') and
(date1 <= '2012-07-09' or date2<='2012-07-09')
This takes about 40sek.

select count(*)
from table_h
where
(date1 >= '2011-1-1' or date2>='2011-1-1') and
(date1 <= '2012-07-09' or date2<='2012-07-09')
when ignoring function, it takes <1sek.

Function is:
CREATE OR REPLACE FUNCTION get_level_value(_user integer, _id, _area) RETURNS integer
AS $$
DECLARE found integer;
BEGIN
SELECT 1 INTO found
FROM table_o
WHERE userid=_user AND
id=_id AND
area=_area;
IF (found) THEN
return 3;
ELSE
return 1;
END IF;
END;
$$
LANGUAGE plpgsql;

On explain, it seems to me that this function is on filter and it will execute on every row. Total resultset contains 1 700 000 rows.
QUERY PLAN
Aggregate (cost=285543.89..285543.90 rows=1 width=0) (actual time=32391.380..32391.380 rows=1 loops=1)
-> Bitmap Heap Scan on table_h (cost=11017.63..284987.40 rows=222596 width=0) (actual time=326.946..31857.145 rows=631818 loops=1)
Recheck Cond: ((date1 >= '2011-01-01'::date) OR (date2 >= '2011-01-01'::date))
Filter: (((date1 <= '2012-07-09'::date) OR (date2 <= '2012-07-09'::date)) AND (level <= get_level_value(11268, id, area)))
-> BitmapOr (cost=11017.63..11017.63 rows=669412 width=0) (actual time=321.635..321.635 rows=0 loops=1)
-> Bitmap Index Scan on date1 (cost=0.00..10626.30 rows=652457 width=0) (actual time=84.555..84.555 rows=647870 loops=1)
Index Cond: (date1 >= '2011-01-01'::date)
-> Bitmap Index Scan on date2_table_h (cost=0.00..280.03 rows=16955 width=0) (actual time=237.074..237.074 rows=15222 loops=1)
Index Cond: (date2 >= '2011-01-01'::date)

How should I handle this situation and use function?

--
kupen

--
Wippies-vallankumous on täällä! Varmista paikkasi vallankumouksen eturintamassa ja liity Wippiesiin heti!
http://www.wippies.com/


From: Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>
To: Pena Kupen <kupen(at)wippies(dot)fi>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Custom function in where clause
Date: 2012-07-10 08:44:18
Message-ID: CAK-MWwRf6SNeha9bC72v2_hd4pSDDtoHYrji1s=yjntAoBXDQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Tue, Jul 10, 2012 at 6:36 PM, Pena Kupen <kupen(at)wippies(dot)fi> wrote:

> Hi,
>
> I have searched solution to my problem a few days. On my query, there is
> big performance problem.
> It seems to me, that problem is on where-part of sql and it's function.
>
> My sql is:
> select count(*)
> from table_h where level <=
> get_level_value(11268,id,area) and (date1 >= '2011-1-1'
> or date2>='2011-1-1') and (date1 <= '2012-07-09' or
> date2<='2012-07-09')
> This takes about 40sek.
>
> select count(*)
> from table_h where (date1 >=
> '2011-1-1' or date2>='2011-1-1') and (date1 <=
> '2012-07-09' or date2<='2012-07-09')
> when ignoring function, it takes <1sek.
>
> Function is:
> CREATE OR REPLACE FUNCTION get_level_value(_user integer, _id, _area)
> RETURNS integer
> AS $$
> DECLARE found integer;
> BEGIN
> SELECT 1 INTO found
> FROM table_o
> WHERE userid=_user AND
> id=_id AND
> area=_area;
> IF (found) THEN
> return 3;
> ELSE
> return 1;
> END IF;
> END;
> $$
> LANGUAGE plpgsql;
>
> On explain, it seems to me that this function is on filter and it will
> execute on every row. Total resultset contains 1 700 000 rows.
> QUERY PLAN
> Aggregate (cost=285543.89..285543.90 rows=1 width=0) (actual
> time=32391.380..32391.380 rows=1 loops=1)
> -> Bitmap Heap Scan on table_h (cost=11017.63..284987.40 rows=222596
> width=0) (actual time=326.946..31857.145 rows=631818 loops=1)
> Recheck Cond: ((date1 >= '2011-01-01'::date) OR (date2 >=
> '2011-01-01'::date))
> Filter: (((date1 <= '2012-07-09'::date) OR (date2 <=
> '2012-07-09'::date)) AND (level <= get_level_value(11268, id, area)))
> -> BitmapOr (cost=11017.63..11017.63 rows=669412 width=0) (actual
> time=321.635..321.635 rows=0 loops=1)
> -> Bitmap Index Scan on date1 (cost=0.00..10626.30
> rows=652457 width=0) (actual time=84.555..84.555 rows=647870 loops=1)
> Index Cond: (date1 >= '2011-01-01'::date)
> -> Bitmap Index Scan on date2_table_h (cost=0.00..280.03
> rows=16955 width=0) (actual time=237.074..237.074 rows=15222 loops=1)
> Index Cond: (date2 >= '2011-01-01'::date)
>
> How should I handle this situation and use function?
>
>
You could not have good performance using function in case where direct
JOIN is only way to have reasonable performance.
Stop using function and write join with table_o instead, or put whole query
with join inside a function.

--
Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/ <http://www.postgresql-consulting.com/>

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

Skype: maxim.boguk
Jabber: maxim(dot)boguk(at)gmail(dot)com
МойКруг: http://mboguk.moikrug.ru/

"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."


From: Brendan Jurd <direvus(at)gmail(dot)com>
To: Pena Kupen <kupen(at)wippies(dot)fi>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Custom function in where clause
Date: 2012-07-10 08:45:29
Message-ID: CADxJZo0NNM0u6197HoyemXT58sQgZPTN1oM-q+S92jgO7r=JRQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On 10 July 2012 18:36, Pena Kupen <kupen(at)wippies(dot)fi> wrote:
> Hi,
>
> I have searched solution to my problem a few days. On my query, there is big
> performance problem.
> It seems to me, that problem is on where-part of sql and it's function.
>

> How should I handle this situation and use function?
>

I would start by rewriting your function in plain SQL rather than
PL/pgSQL. As a general rule, don't write a function in PL/pgSQL
unless you really need procedurality. This function does not.

For example:

CREATE OR REPLACE FUNCTION get_level_value(_user integer, _id, _area)
RETURNS integer
AS $$
-- Return 3 if there are matching records in table_o, otherwise return 1.
SELECT CASE WHEN EXISTS (
SELECT id
FROM table_o
WHERE userid=_user AND
id=_id AND
area=_area
) THEN 3 ELSE 1 END;
$$
LANGUAGE sql STABLE;

Cheers,
BJ