Lists: | pgsql-performance |
---|
From: | Pena Kupen <kupen(at)wippies(dot)fi> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Custom function in where clause |
Date: | 2012-07-10 08:36:15 |
Message-ID: | 203599184.371341909375800.JavaMail.kupen@wippies.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
Hi,
I have searched solution to my problem a few days. On my query, there is big performance problem.
It seems to me, that problem is on where-part of sql and it's function.
My sql is:
select count(*)
from table_h
where
level <= get_level_value(11268,id,area) and
(date1 >= '2011-1-1' or date2>='2011-1-1') and
(date1 <= '2012-07-09' or date2<='2012-07-09')
This takes about 40sek.
select count(*)
from table_h
where
(date1 >= '2011-1-1' or date2>='2011-1-1') and
(date1 <= '2012-07-09' or date2<='2012-07-09')
when ignoring function, it takes <1sek.
Function is:
CREATE OR REPLACE FUNCTION get_level_value(_user integer, _id, _area) RETURNS integer
AS $$
DECLARE found integer;
BEGIN
SELECT 1 INTO found
FROM table_o
WHERE userid=_user AND
id=_id AND
area=_area;
IF (found) THEN
return 3;
ELSE
return 1;
END IF;
END;
$$
LANGUAGE plpgsql;
On explain, it seems to me that this function is on filter and it will execute on every row. Total resultset contains 1 700 000 rows.
QUERY PLAN
Aggregate (cost=285543.89..285543.90 rows=1 width=0) (actual time=32391.380..32391.380 rows=1 loops=1)
-> Bitmap Heap Scan on table_h (cost=11017.63..284987.40 rows=222596 width=0) (actual time=326.946..31857.145 rows=631818 loops=1)
Recheck Cond: ((date1 >= '2011-01-01'::date) OR (date2 >= '2011-01-01'::date))
Filter: (((date1 <= '2012-07-09'::date) OR (date2 <= '2012-07-09'::date)) AND (level <= get_level_value(11268, id, area)))
-> BitmapOr (cost=11017.63..11017.63 rows=669412 width=0) (actual time=321.635..321.635 rows=0 loops=1)
-> Bitmap Index Scan on date1 (cost=0.00..10626.30 rows=652457 width=0) (actual time=84.555..84.555 rows=647870 loops=1)
Index Cond: (date1 >= '2011-01-01'::date)
-> Bitmap Index Scan on date2_table_h (cost=0.00..280.03 rows=16955 width=0) (actual time=237.074..237.074 rows=15222 loops=1)
Index Cond: (date2 >= '2011-01-01'::date)
How should I handle this situation and use function?
--
kupen
--
Wippies-vallankumous on täällä! Varmista paikkasi vallankumouksen eturintamassa ja liity Wippiesiin heti!
http://www.wippies.com/
From: | Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com> |
---|---|
To: | Pena Kupen <kupen(at)wippies(dot)fi> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Custom function in where clause |
Date: | 2012-07-10 08:44:18 |
Message-ID: | CAK-MWwRf6SNeha9bC72v2_hd4pSDDtoHYrji1s=yjntAoBXDQA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
On Tue, Jul 10, 2012 at 6:36 PM, Pena Kupen <kupen(at)wippies(dot)fi> wrote:
> Hi,
>
> I have searched solution to my problem a few days. On my query, there is
> big performance problem.
> It seems to me, that problem is on where-part of sql and it's function.
>
> My sql is:
> select count(*)
> from table_h where level <=
> get_level_value(11268,id,area) and (date1 >= '2011-1-1'
> or date2>='2011-1-1') and (date1 <= '2012-07-09' or
> date2<='2012-07-09')
> This takes about 40sek.
>
> select count(*)
> from table_h where (date1 >=
> '2011-1-1' or date2>='2011-1-1') and (date1 <=
> '2012-07-09' or date2<='2012-07-09')
> when ignoring function, it takes <1sek.
>
> Function is:
> CREATE OR REPLACE FUNCTION get_level_value(_user integer, _id, _area)
> RETURNS integer
> AS $$
> DECLARE found integer;
> BEGIN
> SELECT 1 INTO found
> FROM table_o
> WHERE userid=_user AND
> id=_id AND
> area=_area;
> IF (found) THEN
> return 3;
> ELSE
> return 1;
> END IF;
> END;
> $$
> LANGUAGE plpgsql;
>
> On explain, it seems to me that this function is on filter and it will
> execute on every row. Total resultset contains 1 700 000 rows.
> QUERY PLAN
> Aggregate (cost=285543.89..285543.90 rows=1 width=0) (actual
> time=32391.380..32391.380 rows=1 loops=1)
> -> Bitmap Heap Scan on table_h (cost=11017.63..284987.40 rows=222596
> width=0) (actual time=326.946..31857.145 rows=631818 loops=1)
> Recheck Cond: ((date1 >= '2011-01-01'::date) OR (date2 >=
> '2011-01-01'::date))
> Filter: (((date1 <= '2012-07-09'::date) OR (date2 <=
> '2012-07-09'::date)) AND (level <= get_level_value(11268, id, area)))
> -> BitmapOr (cost=11017.63..11017.63 rows=669412 width=0) (actual
> time=321.635..321.635 rows=0 loops=1)
> -> Bitmap Index Scan on date1 (cost=0.00..10626.30
> rows=652457 width=0) (actual time=84.555..84.555 rows=647870 loops=1)
> Index Cond: (date1 >= '2011-01-01'::date)
> -> Bitmap Index Scan on date2_table_h (cost=0.00..280.03
> rows=16955 width=0) (actual time=237.074..237.074 rows=15222 loops=1)
> Index Cond: (date2 >= '2011-01-01'::date)
>
> How should I handle this situation and use function?
>
>
You could not have good performance using function in case where direct
JOIN is only way to have reasonable performance.
Stop using function and write join with table_o instead, or put whole query
with join inside a function.
--
Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/ <http://www.postgresql-consulting.com/>
Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678
Skype: maxim.boguk
Jabber: maxim(dot)boguk(at)gmail(dot)com
МойКруг: http://mboguk.moikrug.ru/
"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."
From: | Brendan Jurd <direvus(at)gmail(dot)com> |
---|---|
To: | Pena Kupen <kupen(at)wippies(dot)fi> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Custom function in where clause |
Date: | 2012-07-10 08:45:29 |
Message-ID: | CADxJZo0NNM0u6197HoyemXT58sQgZPTN1oM-q+S92jgO7r=JRQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
On 10 July 2012 18:36, Pena Kupen <kupen(at)wippies(dot)fi> wrote:
> Hi,
>
> I have searched solution to my problem a few days. On my query, there is big
> performance problem.
> It seems to me, that problem is on where-part of sql and it's function.
>
> How should I handle this situation and use function?
>
I would start by rewriting your function in plain SQL rather than
PL/pgSQL. As a general rule, don't write a function in PL/pgSQL
unless you really need procedurality. This function does not.
For example:
CREATE OR REPLACE FUNCTION get_level_value(_user integer, _id, _area)
RETURNS integer
AS $$
-- Return 3 if there are matching records in table_o, otherwise return 1.
SELECT CASE WHEN EXISTS (
SELECT id
FROM table_o
WHERE userid=_user AND
id=_id AND
area=_area
) THEN 3 ELSE 1 END;
$$
LANGUAGE sql STABLE;
Cheers,
BJ