Re: Retry: Is this possible / slow performance?

Lists: pgsql-performance
From: "Joost Kraaijeveld" <J(dot)Kraaijeveld(at)Askesis(dot)nl>
To: "Pgsql-Performance (E-mail)" <pgsql-performance(at)postgresql(dot)org>
Subject: Retry: Is this possible / slow performance?
Date: 2005-02-07 16:37:53
Message-ID: A3D1526C98B7C1409A687E0943EAC410605EF2@obelix.askesis.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi all,

A retry of the question asked before. All tables freshly vacuumed an analized.

Two queries: one with "set enable_seqscan = on" , the other with "set enable_seqscan = off". The first query lasts 59403 ms, the second query 31 ms ( the desc order variant has the same large difference: 122494 ms vs. 1297 ms). (for the query plans see below).

Can I, without changing the SQL (because it is generated by a tool) or explicitely setting "set enable_seqscan = off" for this query, trick PostgreSQL in taking the fast variant of the queryplan?

TIA

Groeten,

Joost Kraaijeveld
Askesis B.V.
Molukkenstraat 14
6524NB Nijmegen
tel: 024-3888063 / 06-51855277
fax: 024-3608416
e-mail: J(dot)Kraaijeveld(at)Askesis(dot)nl
web: www.askesis.nl

------------------------------- Query 1

begin;
set enable_seqscan = on;
declare SQL_CUR01 cursor for
SELECT A.ordernummer, B.klantnummer FROM "orders" A LEFT OUTER JOIN "klt_alg" B ON A.Klantnummer=B.Klantnummer
ORDER BY A.klantnummer;
fetch 100 in SQL_CUR01;
commit;

QUERY PLAN
Sort (cost=259968.77..262729.72 rows=1104380 width=12)
Sort Key: a.klantnummer, a.ordernummer
-> Hash Left Join (cost=42818.43..126847.70 rows=1104380 width=12)
Hash Cond: ("outer".klantnummer = "inner".klantnummer)
-> Seq Scan on orders a (cost=0.00..46530.79 rows=1104379 width=8)
-> Hash (cost=40635.14..40635.14 rows=368914 width=4)
-> Seq Scan on klt_alg b (cost=0.00..40635.14 rows=368914 width=4)

Actual running time: 59403 ms.

------------------------------- Query 2

begin;
set enable_seqscan = off;
declare SQL_CUR01 cursor for
SELECT A.ordernummer, B.klantnummer FROM "orders" A LEFT OUTER JOIN "klt_alg" B ON A.Klantnummer=B.Klantnummer
ORDER BY A.klantnummer;
fetch 100 in SQL_CUR01;
commit;

QUERY PLAN
Merge Left Join (cost=0.00..2586604.86 rows=1104380 width=12)
Merge Cond: ("outer".klantnummer = "inner".klantnummer)
-> Index Scan using orders_klantnummer on orders a (cost=0.00..2435790.17 rows=1104379 width=8)
-> Index Scan using klt_alg_klantnummer on klt_alg b (cost=0.00..44909.11 rows=368914 width=4)

Actual running time: 31 ms.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Joost Kraaijeveld" <J(dot)Kraaijeveld(at)Askesis(dot)nl>
Cc: "Pgsql-Performance (E-mail)" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Retry: Is this possible / slow performance?
Date: 2005-02-07 17:03:56
Message-ID: 27755.1107795836@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

"Joost Kraaijeveld" <J(dot)Kraaijeveld(at)Askesis(dot)nl> writes:
> Two queries: one with "set enable_seqscan = on" , the other with "set enable_seqscan = off". The first query lasts 59403 ms, the second query 31 ms ( the desc order variant has the same large difference: 122494 ms vs. 1297 ms). (for the query plans see below).

The reason for the difference is that the mergejoin plan has a much
lower startup cost than the hash plan, and since you're only fetching
100 rows the startup cost is dominant. IIRC the planner does make some
allowance for this effect when preparing a DECLARE CURSOR plan (ie,
it puts some weight on startup cost rather than considering only total
cost) ... but it's not so optimistic as to assume that you only want 100
out of an estimated 1 million+ result rows.

The best solution is probably to put a LIMIT into the DECLARE CURSOR,
so that the planner can see how much you intend to fetch.

regards, tom lane


From: PFC <lists(at)boutiquenumerique(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Joost Kraaijeveld" <J(dot)Kraaijeveld(at)askesis(dot)nl>
Cc: "Pgsql-Performance (E-mail)" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Retry: Is this possible / slow performance?
Date: 2005-02-07 18:16:47
Message-ID: opsluad9bhth1vuj@musicbox
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance


Does the planner also take into account that the Hash Join will need a
huge temporary space which will exist for the whole length of the cursor
existence (which may be quite long if he intends to fetch everything),
whereas the Merge Join should need very little space as it is sending the
rows as it fetches them using the Indexes ?

On Mon, 07 Feb 2005 12:03:56 -0500, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> "Joost Kraaijeveld" <J(dot)Kraaijeveld(at)Askesis(dot)nl> writes:
>> Two queries: one with "set enable_seqscan = on" , the other with "set
>> enable_seqscan = off". The first query lasts 59403 ms, the second query
>> 31 ms ( the desc order variant has the same large difference: 122494 ms
>> vs. 1297 ms). (for the query plans see below).
>
> The reason for the difference is that the mergejoin plan has a much
> lower startup cost than the hash plan, and since you're only fetching
> 100 rows the startup cost is dominant. IIRC the planner does make some
> allowance for this effect when preparing a DECLARE CURSOR plan (ie,
> it puts some weight on startup cost rather than considering only total
> cost) ... but it's not so optimistic as to assume that you only want 100
> out of an estimated 1 million+ result rows.
>
> The best solution is probably to put a LIMIT into the DECLARE CURSOR,
> so that the planner can see how much you intend to fetch.
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>