Quick Links

Re: performance drop on 8.2.4, reverting to 8.1.4

Lists:	pgsql-performance

From:	"Liviu Ionescu" <ilgb(at)livius(dot)net>
To:	<pgsql-performance(at)postgresql(dot)org>
Subject:	performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-05-18 09:02:44
Message-ID:	006501c7992b$4d29ffb0$653e10ac@ilgvaio
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

I recently tried to upgrade to 8.2.4, but major queries I wrote for 8.1.4 are now planned differently on 8.2.4 and are no longer usable. What the 8.1.4 planned as a series of 'hash left join's and took about 2 seconds now is planned as 'nested loop left joins' and takes forever.

Other request were also affected, increasing the time form miliseconds to hundreds of miliseconds, even seconds.

The worst performance hit was on the following query. I know it is a bit extreme, but worked perfectly on 8.1.4.

Regards,

Liviu

SELECT n.nodeid,
CASE
WHEN n.parentnodeid IS NULL THEN -1
ELSE n.parentnodeid
END AS parentnodeid, n.nodename, av.value AS iconname,
avt.value AS templatename, avs.value AS subclass, n.globalnodeid, n.isaddupi,
CASE
WHEN realms.nodeid IS NOT NULL THEN 'SERVER'::text
WHEN areas.nodeid IS NOT NULL THEN 'AREA'::text
WHEN rtus.nodeid IS NOT NULL THEN 'DEVICE'::text
WHEN rtunodes.nodeid IS NOT NULL THEN 'TAG'::text
ELSE NULL::text
END AS "class", realms.name AS realmname,
CASE
WHEN n.nodeclass::text = 'area'::text AND n.nodesubclass IS NOT NULL THEN true
ELSE false
END AS istemplate,
CASE
WHEN realms.nodeid IS NOT NULL THEN realms.nodeid
WHEN areas.nodeid IS NOT NULL THEN areas.realmid
WHEN rtus.nodeid IS NOT NULL THEN rtus.realmid
WHEN rtunodes.nodeid IS NOT NULL THEN r.realmid
ELSE NULL::integer
END AS realmid, rtunodes.rtuid, rtunodes.isinvalid, n.isvalid
FROM nodes n
LEFT JOIN realms ON n.nodeid = realms.nodeid
LEFT JOIN areas ON n.nodeid = areas.nodeid
LEFT JOIN rtus ON n.nodeid = rtus.nodeid
LEFT JOIN templates ON n.nodeid = templates.nodeid
LEFT JOIN templatenodes ON n.nodeid = templatenodes.nodeid
LEFT JOIN (rtunodes
JOIN rtus r ON rtunodes.rtuid = r.nodeid) ON n.nodeid = rtunodes.nodeid
LEFT JOIN ( SELECT attributes_values2_view.nodeid, attributes_values2_view.value
FROM attributes_values2_view
WHERE attributes_values2_view.attributename::text = 'iconName'::text) av ON n.nodeid = av.nodeid
LEFT JOIN ( SELECT attributes_values2_view.nodeid, attributes_values2_view.value
FROM attributes_values2_view
WHERE attributes_values2_view.attributename::text = 'addUPItemplate'::text) avt ON n.nodeid = avt.nodeid
LEFT JOIN ( SELECT attributes_values2_view.nodeid, attributes_values2_view.value
FROM attributes_values2_view
WHERE attributes_values2_view.attributename::text = 'addUPIsubclass'::text) avs ON n.nodeid = avs.nodeid
WHERE templates.nodeid IS NULL AND templatenodes.nodeid IS NULL;

CREATE OR REPLACE VIEW attributes_values2_view AS
SELECT nodeattributes.nodeid, nodeattributes.attributeid, a.name AS attributename,
t.name AS typename, a.typeid, a.valuesize, a.flags, nodeattributes.value, a.creationdate
FROM nodeattributes
LEFT JOIN attributes a USING (attributeid)
LEFT JOIN types t USING (typeid)
WHERE t.isattributetype;

the 8.2.4 plan with join_collapse_limit = 1 (with default it was worse, full of nested loops)

"Nested Loop Left Join (cost=32.01..2012.31 rows=1 width=230)"
" Join Filter: (n.nodeid = public.nodeattributes.nodeid)"
" -> Nested Loop Left Join (cost=26.47..1411.38 rows=1 width=220)"
" Join Filter: (n.nodeid = public.nodeattributes.nodeid)"
" -> Nested Loop Left Join (cost=20.93..810.45 rows=1 width=210)"
" Join Filter: (n.nodeid = public.nodeattributes.nodeid)"
" -> Nested Loop Left Join (cost=15.39..209.52 rows=1 width=200)"
" Join Filter: (n.nodeid = rtunodes.nodeid)"
" -> Nested Loop Left Join (cost=11.14..122.60 rows=1 width=187)"
" Filter: (templatenodes.nodeid IS NULL)"
" -> Hash Left Join (cost=11.14..99.52 rows=11 width=187)"
" Hash Cond: (n.nodeid = templates.nodeid)"
" Filter: (templates.nodeid IS NULL)"
" -> Hash Left Join (cost=8.70..87.95 rows=2266 width=187)"
" Hash Cond: (n.nodeid = rtus.nodeid)"
" -> Hash Left Join (cost=4.45..74.20 rows=2266 width=179)"
" Hash Cond: (n.nodeid = areas.nodeid)"
" -> Hash Left Join (cost=1.45..61.81 rows=2266 width=171)"
" Hash Cond: (n.nodeid = realms.nodeid)"
" -> Seq Scan on nodes n (cost=0.00..51.66 rows=2266 width=49)"
" -> Hash (cost=1.20..1.20 rows=20 width=122)"
" -> Seq Scan on realms (cost=0.00..1.20 rows=20 width=122)"
" -> Hash (cost=1.89..1.89 rows=89 width=8)"
" -> Seq Scan on areas (cost=0.00..1.89 rows=89 width=8)"
" -> Hash (cost=3.00..3.00 rows=100 width=8)"
" -> Seq Scan on rtus (cost=0.00..3.00 rows=100 width=8)"
" -> Hash (cost=1.64..1.64 rows=64 width=4)"
" -> Seq Scan on templates (cost=0.00..1.64 rows=64 width=4)"
" -> Index Scan using nodeid_pkey on templatenodes (cost=0.00..2.09 rows=1 width=4)"
" Index Cond: (n.nodeid = templatenodes.nodeid)"
" -> Hash Join (cost=4.25..63.93 rows=1839 width=13)"
" Hash Cond: (rtunodes.rtuid = r.nodeid)"
" -> Seq Scan on rtunodes (cost=0.00..34.39 rows=1839 width=9)"
" -> Hash (cost=3.00..3.00 rows=100 width=8)"
" -> Seq Scan on rtus r (cost=0.00..3.00 rows=100 width=8)"
" -> Hash Join (cost=5.54..600.89 rows=3 width=14)"
" Hash Cond: (a.typeid = t.typeid)"
" -> Hash Join (cost=4.38..599.23 rows=125 width=18)"
" Hash Cond: (public.nodeattributes.attributeid = a.attributeid)"
" -> Seq Scan on nodeattributes (cost=0.00..505.35 rows=23535 width=18)"
" -> Hash (cost=4.36..4.36 rows=1 width=8)"
" -> Seq Scan on attributes a (cost=0.00..4.36 rows=1 width=8)"
" Filter: ((name)::text = 'iconName'::text)"
" -> Hash (cost=1.10..1.10 rows=5 width=4)"
" -> Seq Scan on types t (cost=0.00..1.10 rows=5 width=4)"
" Filter: isattributetype"
" -> Hash Join (cost=5.54..600.89 rows=3 width=14)"
" Hash Cond: (a.typeid = t.typeid)"
" -> Hash Join (cost=4.38..599.23 rows=125 width=18)"
" Hash Cond: (public.nodeattributes.attributeid = a.attributeid)"
" -> Seq Scan on nodeattributes (cost=0.00..505.35 rows=23535 width=18)"
" -> Hash (cost=4.36..4.36 rows=1 width=8)"
" -> Seq Scan on attributes a (cost=0.00..4.36 rows=1 width=8)"
" Filter: ((name)::text = 'addUPItemplate'::text)"
" -> Hash (cost=1.10..1.10 rows=5 width=4)"
" -> Seq Scan on types t (cost=0.00..1.10 rows=5 width=4)"
" Filter: isattributetype"
" -> Hash Join (cost=5.54..600.89 rows=3 width=14)"
" Hash Cond: (a.typeid = t.typeid)"
" -> Hash Join (cost=4.38..599.23 rows=125 width=18)"
" Hash Cond: (public.nodeattributes.attributeid = a.attributeid)"
" -> Seq Scan on nodeattributes (cost=0.00..505.35 rows=23535 width=18)"
" -> Hash (cost=4.36..4.36 rows=1 width=8)"
" -> Seq Scan on attributes a (cost=0.00..4.36 rows=1 width=8)"
" Filter: ((name)::text = 'addUPIsubclass'::text)"
" -> Hash (cost=1.10..1.10 rows=5 width=4)"
" -> Seq Scan on types t (cost=0.00..1.10 rows=5 width=4)"
" Filter: isattributetype"

the 8.1.4 plan

"Hash Left Join (cost=1587.19..1775.85 rows=2270 width=230)"
" Hash Cond: ("outer".nodeid = "inner".nodeid)"
" -> Hash Left Join (cost=1086.04..1257.64 rows=2270 width=220)"
" Hash Cond: ("outer".nodeid = "inner".nodeid)"
" -> Hash Left Join (cost=584.89..745.10 rows=2270 width=210)"
" Hash Cond: ("outer".nodeid = "inner".nodeid)"
" -> Hash Left Join (cost=83.74..232.55 rows=2270 width=200)"
" Hash Cond: ("outer".nodeid = "inner".nodeid)"
" -> Hash Left Join (cost=14.47..128.10 rows=2270 width=187)"
" Hash Cond: ("outer".nodeid = "inner".nodeid)"
" Filter: ("inner".nodeid IS NULL)"
" -> Hash Left Join (cost=8.43..108.26 rows=2270 width=187)"
" Hash Cond: ("outer".nodeid = "inner".nodeid)"
" Filter: ("inner".nodeid IS NULL)"
" -> Hash Left Join (cost=6.62..94.47 rows=2270 width=187)"
" Hash Cond: ("outer".nodeid = "inner".nodeid)"
" -> Hash Left Join (cost=3.30..78.74 rows=2270 width=179)"
" Hash Cond: ("outer".nodeid = "inner".nodeid)"
" -> Hash Left Join (cost=1.24..64.48 rows=2270 width=171)"
" Hash Cond: ("outer".nodeid = "inner".nodeid)"
" -> Seq Scan on nodes n (cost=0.00..51.70 rows=2270 width=49)"
" -> Hash (cost=1.19..1.19 rows=19 width=122)"
" -> Seq Scan on realms (cost=0.00..1.19 rows=19 width=122)"
" -> Hash (cost=1.85..1.85 rows=85 width=8)"
" -> Seq Scan on areas (cost=0.00..1.85 rows=85 width=8)"
" -> Hash (cost=3.06..3.06 rows=106 width=8)"
" -> Seq Scan on rtus (cost=0.00..3.06 rows=106 width=8)"
" -> Hash (cost=1.64..1.64 rows=64 width=4)"
" -> Seq Scan on templates (cost=0.00..1.64 rows=64 width=4)"
" -> Hash (cost=5.44..5.44 rows=244 width=4)"
" -> Seq Scan on templatenodes (cost=0.00..5.44 rows=244 width=4)"
" -> Hash (cost=64.72..64.72 rows=1816 width=13)"
" -> Hash Join (cost=3.33..64.72 rows=1816 width=13)"
" Hash Cond: ("outer".rtuid = "inner".nodeid)"
" -> Seq Scan on rtunodes (cost=0.00..34.16 rows=1816 width=9)"
" -> Hash (cost=3.06..3.06 rows=106 width=8)"
" -> Seq Scan on rtus r (cost=0.00..3.06 rows=106 width=8)"
" -> Hash (cost=501.14..501.14 rows=4 width=14)"
" -> Nested Loop (cost=207.37..501.14 rows=4 width=14)"
" -> Nested Loop (cost=0.00..5.44 rows=1 width=4)"
" Join Filter: ("outer".typeid = "inner".typeid)"
" -> Seq Scan on attributes a (cost=0.00..4.28 rows=1 width=8)"
" Filter: ((name)::text = 'iconName'::text)"
" -> Seq Scan on types t (cost=0.00..1.10 rows=5 width=4)"
" Filter: isattributetype"
" -> Bitmap Heap Scan on nodeattributes (cost=207.37..493.33 rows=190 width=18)"
" Recheck Cond: (nodeattributes.attributeid = "outer".attributeid)"
" -> Bitmap Index Scan on nodeattributes_pkey (cost=0.00..207.37 rows=190 width=0)"
" Index Cond: (nodeattributes.attributeid = "outer".attributeid)"
" -> Hash (cost=501.14..501.14 rows=4 width=14)"
" -> Nested Loop (cost=207.37..501.14 rows=4 width=14)"
" -> Nested Loop (cost=0.00..5.44 rows=1 width=4)"
" Join Filter: ("outer".typeid = "inner".typeid)"
" -> Seq Scan on attributes a (cost=0.00..4.28 rows=1 width=8)"
" Filter: ((name)::text = 'addUPItemplate'::text)"
" -> Seq Scan on types t (cost=0.00..1.10 rows=5 width=4)"
" Filter: isattributetype"
" -> Bitmap Heap Scan on nodeattributes (cost=207.37..493.33 rows=190 width=18)"
" Recheck Cond: (nodeattributes.attributeid = "outer".attributeid)"
" -> Bitmap Index Scan on nodeattributes_pkey (cost=0.00..207.37 rows=190 width=0)"
" Index Cond: (nodeattributes.attributeid = "outer".attributeid)"
" -> Hash (cost=501.14..501.14 rows=4 width=14)"
" -> Nested Loop (cost=207.37..501.14 rows=4 width=14)"
" -> Nested Loop (cost=0.00..5.44 rows=1 width=4)"
" Join Filter: ("outer".typeid = "inner".typeid)"
" -> Seq Scan on attributes a (cost=0.00..4.28 rows=1 width=8)"
" Filter: ((name)::text = 'addUPIsubclass'::text)"
" -> Seq Scan on types t (cost=0.00..1.10 rows=5 width=4)"
" Filter: isattributetype"
" -> Bitmap Heap Scan on nodeattributes (cost=207.37..493.33 rows=190 width=18)"
" Recheck Cond: (nodeattributes.attributeid = "outer".attributeid)"
" -> Bitmap Index Scan on nodeattributes_pkey (cost=0.00..207.37 rows=190 width=0)"
" Index Cond: (nodeattributes.attributeid = "outer".attributeid)"

From:	"Steinar H(dot) Gunderson" <sgunderson(at)bigfoot(dot)com>
To:	Liviu Ionescu <ilgb(at)livius(dot)net>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-05-18 09:49:21
Message-ID:	20070518094921.GA13598@uio.no
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

On Fri, May 18, 2007 at 12:02:44PM +0300, Liviu Ionescu wrote:
> the 8.2.4 plan with join_collapse_limit = 1 (with default it was worse, full of nested loops)

It will probably be useful with EXPLAIN ANALYZE of your queries, not just the
EXPLAIN.

> "Nested Loop Left Join (cost=32.01..2012.31 rows=1 width=230)"

It looks like the planner thinks this is going to be really cheap -- so it's
misestimating something somewhere. Have you ANALYZEd recently?

/* Steinar */
--
Homepage: http://www.sesse.net/

From:	"Liviu Ionescu" <ilgb(at)livius(dot)net>
To:	<pgsql-performance(at)postgresql(dot)org>
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-05-18 10:14:56
Message-ID:	007501c79935$63072ec0$653e10ac@ilgvaio
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

> It will probably be useful with EXPLAIN ANALYZE of your
> queries, not just the EXPLAIN.

it took 245 seconds to complete, see below.

> It looks like the planner thinks this is going to be really
> cheap -- so it's misestimating something somewhere. Have you
> ANALYZEd recently?

yes, but to be sure I did it again before issuing the request; no improvements...

regards,

Liviu

"Nested Loop Left Join (cost=32.03..2026.70 rows=1 width=125) (actual time=16.686..244822.521 rows=2026 loops=1)"
" Join Filter: (n.nodeid = public.nodeattributes.nodeid)"
" -> Nested Loop Left Join (cost=26.55..1420.57 rows=1 width=115) (actual time=13.833..176136.527 rows=2026 loops=1)"
" Join Filter: (n.nodeid = public.nodeattributes.nodeid)"
" -> Nested Loop Left Join (cost=21.06..810.90 rows=1 width=105) (actual time=10.336..95476.175 rows=2026 loops=1)"
" Join Filter: (n.nodeid = public.nodeattributes.nodeid)"
" -> Nested Loop Left Join (cost=15.55..194.15 rows=1 width=95) (actual time=6.514..11524.892 rows=2026 loops=1)"
" Join Filter: (n.nodeid = rtunodes.nodeid)"
" -> Nested Loop Left Join (cost=11.17..107.94 rows=1 width=82) (actual time=0.661..71.751 rows=2026 loops=1)"
" Filter: (templatenodes.nodeid IS NULL)"
" -> Hash Left Join (cost=11.17..99.66 rows=1 width=82) (actual time=0.643..36.053 rows=2206 loops=1)"
" Hash Cond: (n.nodeid = templates.nodeid)"
" Filter: (templates.nodeid IS NULL)"
" -> Hash Left Join (cost=8.73..88.06 rows=2270 width=82) (actual time=0.502..27.756 rows=2270 loops=1)"
" Hash Cond: (n.nodeid = rtus.nodeid)"
" -> Hash Left Join (cost=4.34..74.11 rows=2270 width=74) (actual time=0.286..20.179 rows=2270 loops=1)"
" Hash Cond: (n.nodeid = areas.nodeid)"
" -> Hash Left Join (cost=1.43..61.83 rows=2270 width=66) (actual time=0.114..13.062 rows=2270 loops=1)"
" Hash Cond: (n.nodeid = realms.nodeid)"
" -> Seq Scan on nodes n (cost=0.00..51.70 rows=2270 width=49) (actual time=0.016..4.089 rows=2270 loops=1)"
" -> Hash (cost=1.19..1.19 rows=19 width=17) (actual time=0.056..0.056 rows=19 loops=1)"
" -> Seq Scan on realms (cost=0.00..1.19 rows=19 width=17) (actual time=0.006..0.023 rows=19 loops=1)"
" -> Hash (cost=1.85..1.85 rows=85 width=8) (actual time=0.156..0.156 rows=85 loops=1)"
" -> Seq Scan on areas (cost=0.00..1.85 rows=85 width=8) (actual time=0.007..0.070 rows=85 loops=1)"
" -> Hash (cost=3.06..3.06 rows=106 width=8) (actual time=0.200..0.200 rows=106 loops=1)"
" -> Seq Scan on rtus (cost=0.00..3.06 rows=106 width=8) (actual time=0.010..0.105 rows=106 loops=1)"
" -> Hash (cost=1.64..1.64 rows=64 width=4) (actual time=0.119..0.119 rows=64 loops=1)"
" -> Seq Scan on templates (cost=0.00..1.64 rows=64 width=4) (actual time=0.006..0.059 rows=64 loops=1)"
" -> Index Scan using nodeid_pkey on templatenodes (cost=0.00..8.27 rows=1 width=4) (actual time=0.009..0.009 rows=0 loops=2206)"
" Index Cond: (n.nodeid = templatenodes.nodeid)"
" -> Hash Join (cost=4.38..63.51 rows=1816 width=13) (actual time=0.012..4.417 rows=1816 loops=2026)"
" Hash Cond: (rtunodes.rtuid = r.nodeid)"
" -> Seq Scan on rtunodes (cost=0.00..34.16 rows=1816 width=9) (actual time=0.009..1.290 rows=1816 loops=2026)"
" -> Hash (cost=3.06..3.06 rows=106 width=8) (actual time=0.194..0.194 rows=106 loops=1)"
" -> Seq Scan on rtus r (cost=0.00..3.06 rows=106 width=8) (actual time=0.005..0.091 rows=106 loops=1)"
" -> Hash Join (cost=5.51..611.90 rows=388 width=14) (actual time=0.033..39.896 rows=2079 loops=2026)"
" Hash Cond: (a.typeid = t.typeid)"
" -> Hash Join (cost=4.34..604.41 rows=647 width=18) (actual time=0.031..36.513 rows=2079 loops=2026)"
" Hash Cond: (public.nodeattributes.attributeid = a.attributeid)"
" -> Seq Scan on nodeattributes (cost=0.00..505.35 rows=23535 width=18) (actual time=0.008..16.826 rows=23535 loops=2026)"
" -> Hash (cost=4.28..4.28 rows=5 width=8) (actual time=0.077..0.077 rows=5 loops=1)"
" -> Seq Scan on attributes a (cost=0.00..4.28 rows=5 width=8) (actual time=0.033..0.067 rows=5 loops=1)"
" Filter: ((name)::text = 'iconName'::text)"
" -> Hash (cost=1.10..1.10 rows=6 width=4) (actual time=0.023..0.023 rows=6 loops=1)"
" -> Seq Scan on types t (cost=0.00..1.10 rows=6 width=4) (actual time=0.005..0.012 rows=6 loops=1)"
" Filter: isattributetype"
" -> Hash Join (cost=5.49..606.76 rows=233 width=14) (actual time=0.104..38.474 rows=1865 loops=2026)"
" Hash Cond: (a.typeid = t.typeid)"
" -> Hash Join (cost=4.31..601.80 rows=388 width=18) (actual time=0.101..35.484 rows=1865 loops=2026)"
" Hash Cond: (public.nodeattributes.attributeid = a.attributeid)"
" -> Seq Scan on nodeattributes (cost=0.00..505.35 rows=23535 width=18) (actual time=0.008..16.624 rows=23535 loops=2026)"
" -> Hash (cost=4.28..4.28 rows=3 width=8) (actual time=0.071..0.071 rows=3 loops=1)"
" -> Seq Scan on attributes a (cost=0.00..4.28 rows=3 width=8) (actual time=0.019..0.058 rows=3 loops=1)"
" Filter: ((name)::text = 'addUPItemplate'::text)"
" -> Hash (cost=1.10..1.10 rows=6 width=4) (actual time=0.025..0.025 rows=6 loops=1)"
" -> Seq Scan on types t (cost=0.00..1.10 rows=6 width=4) (actual time=0.004..0.012 rows=6 loops=1)"
" Filter: isattributetype"
" -> Hash Join (cost=5.48..604.19 rows=155 width=14) (actual time=0.794..33.783 rows=132 loops=2026)"
" Hash Cond: (a.typeid = t.typeid)"
" -> Hash Join (cost=4.30..600.50 rows=259 width=18) (actual time=0.791..33.550 rows=132 loops=2026)"
" Hash Cond: (public.nodeattributes.attributeid = a.attributeid)"
" -> Seq Scan on nodeattributes (cost=0.00..505.35 rows=23535 width=18) (actual time=0.008..16.623 rows=23535 loops=2026)"
" -> Hash (cost=4.28..4.28 rows=2 width=8) (actual time=0.060..0.060 rows=2 loops=1)"
" -> Seq Scan on attributes a (cost=0.00..4.28 rows=2 width=8) (actual time=0.015..0.054 rows=2 loops=1)"
" Filter: ((name)::text = 'addUPIsubclass'::text)"
" -> Hash (cost=1.10..1.10 rows=6 width=4) (actual time=0.022..0.022 rows=6 loops=1)"
" -> Seq Scan on types t (cost=0.00..1.10 rows=6 width=4) (actual time=0.003..0.009 rows=6 loops=1)"
" Filter: isattributetype"
"Total runtime: 244826.065 ms"

From:	"Steinar H(dot) Gunderson" <sgunderson(at)bigfoot(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-05-18 10:46:46
Message-ID:	20070518104646.GA13811@uio.no
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

On Fri, May 18, 2007 at 01:14:56PM +0300, Liviu Ionescu wrote:
> yes, but to be sure I did it again before issuing the request; no improvements...

Is this with the join collapse limit set to 1, or with default? (Default is
generally more interesting.)

/* Steinar */
--
Homepage: http://www.sesse.net/

From:	"Liviu Ionescu" <ilgb(at)livius(dot)net>
To:	<pgsql-performance(at)postgresql(dot)org>
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-05-18 11:05:36
Message-ID:	008201c7993c$76c88a10$653e10ac@ilgvaio
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

> Is this with the join collapse limit set to 1, or with
> default? (Default is generally more interesting.)

below is the same query with the default setting.

regards,

Liviu

"Nested Loop Left Join (cost=23.35..1965.46 rows=1 width=125) (actual time=50.408..231926.123 rows=2026 loops=1)"
" Join Filter: (n.nodeid = public.nodeattributes.nodeid)"
" -> Nested Loop Left Join (cost=17.81..1357.58 rows=1 width=115) (actual time=47.103..156521.050 rows=2026 loops=1)"
" Join Filter: (n.nodeid = public.nodeattributes.nodeid)"
" -> Nested Loop Left Join (cost=12.30..752.97 rows=1 width=105) (actual time=43.924..81977.726 rows=2026 loops=1)"
" Join Filter: (n.nodeid = public.nodeattributes.nodeid)"
" -> Nested Loop Left Join (cost=6.83..150.65 rows=1 width=95) (actual time=40.603..12477.227 rows=2026 loops=1)"
" -> Nested Loop Left Join (cost=6.83..150.37 rows=1 width=78) (actual time=38.448..12459.918 rows=2026 loops=1)"
" -> Nested Loop Left Join (cost=6.83..150.08 rows=1 width=70) (actual time=31.793..12436.536 rows=2026 loops=1)"
" -> Nested Loop Left Join (cost=6.83..149.80 rows=1 width=62) (actual time=6.588..12394.366 rows=2026 loops=1)"
" Filter: (templatenodes.nodeid IS NULL)"
" -> Nested Loop Left Join (cost=6.83..149.51 rows=1 width=62) (actual time=6.525..12362.969 rows=2206 loops=1)"
" Join Filter: (n.nodeid = rtunodes.nodeid)"
" -> Hash Left Join (cost=2.44..63.29 rows=1 width=49) (actual time=0.361..14.426 rows=2206 loops=1)"
" Hash Cond: (n.nodeid = templates.nodeid)"
" Filter: (templates.nodeid IS NULL)"
" -> Seq Scan on nodes n (cost=0.00..51.70 rows=2270 width=49) (actual time=0.071..4.417 rows=2270 loops=1)"
" -> Hash (cost=1.64..1.64 rows=64 width=4) (actual time=0.152..0.152 rows=64 loops=1)"
" -> Seq Scan on templates (cost=0.00..1.64 rows=64 width=4) (actual time=0.032..0.082 rows=64 loops=1)"
" -> Hash Join (cost=4.38..63.51 rows=1816 width=13) (actual time=0.011..4.365 rows=1816 loops=2206)"
" Hash Cond: (rtunodes.rtuid = r.nodeid)"
" -> Seq Scan on rtunodes (cost=0.00..34.16 rows=1816 width=9) (actual time=0.008..1.276 rows=1816 loops=2206)"
" -> Hash (cost=3.06..3.06 rows=106 width=8) (actual time=0.241..0.241 rows=106 loops=1)"
" -> Seq Scan on rtus r (cost=0.00..3.06 rows=106 width=8) (actual time=0.029..0.136 rows=106 loops=1)"
" -> Index Scan using nodeid_pkey on templatenodes (cost=0.00..0.28 rows=1 width=4) (actual time=0.008..0.008 rows=0 loops=2206)"
" Index Cond: (n.nodeid = templatenodes.nodeid)"
" -> Index Scan using rtus_pkey on rtus (cost=0.00..0.27 rows=1 width=8) (actual time=0.016..0.016 rows=0 loops=2026)"
" Index Cond: (n.nodeid = rtus.nodeid)"
" -> Index Scan using areas_pkey on areas (cost=0.00..0.27 rows=1 width=8) (actual time=0.007..0.007 rows=0 loops=2026)"
" Index Cond: (n.nodeid = areas.nodeid)"
" -> Index Scan using realms_pkey on realms (cost=0.00..0.27 rows=1 width=17) (actual time=0.004..0.004 rows=0 loops=2026)"
" Index Cond: (n.nodeid = realms.nodeid)"
" -> Hash Join (cost=5.48..600.38 rows=155 width=14) (actual time=0.812..34.198 rows=132 loops=2026)"
" Hash Cond: (public.nodeattributes.attributeid = a.attributeid)"
" -> Seq Scan on nodeattributes (cost=0.00..505.35 rows=23535 width=18) (actual time=0.009..16.660 rows=23535 loops=2026)"
" -> Hash (cost=5.47..5.47 rows=1 width=4) (actual time=0.196..0.196 rows=2 loops=1)"
" -> Hash Join (cost=1.18..5.47 rows=1 width=4) (actual time=0.124..0.187 rows=2 loops=1)"
" Hash Cond: (a.typeid = t.typeid)"
" -> Seq Scan on attributes a (cost=0.00..4.28 rows=2 width=8) (actual time=0.044..0.103 rows=2 loops=1)"
" Filter: ((name)::text = 'addUPIsubclass'::text)"
" -> Hash (cost=1.10..1.10 rows=6 width=4) (actual time=0.047..0.047 rows=6 loops=1)"
" -> Seq Scan on types t (cost=0.00..1.10 rows=6 width=4) (actual time=0.028..0.034 rows=6 loops=1)"
" Filter: isattributetype"
" -> Hash Join (cost=5.51..601.70 rows=233 width=14) (actual time=0.103..35.496 rows=1865 loops=2026)"
" Hash Cond: (public.nodeattributes.attributeid = a.attributeid)"
" -> Seq Scan on nodeattributes (cost=0.00..505.35 rows=23535 width=18) (actual time=0.009..16.595 rows=23535 loops=2026)"
" -> Hash (cost=5.48..5.48 rows=2 width=4) (actual time=0.116..0.116 rows=3 loops=1)"
" -> Hash Join (cost=1.18..5.48 rows=2 width=4) (actual time=0.063..0.107 rows=3 loops=1)"
" Hash Cond: (a.typeid = t.typeid)"
" -> Seq Scan on attributes a (cost=0.00..4.28 rows=3 width=8) (actual time=0.017..0.056 rows=3 loops=1)"
" Filter: ((name)::text = 'addUPItemplate'::text)"
" -> Hash (cost=1.10..1.10 rows=6 width=4) (actual time=0.022..0.022 rows=6 loops=1)"
" -> Seq Scan on types t (cost=0.00..1.10 rows=6 width=4) (actual time=0.004..0.010 rows=6 loops=1)"
" Filter: isattributetype"
" -> Hash Join (cost=5.54..603.02 rows=388 width=14) (actual time=0.031..35.795 rows=2079 loops=2026)"
" Hash Cond: (public.nodeattributes.attributeid = a.attributeid)"
" -> Seq Scan on nodeattributes (cost=0.00..505.35 rows=23535 width=18) (actual time=0.008..16.766 rows=23535 loops=2026)"
" -> Hash (cost=5.50..5.50 rows=3 width=4) (actual time=0.120..0.120 rows=5 loops=1)"
" -> Hash Join (cost=1.18..5.50 rows=3 width=4) (actual time=0.074..0.110 rows=5 loops=1)"
" Hash Cond: (a.typeid = t.typeid)"
" -> Seq Scan on attributes a (cost=0.00..4.28 rows=5 width=8) (actual time=0.025..0.050 rows=5 loops=1)"
" Filter: ((name)::text = 'iconName'::text)"
" -> Hash (cost=1.10..1.10 rows=6 width=4) (actual time=0.026..0.026 rows=6 loops=1)"
" -> Seq Scan on types t (cost=0.00..1.10 rows=6 width=4) (actual time=0.004..0.010 rows=6 loops=1)"
" Filter: isattributetype"
"Total runtime: 231929.656 ms"

From:	"Steinar H(dot) Gunderson" <sgunderson(at)bigfoot(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-05-18 11:14:55
Message-ID:	20070518111455.GA13978@uio.no
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

On Fri, May 18, 2007 at 02:05:36PM +0300, Liviu Ionescu wrote:
> " -> Hash Left Join (cost=2.44..63.29 rows=1 width=49) (actual time=0.361..14.426 rows=2206 loops=1)"
> " Hash Cond: (n.nodeid = templates.nodeid)"
> " Filter: (templates.nodeid IS NULL)"
> " -> Seq Scan on nodes n (cost=0.00..51.70 rows=2270 width=49) (actual time=0.071..4.417 rows=2270 loops=1)"
> " -> Hash (cost=1.64..1.64 rows=64 width=4) (actual time=0.152..0.152 rows=64 loops=1)"
> " -> Seq Scan on templates (cost=0.00..1.64 rows=64 width=4) (actual time=0.032..0.082 rows=64 loops=1)"

This seems to be the source of the misestimation. You might want to try using
"n WHERE n.nodein NOT IN (SELECT nodeid FROM templates)" instead of "n LEFT
JOIN templates USING (nodeid) WHERE templates.nodeid IS NULL" and see if it
helps.

> "Total runtime: 231929.656 ms"

Note that this is better than the version with collapse_limit set to 1. :-)

/* Steinar */
--
Homepage: http://www.sesse.net/

From:	"Liviu Ionescu" <ilgb(at)livius(dot)net>
To:	<pgsql-performance(at)postgresql(dot)org>
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-05-18 11:51:42
Message-ID:	008901c79942$e7136e10$653e10ac@ilgvaio
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

> This seems to be the source of the misestimation. You might
> want to try using "n WHERE n.nodein NOT IN (SELECT nodeid
> FROM templates)" instead of "n LEFT JOIN templates USING
> (nodeid) WHERE templates.nodeid IS NULL" and see if it helps.

it helped, the new version of the query takes 2303 ms on both 8.1.4 and 8.2.4.

any idea why the 8.2.4 planner is not happy with the initial select? was it just a big chance that it worked in 8.1.4 or the 8.2.4 planner has a problem?

or, from another perspective, is the new syntax more portable? what are the chances that after upgrading to 8.3.x to encounter new problems?

regards,

Liviu

From:	"Steinar H(dot) Gunderson" <sgunderson(at)bigfoot(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-05-18 13:10:47
Message-ID:	20070518131047.GA14326@uio.no
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

On Fri, May 18, 2007 at 02:51:42PM +0300, Liviu Ionescu wrote:
> it helped, the new version of the query takes 2303 ms on both 8.1.4 and 8.2.4.

And the old one?

> any idea why the 8.2.4 planner is not happy with the initial select? was it
> just a big chance that it worked in 8.1.4 or the 8.2.4 planner has a
> problem?

I guess it was more or less by chance, especially as 8.1 did not reorder
outer joins. Others might know more about the estimation, though.

/* Steinar */
--
Homepage: http://www.sesse.net/

From:	"Liviu Ionescu" <ilgb(at)livius(dot)net>
To:	<pgsql-performance(at)postgresql(dot)org>
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-05-18 13:33:37
Message-ID:	009501c79951$24596eb0$653e10ac@ilgvaio
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

> > it helped, the new version of the query takes 2303 ms on both 8.1.4
> > and 8.2.4.
>
> And the old one?

slightly shorter, 2204 ms.

as a subjective perception, the entire application is slightly slower on 8.2.4, probably there are many queries that were manually tunned for 7.x/8.1.x and now need rewriting, which is not really what I expected from an upgrade.

Liviu

From:	"George Pavlov" <gpavlov(at)mynewplace(dot)com>
To:	"Liviu Ionescu" <ilgb(at)livius(dot)net>, <pgsql-performance(at)postgresql(dot)org>
Cc:	"Steinar H(dot) Gunderson" <sgunderson(at)bigfoot(dot)com>
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-05-18 15:13:08
Message-ID:	8C5B026B51B6854CBE88121DBF097A86C39661@ehost010-33.exch010.intermedia.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

> > This seems to be the source of the misestimation. You might
> > want to try using "n WHERE n.nodein NOT IN (SELECT nodeid
> > FROM templates)" instead of "n LEFT JOIN templates USING
> > (nodeid) WHERE templates.nodeid IS NULL" and see if it helps.
>
> it helped, the new version of the query takes 2303 ms on both
> 8.1.4 and 8.2.4.

this is very interesting. on 8.1.x i have also repeatedly had to rewrite
joins as their equivalent IN/NOT IN alternatives in order to improve
performance, so i feel that at least under some alignments of the
planets 8.1 has similar problems.

george

From:	"Liviu Ionescu" <ilgb(at)livius(dot)net>
To:	<pgsql-performance(at)postgresql(dot)org>
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-05-18 15:40:31
Message-ID:	00a101c79962$de653d50$653e10ac@ilgvaio
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

> under some alignments of the planets 8.1 has similar problems.

8.1 might have similar problems, but the point here is different: if what
was manually tuned to work in 8.1 confuses the 8.2 planner and performance
drops so much (from 2303 to 231929 ms in my case) upgrading a production
machine to 8.2 is a risky business. I probably have hundreds of sql
statements in my application, some of them quite complex, and it is not
reasonable to check all of them in order to certify them to be 8.2
compliant.

regards,

Liviu

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Steinar H(dot) Gunderson" <sgunderson(at)bigfoot(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-05-18 15:54:49
Message-ID:	28604.1179503689@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

"Steinar H. Gunderson" <sgunderson(at)bigfoot(dot)com> writes:
> On Fri, May 18, 2007 at 02:05:36PM +0300, Liviu Ionescu wrote:
>> " -> Hash Left Join (cost=2.44..63.29 rows=1 width=49) (actual time=0.361..14.426 rows=2206 loops=1)"
>> " Hash Cond: (n.nodeid = templates.nodeid)"
>> " Filter: (templates.nodeid IS NULL)"

> This seems to be the source of the misestimation.

Yeah. 8.2 is estimating that the "nodeid IS NULL" condition will
discard all or nearly all the rows, presumably because there aren't any
null nodeid's in the underlying table --- it fails to consider that the
LEFT JOIN may inject some nulls. 8.1 was not any brighter; the reason
it gets a different estimate is that it doesn't distinguish left-join
and WHERE clauses at all, but assumes that the result of the left join
can't have fewer rows than its left input, even after applying the
filter condition. In this particular scenario that happens to be a
better estimate. So even though 8.2 is smarter, and there is no bug
here that wasn't in 8.1 too, it's getting a worse estimate leading to
a worse plan.

This is a sufficiently common idiom that I think it's a must-fix
problem. Not sure about details yet, but it seems somehow the
selectivity estimator had better start accounting for
outer-join-injected NULLs.

regards, tom lane

From:	Kenneth Marshall <ktm(at)rice(dot)edu>
To:	Liviu Ionescu <ilgb(at)livius(dot)net>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-05-18 16:21:39
Message-ID:	20070518162139.GG7069@it.is.rice.edu
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

On Fri, May 18, 2007 at 06:40:31PM +0300, Liviu Ionescu wrote:
> > under some alignments of the planets 8.1 has similar problems.
>
> 8.1 might have similar problems, but the point here is different: if what
> was manually tuned to work in 8.1 confuses the 8.2 planner and performance
> drops so much (from 2303 to 231929 ms in my case) upgrading a production
> machine to 8.2 is a risky business. I probably have hundreds of sql
> statements in my application, some of them quite complex, and it is not
> reasonable to check all of them in order to certify them to be 8.2
> compliant.
>
> regards,
>
> Liviu
>
>
Liviu,

It is arguable, that updating the DB software version in an enterprise
environment requires exactly that: check all production queries on the
new software to identify any issues. In part, this is brought on by the
very tuning that you performed against the previous software. Restore
the 8.1 DB into 8.2. Then run the queries against both versions to
evaluate functioning and timing.

Ken

From:	"Liviu Ionescu" <ilgb(at)livius(dot)net>
To:
Cc:	<pgsql-performance(at)postgresql(dot)org>
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-05-18 16:49:30
Message-ID:	00bc01c7996c$82a66890$653e10ac@ilgvaio
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

> It is arguable, that updating the DB software version in an
> enterprise environment requires exactly that: check all
> production queries on the new software to identify any
> issues. In part, this is brought on by the very tuning that
> you performed against the previous software. Restore the 8.1
> DB into 8.2. Then run the queries against both versions to
> evaluate functioning and timing.

you're right. my previous message was not a complain, was a warning for
others to avoid the same mistake. I was overconfident and got bitten. in the
future I'll check my queries on 8.2/8.3/... on a development configuration
before upgrading the production server.

regards,

Liviu

From:	Guido Neitzer <lists(at)event-s(dot)net>
To:	PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-05-19 19:51:04
Message-ID:	E9A8847C-1906-4701-89F2-0ECDFC5F8545@event-s.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

On 18.05.2007, at 10:21, Kenneth Marshall wrote:

> It is arguable, that updating the DB software version in an enterprise
> environment requires exactly that: check all production queries on the
> new software to identify any issues. In part, this is brought on by
> the
> very tuning that you performed against the previous software. Restore
> the 8.1 DB into 8.2. Then run the queries against both versions to
> evaluate functioning and timing.

And it is always a good idea to do this in a "clean room environment"
aka test server and set the logging in PostgreSQL to log all queries
longer than xx ms. If you first install 8.1 on the test machine, do a
test run and then upgrade to 8.2, you can compare results from the
tests and find the queries that are slower or faster quite easily.

cug

From:	Vivek Khera <vivek(at)khera(dot)org>
To:	PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-05-23 15:47:27
Message-ID:	280803EF-0935-43AD-AD86-73E3A1700BF5@khera.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

On May 18, 2007, at 11:40 AM, Liviu Ionescu wrote:

> 8.1 might have similar problems, but the point here is different:
> if what
> was manually tuned to work in 8.1 confuses the 8.2 planner and
> performance
> drops so much (from 2303 to 231929 ms in my case) upgrading a
> production
> machine to 8.2 is a risky business. I probably have hundreds of sql

Doing any major software version upgrade on any production system is
suicidal without first vetting your entire stack against the proposed
upgrades. For best results, verify even minor version upgrades on
test systems first, with full testing of your app.

We do. It has saved us many times.

From:	"Steven Flatt" <steven(dot)flatt(at)gmail(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Steinar H(dot) Gunderson" <sgunderson(at)bigfoot(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-06-05 21:30:14
Message-ID:	357fa7590706051430h5db6760dx9d4a46c7a8a36205@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

On 5/18/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>
> Yeah. 8.2 is estimating that the "nodeid IS NULL" condition will
> discard all or nearly all the rows, presumably because there aren't any
> null nodeid's in the underlying table --- it fails to consider that the
> LEFT JOIN may inject some nulls. 8.1 was not any brighter; the reason
> it gets a different estimate is that it doesn't distinguish left-join
> and WHERE clauses at all, but assumes that the result of the left join
> can't have fewer rows than its left input, even after applying the
> filter condition. In this particular scenario that happens to be a
> better estimate. So even though 8.2 is smarter, and there is no bug
> here that wasn't in 8.1 too, it's getting a worse estimate leading to
> a worse plan.
>
> This is a sufficiently common idiom that I think it's a must-fix
> problem. Not sure about details yet, but it seems somehow the
> selectivity estimator had better start accounting for
> outer-join-injected NULLs.
>
This problem is causing us a bit of grief as we plan to move from 8.1.4 to
8.2.4. We have many (on the order of a hundred) queries that are of the
form:

(A) LEFT JOIN (B) ON col WHERE B.col IS NULL

These queries are much slower on 8.2 than on 8.1 for what looks like the
reason outlined above. I have rewritten a few key queries to be of the
equivalent form:

(A) WHERE col NOT IN (SELECT col FROM (B))

which has resulted in a dramatic improvement. I'm really hoping that I'm
not going to need to re-write every single one of our queries that are of
the first form above. Is there any estimation as to if/when the fix will
become available? I'm hoping this isn't going to be a showstopper in us
moving to 8.2.

Thanks,
Steve

From:	"Steinar H(dot) Gunderson" <sgunderson(at)bigfoot(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-06-05 21:38:47
Message-ID:	20070605213847.GA9547@uio.no
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

On Tue, Jun 05, 2007 at 05:30:14PM -0400, Steven Flatt wrote:
> (A) LEFT JOIN (B) ON col WHERE B.col IS NULL
>
> These queries are much slower on 8.2 than on 8.1 for what looks like the
> reason outlined above. I have rewritten a few key queries to be of the
> equivalent form:
>
> (A) WHERE col NOT IN (SELECT col FROM (B))

At least those _can_ be rewritten into a sane form. I have an application
with a large FULL OUTER JOIN, where _both_ sides can return NULLs. (It's
basically a diff between a "current" and a "wanted" state.)

It performs reasonably well under both 8.1 and 8.2, though. Fourteen-table
join or so :-)

/* Steinar */
--
Homepage: http://www.sesse.net/

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Steven Flatt" <steven(dot)flatt(at)gmail(dot)com>
Cc:	"Steinar H(dot) Gunderson" <sgunderson(at)bigfoot(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-06-05 22:08:05
Message-ID:	10645.1181081285@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

"Steven Flatt" <steven(dot)flatt(at)gmail(dot)com> writes:
> Is there any estimation as to if/when the fix will
> become available? I'm hoping this isn't going to be a showstopper in us
> moving to 8.2.

If you're feeling desperate you could revert this patch in your local
copy:
http://archives.postgresql.org/pgsql-committers/2006-11/msg00066.php

regards, tom lane

From:	"Steven Flatt" <steven(dot)flatt(at)gmail(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Steinar H(dot) Gunderson" <sgunderson(at)bigfoot(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-06-07 17:11:44
Message-ID:	357fa7590706071011g4e337937rc622ffed39856425@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

On 6/5/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> If you're feeling desperate you could revert this patch in your local
> copy:
> http://archives.postgresql.org/pgsql-committers/2006-11/msg00066.php
>
> regards, tom lane
>

Reverting that patch has not appeared to solve our problem. Perhaps I
didn't provide enough information, because I feel like there's more going
on here.

One instance of our problem goes like this, and I have included a
self-contained example with which you can reproduce the problem. We make
heavy use of partitioned tables, so during our schema install, we create a
lot of inherited tables (on the order of 2000) to which we also want to add
the FK constraints that exist on the parent table. The PLpgSQL function
below does this. It queries for all FK constraints that are on the parent
table but not on the child, then generates the sql to add them to
the child. (The function has been modified from the original but the main
query is the same.)

Note the "this is slow" section and the "replace with this which is fast"
section. Both queries are fast on 8.1.4 (entire function completes in 2
minutes), but not on 8.2.4. If you notice the "ELAPSED TIME"s written to
the console, the query times start equally fast but grows painfully slow
rather quickly with the "slow" version on 8.2.4.

Sorry for not providing explain analyze output, but I found it hard to tie
the output into the execution of the function. When I did stand-alone
explain analyzes, the actual times reported were similar on 8.1.4 and 8.2.4.
I think the degradation has more to do with doing many such queries in a
single transaction or something like that.

Plus, correct me if I'm wrong, but the degrading query is executed against
pg_catalog tables only, which are in general smallish, so I have a hard time
believing that even a sub-optimal query plan results in this level of
degradation.

Any help is much appreciated, thanks.
Steve

CREATE OR REPLACE FUNCTION inherit_fks_test()
RETURNS interval
VOLATILE
LANGUAGE PLpgSQL
AS '
DECLARE
childtbl varchar;
childoid oid;
rec record;
start timestamptz;
finish timestamptz;
time1 timestamptz;
time2 timestamptz;
elapsed interval;
BEGIN
start := timeofday();

EXECUTE ''SET LOCAL log_min_messages TO NOTICE'';
EXECUTE ''CREATE TABLE foo(a INT UNIQUE)'';
EXECUTE ''CREATE TABLE bar(b INT REFERENCES foo(a))'';

FOR count IN 1 .. 2000
LOOP
childtbl := ''bar_'' || count;
EXECUTE ''CREATE TABLE '' || childtbl || ''() INHERITS
(bar)'';

childoid := childtbl::regclass::oid;

time1 := timeofday();
FOR rec IN
SELECT ''ALTER TABLE ''
|| quote_ident(n.nspname) || ''.''
|| quote_ident(cl.relname)
|| '' ADD CONSTRAINT ''
|| quote_ident(parent_const.conname) || '' ''
|| parent_const.def AS cmd
FROM pg_catalog.pg_class cl
JOIN pg_catalog.pg_namespace n
ON (n.oid = cl.relnamespace)
JOIN pg_catalog.pg_inherits i
ON (i.inhrelid = cl.oid)
JOIN (
SELECT c.conname,
c.conrelid,
c.confrelid,
pg_get_constraintdef(c.oid) AS def
FROM pg_catalog.pg_constraint c
WHERE c.confrelid <> 0
) AS parent_const
ON (parent_const.conrelid = i.inhparent)

-- This is slow
-------------------------------------------------------------------------------
LEFT OUTER JOIN (
SELECT c2.conname,
c2.conrelid,
c2.confrelid,
pg_get_constraintdef(c2.oid) AS def
FROM pg_catalog.pg_constraint c2
WHERE c2.confrelid <> 0
) AS child_const
ON (child_const.conrelid = cl.oid
AND child_const.conname =
parent_const.conname
AND child_const.confrelid =
parent_const.confrelid
AND child_const.def = parent_const.def)
WHERE child_const.conname IS NULL
-------------------------------------------------------------------------------

-- Replace with this which is fast
-------------------------------------------------------------------------------
-- WHERE conname NOT IN (
-- SELECT c2.conname
-- FROM pg_catalog.pg_constraint c2
-- WHERE c2.confrelid <> 0
-- AND c2.conrelid = cl.oid
-- AND c2.conname = parent_const.conname
-- AND c2.confrelid =
parent_const.confrelid
-- AND pg_get_constraintdef(c2.oid) =
-- parent_const.def
-- )
-------------------------------------------------------------------------------

AND cl.oid = childoid
LOOP
time2 := timeofday();
EXECUTE rec.cmd;
END LOOP;

elapsed := time2 - time1;
RAISE NOTICE ''%: ELAPSED TIME: %'',count,elapsed;

END LOOP;

finish := timeofday();
RETURN finish - start;
END;
';

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Steven Flatt" <steven(dot)flatt(at)gmail(dot)com>
Cc:	"Steinar H(dot) Gunderson" <sgunderson(at)bigfoot(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-06-08 01:11:19
Message-ID:	11349.1181265079@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

"Steven Flatt" <steven(dot)flatt(at)gmail(dot)com> writes:
> One instance of our problem goes like this, and I have included a
> self-contained example with which you can reproduce the problem.

This is fairly interesting, because if you run the query by hand after
the function finishes, it's pretty fast. What I think is happening is
that the plpgsql function caches a plan for the catalog query that is
predicated on pg_constraint and pg_inherits being small, and after
you've inserted a few thousand rows in them, that's not true anymore.

In CVS 8.2 (and HEAD), the core of the query seems to be
planned like this initially:

-> Hash Join (cost=1.24..8.70 rows=1 width=76)
Hash Cond: (c.conrelid = i.inhparent)
-> Seq Scan on pg_constraint c (cost=0.00..7.35 rows=27 width=76)
Filter: (confrelid <> 0::oid)
-> Hash (cost=1.23..1.23 rows=1 width=8)
-> Seq Scan on pg_inherits i (cost=0.00..1.23 rows=1 width=8)
Filter: (inhrelid = 42154::oid)

With a thousand or so rows inserted in each catalog, it likes
this plan better:

-> Nested Loop (cost=0.00..16.55 rows=1 width=76)
-> Index Scan using pg_inherits_relid_seqno_index on pg_inherits i (cost=0.00..8.27 rows=1 width=8)
Index Cond: (inhrelid = 42154::oid)
-> Index Scan using pg_constraint_conrelid_index on pg_constraint c (cost=0.00..8.27 rows=1 width=76)
Index Cond: (c.conrelid = i.inhparent)
Filter: (c.confrelid <> 0::oid)

and indeed that plan is a lot better as the catalogs grow.
But the plpgsql function cached the other plan at start.

I'm not entirely sure why 8.1 doesn't fall into the same trap ---
perhaps it's because it's unable to rearrange outer joins.
It's certainly not being any smarter than 8.2.

Anyway, it seems that you could either try to get some pg_constraint and
pg_inherits rows created before you start this function, or you could
change it to use an EXECUTE to force replanning of the inner query.
Or just start a new session after the first few hundred table creations.

I was hoping that the auto plan invalidation code in CVS HEAD would get
it out of this problem, but it seems not to for the problem-as-given.
The trouble is that it won't change plans until autovacuum analyzes the
tables, and that won't happen until the transaction commits and sends
off its I-inserted-lotsa-rows report to the stats collector. So any
given large transaction is stuck with the plans it first forms. There's
probably nothing we can do about that in time for 8.3, but it's
something to think about for future releases ...

regards, tom lane

From:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Steven Flatt <steven(dot)flatt(at)gmail(dot)com>, "Steinar H(dot) Gunderson" <sgunderson(at)bigfoot(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-06-08 01:22:57
Message-ID:	20070608012257.GB20828@alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

Tom Lane escribió:

> I was hoping that the auto plan invalidation code in CVS HEAD would get
> it out of this problem, but it seems not to for the problem-as-given.
> The trouble is that it won't change plans until autovacuum analyzes the
> tables, and that won't happen until the transaction commits and sends
> off its I-inserted-lotsa-rows report to the stats collector. So any
> given large transaction is stuck with the plans it first forms. There's
> probably nothing we can do about that in time for 8.3, but it's
> something to think about for future releases ...

I think there is something we can do about this -- drop the default
value for analyze threshold. We even discussed way back that we could
drop the concept of thresholds altogether, and nobody came up with an
argument for defending them.

> it won't change plans until autovacuum analyzes the
> tables, and that won't happen until the transaction commits and sends
> off its I-inserted-lotsa-rows report to the stats collector. So any
> given large transaction is stuck with the plans it first forms. There's
> probably nothing we can do about that in time for 8.3, but it's
> something to think about for future releases ...

Ah, *within* a single large transaction :-( Yeah that's probably not
very solvable for the moment.

--
Alvaro Herrera Valdivia, Chile ICBM: S 39º 49' 18.1", W 73º 13' 56.4"
"Ninguna manada de bestias tiene una voz tan horrible como la humana" (Orual)

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc:	Steven Flatt <steven(dot)flatt(at)gmail(dot)com>, "Steinar H(dot) Gunderson" <sgunderson(at)bigfoot(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-06-08 01:31:14
Message-ID:	11710.1181266274@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Tom Lane escribi:
>> I was hoping that the auto plan invalidation code in CVS HEAD would get
>> it out of this problem, but it seems not to for the problem-as-given.
>> The trouble is that it won't change plans until autovacuum analyzes the
>> tables, and that won't happen until the transaction commits and sends
>> off its I-inserted-lotsa-rows report to the stats collector.

> I think there is something we can do about this -- drop the default
> value for analyze threshold.

Maybe worth doing, but it doesn't help for Steve's example.

regards, tom lane

From:	"Steven Flatt" <steven(dot)flatt(at)gmail(dot)com>
To:	"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	"Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Steinar H(dot) Gunderson" <sgunderson(at)bigfoot(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: performance drop on 8.2.4, reverting to 8.1.4
Date:	2007-06-12 18:31:01
Message-ID:	357fa7590706121131t5adf3795ld006955d6f77e175@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-performance

Thanks Tom and Alvaro.

To follow up on this, I re-wrote and tweaked a number of queries (including
the one provided) to change "LEFT OUTER JOIN ... WHERE col IS NULL" clauses
to "WHERE col NOT IN (...)" clauses.

This has brought performance to an acceptable level on 8.2.

Thanks for your time,
Steve

On 6/7/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> > Tom Lane escribió:
> >> I was hoping that the auto plan invalidation code in CVS HEAD would get
> >> it out of this problem, but it seems not to for the problem-as-given.
> >> The trouble is that it won't change plans until autovacuum analyzes the
> >> tables, and that won't happen until the transaction commits and sends
> >> off its I-inserted-lotsa-rows report to the stats collector.
>
> > I think there is something we can do about this -- drop the default
> > value for analyze threshold.
>
> Maybe worth doing, but it doesn't help for Steve's example.
>
> regards, tom lane
>