Re: Join over two tables of 50K records takes 2 hours

Lists: pgsql-performance
From: Svetlin Manavski <svetlin(dot)manavski(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Join over two tables of 50K records takes 2 hours
Date: 2011-10-12 08:55:46
Message-ID: CAMm+ggTh3DRKpWyeNfRi5nUSo3UaActDKqb=7JHW-NSgcOUrfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi all,
I am running 9.03 with the settings listed below. I have a prohibitively
slow query in an application which has an overall good performance:

select
cast (SD.detectorid as numeric),
CAST( ( (SD.createdtime - 0 )/ 18000000000000::bigint ) AS numeric) as
timegroup,
sum(datafromsource)+sum(datafromdestination) as numbytes,
CAST ( sum(packetsfromsource)+sum(packetsfromdestination) AS numeric) as
numpackets
from
appqosdata.icmptraffic SD , appqosdata.icmptraffic_classes CL
where
SD.detectorid >= 0
and CL.detectorid = SD.detectorid
and CL.sessionid = SD.sessionid
and CL.createdtime = SD.createdtime
and SD.detectorid = 1
and SD.createdtime >= 1317726000000000000::bigint and SD.createdtime <
1318326120000000000::bigint
and CL.sessionid < 1318326120000000000::bigint
and CL.classid = 1
group by
SD.detectorid, timegroup

appqosdata.icmptraffic and appqosdata.icmptraffic_classes are both
partitioned.

CREATE TABLE appqosdata.icmptraffic
(
detectorid smallint not null default(0), -- references
appqosdata.detectors(id),
sessionid bigint not null,
createdtime bigint not null,

...
--primary key(detectorid, sessionid, createdtime) defined in the
children tables
);

CREATE TABLE appqosdata.icmptraffic_classes
(
detectorid smallint not null,
sessionid bigint not null,
createdtime bigint not null,

classid integer not null

-- definitions in the children tables:
--primary key(detectorid, sessionid, createdtime, classid)
--foreign key(detectorid, sessionid, createdtime) references
appqosdata.icmptraffic(detectorid, sessionid, createdtime),
--foreign key(classid) references appqosdata.display_classes(id),
);

"HashAggregate (cost=154.24..154.28 rows=1 width=34) (actual
time=7594069.940..7594069.983 rows=19 loops=1)"
" Output: (sd.detectorid)::numeric, ((((sd.createdtime - 0) /
18000000000000::bigint))::numeric), (sum(sd.datafromsource) +
sum(sd.datafromdestination)), ((sum(sd.packetsfromsource) +
sum(sd.packetsfromdestination)))::numeric, sd.detectorid"
" -> Nested Loop (cost=0.00..154.23 rows=1 width=34) (actual
time=0.140..7593838.258 rows=50712 loops=1)"
" Output: sd.detectorid, sd.createdtime, sd.datafromsource,
sd.datafromdestination, sd.packetsfromsource, sd.packetsfromdestination,
(((sd.createdtime - 0) / 18000000000000::bigint))::numeric"
" Join Filter: ((sd.sessionid = cl.sessionid) AND (sd.createdtime =
cl.createdtime))"
" -> Append (cost=0.00..88.37 rows=7 width=18) (actual
time=0.063..333.355 rows=51776 loops=1)"
" -> Seq Scan on appqosdata.icmptraffic_classes cl
(cost=0.00..37.48 rows=1 width=18) (actual time=0.013..0.013 rows=0
loops=1)"
" Output: cl.detectorid, cl.sessionid, cl.createdtime"
" Filter: ((cl.sessionid < 1318326120000000000::bigint)
AND (cl.detectorid = 1) AND (cl.classid = 1))"
" -> Index Scan using icmptraffic_classes_10_pkey on
appqosdata.icmptraffic_classes_10 cl (cost=0.00..8.36 rows=1 width=18)
(actual time=0.046..14.205 rows=3985 loops=1)"
" Output: cl.detectorid, cl.sessionid, cl.createdtime"
" Index Cond: ((cl.detectorid = 1) AND (cl.sessionid <
1318326120000000000::bigint) AND (cl.classid = 1))"
" -> Index Scan using icmptraffic_classes_11_pkey on
appqosdata.icmptraffic_classes_11 cl (cost=0.00..8.62 rows=1 width=18)
(actual time=0.038..52.757 rows=14372 loops=1)"
" Output: cl.detectorid, cl.sessionid, cl.createdtime"
" Index Cond: ((cl.detectorid = 1) AND (cl.sessionid <
1318326120000000000::bigint) AND (cl.classid = 1))"
" -> Index Scan using icmptraffic_classes_12_pkey on
appqosdata.icmptraffic_classes_12 cl (cost=0.00..8.60 rows=1 width=18)
(actual time=0.033..47.845 rows=13512 loops=1)"
" Output: cl.detectorid, cl.sessionid, cl.createdtime"
" Index Cond: ((cl.detectorid = 1) AND (cl.sessionid <
1318326120000000000::bigint) AND (cl.classid = 1))"
" -> Index Scan using icmptraffic_classes_13_pkey on
appqosdata.icmptraffic_classes_13 cl (cost=0.00..8.59 rows=1 width=18)
(actual time=0.030..46.504 rows=13274 loops=1)"
" Output: cl.detectorid, cl.sessionid, cl.createdtime"
" Index Cond: ((cl.detectorid = 1) AND (cl.sessionid <
1318326120000000000::bigint) AND (cl.classid = 1))"
" -> Index Scan using icmptraffic_classes_14_pkey on
appqosdata.icmptraffic_classes_14 cl (cost=0.00..8.43 rows=1 width=18)
(actual time=0.025..22.868 rows=6633 loops=1)"
" Output: cl.detectorid, cl.sessionid, cl.createdtime"
" Index Cond: ((cl.detectorid = 1) AND (cl.sessionid <
1318326120000000000::bigint) AND (cl.classid = 1))"
" -> Index Scan using icmptraffic_classes_15_pkey on
appqosdata.icmptraffic_classes_15 cl (cost=0.00..8.30 rows=1 width=18)
(actual time=0.014..0.014 rows=0 loops=1)"
" Output: cl.detectorid, cl.sessionid, cl.createdtime"
" Index Cond: ((cl.detectorid = 1) AND (cl.sessionid <
1318326120000000000::bigint) AND (cl.classid = 1))"
" -> Materialize (cost=0.00..65.13 rows=6 width=42) (actual
time=0.001..73.261 rows=50915 loops=51776)"
" Output: sd.detectorid, sd.createdtime, sd.datafromsource,
sd.datafromdestination, sd.packetsfromsource, sd.packetsfromdestination,
sd.sessionid"
" -> Append (cost=0.00..65.10 rows=6 width=42) (actual
time=0.059..244.693 rows=50915 loops=1)"
" -> Seq Scan on appqosdata.icmptraffic sd
(cost=0.00..22.60 rows=1 width=42) (actual time=0.001..0.001 rows=0
loops=1)"
" Output: sd.detectorid, sd.createdtime,
sd.datafromsource, sd.datafromdestination, sd.packetsfromsource,
sd.packetsfromdestination, sd.sessionid"
" Filter: ((sd.detectorid >= 0) AND (sd.createdtime
>= 1317726000000000000::bigint) AND (sd.createdtime <
1318326120000000000::bigint) AND (sd.detectorid = 1))"
" -> Index Scan using icmptraffic_10_pkey on
appqosdata.icmptraffic_10 sd (cost=0.00..8.35 rows=1 width=42) (actual
time=0.053..7.807 rows=3997 loops=1)"
" Output: sd.detectorid, sd.createdtime,
sd.datafromsource, sd.datafromdestination, sd.packetsfromsource,
sd.packetsfromdestination, sd.sessionid"
" Index Cond: ((sd.detectorid >= 0) AND
(sd.detectorid = 1) AND (sd.createdtime >= 1317726000000000000::bigint) AND
(sd.createdtime < 1318326120000000000::bigint))"
" -> Index Scan using icmptraffic_11_pkey on
appqosdata.icmptraffic_11 sd (cost=0.00..8.59 rows=1 width=42) (actual
time=0.025..27.957 rows=14372 loops=1)"
" Output: sd.detectorid, sd.createdtime,
sd.datafromsource, sd.datafromdestination, sd.packetsfromsource,
sd.packetsfromdestination, sd.sessionid"
" Index Cond: ((sd.detectorid >= 0) AND
(sd.detectorid = 1) AND (sd.createdtime >= 1317726000000000000::bigint) AND
(sd.createdtime < 1318326120000000000::bigint))"
" -> Index Scan using icmptraffic_12_pkey on
appqosdata.icmptraffic_12 sd (cost=0.00..8.58 rows=1 width=42) (actual
time=0.027..26.217 rows=13512 loops=1)"
" Output: sd.detectorid, sd.createdtime,
sd.datafromsource, sd.datafromdestination, sd.packetsfromsource,
sd.packetsfromdestination, sd.sessionid"
" Index Cond: ((sd.detectorid >= 0) AND
(sd.detectorid = 1) AND (sd.createdtime >= 1317726000000000000::bigint) AND
(sd.createdtime < 1318326120000000000::bigint))"
" -> Index Scan using icmptraffic_13_pkey on
appqosdata.icmptraffic_13 sd (cost=0.00..8.56 rows=1 width=42) (actual
time=0.030..26.075 rows=13430 loops=1)"
" Output: sd.detectorid, sd.createdtime,
sd.datafromsource, sd.datafromdestination, sd.packetsfromsource,
sd.packetsfromdestination, sd.sessionid"
" Index Cond: ((sd.detectorid >= 0) AND
(sd.detectorid = 1) AND (sd.createdtime >= 1317726000000000000::bigint) AND
(sd.createdtime < 1318326120000000000::bigint))"
" -> Index Scan using icmptraffic_14_pkey on
appqosdata.icmptraffic_14 sd (cost=0.00..8.41 rows=1 width=42) (actual
time=0.027..11.040 rows=5604 loops=1)"
" Output: sd.detectorid, sd.createdtime,
sd.datafromsource, sd.datafromdestination, sd.packetsfromsource,
sd.packetsfromdestination, sd.sessionid"
" Index Cond: ((sd.detectorid >= 0) AND
(sd.detectorid = 1) AND (sd.createdtime >= 1317726000000000000::bigint) AND
(sd.createdtime < 1318326120000000000::bigint))"
"Total runtime: 7594071.137 ms"

name |
current_setting
------------------------------+----------------------------------------------------------------------------------------------------------
version | PostgreSQL 9.0.3 on
amd64-portbld-freebsd8.0, compiled by GCC cc (GCC) 4.2.1 20070719
[FreeBSD], 64-bit
autovacuum | on
autovacuum_analyze_threshold | 500000
autovacuum_max_workers | 1
autovacuum_naptime | 1h
autovacuum_vacuum_threshold | 500000
checkpoint_segments | 64
effective_cache_size | 3GB
fsync | on
lc_collate | C
lc_ctype | C
listen_addresses | *
log_destination | syslog, stderr
log_min_duration_statement | 5ms
log_rotation_age | 1d
log_rotation_size | 100MB
logging_collector | on
max_connections | 30
max_stack_depth | 2MB
server_encoding | UTF8
shared_buffers | 1793MB
silent_mode | on
synchronous_commit | on
syslog_facility | local0
TimeZone | Europe/Jersey
update_process_title | off
wal_buffers | 128kB
work_mem | 24MB

Thanks for any help,
Svetlin Manavski


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Svetlin Manavski <svetlin(dot)manavski(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Join over two tables of 50K records takes 2 hours
Date: 2011-10-14 04:37:48
Message-ID: 13425.1318567068@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Svetlin Manavski <svetlin(dot)manavski(at)gmail(dot)com> writes:
> I am running 9.03 with the settings listed below. I have a prohibitively
> slow query in an application which has an overall good performance:

It's slow because the planner is choosing a nestloop join on the
strength of its estimate that there's only a half dozen rows to be
joined. You need to figure out why those rowcount estimates are so bad.
I suspect that you've shot yourself in the foot by raising
autovacuum_analyze_threshold so high --- most likely, none of those
tables have ever gotten analyzed. And what's with the high
autovacuum_naptime setting? You might need to increase
default_statistics_target too, but first see if a manual ANALYZE makes
things better.

regards, tom lane


From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Svetlin Manavski <svetlin(dot)manavski(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Join over two tables of 50K records takes 2 hours
Date: 2011-10-14 04:38:16
Message-ID: CAOR=d=2zPgFH=sGNrqEe7OQp5PLAF2+SG5W3AS5ZRcom7S6p7w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

It seems like your row estimate are way off, with the planner
expecting 1 and getting 3000 or so. Have you tried cranking up
default stats target to say 1000, running analyze and seeing what
happens?

If that doesn't do it, try temporarily turning off nested loops:

set enable_nestloop = off;
explain analyze yourqueryhere;


From: Svetlin Manavski <svetlin(dot)manavski(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, scott(dot)marlowe(at)gmail(dot)com
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Join over two tables of 50K records takes 2 hours
Date: 2011-10-14 08:35:57
Message-ID: CAMm+ggQYraUSeEbDg19uSZhgoS8WitNiwrD_WOWhWJro_iSWUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Thank you guys for spotting the problem immediately.
The reason for such autovacuum thresholds is that these tables are designed
for very high rate of inserts and I have a specific routine to analyze them
in a more controlled way. Infact the stats target of some of the fields is
also high. However that routine was failing to perform analyze on
appqosdata.icmptraffic and its children due to another issue...

Regards,
Svetlin Manavski

On Fri, Oct 14, 2011 at 5:37 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Svetlin Manavski <svetlin(dot)manavski(at)gmail(dot)com> writes:
> > I am running 9.03 with the settings listed below. I have a prohibitively
> > slow query in an application which has an overall good performance:
>
> It's slow because the planner is choosing a nestloop join on the
> strength of its estimate that there's only a half dozen rows to be
> joined. You need to figure out why those rowcount estimates are so bad.
> I suspect that you've shot yourself in the foot by raising
> autovacuum_analyze_threshold so high --- most likely, none of those
> tables have ever gotten analyzed. And what's with the high
> autovacuum_naptime setting? You might need to increase
> default_statistics_target too, but first see if a manual ANALYZE makes
> things better.
>
> regards, tom lane
>


From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Svetlin Manavski <svetlin(dot)manavski(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Join over two tables of 50K records takes 2 hours
Date: 2011-10-14 09:41:34
Message-ID: CAOR=d=3kdC9QcwJt5a1rdJZWRqeuVKWkAp6xX6Pt-t-khACsyA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Fri, Oct 14, 2011 at 2:35 AM, Svetlin Manavski
<svetlin(dot)manavski(at)gmail(dot)com> wrote:
> Thank you guys for spotting the problem immediately.
> The reason for such autovacuum thresholds is that these tables are designed
> for very high rate of inserts and I have a specific routine to analyze them
> in a more controlled way. Infact the stats target of some of the fields is
> also high. However that routine was failing to perform analyze on
> appqosdata.icmptraffic and its children due to another issue...

Note that it's possible to set many of the autovacuum values per table
so you don't have to do a brainyectomy on the whole autovac system.