Re: performance issue with a specific query

Lists: pgsql-performance
From: Eliott <eliott100(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: performance issue with a specific query
Date: 2006-07-27 14:23:28
Message-ID: c2162c750607270723s628f60baqa729efbfcbeadde2@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi!

I hope I'm sending my question to the right list, please don't flame if it's
the wrong one.

I have noticed that while a query runs in about 1.5seconds on a 8.xx version
postgresql server on our 7.4.13 it takes around 15-20 minutes. Since we are
using RHEL4 on our server we are stuck with 7.4.13. The enormous time
difference between the different builds drives me crazy. Can you please help
me identifying the bottleneck or suggest anything to improve the dismal
performance.
The query is the following:

Select
car_license_plate.license_plate,
substr(date_trunc('day', car_km_fuel.transaction_time), 1, 10),
substr(date_trunc('second', car_km_fuel.transaction_time), 12, 8),
vehicle_make.make,
vehicle_type.model,
engine_size,
vehicle_fuel_type.fuel_type,
v_org_person_displayname.displayname_lastfirst,
car_km_fuel.ammount,
car_km_fuel.unit_price,
car_km_fuel.total_ammount,
currency.currency AS,
car_km_fuel.km AS,
vehicle_specific.fuel_capacity,
CASE WHEN (car_km_fuel.ammount > vehicle_specific.fuel_capacity) THEN
CAST(ROUND(CAST(car_km_fuel.ammount - vehicle_specific.fuel_capacity AS
NUMERIC), 2) AS varchar) ELSE '---' END AS "over",
car_km_fuel.notes,
CASE WHEN (prev_car_km_fuel.km IS NOT NULL AND car_km_fuel.km IS NOT NULL
AND (car_km_fuel.km - prev_car_km_fuel.km <> 0)) THEN
CAST(Round(CAST(((car_km_fuel.ammount / (car_km_fuel.km -
prev_car_km_fuel.km)) * 100) AS Numeric), 2) AS VARCHAR)
WHEN (prev_car_km_fuel.km IS NULL) THEN 'xxxx'
WHEN (car_km_fuel.km IS NULL) THEN 'error' END AS "average",
vehicle_specific.consumption_town,
org_person.email_address

FROM
car_km_fuel

LEFT JOIN
car ON car.id = car_km_fuel.car_id

LEFT JOIN
car_license_plate ON car_license_plate.car_id = car.id AND
(car_license_plate.license_plate_end_date < date_trunc('day',
car_km_fuel.transaction_time) OR car_license_plate.license_plate_end_date IS
NULL)
LEFT JOIN
vehicle_specific ON vehicle_specific.id = car.vehicle_specific_id

LEFT JOIN
vehicle_variant ON vehicle_variant.id =
vehicle_specific.vehicle_variant_id

LEFT JOIN
vehicle_type ON vehicle_type.id = vehicle_variant.vehicle_type_id

LEFT JOIN
vehicle_make ON vehicle_make.id = vehicle_type.vehicle_make_id

LEFT JOIN
vehicle_fuel_type ON vehicle_fuel_type.id = vehicle_specific.fuel_type_id

LEFT JOIN
car_driver ON car_driver.car_id = car.id AND
car_driver.allocation_date <= date_trunc('day',
car_km_fuel.transaction_time) AND
(car_driver.end_date >= date_trunc('day',
car_km_fuel.transaction_time) OR car_driver.end_date IS NULL)

LEFT JOIN
v_org_person_displayname ON v_org_person_displayname.id =
car_driver.car_driver_id

LEFT JOIN
org_person ON org_person.id = v_org_person_displayname.id

LEFT JOIN
currency ON currency.id = car_km_fuel.currency_id

LEFT JOIN
car_km_fuel AS prev_car_km_fuel ON
prev_car_km_fuel.transaction_time = (SELECT MAX(transaction_time) FROM
car_km_fuel as car_km_fuel2 WHERE car_km_fuel2.car_id = car.id AND
car_km_fuel2.transaction_time < car_km_fuel.transaction_time)

LEFT JOIN
org_company ON org_company.id = org_person.company_id

WHERE
(lower(org_company.name) LIKE lower(:param3) || '%') AND
(car_km_fuel.transaction_time >= :param1 OR :param1 IS NULL) AND
(car_km_fuel.transaction_time <= :param2 OR :param2 IS NULL)

ORDER BY
1, 2, 3;

The output of explain if the following under 7.4.13:

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=66.66..66.66 rows=1 width=917)
Sort Key: car_license_plate.license_plate,
substr((date_trunc('day'::text, car_km_fuel.transaction_time))::text,
1, 10), substr((date_trunc('second'::text,
car_km_fuel.transaction_time))::text, 12, 8)
-> Nested Loop (cost=44.93..66.65 rows=1 width=917)
-> Nested Loop Left Join (cost=44.93..62.23 rows=1 width=921)
Join Filter: ("inner".transaction_time = (subplan))
-> Nested Loop Left Join (cost=44.93..62.21 rows=1 width=917)
Join Filter: ("inner".id = "outer".currency_id)
-> Nested Loop (cost=44.93..60.92 rows=1 width=828)
-> Hash Join (cost=44.93..58.32 rows=1 width=805)
Hash Cond: ("outer".id =
"inner".car_driver_id)
-> Subquery Scan
v_org_person_displayname (cost=16.42..28.82 rows=196 width=520)
-> Merge Right Join
(cost=16.42..26.86 rows=196 width=51)
Merge Cond: ("outer".id
= "inner".company_id)
-> Index Scan using
pk_org_company on org_company co (cost=0.00..29.82 rows=47 width=27)
-> Sort
(cost=16.42..16.91 rows=196 width=28)
Sort Key: pers.company_id
-> Seq Scan on
org_person pers (cost=0.00..8.96 rows=196 width=28)
-> Hash (cost=28.51..28.51 rows=1 width=285)
-> Hash Join
(cost=19.81..28.51 rows=1 width=285)
Hash Cond:
("outer".car_id = "inner".car_id)
Join Filter:
((("outer".allocation_date)::timestamp without time zone <=
date_trunc('day'::text, "inner".transaction_time)) AND
((("outer".end_date)::timestamp without time zone >=
date_trunc('day'::text, "inner".transaction_time)) OR
("outer".end_date IS NULL)))
-> Seq Scan on
car_driver (cost=0.00..7.73 rows=173 width=16)
-> Hash
(cost=19.80..19.80 rows=4 width=285)
-> Hash Left Join
(cost=19.53..19.80 rows=4 width=285)
Hash Cond:
("outer".fuel_type_id = "inner".id)
-> Hash Left
Join (cost=18.50..18.72 rows=4 width=279)
Hash
Cond: ("outer".vehicle_make_id = "inner".id)
->
Merge Left Join (cost=17.38..17.53 rows=3 width=274)

Merge Cond: ("outer".vehicle_type_id = "inner".id)

-> Sort (cost=15.67..15.67 rows=2 width=265)

Sort Key: vehicle_variant.vehicle_type_id

-> Nested Loop Left Join (cost=0.00..15.66 rows=2 width=265)

Join Filter: ("inner".id = "outer".vehicle_variant_id)

-> Nested Loop Left Join (cost=0.00..13.83 rows=1
width=265)

Join Filter: ("inner".id =
"outer".vehicle_specific_id)

-> Nested Loop Left Join (cost=0.00..10.50 rows=1
width=234)

Join Filter:
((("inner".license_plate_end_date)::timestamp without time zone <
date_trunc('day'::text, "outer".transaction_time)) OR
("inner".license_plate_end_date IS NULL))

-> Nested Loop (cost=0.00..4.83 rows=1
width=224)

-> Seq Scan on car_km_fuel
(cost=0.00..0.00 rows=1 width=216)

Filter: (((transaction_time >=
'2005-01-01 00:00:00'::timestamp without time zone) OR (now() IS
NULL)) AND (((transaction_time)::timestamp with time zone <= now()) OR
(now() IS NULL)))

-> Index Scan using pk_car on car
(cost=0.00..4.82 rows=1 width=8)

Index Cond: (car.id =
"outer".car_id)

-> Index Scan using
ix_car_license_plate__car_id on car_license_plate (cost=0.00..5.65
rows=1 width=18)

Index Cond: (car_license_plate.car_id =
"outer".id)

-> Seq Scan on vehicle_specific (cost=0.00..2.59
rows=59 width=39)

-> Seq Scan on vehicle_variant (cost=0.00..1.37 rows=37
width=8)

-> Sort (cost=1.71..1.77 rows=22 width=17)

Sort Key: vehicle_type.id

-> Seq Scan on vehicle_type (cost=0.00..1.22 rows=22 width=17)
->
Hash (cost=1.10..1.10 rows=10 width=13)

-> Seq Scan on vehicle_make (cost=0.00..1.10 rows=10 width=13)
-> Hash
(cost=1.02..1.02 rows=2 width=14)
-> Seq
Scan on vehicle_fuel_type (cost=0.00..1.02 rows=2 width=14)
-> Index Scan using pk_org_person on
org_person (cost=0.00..2.59 rows=1 width=35)
Index Cond: (org_person.id =
"outer".car_driver_id)
-> Seq Scan on currency (cost=0.00..1.13
rows=13 width=97)
-> Seq Scan on car_km_fuel prev_car_km_fuel
(cost=0.00..0.00 rows=1 width=16)
SubPlan
-> Aggregate (cost=0.01..0.01 rows=1 width=8)
-> Seq Scan on car_km_fuel car_km_fuel2
(cost=0.00..0.00 rows=1 width=8)
Filter: ((car_id = $0) AND (transaction_time < $1))
-> Index Scan using pk_org_company on org_company
(cost=0.00..4.36 rows=1 width=4)
Index Cond: (org_company.id = "outer".company_id)
Filter: (lower((name)::text) ~~ '%'::text)

(64 rows)

If I leave off the where clause or run it on just a couple of recods, the
result is fine. Any ideas?

Regards
eliott


From: Scott Marlowe <smarlowe(at)g2switchworks(dot)com>
To: Eliott <eliott100(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: performance issue with a specific query
Date: 2006-07-27 14:46:26
Message-ID: 1154011586.31664.85.camel@state.g2switchworks.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Thu, 2006-07-27 at 09:23, Eliott wrote:
> Hi!
>
> I hope I'm sending my question to the right list, please don't flame
> if it's the wrong one.
>
> I have noticed that while a query runs in about 1.5seconds on a 8.xx
> version postgresql server on our 7.4.13 it takes around 15-20 minutes.
> Since we are using RHEL4 on our server we are stuck with 7.4.13. The
> enormous time difference between the different builds drives me crazy.
> Can you please help me identifying the bottleneck or suggest anything
> to improve the dismal performance.

You are absolutely on the right list. A couple of points.

1: Which 8.xx? 8.0.x or 8.1.x? 8.1.x is literally light years ahead
of 7.4 in terms of performance. 8.0 is somewhere between them. The
performance difference you're seeing is pretty common.

2: Looking at your query, there are places where you're joining on
things like date_trunc(...). In 7.4 the database will not, and cannot
use a normal index on the date field for those kinds of things. It can,
however, use a funtional index on some of them. Try creating an index
on date_trunc('day',yourfieldhere) and see if that helps.

3: You are NOT Stuck on 7.4.13. I have a RHEL server that will be
running 8.1.4 or so pretty soon as a dataware house. It may get updated
to RHEL4, may not. You can either compile from the .tar.[gz|bz2] files
or download the PGDG rpms for your distro.

4: You are fighting an uphill battle. There were a LOT of improvements
made all over in the march from 7.4 to 8.1. Not all of them were simple
planner tweaks and shortcuts, but honest to goodness changes to the way
things happen. No amount of tuning can make 7.4 run as fast as 8.1.


From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: Eliott <eliott100(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: performance issue with a specific query
Date: 2006-07-27 14:52:31
Message-ID: b42b73150607270752x7137dbbeu419589b89f22c24d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On 7/27/06, Eliott <eliott100(at)gmail(dot)com> wrote:
> Hi!
>
> I hope I'm sending my question to the right list, please don't flame if it's
> the wrong one.
>
> I have noticed that while a query runs in about 1.5seconds on a 8.xx version
> postgresql server on our 7.4.13 it takes around 15-20 minutes. Since we are
> using RHEL4 on our server we are stuck with 7.4.13. The enormous time
> difference between the different builds drives me crazy. Can you please help
> me identifying the bottleneck or suggest anything to improve the dismal
> performance.
> The query is the following:
>

try turning off genetic query optimization. regarding the rhel4
issue...does rhel not come with a c compiler? :)

merlin


From: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Eliott <eliott100(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: performance issue with a specific query
Date: 2006-07-27 15:39:42
Message-ID: 44C8DE3E.4060208@commandprompt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

>>
>
> try turning off genetic query optimization. regarding the rhel4
> issue...does rhel not come with a c compiler? :)

Enterprises are not going to compile. They are going to accept the
latest support by vendor release.

Redhat has a tendency to be incredibly stupid about this particular
area of their packaging.

Joshua D. Drake

>
> merlin
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
>

--

=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Eliott <eliott100(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: performance issue with a specific query
Date: 2006-07-27 17:25:48
Message-ID: 20060727172548.GD18774@surnet.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Joshua D. Drake wrote:
> >>
> >
> >try turning off genetic query optimization. regarding the rhel4
> >issue...does rhel not come with a c compiler? :)
>
> Enterprises are not going to compile. They are going to accept the
> latest support by vendor release.
>
> Redhat has a tendency to be incredibly stupid about this particular
> area of their packaging.

Stupid how?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Eliott <eliott100(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: performance issue with a specific query
Date: 2006-07-27 20:18:07
Message-ID: 3304.1154031487@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> Joshua D. Drake wrote:
>> Enterprises are not going to compile. They are going to accept the
>> latest support by vendor release.
>>
>> Redhat has a tendency to be incredibly stupid about this particular
>> area of their packaging.

> Stupid how?

Red Hat feels (apparently accurately, judging by their subscription
revenue ;-)) that what RHEL customers want is a platform that's stable
over multi-year application lifespans. So major incompatible changes in
the system software are not looked on with favor. That's why RHEL4
is still shipping PG 7.4.*. You can call it a stupid policy if you
like, but it's hard to argue with success.

However, there will be an RH-supported release of PG 8.1.* as an optional
add-on for RHEL4. Real Soon Now, I hope --- the release date has been
pushed back a couple times already.

regards, tom lane


From: Eliott <eliott100(at)gmail(dot)com>
To: "Scott Marlowe" <smarlowe(at)g2switchworks(dot)com>, mmoncure(at)gmail(dot)com
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: performance issue with a specific query
Date: 2006-07-28 10:04:01
Message-ID: c2162c750607280304m6d155f86t2dad5bd784fa8542@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi!

thanks for the quick help
i managed to reduce the response time from seemingly taking infinity to a
tolerable level, which is by the way a huge improvement.
What helped was the functional index on date_trunc('day',yourfieldhere) as
Scott suggested.
I tried to disable the geqo, but it didn't make any noticeable difference.
For now, I am happy with the result, if the result set stays the same I can
live without an upgrade.

> 1: Which 8.xx? 8.0.x or 8.1.x? 8.1.x is literally light years ahead
> of 7.4 in terms of performance. 8.0 is somewhere between them. The
> performance difference you're seeing is pretty common.

The benchmark was done on a small notebook running 8.1.4.1, versus 7.4.13 on
a 2gig P4 server. The difference is astounding, even without the functional
index 8.1 is 5-10 times faster than a full fledged server.

3: You are NOT Stuck on 7.4.13. I have a RHEL server that will be
> running 8.1.4 or so pretty soon as a dataware house. It may get updated
> to RHEL4, may not. You can either compile from the .tar.[gz|bz2] files
> or download the PGDG rpms for your distro.

I know, but that's was I was trying to avoid. It is not that I would use the
RHEL support provided for 7.4.13, but you know, staying official is the
whole point of subscribing to RHEL4.

Moreover, since many of our other applications are running happily running
under 7.4, I would be afraid to upgrade the whole thing.

So, again, thanks everybody for the help, you saved the day for me.

Regards
Eliott