Re: Query tuning

Lists: pgsql-performance
From: "Subbiah, Stalin" <SSubbiah(at)netopia(dot)com>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Query tuning
Date: 2006-08-23 02:53:29
Message-ID: B949C470120CA7499A211214D76FBA580154443E@mxca2.corp.netopia.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Actually these servers will be upgraded to 8.1.4 in couple of months.

Here you go with explain analyze.

# explain analyze SELECT *
FROM EVENTLOG
WHERE EVENTTIME>'07/23/06 16:00:00' AND EVENTTIME<'08/22/06 16:00:00'
AND (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA'
OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA'
OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA')
ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 500;

QUERY PLAN

------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
------------------------------------------------------------------------
-------------------------------------------------------------
Limit (cost=15583110.14..15583111.39 rows=500 width=327) (actual
time=427771.568..427772.904 rows=500 loops=1)
-> Sort (cost=15583108.89..15618188.88 rows=14031998 width=327)
(actual time=427770.504..427771.894 rows=1000 loops=1)
Sort Key: eventtime, sequencenum
-> Seq Scan on eventlog (cost=0.00..2334535.17 rows=14031998
width=327) (actual time=10.370..190038.764 rows=7699388 loops=1)
Filter: ((eventtime > '2006-07-23 16:00:00'::timestamp
without time zone) AND (eventtime < '2006-08-22 16:00:00'::timestamp
without time zone) AND (((objdomainid)::text =
'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text =
'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text =
'tzRh39d0d91luNGT1weIUjLvFIcA'::text)))
Total runtime: 437884.134 ms
(6 rows)

-----Original Message-----
From: Chris [mailto:dmagick(at)gmail(dot)com]
Sent: Tuesday, August 22, 2006 6:37 PM
To: Subbiah, Stalin
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] Query tuning

Subbiah, Stalin wrote:
> Hello All,
>
> This query runs forever and ever. Nature of this table being lots of
> inserts/deletes/query, I vacuum it every half hour to keep the holes
> reusable and nightly once vacuum analyze to update the optimizer.
> We've got index on eventtime only. Running it for current day uses
> index range scan and it runs within acceptable time. Below is the
> explain of the query. Is the order by sequencenum desc prevents from
> applying limit optimization?
>
> explain SELECT *
> FROM EVENTLOG
> WHERE EVENTTIME>'07/23/06 16:00:00'
> AND EVENTTIME<'08/22/06 16:00:00'
> AND (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA'
> OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA'
> OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA')
> ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 0;
>
> QUERY PLAN
>
> ----------------------------------------------------------------------
> --
> ----------------------------------------------------------------------
> --
> ----------------------------------------------------------------------
> --
> ----------------------------------------------------------------------
> --
> -------------------------------------------------------------
> Limit (cost=15546930.29..15546931.54 rows=500 width=327)
> -> Sort (cost=15546930.29..15581924.84 rows=13997819 width=327)
> Sort Key: eventtime, sequencenum
> -> Seq Scan on eventlog (cost=0.00..2332700.25
> rows=13997819
> width=327)
> Filter: ((eventtime > '2006-07-23 16:00:00'::timestamp
> without time zone) AND (eventtime < '2006-08-22 16:00:00'::timestamp
> without time zone) AND (((objdomainid)::text =
> 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text =
> 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text =
> 'tzRh39d0d91luNGT1weIUjLvFIcA'::text)))
> (5 rows)
>
> Thanks,
> Stalin
> Pg version 8.0.1, suse 64bit.

Firstly you should update to 8.0.8 - because it's in the same stream you
won't need to do a dump/initdb/reload like a major version change, it
should be a simple upgrade.

Can you send explain analyze instead of just explain?

It sounds like you're not analyz'ing enough - if you're doing lots of
updates/deletes/inserts then the statistics postgresql uses to choose
whether to do an index scan or something else will quickly be outdated
and so it'll have to go back to a full table scan every time..

Can you set up autovacuum to handle that for you more regularly?

--
Postgresql & php tutorials
http://www.designmagick.com/


From: Chris <dmagick(at)gmail(dot)com>
To: "Subbiah, Stalin" <SSubbiah(at)netopia(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Query tuning
Date: 2006-08-23 03:05:34
Message-ID: 44EBC5FE.9020404@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Subbiah, Stalin wrote:
> Actually these servers will be upgraded to 8.1.4 in couple of months.

even so, you could get some bad data in there.
http://www.postgresql.org/docs/8.0/static/release.html . Go through the
old release notes and you'll find various race conditions, crashes etc.

> Here you go with explain analyze.
>
> # explain analyze SELECT *
> FROM EVENTLOG
> WHERE EVENTTIME>'07/23/06 16:00:00' AND EVENTTIME<'08/22/06 16:00:00'
> AND (OBJDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA'
> OR OBJID='tzRh39d0d91luNGT1weIUjLvFIcA'
> OR USERDOMAINID='tzRh39d0d91luNGT1weIUjLvFIcA')
> ORDER BY EVENTTIME DESC, SEQUENCENUM DESC LIMIT 500 OFFSET 500;
>
> QUERY PLAN
>
> ------------------------------------------------------------------------
> ------------------------------------------------------------------------
> ------------------------------------------------------------------------
> ------------------------------------------------------------------------
> -------------------------------------------------------------
> Limit (cost=15583110.14..15583111.39 rows=500 width=327) (actual
> time=427771.568..427772.904 rows=500 loops=1)
> -> Sort (cost=15583108.89..15618188.88 rows=14031998 width=327)
> (actual time=427770.504..427771.894 rows=1000 loops=1)
> Sort Key: eventtime, sequencenum
> -> Seq Scan on eventlog (cost=0.00..2334535.17 rows=14031998
> width=327) (actual time=10.370..190038.764 rows=7699388 loops=1)
> Filter: ((eventtime > '2006-07-23 16:00:00'::timestamp
> without time zone) AND (eventtime < '2006-08-22 16:00:00'::timestamp
> without time zone) AND (((objdomainid)::text =
> 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((objid)::text =
> 'tzRh39d0d91luNGT1weIUjLvFIcA'::text) OR ((userdomainid)::text =
> 'tzRh39d0d91luNGT1weIUjLvFIcA'::text)))
> Total runtime: 437884.134 ms
> (6 rows)

If you analyze the table then run this again what plan does it come back
with?

I can't read explain output properly but I suspect (and I'm sure I'll be
corrected if need be) that the sort step is way out of whack and so is
the seq scan because the stats aren't up to date enough.

Do you have an index on objdomainid, objid and userdomainid (one index
per field) ? I wonder if that will help much.

--
Postgresql & php tutorials
http://www.designmagick.com/