Re: intermittant performance problem

From: Mike Charnoky <noky(at)nextbus(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: intermittant performance problem
Date: 2009-03-10 02:21:25
Message-ID: 49B5CEA5.9010306@nextbus.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

The random sampling query is normally pretty snappy. It usually takes
on the order of 1 second to sample a few thousand rows of data out of a
few million. The sampling is consistently quick, too. However, on some
days, the sampling starts off quick, then when the process starts
sampling from a different subset of data (different range of times for
the same day), the sampling query takes a couple minutes.

Regarding the concurrent vacuuming, this is definitely not happening. I
always check pg_stat_activity whenever the sampling process starts to
lag behind. I have never seen a vacuum running during this time.

Interesting idea to issue the EXPLAIN first... I will see if I can
instrument the sampling program to do this.

Thanks for your help Tom.

Mike

Tom Lane wrote:
> Mike Charnoky <noky(at)nextbus(dot)com> writes:
>> The sampling query which runs really slow on some days looks something
>> like this:
>
>> INSERT INTO sampled_data
>> (item_name, timestmp, ... )
>> SELECT item_name, timestmp, ... )
>> FROM raw_data
>> WHERE timestmp >= ? and timestmp < ?
>> AND item_name=?
>> AND some_data_field NOTNULL
>> ORDER BY random()
>> LIMIT ?;
>
> Hmph, I'd expect that that would run pretty slowly *all* the time :-(.
> There's no good way to optimize "ORDER BY random()". However, it seems
> like the first thing you should do is modify the program so that it
> issues an EXPLAIN for that right before actually doing the query, and
> then you could see if the plan is different on the slow days.
>
>> We have done a great deal of PG tuning, including the autovacuum for the
>> "raw_data" table. Autovacuum kicks like clockwork every day on that
>> table after the sampling process finishes (after one day's worth of data
>> is deleted from "raw_data" table, a roughly 7% change in size).
>
> Also, are you sure you have ruled out the possibility that the problem
> comes from autovac kicking in *while* the update is running?
>
> regards, tom lane
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Craig Ringer 2009-03-10 02:37:40 Re: C++ User-defined functions
Previous Message Mike Charnoky 2009-03-10 02:10:53 Re: intermittant performance problem