Re: Table maintenance: order of operations important?

Lists: pgsql-admin
From: Jeff Boes <mur(at)qtm(dot)net>
To: pgsql-admin(at)postgresql(dot)org
Subject: Table maintenance: order of operations important?
Date: 2004-05-20 13:48:34
Message-ID: fe92d61c6c9a3eea2baa0b200ea5572f@news.teranews.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

We have a number of tables from which we expire (delete) rows nightly.
Is there any value to ordering the operations, in terms of either table
physical organization or elapsed time?

E.g.,

DELETE FROM foo WHERE date_expires < now();
VACUUM ANALYZE foo;
CLUSTER foo;
REINDEX TABLE foo;

How would you choose to order these (under 7.4.x) for fastest
turn-around? Does it matter?

--
It may not always be easy, convenient, or politically correct to stand
for truth and right, but it is the right thing to do. Always.
________ --M. Russell Ballard
Jeffery Boes <>< mur(at)qtm(dot)net


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jeff Boes <mur(at)qtm(dot)net>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: Table maintenance: order of operations important?
Date: 2004-05-20 17:52:12
Message-ID: 3030.1085075532@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Jeff Boes <mur(at)qtm(dot)net> writes:
> DELETE FROM foo WHERE date_expires < now();
> VACUUM ANALYZE foo;
> CLUSTER foo;
> REINDEX TABLE foo;

> How would you choose to order these (under 7.4.x) for fastest
> turn-around? Does it matter?

If you are going to CLUSTER then the VACUUM and the REINDEX are both
utterly redundant. The ANALYZE is still useful but should be done after
CLUSTER since its physical-order-correlation stats will be quite wrong
if done beforehand. In other words there is only one sane way to do
this and it is

DELETE FROM foo WHERE date_expires < now();
CLUSTER foo;
ANALYZE foo;

You could possibly make a case for

DELETE FROM foo WHERE date_expires < now();
CLUSTER foo;
VACUUM ANALYZE foo;

The VACUUM won't do anything useful in terms of reclaiming space (there
being none to reclaim just after a CLUSTER) but it would ensure that all
rows in the table are marked as committed-good, rather than leaving that
work to be done by the first transaction that happens to hit each row.

regards, tom lane


From: Gaetano Mendola <mendola(at)bigfoot(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Table maintenance: order of operations important?
Date: 2004-05-20 18:53:51
Message-ID: 40ACFEBF.7000406@bigfoot.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-admin

Tom Lane wrote:

> Jeff Boes <mur(at)qtm(dot)net> writes:
>
>> DELETE FROM foo WHERE date_expires < now();
>> VACUUM ANALYZE foo;
>> CLUSTER foo;
>> REINDEX TABLE foo;
>
>
>>How would you choose to order these (under 7.4.x) for fastest
>>turn-around? Does it matter?
>
>
> If you are going to CLUSTER then the VACUUM and the REINDEX are both
> utterly redundant.

Without cluster with 7.4.2 a REINDEX is redundant after a VACUUM FULL ?

Regards
Gaetano Mendola