Re: Heavily modified big table bloat even in auto vacuum is running

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Heavily modified big table bloat even in auto vacuum is running
Date: 2013-11-25 05:12:54
Message-ID: CAA4eK1Ja1bo9oC6YmXT0PqV=VrXed969Y=DRKc-SqXB1FAEZNg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 22, 2013 at 12:12 PM, Haribabu kommi
<haribabu(dot)kommi(at)huawei(dot)com> wrote:
> On 19 November 2013 10:33 Amit Kapila wrote:
>> If I understood correctly, then your patch's main intention is to
>> correct the estimate of dead tuples, so that it can lead to Vacuum
>> cleaning the table/index which otherwise is not happening as per
>> configuration value (autovacuum_vacuum_threshold) in some of the cases,
>> also it is not reducing the complete bloat (Unpatched - 1532MB
>> ~Patched - 1474MB), as the main reason of bloat is extra space in
>> index which can be reclaimed by reindex operation.
>>
>> So if above is correct then this patch has 3 advantages:
>> a. Extra Vacuum on table/index due to better estimation of dead tuples.
>> b. Space reclaim due to this extra vacuum c. may be some performance
>> advantage as it will avoid the delay in cleaning dead tuples
>>
>> I think better way to test the patch is to see how much benefit is
>> there due to above (a and b points) advantages. Different values of
>> autovacuum_vacuum_threshold can be used to test.
>
>
> The performance effect of the patch is not much visible as I think the analyze
> on the table estimates the number of dead tuples of the table with some estimation.

Yes, that seems to be the reason why you are not seeing any
performance benefit, but still I think this is useful optimization to
do, as
analyze updates both the livetuples and dead tuples and similarly
vacuum should also update both the counts. Do you see any reason
why Vacuum should only update live tuples and not deadtuples?

> Because of this reason not much performance improvement is not visible as the
> missed dead tuple calculation in vacuum is covered by the analyze.

Yeah, so might be we can check once by configuring
analyze_threshold/scalefactor in a way that analyze doesn't get
trigger during your test.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexey Vasiliev 2013-11-25 05:15:57 Re[2]: [HACKERS] Re[2]: [HACKERS] Connect from background worker thread to database
Previous Message firoz e v 2013-11-25 05:10:09 Re: [PoC] pgstattuple2: block sampling to reduce physical read