Re: Heavily modified big table bloat even in auto vacuum is running

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Heavily modified big table bloat even in auto vacuum is running
Date: 2013-10-22 04:45:02
Message-ID: CAA4eK1L8am2uEpQdd+D_aBr_i2NY3q4kYAgwEt=UnSmsMzqQRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 21, 2013 at 10:54 AM, Haribabu kommi
<haribabu(dot)kommi(at)huawei(dot)com> wrote:
> On 20 October 2013 12:06 Amit Kapila wrote:
>>On Tue, Oct 15, 2013 at 3:37 PM, Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com> wrote:
>>> On 12 October 2013 11:30 Tom Lane wrote:
>>>>Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com> writes:
>>>
>>>>Another way to look at it is that we want to keep any increments to
>>>>n_dead_tuples that occur after VACUUM takes its snapshot. Maybe we could have VACUUM copy the n_dead_tuples value as it exists when VACUUM starts, and then send that as the value to subtract when it's done?
>>>
>>> Taking of n_dead_tuples copy and pass the same at the vacuum end to
>>> subtract from table stats may not be correct, as vacuum may not be cleaned all the dead tuples because of tuple visibility To other transactions. How about resets the n_dead_tuples as zero if it goes negative because of errors?
>
>>Wouldn't the way you are planing to change n_dead_tuples create inconsistency for n_live_tuples and n_dead_tuples, because it would have counted non deleted tuples as n_live_tuples as per below code:
>>
>>if (tupgone)
>>{
>>..
>>tups_vacuumed += 1;
>>has_dead_tuples = true;
>>}
>>else
>>{
>>num_tuples += 1;
>>hastup = true;
>>..
>>}
>>
>>So now if we just subtract tuples_deleted from n_dead_tuples, it will count the tuples deleted during vacuum both as live tuples and dead tuples.
>>There is one statistics for dead row version's that cannot be removed (nkeep), if we could use that to estimate total remaining dead tuples, then the solution can be inline with Tom's suggestion (What
>>would make more sense to me is for VACUUM to estimate the number of remaining dead tuples somehow and send that in its message.).
>
> Yes, it's correct. "nkeep" counter have the dead tuples which are recently dead and are not vacuumed. The removal of tuples vacuumed from dead tuples should be the same as "nkeep" counter.
> So if we remove the nkeep from num_tuples which gives us the proper live tuples. How about following statement at the end scan for all blocks.
>
> num_tuples -= nkeep;

Actually what I had in mind was to use nkeep to estimate n_dead_tuples
similar to how num_tuples is used to estimate n_live_tuples. I think
it will match what Tom had pointed in his response (>>>>What would
make more sense to me is for VACUUM to estimate the number
>>>>of remaining dead tuples somehow and send that in its message. However, since the whole point here is that we aren't accounting for transactions that commit while VACUUM runs, it's not very clear how to do that.)

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2013-10-22 05:51:28 Re: proposal: lob conversion functionality
Previous Message Tom Lane 2013-10-22 04:39:29 pgsql: Fix blatantly broken record_image_cmp() logic for pass-by-value