Re: Heavily modified big table bloat even in auto vacuum is running

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Heavily modified big table bloat even in auto vacuum is running
Date: 2013-11-12 03:17:24
Message-ID: CAA4eK1JjxcyBzGf2GfHpNe8JgzKQ4RE6exNsvVjc3R3PcuFi9w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 11, 2013 at 3:14 PM, Haribabu kommi
<haribabu(dot)kommi(at)huawei(dot)com> wrote:
> On 08 November 2013 18:35 Amit Kapila wrote:
>> On Fri, Nov 8, 2013 at 10:56 AM, Haribabu kommi
>> <haribabu(dot)kommi(at)huawei(dot)com> wrote:
>> > On 07 November 2013 09:42 Amit Kapila wrote:
>> >> I am not sure whether the same calculation as done for
>> new_rel_tuples
>> >> works for new_dead_tuples, you can once check it.
>> >
>> > I didn't find any way to calculate new_dead_tuples like
>> new_rel_tuples.
>> > I will check it.
>> >
>> > The two approaches calculations are approximation values only.
>> >
>> > 1. Taking a copy of n_dead_tuples before VACUUM starts and then
>> subtract it once it is done.
>> > This approach doesn't include the tuples which are remains during
>> the vacuum operation.
>>
>> Wouldn't next or future vacuum's will make the estimate more
>> appropraite?
>
> Possible only when nkeep counter value (tuples not cleaned) is very less value.

Do you really expect too many dead tuples during Vacuum?

>> > 2. nkeep counter contains the tuples which are still visible to other
>> transactions.
>> > This approach doesn't include tuples which are deleted on pages
>> where vacuum operation is already finished.
>> >
>> > In my opinion the second approach gives the value nearer to the
>> actual
>> > value, because it includes some of the new dead tuples also. Please
>> correct me if anything wrong in my analysis.
>> I think main problem in nkeep logic is to come up with an estimation
>> algorithm similar to live tuples.
>>
>> By the way, do you have test case or can you try to write a test case
>> which can show this problem and then after fix, you can verify if the
>> problem is resolved.
>
> The simulated index bloat problem can be generated using the attached script and sql.
> With the fix of setting the dead tuples properly,

Which fix here you are referring to, is it the one which you have
proposed with your initial mail?

> the bloat is reduced and by changing the vacuum cost
> Parameters the bloat is avoided.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Johnston 2013-11-12 03:25:59 Re: MVCC snapshot timing
Previous Message Bruce Momjian 2013-11-12 02:46:09 Re: MVCC snapshot timing