Quick Links

Heavily modified big table bloat even in auto vacuum is running

Lists:	pgsql-hackers

From:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
To:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Heavily modified big table bloat even in auto vacuum is running
Date:	2013-10-11 12:57:17
Message-ID:	8977CB36860C5843884E0A18D8747B0372BC84DA@szxeml558-mbs.china.huawei.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

vacuum is not happening on a heavily modified big table even if the dead tuples are more than configured threshold.

This is because during at the end of vacuum, the number of dead tuples of the table is reset as zero, because
of this reason the dead tuples which are occurred during the vacuum operation are lost. Thus to trigger a next vacuum
on the same table, the configured threshold number of dead tuples needs to be occurred.
The next vacuum operation is taking more time because of more number of dead tuples, like this it continues and it leads
to Table and index bloat.

To handle the above case instead of directly resetting the dead tuples as zero, how if the exact dead tuples
are removed from the table stats. With this approach vacuum gets triggered frequently thus it reduces the bloat.
Patch for the same is attached in the mail.

please let me know is there any problem in this approach.

Regards,
Hari babu.

Attachment	Content-Type	Size
vacuum_fix_v1.patch	application/octet-stream	3.2 KB

From:	Kevin Grittner <kgrittn(at)ymail(dot)com>
To:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-10-11 23:10:49
Message-ID:	1381533049.83343.YahooMailNeo@web162906.mail.bf1.yahoo.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com> wrote:

> To handle the above case instead of directly resetting the dead
> tuples as zero, how if the exact dead tuples are removed from the
> table stats. With this approach vacuum gets triggered frequently
> thus it reduces the bloat.
> Patch for the same is attached in the mail.

Please add this to the open CommitFest to ensure that it gets
reviewed:

https://commitfest.postgresql.org/action/commitfest_view/open

For more information about the process, see:

http://wiki.postgresql.org/wiki/CommitFest

You may also want to reveiw:

http://wiki.postgresql.org/wiki/Developer_FAQ

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-10-12 06:00:16
Message-ID:	4154.1381557616@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com> writes:
> To handle the above case instead of directly resetting the dead tuples as zero, how if the exact dead tuples
> are removed from the table stats. With this approach vacuum gets triggered frequently thus it reduces the bloat.

This does not seem like a very good idea as-is, because it will mean that
n_dead_tuples can diverge arbitrarily far from reality over time, as a
result of accumulation of errors. It also doesn't seem like a very good
idea that VACUUM sets n_live_tuples while only adjusting n_dead_tuples
incrementally; ideally those counters should move in the same fashion.
In short, I think this patch will create at least as many problems as
it fixes.

What would make more sense to me is for VACUUM to estimate the number
of remaining dead tuples somehow and send that in its message. However,
since the whole point here is that we aren't accounting for transactions
that commit while VACUUM runs, it's not very clear how to do that.

Another way to look at it is that we want to keep any increments to
n_dead_tuples that occur after VACUUM takes its snapshot. Maybe we could
have VACUUM copy the n_dead_tuples value as it exists when VACUUM starts,
and then send that as the value to subtract when it's done?

regards, tom lane

From:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-10-15 10:07:27
Message-ID:	8977CB36860C5843884E0A18D8747B0372BC8A7E@szxeml558-mbs.china.huawei.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 12 October 2013 11:30 Tom Lane wrote:
>Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com> writes:
>> To handle the above case instead of directly resetting the dead tuples
>> as zero, how if the exact dead tuples are removed from the table stats. With this approach vacuum gets triggered frequently thus it reduces the bloat.

>This does not seem like a very good idea as-is, because it will mean that n_dead_tuples can diverge arbitrarily far from reality over time, as a result of accumulation of errors. It also doesn't seem
>like a very good idea that VACUUM sets n_live_tuples while only adjusting n_dead_tuples incrementally; ideally those counters should move in the same fashion.
>In short, I think this patch will create at least as many problems as it fixes.

>What would make more sense to me is for VACUUM to estimate the number of remaining dead tuples somehow and send that in its message. However, since the whole point here is that we aren't accounting for
>transactions that commit while VACUUM runs, it's not very clear how to do that.

>Another way to look at it is that we want to keep any increments to n_dead_tuples that occur after VACUUM takes its snapshot. Maybe we could have VACUUM copy the n_dead_tuples value as it exists when
>VACUUM starts, and then send that as the value to subtract when it's done?

Taking of n_dead_tuples copy and pass the same at the vacuum end to subtract from table stats may not be correct, as vacuum may not be cleaned all the dead tuples because of tuple visibility
To other transactions. How about resets the n_dead_tuples as zero if it goes negative because of errors?

Regards,
Hari babu.

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-10-20 06:35:59
Message-ID:	CAA4eK1Lg+v_r1VdH68qyzHzfQjbfjotA=UurTGjW-bMnUKrORA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Oct 15, 2013 at 3:37 PM, Haribabu kommi
<haribabu(dot)kommi(at)huawei(dot)com> wrote:
> On 12 October 2013 11:30 Tom Lane wrote:
>>Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com> writes:
>>> To handle the above case instead of directly resetting the dead tuples
>>> as zero, how if the exact dead tuples are removed from the table stats. With this approach vacuum gets triggered frequently thus it reduces the bloat.
>
>>This does not seem like a very good idea as-is, because it will mean that n_dead_tuples can diverge arbitrarily far from reality over time, as a result of accumulation of errors. It also doesn't seem
>>like a very good idea that VACUUM sets n_live_tuples while only adjusting n_dead_tuples incrementally; ideally those counters should move in the same fashion.
>>In short, I think this patch will create at least as many problems as it fixes.
>
>>What would make more sense to me is for VACUUM to estimate the number of remaining dead tuples somehow and send that in its message. However, since the whole point here is that we aren't accounting for
>>transactions that commit while VACUUM runs, it's not very clear how to do that.
>
>>Another way to look at it is that we want to keep any increments to n_dead_tuples that occur after VACUUM takes its snapshot. Maybe we could have VACUUM copy the n_dead_tuples value as it exists when
>>VACUUM starts, and then send that as the value to subtract when it's done?
>
> Taking of n_dead_tuples copy and pass the same at the vacuum end to subtract from table stats may not be correct, as vacuum may not be cleaned all the dead tuples because of tuple visibility
> To other transactions. How about resets the n_dead_tuples as zero if it goes negative because of errors?

Wouldn't the way you are planing to change n_dead_tuples create
inconsistency for n_live_tuples and n_dead_tuples, because it would
have counted non deleted tuples as n_live_tuples as per below code:

if (tupgone)
{
..
tups_vacuumed += 1;
has_dead_tuples = true;
}
else
{
num_tuples += 1;
hastup = true;
..
}

So now if we just subtract tuples_deleted from n_dead_tuples, it will
count the tuples deleted during vacuum both as live tuples and dead
tuples.
There is one statistics for dead row version's that cannot be removed
(nkeep), if we could use that to estimate total remaining dead tuples,
then the solution can be inline with Tom's suggestion (What would make
more sense to me is for VACUUM to estimate the number of remaining
dead tuples somehow and send that in its message.).

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-10-21 05:24:07
Message-ID:	8977CB36860C5843884E0A18D8747B0372BC968A@szxeml558-mbs.china.huawei.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 20 October 2013 12:06 Amit Kapila wrote:
>On Tue, Oct 15, 2013 at 3:37 PM, Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com> wrote:
>> On 12 October 2013 11:30 Tom Lane wrote:
>>>Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com> writes:
>>>> To handle the above case instead of directly resetting the dead
>>>> tuples as zero, how if the exact dead tuples are removed from the table stats. With this approach vacuum gets triggered frequently thus it reduces the bloat.
>>
>>>This does not seem like a very good idea as-is, because it will mean
>>>that n_dead_tuples can diverge arbitrarily far from reality over time, as a result of accumulation of errors. It also doesn't seem like a very good idea that VACUUM sets n_live_tuples while only adjusting n_dead_tuples incrementally; ideally those counters should move in the same fashion.
>>>In short, I think this patch will create at least as many problems as it fixes.
>>
>>>What would make more sense to me is for VACUUM to estimate the number
>>>of remaining dead tuples somehow and send that in its message. However, since the whole point here is that we aren't accounting for transactions that commit while VACUUM runs, it's not very clear how to do that.
>>
>>>Another way to look at it is that we want to keep any increments to
>>>n_dead_tuples that occur after VACUUM takes its snapshot. Maybe we could have VACUUM copy the n_dead_tuples value as it exists when VACUUM starts, and then send that as the value to subtract when it's done?
>>
>> Taking of n_dead_tuples copy and pass the same at the vacuum end to
>> subtract from table stats may not be correct, as vacuum may not be cleaned all the dead tuples because of tuple visibility To other transactions. How about resets the n_dead_tuples as zero if it goes negative because of errors?

>Wouldn't the way you are planing to change n_dead_tuples create inconsistency for n_live_tuples and n_dead_tuples, because it would have counted non deleted tuples as n_live_tuples as per below code:
>
>if (tupgone)
>{
>..
>tups_vacuumed += 1;
>has_dead_tuples = true;
>}
>else
>{
>num_tuples += 1;
>hastup = true;
>..
>}
>
>So now if we just subtract tuples_deleted from n_dead_tuples, it will count the tuples deleted during vacuum both as live tuples and dead tuples.
>There is one statistics for dead row version's that cannot be removed (nkeep), if we could use that to estimate total remaining dead tuples, then the solution can be inline with Tom's suggestion (What
>would make more sense to me is for VACUUM to estimate the number of remaining dead tuples somehow and send that in its message.).

Yes, it's correct. "nkeep" counter have the dead tuples which are recently dead and are not vacuumed. The removal of tuples vacuumed from dead tuples should be the same as "nkeep" counter.
So if we remove the nkeep from num_tuples which gives us the proper live tuples. How about following statement at the end scan for all blocks.

num_tuples -= nkeep;

please let me know if any corrections are required.
Patch with the above implementation is attached in the mail.

Regards,
Hari babu.

Attachment	Content-Type	Size
vacuum_fix_v2.patch	application/octet-stream	3.5 KB

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-10-22 04:45:02
Message-ID:	CAA4eK1L8am2uEpQdd+D_aBr_i2NY3q4kYAgwEt=UnSmsMzqQRg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Oct 21, 2013 at 10:54 AM, Haribabu kommi
<haribabu(dot)kommi(at)huawei(dot)com> wrote:
> On 20 October 2013 12:06 Amit Kapila wrote:
>>On Tue, Oct 15, 2013 at 3:37 PM, Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com> wrote:
>>> On 12 October 2013 11:30 Tom Lane wrote:
>>>>Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com> writes:
>>>
>>>>Another way to look at it is that we want to keep any increments to
>>>>n_dead_tuples that occur after VACUUM takes its snapshot. Maybe we could have VACUUM copy the n_dead_tuples value as it exists when VACUUM starts, and then send that as the value to subtract when it's done?
>>>
>>> Taking of n_dead_tuples copy and pass the same at the vacuum end to
>>> subtract from table stats may not be correct, as vacuum may not be cleaned all the dead tuples because of tuple visibility To other transactions. How about resets the n_dead_tuples as zero if it goes negative because of errors?
>
>>Wouldn't the way you are planing to change n_dead_tuples create inconsistency for n_live_tuples and n_dead_tuples, because it would have counted non deleted tuples as n_live_tuples as per below code:
>>
>>if (tupgone)
>>{
>>..
>>tups_vacuumed += 1;
>>has_dead_tuples = true;
>>}
>>else
>>{
>>num_tuples += 1;
>>hastup = true;
>>..
>>}
>>
>>So now if we just subtract tuples_deleted from n_dead_tuples, it will count the tuples deleted during vacuum both as live tuples and dead tuples.
>>There is one statistics for dead row version's that cannot be removed (nkeep), if we could use that to estimate total remaining dead tuples, then the solution can be inline with Tom's suggestion (What
>>would make more sense to me is for VACUUM to estimate the number of remaining dead tuples somehow and send that in its message.).
>
> Yes, it's correct. "nkeep" counter have the dead tuples which are recently dead and are not vacuumed. The removal of tuples vacuumed from dead tuples should be the same as "nkeep" counter.
> So if we remove the nkeep from num_tuples which gives us the proper live tuples. How about following statement at the end scan for all blocks.
>
> num_tuples -= nkeep;

Actually what I had in mind was to use nkeep to estimate n_dead_tuples
similar to how num_tuples is used to estimate n_live_tuples. I think
it will match what Tom had pointed in his response (>>>>What would
make more sense to me is for VACUUM to estimate the number
>>>>of remaining dead tuples somehow and send that in its message. However, since the whole point here is that we aren't accounting for transactions that commit while VACUUM runs, it's not very clear how to do that.)

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-10-22 08:39:46
Message-ID:	8977CB36860C5843884E0A18D8747B0372BC9AC4@szxeml558-mbs.china.huawei.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 22 October 2013 10:15 Amit Kapila wrote:
>>On Mon, Oct 21, 2013 at 10:54 AM, Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com> wrote:
>>
>> Yes, it's correct. "nkeep" counter have the dead tuples which are recently dead and are not vacuumed. The removal of tuples vacuumed from dead tuples should be the same as "nkeep" counter.
>> So if we remove the nkeep from num_tuples which gives us the proper live tuples. How about following statement at the end scan for all blocks.
>>
>> num_tuples -= nkeep;
>
>Actually what I had in mind was to use nkeep to estimate n_dead_tuples similar to how num_tuples is used to estimate n_live_tuples. I think it will match what Tom had pointed in his response (>>>>What
>>would make more sense to me is for VACUUM to estimate the number
>>>>>of remaining dead tuples somehow and send that in its message.
>>>>>However, since the whole point here is that we aren't accounting for
>>>>>transactions that commit while VACUUM runs, it's not very clear how
>>>>>to do that.)

I changed the patch as passing the "nkeep" counter data as the new dead tuples in the relation to stats like the new_rel_tuples.
The "nkeep" counter is an approximation of dead tuples data of a relation.
Instead of resetting dead tuples stats as zero, used this value to set n_dead_tuples same as n_live_tuples.

Patch is attached in the mail. Please let me know if any changes are required.

Regards,
Hari Babu.

Attachment	Content-Type	Size
vacuum_fix_v3.patch	application/octet-stream	4.3 KB

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-11-07 04:12:21
Message-ID:	CAA4eK1K6tsLCT9hAxV0iKw8nsaz-UTVuyc27j4HO86S0rAEiXQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Oct 22, 2013 at 2:09 PM, Haribabu kommi
<haribabu(dot)kommi(at)huawei(dot)com> wrote:
> On 22 October 2013 10:15 Amit Kapila wrote:
>>>On Mon, Oct 21, 2013 at 10:54 AM, Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com> wrote:
>>>
>>Actually what I had in mind was to use nkeep to estimate n_dead_tuples similar to how num_tuples is used to estimate n_live_tuples. I think it will match what Tom had pointed in his response (>>>>What
>>>would make more sense to me is for VACUUM to estimate the number
>>>>>>of remaining dead tuples somehow and send that in its message.
>>>>>>However, since the whole point here is that we aren't accounting for
>>>>>>transactions that commit while VACUUM runs, it's not very clear how
>>>>>>to do that.)
>
> I changed the patch as passing the "nkeep" counter data as the new dead tuples in the relation to stats like the new_rel_tuples.
> The "nkeep" counter is an approximation of dead tuples data of a relation.
> Instead of resetting dead tuples stats as zero, used this value to set n_dead_tuples same as n_live_tuples.

Directly using nkeep might not work as it is not guaranteed that
Vacuum will scan all the pages, we need to estimate the value similar
to new_rel_tuples, something like as done in below function:

/* now we can compute the new value for pg_class.reltuples */
vacrelstats->new_rel_tuples = vac_estimate_reltuples(onerel, false,

nblocks,

vacrelstats->scanned_pages,

num_tuples);

I am not sure whether the same calculation as done for new_rel_tuples
works for new_dead_tuples, you can once check it.

I am thinking that if we have to do estimation anyway, then wouldn't
it be better to do the way Tom had initially suggested (Maybe we could
have VACUUM copy the n_dead_tuples value as it exists when VACUUM
starts, and then send that as the value to subtract when it's done?)

I think the reason you gave that due to tuple visibility check the
number of dead tuples calculated by above logic is not accurate is
right but still it will make the value of dead tuples more appropriate
than it's current value.

You can check if there is a way to do estimation of dead tuples
similar to new tuples, and it will be as solid as current logic of
vac_estimate_reltuples(), then it's okay, otherwise use the other
solution (using the value of n_dead_tuples at start of Vacuum) to
solve the problem.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-11-08 05:26:16
Message-ID:	8977CB36860C5843884E0A18D8747B0372BD78B4@szxeml558-mbs.china.huawei.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 07 November 2013 09:42 Amit Kapila wrote:
> On Tue, Oct 22, 2013 at 2:09 PM, Haribabu kommi
> <haribabu(dot)kommi(at)huawei(dot)com> wrote:
> > On 22 October 2013 10:15 Amit Kapila wrote:
> >>>On Mon, Oct 21, 2013 at 10:54 AM, Haribabu kommi
> <haribabu(dot)kommi(at)huawei(dot)com> wrote:
> >>>
> >>Actually what I had in mind was to use nkeep to estimate
> n_dead_tuples
> >>similar to how num_tuples is used to estimate n_live_tuples. I think
> >>it will match what Tom had pointed in his response (>>>>What
> >>>would make more sense to me is for VACUUM to estimate the number
> >>>>>>of remaining dead tuples somehow and send that in its message.
> >>>>>>However, since the whole point here is that we aren't accounting
> >>>>>>for transactions that commit while VACUUM runs, it's not very
> >>>>>>clear how to do that.)
> >
> > I changed the patch as passing the "nkeep" counter data as the new
> dead tuples in the relation to stats like the new_rel_tuples.
> > The "nkeep" counter is an approximation of dead tuples data of a
> relation.
> > Instead of resetting dead tuples stats as zero, used this value to
> set n_dead_tuples same as n_live_tuples.
>
> Directly using nkeep might not work as it is not guaranteed that Vacuum
> will scan all the pages, we need to estimate the value similar to
> new_rel_tuples, something like as done in below function:
>
> /* now we can compute the new value for pg_class.reltuples */
> vacrelstats->new_rel_tuples = vac_estimate_reltuples(onerel, false,
>
> nblocks,
>
> vacrelstats->scanned_pages,
>
> num_tuples);
>
> I am not sure whether the same calculation as done for new_rel_tuples
> works for new_dead_tuples, you can once check it.

I didn't find any way to calculate new_dead_tuples like new_rel_tuples.
I will check it.

> I am thinking that if we have to do estimation anyway, then wouldn't it
> be better to do the way Tom had initially suggested (Maybe we could
> have VACUUM copy the n_dead_tuples value as it exists when VACUUM
> starts, and then send that as the value to subtract when it's done?)
>
> I think the reason you gave that due to tuple visibility check the
> number of dead tuples calculated by above logic is not accurate is
> right but still it will make the value of dead tuples more appropriate
> than it's current value.
>
> You can check if there is a way to do estimation of dead tuples similar
> to new tuples, and it will be as solid as current logic of
> vac_estimate_reltuples(), then it's okay, otherwise use the other
> solution (using the value of n_dead_tuples at start of Vacuum) to solve
> the problem.

The two approaches calculations are approximation values only.

1. Taking a copy of n_dead_tuples before VACUUM starts and then subtract it once it is done.
This approach doesn't include the tuples which are remains during the vacuum operation.

2. nkeep counter contains the tuples which are still visible to other transactions.
This approach doesn't include tuples which are deleted on pages where vacuum operation is already finished.

In my opinion the second approach gives the value nearer to the actual value,
because it includes some of the new dead tuples also. Please correct me if anything wrong in my analysis.

Regards,
Hari babu.

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-11-08 13:05:11
Message-ID:	CAA4eK1KwrvP2TggtZaCBx9AuZ5e+4an0V5fC3zktarWMUfPkQw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Nov 8, 2013 at 10:56 AM, Haribabu kommi
<haribabu(dot)kommi(at)huawei(dot)com> wrote:
> On 07 November 2013 09:42 Amit Kapila wrote:
>> I am not sure whether the same calculation as done for new_rel_tuples
>> works for new_dead_tuples, you can once check it.
>
> I didn't find any way to calculate new_dead_tuples like new_rel_tuples.
> I will check it.
>
>> I am thinking that if we have to do estimation anyway, then wouldn't it
>> be better to do the way Tom had initially suggested (Maybe we could
>> have VACUUM copy the n_dead_tuples value as it exists when VACUUM
>> starts, and then send that as the value to subtract when it's done?)
>>
>> I think the reason you gave that due to tuple visibility check the
>> number of dead tuples calculated by above logic is not accurate is
>> right but still it will make the value of dead tuples more appropriate
>> than it's current value.
>>
>> You can check if there is a way to do estimation of dead tuples similar
>> to new tuples, and it will be as solid as current logic of
>> vac_estimate_reltuples(), then it's okay, otherwise use the other
>> solution (using the value of n_dead_tuples at start of Vacuum) to solve
>> the problem.
>
> The two approaches calculations are approximation values only.
>
> 1. Taking a copy of n_dead_tuples before VACUUM starts and then subtract it once it is done.
> This approach doesn't include the tuples which are remains during the vacuum operation.

Wouldn't next or future vacuum's will make the estimate more appropraite?

> 2. nkeep counter contains the tuples which are still visible to other transactions.
> This approach doesn't include tuples which are deleted on pages where vacuum operation is already finished.
>
> In my opinion the second approach gives the value nearer to the actual value,
> because it includes some of the new dead tuples also. Please correct me if anything wrong in my analysis.
I think main problem in nkeep logic is to come up with an
estimation algorithm similar to live tuples.

By the way, do you have test case or can you try to write a test case
which can show this problem and
then after fix, you can verify if the problem is resolved.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-11-11 09:44:23
Message-ID:	8977CB36860C5843884E0A18D8747B0372BDA6BF@szxeml558-mbs.china.huawei.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 08 November 2013 18:35 Amit Kapila wrote:
> On Fri, Nov 8, 2013 at 10:56 AM, Haribabu kommi
> <haribabu(dot)kommi(at)huawei(dot)com> wrote:
> > On 07 November 2013 09:42 Amit Kapila wrote:
> >> I am not sure whether the same calculation as done for
> new_rel_tuples
> >> works for new_dead_tuples, you can once check it.
> >
> > I didn't find any way to calculate new_dead_tuples like
> new_rel_tuples.
> > I will check it.
> >
> >> I am thinking that if we have to do estimation anyway, then wouldn't
> >> it be better to do the way Tom had initially suggested (Maybe we
> >> could have VACUUM copy the n_dead_tuples value as it exists when
> >> VACUUM starts, and then send that as the value to subtract when it's
> >> done?)
> >>
> >> I think the reason you gave that due to tuple visibility check the
> >> number of dead tuples calculated by above logic is not accurate is
> >> right but still it will make the value of dead tuples more
> >> appropriate than it's current value.
> >>
> >> You can check if there is a way to do estimation of dead tuples
> >> similar to new tuples, and it will be as solid as current logic of
> >> vac_estimate_reltuples(), then it's okay, otherwise use the other
> >> solution (using the value of n_dead_tuples at start of Vacuum) to
> >> solve the problem.
> >
> > The two approaches calculations are approximation values only.
> >
> > 1. Taking a copy of n_dead_tuples before VACUUM starts and then
> subtract it once it is done.
> > This approach doesn't include the tuples which are remains during
> the vacuum operation.
>
> Wouldn't next or future vacuum's will make the estimate more
> appropraite?

Possible only when nkeep counter value (tuples not cleaned) is very less value.

> > 2. nkeep counter contains the tuples which are still visible to other
> transactions.
> > This approach doesn't include tuples which are deleted on pages
> where vacuum operation is already finished.
> >
> > In my opinion the second approach gives the value nearer to the
> actual
> > value, because it includes some of the new dead tuples also. Please
> correct me if anything wrong in my analysis.
> I think main problem in nkeep logic is to come up with an estimation
> algorithm similar to live tuples.
>
> By the way, do you have test case or can you try to write a test case
> which can show this problem and then after fix, you can verify if the
> problem is resolved.

The simulated index bloat problem can be generated using the attached script and sql.
With the fix of setting the dead tuples properly, the bloat is reduced and by changing the vacuum cost
Parameters the bloat is avoided.

The advantage with the fix is observed is the more number of times the auto vacuum is triggered on
The bloated table, as it satisfies the vacuum criteria because of proper dead tuples compared to the
original code.

Regards,
Hari babu.

Attachment	Content-Type	Size
Table_and_functions.sql	application/octet-stream	835 bytes
script.sh	application/octet-stream	8.6 KB

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-11-12 03:17:24
Message-ID:	CAA4eK1JjxcyBzGf2GfHpNe8JgzKQ4RE6exNsvVjc3R3PcuFi9w@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Nov 11, 2013 at 3:14 PM, Haribabu kommi
<haribabu(dot)kommi(at)huawei(dot)com> wrote:
> On 08 November 2013 18:35 Amit Kapila wrote:
>> On Fri, Nov 8, 2013 at 10:56 AM, Haribabu kommi
>> <haribabu(dot)kommi(at)huawei(dot)com> wrote:
>> > On 07 November 2013 09:42 Amit Kapila wrote:
>> >> I am not sure whether the same calculation as done for
>> new_rel_tuples
>> >> works for new_dead_tuples, you can once check it.
>> >
>> > I didn't find any way to calculate new_dead_tuples like
>> new_rel_tuples.
>> > I will check it.
>> >
>> > The two approaches calculations are approximation values only.
>> >
>> > 1. Taking a copy of n_dead_tuples before VACUUM starts and then
>> subtract it once it is done.
>> > This approach doesn't include the tuples which are remains during
>> the vacuum operation.
>>
>> Wouldn't next or future vacuum's will make the estimate more
>> appropraite?
>
> Possible only when nkeep counter value (tuples not cleaned) is very less value.

Do you really expect too many dead tuples during Vacuum?

>> > 2. nkeep counter contains the tuples which are still visible to other
>> transactions.
>> > This approach doesn't include tuples which are deleted on pages
>> where vacuum operation is already finished.
>> >
>> > In my opinion the second approach gives the value nearer to the
>> actual
>> > value, because it includes some of the new dead tuples also. Please
>> correct me if anything wrong in my analysis.
>> I think main problem in nkeep logic is to come up with an estimation
>> algorithm similar to live tuples.
>>
>> By the way, do you have test case or can you try to write a test case
>> which can show this problem and then after fix, you can verify if the
>> problem is resolved.
>
> The simulated index bloat problem can be generated using the attached script and sql.
> With the fix of setting the dead tuples properly,

Which fix here you are referring to, is it the one which you have
proposed with your initial mail?

> the bloat is reduced and by changing the vacuum cost
> Parameters the bloat is avoided.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-11-13 06:32:58
Message-ID:	8977CB36860C5843884E0A18D8747B0372BDAFDF@szxeml558-mbs.china.huawei.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 12 November 2013 08:47 Amit Kapila wrote:
> On Mon, Nov 11, 2013 at 3:14 PM, Haribabu kommi
> <haribabu(dot)kommi(at)huawei(dot)com> wrote:
> > On 08 November 2013 18:35 Amit Kapila wrote:
> >> On Fri, Nov 8, 2013 at 10:56 AM, Haribabu kommi
> >> <haribabu(dot)kommi(at)huawei(dot)com> wrote:
> >> > On 07 November 2013 09:42 Amit Kapila wrote:
> >> > 1. Taking a copy of n_dead_tuples before VACUUM starts and then
> >> subtract it once it is done.
> >> > This approach doesn't include the tuples which are remains
> >> > during
> >> the vacuum operation.

Patch is modified as take a copy of n_dead_tuples during vacuum start and use
the same while calculating the new dead tuples at end of vacuum.

> >> By the way, do you have test case or can you try to write a test
> case
> >> which can show this problem and then after fix, you can verify if
> the
> >> problem is resolved.
> >
> > The simulated index bloat problem can be generated using the attached
> script and sql.
> > With the fix of setting the dead tuples properly,
>
> Which fix here you are referring to, is it the one which you have
> proposed with your initial mail?
>
> > the bloat is reduced and by changing the vacuum cost Parameters the
> > bloat is avoided.

With the simulated bloat test run for 1 hour the bloat occurred as below,

Unpatched - 1532MB
Patched - 1474MB

With this patched approach the bloat is reduced.

Regards,
Hari babu.

Attachment	Content-Type	Size
vacuum_fix_v4.patch	application/octet-stream	4.2 KB

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-11-15 04:29:44
Message-ID:	CAA4eK1LENy6_K-5GswMuCeHhG4B7mUCfSzZENPaUe8U0mJ5BEA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Wed, Nov 13, 2013 at 12:02 PM, Haribabu kommi
<haribabu(dot)kommi(at)huawei(dot)com> wrote:
> On 12 November 2013 08:47 Amit Kapila wrote:
>> On Mon, Nov 11, 2013 at 3:14 PM, Haribabu kommi
>> <haribabu(dot)kommi(at)huawei(dot)com> wrote:
>> > On 08 November 2013 18:35 Amit Kapila wrote:
>> >> On Fri, Nov 8, 2013 at 10:56 AM, Haribabu kommi
>> >> <haribabu(dot)kommi(at)huawei(dot)com> wrote:
>> >> > On 07 November 2013 09:42 Amit Kapila wrote:
>> >> > 1. Taking a copy of n_dead_tuples before VACUUM starts and then
>> >> subtract it once it is done.
>> >> > This approach doesn't include the tuples which are remains
>> >> > during
>> >> the vacuum operation.
>
> Patch is modified as take a copy of n_dead_tuples during vacuum start and use
> the same while calculating the new dead tuples at end of vacuum.
>
>> >> By the way, do you have test case or can you try to write a test
>> case
>> >> which can show this problem and then after fix, you can verify if
>> the
>> >> problem is resolved.
>> >
>> > The simulated index bloat problem can be generated using the attached
>> script and sql.
>> > With the fix of setting the dead tuples properly,
>>
>> Which fix here you are referring to, is it the one which you have
>> proposed with your initial mail?
>>
>> > the bloat is reduced and by changing the vacuum cost Parameters the
>> > bloat is avoided.
>
> With the simulated bloat test run for 1 hour the bloat occurred as below,
>
> Unpatched - 1532MB
> Patched - 1474MB

In your test run, have you checked what happen if after heavy update
(or once bloat occurs), if you keep the system idle (or just have read
load on system) for some time, what is the result?

You haven't answered one of my questions in previous mail
( >With the fix of setting the dead tuples properly, the bloat is
reduced and by changing the vacuum cost Parameters the bloat is
avoided.
Which fix here you are referring to?)

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-11-15 05:22:05
Message-ID:	8977CB36860C5843884E0A18D8747B0372BDB7A8@szxeml558-mbs.china.huawei.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 15 November 2013 10:00 Amit Kapila wrote:
> On Wed, Nov 13, 2013 at 12:02 PM, Haribabu kommi
> <haribabu(dot)kommi(at)huawei(dot)com> wrote:
> > On 12 November 2013 08:47 Amit Kapila wrote:
> >> On Mon, Nov 11, 2013 at 3:14 PM, Haribabu kommi
> >> <haribabu(dot)kommi(at)huawei(dot)com> wrote:
> >> > On 08 November 2013 18:35 Amit Kapila wrote:
> >> >> On Fri, Nov 8, 2013 at 10:56 AM, Haribabu kommi
> >> >> <haribabu(dot)kommi(at)huawei(dot)com> wrote:
> >> >> > On 07 November 2013 09:42 Amit Kapila wrote:
> >> >> > 1. Taking a copy of n_dead_tuples before VACUUM starts and then
> >> >> subtract it once it is done.
> >> >> > This approach doesn't include the tuples which are remains
> >> >> > during
> >> >> the vacuum operation.
> >
> > Patch is modified as take a copy of n_dead_tuples during vacuum start
> > and use the same while calculating the new dead tuples at end of
> vacuum.
> >
> >> >> By the way, do you have test case or can you try to write a test
> >> case
> >> >> which can show this problem and then after fix, you can verify if
> >> the
> >> >> problem is resolved.
> >> >
> >> > The simulated index bloat problem can be generated using the
> >> > attached
> >> script and sql.
> >> > With the fix of setting the dead tuples properly,
> >>
> >> Which fix here you are referring to, is it the one which you have
> >> proposed with your initial mail?
> >>
> >> > the bloat is reduced and by changing the vacuum cost Parameters
> the
> >> > bloat is avoided.
> >
> > With the simulated bloat test run for 1 hour the bloat occurred as
> > below,
> >
> > Unpatched - 1532MB
> > Patched - 1474MB
>
> In your test run, have you checked what happen if after heavy update
> (or once bloat occurs), if you keep the system idle (or just have read
> load on system) for some time, what is the result?

In the simulated test run which is shared in the previous mail, after a heavy update
the system is idle for 15 mins.

With the master code, the vacuum is not triggered during this idle time as it is
Not satisfies the vacuum threshold, because it doesn't consider the dead tuples occurred
During vacuum operation.

With the fix the one extra vacuum can gets triggered compared to master code after two or three
heavy updates because of accumulation of dead tuples.

> You haven't answered one of my questions in previous mail ( >With the
> fix of setting the dead tuples properly, the bloat is reduced and by
> changing the vacuum cost Parameters the bloat is avoided.
> Which fix here you are referring to?)

The bloat reduced is same with initial and latest patch.
The vacuum cost parameters change (which doesn't contain any fix) is avoided the bloat.

Regards,
Hari babu.

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-11-19 05:02:46
Message-ID:	CAA4eK1Lm_6qasRWRnMhfwrUE9Ma3xffuJQOTxj7=vejZtoHY-A@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Nov 15, 2013 at 10:52 AM, Haribabu kommi
<haribabu(dot)kommi(at)huawei(dot)com> wrote:
> On 15 November 2013 10:00 Amit Kapila wrote:
>> On Wed, Nov 13, 2013 at 12:02 PM, Haribabu kommi
>> <haribabu(dot)kommi(at)huawei(dot)com> wrote:
>> >
>> > With the simulated bloat test run for 1 hour the bloat occurred as
>> > below,
>> >
>> > Unpatched - 1532MB
>> > Patched - 1474MB
>>
>> In your test run, have you checked what happen if after heavy update
>> (or once bloat occurs), if you keep the system idle (or just have read
>> load on system) for some time, what is the result?
>
> In the simulated test run which is shared in the previous mail, after a heavy update
> the system is idle for 15 mins.
>
> With the master code, the vacuum is not triggered during this idle time as it is
> Not satisfies the vacuum threshold, because it doesn't consider the dead tuples occurred
> During vacuum operation.
>
> With the fix the one extra vacuum can gets triggered compared to master code after two or three
> heavy updates because of accumulation of dead tuples.
>
>> You haven't answered one of my questions in previous mail ( >With the
>> fix of setting the dead tuples properly, the bloat is reduced and by
>> changing the vacuum cost Parameters the bloat is avoided.
>> Which fix here you are referring to?)
>
> The bloat reduced is same with initial and latest patch.
> The vacuum cost parameters change (which doesn't contain any fix) is avoided the bloat.

If I understood correctly, then your patch's main intention is to
correct the estimate of dead tuples, so that it can lead to Vacuum
cleaning the table/index which otherwise is not happening as per
configuration value (autovacuum_vacuum_threshold) in some of the
cases, also it is not reducing the complete bloat (Unpatched - 1532MB
~Patched - 1474MB), as the main reason of bloat is extra space in
index which can be reclaimed by reindex operation.

So if above is correct then this patch has 3 advantages:
a. Extra Vacuum on table/index due to better estimation of dead tuples.
b. Space reclaim due to this extra vacuum
c. may be some performance advantage as it will avoid the delay in
cleaning dead tuples

I think better way to test the patch is to see how much benefit is
there due to above (a and b points) advantages. Different values of
autovacuum_vacuum_threshold can be used to test.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-11-22 06:42:15
Message-ID:	8977CB36860C5843884E0A18D8747B0372BEF50E@szxeml558-mbs.china.huawei.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 19 November 2013 10:33 Amit Kapila wrote:
> If I understood correctly, then your patch's main intention is to
> correct the estimate of dead tuples, so that it can lead to Vacuum
> cleaning the table/index which otherwise is not happening as per
> configuration value (autovacuum_vacuum_threshold) in some of the cases,
> also it is not reducing the complete bloat (Unpatched - 1532MB
> ~Patched - 1474MB), as the main reason of bloat is extra space in
> index which can be reclaimed by reindex operation.
>
> So if above is correct then this patch has 3 advantages:
> a. Extra Vacuum on table/index due to better estimation of dead tuples.
> b. Space reclaim due to this extra vacuum c. may be some performance
> advantage as it will avoid the delay in cleaning dead tuples
>
> I think better way to test the patch is to see how much benefit is
> there due to above (a and b points) advantages. Different values of
> autovacuum_vacuum_threshold can be used to test.

I modified the test and did a performance test with following configuration changes,
autovacuum_vacuum_threshold as 3 times the number of records in the table.
Autovacuum_nap_time - 30s

The test script will generate the configured vacuum threshold data in 45sec.
After 180sec test, sleeps for 2 min.

After one hour test run the patched approach shown one autovacuum is more
than the master code. Not sure as this difference also may not be visible in long runs.

The performance effect of the patch is not much visible as I think the analyze
on the table estimates the number of dead tuples of the table with some estimation.
Because of this reason not much performance improvement is not visible as the
missed dead tuple calculation in vacuum is covered by the analyze. Please correct
me if anything missed in my analysis.

Updated patch and test scripts are attached in the mail.

Regards,
Hari babu.

Attachment	Content-Type	Size
vacuum_fix_v5.patch	application/octet-stream	4.5 KB
Table_and_functions.sql	application/octet-stream	1.1 KB
script.sh	application/octet-stream	3.9 KB

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-11-25 05:12:54
Message-ID:	CAA4eK1Ja1bo9oC6YmXT0PqV=VrXed969Y=DRKc-SqXB1FAEZNg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Nov 22, 2013 at 12:12 PM, Haribabu kommi
<haribabu(dot)kommi(at)huawei(dot)com> wrote:
> On 19 November 2013 10:33 Amit Kapila wrote:
>> If I understood correctly, then your patch's main intention is to
>> correct the estimate of dead tuples, so that it can lead to Vacuum
>> cleaning the table/index which otherwise is not happening as per
>> configuration value (autovacuum_vacuum_threshold) in some of the cases,
>> also it is not reducing the complete bloat (Unpatched - 1532MB
>> ~Patched - 1474MB), as the main reason of bloat is extra space in
>> index which can be reclaimed by reindex operation.
>>
>> So if above is correct then this patch has 3 advantages:
>> a. Extra Vacuum on table/index due to better estimation of dead tuples.
>> b. Space reclaim due to this extra vacuum c. may be some performance
>> advantage as it will avoid the delay in cleaning dead tuples
>>
>> I think better way to test the patch is to see how much benefit is
>> there due to above (a and b points) advantages. Different values of
>> autovacuum_vacuum_threshold can be used to test.
>
>
> The performance effect of the patch is not much visible as I think the analyze
> on the table estimates the number of dead tuples of the table with some estimation.

Yes, that seems to be the reason why you are not seeing any
performance benefit, but still I think this is useful optimization to
do, as
analyze updates both the livetuples and dead tuples and similarly
vacuum should also update both the counts. Do you see any reason
why Vacuum should only update live tuples and not deadtuples?

> Because of this reason not much performance improvement is not visible as the
> missed dead tuple calculation in vacuum is covered by the analyze.

Yeah, so might be we can check once by configuring
analyze_threshold/scalefactor in a way that analyze doesn't get
trigger during your test.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-11-26 13:56:48
Message-ID:	8977CB36860C5843884E0A18D8747B0372BEFFE0@szxeml558-mbs.china.huawei.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 25 November 2013 10:43 Amit Kapila wrote:
> On Fri, Nov 22, 2013 at 12:12 PM, Haribabu kommi
> <haribabu(dot)kommi(at)huawei(dot)com> wrote:
> > On 19 November 2013 10:33 Amit Kapila wrote:
> >> If I understood correctly, then your patch's main intention is to
> >> correct the estimate of dead tuples, so that it can lead to Vacuum
> >> cleaning the table/index which otherwise is not happening as per
> >> configuration value (autovacuum_vacuum_threshold) in some of the
> >> cases, also it is not reducing the complete bloat (Unpatched -
> 1532MB
> >> ~Patched - 1474MB), as the main reason of bloat is extra space in
> >> index which can be reclaimed by reindex operation.
> >>
> >> So if above is correct then this patch has 3 advantages:
> >> a. Extra Vacuum on table/index due to better estimation of dead
> tuples.
> >> b. Space reclaim due to this extra vacuum c. may be some performance
> >> advantage as it will avoid the delay in cleaning dead tuples
> >>
> >> I think better way to test the patch is to see how much benefit is
> >> there due to above (a and b points) advantages. Different values of
> >> autovacuum_vacuum_threshold can be used to test.
> >
> >
> > The performance effect of the patch is not much visible as I think
> the
> > analyze on the table estimates the number of dead tuples of the table
> with some estimation.
>
> Yes, that seems to be the reason why you are not seeing any
> performance benefit, but still I think this is useful optimization to
> do, as
> analyze updates both the livetuples and dead tuples and similarly
> vacuum should also update both the counts. Do you see any reason
> why Vacuum should only update live tuples and not deadtuples?

As vacuum touches all the pages where the dead tuples are present. This is not the
Same with analyzer. Because of this reason, the analyzer estimates the dead tuples also.
With the proposed patch the vacuum also estimates the dead tuples.

> > Because of this reason not much performance improvement is not
> visible
> > as the missed dead tuple calculation in vacuum is covered by the
> analyze.
>
> Yeah, so might be we can check once by configuring
> analyze_threshold/scalefactor in a way that analyze doesn't get trigger
> during your test.

I ran the test for one hour with a high analyze_threshold and results are below.

Auto vacuum count Bloat size
Master 15 155MB
Patched 23 134MB

Updated test script and configuration is attached in the mail.

Regards,
Hari babu.

Attachment	Content-Type	Size
test_script.tar.gz	application/x-gzip	8.3 KB

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-11-29 06:30:10
Message-ID:	CAA4eK1Kh1OXrGo20Na=3ZiRrui7KfPseXBxJ67pAmH6CueVBng@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Tue, Nov 26, 2013 at 7:26 PM, Haribabu kommi
<haribabu(dot)kommi(at)huawei(dot)com> wrote:
> On 25 November 2013 10:43 Amit Kapila wrote:
>> On Fri, Nov 22, 2013 at 12:12 PM, Haribabu kommi
>> <haribabu(dot)kommi(at)huawei(dot)com> wrote:
>> > On 19 November 2013 10:33 Amit Kapila wrote:
>> >> If I understood correctly, then your patch's main intention is to
>> >> correct the estimate of dead tuples, so that it can lead to Vacuum
>> >> cleaning the table/index which otherwise is not happening as per
>> >> configuration value (autovacuum_vacuum_threshold) in some of the
>> >> cases, also it is not reducing the complete bloat (Unpatched -
>> 1532MB
>> >> ~Patched - 1474MB), as the main reason of bloat is extra space in
>> >> index which can be reclaimed by reindex operation.
>> >>
>> >> So if above is correct then this patch has 3 advantages:
>> >> a. Extra Vacuum on table/index due to better estimation of dead
>> tuples.
>> >> b. Space reclaim due to this extra vacuum c. may be some performance
>> >> advantage as it will avoid the delay in cleaning dead tuples
>> >>
>> >> I think better way to test the patch is to see how much benefit is
>> >> there due to above (a and b points) advantages. Different values of
>> >> autovacuum_vacuum_threshold can be used to test.
>> >
>> >
>> > The performance effect of the patch is not much visible as I think
>> the
>> > analyze on the table estimates the number of dead tuples of the table
>> with some estimation.
>>
>> Yes, that seems to be the reason why you are not seeing any
>> performance benefit, but still I think this is useful optimization to
>> do, as
>> analyze updates both the livetuples and dead tuples and similarly
>> vacuum should also update both the counts. Do you see any reason
>> why Vacuum should only update live tuples and not deadtuples?
>
> As vacuum touches all the pages where the dead tuples are present. This is not the
> Same with analyzer. Because of this reason, the analyzer estimates the dead tuples also.
> With the proposed patch the vacuum also estimates the dead tuples.

Few questions about your latest patch:
a. Is there any reason why you are doing estimation of dead tuples
only for Autovacuum and not for Vacuum.

/* clear and get the new stats for calculating proper dead tuples */
pgstat_clear_snapshot();
tabentry = pgstat_fetch_stat_tabentry(RelationGetRelid(onerel));
b. In the above code, to get latest data you are first clearing
snapshot and then calling pgstat function. It will inturn perform I/O
(read of stats file) and send/receive message from stats collector
to ensure it can read latest data. I think it will add overhead
to Vacuum, especially if 'nkeep' calculated in function
lazy_scan_heap() can serve the purpose. In my simple test[1], I
observed
that value of keep can serve the purpose.

Can you please once try the test on 'nkeep' approach patch.

>> > Because of this reason not much performance improvement is not
>> visible
>> > as the missed dead tuple calculation in vacuum is covered by the
>> analyze.
>>
>> Yeah, so might be we can check once by configuring
>> analyze_threshold/scalefactor in a way that analyze doesn't get trigger
>> during your test.
>
> I ran the test for one hour with a high analyze_threshold and results are below.
>
> Auto vacuum count Bloat size
> Master 15 155MB
> Patched 23 134MB
>
> Updated test script and configuration is attached in the mail.

I just had a brief look on your test, please check if you can simplify
your script file and make the test results to come in 15~20 mins.
Don't put too much effort on it, if you can do it easily then it is okay.

[1] Simple test case to verify the value of dead tuples:
Session-1
-----------------
a. Create table t1(c1 int);
b. insert into t1 values(generate_series(1,1000));
c. delete from t1;
d. Vacuum t1;
-- here I stopped in debugger, after fetching dead tuple count first
time (line 235, vacuumlazy.c, after applying your patch) as per your
code (modified a bit so that I can get the value for Vacuum)

Session-2
-----------------
a. insert into t1 values (generate_series(1000,1500));
b. delete from t1;

Session -1
-----------------
b. Verified the value of nkeep in lazy_scan_heap(), it is 501 which is
what we expect.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-11-29 13:25:21
Message-ID:	8977CB36860C5843884E0A18D8747B0372BF0B5B@szxeml558-mbs.china.huawei.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 29 November 2013 12:00 Amit Kapila wrote:
> On Tue, Nov 26, 2013 at 7:26 PM, Haribabu kommi
> <haribabu(dot)kommi(at)huawei(dot)com> wrote:
> > On 25 November 2013 10:43 Amit Kapila wrote:
> >> On Fri, Nov 22, 2013 at 12:12 PM, Haribabu kommi
> >> <haribabu(dot)kommi(at)huawei(dot)com> wrote:
> >> > On 19 November 2013 10:33 Amit Kapila wrote:
> >> >> If I understood correctly, then your patch's main intention is to
> >> >> correct the estimate of dead tuples, so that it can lead to
> Vacuum
> >> >> cleaning the table/index which otherwise is not happening as per
> >> >> configuration value (autovacuum_vacuum_threshold) in some of the
> >> >> cases, also it is not reducing the complete bloat (Unpatched -
> >> 1532MB
> >> >> ~Patched - 1474MB), as the main reason of bloat is extra space
> in
> >> >> index which can be reclaimed by reindex operation.
> >> >>
> >> >> So if above is correct then this patch has 3 advantages:
> >> >> a. Extra Vacuum on table/index due to better estimation of dead
> >> tuples.
> >> >> b. Space reclaim due to this extra vacuum c. may be some
> >> >> performance advantage as it will avoid the delay in cleaning dead
> >> >> tuples
> >> >>
> >> >> I think better way to test the patch is to see how much benefit
> is
> >> >> there due to above (a and b points) advantages. Different values
> >> >> of autovacuum_vacuum_threshold can be used to test.
> >> >
> >> >
> >> > The performance effect of the patch is not much visible as I think
> >> the
> >> > analyze on the table estimates the number of dead tuples of the
> >> > table
> >> with some estimation.
> >>
> >> Yes, that seems to be the reason why you are not seeing any
> >> performance benefit, but still I think this is useful optimization
> to
> >> do, as
> >> analyze updates both the livetuples and dead tuples and similarly
> >> vacuum should also update both the counts. Do you see any reason
> >> why Vacuum should only update live tuples and not deadtuples?
> >
> > As vacuum touches all the pages where the dead tuples are present.
> > This is not the Same with analyzer. Because of this reason, the
> analyzer estimates the dead tuples also.
> > With the proposed patch the vacuum also estimates the dead tuples.
>
> Few questions about your latest patch:
> a. Is there any reason why you are doing estimation of dead tuples only
> for Autovacuum and not for Vacuum.

No, changed.

> /* clear and get the new stats for calculating proper dead tuples */
> pgstat_clear_snapshot(); tabentry =
> pgstat_fetch_stat_tabentry(RelationGetRelid(onerel));
> b. In the above code, to get latest data you are first clearing
> snapshot and then calling pgstat function. It will inturn perform I/O
> (read of stats file) and send/receive message from stats collector
> to ensure it can read latest data. I think it will add overhead
> to Vacuum, especially if 'nkeep' calculated in function
> lazy_scan_heap() can serve the purpose. In my simple test[1], I
> observed
> that value of keep can serve the purpose.
>
> Can you please once try the test on 'nkeep' approach patch.

Using the nkeep and snapshot approach, I ran the test for 40 mins with a
high analyze_threshold and results are below.

Auto vacuum count Bloat size
Master 11 220MB
Patched_nkeep 14 215MB
Patched_snapshot 18 198MB

Both the approaches are showing good improvement in the test.
Updated patches, test script and configuration is attached in the mail.

Regards,
Hari babu.

Attachment	Content-Type	Size
vacuum_fix_v6_nkeep.patch	application/octet-stream	3.6 KB
test_script.tar.gz	application/x-gzip	8.3 KB
vacuum_fix_v6_snapshot.patch	application/octet-stream	3.7 KB

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-12-06 06:26:52
Message-ID:	CAA4eK1KpnqQb4R7BMiVHEyjivbch8wzniYSDQ-YAAFBi77Km6A@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Fri, Nov 29, 2013 at 6:55 PM, Haribabu kommi
<haribabu(dot)kommi(at)huawei(dot)com> wrote:
> On 29 November 2013 12:00 Amit Kapila wrote:
>> On Tue, Nov 26, 2013 at 7:26 PM, Haribabu kommi
>> <haribabu(dot)kommi(at)huawei(dot)com> wrote:
>> Few questions about your latest patch:
>> a. Is there any reason why you are doing estimation of dead tuples only
>> for Autovacuum and not for Vacuum.
>
> No, changed.
>
>> /* clear and get the new stats for calculating proper dead tuples */
>> pgstat_clear_snapshot(); tabentry =
>> pgstat_fetch_stat_tabentry(RelationGetRelid(onerel));
>> b. In the above code, to get latest data you are first clearing
>> snapshot and then calling pgstat function. It will inturn perform I/O
>> (read of stats file) and send/receive message from stats collector
>> to ensure it can read latest data. I think it will add overhead
>> to Vacuum, especially if 'nkeep' calculated in function
>> lazy_scan_heap() can serve the purpose. In my simple test[1], I
>> observed
>> that value of keep can serve the purpose.
>>
>> Can you please once try the test on 'nkeep' approach patch.
>
> Using the nkeep and snapshot approach, I ran the test for 40 mins with a
> high analyze_threshold and results are below.
>
> Auto vacuum count Bloat size
> Master 11 220MB
> Patched_nkeep 14 215MB
> Patched_snapshot 18 198MB
>
> Both the approaches are showing good improvement in the test.
> Updated patches, test script and configuration is attached in the mail.

I think updating dead tuple count using nkeep is good idea as similar
thing is done for Analyze as well in acquire_sample_rows().
One minor point, I think it is better to log dead tuples is below error message:
ereport(LOG,
(errmsg("automatic vacuum of table \"%s.%s.%s\": index scans: %d\n"
"pages: %d removed, %d remain\n"
"tuples: %.0f removed, %.0f remain\n"

"tuples: %.0f removed, %.0f remain, %.0f dead\n"

About your test, how to collect the data by running this script, are
you manually stopping it after 40 mins, because I ran it for more than
an hour,
the final result didn't came.
As I mentioned you last time, please simplify your test, for other
person in its current form, it is difficult to make meaning out of it.
Write comments on top of it in steps form to explain what exactly it
is doing and how to take data using it (for example, do I need
to wait, till script ends; how long this test can take to complete).

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-12-12 06:54:51
Message-ID:	8977CB36860C5843884E0A18D8747B0372BF34D5@szxeml558-mbs.china.huawei.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On 06 December 2013 11:57 Amit Kapila wrote:
> On Fri, Nov 29, 2013 at 6:55 PM, Haribabu kommi
> <haribabu(dot)kommi(at)huawei(dot)com> wrote:
> > On 29 November 2013 12:00 Amit Kapila wrote:
> >> On Tue, Nov 26, 2013 at 7:26 PM, Haribabu kommi
> >> <haribabu(dot)kommi(at)huawei(dot)com> wrote:
> >> Few questions about your latest patch:
> >> a. Is there any reason why you are doing estimation of dead tuples
> >> only for Autovacuum and not for Vacuum.
> >
> > No, changed.
> >
> >> /* clear and get the new stats for calculating proper dead tuples */
> >> pgstat_clear_snapshot(); tabentry =
> >> pgstat_fetch_stat_tabentry(RelationGetRelid(onerel));
> >> b. In the above code, to get latest data you are first clearing
> >> snapshot and then calling pgstat function. It will inturn perform
> I/O
> >> (read of stats file) and send/receive message from stats
> >> collector to ensure it can read latest data. I think it will add
> overhead
> >> to Vacuum, especially if 'nkeep' calculated in function
> >> lazy_scan_heap() can serve the purpose. In my simple test[1], I
> >> observed
> >> that value of keep can serve the purpose.
> >>
> >> Can you please once try the test on 'nkeep' approach patch.
> >
> > Using the nkeep and snapshot approach, I ran the test for 40 mins
> with
> > a high analyze_threshold and results are below.
> >
> > Auto vacuum count Bloat size
> > Master 11 220MB
> > Patched_nkeep 14 215MB
> > Patched_snapshot 18 198MB
> >
> > Both the approaches are showing good improvement in the test.
> > Updated patches, test script and configuration is attached in the
> mail.
>
> I think updating dead tuple count using nkeep is good idea as similar
> thing is done for Analyze as well in acquire_sample_rows().
> One minor point, I think it is better to log dead tuples is below error
> message:
> ereport(LOG,
> (errmsg("automatic vacuum of table \"%s.%s.%s\": index scans: %d\n"
> "pages: %d removed, %d remain\n"
> "tuples: %.0f removed, %.0f remain\n"
>
> "tuples: %.0f removed, %.0f remain, %.0f dead\n"
>
>
> About your test, how to collect the data by running this script, are
> you manually stopping it after 40 mins, because I ran it for more than
> an hour, the final result didn't came.
> As I mentioned you last time, please simplify your test, for other
> person in its current form, it is difficult to make meaning out of it.
> Write comments on top of it in steps form to explain what exactly it is
> doing and how to take data using it (for example, do I need to wait,
> till script ends; how long this test can take to complete).

A simplified test and updated patch by taking care the above comment are attached in the mail.
I am not able to reduce the test duration but changed as the test automatically exists after 45 mins run.
Please check vacuum_test.sh file more details for running the test.

Auto vacuum count Bloat size
Master 15 220MB
Patched_nkeep 18 213MB

Please let me know your suggestions.

Regards,
Hari babu.

Attachment	Content-Type	Size
vacuum_fix_v7_nkeep.patch	application/octet-stream	5.0 KB
test_script.tar.gz	application/x-gzip	8.3 KB

From:	Peter Eisentraut <peter_e(at)gmx(dot)net>
To:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-12-16 13:21:56
Message-ID:	52AEFE74.6050206@gmx.net
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Amit, if you think this patch is ready now, please mark it as Ready for
Committer. Otherwise, I encourage your to continue the discussion and
possibly resubmit the patch to the next commitfest.

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-12-16 14:26:51
Message-ID:	CAA4eK1JKrhyoYUnVKBKzJWsPNJx3u7QWHojcf2jteWsWgFqP+Q@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Mon, Dec 16, 2013 at 6:51 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> Amit, if you think this patch is ready now, please mark it as Ready for
> Committer. Otherwise, I encourage your to continue the discussion and
> possibly resubmit the patch to the next commitfest.

Only one test is pending from myside to conclude. I am planing to
complete it tomorrow.
Is it okay?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2013-12-17 06:17:16
Message-ID:	CAA4eK1L5CQFsAT6REMQktAAr0RbVz5inXMwwQxA03ChzF4xrCw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Dec 12, 2013 at 12:24 PM, Haribabu kommi
<haribabu(dot)kommi(at)huawei(dot)com> wrote:
> On 06 December 2013 11:57 Amit Kapila wrote:
>
> A simplified test and updated patch by taking care the above comment are attached in the mail.
> I am not able to reduce the test duration but changed as the test automatically exists after 45 mins run.
> Please check vacuum_test.sh file more details for running the test.
>
> Auto vacuum count Bloat size
> Master 15 220MB
> Patched_nkeep 18 213MB

I ran the attached test and the numbers are as below:
Auto vacuum count Bloat size
Master 16 222MB
Patched_nkeep 23 216MB

Here by Bloat size, it means the table_size after the test finished and
by Auto vacuum count, it means number of times Auto Vacuum is
triggered during test run.

It clearly shows that by setting number of dead tuples at end of Vacuum improves
the situation.
I am marking this (based on patch vacuum_fix_v7_nkeep.patch) as Ready
For Committer.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	tirtho <tirtho(at)hotmail(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2014-01-16 16:54:48
Message-ID:	1389891288375-5787477.post@n5.nabble.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Is this now fixed? If so, where do I find the patch for postgres 9.2.2.

Thanks

- Tirthankar

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Heavily-modified-big-table-bloat-even-in-auto-vacuum-is-running-tp5774274p5787477.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	tirtho <tirtho(at)hotmail(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2014-01-17 03:46:50
Message-ID:	CAA4eK1+hzxYyX7_5_LAihGouwW6PU7ZSSEquRQTU-RLpNbbPaA@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Thu, Jan 16, 2014 at 10:24 PM, tirtho <tirtho(at)hotmail(dot)com> wrote:
> Is this now fixed? If so, where do I find the patch for postgres 9.2.2.

This is still not fixed, the patch is in Ready For Committer stage.

You can confirm the status here:
https://commitfest.postgresql.org/action/patch_view?id=1258

I had verified that it still applies on HEAD, so moved it to current
Commit Fest.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2014-01-19 00:29:37
Message-ID:	27056.1390091377@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
> I am marking this (based on patch vacuum_fix_v7_nkeep.patch) as Ready
> For Committer.

I've reviewed and committed this patch, with one significant change.
If you look at the way that vacuumlazy.c computes new_rel_tuples, it's
based on scanned_tuples (lazy_scan_heap's num_tuples), which is the total
number of surviving tuples *including* the recently-dead ones counted in
nkeep. This is the number that we want to put into pg_class.reltuples,
I think, but it's wrong for the pgstats stuff to use it as n_live_tuples
if we're going to count the recently-dead ones as dead. That is, if we're
improving the approximation that n_dead_tuples is zero after a vacuum,
the fix should involve reducing the n_live_tuples estimate as well as
increasing the n_dead_tuples estimate.

Using your test script against the unpatched code, it's easy to see that
there's a large (and wrong) value of n_live_tup reported by an autovacuum,
which gets corrected by the next autoanalyze. For instance note these
successive readouts from the pg_stat_all_tables query:

n_live_tup | autovacuum_count | autoanalyze_count | n_dead_tup
------------+------------------+-------------------+------------
497365 | 9 | 8 | 4958346
497365 | 9 | 8 | 5458346
1186555 | 10 | 8 | 0
1186555 | 10 | 8 | 500000
499975 | 10 | 9 | 2491877

Since we know the true number of live tuples is always exactly 500000
in this test, that jump is certainly wrong. With the committed patch,
the behavior is significantly saner:

n_live_tup | autovacuum_count | autoanalyze_count | n_dead_tup
------------+------------------+-------------------+------------
483416 | 2 | 2 | 5759861
483416 | 2 | 2 | 6259861
655171 | 3 | 2 | 382306
655171 | 3 | 2 | 882306
553942 | 3 | 3 | 3523744

Still some room for improvement, but it's not so silly anymore.

It strikes me that there may be an obvious way to improve the number
further, based on the observation in this thread that nkeep doesn't need
to be scaled up because VACUUM should have scanned every page that could
contain dead tuples. Namely, that we're arriving at new_rel_tuples by
scaling up num_tuples linearly --- but perhaps we should only scale up
the live-tuples fraction of that count, not the dead-tuples fraction.
By scaling up dead tuples too, we are presumably still overestimating
new_rel_tuples somewhat, and the behavior that I'm seeing with this test
script seems to confirm that. I haven't time to pursue this idea at the
moment, but perhaps someone else would like to.

The n_dead_tup values seem to still be on the high side (not the low
side) when I run this test. Not too sure why that is.

Also, I don't see any particularly meaningful change in the rate of
autovacuuming or autoanalyzing when using default postgresql.conf
settings. I see that your test case changes the autovac parameters quite
a bit, but I didn't bother installing those values here. This may or may
not mean much; the fix is clearly correct on its own terms.

On the whole, this patch is not really addressing the question of
accounting for transactions changing the table concurrently with
VACUUM; it's only fixing the impedance mismatch between pgstats wanting
to count live and dead tuples separately while VACUUM wasn't telling
it that. That's a good thing to do, but I think we still have some
issues here.

regards, tom lane

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2014-01-19 04:40:46
Message-ID:	CAA4eK1+4_J2BUs4AAhD+Q98ZNRkpF2FiCNuoHq8WWU_mfiqbpw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

On Sun, Jan 19, 2014 at 5:59 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
>> I am marking this (based on patch vacuum_fix_v7_nkeep.patch) as Ready
>> For Committer.
>
> I've reviewed and committed this patch, with one significant change.
> If you look at the way that vacuumlazy.c computes new_rel_tuples, it's
> based on scanned_tuples (lazy_scan_heap's num_tuples), which is the total
> number of surviving tuples *including* the recently-dead ones counted in
> nkeep. This is the number that we want to put into pg_class.reltuples,
> I think, but it's wrong for the pgstats stuff to use it as n_live_tuples
> if we're going to count the recently-dead ones as dead. That is, if we're
> improving the approximation that n_dead_tuples is zero after a vacuum,
> the fix should involve reducing the n_live_tuples estimate as well as
> increasing the n_dead_tuples estimate.
>
> Using your test script against the unpatched code, it's easy to see that
> there's a large (and wrong) value of n_live_tup reported by an autovacuum,
> which gets corrected by the next autoanalyze. For instance note these
> successive readouts from the pg_stat_all_tables query:
>
> n_live_tup | autovacuum_count | autoanalyze_count | n_dead_tup
> ------------+------------------+-------------------+------------
> 497365 | 9 | 8 | 4958346
> 497365 | 9 | 8 | 5458346
> 1186555 | 10 | 8 | 0
> 1186555 | 10 | 8 | 500000
> 499975 | 10 | 9 | 2491877
>
> Since we know the true number of live tuples is always exactly 500000
> in this test, that jump is certainly wrong. With the committed patch,
> the behavior is significantly saner:
>
> n_live_tup | autovacuum_count | autoanalyze_count | n_dead_tup
> ------------+------------------+-------------------+------------
> 483416 | 2 | 2 | 5759861
> 483416 | 2 | 2 | 6259861
> 655171 | 3 | 2 | 382306
> 655171 | 3 | 2 | 882306
> 553942 | 3 | 3 | 3523744
>
> Still some room for improvement, but it's not so silly anymore.
>
> It strikes me that there may be an obvious way to improve the number
> further, based on the observation in this thread that nkeep doesn't need
> to be scaled up because VACUUM should have scanned every page that could
> contain dead tuples. Namely, that we're arriving at new_rel_tuples by
> scaling up num_tuples linearly --- but perhaps we should only scale up
> the live-tuples fraction of that count, not the dead-tuples fraction.
> By scaling up dead tuples too, we are presumably still overestimating
> new_rel_tuples somewhat, and the behavior that I'm seeing with this test
> script seems to confirm that.

After reading your analysis, first thought occurred to me is that we can
directly subtract nkeep from num_tuples to account for better scaling
of live tuples, but I think the scaling routine vac_estimate_reltuples()
is expecting scanned_tuples and this routine is shared by both
Analyze and Vacuum where the mechanism to calculate the live
and dead tuples seems to be bit different, so may be directly passing
a subtract of num_tuples and nkeep to this routine might create some
problem. However I think this idea is definitely worth pursuing to
improve the estimates of live tuples in Vacuum.

> I haven't time to pursue this idea at the moment, but perhaps someone else would like to.

I think this idea is worth to be added in Todo list.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Haribabu kommi <haribabu(dot)kommi(at)huawei(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Heavily modified big table bloat even in auto vacuum is running
Date:	2014-01-19 07:54:34
Message-ID:	4446.1390118074@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
> On Sun, Jan 19, 2014 at 5:59 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> It strikes me that there may be an obvious way to improve the number
>> further, based on the observation in this thread that nkeep doesn't need
>> to be scaled up because VACUUM should have scanned every page that could
>> contain dead tuples. Namely, that we're arriving at new_rel_tuples by
>> scaling up num_tuples linearly --- but perhaps we should only scale up
>> the live-tuples fraction of that count, not the dead-tuples fraction.
>> By scaling up dead tuples too, we are presumably still overestimating
>> new_rel_tuples somewhat, and the behavior that I'm seeing with this test
>> script seems to confirm that.

> After reading your analysis, first thought occurred to me is that we can
> directly subtract nkeep from num_tuples to account for better scaling
> of live tuples, but I think the scaling routine vac_estimate_reltuples()
> is expecting scanned_tuples and this routine is shared by both
> Analyze and Vacuum where the mechanism to calculate the live
> and dead tuples seems to be bit different, so may be directly passing
> a subtract of num_tuples and nkeep to this routine might create some
> problem. However I think this idea is definitely worth pursuing to
> improve the estimates of live tuples in Vacuum.

Yeah, it seemed like it would require some rethinking of the way
vac_estimate_reltuples() works. It's probably not that hard, but it
looked like it'd require more thought than I wanted to put into it on
a Saturday ;-)

regards, tom lane