Re: Reduce maximum error in tuples estimation after vacuum.

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Kyotaro HORIGUCHI'" <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reduce maximum error in tuples estimation after vacuum.
Date: 2013-07-03 05:23:44
Message-ID: 000601ce77ad$7d3388e0$779a9aa0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thursday, June 27, 2013 4:58 PM Amit Kapila wrote:
> On Wednesday, June 26, 2013 7:40 AM Kyotaro HORIGUCHI wrote:
> > I've recovered from messing up.
> >
> > <snip>
> > > Please let me have a bit of time to diagnose this.
> >
> > I was completely messed up and walking on the wrong way. I looked
> into
> > the vacuum for UPDATEs, not DELETE's so it's quite resonable to have
> > such results.
> >
> > The renewed test script attached shows the verbose output of vacuum
> > after the deletes. I had following output from it.
> >
> > # I belive this runs for you..
> >
> > | INFO: "t": found 989999 removable, 110 nonremovable row
> > | versions in 6308 out of 10829 pages
> >
> > On such a case of partially-scanned, lazy_scan_heap() tries to
> estimate
> > resulting num_tuples in vac_estimate_reltuples() assuming the
> > uniformity of tuple density, which failes for such a a strong
> imbalance
> > made by bulk updates.
> >
> > Do you find any differences between what you will have and the
> > following I had?
>
> I could see the same output with your latest script, also I could
> reproduce
> the test if I run the test with individual sql statements.
> One of the main point for reproducing individual test was to keep
> autovacuum
> = off.

I checked further that why I could not reproduce the issue with
autovacuum=on.
The reason is that it starts analyzer which changes the value for reltuples
in pg_class and after that the estimated and real values become same.
Kindly refer below code:

relation_needs_vacanalyze()
{
..
anltuples = tabentry->changes_since_analyze;
..
anlthresh = (float4) anl_base_thresh + anl_scale_factor * reltuples;
..
*doanalyze = (anltuples > anlthresh);
}

Test Results
--------------

postgres=# drop table if exists t;
DROP TABLE
postgres=# create table t (a int, b int, c int, d int default 0, e int
default 0
, f int default 0);
CREATE TABLE
postgres=# insert into t (select a, (random() * 100000)::int from
generate_serie
s((select count(*) from t) + 1, 1000000) a);
INSERT 0 1000000
postgres=# update t set b = b + 1 where a < (select count(*) from t) * 0.7;

UPDATE 699999
postgres=# vacuum t;
VACUUM
postgres=# delete from t where a < (select count(*) from t) * 0.99;
DELETE 989999
postgres=# vacuum t;
VACUUM
postgres=# select c.relpages, s.n_live_tup, c.reltuples, (select count(*)
from t
) as tuples, reltuples::float / (select count(*) from t) as ratio from
pg_stat_
user_tables s, pg_class c where s.relname = 't' and c.relname = 't';
relpages | n_live_tup | reltuples | tuples | ratio
----------+------------+-----------+--------+------------------
6370 | 417600 | 417600 | 10001 | 41.7558244175582
(1 row)

Here I waited for 1 minute (sufficient time so that analyzer should get
trigger if required).
Infact if you run Analyze t, that also would have served the purpose.

postgres=# select c.relpages, s.n_live_tup, c.reltuples, (select count(*)
from t
) as tuples, reltuples::float / (select count(*) from t) as ratio from
pg_stat_
user_tables s, pg_class c where s.relname = 't' and c.relname = 't';
relpages | n_live_tup | reltuples | tuples | ratio
----------+------------+-----------+--------+-------
6370 | 10001 | 10001 | 10001 | 1
(1 row)

Now if subsequent analyzer run corrects the estimate, don't you think that
it is sufficient for the problem reported?

With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2013-07-03 06:12:23 Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: Proposal for Allow postgresql.conf values to be changed via SQL [review])
Previous Message Josh Berkus 2013-07-03 04:42:43 Re: [9.4 CF 1] The Commitfest Slacker List