Re: gincostestimate

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jan Urbański <wulczer(at)wulczer(dot)org>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: gincostestimate
Date: 2010-08-07 14:36:23
Message-ID: 4471.1281191783@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> 1. The use of rd_amcache is very questionable.

Attached is an alternate patch that I think you should give serious
consideration to. The basic idea here is to only update the metapage
stats data during VACUUM, and not bother with incremental updates during
other operations. That gets rid of a lot of the complexity and
opportunities for bugs-of-omission in the original approach, and also
reduces contention for the metapage as well as WAL traffic.
gincostestimate can compensate fairly well for index growth since the
last VACUUM by scaling up the recorded values by the known growth ratio
of the overall index size. (Note that the index->pages count passed to
gincostestimate is accurate, having been recently gotten from
RelationGetNumberOfBlocks.) Of course, this is only approximate, but
considering that the equations the values are going to be fed into are
even more approximate, I don't see a problem with that.

I also dropped the use of rd_amcache, instead having ginGetStats()
just read the metapage every time. Since the planner stats per se
are now only updated during vacuum, it would be reasonable to use
rd_amcache to remember them, but there's still a problem with
nPendingPages. I think that keeping it simple is the way to go,
at least until someone can show a performance problem with this way.

I didn't do anything about the questionable equations in
gincostestimate. Those need to either be fixed, or documented as
to why they're correct. Other than that I think this could be
committed.

regards, tom lane

PS: I still haven't tested this further than running the regression
tests, since I see little point in trying to check its estimation
behavior until those equations are fixed. However, the hstore
regression test did expose a core dump in gincostestimate (failing
to guard against null partial_matches), which I have fixed here.

Attachment Content-Type Size
gincostestimate-0.21.gz application/octet-stream 5.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mike Fowler 2010-08-07 15:38:06 Re: Review: Re: [PATCH] Re: [HACKERS] Adding xpath_exists function
Previous Message Pavel Stehule 2010-08-07 11:39:03 Re: patch (for 9.1) string functions