Re: Collect frequency statistics for arrays

From: Noah Misch <noah(at)leadboat(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Nathan Boley <npboley(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Collect frequency statistics for arrays
Date: 2012-01-23 15:58:10
Message-ID: 20120123155810.GA12821@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 23, 2012 at 01:21:20AM +0400, Alexander Korotkov wrote:
> Updated patch is attached. I've updated comment
> of mcelem_array_contained_selec with more detailed description of
> probability distribution assumption. Also, I found that "rest" behavious
> should be better described by Poisson distribution, relevant changes were
> made.

Thanks. That makes more of the math clear to me. I do not follow all of it,
but I feel that the comments now have enough information that I could go about
doing so.

> + /* Take care about events with low probabilities. */
> + if (rest > DEFAULT_CONTAIN_SEL)
> + {

Why the change from "rest > 0" to this in the latest version?

> + /* emit some statistics for debug purposes */
> + elog(DEBUG3, "array: target # mces = %d, bucket width = %d, "
> + "# elements = %llu, hashtable size = %d, usable entries = %d",
> + num_mcelem, bucket_width, element_no, i, track_len);

That should be UINT64_FMT. (I introduced that error in v0.10.)

I've attached a new version that includes the UINT64_FMT fix, some edits of
your newest comments, and a rerun of pgindent on the new files. I see no
other issues precluding commit, so I am marking the patch Ready for Committer.
If I made any of the comments worse, please post another update.

Thanks,
nm

Attachment Content-Type Size
arrayanalyze-0.13.patch.gz application/x-gunzip 24.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-01-23 16:01:52 Re: Removing freelist (was Re: Should I implement DROP INDEX CONCURRENTLY?)
Previous Message Robert Haas 2012-01-23 15:52:20 Re: GUC_REPORT for protocol tunables was: Re: Optimize binary serialization format of arrays with fixed size elements