Re: Patch Review: Collect frequency statistics and selectivity estimation for arrays

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Nathan Boley <npboley(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch Review: Collect frequency statistics and selectivity estimation for arrays
Date: 2011-07-15 07:40:12
Message-ID: CAPpHfdu_Hn3++doYxAh1sYKUwyH6Pk0jnis6r=RkeMXE5Wb20A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

Thank you for review. I've few questions about it.

On Fri, Jul 15, 2011 at 2:13 AM, Nathan Boley <npboley(at)gmail(dot)com> wrote:

> First, it makes me uncomfortable that you are using the MCV and histogram
> slot
> kinds in a way that is very different from other data types.
>
> I realize that tsvector uses MCV in the same way that you do but:
>
> 1) I don't like that very much either.
> 2) TS vector is different in that equality ( in the btree sense )
> doesn't make sense, whereas it does for arrays.
>
> Using the histogram slot for the array lengths is also very surprising to
> me.
>
> Why not just use a new STA_KIND? It's not like we are running out of
> room, and this will be the second 'container' type that splits the
> container
> and stores stats about the elements.
>
Thus, do you think we should collect both btree and frequency/length
statistics for arrays?

> 1) In calc_distr you go to some lengths to avoid round off errors. Since it
> is
> certainly just the order of the estimate that matters, why not just
> perform the calculation in log space?
>
It seems to me that I didn't anything to avoid round off errors there...

------
With best regards,
Alexander Korotkov.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Page 2011-07-15 08:09:18 Re: pg_class.relistemp
Previous Message Pavel Stehule 2011-07-15 07:33:53 Re: patch: enhanced get diagnostics statement 2