From: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: WIP: collect frequency statistics for arrays |
Date: | 2011-06-13 19:10:36 |
Message-ID: | BANLkTikOowSvYoZWUE8b4uS7JdOZ=A-y4w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jun 13, 2011 at 8:16 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> If the data type is hashable, you could consider building a hash table
> on the MCVs and then do a probe for each element in the array. I
> think that's better than the other way around because there can't be
> more than 10k MCVs, whereas the input constant could be arbitrarily
> long. I'm not entirely sure whether this case is important enough to
> be worth spending a lot of code on, but then again it might not be
> that much code.
>
Unfortunately, most time consuming operation isn't related to elements
comparison. It is caused by complex computations in calc_distr function.
> Another option is to bound the number of operations you're willing to
> perform to some reasonable limit, say, 10 * default_statistics_target.
> Work out ceil((10 * default_statistics_target) /
> number-of-elements-in-const) and consider at most that many MCVs.
> When this limit kicks in you'll get a less-accurate selectivity
> estimate, but that's a reasonable price to pay for not blowing out
> planning time.
Good option. I'm going to add such condition to my patch.
------
With best regards,
Alexander Korotkov.
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2011-06-13 19:15:52 | Re: SSI patch renumbered existing 2PC resource managers?? |
Previous Message | Jeff Shanab | 2011-06-13 19:10:29 | Libpq in VS 2010 |