Re: gsoc08, text search selectivity, pg_statistics holding an array of a different type

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: Jan Urbański <j(dot)urbanski(at)students(dot)mimuw(dot)edu(dot)pl>, "Postgres - Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: gsoc08, text search selectivity, pg_statistics holding an array of a different type
Date: 2008-05-10 00:26:23
Message-ID: 8729.1210379183@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:
> Jan Urbaski wrote:
>> It is no longer true with the design that I planned to use. The
>> typanalyze function for the tsvector type returns an array of
>> most-frequent lexemes (cstrings actually) from the tsvectors, not an
>> array of tsvectors. The question is: is this approach OK? Should
>> typanalyze functions be able to communicate the type of their result to
>> analyze_rel() ? I'm thinking of extending the VacAttrStats structure, so
>> a typanalyze func could set the proper fields to the proper values.re

> Hmm. One idea is to store an array of tsvectors, with only one lexeme in
> each tsvector.

Jan's right: this is an oversight in the design of the VacAttrStats API.
The existing pg_statistics "slot" types all need an array of the same
datatype as the underlying column, but it's obvious when you think about
it that there could be kinds of statistics that need to be stored as an
array of some other type. I'm good with the idea of extending
VacAttrStats for the purpose.

(Whether it's actually a good idea to store the entries as cstrings is
another question...)

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2008-05-10 00:33:41 Re: Small TRUNCATE glitch
Previous Message Tom Lane 2008-05-09 23:53:23 Re: Deterministic locking in PostgreSQL