Re: ts_headline

From: Stephen Davies <scldad(at)sdc(dot)com(dot)au>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: ts_headline
Date: 2008-02-22 09:10:39
Message-ID: 200802221940.39577.scldad@sdc.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-patches

Not quite:-(

It is the ts_headline with the explicit "english" configuration that "fails"
rather than the implicit "simple".

That's what is so weird.

As you say, the ts_vector has "databas" so the "english" version of
ts_headline should work - but it doesn't. The "simple" version does; despite
the above.

Weird!

Stephen

On Friday 22 February 2008 19:33, Richard Huxton wrote:
> Stephen Davies wrote:
> > OK. The first level explanation is that my default config is "simple".
>
> Aha! Actually, that's the whole explanation.
>
> > This explains the different query results as "english" reduces "database"
> > to "databas" while "simple does not reduce it at all.
>
> Exactly.
>
> > The "document" is parsed/indexed using "english" explicitly so my queries
> > nedd to be explicit also (not an issue as all "real" queries are
> > generated rather than typed).
>
> Or change your default configuration to match the one you're using.
>
> > However, I still cannot see a reason for the ts_headline results. If
> > anything, they should be the other way around.
> >
> > I suspect that ts_headline may only work properly when no configuration
> > is specified - regardless of the default setting.
>
> No. What's happening is that your tsvector representation of the
> document (which gets indexed) contains lexemes processed by your
> "english" config. So, it will have something like:
> ... databas: 123, 129, 200 ...
> Of course, when you do a tsquery search with "simple" configuration it
> checks doesn't do any stemming so is actually looking for a lexeme
> called "database" which it can't find.
>
> Since it can't find anything, it falls back to displaying just the start
> of the document. Since the alternative would be to display nothing, that
> makes a certain amount of sense.
>
> To check this, try: ts_headline(t, to_tsquery('simple','databas')) and
> you should get your database results.
>
>
> Moral of the story: if you specify a configuration, always specify it.
>
> Thanks for working through this Stephen - good question specification btw.

--
========================================================================
This email is for the person(s) identified above, and is confidential to
the sender and the person(s). No one else is authorised to use or
disseminate this email or its contents.

Stephen Davies Consulting Voice: 08-8177 1595
Adelaide, South Australia. Fax: 08-8177 0133
Computing & Network solutions. Mobile:0403 0405 83

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Richard Huxton 2008-02-22 09:30:18 Re: ts_headline
Previous Message Richard Huxton 2008-02-22 09:03:29 Re: ts_headline

Browse pgsql-patches by date

  From Date Subject
Next Message Richard Huxton 2008-02-22 09:30:18 Re: ts_headline
Previous Message Richard Huxton 2008-02-22 09:03:29 Re: ts_headline