Lists: | pgsql-performance |
---|
From: | Benoit Delbosc <bdelbosc(at)nuxeo(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Slow fulltext query plan |
Date: | 2012-04-12 22:09:23 |
Message-ID: | 4F875293.1000100@nuxeo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
Hi,
I would like to understand why the following query execution don't use
any fulltext indexes
and takes more than 300s (using lot of temporary files):
EXPLAIN ANALYZE SELECT hierarchy.id
FROM hierarchy
JOIN fulltext ON fulltext.id = hierarchy.id,
TO_TSQUERY('whatever') query1,
TO_TSQUERY('whatever') query2
WHERE (query1 @@ nx_to_tsvector(fulltext.fulltext)) OR (query2 @@
nx_to_tsvector(fulltext.fulltext_title));
The query plan is here:
http://explain.depesz.com/s/YgP
While if I replace the query2 by query1 in the second clause:
EXPLAIN ANALYZE SELECT hierarchy.id
FROM hierarchy
JOIN fulltext ON fulltext.id = hierarchy.id,
TO_TSQUERY('whatever') query1,
TO_TSQUERY('whatever') query2
WHERE (query1 @@ nx_to_tsvector(fulltext.fulltext)) OR (query1 @@
nx_to_tsvector(fulltext.fulltext_title));
It is 5 order of magniude faster (15ms) using the gin indexes:
http://explain.depesz.com/s/RLa
The nx_to_tsvector is an immutable function with the following code:
SELECT TO_TSVECTOR('english', SUBSTR($1, 1, 250000))
Here is the list of indexes:
hierarchy: "hierarchy_pk" PRIMARY KEY, btree (id)
fulltext: "fulltext_fulltext_idx" gin
(nx_to_tsvector(fulltext::character varying))
fulltext: "fulltext_fulltext_title_idx" gin
(nx_to_tsvector(fulltext_title::character varying))
fulltext and fulltext_title are text type.
And some PostgreSQL configuration:
PostgreSQL 9.1.2 on x86_64-unknown-linux-gnu
shared_buffers: 4GB
effective_cache_size: 10GB
work_mem: 20MB
Thanks for your work and enlightenment
ben
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Benoit Delbosc <bdelbosc(at)nuxeo(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Slow fulltext query plan |
Date: | 2012-04-12 22:25:59 |
Message-ID: | 6273.1334269559@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
Benoit Delbosc <bdelbosc(at)nuxeo(dot)com> writes:
> EXPLAIN ANALYZE SELECT hierarchy.id
> FROM hierarchy
> JOIN fulltext ON fulltext.id = hierarchy.id,
> TO_TSQUERY('whatever') query1,
> TO_TSQUERY('whatever') query2
> WHERE (query1 @@ nx_to_tsvector(fulltext.fulltext)) OR (query2 @@
> nx_to_tsvector(fulltext.fulltext_title));
Is there a reason why you're writing the query in such a
non-straightforward way, rather than just
EXPLAIN ANALYZE SELECT hierarchy.id
FROM hierarchy
JOIN fulltext ON fulltext.id = hierarchy.id
WHERE (TO_TSQUERY('whatever') @@ nx_to_tsvector(fulltext.fulltext))
OR (TO_TSQUERY('whatever') @@ nx_to_tsvector(fulltext.fulltext_title));
?
regards, tom lane
From: | Benoit Delbosc <bdelbosc(at)nuxeo(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Slow fulltext query plan |
Date: | 2012-04-12 22:56:11 |
Message-ID: | 4F875D8B.6020802@nuxeo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
On 13/04/2012 00:25, Tom Lane wrote:
> Benoit Delbosc<bdelbosc(at)nuxeo(dot)com> writes:
>> EXPLAIN ANALYZE SELECT hierarchy.id
>> FROM hierarchy
>> JOIN fulltext ON fulltext.id = hierarchy.id,
>> TO_TSQUERY('whatever') query1,
>> TO_TSQUERY('whatever') query2
>> WHERE (query1 @@ nx_to_tsvector(fulltext.fulltext)) OR (query2 @@
>> nx_to_tsvector(fulltext.fulltext_title));
> Is there a reason why you're writing the query in such a
> non-straightforward way, rather than just
>
> EXPLAIN ANALYZE SELECT hierarchy.id
> FROM hierarchy
> JOIN fulltext ON fulltext.id = hierarchy.id
> WHERE (TO_TSQUERY('whatever') @@ nx_to_tsvector(fulltext.fulltext))
> OR (TO_TSQUERY('whatever') @@ nx_to_tsvector(fulltext.fulltext_title));
>
> ?
>
This query is written by a framework, also I thought that is a common
pattern that can be found in the documentation:
http://www.postgresql.org/docs/9.1/interactive/textsearch-controls.html
if you think this a wrong way to do it then I will try to fix the framework.
btw your version takes 15ms :)
Thanks
ben
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Benoit Delbosc <bdelbosc(at)nuxeo(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Slow fulltext query plan |
Date: | 2012-04-14 01:16:43 |
Message-ID: | 772.1334366203@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
Benoit Delbosc <bdelbosc(at)nuxeo(dot)com> writes:
> On 13/04/2012 00:25, Tom Lane wrote:
>> Is there a reason why you're writing the query in such a
>> non-straightforward way, rather than just
>>
>> EXPLAIN ANALYZE SELECT hierarchy.id
>> FROM hierarchy
>> JOIN fulltext ON fulltext.id = hierarchy.id
>> WHERE (TO_TSQUERY('whatever') @@ nx_to_tsvector(fulltext.fulltext))
>> OR (TO_TSQUERY('whatever') @@ nx_to_tsvector(fulltext.fulltext_title));
> This query is written by a framework, also I thought that is a common
> pattern that can be found in the documentation:
> http://www.postgresql.org/docs/9.1/interactive/textsearch-controls.html
Well, "common pattern" would be stretching it. Anyway I've concluded
that this is in fact a planner bug. There will be a fix in 9.2, but I'm
not going to take the risk of back-patching it, so you might want to
think about changing that framework.
regards, tom lane