Lists: | pgsql-performance |
---|
From: | Hervé Piedvache <footcow(at)noos(dot)fr> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | TSearch2 and optimisation ... |
Date: | 2004-08-25 22:48:32 |
Message-ID: | 200408260048.32274.footcow@noos.fr |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
Hi,
I'm a little beginner with Tsearch2 ....
I have simples tables like this :
# \d article
Table "public.article"
Column | Type |
Modifiers
------------+-----------------------------+-----------------------------------------------------------------
id | integer | not null default
nextval('public.article_rss_id_rss_seq'::text)
id_site | integer | not null
title | text |
url | text |
desc | text |
r_date | timestamp without time zone | default now()
r_update | timestamp without time zone | default now()
idxfti | tsvector |
Indexes:
"article_id_key" unique, btree (id)
"idxfti_idx" gist (idxfti)
"ix_article_update" btree (r_update)
"ix_article_url" btree (url)
"ix_id_site" btree (id_site)
Foreign-key constraints:
"$1" FOREIGN KEY (id_site) REFERENCES site (id_site)
Triggers:
tsvectorupdate BEFORE INSERT OR UPDATE ON article FOR EACH ROW EXECUTE
PROCEDURE tsearch2('idxfti', 'title', 'desc')
# \d site_rss
Table "public.site"
Column | Type | Modifiers
--------------+---------+---------------------------------------------------------------
id_site | integer | not null default
nextval('public.site_id_site_seq'::text)
site_name | text |
site_url | text |
url | text |
language | text |
datecrea | date | default now()
id_category | integer |
time_refresh | integer |
active | integer |
error | integer |
Indexes:
"site_id_site_key" unique, btree (id_site)
"ix_site_id_category" btree (id_category)
"ix_site_url" btree (url)
# \d user_choice
Table "public.user_choice"
Column | Type | Modifiers
---------+---------+-----------
id_user | integer |
id_site | integer |
Indexes:
"ix_user_choice_all" unique, btree (id_user, id_site)
I have done a simple request, looking for title or description having Postgres
inside order by rank and date, like this :
SELECT a.title, a.id, a.url, to_char(a.r_date, 'DD/MM/YYYY HH24:MI:SS') as dt,
s.site_name, s.id_site, case when exists (select id_user from user_choice u
where u.id_site=s.id_site and u.id_user = 1) then 1 else 0 end as bookmarked
FROM article a, site s
WHERE s.id_site = a.id_site
AND idxfti @@ to_tsquery('postgresql')
ORDER BY rank(idxfti, to_tsquery('postgresql')) DESC, a.r_date DESC;
The request takes about 4 seconds ... I have about 1 400 000 records in
article and 36 000 records in site table ... it's a Bi-Pentium III 933 MHz
server with 1 Gb memory ... I'm using Postgresql 7.4.5
For me this result is very very slow I really need a quicker result with less
than 1 second ...
The next time I call the same request I have got the result in 439 ms ... but
If I replace "Postgresql" in my find with "Linux" for example I will get the
next result in 5 seconds ... :o(
Is it a bad use of Tsearch2 ... or a bad table structure ... or from my
request ... ? I have no idea how to optimise this ...
Explain gives me this result :
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Sort (cost=10720.91..10724.29 rows=1351 width=191)
Sort Key: rank(a.idxfti, '\'postgresql\''::tsquery), a.r_date
-> Merge Join (cost=4123.09..10650.66 rows=1351 width=191)
Merge Cond: ("outer".id_site = "inner".id_site)
-> Index Scan using site_id_site_key on site s (cost=0.00..2834.96
rows=35705 width=28)
-> Sort (cost=4123.09..4126.47 rows=1351 width=167)
Sort Key: a.id_site
-> Index Scan using idxfti_idx on article a
(cost=0.00..4052.84 rows=1351 width=167)
Index Cond: (idxfti @@ '\'postgresql\''::tsquery)
Filter: (idxfti @@ '\'postgresql\''::tsquery)
SubPlan
-> Seq Scan on user_choice u (cost=0.00..2.69 rows=1 width=4)
Filter: ((id_site = $0) AND (id_user = 1))
(13 rows)
Any idea are well done ;o)
Regards,
--
Bill Footcow
From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | Hervé Piedvache <footcow(at)noos(dot)fr>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: TSearch2 and optimisation ... |
Date: | 2004-08-25 23:50:53 |
Message-ID: | 200408251650.53233.josh@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
Herve'
> The request takes about 4 seconds ... I have about 1 400 000 records in
> article and 36 000 records in site table ... it's a Bi-Pentium III 933 MHz
> server with 1 Gb memory ... I'm using Postgresql 7.4.5
> For me this result is very very slow I really need a quicker result with
> less than 1 second ...
> The next time I call the same request I have got the result in 439 ms ...
> but If I replace "Postgresql" in my find with "Linux" for example I will
> get the next result in 5 seconds ... :o(
Hmmm. It sounds like your system is unable to keep all of the data cached in
memory. What else do you have going on on that machine?
> Explain gives me this result :
Please do "EXPLAIN ANALYZE" so that we can see where time is actually spent.
--
Josh Berkus
Aglio Database Solutions
San Francisco
From: | Hervé Piedvache <herve(at)elma(dot)fr> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Cc: | Josh Berkus <josh(at)agliodbs(dot)com>, Hervé Piedvache <footcow(at)noos(dot)fr> |
Subject: | Re: TSearch2 and optimisation ... |
Date: | 2004-08-26 07:22:01 |
Message-ID: | 200408260922.01797.herve@elma.fr |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
Josh,
Le Jeudi 26 Août 2004 01:50, Josh Berkus a écrit :
> > The request takes about 4 seconds ... I have about 1 400 000 records in
> > article and 36 000 records in site table ... it's a Bi-Pentium III 933
> > MHz server with 1 Gb memory ... I'm using Postgresql 7.4.5
> > For me this result is very very slow I really need a quicker result with
> > less than 1 second ...
> > The next time I call the same request I have got the result in 439 ms ...
> > but If I replace "Postgresql" in my find with "Linux" for example I will
> > get the next result in 5 seconds ... :o(
>
> Hmmm. It sounds like your system is unable to keep all of the data cached
> in memory. What else do you have going on on that machine?
There is an Apache + PHP running in same time ...
> > Explain gives me this result :
>
> Please do "EXPLAIN ANALYZE" so that we can see where time is actually
> spent.
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Sort (cost=10740.35..10743.73 rows=1351 width=190) (actual
time=7054.603..7054.707 rows=139 loops=1)
Sort Key: rank(a.idxfti, '\'postgresql\''::tsquery), a.r_date
-> Merge Join (cost=4123.09..10670.10 rows=1351 width=190) (actual
time=5476.749..7052.766 rows=139 loops=1)
Merge Cond: ("outer".id_site = "inner".id_site)
-> Index Scan using site_id_site_key on site s (cost=0.00..2846.52
rows=35705 width=28) (actual time=43.985..1548.903 rows=34897 loops=1)
-> Sort (cost=4123.09..4126.47 rows=1351 width=166) (actual
time=5416.836..5416.983 rows=139 loops=1)
Sort Key: a.id_site
-> Index Scan using idxfti_idx on article a
(cost=0.00..4052.84 rows=1351 width=166) (actual time=109.766..5415.108
rows=139 loops=1)
Index Cond: (idxfti @@ '\'postgresql\''::tsquery)
Filter: (idxfti @@ '\'postgresql\''::tsquery)
SubPlan
-> Seq Scan on user_choice u (cost=0.00..2.69 rows=1 width=4)
(actual time=0.146..0.146 rows=0 loops=139)
Filter: ((id_site = $0) AND (id_user = 1))
Total runtime: 7056.126 ms
Thanks for your help ...
--
Hervé Piedvache
Elma Ingénierie Informatique
6 rue du Faubourg Saint-Honoré
F-75008 - Paris - France
Pho. 33-144949901
Fax. 33-144949902
From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | Hervé Piedvache <herve(at)elma(dot)fr>, pgsql-performance(at)postgresql(dot)org |
Cc: | Hervé Piedvache <footcow(at)noos(dot)fr> |
Subject: | Re: TSearch2 and optimisation ... |
Date: | 2004-08-26 17:48:58 |
Message-ID: | 200408261048.58726.josh@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
Herve'
> (cost=0.00..4052.84 rows=1351 width=166) (actual time=109.766..5415.108
> rows=139 loops=1)
> Index Cond: (idxfti @@ '\'postgresql\''::tsquery)
> Filter: (idxfti @@ '\'postgresql\''::tsquery)
From this, it looks like your FTI index isn't fitting in your sort_mem.
What's sort_mem at now? Can you increase it?
Overall, though, I'm not sure you can get this sub-1s without a faster
machine. Although I'm doing FTI on about 25MB of FTI text on a
single-processor machine, and getting 40ms response times, so maybe we can
...
--
Josh Berkus
Aglio Database Solutions
San Francisco
From: | Hervé Piedvache <footcow(at)noos(dot)fr> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Cc: | Josh Berkus <josh(at)agliodbs(dot)com>, Hervé Piedvache <herve(at)elma(dot)fr> |
Subject: | Re: TSearch2 and optimisation ... |
Date: | 2004-08-26 19:30:40 |
Message-ID: | 200408262130.40500.footcow@noos.fr |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-performance |
Le Jeudi 26 Août 2004 19:48, Josh Berkus a écrit :
> Herve'
>
> > (cost=0.00..4052.84 rows=1351 width=166) (actual time=109.766..5415.108
> > rows=139 loops=1)
> > Index Cond: (idxfti @@ '\'postgresql\''::tsquery)
> > Filter: (idxfti @@ '\'postgresql\''::tsquery)
> >
> >From this, it looks like your FTI index isn't fitting in your sort_mem.
>
> What's sort_mem at now? Can you increase it?
shared_buffers = 3000
sort_mem = 10240
> Overall, though, I'm not sure you can get this sub-1s without a faster
> machine. Although I'm doing FTI on about 25MB of FTI text on a
> single-processor machine, and getting 40ms response times, so maybe we can
> ...
Sorry I missed understand what you mean here ...
You tell me to upgrade the hardware but you manage a 25 Mb with a single
processor ?? What you mean ?
My database is about 450 Mb ...
Regards,
--
Bill Footcow