Lists: | pgsql-sql |
---|
From: | <ogjunk-pgjedan(at)yahoo(dot)com> |
---|---|
To: | pgsql-sql(at)postgresql(dot)org |
Subject: | Trimming the cost of ORDER BY in a simple query |
Date: | 2005-05-03 04:35:31 |
Message-ID: | 20050503043531.23516.qmail@web31106.mail.mud.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
Hello,
I have a simple query with a pretty high cost (EXPLAIN ...), and I'm
wondering if I can somehow trim it.
Query (shows the last 7 dates):
=> SELECT DISTINCT date_part('year', uu.add_date), date_part('month',
uu.add_date), date_part('day', uu.add_date) FROM user_url uu WHERE
uu.user_id=1 ORDER BY date_part('year', uu.add_date) DESC,
date_part('month', uu.add_date) DESC, date_part('day', uu.add_date)
DESC LIMIT 7;
QUERY PLAN:
---------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=4510.14..4522.93 rows=2 width=8) (actual
time=19.924..20.160 rows=7 loops=1)
-> Unique (cost=4510.14..4522.93 rows=2 width=8) (actual
time=19.919..20.139 rows=7 loops=1)
-> Sort (cost=4510.14..4513.34 rows=1279 width=8) (actual
time=19.915..20.004 rows=78 loops=1)
Sort Key: date_part('year'::text, add_date),
date_part('month'::text, add_date), date_part('day'::text, add_date)
-> Index Scan using foo on user_url uu
(cost=0.00..4444.14 rows=1279 width=8) (actual time=0.095..14.761
rows=1225 loops=1)
Index Cond: (user_id = 1)
Total runtime: 20.313 ms
(7 rows)
It looks like the cost is all in ORDER BY, and if I remove ORDER BY the
execution time goes from 20-90 ms to less than 1 ms.
I do need the 7 most recent add_dates. Is there a more efficient way
of grabbing them?
Thanks,
Otis
From: | Mischa Sandberg <mischa(dot)sandberg(at)telus(dot)net> |
---|---|
To: | ogjunk-pgjedan(at)yahoo(dot)com |
Cc: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: Trimming the cost of ORDER BY in a simple query |
Date: | 2005-05-03 06:31:05 |
Message-ID: | 1115101865.42771aa9ee0d3@webmail.telus.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
Quoting ogjunk-pgjedan(at)yahoo(dot)com:
> I have a simple query with a pretty high cost (EXPLAIN ...), and I'm
> wondering if I can somehow trim it.
>
> Query (shows the last 7 dates):
>
> => SELECT DISTINCT date_part('year', uu.add_date),
> date_part('month',
> uu.add_date), date_part('day', uu.add_date) FROM user_url uu
> WHERE
> uu.user_id=1 ORDER BY date_part('year', uu.add_date) DESC,
> date_part('month', uu.add_date) DESC, date_part('day',
> uu.add_date)
> DESC LIMIT 7;
>
> QUERY PLAN:
>
---------------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=4510.14..4522.93 rows=2 width=8) (actual
> time=19.924..20.160 rows=7 loops=1)
> -> Unique (cost=4510.14..4522.93 rows=2 width=8) (actual
> time=19.919..20.139 rows=7 loops=1)
> -> Sort (cost=4510.14..4513.34 rows=1279 width=8) (actual
> time=19.915..20.004 rows=78 loops=1)
> Sort Key: date_part('year'::text, add_date),
> date_part('month'::text, add_date), date_part('day'::text, add_date)
> -> Index Scan using foo on user_url uu
> (cost=0.00..4444.14 rows=1279 width=8) (actual time=0.095..14.761
> rows=1225 loops=1)
> Index Cond: (user_id = 1)
> Total runtime: 20.313 ms
> (7 rows)
>
>
> It looks like the cost is all in ORDER BY, and if I remove ORDER BY
> the
> execution time goes from 20-90 ms to less than 1 ms.
>
> I do need the 7 most recent add_dates. Is there a more efficient
> way
> of grabbing them?
The query analyzer is using the sort to detect and return distinct
values, as well. So there's not much point in trying to remove it.
From: | Ragnar Hafstað <gnari(at)simnet(dot)is> |
---|---|
To: | ogjunk-pgjedan(at)yahoo(dot)com |
Cc: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: Trimming the cost of ORDER BY in a simple query |
Date: | 2005-05-03 07:22:18 |
Message-ID: | 1115104938.5853.25.camel@localhost.localdomain |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
On Mon, 2005-05-02 at 21:35 -0700, ogjunk-pgjedan(at)YAHOO(dot)COM wrote:
> Query (shows the last 7 dates):
>
> => SELECT DISTINCT date_part('year', uu.add_date), date_part('month',
> uu.add_date), date_part('day', uu.add_date) FROM user_url uu WHERE
> uu.user_id=1 ORDER BY date_part('year', uu.add_date) DESC,
> date_part('month', uu.add_date) DESC, date_part('day', uu.add_date)
> DESC LIMIT 7;
>
i assume add_date is a timestamp, because otherwise you could
ORDER by add_date, and use an index on (desc, add_date):
SELECT DISTINCT .... WHERE uu.user_id=1
ORDER BY uu.user_id DESC ,uu.add_date DESC LIMIT 7;
this will not work for a timestamp add_date
> QUERY PLAN:
...
> Total runtime: 20.313 ms
20 ms does not sound like a big problem.
if 20 ms is really too much for you, and add_date IS a timestamp,
you might think about adding a date column , maintained
by triggers of by your application, and add an index on (user_id, x)
another possibility (total guess) is a functional index
create index func_id_date on user_url(user_id,(add_date::date));
SELECT ... ORDER BY uu.user_id DESC ,uu.add_date::date DESC LIMIT 7;
I have no idea if this will work, or if the planner will use such
an index.
gnari
From: | Ezequiel Tolnay <mail(at)etolnay(dot)com(dot)ar> |
---|---|
To: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: Trimming the cost of ORDER BY in a simple query |
Date: | 2005-05-03 08:08:23 |
Message-ID: | d57bhc$2nfr$1@news.hub.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
Ragnar Hafstað wrote:
> another possibility (total guess) is a functional index
> create index func_id_date on user_url(user_id,(add_date::date));
> SELECT ... ORDER BY uu.user_id DESC ,uu.add_date::date DESC LIMIT 7;
> I have no idea if this will work, or if the planner will use such
> an index.
To make sure the index is used you could build the query using a
subquery like this:
SELECT DISTINCT date_part('year', add_date), date_part('month',
add_date), date_part('day', add_date)
FROM (
SELECT uu.add_date::date as add_date
FROM user_url uu
GROUP BY uu.user_id, uu.add_date
HAVING uu.user_id=1
ORDER BY uu.add_date::date DESC
LIMIT 7
) x;
Perhaps a select distinct instead of the group by in the subquery would
make use the index?
Cheers,
Ezequiel Tolnay
From: | Harald Fuchs <use_reply_to(at)protecting(dot)net> |
---|---|
To: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: Trimming the cost of ORDER BY in a simple query |
Date: | 2005-05-03 10:59:48 |
Message-ID: | puu0lk60d7.fsf@srv.protecting.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-sql |
In article <1115104938(dot)5853(dot)25(dot)camel(at)localhost(dot)localdomain>,
Ragnar Hafstað <gnari(at)simnet(dot)is> writes:
> 20 ms does not sound like a big problem.
> if 20 ms is really too much for you, and add_date IS a timestamp,
> you might think about adding a date column , maintained
> by triggers of by your application, and add an index on (user_id, x)
Maybe a functional index on "date_trunc('day', add_date)" would be helpful?