Re: significant slow down with various LIMIT

Lists: pgsql-performance
From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: <andrey(dot)perliev(at)gmail(dot)com>,<pgsql-performance(at)postgresql(dot)org>
Subject: Re: significant slow down with various LIMIT
Date: 2010-04-12 12:09:08
Message-ID: 4BC2C7140200002500030705@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

[rearranging to put related information together]

norn

Since the LIMIT 3 and LIMIT 4 queries generated exactly the same
plan, the increased time for LIMIT 4 suggests that there are 3
matching rows which are near the end of the index it is scanning, but
the fourth one is much farther in.

Since what you're showing suggests that the active portion of your
data is heavily cached, you might benefit from decreasing
random_page_cost, and possibly also seq_page_cost.

> 8GB RAM

> effective_cache_size = 1536MB

> Please also note that this hardware isn't dedicated DB server, but
> also serve as web server and file server.

Even with those other uses, you're likely to actually be using 6 GB
or 7 GB for cache. I'd set effective_cache_size in that range.

> max_connections = 250
> work_mem = 128MB

While probably not related to this problem, that's a dangerous
combination. What if all 250 connections are active with a query
which uses work_mem memory? A single connection can actually be
using several work_mem allocations at once.

> 2 SATA 750GB (pg db installed in software RAID 0)

You do realize that if either drive dies you lose all your data on
that pair of drives, right? I hope the value of the data and well
tested backup procedures keeps the loss to something which is
acceptable.

> I have about 3 million rows in core_object, 1.5 million in
> plugin_plugin_addr and 1.5 million in plugins_guide_address.
> When there were 300 000+ objects queries works perfectly, but as db
> enlarge things go worse...

With a relational database, it's not unusual for the most efficient
plan to depend on the quantity of data in the tables. It is
important that your statistics are kept up-to-date so that plans can
adapt to the changing table sizes or data distributions. The
effective_cache_size and cost parameters are also used to calculate
the costs of various plans, so adjusting those may help the optimizer
make good choices.

-Kevin


From: norn <andrey(dot)perliev(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: significant slow down with various LIMIT
Date: 2010-04-12 13:32:02
Message-ID: f4ecdf51-a74f-45f7-9b6d-87244787ca6b@x3g2000yqd.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Kevin,
I appreciate your help very much!

> Since the LIMIT 3 and LIMIT 4 queries generated exactly the same
> plan, the increased time for LIMIT 4 suggests that there are 3
> matching rows which are near the end of the index it is scanning, but
> the fourth one is much farther in.
Yes, you are right, I checked id of the rows and found that problem
occurred when one of id is 1377077 and next one is 132604.
This table was modified with several new rows and the problem now
happens between limit 4 and 5, this is another evidence of your
rightness!

Followed by your advices I set the following:
effective_cache_size=6144MB
random_page_cost=0.25
seq_page_cost = 0.25
max_connections = 50 # thanks for pointing! I don't really need so
much

> > 2 SATA 750GB (pg db installed in software RAID 0)
>
> You do realize that if either drive dies  you lose all your data on
> that pair of drives, right?  I hope the value of the data and well
> tested backup procedures keeps the loss to something which is
> acceptable.
Thanks for attention! I have some regular backup procedures already,
so there are no extra risk related to drive failure...

I restarted Postgresql with new settings and got no performance
improvements in this particular query...
Do you have ideas how much random_page_cost and seq_page_cost should
be decreased?
Also I wish to notice, that I made REINDEX DATABASE while tried to
solve the problem by myself.. this doesn't help...


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "norn" <andrey(dot)perliev(at)gmail(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: significant slow down with various LIMIT
Date: 2010-04-12 21:28:36
Message-ID: 4BC34A340200002500030759@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

norn <andrey(dot)perliev(at)gmail(dot)com> wrote:

> I restarted Postgresql with new settings and got no performance
> improvements in this particular query...

The cost settings help the optimizer make good decisions about plan
choice. I guess I don't have much reason to believe, at this point,
that there is a better plan for it to choose for this query. Do you
think you see one? What would that be? (We might be able to force
that plan and find out if you're right, which can be a valuable
diagnostic step, even if the way it gets forced isn't a
production-quality solution.)

Are you able to share the table descriptions? (That might help us
suggest an index or some such which might help.)

> Do you have ideas how much random_page_cost and seq_page_cost
> should be decreased?

It really depends on how much of your active data set is cached. If
it is effectively fully cached, you might want to go to 0.01 for
both (or even lower). Many of our databases perform best with
seq_page_cost = 1 and random_page_cost = 2. With some, either of
those "extremes" causes some queries to optimize poorly, and we've
had luck with 0.3 and 0.5. This is one worth testing with your
workload, because you can make some queries faster at the expense of
others; sometimes it comes down to which needs better response time
to keep your users happy.

-Kevin


From: norn <andrey(dot)perliev(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: significant slow down with various LIMIT
Date: 2010-04-13 06:07:19
Message-ID: e1a8a6b8-d5c3-4bf1-8df7-007e5950b698@q23g2000yqd.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Apr 13, 5:28 am, Kevin(dot)Gritt(dot)(dot)(dot)(at)wicourts(dot)gov ("Kevin Grittner")
wrote:
> The cost settings help the optimizer make good decisions about plan
> choice.  I guess I don't have much reason to believe, at this point,
> that there is a better plan for it to choose for this query.  Do you
> think you see one?  What would that be?  (We might be able to force
> that plan and find out if you're right, which can be a valuable
> diagnostic step, even if the way it gets forced isn't a
> production-quality solution.)
I have no deep knowledge of Postgresql, so I've no idea which plan is
the best, but I am wondering why there are so big gap between two
limits and how to avoid this...

> Are you able to share the table descriptions?  (That might help us
> suggest an index or some such which might help.)
sure, here it is

# \d core_object
Table "public.core_object"
Column | Type |
Modifiers
-----------+---------
+----------------------------------------------------------
id | integer | not null default
nextval('core_object_id_seq'::regclass)
typeid_id | integer | not
null
Indexes:
"core_object_pkey" PRIMARY KEY, btree
(id)
"core_object_pkey_desc" btree (id
DESC)
"core_object_typeid_id" btree
(typeid_id)
Foreign-key
constraints:
"core_object_typeid_id_fkey" FOREIGN KEY (typeid_id) REFERENCES
core_obj_typeset(id) DEFERRABLE INITIALLY DEFERRED
Referenced by:
TABLE "plugins_plugin_addr" CONSTRAINT
"plugins_plugin_addr_oid_id_fkey" FOREIGN KEY (oid_id) REFERENCES
core_object(id) DEFERRABLE INITIALLY
DEFERRED
...and many others, so I skipped as irrelevant....

# \d plugins_plugin_addr
Table "public.plugins_plugin_addr"
Column | Type | Modifiers
---------------+---------
+------------------------------------------------------------------
id | integer | not null default
nextval('plugins_plugin_addr_id_seq'::regclass)
oid_id | integer | not null
sub_attrib_id | integer | not null
address_id | integer | not null
Indexes:
"plugins_plugin_addr_pkey" PRIMARY KEY, btree (id)
"plugins_plugin_addr_sub_attrib_id_key" UNIQUE, btree
(sub_attrib_id)
"plugins_plugin_addr_address_id" btree (address_id)
"plugins_plugin_addr_oid_id" btree (oid_id)
Foreign-key constraints:
"plugins_plugin_addr_address_id_fkey" FOREIGN KEY (address_id)
REFERENCES plugins_guide_address(id) DEFERRABLE INITIALLY DEFERRED
"plugins_plugin_addr_oid_id_fkey" FOREIGN KEY (oid_id) REFERENCES
core_object(id) DEFERRABLE INITIALLY DEFERRED
"plugins_plugin_addr_sub_attrib_id_fkey" FOREIGN KEY
(sub_attrib_id) REFERENCES plugins_sub_attrib(id) DEFERRABLE INITIALLY
DEFERRED

# \d plugins_guide_address
Table
"public.plugins_guide_address"
Column | Type |
Modifiers
--------------+------------------------
+--------------------------------------------------------------------
id | integer | not null default
nextval('plugins_guide_address_id_seq'::regclass)
country_id | integer |
region_id | integer |
city_id | integer |
zip_id | integer |
street_id | integer |
house | character varying(20) |
district_id | integer |
code | character varying(23) |
significance | smallint |
alias_fr | character varying(300) |
alias_ru | character varying(300) |
alias_en | character varying(300) |
alias_de | character varying(300) |
alias_it | character varying(300) |
alias_len | smallint |
Indexes:
"plugins_guide_address_pkey" PRIMARY KEY, btree (id)
"plugins_guide_address_uniq" UNIQUE, btree (country_id, region_id,
district_id, city_id, street_id, house)
"plugins_guide_address_alias_ru" btree (alias_ru)
"plugins_guide_address_city_id" btree (city_id)
"plugins_guide_address_code" btree (code)
"plugins_guide_address_country_id" btree (country_id)
"plugins_guide_address_district_id" btree (district_id)
"plugins_guide_address_house" btree (house)
"plugins_guide_address_house_upper" btree (upper(house::text))
"plugins_guide_address_region_id" btree (region_id)
"plugins_guide_address_significance" btree (significance)
"plugins_guide_address_street_id" btree (street_id)
"plugins_guide_address_zip_id" btree (zip_id)
Foreign-key constraints:
"plugins_guide_address_city_id_fkey" FOREIGN KEY (city_id)
REFERENCES plugins_guide_city(id) DEFERRABLE INITIALLY DEFERRED
"plugins_guide_address_country_id_fkey" FOREIGN KEY (country_id)
REFERENCES plugins_guide_country(id) DEFERRABLE INITIALLY DEFERRED
"plugins_guide_address_district_id_fkey" FOREIGN KEY (district_id)
REFERENCES plugins_guide_district(id) DEFERRABLE INITIALLY DEFERRED
"plugins_guide_address_region_id_fkey" FOREIGN KEY (region_id)
REFERENCES plugins_guide_region(id) DEFERRABLE INITIALLY DEFERRED
"plugins_guide_address_street_id_fkey" FOREIGN KEY (street_id)
REFERENCES plugins_guide_street(id) DEFERRABLE INITIALLY DEFERRED
"plugins_guide_address_zip_id_fkey" FOREIGN KEY (zip_id)
REFERENCES plugins_guide_zip(id) DEFERRABLE INITIALLY DEFERRED
Referenced by:
TABLE "plugins_guide_ziphelper" CONSTRAINT
"plugins_guide_ziphelper_address_id_fkey" FOREIGN KEY (address_id)
REFERENCES plugins_guide_address(id) DEFERRABLE INITIALLY DEFERRED
TABLE "plugins_plugin_addr" CONSTRAINT
"plugins_plugin_addr_address_id_fkey" FOREIGN KEY (address_id)
REFERENCES plugins_guide_address(id) DEFERRABLE INITIALLY DEFERRED

------------end---------------


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "norn" <andrey(dot)perliev(at)gmail(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: significant slow down with various LIMIT
Date: 2010-04-13 17:24:18
Message-ID: 4BC4627202000025000307E5@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

norn <andrey(dot)perliev(at)gmail(dot)com> wrote:

> I am wondering why there are so big gap between two limits and how
> to avoid this...

I think we've already established that it is because of the
percentage of the table which must be scanned to get to the desired
number of rows. The problem is exacerbated by the fact that it's a
"backward" scan on the index, which is slower than a forward scan --
mainly because disks spin in one direction, and the spacing of the
sectors is optimized for forward scans.

There are a couple things to try which will give a more complete
picture of what might work to make the run time more predictable.
Please try these, and run EXPLAIN ANALYZE of your problem query each
way.

(1) Try it without the ORDER BY clause and the LIMIT.

(2) Temporarily take that top index out of consideration. (Don't
worry, it'll come back when you issue the ROLLBACK -- just don't
forget the BEGIN statement.)

BEGIN;
DROP INDEX plugins_plugin_addr_oid_id;
explain analyze <your query>
ROLLBACK;

(3) Try it like this (untested, so you may need to fix it up):

explain analyze
SELECT core_object.id
from (SELECT id, city_id FROM "plugins_guide_address")
"plugins_guide_address"
JOIN "plugins_plugin_addr"
ON ("plugins_plugin_addr"."address_id"
= "plugins_guide_address"."id")
JOIN "core_object"
ON ("core_object"."id" = "plugins_plugin_addr"."oid_id")
WHERE "plugins_guide_address"."city_id" = 4535
ORDER BY "core_object"."id" DESC
LIMIT 4 -- or whatever it normally takes to cause the problem
;

-Kevin


From: Chris Bowlby <excalibur(at)accesswave(dot)ca>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: significant slow down with various LIMIT
Date: 2010-04-13 17:59:55
Message-ID: 4BC4B11B.6090106@accesswave.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

I'm also wondering if a re-clustering of the table would work based on
the index that's used.

such that:

CLUSTER core_object USING plugins_plugin_addr_oid_id;

and see if that makes any change in the differences that your seeing.

On 04/13/2010 02:24 PM, Kevin Grittner wrote:
> norn <andrey(dot)perliev(at)gmail(dot)com> wrote:
>
>
>> I am wondering why there are so big gap between two limits and how
>> to avoid this...
>>
>
> I think we've already established that it is because of the
> percentage of the table which must be scanned to get to the desired
> number of rows. The problem is exacerbated by the fact that it's a
> "backward" scan on the index, which is slower than a forward scan --
> mainly because disks spin in one direction, and the spacing of the
> sectors is optimized for forward scans.
>
> There are a couple things to try which will give a more complete
> picture of what might work to make the run time more predictable.
> Please try these, and run EXPLAIN ANALYZE of your problem query each
> way.
>
> (1) Try it without the ORDER BY clause and the LIMIT.
>
> (2) Temporarily take that top index out of consideration. (Don't
> worry, it'll come back when you issue the ROLLBACK -- just don't
> forget the BEGIN statement.)
>
> BEGIN;
> DROP INDEX plugins_plugin_addr_oid_id;
> explain analyze <your query>
> ROLLBACK;
>
> (3) Try it like this (untested, so you may need to fix it up):
>
> explain analyze
> SELECT core_object.id
> from (SELECT id, city_id FROM "plugins_guide_address")
> "plugins_guide_address"
> JOIN "plugins_plugin_addr"
> ON ("plugins_plugin_addr"."address_id"
> = "plugins_guide_address"."id")
> JOIN "core_object"
> ON ("core_object"."id" = "plugins_plugin_addr"."oid_id")
> WHERE "plugins_guide_address"."city_id" = 4535
> ORDER BY "core_object"."id" DESC
> LIMIT 4 -- or whatever it normally takes to cause the problem
> ;
>
> -Kevin
>
>


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "norn" <andrey(dot)perliev(at)gmail(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: significant slow down with various LIMIT
Date: 2010-04-14 14:31:33
Message-ID: 4BC58B75020000250003085F@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:

> (3) Try it like this (untested, so you may need to fix it up):
>
> explain analyze
> SELECT core_object.id
> from (SELECT id, city_id FROM "plugins_guide_address")
> "plugins_guide_address"
> JOIN "plugins_plugin_addr"
> ON ("plugins_plugin_addr"."address_id"
> = "plugins_guide_address"."id")
> JOIN "core_object"
> ON ("core_object"."id" = "plugins_plugin_addr"."oid_id")
> WHERE "plugins_guide_address"."city_id" = 4535
> ORDER BY "core_object"."id" DESC
> LIMIT 4 -- or whatever it normally takes to cause the problem
> ;

Hmph. I see I didn't take that quite where I intended.
Forget the above and try this:

explain analyze
SELECT core_object.id
from (SELECT id, city_id FROM "plugins_guide_address"
WHERE "city_id" = 4535) "plugins_guide_address"
JOIN "plugins_plugin_addr"
ON ("plugins_plugin_addr"."address_id"
= "plugins_guide_address"."id")
JOIN "core_object"
ON ("core_object"."id" = "plugins_plugin_addr"."oid_id")
ORDER BY "core_object"."id" DESC
LIMIT 4 -- or whatever it normally takes to cause the problem
;

-Kevin


From: norn <andrey(dot)perliev(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: significant slow down with various LIMIT
Date: 2010-04-15 14:23:28
Message-ID: e19dfd9e-2818-4e69-b45c-5e21d11a2c46@k33g2000yqc.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Kevin,
thanks for your time!
Here the requested tests.

> (1) Try it without the ORDER BY clause and the LIMIT.
W/o the 'order by' it works instantly (about 1ms!)
Limit (cost=0.00..3.59 rows=5 width=4) (actual time=0.127..0.229
rows=5 loops=1)
-> Nested Loop (cost=0.00..277863.53 rows=386544 width=4) (actual
time=0.125..0.224 rows=5 loops=1)
-> Nested Loop (cost=0.00..91136.78 rows=386544 width=4)
(actual time=0.106..0.154 rows=5 loops=1)
-> Index Scan using plugins_guide_address_city_id on
plugins_guide_address (cost=0.00..41109.07 rows=27673 width=4)
(actual time=0.068..0.080 rows=5 loops=1)
Index Cond: (city_id = 4535)
-> Index Scan using plugins_plugin_addr_address_id on
plugins_plugin_addr (cost=0.00..1.63 rows=14 width=8) (actual
time=0.011..0.012 rows=1 loops=5)
Index Cond: (plugins_plugin_addr.address_id =
plugins_guide_address.id)
-> Index Scan using core_object_pkey on core_object
(cost=0.00..0.47 rows=1 width=4) (actual time=0.011..0.012 rows=1
loops=5)
Index Cond: (core_object.id =
plugins_plugin_addr.oid_id)
Total runtime: 0.328 ms
(10 rows)

W/o the limit it takes 1.4 seconds, which is anyway better than...
Sort (cost=199651.74..200618.10 rows=386544 width=4) (actual
time=1153.167..1157.841 rows=43898 loops=1)
Sort Key: core_object.id
Sort Method: quicksort Memory: 3594kB
-> Hash Join (cost=81234.35..163779.93 rows=386544 width=4)
(actual time=122.050..1128.909 rows=43898 loops=1)
Hash Cond: (core_object.id = plugins_plugin_addr.oid_id)
-> Seq Scan on core_object (cost=0.00..46467.07
rows=3221307 width=4) (actual time=0.011..378.677 rows=3221349
loops=1)
-> Hash (cost=76402.55..76402.55 rows=386544 width=4)
(actual time=121.170..121.170 rows=43898 loops=1)
-> Nested Loop (cost=368.81..76402.55 rows=386544
width=4) (actual time=8.645..104.842 rows=43898 loops=1)
-> Bitmap Heap Scan on plugins_guide_address
(cost=368.81..26374.83 rows=27673 width=4) (actual time=8.599..15.590
rows=26583 loops=1)
Recheck Cond: (city_id = 4535)
-> Bitmap Index Scan on
plugins_guide_address_city_id (cost=0.00..361.89 rows=27673 width=0)
(actual time=7.856..7.856 rows=26583 loops=1)
Index Cond: (city_id = 4535)
-> Index Scan using
plugins_plugin_addr_address_id on plugins_plugin_addr
(cost=0.00..1.63 rows=14 width=8) (actual time=0.002..0.003 rows=2
loops=26583)
Index Cond: (plugins_plugin_addr.address_id
= plugins_guide_address.id)
Total runtime: 1162.193 ms
(15 rows)

>(2) Temporarily take that top index out of consideration
It works nice! Query takes about 0.6 seconds as expected!

explain analyze SELECT core_object.id from "core_object" INNER JOIN
"plugins_plugin_addr" ON ("core_object"."id" =
"plugins_plugin_addr"."oid_id") INNER JOIN "plugins_guide_address" ON
("plugins_plugin_addr"."address_id" = "plugins_guide_address"."id")
WHERE "plugins_guide_address"."city_id" = 4535 ORDER BY
"core_object"."id" DESC;

Limit (cost=112274.36..112275.66 rows=5 width=4) (actual
time=200.758..637.039 rows=5 loops=1)
-> Merge Join (cost=112274.36..213042.22 rows=386544 width=4)
(actual time=200.754..637.035 rows=5 loops=1)
Merge Cond: (core_object.id = plugins_plugin_addr.oid_id)
-> Index Scan Backward using core_object_pkey on
core_object (cost=0.00..86916.44 rows=3221307 width=4) (actual
time=0.115..302.512 rows=1374693 loops=1)
-> Sort (cost=112274.36..113240.72 rows=386544 width=4)
(actual time=154.635..154.635 rows=5 loops=1)
Sort Key: plugins_plugin_addr.oid_id
Sort Method: quicksort Memory: 3594kB
-> Nested Loop (cost=368.81..76402.55 rows=386544
width=4) (actual time=9.522..126.206 rows=43898 loops=1)
-> Bitmap Heap Scan on plugins_guide_address
(cost=368.81..26374.83 rows=27673 width=4) (actual time=9.367..21.311
rows=26583 loops=1)
Recheck Cond: (city_id = 4535)
-> Bitmap Index Scan on
plugins_guide_address_city_id (cost=0.00..361.89 rows=27673 width=0)
(actual time=8.577..8.577 rows=26583 loops=1)
Index Cond: (city_id = 4535)
-> Index Scan using
plugins_plugin_addr_address_id on plugins_plugin_addr
(cost=0.00..1.63 rows=14 width=8) (actual time=0.002..0.003 rows=2
loops=26583)
Index Cond: (plugins_plugin_addr.address_id
= plugins_guide_address.id)
Total runtime: 637.620 ms
(15 rows)

> (3) Try it like this (untested, so you may need to fix it up):
explain analyze
SELECT core_object.id
from (SELECT id, city_id FROM "plugins_guide_address"
WHERE "city_id" = 4535) "plugins_guide_address"
JOIN "plugins_plugin_addr"
ON ("plugins_plugin_addr"."address_id"
= "plugins_guide_address"."id")
JOIN "core_object"
ON ("core_object"."id" = "plugins_plugin_addr"."oid_id")
ORDER BY "core_object"."id" DESC
LIMIT 5;
Limit (cost=0.00..11.51 rows=5 width=4) (actual
time=494.600..4737.867 rows=5 loops=1)
-> Merge Join (cost=0.00..889724.50 rows=386544 width=4) (actual
time=494.599..4737.862 rows=5 loops=1)
Merge Cond: (plugins_plugin_addr.oid_id = core_object.id)
-> Nested Loop (cost=0.00..789923.00 rows=386544 width=4)
(actual time=450.359..4269.608 rows=5 loops=1)
-> Index Scan Backward using
plugins_plugin_addr_oid_id on plugins_plugin_addr
(cost=0.00..45740.51 rows=1751340 width=8) (actual time=0.038..321.285
rows=1374690 loops=1)
-> Index Scan using plugins_guide_address_pkey on
plugins_guide_address (cost=0.00..0.41 rows=1 width=4) (actual
time=0.003..0.003 rows=0 loops=1374690)
Index Cond: (public.plugins_guide_address.id =
plugins_plugin_addr.address_id)
Filter: (public.plugins_guide_address.city_id =
4535)
-> Index Scan Backward using core_object_pkey on
core_object (cost=0.00..86916.44 rows=3221307 width=4) (actual
time=0.008..288.625 rows=1374693 loops=1)
Total runtime: 4737.964 ms
(10 rows)

So, as we can see, dropping index may help, but why? What shall I do
in my particular situation? Probably analyzing my tests help you
giving some recommendations, I hope so! :)

Thanks again for your time!

On Apr 14, 10:31 pm, Kevin(dot)Gritt(dot)(dot)(dot)(at)wicourts(dot)gov ("Kevin Grittner")
wrote:
> "Kevin Grittner" <Kevin(dot)Gritt(dot)(dot)(dot)(at)wicourts(dot)gov> wrote:
> > (3) Try it like this (untested, so you may need to fix it up):
>
> > explain analyze
> > SELECT core_object.id
> >   from (SELECT id, city_id FROM "plugins_guide_address")
> >        "plugins_guide_address"
> >   JOIN "plugins_plugin_addr"
> >     ON ("plugins_plugin_addr"."address_id"
> >        = "plugins_guide_address"."id")
> >   JOIN "core_object"
> >     ON ("core_object"."id" = "plugins_plugin_addr"."oid_id")
> >   WHERE "plugins_guide_address"."city_id" = 4535
> >   ORDER BY "core_object"."id" DESC
> >   LIMIT 4 -- or whatever it normally takes to cause the problem
> > ;
>
> Hmph.  I see I didn't take that quite where I intended.
> Forget the above and try this:
>
> explain analyze
> SELECT core_object.id
>   from (SELECT id, city_id FROM "plugins_guide_address"
>           WHERE "city_id" = 4535) "plugins_guide_address"
>   JOIN "plugins_plugin_addr"
>     ON ("plugins_plugin_addr"."address_id"
>        = "plugins_guide_address"."id")
>   JOIN "core_object"
>     ON ("core_object"."id" = "plugins_plugin_addr"."oid_id")
>   ORDER BY "core_object"."id" DESC
>   LIMIT 4 -- or whatever it normally takes to cause the problem
> ;
>
> -Kevin
>
> --
> Sent via pgsql-performance mailing list (pgsql-performa(dot)(dot)(dot)(at)postgresql(dot)org)
> To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-performance


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "norn" <andrey(dot)perliev(at)gmail(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: significant slow down with various LIMIT
Date: 2010-04-20 21:57:11
Message-ID: 4BCDDCE70200002500030C1E@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

norn <andrey(dot)perliev(at)gmail(dot)com> wrote:

>> (1) Try it without the ORDER BY clause and the LIMIT.
> W/o the 'order by' it works instantly (about 1ms!)

> W/o the limit it takes 1.4 seconds

>>(2) Temporarily take that top index out of consideration
> It works nice! Query takes about 0.6 seconds as expected!

> So, as we can see, dropping index may help, but why? What shall I
> do in my particular situation? Probably analyzing my tests help
> you giving some recommendations, I hope so! :)

The combination of the ORDER BY DESC and the LIMIT causes it to
think it can get the right data most quickly by scanning backwards
on the index. It's wrong about that. With the information from the
additional plans, it seems that this bad estimate might be why it's
not recognizing the plan which is actually four orders of magnitude
faster:

Index Scan using plugins_guide_address_city_id
on plugins_guide_address
Index Cond: (city_id = 4535)
estimated rows=27673
actual rows=5

Try this:

ALTER TABLE ALTER plugins_guide_address
ALTER COLUMN city_id SET STATISTICS 1000;
ANALYZE plugins_guide_address;

Then try your query.

I have one more diagnostic query to test, if the above doesn't work:

explain analyze
SELECT id FROM
(
SELECT core_object.id
FROM "core_object"
JOIN "plugins_plugin_addr"
ON ("core_object"."id" = "plugins_plugin_addr"."oid_id")
JOIN "plugins_guide_address"
ON ("plugins_plugin_addr"."address_id" =
"plugins_guide_address"."id")
WHERE "plugins_guide_address"."city_id" = 4535
) x
ORDER BY id DESC
LIMIT 4;

-Kevin


From: norn <andrey(dot)perliev(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: significant slow down with various LIMIT
Date: 2010-04-21 08:24:08
Message-ID: 20427682-16f8-4340-afd7-1222977c440d@12g2000yqi.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

> Try this:
>
> ALTER TABLE ALTER plugins_guide_address
>   ALTER COLUMN city_id SET STATISTICS 1000;
> ANALYZE plugins_guide_address;
>
> Then try your query.
No luck... The same query time...

> I have one more diagnostic query to test, if the above doesn't work:
>
> explain analyze
> SELECT id FROM
>   (
>     SELECT core_object.id
>       FROM "core_object"
>       JOIN "plugins_plugin_addr"
>         ON ("core_object"."id" = "plugins_plugin_addr"."oid_id")
>       JOIN "plugins_guide_address"
>         ON ("plugins_plugin_addr"."address_id" =
>             "plugins_guide_address"."id")
>       WHERE "plugins_guide_address"."city_id" = 4535
>   ) x
>   ORDER BY id DESC
>   LIMIT 4;

Limit (cost=0.00..8.29 rows=4 width=4) (actual time=0.284..1322.792
rows=4 loops=1)
-> Merge Join (cost=0.00..993770.68 rows=479473 width=4) (actual
time=0.281..1322.787 rows=4 loops=1)
Merge Cond: (plugins_plugin_addr.oid_id = core_object.id)
-> Nested Loop (cost=0.00..887841.46 rows=479473 width=4)
(actual time=0.194..1201.318 rows=4 loops=1)
-> Index Scan Backward using
plugins_plugin_addr_oid_id on plugins_plugin_addr
(cost=0.00..51546.26 rows=1980627 width=8) (actual time=0.117..87.035
rows=359525 loops=1)
-> Index Scan using plugins_guide_address_pkey on
plugins_guide_address (cost=0.00..0.41 rows=1 width=4) (actual
time=0.003..0.003 rows=0 loops=359525)
Index Cond: (plugins_guide_address.id =
plugins_plugin_addr.address_id)
Filter: (plugins_guide_address.city_id = 4535)
-> Index Scan Backward using core_object_pkey on
core_object (cost=0.00..91309.16 rows=3450658 width=4) (actual
time=0.079..73.071 rows=359525 loops=1)
Total runtime: 1323.065 ms
(10 rows)


From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "norn" <andrey(dot)perliev(at)gmail(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: significant slow down with various LIMIT
Date: 2010-04-21 13:52:55
Message-ID: 4BCEBCE70200002500030C3F@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

I wrote:

> ALTER TABLE ALTER plugins_guide_address
> ALTER COLUMN city_id SET STATISTICS 1000;

One too many ALTERs in there. Should be:

ALTER TABLE plugins_guide_address
ALTER COLUMN city_id SET STATISTICS 1000;

-Kevin


From: norn <andrey(dot)perliev(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: significant slow down with various LIMIT
Date: 2010-04-21 15:40:42
Message-ID: e3368ad6-8a61-4671-bff0-fe4a003d4e65@z7g2000yqb.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On Apr 21, 9:52 pm, Kevin(dot)Gritt(dot)(dot)(dot)(at)wicourts(dot)gov ("Kevin Grittner")
wrote:
> I wrote:
> > ALTER TABLE ALTER plugins_guide_address
> >   ALTER COLUMN city_id SET STATISTICS 1000;
>
> One too many ALTERs in there.  Should be:
>
> ALTER TABLE plugins_guide_address
>   ALTER COLUMN city_id SET STATISTICS 1000;

Yeah, I noticed it and ran correctly.