Can I do better than this heapscan and sort?

From: Andy Halsall <halsall_andy(at)hotmail(dot)com>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: Can I do better than this heapscan and sort?
Date: 2012-06-21 20:07:01
Message-ID: BLU123-W20B062EC217D902A05BAC1F5FD0@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


I have two tables node and relationship. Each relationship record connects two nodes and has an application keys (unfortunately named) that can be used by the application to look-up a relationship and get from one node to the other.

My query uses a node id and a description of a relationship from the node, and must find the "next" relationship that the node has. It does this by finding all the relationships that could be "next", ordering them and then getting the first.

Details are below but I end up with 6896 candidates for "next".

If I'm reading the output correctly it takes 13.509 ms to apply the filter and another 7 ms or so to do the sort of the remaining 6896 nodes.

Have tried many index combinations to try and improve the results. I suspect that with so many nodes to sort, postgresql will opt for heap scan rather than index. But why does it not use the IDX_order_sort_down_2 index for the sort?

Thanks,
Andy


Details..........

Version
-------
PostgreSQL 9.1.2 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.5.2, 64-bit

Tables
------
CREATE TABLE node (
node_id bigint NOT NULL,
node_type int4 NOT NULL,
c_state int4 NOT NULL,
d_state int4 NOT NULL,
sort_key bigint NOT NULL,
permissions bytea NOT NULL,
audit bytea NOT NULL,
pkg_id bytea NULL,
created timestamp NOT NULL
);

CREATE TABLE relationship (
rel_id bigint NOT NULL,
rel_type integer NOT NULL,
s_t_n bigint NOT NULL,
t_s_n bigint NOT NULL,
state integer NOT NULL,
control integer NOT NULL,
sort_key bigint NOT NULL,
prime_key bytea NULL,
prime_key_len integer NOT NULL,
sec_key bytea NULL,
sec_key_len integer NOT NULL,
up_sort_key bigint NOT NULL,
up_prime_key bytea NULL,
up_prime_key_len integer NOT NULL,
up_sec_key bytea NULL,
up_sec_key_len integer NOT NULL,
permissions bytea NOT NULL,
t_s_n_type integer NOT NULL,
created timestamp NOT NULL
);

Constraints
-----------
-- Primary keys
ALTER TABLE node ADD CONSTRAINT PK_node PRIMARY KEY (node_id);

ALTER TABLE relationship ADD CONSTRAINT PK_relationship PRIMARY KEY (rel_id);

-- Foreign keys
ALTER TABLE relationship ADD CONSTRAINT FK_node_s FOREIGN KEY (s_t_n) REFERENCES node (node_id);

ALTER TABLE relationship ADD CONSTRAINT FK_node_n FOREIGN KEY (t_s_n) REFERENCES node (node_id);


Indexes
-------
CREATE INDEX IDX_node_type ON node (node_type ASC) TABLESPACE ds_appex_ts_10
;
CREATE INDEX IDX_node_sort_key ON node (sort_key ASC) TABLESPACE ds_appex_ts_10
;
CREATE INDEX IDX_relationship_s_t_n ON relationship (s_t_n ASC) TABLESPACE ds_appex_ts_10
;
CREATE INDEX IDX_relationship_t_s_n ON relationship (t_s_n ASC) TABLESPACE ds_appex_ts_10
;
CREATE INDEX IDX_relationship_type ON relationship (rel_type ASC) TABLESPACE ds_appex_ts_10
;
CREATE INDEX IDX_relationship_prime_key ON relationship (prime_key ASC) TABLESPACE ds_appex_ts_10
;
CREATE INDEX IDX_relationship_u_prime_key ON relationship (up_prime_key ASC) TABLESPACE ds_appex_ts_10
;
CREATE INDEX IDX_relationship_sec_key ON relationship (sec_key ASC) TABLESPACE ds_appex_ts_10
;
CREATE INDEX IDX_order_first ON node(sort_key DESC, node_id DESC) TABLESPACE ds_appex_ts_10
;
CREATE INDEX IDX_order_sort_down_1 ON relationship(sort_key DESC, prime_key ASC NULLS FIRST, sec_key ASC NULLS FIRST) TABLESPACE ds_appex_ts_10
;
CREATE INDEX IDX_order_sort_down_2 ON relationship(sort_key DESC, prime_key ASC NULLS FIRST, sec_key DESC NULLS FIRST) TABLESPACE ds_appex_ts_10
;
CREATE INDEX IDX_order_sort_up ON relationship(up_sort_key DESC, up_prime_key ASC NULLS FIRST, up_sec_key ASC NULLS FIRST) TABLESPACE ds_appex_ts_10
;

Query
-----
CREATE OR REPLACE FUNCTION sp_get_rel_sort_dup_sec_desc(in_rel_type1 integer, in_rel_type2 integer, in_node_type integer, in_own_guid bigint, in_prev_prime_key bytea, in_prev_prime_key_len integer, in_prev_sec_key bytea, in_prev_sec_key_len integer, in_prev_sort_key bigint, in_ctrl integer) RETURNS select_rel_holder AS
'
declare
h select_rel_holder%rowtype;

begin
SELECT INTO h r.rel_id, r.t_s_n, r.rel_type, r.sort_key,
r.state,r.permissions, r.control,
r.prime_key, r.prime_key_len, r.sec_key, r.sec_key_len,
r.up_prime_key, r.up_prime_key_len, r.up_sec_key, r.up_sec_key_len
FROM relationship r
WHERE r.s_t_n = in_own_guid AND (r.rel_type = in_rel_type1 OR r.rel_type = in_rel_type2)
AND
(
(
(
r.prime_key > in_prev_prime_key
OR
( r.prime_key = in_prev_prime_key AND r.sec_key < in_prev_sec_key)
)
AND
r.sort_key = in_prev_sort_key
)

OR
r.sort_key < in_prev_sort_key
)
AND t_s_n_type = in_node_type
AND r.control >= in_ctrl
ORDER BY sort_key DESC, prime_key ASC NULLS FIRST, sec_key DESC NULLS FIRST LIMIT 1;
RETURN h;
end
'
language 'plpgsql' STABLE;


EXPLAIN ANALYZE output
-------------------------------
Limit (cost=48.90..48.90 rows=1 width=89) (actual time=21.480..21.480 rows=1 loops=1)
Output: rel_id, t_s_n, rel_type, sort_key, state, permissions, control, prime_key, prime_key_len, sec_key, sec_key_len, up_prime_key, up_prime_key_l
en, up_sec_key, up_sec_key_len

-> Sort (cost=48.90..48.90 rows=1 width=89) (actual time=21.479..21.479 rows=1 loops=1)
Output: rel_id, t_s_n, rel_type, sort_key, state, permissions, control, prime_key, prime_key_len, sec_key, sec_key_len, up_prime_key, up_prime
_key_len, up_sec_key, up_sec_key_len
Sort Key: r.sort_key, r.prime_key, r.sec_key
Sort Method: top-N heapsort Memory: 25kB

-> Bitmap Heap Scan on public.relationship r (cost=3.39..48.89 rows=1 width=89) (actual time=1.034..13.509 rows=6986 loops=1)
Output: rel_id, t_s_n, rel_type, sort_key, state, permissions, control, prime_key, prime_key_len, sec_key, sec_key_len, up_prime_key, up
_prime_key_len, up_sec_key, up_sec_key_len
Recheck Cond: (r.s_t_n = $4)
Filter: ((r.control >= $10) AND (r.t_s_n_type = $3) AND ((r.rel_type = $1) OR (r.rel_type = $2)) AND ((((r.prime_key > $5) OR ((r.prime_
key = $5) AND (r.sec_key < $7))) AND (r.sort_key = $9)) OR (r.sort_key < $9)))

-> Bitmap Index Scan on idx_relationship_s_t_n (cost=0.00..3.39 rows=18 width=0) (actual time=0.951..0.951 rows=6989 loops=1)
Index Cond: (r.s_t_n = $4)

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Maxim Boguk 2012-06-22 07:02:32 Performance of a large array access by position (tested version 9.1.3)
Previous Message Albe Laurenz 2012-06-21 08:34:38 Re: moving tables