Quick Links

Re: Using bitmap index scans-more efficient

Lists:	pgsql-sql

From:	Kyle Bateman <kyle(at)actarg(dot)com>
To:	pgsql-sql(at)postgresql(dot)org
Subject:	Using bitmap index scans-more efficient
Date:	2006-08-13 17:46:42
Message-ID:	44DF6582.4000806@actarg.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-sql

How can I use bitmap index scans more effectively? (version 8.1.0)

I have a financial ledger (actually a view, grouping several other tables)
containing about a million records. Each record contains an account code and
a project code. I can query for all the transactions belonging to any single
project code and it is very fast and efficient (milliseconds/project).

But projects are organized in a hierarchical structure, so I also need to
query the ledger for transactions belonging to a particular project and/or all
its progeny. Depending on the method, this is taking several seconds to
several minutes per project.

For testing purposes, I'll present results using a smaller version of the
ledger with the following query times:

It is most efficient to enumerate the group of projects using "in" (0.144 seconds)

select * from ledger where proj in (4737,4789,4892,4893,4894,4895,4933,4934,4935);

---------------------------------------------------------------------------
Nested Loop Left Join (cost=19.73..4164.10 rows=7 width=85)
-> Nested Loop (cost=19.73..4139.08 rows=7 width=81)
-> Nested Loop (cost=19.73..4100.07 rows=7 width=63)
-> Bitmap Heap Scan on apinv_items i (cost=19.73..1185.71 rows=487 width=55)
Recheck Cond: ((proj = 4737) OR (proj = 4789) OR (proj = 4892) OR (proj = 4893) OR (proj = 4894) OR (proj = 4895) OR (proj = 4933) OR (proj = 4934
) OR (proj = 4935))
Filter: ((status = 'en'::bpchar) OR (status = 'cl'::bpchar) OR (status = 'pd'::bpchar))
-> BitmapOr (cost=19.73..19.73 rows=495 width=0)
-> Bitmap Index Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4737)
-> Bitmap Index Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4789)
-> Bitmap Index Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4892)
-> Bitmap Index Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4893)
-> Bitmap Index Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4894)
-> Bitmap Index Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4895)
-> Bitmap Index Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4933)
-> Bitmap Index Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4934)
-> Bitmap Index Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4935)
-> Index Scan using apinv_hdr_pkey on apinv_hdr h (cost=0.00..5.97 rows=1 width=21)
Index Cond: (("outer".vendid = h.vendid) AND (("outer".invnum)::text = (h.invnum)::text))
-> Index Scan using vend_org_pkey on vend_org v (cost=0.00..5.56 rows=1 width=26)
Index Cond: (v.org_id = "outer".vendid)
-> Seq Scan on acct a (cost=0.00..3.54 rows=1 width=4)
Filter: ((code)::text = 'ap'::text)
---------------------------------------------------------------------------

Problem is, the project list has to be hard-coded into the SQL statement.
What I really need is transactions belonging to "project 4737 and all its progeny."
So I've tried using a many-to-many table proj_prog that describes which projects
are progeny of which other projects. Unfortunately, the query time then goes up
by a factor of 6 (to 0.85 seconds).

Examples:
select * from ledger where proj = any (array(select prog_id from proj_prog where proj_id = 4737));
select * from ledger where proj = any (array[4737,4789,4892,4893,4894,4895,4933,4934,4935]);"

---------------------------------------------------------------------------
Nested Loop Left Join (cost=13584.99..17647.39 rows=850 width=85)
InitPlan
-> Index Scan using proj_prog_pkey on proj_prog (cost=0.00..38.04 rows=21 width=4)
Index Cond: (proj_id = 4737)
-> Merge Join (cost=13543.42..17565.44 rows=850 width=81)
Merge Cond: ("outer".vendid = "inner".org_id)
-> Merge Join (cost=13543.42..17405.05 rows=850 width=63)
Merge Cond: (("outer".vendid = "inner".vendid) AND (("outer".invnum)::text = "inner"."?column10?"))
-> Index Scan using apinv_hdr_pkey on apinv_hdr h (cost=0.00..3148.16 rows=51016 width=21)
-> Sort (cost=13543.42..13693.47 rows=60020 width=55)
Sort Key: i.vendid, (i.invnum)::text
-> Seq Scan on apinv_items i (cost=0.00..7197.27 rows=60020 width=55)
Filter: (((status = 'en'::bpchar) OR (status = 'cl'::bpchar) OR (status = 'pd'::bpchar)) AND (proj = ANY ($0)))
-> Index Scan using vend_org_pkey on vend_org v (cost=0.00..145.52 rows=1799 width=26)
-> Materialize (cost=3.54..3.55 rows=1 width=4)
-> Seq Scan on acct a (cost=0.00..3.54 rows=1 width=4)
Filter: ((code)::text = 'ap'::text)

---------------------------------------------------------------------------

The worst case is the following types of queries (about 5 seconds):

select * from ledger where proj in (select prog_id from proj_prog where proj_id = 4737);
select l.* from ledger l, proj_prog p where l.proj = p.prog_id and p.proj_id = 4737;

---------------------------------------------------------------------------
Hash Join (cost=19032.47..23510.23 rows=6 width=85)
Hash Cond: ("outer".proj = "inner".prog_id)
-> Nested Loop Left Join (cost=18994.38..23378.41 rows=1700 width=85)
-> Hash Join (cost=18990.84..23340.87 rows=1700 width=81)
Hash Cond: ("outer".vendid = "inner".org_id)
-> Merge Join (cost=18935.35..23255.64 rows=1700 width=63)
Merge Cond: (("outer".vendid = "inner".vendid) AND (("outer".invnum)::text = "inner"."?column10?"))
-> Index Scan using apinv_hdr_pkey on apinv_hdr h (cost=0.00..3148.16 rows=51016 width=21)
-> Sort (cost=18935.35..19235.45 rows=120041 width=55)
Sort Key: i.vendid, (i.invnum)::text
-> Seq Scan on apinv_items i (cost=0.00..4152.99 rows=120041 width=55)
Filter: ((status = 'en'::bpchar) OR (status = 'cl'::bpchar) OR (status = 'pd'::bpchar))
-> Hash (cost=50.99..50.99 rows=1799 width=26)
-> Seq Scan on vend_org v (cost=0.00..50.99 rows=1799 width=26)
-> Materialize (cost=3.54..3.55 rows=1 width=4)
-> Seq Scan on acct a (cost=0.00..3.54 rows=1 width=4)
Filter: ((code)::text = 'ap'::text)
-> Hash (cost=38.04..38.04 rows=21 width=4)
-> Index Scan using proj_prog_pkey on proj_prog p (cost=0.00..38.04 rows=21 width=4)
Index Cond: (proj_id = 4737)
---------------------------------------------------------------------------

I would like to be able to get the best performance like in the first query but
without having to enumerate the projects (i.e. using a single query).
The secret seems to be the bitmap index scans.

Any ideas about whether/how this can be done?

Thanks!

Kyle Bateman

---------------------------------------------------------------------------
BTW, The ledger view is built roughly as follows:

create view rp_v_api as
select
h.adate as adate,
(i.price * i.quant)::numeric(14,2) as amount,
substring(v.org_name from 1 for 40) as descr,

i.proj as proj,
i.acct as acct,
1 as cr_proj,
a.acct_id as cr_acct

from (
apinv_hdr h
join apinv_items i on i.vendid = h.vendid and i.invnum = h.invnum
join vend_org v on v.org_id = h.vendid
left join acct a on a.code = 'ap'
)
where i.status in ('en','cl','pd');

From:	Florian Weimer <fweimer(at)bfk(dot)de>
To:	Kyle Bateman <kyle(at)actarg(dot)com>
Cc:	pgsql-sql(at)postgresql(dot)org
Subject:	Re: Using bitmap index scans-more efficient
Date:	2006-08-14 06:40:14
Message-ID:	82mza77rch.fsf@mid.bfk.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-sql

* Kyle Bateman:

> Any ideas about whether/how this can be done?

If the project tree is fairly consistent, it's convenient to encode it
using intervals instead of parent/child intervals. IIRC, Celko's "SQL
for smarties" explains how to do this, and Kristian Koehntopp has
written some PHP code to implement it.

--
Florian Weimer <fweimer(at)bfk(dot)de>
BFK edv-consulting GmbH http://www.bfk.de/
Durlacher Allee 47 tel: +49-721-96201-1
D-76131 Karlsruhe fax: +49-721-96201-99

From:	Kyle Bateman <kyle(at)actarg(dot)com>
To:	pgsql-sql(at)postgresql(dot)org
Cc:	Florian Weimer <fweimer(at)bfk(dot)de>
Subject:	Re: Using bitmap index scans-more efficient
Date:	2006-08-15 01:06:34
Message-ID:	44E11E1A.8020507@actarg.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-sql

Florian Weimer wrote:

>* Kyle Bateman:
>
>
>
>>Any ideas about whether/how this can be done?
>>
>>
>
>If the project tree is fairly consistent, it's convenient to encode it
>using intervals instead of parent/child intervals. IIRC, Celko's "SQL
>for smarties" explains how to do this, and Kristian Koehntopp has
>written some PHP code to implement it.
>
>
>
I agree that this produces a more efficient query for finding the
projects that are the progeny of another project, but I'm trying to
figure out how that helps me select the right project transactions from
my ledger efficiently.

This query produces wonderful results (very fast):

select * from ledger where proj >= 4737 and proj <= 4740;

But I'm assuming that using an interval-encoded project tree, I would
have to do something like the following to get a progency group:

select * from ledger l, proj p where p.proj_id = l.proj and p.left >
1234 and p.right < 2345;

The problem (at least according to my initial testing) is that this
forces a join of the entire ledger and I get my slowest performance
group (5 seconds).

What am I missing?

Kyle

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Kyle Bateman <kyle(at)actarg(dot)com>
Cc:	pgsql-sql(at)postgresql(dot)org, Florian Weimer <fweimer(at)bfk(dot)de>
Subject:	Re: Using bitmap index scans-more efficient
Date:	2006-08-15 01:53:37
Message-ID:	27367.1155606817@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-sql

Kyle Bateman <kyle(at)actarg(dot)com> writes:
> But I'm assuming that using an interval-encoded project tree, I would
> have to do something like the following to get a progency group:

> select * from ledger l, proj p where p.proj_id = l.proj and p.left >
> 1234 and p.right < 2345;

btree has no idea about the constraint (that I imagine exists) that left
<= right. If you're just doing a simple index on (left, right) then the
above query requires scanning all index entries with left > 1234. It
would probably help to say

select * from ledger l, proj p where p.proj_id = l.proj and
p.left > 1234 and p.left < 2345 and p.right < 2345;

so that you can constrain the range of "left" values scanned.

regards, tom lane

From:	Kyle Bateman <kyle(at)actarg(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-sql(at)postgresql(dot)org, Florian Weimer <fweimer(at)bfk(dot)de>
Subject:	Re: Using bitmap index scans-more efficient
Date:	2006-08-15 21:58:39
Message-ID:	44E2438F.8000105@actarg.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-sql

Tom Lane wrote:

>Kyle Bateman <kyle(at)actarg(dot)com> writes:
>
>
>>But I'm assuming that using an interval-encoded project tree, I would
>>have to do something like the following to get a progency group:
>>
>>
>
>
>
>>select * from ledger l, proj p where p.proj_id = l.proj and p.left >
>>1234 and p.right < 2345;
>>
>>
>
>btree has no idea about the constraint (that I imagine exists) that left
><= right. If you're just doing a simple index on (left, right) then the
>above query requires scanning all index entries with left > 1234. It
>would probably help to say
>
>select * from ledger l, proj p where p.proj_id = l.proj and
> p.left > 1234 and p.left < 2345 and p.right < 2345;
>
>so that you can constrain the range of "left" values scanned.
>
>
Thanks for the replies, Tom and Florian.

My problem is not that it is difficult (or costly) to determine the
progeny of a given project. I can determine this in about 90 msec
regardless of whether I use an adjacency model, interval-encoding, or
materialized path (current implementation). The problem is, when I try
to extract the ledger entries belonging to that progeny from a set of a
million records, it seems to need to process all million records rather
than being able to index right into the ones I want.

I'm not very good at reading explain output, but I tried to set up the
query Tom suggests by creating an interval-encoded project table
(proj_int) and then joining it to my ledger like so:

select l.* from ledger l, proj_int i where l.proj = i.proj_id and i.lft
>= 5283 and i.lft < 5300 and i.rgt <= 5300;

On my mini-test-ledger of 100,000 entries, this takes the longest time
(5 seconds) with the following explain output:

Hash Join (cost=19018.46..23411.52 rows=14 width=85)
Hash Cond: ("outer".proj = "inner".proj_id)
-> Nested Loop Left Join (cost=18994.38..23378.41 rows=1700 width=85)
-> Hash Join (cost=18990.84..23340.87 rows=1700 width=81)
Hash Cond: ("outer".vendid = "inner".org_id)
-> Merge Join (cost=18935.35..23255.64 rows=1700 width=63)
Merge Cond: (("outer".vendid = "inner".vendid) AND
(("outer".invnum)::text = "inner"."?column10?"))
-> Index Scan using apinv_hdr_pkey on apinv_hdr h
(cost=0.00..3148.16 rows=51016 width=21)
-> Sort (cost=18935.35..19235.45 rows=120041
width=55)
Sort Key: i.vendid, (i.invnum)::text
-> Seq Scan on apinv_items i
(cost=0.00..4152.99 rows=120041 width=55)
Filter: ((status = 'en'::bpchar) OR
(status = 'cl'::bpchar) OR (status = 'pd'::bpchar))
-> Hash (cost=50.99..50.99 rows=1799 width=26)
-> Seq Scan on vend_org v (cost=0.00..50.99
rows=1799 width=26)
-> Materialize (cost=3.54..3.55 rows=1 width=4)
-> Seq Scan on acct a (cost=0.00..3.54 rows=1 width=4)
Filter: ((code)::text = 'ap'::text)
-> Hash (cost=24.06..24.06 rows=10 width=4)
-> Bitmap Heap Scan on proj_int i (cost=2.26..24.06 rows=10
width=4)
Recheck Cond: ((lft >= 5283) AND (lft < 5300) AND (rgt <=
5300))
-> Bitmap Index Scan on i_proj_int_lft_rgt
(cost=0.00..2.26 rows=10 width=0)
Index Cond: ((lft >= 5283) AND (lft < 5300) AND
(rgt <= 5300))

That is roughly equivalent to my materialized path method:

select l.* from ledger l where l.proj in (select proj_id from proj_v
where 4737 = any(ppath));

And is quite slow compared to 150 msec when enumerating the progeny
projects like so:

select l.* from ledger l where l.proj in
(4737,4789,4892,4893,4894,4895,4933,4934,4935);

Nested Loop Left Join (cost=19.73..4164.10 rows=7 width=85)
-> Nested Loop (cost=19.73..4139.08 rows=7 width=81)
-> Nested Loop (cost=19.73..4100.07 rows=7 width=63)
-> Bitmap Heap Scan on apinv_items i
(cost=19.73..1185.71 rows=487 width=55)
Recheck Cond: ((proj = 4737) OR (proj = 4789) OR
(proj = 4892) OR (proj = 4893) OR (proj = 4894) OR (proj = 4895) OR
(proj = 4933) OR (proj = 4934) OR (proj = 4935))
Filter: ((status = 'en'::bpchar) OR (status =
'cl'::bpchar) OR (status = 'pd'::bpchar))
-> BitmapOr (cost=19.73..19.73 rows=495 width=0)
-> Bitmap Index Scan on i_apinv_items_proj
(cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4737)
-> Bitmap Index Scan on i_apinv_items_proj
(cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4789)
-> Bitmap Index Scan on i_apinv_items_proj
(cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4892)
-> Bitmap Index Scan on i_apinv_items_proj
(cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4893)
-> Bitmap Index Scan on i_apinv_items_proj
(cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4894)
-> Bitmap Index Scan on i_apinv_items_proj
(cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4895)
-> Bitmap Index Scan on i_apinv_items_proj
(cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4933)
-> Bitmap Index Scan on i_apinv_items_proj
(cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4934)
-> Bitmap Index Scan on i_apinv_items_proj
(cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4935)
-> Index Scan using apinv_hdr_pkey on apinv_hdr h
(cost=0.00..5.97 rows=1 width=21)
Index Cond: (("outer".vendid = h.vendid) AND
(("outer".invnum)::text = (h.invnum)::text))
-> Index Scan using vend_org_pkey on vend_org v
(cost=0.00..5.56 rows=1 width=26)
Index Cond: (v.org_id = "outer".vendid) Nested Loop Left
Join (cost=19.73..4164.10 rows=7 width=85)
-> Nested Loop (cost=19.73..4139.08 rows=7 width=81)
-> Nested Loop (cost=19.73..4100.07 rows=7 width=63)
-> Bitmap Heap Scan on apinv_items i
(cost=19.73..1185.71 rows=487 width=55)
Recheck Cond: ((proj = 4737) OR (proj = 4789) OR
(proj = 4892) OR (proj = 4893) OR (proj = 4894) OR (proj = 4895) OR
(proj = 4933) OR (proj = 4934) OR (proj = 4935))
Filter: ((status = 'en'::bpchar) OR (status =
'cl'::bpchar) OR (status = 'pd'::bpchar))
-> BitmapOr (cost=19.73..19.73 rows=495 width=0)
-> Bitmap Index Scan on i_apinv_items_proj
(cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4737)
-> Bitmap Index Scan on i_apinv_items_proj
(cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4789)
-> Bitmap Index Scan on i_apinv_items_proj
(cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4892)
-> Bitmap Index Scan on i_apinv_items_proj
(cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4893)
-> Bitmap Index Scan on i_apinv_items_proj
(cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4894)
-> Bitmap Index Scan on i_apinv_items_proj
(cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4895)
-> Bitmap Index Scan on i_apinv_items_proj
(cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4933)
-> Bitmap Index Scan on i_apinv_items_proj
(cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4934)
-> Bitmap Index Scan on i_apinv_items_proj
(cost=0.00..2.19 rows=55 width=0)
Index Cond: (proj = 4935)
-> Index Scan using apinv_hdr_pkey on apinv_hdr h
(cost=0.00..5.97 rows=1 width=21)
Index Cond: (("outer".vendid = h.vendid) AND
(("outer".invnum)::text = (h.invnum)::text))
-> Index Scan using vend_org_pkey on vend_org v
(cost=0.00..5.56 rows=1 width=26)
Index Cond: (v.org_id = "outer".vendid)
-> Seq Scan on acct a (cost=0.00..3.54 rows=1 width=4)
Filter: ((code)::text = 'ap'::text)
-> Seq Scan on acct a (cost=0.00..3.54 rows=1 width=4)
Filter: ((code)::text = 'ap'::text)

I'm still stumped...

From:	Kyle Bateman <kyle(at)actarg(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-sql(at)postgresql(dot)org, Florian Weimer <fweimer(at)bfk(dot)de>
Subject:	Re: Using bitmap index scans-more efficient
Date:	2006-08-15 23:15:37
Message-ID:	44E25599.4060905@actarg.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-sql

Tom Lane wrote:

Here's an interesting result:

I created a function proj_left(int4) that returns the left interval
number for a given project. Then I created an index on the underlying
table for the ledger view(which took forever to build) like so:

create index i_test on apinv_items (proj_left(proj));

Now my query:

select * from ledger where proj_left(dr_proj) >= 5283 and
proj_left(dr_proj) < 5300;

is very speedy. Problem is, I had to mark the function proj_left() as
immutable, which it can not be since the left and right values for a
given project will change any time a project is added, removed, or moved
around the hierarchy :(

So is there any good way to tell the planner to do several individual
index scans for the projects involved in the desired progeny, or the
results together and return the result? This is what it seems to be
choosing in the case of the query:

select * from ledger where proj in (4737,4789,4892,4893,4894,4895,4933,4934,4935);

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Kyle Bateman <kyle(at)actarg(dot)com>
Cc:	pgsql-sql(at)postgresql(dot)org, fweimer(at)bfk(dot)de
Subject:	Re: Using bitmap index scans-more efficient
Date:	2006-08-16 16:58:12
Message-ID:	26676.1155747492@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-sql

Kyle Bateman <kyle(at)actarg(dot)com> writes:
> I'm wondering if this might expose a weakness in the optimizer having to
> do with left joins.

Before 8.2 the optimizer has no ability to rearrange the order of outer
joins. Do you have time to try your test case against CVS HEAD?

regards, tom lane

From:	Kyle Bateman <kyle(at)actarg(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-sql(at)postgresql(dot)org
Subject:	Re: Using bitmap index scans-more efficient
Date:	2006-08-16 20:46:06
Message-ID:	44E3840E.4080502@actarg.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-sql

Tom Lane wrote:

>Kyle Bateman <kyle(at)actarg(dot)com> writes:
>
>
>>I'm wondering if this might expose a weakness in the optimizer having to
>>do with left joins.
>>
>>
>
>Before 8.2 the optimizer has no ability to rearrange the order of outer
>joins. Do you have time to try your test case against CVS HEAD?
>
>

OK, I figured it out--grabbed the latest snapshot (hope that is what you
need).

My results are similar:

select l.* from ledg_v1 l, proj p where l.proj = p.proj_id and 5 =
p.par; (24 msec)
Nested Loop (cost=0.00..1991.93 rows=480 width=23)
-> Nested Loop (cost=0.00..4.68 rows=6 width=8)
-> Seq Scan on acct a (cost=0.00..1.12 rows=1 width=4)
Filter: ((code)::text = 'ap'::text)
-> Index Scan using i_proj_par on proj p (cost=0.00..3.49
rows=6 width=4)
Index Cond: (5 = par)
-> Index Scan using i_ledg_proj on ledg l (cost=0.00..330.17
rows=83 width=19)
Index Cond: (l.proj = "outer".proj_id)

select l.* from ledg_v2 l, proj p where l.proj = p.proj_id and 5 =
p.par; (1.25 sec)
Hash Join (cost=4.63..16768.43 rows=480 width=23)
Hash Cond: ("outer".proj = "inner".proj_id)
-> Nested Loop Left Join (cost=1.13..14760.13 rows=400000 width=23)
-> Seq Scan on ledg l (cost=0.00..6759.00 rows=400000 width=19)
-> Materialize (cost=1.13..1.14 rows=1 width=4)
-> Seq Scan on acct a (cost=0.00..1.12 rows=1 width=4)
Filter: ((code)::text = 'ap'::text)
-> Hash (cost=3.49..3.49 rows=6 width=4)
-> Index Scan using i_proj_par on proj p (cost=0.00..3.49
rows=6 width=4)
Index Cond: (5 = par)

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Kyle Bateman <kyle(at)actarg(dot)com>
Cc:	pgsql-sql(at)postgresql(dot)org
Subject:	Re: Using bitmap index scans-more efficient
Date:	2006-08-16 21:46:26
Message-ID:	28876.1155764786@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-sql

Kyle Bateman <kyle(at)actarg(dot)com> writes:
> Tom Lane wrote:
>> Before 8.2 the optimizer has no ability to rearrange the order of outer
>> joins. Do you have time to try your test case against CVS HEAD?

> OK, I figured it out--grabbed the latest snapshot (hope that is what you
> need).
> My results are similar:

Are you sure you found a recent version? I get this from CVS HEAD:

ledger=# explain analyze select l.* from ledg_v2 l, proj p where l.proj = p.proj_id and 5 = p.par;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Nested Loop Left Join (cost=5.79..1386.74 rows=400 width=23) (actual time=0.125..1.543 rows=329 loops=1)
-> Nested Loop (cost=4.66..1377.61 rows=400 width=19) (actual time=0.109..1.072 rows=329 loops=1)
-> Index Scan using i_proj_par on proj p (cost=0.00..8.41 rows=5 width=4) (actual time=0.023..0.028 rows=4 loops=1)
Index Cond: (5 = par)
-> Bitmap Heap Scan on ledg l (cost=4.66..272.83 rows=81 width=19) (actual time=0.073..0.213 rows=82 loops=4)
Recheck Cond: (l.proj = p.proj_id)
-> Bitmap Index Scan on i_ledg_proj (cost=0.00..4.66 rows=81 width=0) (actual time=0.041..0.041 rows=82 loops=4)
Index Cond: (l.proj = p.proj_id)
-> Materialize (cost=1.13..1.14 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=329)
-> Seq Scan on acct a (cost=0.00..1.12 rows=1 width=4) (actual time=0.009..0.011 rows=1 loops=1)
Filter: ((code)::text = 'ap'::text)
Total runtime: 1.696 ms
(12 rows)

Yours is doing the left join inside the join to proj, which of course is
terrible because it has to form the whole 400K-row join result.

regards, tom lane

From:	Kyle Bateman <kyle(at)actarg(dot)com>
To:	pgsql-sql(at)postgresql(dot)org
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject:	PG 8.2beta reordering working for this case?
Date:	2006-10-08 01:00:44
Message-ID:	45284DBC.3060901@actarg.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-sql

Kyle Bateman wrote:

> Tom Lane wrote:
>
>>
>> Before 8.2 the optimizer has no ability to rearrange the order of outer
>> joins. Do you have time to try your test case against CVS HEAD?
>>
>>
I've done some more refinement on my accounting ledger system that has
clarified some of the problems I was having with performance joining to
a union. Here's an interesting test case. I just tried it with PG
8.2beta1. I don't claim to understand the new feature of reordering
queries very well, but it seems like this is related to that feature.
Since its still a performance problem in 8.2, I thought this might be a
helpful test case during beta.

To see this demo, run create.sql on a clean database. It will create
all the needed tables, views and test data.

Then run q1 and q2 which are very efficient. Then q3 is the slow one.
The reason is, it does the union, producing 300,000 records before
trying to select by project. It seems like the optimizer should
internally rewrite the query to look more like what is in q4 (which is
very fast).

Is there a way to make the optimizer do this?

Kyle Bateman

Attachment	Content-Type	Size
create.sql	text/x-sql	6.3 KB
q1.sql	text/x-sql	310 bytes
q2.sql	text/x-sql	290 bytes
q3.sql	text/x-sql	409 bytes
q4.sql	text/x-sql	454 bytes

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Kyle Bateman <kyle(at)actarg(dot)com>
Cc:	pgsql-sql(at)postgresql(dot)org
Subject:	Re: PG 8.2beta reordering working for this case?
Date:	2006-10-08 17:49:31
Message-ID:	25779.1160329771@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-sql

Kyle Bateman <kyle(at)actarg(dot)com> writes:
> Is there a way to make the optimizer do this?

Sorry, that's not happening for 8.2. Consider using a union all (not
union) across the subledg_N tables directly and then joining to that.
That boils down to being a partitioning case and I think probably will
be covered by the 8.2 improvements. There isn't any understanding of
how to commute joins and unions though ... (offhand I'm not even sure
of the conditions under which such a thing would be safe).

regards, tom lane

From:	Kyle Bateman <kyle(at)actarg(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-sql(at)postgresql(dot)org
Subject:	Re: PG 8.2beta reordering working for this case?
Date:	2006-10-09 05:27:40
Message-ID:	4529DDCC.6010909@actarg.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-sql

Tom Lane wrote:

>Kyle Bateman <kyle(at)actarg(dot)com> writes:
>
>
>>Is there a way to make the optimizer do this?
>>
>>
>
>Sorry, that's not happening for 8.2. Consider using a union all (not
>union) across the subledg_N tables directly and then joining to that.
>That boils down to being a partitioning case and I think probably will
>be covered by the 8.2 improvements.
>
Yup, union all is much more efficient. It hadn't really occurred to me
the difference between union and union all. But it makes sense to
eliminate the need for a unique sort. The q3 query went from 10 seconds
to 1 second with just the addition of union all in the general ledger.

BTW, explain analyze still says 10 seconds of run time (and takes 10
seconds to run), but when I remove the explain analyze, the query runs
in about a second. What's that all about?

Also, I came up with the view shown in the attachment. It is still much
faster than joining to the union-all ledger (40 ms). I'm not sure why
because I'm not sure if explain analyze is telling me the real story (I
see a sequential scan of the ledgers in there when it runs 10 seconds).
I'm not sure what it's doing when it runs in 1 second.

Kyle

Attachment	Content-Type	Size
workaround.sql	text/x-sql	429 bytes

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Kyle Bateman <kyle(at)actarg(dot)com>
Cc:	pgsql-sql(at)postgresql(dot)org
Subject:	Re: PG 8.2beta reordering working for this case?
Date:	2006-10-09 14:15:23
Message-ID:	1600.1160403323@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-sql

Kyle Bateman <kyle(at)actarg(dot)com> writes:
> BTW, explain analyze still says 10 seconds of run time (and takes 10
> seconds to run), but when I remove the explain analyze, the query runs
> in about a second. What's that all about?

Instrumentation overhead ... you're probably running this on a PC with a
very slow clock-reading capability. Each node output row counted by
explain analyze takes two gettimeofday() calls, and apparently it's not
unusual for those to take several microseconds on cheap motherboards,
even when the CPU is nominally very fast.

regards, tom lane