Quick Links

Convert check constraints into One-Time_Filter on prepared statements

Lists:	pgsql-hackers

From:	ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Convert check constraints into One-Time_Filter on prepared statements
Date:	2008-10-13 09:00:44
Message-ID:	20081013172100.87A1.52131E4D@oss.ntt.co.jp
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hello,

Constraint exclusion (CE) is very useful for partitioned tables,
but prepared statements interfere with CE if the parameter
contains partition keys, because CE is a planner-time optimization
but the actual parameters are given at executor-time.

I have an idea to use constraint exclusion and prepared statements
together -- converting check constraints into One-Time Filter.

For example, when we have "test" table partitioned by test_{year}:

CREATE TABLE test PARTITIONED BY
PARTITION test_2008 CHECK('2008-01-01' <= t AND t < '2009-01-01')
PARTITION test_2009 CHECK('2009-01-01' <= t AND t < '2010-01-01')
PARTITION test_2010 CHECK('2010-01-01' <= t AND t < '2011-01-01')

and prepare a statement that have a partitioned key in the parameter:

PREPARE p(timestamp) AS
SELECT * FROM test WHERE $1 <= t AND t < $1 + '1 mon';

Then planner converts check constraints into One-Time Filter.
Plan will be the following:

EXPLAIN EXECUTE p('2008-07-01');
-------------------------------------------------------------------------
Append
-> Result
One-Time Filter: (('2008-01-01' <= $1) AND ($1 < '2009-01-01'))
-> Index Scan on test_2008_t_key
Index Cond: (($1 <= t) AND (t < ($1 + '1 mon')))
-> Result
One-Time Filter: (('2009-01-01' <= $1) AND ($1 < '2010-01-01'))
-> Index Scan on test_2009_t_key
Index Cond: (($1 <= t) AND (t < ($1 + '1 mon')))
-> Result
One-Time Filter: (('2010-01-01' <= $1) AND ($1 < '2011-01-01'))
-> Index Scan on test_2010_t_key
Index Cond: (($1 <= t) AND (t < ($1 + '1 mon')))

We can avoid internal scans when One-Time Filter returns false for each
partition. So we can reuse the plan and receive benefit from CE.

Is this concept ok and worth trying?
If it is reasonable, I'll try it. Comments welcome.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

From:	Gregory Stark <stark(at)enterprisedb(dot)com>
To:	ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Convert check constraints into One-Time_Filter on prepared statements
Date:	2008-10-13 09:38:20
Message-ID:	8763nwsulv.fsf@oxford.xeocode.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Lists:	pgsql-hackers

ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp> writes:

> EXPLAIN EXECUTE p('2008-07-01');
> -------------------------------------------------------------------------
> Append
> -> Result
> One-Time Filter: (('2008-01-01' <= $1) AND ($1 < '2009-01-01'))
> -> Index Scan on test_2008_t_key
> Index Cond: (($1 <= t) AND (t < ($1 + '1 mon')))
>
> We can avoid internal scans when One-Time Filter returns false for each
> partition. So we can reuse the plan and receive benefit from CE.
>
> Is this concept ok and worth trying?
> If it is reasonable, I'll try it. Comments welcome.

It's tempting to go straight for the special case where we can detect that the
constraints are mutually exclusive and we can build a lookup table or do a
binary search to find the single partition which is included.

But the this would handle the general case and would be a huge help. I've
looked at plans where partitioning represented a huge slowdown and the time
seemed to just disappear down the hole. All the time was being spent in just
the index lookup startup and probe for dozens of partitions which returned no
rows.

In combination with the ordered append node -- which is still sitting in my
experimental directory -- this would basically "fix" partitioning. The planner
would recognize that the result is ordered and merge-joinable and it wouldn't
waste any time doing any index probes.

There would be a much smaller waste of time setting up the degenerate heap
merge and checking all those conditions in the executor. These things will
matter when we get to scaling up to hundreds of plan nodes but the situation
would still be much better than today.

The other reason I think it's still worth doing this even though it's not as
effective as covering the special case of mutually exclusive partitions is
that there's still a place for this even once we have a more mature
partitioning scheme.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's On-Demand Production Tuning