Re: Select queries which violates table constrains

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Joni Martikainen <joni(at)shade-fx(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Select queries which violates table constrains
Date: 2014-05-12 14:41:05
Message-ID: 5370DD81.6040006@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 05/10/2014 09:24 PM, Joni Martikainen wrote:
> Hi,
>
> I investigated some select query performance issues and noticed that
> postgresql misses some obvious cases while processing SELECT query. I
> mean the case where WHERE clause contains statement which condition
> would be against table structure. (excuse my language, look the code)
>
> Example:
> Let the table be :
>
> CREATE TABLE test
> (
> id numeric(3,0) NOT NULL,
> somecolumn numeric(5,0) NOT NULL,
> CONSTRAINT id_pk PRIMARY KEY (id)
> );
>
> Simple table with "somecolumn" column which has constraint NOT NULL.
>
> Let's do a following query to the table.
>
> SELECT somecolumn FROM test WHERE somecolumn IS NULL;
>
> Result is empty result set which is obvious because any null value would
> be against the table constrain.
> The thing here is that postgresql does SeqScan to this table in order to
> find out if there is any null values.
>
> Explain:
> "Seq Scan on test (cost=0.00..1.06 rows=1 width=5)"
> " Filter: (somecolumn IS NULL)"
> "Planning time: 0.778 ms"
>
> SeqScan can be avoided by making index for "somecolumn" and indexing all
> the null values. That index would be empty and very fast but also very
> pointless since table constraint here is simple.
> No one would do such a query in real life but some programmatically
> generated queries does this kind of things. Only way I found to go
> around this problem was to create those empty indexies but I think the
> query optimizer could be smarter here.
>
> I took a look of the optimizer code and I didn't find any code which
> avoids this kind of situations. (I expect that it would be optimizer's
> task to find out this kind of things)
>
> I was thinking some feature for optimizer where the optimizer could add
> a hint for an executor if some query plan path leads to the empty result
> set case. If executor sees this hint it could avoid doing seqscan and
> actually even index scans. This kind of query constraint vs. table
> constraint comparison should be anyway cheaper process to execute than
> seqscan.
>
> The question is that, is there any reason why such an optimization phase
> could not be implemented? Another question is that how is the query
> engine handling the partitioned table case? Am i right that table
> partitions are solved by table constrains and indexies are used to
> validate which child table to look for? And so forth could this kind of
> new optimization phase benefit partitioned tables?

Actually, the planner can perform that optimization. The trick is called
"constraint exclusion". It is typically used for partitioning, where the
WHERE-clause restricts the query to a single partition, and you would
otherwise have to scan all the partitions. It is not usually a very
useful optimization, and it is somewhat expensive to check for that
case, so it is disabled by default except for partitioned tables. But if
you do "set constraint_exclusion=on", you will get the plan you're
looking for:

postgres=# set constraint_exclusion=on;
SET
postgres=# explain SELECT somecolumn FROM test WHERE somecolumn IS NULL;
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.01 rows=1 width=0)
One-Time Filter: false
Planning time: 0.071 ms
(3 rows)

- Heikki

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-05-12 14:41:12 Re: Proposal for CSN based snapshots
Previous Message Tom Lane 2014-05-12 14:38:59 Re: Select queries which violates table constrains