Re: inherit support for foreign tables

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: inherit support for foreign tables
Date: 2014-01-21 19:09:39
Message-ID: CA+TgmoZiskpRfmSZuY01Uh3qBqPRb3-UgXnS-euUO5cw9Yst3Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 20, 2014 at 9:44 PM, Shigeru Hanada
<shigeru(dot)hanada(at)gmail(dot)com> wrote:
> Thanks for the comments.
>
> 2014/1/21 KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>:
>>> In addition, an idea which I can't throw away is to assume that all
>>> constraints defined on foreign tables as ASSERTIVE. Foreign tables
>>> potentially have dangers to have "wrong" data by updating source data
>>> not through foreign tables. This is not specific to an FDW, so IMO
>>> constraints defined on foreign tables are basically ASSERTIVE. Of
>>> course PG can try to maintain data correct, but always somebody might
>>> break it.
>>> qu
>>>
>> Does it make sense to apply "assertive" CHECK constraint on the qual
>> of ForeignScan to filter out tuples with violated values at the local
>> side, as if row-level security feature doing.
>> It enables to handle a situation that planner expects only "clean"
>> tuples are returned but FDW driver is unavailable to anomalies.
>>
>> Probably, this additional check can be turned on/off on the fly,
>> if FDW driver has a way to inform the core system its capability,
>> like FDW_CAN_ENFORCE_CHECK_CONSTRAINT that informs planner to skip
>> local checks.
>
> Hmm, IIUC you mean that local users can't (or don't need to) know that
> data which violates the local constraints exist on remote side.
> Applying constraints to the data which is modified through FDW would
> be necessary as well. In that design, FDW is a bidirectional filter
> which provides these features:
>
> 1) Don't push wrong data into remote data source, by applying local
> constraints to the result of the modifying query executed on local PG.
> This is not perfect filter, because remote constraints don't mapped
> automatically or perfectly (imagine constraints which is available on
> remote but is not supported in PG).
> 2) Don't retrieve wrong data from remote to local PG, by applying
> local constraints
>
> I have a concern about consistency. It has not been supported, but
> let's think of Aggregate push-down invoked by a query below.
>
> SELECT count(*) FROM remote_table;
>
> If this query was fully pushed down, the result is the # of records
> exist on remote side, but the result would be # of valid records when
> we don't push down the aggregate. This would confuse users.
>
>>> Besides CHECK constraints, currently NOT NULL constraints are
>>> virtually ASSERTIVE (not enforcing). Should it also be noted
>>> explicitly?
>>>
>> Backward compatibility….
>
> Yep, backward compatibility (especially visible ones to users) should
> be minimal, ideally zero.
>
>> NOT NULL [ASSERTIVE] might be an option.
>
> Treating [ASSERTIVE | NOT ASSERTIVE] like DEFERRABLE, and allow
> ingASSERTIVE for only foreign tables? It makes sense, though we need
> consider exclusiveness . But It needs to default to ASSERTIVE on
> foreign tables, and NOT ASSERTIVE (means "forced") on others. Isn't
> is too complicated?
>
> CREATE FOREIGN TABLE foo (
> id int NOT NULL ASSERTIVE CHECK (id > 1) ASSERTIVE,
> …
> CONSTRAINT chk_foo_name_upper CHECK (upper(name) = name) ASSERTIVE
> ) SERVER server;
>
> BTW, I noticed that this is like push-down-able expressions in
> JOIN/WHERE. We need to check a CHECK constraint defined on a foreign
> tables contains only expressions which have same semantics as remote
> side (in practice, built-in and immutable)?

I don't think that that ASSERTIVE is going to fly, because "assertive"
means (sayeth the Google) "having or showing a confident and forceful
personality", which is not what we mean here. It's tempting to do
something like try to replace the keyword "check" with "assume" or
"assert" or (stretching) "assertion", but that would require whichever
one we picked to be a fully-reserved keyword, which I can't think is
going to get much support here, for entirely understandable reasons.
So I think we should look for another option.

Currently, constraints can be marked NO INHERIT (though this seems to
have not been fully documented, as the ALTER TABLE page doesn't
mention it anywhere) or NOT VALID, so I'm thinking maybe we should go
with something along those lines. Some ideas:

- NO CHECK. The idea of writing CHECK (id > 1) NO CHECK is pretty
hilarious, though.
- NO VALIDATE. But then people need to understand that NOT VALID
means "we didn't validate it yet" while "no validate" means "we don't
ever intend to validate it", which could be confusing.
- NO ENFORCE. Requires a new (probably unreserved) keyword.
- NOT VALIDATED or NOT CHECKED. Same problems as NO CHECK and NO
VALIDATE, respectively, plus now we have to create a new keyword.

Another idea is to apply an extensible-options syntax to constraints,
like we do for EXPLAIN, VACUUM, etc. Like maybe:

CHECK (id > 1) OPTIONS (enforced false, valid true)

Yet another idea is to consider validity a three-state property:
either the constraint is valid (because we have checked it and are
enforcing it), or it is not valid (because we are enforcing it but
have not checked the pre-existing data), or it is assumed true
(because we are not checking or enforcing it but are believing it
anyway). So then we could have a syntax like this:

CHECK (id > 1) VALIDATE { ON | OFF | ASSERTION }

Other ideas?

One thing that's bugging me a bit about this whole line of attack is
that, in the first instance, the whole goal here is to support
inheritance hierarchies that mix ordinary tables with foreign tables.
If you have a table with children some of which are inherited and
others of which are not inherited, you're very likely going to want
your constraints enforced for real on the children that are tables and
assumed true on the children that are foreign tables, and none of what
we're talking about here gets us to that, because we normally want the
constraints to be identical throughout the inheritance hierarchy.
Maybe there's some way around that, but I'm back to wondering if it
wouldn't be better to simply silently force any constraints on a
foreign-table into assertion mode. That could be done without any new
syntax at all, and frankly I think it's what people are going to want
more often than not.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-01-21 19:11:36 Re: Performance Improvement by reducing WAL for Update Operation
Previous Message Tom Lane 2014-01-21 19:06:31 Re: NOT Null constraint on foreign table not working