Re: IMPORT FOREIGN SCHEMA statement

From: David Fetter <david(at)fetter(dot)org>
To: Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: IMPORT FOREIGN SCHEMA statement
Date: 2014-05-25 19:41:18
Message-ID: 20140525194118.GB32355@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 23, 2014 at 10:08:06PM +0200, Ronan Dunklau wrote:
> Hello,
>
> Since my last proposal didn't get any strong rebuttal, please find attached a
> more complete version of the IMPORT FOREIGN SCHEMA statement.

Thanks!

Please to send future patches to this thread so people can track them
in their mail.

> I tried to follow the SQL-MED specification as closely as possible.
>
> This adds discoverability to foreign servers. The structure of the statement
> as I understand it is simple enough:
>
> IMPORT FOREIGN SCHEMA remote_schema FROM SERVER some_server [ (LIMIT TO |
> EXCEPT) table_list ] INTO local_schema.
>
> The import_foreign_schema patch adds the infrastructure, and a new FDW
> routine:
>
> typedef List *(*ImportForeignSchema_function) (ForeignServer *server,
> ImportForeignSchemaStmt * parsetree);
>
> This routine must return a list of CreateForeignTableStmt mirroring whatever
> tables were found on the remote side, which will then be executed.
>
> The import_foreign_schema_postgres_fdw patch proposes an implementation of
> this API for postgres_fdw. It will import a foreign schema using the right
> types as well as nullable information.

In the case of PostgreSQL, "the right types" are obvious until there's
a user-defined one. What do you plan to do in that case?

> Regarding documentation, I don't really know where it should have been put. If
> I missed something, let me know and I'll try to correct it.

It's not exactly something you missed, but I need to bring it up
anyway before we go too far. The standard missed two crucial concepts
when this part of it was written:

1. No single per-database-type universal type mapping can be correct.

People will have differing goals for type mapping, and writing a whole
new FDW for each of those goals is, to put it mildly, wasteful. I
will illustrate with a concrete and common example.

MySQL's datetime type encourages usages which PostgreSQL's
corresponding type, timestamptz, simply disallows, namely '0000-00-00
00:00:00' as its idea of UNKNOWN or NULL.

One way PostgreSQL's mapping could work is to map it to TEXT, which
would preserve the strings exactly and be in some sense an identity
map. It would also make the type somewhat useless in its original
intended form.

Another one would map the type is to a composite data type
mysql_datetime(tstz timestamptz, is_wacky boolean) which would
capture, for example, ('2014-04-01 00:00:00+00', false) for the UTC
start of April Fools' Day this year, and (NULL, true) for '0000-00-00
00:00:00'.

There are doubtless others, and there is no principled way to assign
any one of them as universally correct.

This brings me to the next crucial concept the standard missed:

2. The correct mapping may not be the identity, and furthermore, the
inbound and outbound mappings might in general not be mere inversions
of each other.

MySQL (no aspersions intended) again provides a concrete example with
its unsigned integer types. Yes, it's possible to create a domain
over INT8 which simulates UINT4, a domain over NUMERIC which simulates
UINT8, etc., but again this process's correctness depends on
circumstances.

To address these problems, I propose the following:

- We make type mappings settable at the level of:
- FDW
- Instance (a.k.a. cluster)
- Database
- Schema
- Table
- Column

using the existing ALTER command and some way of spelling out how
a remote type maps to a local type.

This would consist of:
- The remote type
- The local type to which it maps
- The inbound transformation (default identity)
- The outbound transformation (default identity)

At any given level, the remote type would need to be unique. To
communicate this to the system, we either invent new syntax, with
all the hazards attendant thereto, or we could use JSON or similar
serialization.

ALTER FOREIGN TABLE foo
ADD TYPE MAPPING
FROM "datetime"
TO TEXT
WITH (
INBOUND TRANSFORMATION IDENTITY,
OUTBOUND TRANSFORMATION IDENTITY
) /* Ugh!!! */

vs.

ALTER FOREIGN TABLE foo ADD (mapping '{
"datetime": "text",
"inbound": "IDENTITY",
outbound: "IDENTITY"
}')

Each FDW would have some set of default mappings and some way to
override them as above.

What say?

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-05-25 20:02:24 Re: Shared memory changes in 9.4?
Previous Message Maciek Sakrejda 2014-05-25 19:26:20 Shared memory changes in 9.4?