Re: per-column generic option

Lists: pgsql-hackers
From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: per-column generic option
Date: 2011-06-14 08:56:05
Message-ID: 4DF72225.2010609@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

I would like to propose support for per-column generic option, which is
defined in the SQL/MED standard. In 9.0 release, support for foreign
tables and per-table generic option have been added, but support for
per-column generic option hasn't.

Please examine the description below and attached patch
per_column_option_v1.patch. Any comments or questions are welcome.

Possible use cases
~~~~~~~~~~~~~~~~~~
Purpose of per-column generic option is passing column-specific settings
to the foreign-data wrapper.

1) For file_fdw, per-column generic option can be used to represent
per-column COPY option FORCE_NOT_NULL with boolean value (currently
file_fdw doesn't support FORCE_NOT_NULL option).

2) For postgresql_fdw (even though it has not been implemented yet),
per-column generic option could be used to represent the name of the
column on the foreign side. It is similar to per-table generic option
such as "nspname" and "relname" for namespace name/relation name,
proposed in the last development cycles. Such option would be named
"attname" after pg_attribute.attname.

Catalog design
~~~~~~~~~~~~~~
This proposal requires changing some catalogs.

1) To store per-column generic options, new attribute attfdwoptions
(text[]) was added at tail of pg_attribute. This is similar to the
generic option of other FDW objects such as FDW, server, user mapping
and foreign table. Existing attribute attoptions is not used for
generic options.

2) To conform the SQL/MED standard, an information_schema view
COLUMN_OPTIONS was added. Also underlying view
_pg_foreign_table_columns was added to show only columns which current
user has any access privilege. This fashion is same as other FDW views.

Syntax design
~~~~~~~~~~~~~
Per-column generic options can be operated via CREATE FOREIGN TABLE
statement and ALTER FOREIGN TABLE statement. Similar to other generic
options, ADD/SET/DROP can be specified for ALTER FOREIGN TABLE.

1) In CREATE FOREIGN TABLE statement, per-column generic options can be
specified in a column definition without operation qualifier such as
SET, ADD and DROP; all options are treated as ADD. Similar to other FDW
objects, multiple options can be specified for one column by separating
option-value pairs with comma.

-- multiple options can be specified for one column at once
CREATE FOREIGN TABLE foo (
c1 int OPTIONS (opt1 'value1'),
c2 text OPTIONS (opt2 'values2', opt3 'value3'),
c3 date OPTIONS (opt4 'value4) NOT NULL
) SERVER server;

To avoid syntax conflict between "OPTIONS (...)" and "DEFAULT b_expr"
(b_expr can end with a token "OPTION"), I placed OPTIONS (...) between
data type and any other column qualifier such as default values and
constraints.

The SQL/MED standard doesn't consider any column qualifier other than
data type, so it defines the syntax simply as below. I think new syntax
conforms the standard...

CREATE FOREIGN TABLE foo (
{ column_name data_type [ OPTIONS ( option 'value' [, ...] ) ] }
[, ... ]
) SERVER server [ OPTIONS (...) ]

Please note that CREATE FOREIGN TABLE shares the columnDef, a syntax
element for a column definition, with CREATE TABLE. I thought that they
should so, and I didn't introduce separated syntax for foreign tables.

2) Similar to other FDW objects' ALTER statement, ALTER FOREIGN TABLE
ALTER COLUMN accepts ADD/SET/DROP operation for each option. DROP
requires only option name.

ALTER FOREIGN TABLE foo ALTER COLUMN c1
OPTIONS (SET opt1 'VALUE1'); -- should be set in advance
ALTER FOREIGN TABLE foo ALTER COLUMN c1
OPTIONS (ADD opt2 'VALUE1', DROP opt1);

Similar to other ALTER FOREIGN TABLE commands, ALTER COLUMN ... OPTIONS
(...) can be contained in the list of ALTER commands.

ALTER FOREIGN TABLE foo
ALTER COLUMN col1 OPTIONS (opt1 'val1'),
ALTER COLUMN col2 SET NOT NULL;

psql support
~~~~~~~~~~~~
1) psql should support describing per-column generic options, so \dec
command was added. If the form \dec+ is used, generic options are also
displayed. Output sample is:

postgres=# \dec csv_branches
List of foreign table columns
Schema | Table | Column
--------+--------------+----------
public | csv_branches | bid
public | csv_branches | bbalance
public | csv_branches | filler
(3 rows)

postgres=# \dec+ csv_branches
List of foreign table columns
Schema | Table | Column | Options
--------+--------------+----------+------------------------
public | csv_branches | bid | {force_not_null=false}
public | csv_branches | bbalance | {force_not_null=true}
public | csv_branches | filler |
(3 rows)

Here I found an inconsistency about privilege to see generic options
(not only column but also FDW and server et al). The
information_schema.*_options only shows options which are associated to
objects that current user can access, but \de*+ doesn't have such
restriction. \de* commands should be fixed to hide forbidden objects?

2) psql can support tab-completion CREATE/ALTER FOREIGN TABLE statement
about OPTIONS, but the patch doesn't include this feature.

pg_dump support
~~~~~~~~~~~~~~~
Sorry, I overlooked this issue till writing this post... I'm going to
work on this and post revised patch soon. Please examine other parts first.

Documents
~~~~~~~~~
1) Is "generic options" proper term to mean FDW-specific option
associated to a FDW object? It's used in the SQL/MED standard, but
seems not popular... "FDW option" would be better than "generic option"?

Regards,
--
Shigeru Hanada

Attachment Content-Type Size
per_column_option_v1.patch text/plain 36.2 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-06-14 12:20:42
Message-ID: BANLkTin+Q1vczEtxb5Joz+65b9AY6Mwqtw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I haven't looked at the patch yet, but here are a few comments on the
design, which overall looks good.

2011/6/14 Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>:
> 1) psql should support describing per-column generic options, so \dec
> command was added.  If the form \dec+ is used, generic options are also
> displayed.  Output sample is:

I would not add a new backslash command for this - it's unlikely to be
useful to see this information across all tables. It would be more
helpful to somehow (not sure of the details) incorporate this into the
output of running \d on a foreign table.

> Here I found an inconsistency about privilege to see generic options
> (not only column but also FDW and server et al).  The
> information_schema.*_options only shows options which are associated to
> objects that current user can access, but \de*+ doesn't have such
> restriction.  \de* commands should be fixed to hide forbidden objects?

It's less important whether \de* is consistent with information_schema
in this regard than it is whether it is consistent with other psql
backslash commands, e.g. \dv or \db or \dC. AFAIK those commands do
not filter by privilege.

> 1) Is "generic options" proper term to mean FDW-specific option
> associated to a FDW object?  It's used in the SQL/MED standard, but
> seems not popular...  "FDW option" would be better than "generic option"?

I think FDW option is much clearer.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-06-15 08:57:33
Message-ID: 4DF873FD.7020105@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2011/06/14 21:20), Robert Haas wrote:
> I haven't looked at the patch yet, but here are a few comments on the
> design, which overall looks good.

Thanks for the review. Please find attached a revised patch.

In addition to responding to your comments, I also added pg_dump
support. Now pg_dump dumps per-column generic options with ALTER
FOREIGN TABLE ALTER COLUMN statement just after CREATE FOREIGN TABLE.
Once I've though to dump them in each column definition of a CREATE
FOREIGN TABLE statement, but that seems to makes the statement too complex.

> 2011/6/14 Shigeru Hanada<shigeru(dot)hanada(at)gmail(dot)com>:
>> 1) psql should support describing per-column generic options, so \dec
>> command was added. If the form \dec+ is used, generic options are also
>> displayed. Output sample is:
>
> I would not add a new backslash command for this - it's unlikely to be
> useful to see this information across all tables. It would be more
> helpful to somehow (not sure of the details) incorporate this into the
> output of running \d on a foreign table.

Hm, belatedly I noticed that relation-kind-specific column are added
preceding to verbose-only columns such as expression for indexes and
column values for sequences. It seems suitable place to show per-column
generic options. Please see attached "desc_results.txt" as sample.

I also noticed that relation-kind-specific information are not mentioned
in any document (at least in the section of psql[1]), even about
existing ones such as sequence values and index definition. I also
added short brief of them to psql document.

BTW, while working around \d command, I noticed that we can avoid
variable width (# of columns) query result, which is used to fetch
column information, with using NULL as placeholder (and it has already
been used partially). I think that it would enhance maintainability
little, so I've separated this fix to another patch
avoid_variable_width_result.patch. The main patch
per_column_option_v2.patch assumes that this fix has been applied.

[1] http://developer.postgresql.org/pgdocs/postgres/app-psql.html

>> Here I found an inconsistency about privilege to see generic options
>> (not only column but also FDW and server et al). The
>> information_schema.*_options only shows options which are associated to
>> objects that current user can access, but \de*+ doesn't have such
>> restriction. \de* commands should be fixed to hide forbidden objects?
>
> It's less important whether \de* is consistent with information_schema
> in this regard than it is whether it is consistent with other psql
> backslash commands, e.g. \dv or \db or \dC. AFAIK those commands do
> not filter by privilege.

Agreed, I'll leave \de* to show results unconditionally.

>> 1) Is "generic options" proper term to mean FDW-specific option
>> associated to a FDW object? It's used in the SQL/MED standard, but
>> seems not popular... "FDW option" would be better than "generic option"?
>
> I think FDW option is much clearer.

So do I, but I didn't touch them because "generic option" appears in
many documents, source files including comments and psql's \d* output.
Most of them have been there since 8.4. Is it acceptable to change them
to "FDW option", at least for only documents?

OTOH, psql's \d* commands use "Options" as column header of FDW options
and reloptions. I also left them because I thought that this would not
cause misunderstanding.

Regards,
--
Shigeru Hanada

Attachment Content-Type Size
desc_results.txt text/plain 929 bytes
avoid_variable_width_result.patch text/plain 2.0 KB
per_column_option_v2.patch text/plain 42.1 KB

From: David Fetter <david(at)fetter(dot)org>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-06-16 23:44:58
Message-ID: 20110616234458.GD22300@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Jun 14, 2011 at 05:56:05PM +0900, Shigeru Hanada wrote:
> Hi,
>
> I would like to propose support for per-column generic option, which
> is defined in the SQL/MED standard. In 9.0 release, support for
> foreign tables and per-table generic option have been added, but
> support for per-column generic option hasn't.
>
> Please examine the description below and attached patch
> per_column_option_v1.patch. Any comments or questions are welcome.

Sorry not to respond sooner.

First, the per-column generic options are a great thing for us to
have. :)

I have an idea I've been using for the next release of DBI-Link that
has varying levels of data type mapping. In general, these mappings
would be units of executable code, one in-bound, and one out-bound,
for each of:

Universe (everything, default "mapping" is the identity map, i.e. a no-op)
Database type (e.g. MySQL)
Instance (e.g. mysql://foo.bar.com:5432)
Database
Schema
Table
Column

I didn't include row in the hierarchy because I couldn't think of a
way to identify rows across DBMSs and stable over time.

The finest-grain transformation that's been set would be the one
actually used.

Here's an example of a non-trivial mapping.

Database type:
MySQL
Foreign data type:
datetime
PostgreSQL data type:
timestamptz
Transformation direction:
Import
Transformation:
CASE
WHEN DATA = '0000-00-00 00:00:00'
THEN NULL
ELSE DATA
END

Here, I'm making the simplifying assumption that there is a bijective
mapping between data types.

Is there some way to fit the per-column part of such a mapping into
this scheme? We'd need to do some dependency tracking in order to be
able to point to the appropriate code...

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-06-17 10:19:39
Message-ID: 4DFB2A3B.8050101@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2011/06/17 8:44), David Fetter wrote:
> Sorry not to respond sooner.
>
> First, the per-column generic options are a great thing for us to
> have. :)

Thanks for the comments. :-)

> I have an idea I've been using for the next release of DBI-Link that
> has varying levels of data type mapping. In general, these mappings
> would be units of executable code, one in-bound, and one out-bound,
> for each of:
>
> Universe (everything, default "mapping" is the identity map, i.e. a no-op)
> Database type (e.g. MySQL)
> Instance (e.g. mysql://foo.bar.com:5432)
> Database
> Schema
> Table
> Column

Some of them seem to be able to be mapped to FDW object, e.g. Database
to SERVER and Table to FOREIGN TABLE.

> I didn't include row in the hierarchy because I couldn't think of a
> way to identify rows across DBMSs and stable over time.
>
> The finest-grain transformation that's been set would be the one
> actually used.
>
> Here's an example of a non-trivial mapping.
>
> Database type:
> MySQL
> Foreign data type:
> datetime
> PostgreSQL data type:
> timestamptz
> Transformation direction:
> Import
> Transformation:
> CASE
> WHEN DATA = '0000-00-00 00:00:00'
> THEN NULL
> ELSE DATA
> END
>
> Here, I'm making the simplifying assumption that there is a bijective
> mapping between data types.
>
> Is there some way to fit the per-column part of such a mapping into
> this scheme? We'd need to do some dependency tracking in order to be
> able to point to the appropriate code...

IIUC, you are talking about using FDW options as storage of data type
mapping setting, or mapping definition itself, right? If so, a foreign
table needs to be created to use per-column FDW options. Does it suit
to your idea?

BTW, I couldn't get what you mean by "dependency tracking". You mean
the dependency between foreign column and local column? It might
include essence of your idea... Would you explain the detail?

Regards,
--
Shigeru Hanada


From: David Fetter <david(at)fetter(dot)org>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-06-17 12:59:31
Message-ID: 20110617125931.GB7628@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jun 17, 2011 at 07:19:39PM +0900, Shigeru Hanada wrote:
> (2011/06/17 8:44), David Fetter wrote:
> > Sorry not to respond sooner.
> >
> > First, the per-column generic options are a great thing for us to
> > have. :)
>
> Thanks for the comments. :-)
>
> > I have an idea I've been using for the next release of DBI-Link that
> > has varying levels of data type mapping. In general, these mappings
> > would be units of executable code, one in-bound, and one out-bound,
> > for each of:
> >
> > Universe (everything, default "mapping" is the identity map, i.e. a no-op)
> > Database type (e.g. MySQL)
> > Instance (e.g. mysql://foo.bar.com:5432)
> > Database
> > Schema
> > Table
> > Column
>
> Some of them seem to be able to be mapped to FDW object, e.g. Database
> to SERVER and Table to FOREIGN TABLE.

Yes, I see there are a few missing. "Universe" doesn't really need
much of anything, as far as I can tell, except if we wanted to do
something that affected SQL/MED globally. Is that hierarchy otherwise
OK? DB2 may have one more level between Instance and Database Type,
that latter being the province of an individual FDW.

> > I didn't include row in the hierarchy because I couldn't think of a
> > way to identify rows across DBMSs and stable over time.
> >
> > The finest-grain transformation that's been set would be the one
> > actually used.
> >
> > Here's an example of a non-trivial mapping.
> >
> > Database type:
> > MySQL
> > Foreign data type:
> > datetime
> > PostgreSQL data type:
> > timestamptz
> > Transformation direction:
> > Import
> > Transformation:
> > CASE
> > WHEN DATA = '0000-00-00 00:00:00'
> > THEN NULL
> > ELSE DATA
> > END
> >
> > Here, I'm making the simplifying assumption that there is a bijective
> > mapping between data types.
> >
> > Is there some way to fit the per-column part of such a mapping into
> > this scheme? We'd need to do some dependency tracking in order to be
> > able to point to the appropriate code...
>
> IIUC, you are talking about using FDW options as storage of data
> type mapping setting, or mapping definition itself, right? If so, a
> foreign table needs to be created to use per-column FDW options.
> Does it suit to your idea?

Yes. The only mildly disturbing thing about how that would work is
that "magic" key names would actually point to executable code, so
there would be some kind of non-uniform processing of the options, and
(possibly quite unlikely) ways to escalate privilege.

> BTW, I couldn't get what you mean by "dependency tracking". You
> mean the dependency between foreign column and local column? It
> might include essence of your idea... Would you explain the detail?

I think the dependency between the mapping between the foreign column
and the local one is already handled. On that subject, it's possible
to make an argument that this mapping might need to be expanded so
that in general, M foreign columns map to N local ones (distinct M and
N), but that's a research topic, so let's not worry about it now.

The dependency tracking I have in mind is of the actual executable
code. If the inbound mapping has what amounts to a pointer to a
function, it shouldn't be possible to drop that function without
CASCADE, and if we're caching such functions, the cache needs to be
refreshed any time the function changes.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-06-26 09:34:07
Message-ID: BANLkTikaHdN4je-sH4NvbzGNPCs4tFtUZw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I checked your patch.

The backend portion seems to me OK, but I have several questions/comments.

* This patch should be rebased.
It conflicts with the latest bin/psql/describe.c and
include/catalog/catversion.h.
IIRC, we should not touch catversion.h in submission stage. (It might
be a task of
committer when a patch get upstreamed.)

* It might be an option to extend attreloptions, instead of the new
attfdwoptions.
Although I didn't track the discussion when pg_foreign_table catalog
that provides
relation level fdw-options, was it impossible or unreasonable to extend existing
design of reloptions/attoptions?
Right now, it accepts only hard-wired options listed at reloptions.c.
But, it seems
to me worthwhile, if it could accept options validated by loadable modules.

* pg_dump shall die when we run it for older postgresql version.

This patch does not modify queries to older postgresql version at
getTableAttrs().
In the result, this index shall be set by -1.
+ i_attfdwoptions = PQfnumber(res, "attfdwoptions");

Then, PGgetvalue() returns NULL for unranged column number, and strdup()
shall cause segmentation fault.
+ tbinfo->attfdwoptions[j] = strdup(PQgetvalue(res, j,
i_attfdwoptions));

In fact, I tried to run the patched pg_dump towards v9.0.2
[kaigai(at)vmlinux pg_dump]$ ./pg_dump postgres
pg_dump: column number -1 is out of range 0..14
Segmentation fault

My recommendation is to append "NULL as attfdwoptions" on the queries to
older versions. It eventually makes PGgetvalue() to return an empty string,
then strdup() does not cause a problem.

Thanks,

2011年6月15日10:57 Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>:
> (2011/06/14 21:20), Robert Haas wrote:
>> I haven't looked at the patch yet, but here are a few comments on the
>> design, which overall looks good.
>
> Thanks for the review. Please find attached a revised patch.
>
> In addition to responding to your comments, I also added pg_dump
> support. Now pg_dump dumps per-column generic options with ALTER
> FOREIGN TABLE ALTER COLUMN statement just after CREATE FOREIGN TABLE.
> Once I've though to dump them in each column definition of a CREATE
> FOREIGN TABLE statement, but that seems to makes the statement too complex.
>
>> 2011/6/14 Shigeru Hanada<shigeru(dot)hanada(at)gmail(dot)com>:
>>> 1) psql should support describing per-column generic options, so \dec
>>> command was added. If the form \dec+ is used, generic options are also
>>> displayed. Output sample is:
>>
>> I would not add a new backslash command for this - it's unlikely to be
>> useful to see this information across all tables. It would be more
>> helpful to somehow (not sure of the details) incorporate this into the
>> output of running \d on a foreign table.
>
> Hm, belatedly I noticed that relation-kind-specific column are added
> preceding to verbose-only columns such as expression for indexes and
> column values for sequences. It seems suitable place to show per-column
> generic options. Please see attached "desc_results.txt" as sample.
>
> I also noticed that relation-kind-specific information are not mentioned
> in any document (at least in the section of psql[1]), even about
> existing ones such as sequence values and index definition. I also
> added short brief of them to psql document.
>
> BTW, while working around \d command, I noticed that we can avoid
> variable width (# of columns) query result, which is used to fetch
> column information, with using NULL as placeholder (and it has already
> been used partially). I think that it would enhance maintainability
> little, so I've separated this fix to another patch
> avoid_variable_width_result.patch. The main patch
> per_column_option_v2.patch assumes that this fix has been applied.
>
> [1] http://developer.postgresql.org/pgdocs/postgres/app-psql.html
>
>>> Here I found an inconsistency about privilege to see generic options
>>> (not only column but also FDW and server et al). The
>>> information_schema.*_options only shows options which are associated to
>>> objects that current user can access, but \de*+ doesn't have such
>>> restriction. \de* commands should be fixed to hide forbidden objects?
>>
>> It's less important whether \de* is consistent with information_schema
>> in this regard than it is whether it is consistent with other psql
>> backslash commands, e.g. \dv or \db or \dC. AFAIK those commands do
>> not filter by privilege.
>
> Agreed, I'll leave \de* to show results unconditionally.
>
>>> 1) Is "generic options" proper term to mean FDW-specific option
>>> associated to a FDW object? It's used in the SQL/MED standard, but
>>> seems not popular... "FDW option" would be better than "generic option"?
>>
>> I think FDW option is much clearer.
>
> So do I, but I didn't touch them because "generic option" appears in
> many documents, source files including comments and psql's \d* output.
> Most of them have been there since 8.4. Is it acceptable to change them
> to "FDW option", at least for only documents?
>
> OTOH, psql's \d* commands use "Options" as column header of FDW options
> and reloptions. I also left them because I thought that this would not
> cause misunderstanding.
>
> Regards,
> --
> Shigeru Hanada
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>


From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-06-27 14:47:51
Message-ID: 4E089817.7070001@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2011/06/26 18:34), Kohei KaiGai wrote:
> I checked your patch.

Thanks for the review! Please find attached a revised patch.

> The backend portion seems to me OK, but I have several questions/comments.
>
> * This patch should be rebased.
> It conflicts with the latest bin/psql/describe.c and
> include/catalog/catversion.h.
> IIRC, we should not touch catversion.h in submission stage. (It might
> be a task of
> committer when a patch get upstreamed.)

I've rebased against current HEAD, and reverted catversion.h.

> * It might be an option to extend attreloptions, instead of the new
> attfdwoptions.
> Although I didn't track the discussion when pg_foreign_table catalog
> that provides
> relation level fdw-options, was it impossible or unreasonable to extend existing
> design of reloptions/attoptions?
> Right now, it accepts only hard-wired options listed at reloptions.c.
> But, it seems
> to me worthwhile, if it could accept options validated by loadable modules.

IIRC someone has objected against storing FDW options in
reloptions/attoptions, but I couldn't find such post. I'll follow the
discussion again.

IMHO, though at present I don't have clear proof, separating FDW options
from access method options seems better than merging them, but I should
learn more about AM mechanism to clarify this issue. Please check other
issues first.

> * pg_dump shall die when we run it for older postgresql version.
>
> This patch does not modify queries to older postgresql version at
> getTableAttrs().
> In the result, this index shall be set by -1.
> + i_attfdwoptions = PQfnumber(res, "attfdwoptions");
>
> Then, PGgetvalue() returns NULL for unranged column number, and strdup()
> shall cause segmentation fault.
> + tbinfo->attfdwoptions[j] = strdup(PQgetvalue(res, j,
> i_attfdwoptions));
>
> In fact, I tried to run the patched pg_dump towards v9.0.2
> [kaigai(at)vmlinux pg_dump]$ ./pg_dump postgres
> pg_dump: column number -1 is out of range 0..14
> Segmentation fault
>
> My recommendation is to append "NULL as attfdwoptions" on the queries to
> older versions. It eventually makes PGgetvalue() to return an empty string,
> then strdup() does not cause a problem.

Fixed in the way you've recommended, and tested against 8.4. I should
have noticed that same technique is used in some other places...

BTW, I also have found an unnecessary FIXME comment and removed it.
Please see the line 2845 of src/backend/catalog/heap.c
(InsertPgAttributeTuple) for the correction.

Regards,
--
Shigeru Hanada

Attachment Content-Type Size
per_column_option_v3.patch text/plain 46.7 KB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-06-27 18:05:23
Message-ID: BANLkTi=LtNQCK3ARa4_aGzfE+Pzhw0AfVQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

2011/6/27 Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>:
>> * It might be an option to extend attreloptions, instead of the new
>> attfdwoptions.
>> Although I didn't track the discussion when pg_foreign_table catalog
>> that provides
>> relation level fdw-options, was it impossible or unreasonable to extend existing
>> design of reloptions/attoptions?
>> Right now, it accepts only hard-wired options listed at reloptions.c.
>> But, it seems
>> to me worthwhile, if it could accept options validated by loadable modules.
>
> IIRC someone has objected against storing FDW options in
> reloptions/attoptions, but I couldn't find such post.  I'll follow the
> discussion again.

I think they should definitely be separate.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: David Fetter <david(at)fetter(dot)org>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-06-27 23:42:20
Message-ID: 20110627234220.GB11795@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Jun 17, 2011 at 05:59:31AM -0700, David Fetter wrote:
> On Fri, Jun 17, 2011 at 07:19:39PM +0900, Shigeru Hanada wrote:

> > > Here's an example of a non-trivial mapping.
> > >
> > > Database type:
> > > MySQL
> > > Foreign data type:
> > > datetime
> > > PostgreSQL data type:
> > > timestamptz
> > > Transformation direction:
> > > Import
> > > Transformation:
> > > CASE
> > > WHEN DATA = '0000-00-00 00:00:00'
> > > THEN NULL
> > > ELSE DATA
> > > END
> > >
> > > Here, I'm making the simplifying assumption that there is a bijective
> > > mapping between data types.

Any word on this?

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: David Fetter <david(at)fetter(dot)org>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-06-28 12:06:40
Message-ID: 4E09C3D0.9050802@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Sorry for the long delay...

(2011/06/17 21:59), David Fetter wrote:
> On Fri, Jun 17, 2011 at 07:19:39PM +0900, Shigeru Hanada wrote:
>> (2011/06/17 8:44), David Fetter wrote:
>>> Sorry not to respond sooner.
>>>
>>> First, the per-column generic options are a great thing for us to
>>> have. :)
>>
>> Thanks for the comments. :-)
>>
>>> I have an idea I've been using for the next release of DBI-Link that
>>> has varying levels of data type mapping. In general, these mappings
>>> would be units of executable code, one in-bound, and one out-bound,
>>> for each of:
>>>
>>> Universe (everything, default "mapping" is the identity map, i.e. a no-op)
>>> Database type (e.g. MySQL)
>>> Instance (e.g. mysql://foo.bar.com:5432)
>>> Database
>>> Schema
>>> Table
>>> Column
>>
>> Some of them seem to be able to be mapped to FDW object, e.g. Database
>> to SERVER and Table to FOREIGN TABLE.
>
> Yes, I see there are a few missing. "Universe" doesn't really need
> much of anything, as far as I can tell, except if we wanted to do
> something that affected SQL/MED globally. Is that hierarchy otherwise
> OK?

Yes, maybe some levels in your hierarchy can be mapped to SQL/MED
objects, and you can store options with them.

Universe : N/A (I'm not sure but custom GUC might suit for this)
Database type : FOREIGN DATA WRAPPER
Instance : N/A
Database : SERVER
Schema : N/A
Table : FOREIGN TABLE
Column : column of FOREIGN TABLE(WIP)

> DB2 may have one more level between Instance and Database Type,
> that latter being the province of an individual FDW.

I'm not familiar with DB2, but it would be difficult to map such level
to one of existing SQL/MED object types.

>>> I didn't include row in the hierarchy because I couldn't think of a
>>> way to identify rows across DBMSs and stable over time.
>>>
>>> The finest-grain transformation that's been set would be the one
>>> actually used.

Yeah, I think it's generally convenient for users if a FDW allows to
override settings which were defined for object on upper level.
For instance, if I'm dealing many files which have same format, and if
we could set "format" option for file_fdw on the server, all I have to
do for each foreign table is to specify "filename". I think that that's
usual use case.

>>> Here's an example of a non-trivial mapping.
>>>
>>> Database type:
>>> MySQL
>>> Foreign data type:
>>> datetime
>>> PostgreSQL data type:
>>> timestamptz
>>> Transformation direction:
>>> Import
>>> Transformation:
>>> CASE
>>> WHEN DATA = '0000-00-00 00:00:00'
>>> THEN NULL
>>> ELSE DATA
>>> END
>>>
>>> Here, I'm making the simplifying assumption that there is a bijective
>>> mapping between data types.
>>>
>>> Is there some way to fit the per-column part of such a mapping into
>>> this scheme? We'd need to do some dependency tracking in order to be
>>> able to point to the appropriate code...
>>
>> IIUC, you are talking about using FDW options as storage of data
>> type mapping setting, or mapping definition itself, right? If so, a
>> foreign table needs to be created to use per-column FDW options.
>> Does it suit to your idea?
>
> Yes. The only mildly disturbing thing about how that would work is
> that "magic" key names would actually point to executable code, so
> there would be some kind of non-uniform processing of the options, and
> (possibly quite unlikely) ways to escalate privilege.

How are you planning to define a mapping for an object other than
column? ISTM that you need to combine N mappings for such object, N is
the number of distinct types used under the level, so FDW seems to have
to cover all kind of transformation. Maybe you need to retrieve options
from lower level to upper level, until you find one which is suitable
for the combination of types.

>> BTW, I couldn't get what you mean by "dependency tracking". You
>> mean the dependency between foreign column and local column? It
>> might include essence of your idea... Would you explain the detail?
>
> I think the dependency between the mapping between the foreign column
> and the local one is already handled. On that subject, it's possible
> to make an argument that this mapping might need to be expanded so
> that in general, M foreign columns map to N local ones (distinct M and
> N), but that's a research topic, so let's not worry about it now.
>
> The dependency tracking I have in mind is of the actual executable
> code. If the inbound mapping has what amounts to a pointer to a
> function, it shouldn't be possible to drop that function without
> CASCADE, and if we're caching such functions, the cache needs to be
> refreshed any time the function changes.

Agreed, such dependency would have to be maintained by the system.
Dependencies from column to FDW (through foreign table and server) have
been managed with pg_depend. Cache invalidation would be need to be
implemented by dbi-link.

Current dependency graph about SQL/MED objects is:

column -> foreign table ----> server -> FDW
user mapping _/

VALIDATOR function might be able to be used to maintain pg_depend
entries when options are set/changed/dropped via CREATE/ALTER, though
it's not main purpose of VALIDATOR.

regards,
--
Shigeru Hanada


From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-07-03 09:50:53
Message-ID: CADyhKSXE_Ze6uprbw2ixn25N5O3vLV1oWbJDpe8iGe1Ftei=cg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hanada-san,

I checked the per-column generic option patch.
Right now, I have nothing to comment on anymore.
So, it should be reviewed by committers.

Thanks,

2011年6月27日16:47 Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>:
> (2011/06/26 18:34), Kohei KaiGai wrote:
>> I checked your patch.
>
> Thanks for the review! Please find attached a revised patch.
>
>> The backend portion seems to me OK, but I have several questions/comments.
>>
>> * This patch should be rebased.
>> It conflicts with the latest bin/psql/describe.c and
>> include/catalog/catversion.h.
>> IIRC, we should not touch catversion.h in submission stage. (It might
>> be a task of
>> committer when a patch get upstreamed.)
>
> I've rebased against current HEAD, and reverted catversion.h.
>
>> * It might be an option to extend attreloptions, instead of the new
>> attfdwoptions.
>> Although I didn't track the discussion when pg_foreign_table catalog
>> that provides
>> relation level fdw-options, was it impossible or unreasonable to extend existing
>> design of reloptions/attoptions?
>> Right now, it accepts only hard-wired options listed at reloptions.c.
>> But, it seems
>> to me worthwhile, if it could accept options validated by loadable modules.
>
> IIRC someone has objected against storing FDW options in
> reloptions/attoptions, but I couldn't find such post. I'll follow the
> discussion again.
>
> IMHO, though at present I don't have clear proof, separating FDW options
> from access method options seems better than merging them, but I should
> learn more about AM mechanism to clarify this issue. Please check other
> issues first.
>
>> * pg_dump shall die when we run it for older postgresql version.
>>
>> This patch does not modify queries to older postgresql version at
>> getTableAttrs().
>> In the result, this index shall be set by -1.
>> + i_attfdwoptions = PQfnumber(res, "attfdwoptions");
>>
>> Then, PGgetvalue() returns NULL for unranged column number, and strdup()
>> shall cause segmentation fault.
>> + tbinfo->attfdwoptions[j] = strdup(PQgetvalue(res, j,
>> i_attfdwoptions));
>>
>> In fact, I tried to run the patched pg_dump towards v9.0.2
>> [kaigai(at)vmlinux pg_dump]$ ./pg_dump postgres
>> pg_dump: column number -1 is out of range 0..14
>> Segmentation fault
>>
>> My recommendation is to append "NULL as attfdwoptions" on the queries to
>> older versions. It eventually makes PGgetvalue() to return an empty string,
>> then strdup() does not cause a problem.
>
> Fixed in the way you've recommended, and tested against 8.4. I should
> have noticed that same technique is used in some other places...
>
> BTW, I also have found an unnecessary FIXME comment and removed it.
> Please see the line 2845 of src/backend/catalog/heap.c
> (InsertPgAttributeTuple) for the correction.
>
> Regards,
> --
> Shigeru Hanada
>
>

--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>


From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-07-04 01:17:06
Message-ID: 4E111492.3000005@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2011/07/03 18:50), Kohei KaiGai wrote:
> I checked the per-column generic option patch.
> Right now, I have nothing to comment on anymore.
> So, it should be reviewed by committers.

Thanks for the review!.

Regards,
--
Shigeru Hanada


From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-07-04 12:03:22
Message-ID: 4E11AC0A.5040005@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2011/07/04 10:17), Shigeru Hanada wrote:
> (2011/07/03 18:50), Kohei KaiGai wrote:
>> I checked the per-column generic option patch.
>> Right now, I have nothing to comment on anymore.
>> So, it should be reviewed by committers.
>
> Thanks for the review!.

I would like to propose adding force_not_null support to file_fdw, as
the first use case of per-column FDW option. Attached patch, which
assumes that per_column_option_v3.patch has been applied, implements
force_not_null option as per-column FDW option.

Overview
========
This option is originally supported by COPY FROM command, so I think
it's reasonable to support it in file_fdw too. It would provides more
flexible parsing capability. In fact, this option has been supported
by the internal routines which are shared with COPY FROM, but currently
we don't have any way to specify it.

Difference between COPY
=======================
For COPY FROM, FORCE_NOT_NULL is specified as a list of column names
('*' is not allowed). For file_fdw, per-table FDW option can be used
like other options, but then file_fdw needs parser which can identify
valid column. I think it's too much work, so I prefer per-column FDW
option which accepts boolean value string. The value 'true' means that
the column doesn't be matched against NULL string, same as ones listed
for COPY FROM.

Example:

If you have created a foreign table with:

CREATE FOREIGN TABLE foo (
c1 int OPTIONS (force_not_null 'false'),
c2 text OPTIONS (force_not_null 'true')
) SERVER file OPTIONS (file '/path/to/file', format 'csv', null '');

values which are read from the file for c1 are matched against
null-representation-string '', but values for c2 are NOT. Empty strings
for c2 are stored as empty strings; they don't treated as NULL value.

Regards,
--
Shigeru Hanada

Attachment Content-Type Size
force_not_null_v1.patch text/plain 12.0 KB

From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-07-10 03:49:28
Message-ID: 1310269656-sup-2089@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Shigeru Hanada escribió:
> (2011/06/26 18:34), Kohei KaiGai wrote:
> > I checked your patch.
>
> Thanks for the review! Please find attached a revised patch.

Err, \dec seems to have a line in describe.h but nowhere else; are you
going to introduce that command?

The new ALTER TABLE grammar seems a bit strange -- ADD, SET, DROP. Is
this defined by the SQL/MED standard? It seems at odds with our
handling of attoptions (and the pg_dump query seems rather bizarre in
comparison to the handling of attoptions there; why do we need
pg_options_to_table when attoptions do not?).

What's the reason for this:

@@ -3681,7 +3691,7 @@ AlterFdwStmt: ALTER FOREIGN DATA_P WRAPPER name opt_fdw_options alter_generic_op
/* Options definition for CREATE FDW, SERVER and USER MAPPING */
create_generic_options:
OPTIONS '(' generic_option_list ')' { $$ = $3; }
- | /*EMPTY*/ { $$ = NIL; }
+ | /*EMPTY*/ { $$ = NIL }
;

I think this should be removed:

+ foreach (clist, column->fdwoptions)
+ {
+ DefElem *option = (DefElem *) lfirst(clist);
+ elog(DEBUG3, "%s=%s", option->defname, strVal(option->arg));
+ }

As for whether attfdwoptions needs to be separate from attoptions, I am
not sure that they really need to be; but if they are, I think we should
use a different name than attfdwoptions, because we might want to use
them for something else. Maybe attgenoptions (and note that the alter
table code already calls them "generic options" so I'm not sure why the
rest of the code is calling them FDW options) ... but then I really
start to question whether they need to be separate from attoptions.

Currently, attoptions are used to store n_distinct and
n_distinct_inherited. Do those options make sense for foreign tables?
If they do make sense for some types of foreign servers, maybe we should
decree that they need to be specifically declared as such by the FDW --
that is, the FDW needs to provide its own attribute_reloptions routine
(or equivalent therein) for validation that includes those core options.
There is no saying that, even if all existing core options such as
n_distinct apply to a FDW now, more core options that we might invent in
the future are going to apply as well.

In short: in my opinion, attoptions and attfdwoptions need to be one
thing and the same.

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-07-11 01:21:19
Message-ID: 624F0D91-1E58-401D-A62C-F51DDDFDF671@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jul 9, 2011, at 10:49 PM, Alvaro Herrera <alvherre(at)commandprompt(dot)com> wrote:
> In short: in my opinion, attoptions and attfdwoptions need to be one
> thing and the same.

I feel the opposite. In particular, what happens when a future release of PostgreSQL adds an attoption that happens to have the same name as somebody's per-column FDW option? Something breaks, that's what...

Another point: We don't commingle these concepts at the table level. It doesn't make sense to have table reloptions separate from table FDW options but then go and make the opposite decision at the column level.

...Robert


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-07-11 04:11:02
Message-ID: 1310357354-sup-7095@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Robert Haas's message of dom jul 10 21:21:19 -0400 2011:
> On Jul 9, 2011, at 10:49 PM, Alvaro Herrera <alvherre(at)commandprompt(dot)com> wrote:
> > In short: in my opinion, attoptions and attfdwoptions need to be one
> > thing and the same.
>
> I feel the opposite. In particular, what happens when a future release of PostgreSQL adds an attoption that happens to have the same name as somebody's per-column FDW option? Something breaks, that's what...

Hmm, if you follow my proposal above, that wouldn't actually happen,
because the core options do not apply to foreign columns.

> Another point: We don't commingle these concepts at the table level.
> It doesn't make sense to have table reloptions separate from table FDW
> options but then go and make the opposite decision at the column
> level.

That's a point. I remember feeling uneasy at the fact that we were
doing things like that, at the time, yes :-)

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-07-11 14:13:37
Message-ID: 4E1B0511.4000209@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thanks for the review.

(2011/07/10 12:49), Alvaro Herrera wrote:
> Err, \dec seems to have a line in describe.h but nowhere else; are you
> going to introduce that command?

\dec command is obsolete, so I removed that line.

> The new ALTER TABLE grammar seems a bit strange -- ADD, SET, DROP. Is
> this defined by the SQL/MED standard?

Yes, syntax for altering foreign table is defined by the SQL/MED
standard as below, and <alter generic option> is common to all SQL/MED
objects:

<alter foreign table statement> ::=
ALTER FOREIGN TABLE <table name> <alter foreign table action>

<alter foreign table action> ::=
<add basic column definition>
| <alter basic column definition>
| <drop basic column definition>
| <alter generic options>

<alter generic options> ::=
OPTIONS <left paren> <alter generic option list> <right paren>

<alter generic option list> ::=
<alter generic option> [ { <comma> <alter generic option> }... ]

<alter generic option> ::= [ <alter operation> ] <option name> [ <option
value> ]
<alter operation> ::=
ADD
| SET
| DROP

<generic option> ::= <option name> [ <option value> ]

<option value> ::= <character string literal>

FYI, default for <alter operation> is ADD.

> It seems at odds with our
> handling of attoptions (and the pg_dump query seems rather bizarre in
> comparison to the handling of attoptions there; why do we need
> pg_options_to_table when attoptions do not?).

That's because of the syntax difference between FDW options and AM
options. AM options should be dumped as "key=value, key=value, ...",
but FDW options should be dumped as "key 'value', key 'value', ...". The
pg_options_to_table() is used to construct list in the latter form.
The way used to handle per-column options in my patch is same as the way
used for other existing FDW objects, such as FDW, server, and user mapping.

> What's the reason for this:
>
> @@ -3681,7 +3691,7 @@ AlterFdwStmt: ALTER FOREIGN DATA_P WRAPPER name opt_fdw_options alter_generic_op
> /* Options definition for CREATE FDW, SERVER and USER MAPPING */
> create_generic_options:
> OPTIONS '(' generic_option_list ')' { $$ = $3; }
> - | /*EMPTY*/ { $$ = NIL; }
> + | /*EMPTY*/ { $$ = NIL }
> ;

Reverted this unintended change.

> I think this should be removed:
>
> + foreach (clist, column->fdwoptions)
> + {
> + DefElem *option = (DefElem *) lfirst(clist);
> + elog(DEBUG3, "%s=%s", option->defname, strVal(option->arg));
> + }

Removed, the codes were used only for debug.

> As for whether attfdwoptions needs to be separate from attoptions, I am
> not sure that they really need to be; but if they are, I think we should
> use a different name than attfdwoptions, because we might want to use
> them for something else. Maybe attgenoptions (and note that the alter
> table code already calls them "generic options" so I'm not sure why the
> rest of the code is calling them FDW options) ... but then I really
> start to question whether they need to be separate from attoptions.

For now I got +1 for attfdwoptions and +1 for attgenoptions for the
naming. I prefer attgenoptions because it follows SQL/MED standard, but
I don't have strong feeling for naming, so I've inspected usage in the
current HEAD... Hm, "gen.*option" appears twice much as "fdw.*option"
in the source code with case insensitive grep, and most of "fdw.*option"
were hit "fdwoptions", per-wrapper options. ISTM that "generic option"
would be better to mean options used by FDW for consistency, so I
unified the wording to "generic option" from "fdw option". I hope that
the name "generic option" doesn't confuse users who aren't familiar to
SQL/MED standard.

> Currently, attoptions are used to store n_distinct and
> n_distinct_inherited. Do those options make sense for foreign tables?
> If they do make sense for some types of foreign servers, maybe we should
> decree that they need to be specifically declared as such by the FDW --
> that is, the FDW needs to provide its own attribute_reloptions routine
> (or equivalent therein) for validation that includes those core options.
> There is no saying that, even if all existing core options such as
> n_distinct apply to a FDW now, more core options that we might invent in
> the future are going to apply as well.
>
> In short: in my opinion, attoptions and attfdwoptions need to be one
> thing and the same.

The n_distinct might make sense for every foreign tables in a sense,
though syntax to set it is not supported. It would allow users to
specify not-FDW-specific statistics information to control costs for the
scan. However each FDW would be able to support such option too. I
think that reloptions and attoptions should be used by only PG core, and
FDW should use generic options. So I prefer separated design.

The attached patch fixes issues other than generic options separation.

Regards,
--
Shigeru Hanada

Attachment Content-Type Size
per_column_option_v4.patch text/plain 45.5 KB

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-07-11 15:44:04
Message-ID: 1310399044.27274.0.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On lör, 2011-07-09 at 23:49 -0400, Alvaro Herrera wrote:
> The new ALTER TABLE grammar seems a bit strange -- ADD, SET, DROP. Is
> this defined by the SQL/MED standard? It seems at odds with our
> handling of attoptions

Well, I believe the SQL/MED options were actually implemented first and
the attoptions afterwards. But it's probably not unwise to keep them
separate, even though the syntaxes could have been made more similar.


From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-07-12 05:31:45
Message-ID: 4E1BDC41.30708@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2011/07/11 10:21), Robert Haas wrote:
> On Jul 9, 2011, at 10:49 PM, Alvaro Herrera<alvherre(at)commandprompt(dot)com> wrote:
>> In short: in my opinion, attoptions and attfdwoptions need to be one
>> thing and the same.
>
> I feel the opposite. In particular, what happens when a future release
> of PostgreSQL adds an attoption that happens to have the same name as
> somebody's per-column FDW option? Something breaks, that's what...
>
> Another point: We don't commingle these concepts at the table level.
> It doesn't make sense to have table reloptions separate from table FDW
> options but then go and make the opposite decision at the column
> level.

I'm afraid that I've misunderstood the discussion. Do you mean that
per-table options should be stored in reloptions, but per-column should
be separated from attoptions? (I think I've misread...)

Could you tell me little more detail why it doesn't make sense to have
table reloptions separate from table FDW options?

Regards,
--
Shigeru Hanada


From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-07-12 07:11:54
Message-ID: 4E1BF3BA.7000703@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2011/07/12 0:44), Peter Eisentraut wrote:
> On lör, 2011-07-09 at 23:49 -0400, Alvaro Herrera wrote:
>> The new ALTER TABLE grammar seems a bit strange -- ADD, SET, DROP. Is
>> this defined by the SQL/MED standard? It seems at odds with our
>> handling of attoptions
>
> Well, I believe the SQL/MED options were actually implemented first and
> the attoptions afterwards. But it's probably not unwise to keep them
> separate, even though the syntaxes could have been made more similar.

As you say, syntax for attoptions/reloptions seem to satisfy the
requirement of SQL/MED; SET for ADD/SET and RESET for DROP.

But at this time it would break backward compatibility. I think it's
reasonable to unify the syntax for handling SQL/MED options at every
level to "OPTIONS (key 'value', ...)".

Regards,
--
Shigeru Hanada


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-07-12 12:19:52
Message-ID: D60B6762-965A-4C86-93A9-1EECC29009C4@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Jul 12, 2011, at 12:31 AM, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com> wrote:
> (2011/07/11 10:21), Robert Haas wrote:
>> On Jul 9, 2011, at 10:49 PM, Alvaro Herrera<alvherre(at)commandprompt(dot)com> wrote:
>>> In short: in my opinion, attoptions and attfdwoptions need to be one
>>> thing and the same.
>>
>> I feel the opposite. In particular, what happens when a future release
>> of PostgreSQL adds an attoption that happens to have the same name as
>> somebody's per-column FDW option? Something breaks, that's what...
>>
>> Another point: We don't commingle these concepts at the table level.
>> It doesn't make sense to have table reloptions separate from table FDW
>> options but then go and make the opposite decision at the column
>> level.
>
> I'm afraid that I've misunderstood the discussion. Do you mean that
> per-table options should be stored in reloptions, but per-column should
> be separated from attoptions? (I think I've misread...)

No, I was arguing that they should both be separate.

...Robert


From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-07-12 12:25:36
Message-ID: 4E1C3D40.8070900@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2011/07/12 21:19), Robert Haas wrote:
> On Jul 12, 2011, at 12:31 AM, Shigeru Hanada<shigeru(dot)hanada(at)gmail(dot)com> wrote:
>> I'm afraid that I've misunderstood the discussion. Do you mean that
>> per-table options should be stored in reloptions, but per-column should
>> be separated from attoptions? (I think I've misread...)
>
> No, I was arguing that they should both be separate.

Thanks, I'm relieved. :)

Regards,
--
Shigeru Hanada


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-07-12 13:56:30
Message-ID: 1310478960-sup-8358@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Excerpts from Shigeru Hanada's message of mar jul 12 03:11:54 -0400 2011:
> (2011/07/12 0:44), Peter Eisentraut wrote:
> > On lör, 2011-07-09 at 23:49 -0400, Alvaro Herrera wrote:
> >> The new ALTER TABLE grammar seems a bit strange -- ADD, SET, DROP. Is
> >> this defined by the SQL/MED standard? It seems at odds with our
> >> handling of attoptions
> >
> > Well, I believe the SQL/MED options were actually implemented first and
> > the attoptions afterwards. But it's probably not unwise to keep them
> > separate, even though the syntaxes could have been made more similar.
>
> As you say, syntax for attoptions/reloptions seem to satisfy the
> requirement of SQL/MED; SET for ADD/SET and RESET for DROP.

Speaking of which -- what's the difference between ADD and SET for SQL/MED
options?

--
Álvaro Herrera <alvherre(at)commandprompt(dot)com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-07-12 21:26:16
Message-ID: 1310505976.17676.4.camel@vanquo.pezone.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On tis, 2011-07-12 at 09:56 -0400, Alvaro Herrera wrote:
> Speaking of which -- what's the difference between ADD and SET for
> SQL/MED options?

ADD add to the existing options, SET overwrites all options with what
you specify.


From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-07-12 21:56:11
Message-ID: 4E1CC2FB.6020409@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2011/07/12 22:56), Alvaro Herrera wrote:
> Speaking of which -- what's the difference between ADD and SET for SQL/MED
> options?

ADD can only add new option; it can't overwrite existing option's value.
To overwrite existing option's value, you need to use SET instead.

Regards,
--
Shigeru Hanada


From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: per-column generic option
Date: 2011-07-14 19:17:01
Message-ID: 4E1F40AD.1070506@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

All,

Is the spec for this feature still under discussion? I don't seem to
see a consensus on this thread.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: per-column generic option
Date: 2011-07-15 02:57:35
Message-ID: 4E1FAC9F.7090100@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

(2011/07/15 4:17), Josh Berkus wrote:
> All,
>
> Is the spec for this feature still under discussion? I don't seem to
> see a consensus on this thread.

Yeah, a remaining concern is whether generic (FDW) options should be
separated from existing attoptions or not.

Indeed, reloptions/attoptions mechanism seems to be also applicable to
generic options, since both need to store multiple key-value pairs, but
IMHO generic options should be separated from reloptions/attoptions for
several reasons:

1) FDW options are handled by only FDW, but reloptions/attoptions are
handled by PG core modules such as planner, AM and autovacuum. If we
can separate them completely, they would be able to share one attribute,
but I worry that some of reloptions/attoptions make sense for some FDW.
For instance, n_distinct might be useful to control costs of a foreign
table scan. Though attoptions can't be set via CREATE/ALTER FOREIGN
TABLE yet.

2) In future, newly added option might conflict somebody's FDW option.
Robert Haas has pointed out this issue some days ago. FDW validator
would reject unknown options, so every FDW would have to know all of
reloptions/attoptions to avoid this issue.

3) It would be difficult to unify syntax to set options from perspective
of backward compatibility and syntax consistency. Other SQL/MED objects
have the syntax such as "OPTIONS (key 'value', ...)", but
reloptions/attoptions have the syntax such as "SET (key = value, ...)".
Without syntax unification, some tools should care relkind before
handling attoptions. For instance, pg_dump should choose syntax used to
dump attoptions. It seems undesired complexity.

Any comments/objections/questions are welcome.

Regards,
--
Shigeru Hanada

* 英語 - 自動検出
* 英語
* 日本語

* 英語
* 日本語

<javascript:void(0);>


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-07-18 19:09:21
Message-ID: CA+TgmoY-NhO8XOxn_jwZE-=M9J_Rtc6wSO1z2QxcTyCa3_s66Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jul 11, 2011 at 12:11 AM, Alvaro Herrera
<alvherre(at)commandprompt(dot)com> wrote:
> Excerpts from Robert Haas's message of dom jul 10 21:21:19 -0400 2011:
>> On Jul 9, 2011, at 10:49 PM, Alvaro Herrera <alvherre(at)commandprompt(dot)com> wrote:
>> > In short: in my opinion, attoptions and attfdwoptions need to be one
>> > thing and the same.
>>
>> I feel the opposite. In particular, what happens when a future release of PostgreSQL adds an attoption that happens to have the same name as somebody's per-column FDW option?  Something breaks, that's what...
>
> Hmm, if you follow my proposal above, that wouldn't actually happen,
> because the core options do not apply to foreign columns.

Well, not at the moment. But I think it's altogether likely that we
might want them to in the future. The foreign data wrapper support we
have right now is basically a stub until we get around to improving
it, so we don't (for example) analyze foreign tables, which means that
n_distinct is not relevant. But that's something we presumably want
to change at some point. Eventually, I would anticipate that we'll
have quite a few more column options and most will apply to both
tables and foreign tables, so I'm not keen to bake in something that
makes that potentially problematic. I think we should understand
attoptions as things that modify the behavior of PostgreSQL, while
attfdw/genoptions are there solely for the foreign data wrapper to
use. An extra nullable field in pg_attribute isn't costing us
anything non-trivial, and the syntactic and definitional clarity seems
entirely worth it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-07-18 19:26:44
Message-ID: 4245.1311017204@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> ... I think we should understand
> attoptions as things that modify the behavior of PostgreSQL, while
> attfdw/genoptions are there solely for the foreign data wrapper to
> use. An extra nullable field in pg_attribute isn't costing us
> anything non-trivial, and the syntactic and definitional clarity seems
> entirely worth it.

+1. We paid the price of allowing nullable columns in pg_attribute long
ago. One more isn't going to cost anything, especially since virtually
every row in that catalog already contains at least one null.

I'm not too thrilled with the terminology of "generic options", though.
I think this should be understood as specifically "FDW-owned options".
If the column isn't reserved for the use of the FDW, then you get right
back into the problem of who's allowed to use it and what if there's a
collision.

regards, tom lane


From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: per-column generic option
Date: 2011-07-18 19:31:52
Message-ID: CA+TgmoZQ_2c9L0nsHvudkQ3eXPsfPPS7KRadG91bDXVyyP35iw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Jul 18, 2011 at 3:26 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> ... I think we should understand
>> attoptions as things that modify the behavior of PostgreSQL, while
>> attfdw/genoptions are there solely for the foreign data wrapper to
>> use.  An extra nullable field in pg_attribute isn't costing us
>> anything non-trivial, and the syntactic and definitional clarity seems
>> entirely worth it.
>
> +1.  We paid the price of allowing nullable columns in pg_attribute long
> ago.  One more isn't going to cost anything, especially since virtually
> every row in that catalog already contains at least one null.
>
> I'm not too thrilled with the terminology of "generic options", though.
> I think this should be understood as specifically "FDW-owned options".
> If the column isn't reserved for the use of the FDW, then you get right
> back into the problem of who's allowed to use it and what if there's a
> collision.

I concur. The SQL/MED standard is welcome to refer to them as generic
options, but at least FTTB they are going to be entirely for FDWs in
our implementation, and naming them that way is therefore a Good
Thing. If the SQL committee decides to use them in other places and
we choose to support that in some future release for some
as-yet-unclear purpose, well, it won't be the first time we've
modified the system catalog schema.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company