Re: VARIANT / ANYTYPE datatype

From: Darren Duncan <darren(at)darrenduncan(dot)net>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: VARIANT / ANYTYPE datatype
Date: 2011-05-10 21:19:32
Message-ID: 4DC9ABE4.9070904@darrenduncan.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera wrote:
> Excerpts from Bruce Momjian's message of mar may 10 16:21:36 -0400 2011:
>> Darren Duncan wrote:
>>> To follow-up, an additional feature that would be useful and resembles union
>>> types is the variant where you could declare a union type first and then
>>> separately other types could declare they are a member of the union. I'm
>>> talking about loosely what mixins or type-roles or interfaces etc are in other
>>> languages. The most trivial example would be declaring an ENUM-alike first and
>>> then separately declaring the component values where the latter declare they are
>>> part of the ENUM, and this could make it easier to add or change ENUM values.
>>> But keep in mind that this is a distinct concept from what we're otherwise
>>> talking about as being union types. -- Darren Duncan
>> Should this be a TODO item?
>
> The general idea of C-style unions, sure. Mixin-style stuff ... not sure.
> Seems like it'd be pretty painful.

From the perspective of users, the single greatest distinction between these 2
kinds of unions is being closed versus being open, and that is the primary
reason to choose one over the other.

A closed union is the C-style, where the union type declares what other types or
values it ranges over. The closed union is best when the union definer can
reasonably assume that the union won't either ever or would rarely be changed,
and in particular can assume that application or database code would have
knowledge of the parts that it deals specially with, so it can be assumed that
if the closed union type ever is changed then any code designed to use it may be
changed at the same time.

A good example for a closed union would be a boolean type which just ranges over
the two singletons false and true or an order type which ranges just over the
three singletons decrease, same, increase. Or a type which enumerates the 7
days of the week, as this is unlikely to change in the life of a system.

An open union is the mixin style, where the component types declare they are
part of the union. The open union is best when it is likely that there would be
either user-defined or extension-defined new types for the union to come along
later, and we want to have code that can be generic or polymorphic for any types
that can be used in particular ways.

Examples of open union types could be number, which all the numeric types
compose, and so you can know say that you can use the generic numeric operators
on values you have simply if their types compose the number union type, and it
still works if more numeric types appear later. Likewise, the string open union
could include both text and blob, as both support catenation and substring
matches or extraction, for example.

This would aid to operator overloading in a generic way, letting you use the
same syntax for different types, but allowing types to mix is optional; eg, you
could support "add(int,int)" and "add(real,real)" without supporting
"add(int,real)" etc but the syntax "add(x,y)" is shared, and you do this while
still having a strong type system; allowing the mixing is optional case-by-case.

Supporting the open union is closer to supporting ANYTYPE while the closed union
isn't so much.

-- Darren Duncan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2011-05-10 21:40:31 Re: Why not install pgstattuple by default?
Previous Message Alvaro Herrera 2011-05-10 20:31:39 Re: VARIANT / ANYTYPE datatype