Re: Refactoring the Type System

From: Darren Duncan <darren(at)darrenduncan(dot)net>
To: PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Refactoring the Type System
Date: 2010-11-14 03:54:21
Message-ID: 4CDF5D6D.8050900@darrenduncan.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

David Fetter wrote:
> For the past couple of years, I've been hearing from the PostGIS
> people among others that our type system just isn't flexible enough
> for their needs. It's really starting to show its age, or possibly
> design compromises that seemed reasonable a decade or more ago, but
> are less so now.
>
> To that end, I've put up a page on the wiki that includes a list of
> issues to be addressed. It's intended to be changed, possibly
> completely.
>
> http://wiki.postgresql.org/wiki/Refactor_Type_System
>
> What might the next version of the type system look like?

Are you talking about changes to the type system as users see it or just changes
to how the existing behavior is implemented internally? If you're talking
about, as users see it, which the other replies to this thread seem to be
saying, though not necessarily the url you pointed to which looks more internals ...

As a statement which may surprise no one who's heard me talk about it before ...

I've mostly completed a type system specification that would be useable by
Postgres, as the most fundamental part of my Muldis D language.

The type system is arguably the most central piece of any DBMS, around which
everything else is defined and built.

You have data, which is structured in some way, and has operators for it.

If you look at a DBMS from the perspective of being a programming language
implementation, you find that a database is just a variable that holds a value
of a structured type. In the case of a relational database, said database is a
tuple whose attribute values are relations; or in the case of
namespaces/schemas, the database tuple has tuple attributes having relation
attributes.

If a database is a variable, then all database constraints are type constraints
on the declared type of that variable, and you can make said constraints
arbitrarily complicated.

From basic structures like nestable tuples and relations, plus a complement of
basic types like numbers and strings, and arbitrary constraints, you can define
data types of any shape or form.

A key component of a good type system is that users can define data types, and
moreover where possible, system-defined types are defined in the same ways as
users define types. For example, stuff like temporal types or geospatial types
are prime candidates for being defined like user-defined types.

If you define all structures using tuples and relations, you can easily flatten
this out on the implementation end and basically do everything as associated
flat relation variables as you do now.

So what I propose is both very flexible and easy to implement, scale, and
optimize, relatively speaking.

You don't have to kludge things by implementing arrays as blobs for example; you
can implement them as relations instead. Geospatial types can just be tuples.
Arrays of structured types can just be relations with an attribute per type
attribute. Arrays of simple types can just be unary relations.

You can also emulate all of the existing Pg features and syntax that you have
now over the type system I've defined, maintaining compatibility too.

I also want to emphasize that, while I drew inspiration from many sources when
defining Muldis D, and there was/is a lot I still didn't/don't know about
Postgres, I have found that as I use and learn Postgres, I'm finding frequently
that how Postgres does things is similar and compatible to how I independently
came up with Muldis D's design; I'm finding more similarities all the time.

-- Darren Duncan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message KaiGai Kohei 2010-11-14 04:16:30 Re: Label switcher function
Previous Message Marko Tiikkaja 2010-11-14 02:45:00 Re: wCTE behaviour