Re: Bug and/or feature? Complex data types in tables...

Lists: pgsql-generalpgsql-hackers
From: "Chris Travers" <chris(at)travelamericas(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Bug and/or feature? Complex data types in tables...
Date: 2003-12-31 04:16:12
Message-ID: 000901c3cf8e$28884230$3f285e3d@winxp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Hi all;

I just made an interesting discovery. Not sure if it is a good thing or
not, and using it certainly breakes first normal form.... Not even sure if
it really works. However, as I am able to CRASH the backend, there is a bug
here somewhere...

test=# select version();
version

----------------------------------------------------------------------------
----
-----
PostgreSQL 7.4 on i686-pc-cygwin, compiled by GCC gcc (GCC) 3.3.1 (cygming
spec
ial)
(1 row)

Try the following example:

CREATE TABLE test1 (
test_id SERIAL,
test_text TEXT
);

CREATE TABLE test2 (
other_test test1,
test_text text
);

The table is created without any problem. Of course there is no way of
inserting anything into the table, you write a function to create the data
type. So I created the following function:

CREATE FUNCTION test1 (int, text) returns test1 as '
declare retval test1;
begin
retval.test_id := $1;
retval.test_text := $2;
return retval;
end;
' language plpgsql.

Now I can insert into the table. But I cannot get anything out of the
table! If I try a simple
SELECT * from test2;
I get: ERROR: cannot display a value of type record

So, I figured I would write a function to turn the record into text. The
function I wrote is:
CREATE FUNCTION test1_to_text(test1) returns text as '
declare retval text;
begin
retval := test1.test_id;
retval := retval::text;
retval := retval|| '':'';
retval := retval|| test1.test_text;
return retval;
end;
' language plpgsql;

Here is where the crash occurs (after a brief hang):
test=# select test1_to_text(other_test) from test2;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

Interestingly I can do:
test=# select test1_to_text(test1('1', 'hi there'));
test1_to_text
---------------
1:hi there
(1 row)

Best Wishes,
Chris Travers


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Chris Travers" <chris(at)travelamericas(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Bug and/or feature? Complex data types in tables...
Date: 2003-12-31 16:46:34
Message-ID: 10921.1072889194@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

"Chris Travers" <chris(at)travelamericas(dot)com> writes:
> Try the following example:

> CREATE TABLE test1 (
> test_id SERIAL,
> test_text TEXT
> );

> CREATE TABLE test2 (
> other_test test1,
> test_text text
> );

This should in fact be disallowed, I think. Back in the pre-SQL days of
Berkeley Postgres, there actually was a feature that involved declaring
table columns this way, but it did NOT work the way you think ;-), and
in any case it has been broken for many years.

I'm not sure why we've never taken the step of preventing complex types
from being declared as fields of other types. I suppose there's some
thought that we'll eventually support it, but I don't believe that that
day is real close.

regards, tom lane


From: "Chris Travers" <chris(at)travelamericas(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Bug and/or feature? Complex data types in tables...
Date: 2004-01-01 01:20:25
Message-ID: 002501c3d006$92958700$5e44053d@winxp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

This concept of using complex types in tables actually does have one
legitimate use. When used with casts and functions, you could use it as a
"poor-man's datatype" development method.

Here is a hypothetical example. Imagine for a moment that there was no CIDR
datatype. I could create a datatype as a set of ints and then create
casting functions which I could use for display of the data. This would be
similar to C except that it could be done by people like myself whose C
coding skills are not up to the level where I or anyone else would want them
in the database backend ;-)

Best Wishes,
Chris Travers


From: Michael Glaesemann <grzm(at)myrealbox(dot)com>
To: "Chris Travers" <chris(at)travelamericas(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Bug and/or feature? Complex data types in tables...
Date: 2004-01-01 15:48:22
Message-ID: EDFFE6AC-3C71-11D8-A298-000A95C88220@myrealbox.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

On Dec 31, 2003, at 7:20 PM, Chris Travers wrote:

> This concept of using complex types in tables actually does have one
> legitimate use. When used with casts and functions, you could use it
> as a
> "poor-man's datatype" development method.
>
> Here is a hypothetical example. Imagine for a moment that there was
> no CIDR
> datatype. I could create a datatype as a set of ints and then create
> casting functions which I could use for display of the data. This
> would be
> similar to C except that it could be done by people like myself whose C
> coding skills are not up to the level where I or anyone else would
> want them
> in the database backend ;-)

This is a situation where PostgreSQL's CREATE DOMAIN, or CREATE TYPE
support would be useful, I think. Is there a reason these wouldn't work
as well as using a "table type"?

Happy New Year!
Michael Glaesemann
grzm myrealbox com


From: "Chris Travers" <chris(at)travelamericas(dot)com>
To: "Michael Glaesemann" <grzm(at)myrealbox(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Bug and/or feature? Complex data types in tables...
Date: 2004-01-02 13:44:16
Message-ID: 015401c3d136$884af9c0$1f00053d@winxp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

creating a complex type and using it in a table would create the same
problem, would it not?
If my type has more than one component, then it would not work well.

Here is a better example. Imagine creating a type for complex numbers.
Each complex number has 2 components: a real component (x, numeric) and an
imaginary component (y, numeric). The standard representation is x +/- yi,
so if the real component is 3.4 and the imaginary component is 5, it would
be written 3.4 + 5i.

Storing this data in the database would require either:
1: A text string which would be parsed by the app. (not really very
useful)
2: A native datatype consisting of 2 numeric components, that could be cast
as text by the rules above.

Obviously the second one is best. Currently in PostgreSQL, I would have to
write this in C, but with complex types, I could write this mostly in
PLPGSQL!

Best Wishes,
Chris Travers

----- Original Message -----
From: "Michael Glaesemann" <grzm(at)myrealbox(dot)com>
To: "Chris Travers" <chris(at)travelamericas(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>; "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Sent: Thursday, January 01, 2004 10:48 PM
Subject: Re: [GENERAL] Bug and/or feature? Complex data types in tables...

> On Dec 31, 2003, at 7:20 PM, Chris Travers wrote:
>
> > This concept of using complex types in tables actually does have one
> > legitimate use. When used with casts and functions, you could use it
> > as a
> > "poor-man's datatype" development method.
> >
> > Here is a hypothetical example. Imagine for a moment that there was
> > no CIDR
> > datatype. I could create a datatype as a set of ints and then create
> > casting functions which I could use for display of the data. This
> > would be
> > similar to C except that it could be done by people like myself whose C
> > coding skills are not up to the level where I or anyone else would
> > want them
> > in the database backend ;-)
>
> This is a situation where PostgreSQL's CREATE DOMAIN, or CREATE TYPE
> support would be useful, I think. Is there a reason these wouldn't work
> as well as using a "table type"?
>
> Happy New Year!
> Michael Glaesemann
> grzm myrealbox com
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>
>


From: Michael Glaesemann <grzm(at)myrealbox(dot)com>
To: "Chris Travers" <chris(at)travelamericas(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Bug and/or feature? Complex data types in tables...
Date: 2004-01-02 18:00:09
Message-ID: 80D1617A-3D4D-11D8-A298-000A95C88220@myrealbox.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

On Jan 2, 2004, at 7:44 AM, Chris Travers wrote:

> creating a complex type and using it in a table would create the same
> problem, would it not?
> If my type has more than one component, then it would not work well.

After a bit of experimentation, I see what you mean:
test=# select version();
version
------------------------------------------------------------------------
-----------------------------------------------
PostgreSQL 7.4 on powerpc-apple-darwin7.2.0, compiled by GCC gcc (GCC)
3.3 20030304 (Apple Computer, Inc. build 1495)
(1 row)

test=# create type complex_number as (real numeric, imaginary numeric);
CREATE TYPE
test=# create table numbers (num complex_number);
ERROR: column "num" has composite type complex_number
ERROR: column "num" has composite type complex_number
test=# create table complex_number_table (real numeric not null,
imaginary numeric not null);
CREATE TABLE
test=# create table numbers (num complex_number_table);
CREATE TABLE

You'd think the first CREATE TABLE numbers would work, and the second
wouldn't. (Unless I'm doing something wrong.)

> Here is a better example. Imagine creating a type for complex numbers.
> Each complex number has 2 components: a real component (x, numeric)
> and an
> imaginary component (y, numeric). The standard representation is x
> +/- yi,
> so if the real component is 3.4 and the imaginary component is 5, it
> would
> be written 3.4 + 5i.

In the language of Date and Darwen, you're talking about possible
representations, or possreps, I believe. One possible representation of
a complex number would be x +/- yi, another could be (x,y)

> Storing this data in the database would require either:
> 1: A text string which would be parsed by the app. (not really very
> useful)
> 2: A native datatype consisting of 2 numeric components, that could
> be cast
> as text by the rules above.
>
> Obviously the second one is best.

Definitely. The default TIMESTAMP possrep is much different from how
it's represented internally. For that matter, NUMERIC is, too. There's
no reason to necessarily store the value in the form the user sees.

> Currently in PostgreSQL, I would have to
> write this in C, but with complex types, I could write this mostly in
> PLPGSQL!

(To avoid confusion since we're talking about complex numbers, I'm
assuming you mean what PostgreSQL refers to as composite types.) It
definitely would be nice to be able to define composite types that can
be used as attributes and functions. It seems like there's quite a bit
of, er, functionality with composite types already. I don't have
pl/pgsql installed, but I was able to create some simple operators with
just SQL (see below). They're not perfect (and don't let us use
composite types in tables); just exploring what I could do. I wonder
what it would take to allow these user-defined types defined in
PostgreSQL (rather than C) usable in tables.

Michael Glaesemann
grzm myrealbox com

test=# create or replace function THE_REAL(complex_number) returns
numeric as 'select $1.real as real;' language sql;
CREATE FUNCTION
test=# create or replace function THE_IMAGINARY(complex_number) returns
numeric as 'select $1.imaginary as real;' language sql;
CREATE FUNCTION
test=# create function complex_number(numeric,numeric) returns
complex_number as 'select $1,$2;' language sql;
CREATE FUNCTION
test=# select THE_REAL(complex_number(4::numeric,3::numeric));
the_real
----------
4
(1 row)

test=# select THE_IMAGINARY(complex_number(4::numeric,3::numeric));
the_imaginary
---------------
3
(1 row)

test=# create or replace function
display_ordpair_complex_number(complex_number) returns text as 'select
''('' || $1.real || '','' || $1.imaginary || '')'';' language sql;
CREATE FUNCTION
test=# create function display_irep_complex_number(complex_number)
returns text as 'select $1.real || '' '' || $1.imaginary || ''i'';'
language sql;
CREATE FUNCTION
test=# select
display_ordpair_complex_number(complex_number(4::numeric,3::numeric));
display_ordpair_complex_number
--------------------------------
(4,3)
(1 row)

test=# select
display_irep_complex_number(complex_number(4::numeric,3::numeric));
display_irep_complex_number
-----------------------------
4 3i
(1 row)


From: "Chris Travers" <chris(at)travelamericas(dot)com>
To: "Michael Glaesemann" <grzm(at)myrealbox(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Bug and/or feature? Complex data types in tables...
Date: 2004-01-03 03:59:44
Message-ID: 008f01c3d1b9$4f947470$1800053d@winxp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

> (To avoid confusion since we're talking about complex numbers, I'm
> assuming you mean what PostgreSQL refers to as composite types.) It
> definitely would be nice to be able to define composite types that can
> be used as attributes and functions. It seems like there's quite a bit
> of, er, functionality with composite types already. I don't have
> pl/pgsql installed, but I was able to create some simple operators with
> just SQL (see below). They're not perfect (and don't let us use
> composite types in tables); just exploring what I could do. I wonder
> what it would take to allow these user-defined types defined in
> PostgreSQL (rather than C) usable in tables.

AFAICS, there are only one thing missing and it could probably be worked
around if the backend did nto crash when you try to retrieve the information
via a casting function. It is:

Some way to define a standard input and output representation (how it is
done in C).

If you can define your own casts, you can then select complex::text from
mytable (but this would crash the backend again :-( )

Of course for complex numbers, you might be able use a domain off of points,
and then define special operators for them (for example, adding complex
numbers is meaningful, but adding points is not). But this might not work
for other sorts of types.

Best Wishes,
Chris Travers


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Chris Travers" <chris(at)travelamericas(dot)com>
Cc: "Michael Glaesemann" <grzm(at)myrealbox(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Bug and/or feature? Complex data types in tables...
Date: 2004-01-03 05:31:10
Message-ID: 18914.1073107870@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

"Chris Travers" <chris(at)travelamericas(dot)com> writes:
> AFAICS, there are only one thing missing and it could probably be worked
> around if the backend did nto crash when you try to retrieve the information
> via a casting function. It is:
> Some way to define a standard input and output representation (how it is
> done in C).

Actually, we could very easily punt on that, instead saying you have to
select out individual fields or else write your own formatting function.

The thing we are missing (i.e., what makes it crash) is an internal
representation that allows a tuple to be embedded as a field of a larger
tuple. I've looked at this a couple of times, and each time concluded
that it was more work than I could afford to spend at the moment. The
support-such-as-it-is for tuple return values uses a structure that has
embedded pointers, and it doesn't make any effort to get rid of
out-of-line TOAST pointers within the tuple. Neither one of those
things are acceptable for a tuple that's trying to act like a Datum.

regards, tom lane


From: elein <elein(at)varlena(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Chris Travers <chris(at)travelamericas(dot)com>, Michael Glaesemann <grzm(at)myrealbox(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Bug and/or feature? Complex data types in tables...
Date: 2004-01-12 01:55:34
Message-ID: 20040111175534.C12147@cookie.varlena.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Understanding in complete detail why it is hard to
have tuples as column values, I still see the need
and uses for composite datatypes.

As Chris said, it allows people to use objects at
a plpgsql level without having to throw them into
creating datatypes with C. This is very important
because it explodes the object capabilities of
postgres exponentially. That is why it was designed
that way in the first place--to use objects in a relational
database.

There has been very slow adoption of OR. One of PostgreSQL's
reasons (Illustra and informix had other reasons) is that
you always have to go down to C to do complex things.
Obvious and easily understood complex features like composite
data types should have always been supported at a higher level.

I'm not trolling for a R vs. OR flamefest or criticizing
the decision. I do understand the technical issues
involved. However, from an advocacy and usablity and
feature rich point of view this particular feature is
valuable IMHO.

Sorry to join the discussion so late.

elein

On Sat, Jan 03, 2004 at 12:31:10AM -0500, Tom Lane wrote:
> "Chris Travers" <chris(at)travelamericas(dot)com> writes:
> > AFAICS, there are only one thing missing and it could probably be worked
> > around if the backend did nto crash when you try to retrieve the information
> > via a casting function. It is:
> > Some way to define a standard input and output representation (how it is
> > done in C).
>
> Actually, we could very easily punt on that, instead saying you have to
> select out individual fields or else write your own formatting function.
>
> The thing we are missing (i.e., what makes it crash) is an internal
> representation that allows a tuple to be embedded as a field of a larger
> tuple. I've looked at this a couple of times, and each time concluded
> that it was more work than I could afford to spend at the moment. The
> support-such-as-it-is for tuple return values uses a structure that has
> embedded pointers, and it doesn't make any effort to get rid of
> out-of-line TOAST pointers within the tuple. Neither one of those
> things are acceptable for a tuple that's trying to act like a Datum.
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)


From: Michael Glaesemann <grzm(at)myrealbox(dot)com>
To: "Chris Travers" <chris(at)travelamericas(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Bug and/or feature? Complex data types in tables...
Date: 2004-01-15 08:58:16
Message-ID: F5060B4A-4738-11D8-B443-000A95C88220@myrealbox.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Hi Chris,

I know this thread is a little old, but it's something I'm interested
in learning more about.

On Jan 2, 2004, at 9:59 PM, Chris Travers wrote:
> AFAICS, there are only one thing missing and it could probably be
> worked
> around if the backend did nto crash when you try to retrieve the
> information
> via a casting function. It is:
>
> Some way to define a standard input and output representation (how it
> is
> done in C).
>
> If you can define your own casts, you can then select complex::text
> from
> mytable (but this would crash the backend again :-( )

Could you explain this a little more? My strengths (such as they are)
are more on relational theory rather than implementation.

My interpretation of what you're saying (which is probably just
restating what's obvious to others) is that there isn't a way to define
the input and output functions in (the PostgreSQL flavor of) SQL. You
have to do it in C, as described in the "User-Defined Types" section
(33.10).

I'm unclear about what follows. Using SELECT complex::text FROM mytable
would be used to get data out of the table. How would you get it in?
How do user-defined casts help out with this?

Thanks for your time! I'm slowing trying to learn here. I'm interested
in figuring out how to implement point and interval/duration types for
temporal work, but know I have a lot to learn to make this possible.

Michael Glaesemann
grzm myrealbox com


From: Michael Glaesemann <grzm(at)myrealbox(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org, "Chris Travers" <chris(at)travelamericas(dot)com>
Subject: Re: Bug and/or feature? Complex data types in tables...
Date: 2004-01-15 09:18:59
Message-ID: DA36B099-473B-11D8-9C78-000A95C88220@myrealbox.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Hi Tom

On Jan 3, 2004, at 2:31 PM, Tom Lane wrote:
> The thing we are missing (i.e., what makes it crash) is an internal
> representation that allows a tuple to be embedded as a field of a
> larger
> tuple. I've looked at this a couple of times, and each time concluded
> that it was more work than I could afford to spend at the moment. The
> support-such-as-it-is for tuple return values uses a structure that has
> embedded pointers, and it doesn't make any effort to get rid of
> out-of-line TOAST pointers within the tuple. Neither one of those
> things are acceptable for a tuple that's trying to act like a Datum.

Would you mind explaining this a little more, or pointing me to where I
can learn more about this? I looked through the html docs for TOAST,
and only found a brief mention regarding large objects and user-defined
types, but it doesn't get into it in very much detail. (Well, there's
the sliced bread index entry, also. :)

Michael Glaesemann
grzm myrealbox com


From: Michael Glaesemann <grzm(at)myrealbox(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org, "Chris Travers" <chris(at)travelamericas(dot)com>
Subject: Re: Bug and/or feature? Complex data types in tables...
Date: 2004-01-15 09:23:10
Message-ID: 6FD6055A-473C-11D8-9C78-000A95C88220@myrealbox.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Hi Tom

On Jan 3, 2004, at 2:31 PM, Tom Lane wrote:
> The thing we are missing (i.e., what makes it crash) is an internal
> representation that allows a tuple to be embedded as a field of a
> larger
> tuple. I've looked at this a couple of times, and each time concluded
> that it was more work than I could afford to spend at the moment. The
> support-such-as-it-is for tuple return values uses a structure that has
> embedded pointers, and it doesn't make any effort to get rid of
> out-of-line TOAST pointers within the tuple. Neither one of those
> things are acceptable for a tuple that's trying to act like a Datum.

Would you mind explaining this a little more, or pointing me to where I
can learn more about this? I looked through the html docs for TOAST,
and only found a brief mention regarding large objects and user-defined
types, but it doesn't get into it in very much detail. (Well, there's
the sliced bread index entry, also. :)

Michael Glaesemann
grzm myrealbox com


From: Michael Glaesemann <grzm(at)myrealbox(dot)com>
To: "Chris Travers" <chris(at)travelamericas(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Bug and/or feature? Complex data types in tables...
Date: 2004-01-15 09:23:21
Message-ID: 76363C6F-473C-11D8-9C78-000A95C88220@myrealbox.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Hi Chris,

I know this thread is a little old, but it's something I'm interested
in learning more about.

On Jan 2, 2004, at 9:59 PM, Chris Travers wrote:
> AFAICS, there are only one thing missing and it could probably be
> worked
> around if the backend did nto crash when you try to retrieve the
> information
> via a casting function. It is:
>
> Some way to define a standard input and output representation (how it
> is
> done in C).
>
> If you can define your own casts, you can then select complex::text
> from
> mytable (but this would crash the backend again :-( )

Could you explain this a little more? My strengths (such as they are)
are more on relational theory rather than implementation.

My interpretation of what you're saying (which is probably just
restating what's obvious to others) is that there isn't a way to define
the input and output functions in (the PostgreSQL flavor of) SQL. You
have to do it in C, as described in the "User-Defined Types" section
(33.10).

I'm unclear about what follows. Using SELECT complex::text FROM mytable
would be used to get data out of the table. How would you get it in?
How do user-defined casts help out with this?

Thanks for your time! I'm slowing trying to learn here. I'm interested
in figuring out how to implement point and interval/duration types for
temporal work, but know I have a lot to learn to make this possible.

Michael Glaesemann
grzm myrealbox com


From: Michael Glaesemann <grzm(at)myrealbox(dot)com>
To: "Chris Travers" <chris(at)travelamericas(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Bug and/or feature? Complex data types in tables...
Date: 2004-01-15 11:29:41
Message-ID: 1C75A818-474E-11D8-878B-000A95C88220@myrealbox.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers


On Jan 15, 2004, at 6:40 PM, Chris Travers wrote:
>> I'm unclear about what follows. Using SELECT complex::text FROM
>> mytable
>> would be used to get data out of the table. How would you get it in?
>> How do user-defined casts help out with this?
>
> Simple, you have to define a complex() function which takes real and
> imaginary components and returns a complex type.

Sorry. This is what I had shown already. I was trying to figure out how
casts were involved.

> You can then insert it as:
> insert into complex_nums (c_num, text_rep) values (complex('2', '3'),
> complex_to_text(complex('2','3'));
> This will successfully return, however, you can only retrieve the
> value that
> is stored as the text string. Anything else causes the backend to
> *crash.*

So basically you're storing a representation of the complex number (or
other type) as text, rather than as the type itself. So, as Tom
mentioned, you're left writing a function to parse that text string and
return a the value as the composite type you want. Am I catching on?

>>
>> Thanks for your time! I'm slowing trying to learn here. I'm interested
>> in figuring out how to implement point and interval/duration types for
>> temporal work, but know I have a lot to learn to make this possible.
>>
> Is there a problem with the built in definitions of point and interval?

I'm thinking along the lines of the temporal proposals Date, Darwen,
and Lorentzos ("Temporal Data and the Relational Model"). Their
"interval" type is more along the lines of a beginning and end time,
such as ['2003-1-23':'2003-1-25'], rather than just '2 days'. They
generalize this to be useful for any ordered, discrete sequence, such
as integers, even numbers, weekdays, part number sequences (if
appropriate), or primes. A point type, in this case, is a value that is
part of such a sequence, e.g., '2003-1-23' could be a date point type
useful in a date interval. Here's a link to Hugh Darwen's summary of
the book:

<http://www.hughdarwen.freeola.com/TheThirdManifesto.web/
TemporalData.pdf>

Michael Glaesemann
grzm myrealbox com


From: Michael Glaesemann <grzm(at)myrealbox(dot)com>
To: "Chris Travers" <chris(at)travelamericas(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Bug and/or feature? Complex data types in tables...
Date: 2004-01-15 11:49:53
Message-ID: EEB379A2-4750-11D8-B824-000A95C88220@myrealbox.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers


On Jan 15, 2004, at 6:50 PM, Chris Travers wrote:

>> Would you mind explaining this a little more, or pointing me to where
>> I
>> can learn more about this? I looked through the html docs for TOAST,
>> and only found a brief mention regarding large objects and
>> user-defined
>> types, but it doesn't get into it in very much detail. (Well, there's
>> the sliced bread index entry, also. :)
>
> Tom can correct me if I am wrong, but iirc, there is a limit to how
> much
> information can be stored inline in a table. In order to store larger
> rows,
> these can be compressed or moved out of the table into TOAST.
> PostgreSQL
> needs to be able to know how to handle these issues. TOAST is then
> significant because it allows you to store, say 1GB of text in a field
> without using a large number of pages in the table and thus slowing
> down the
> seq_scan's, and possibly introducing other problems.

Okay. This much I think I follow.

> With complex types, this could become far harder, especially if you
> want to
> move only parts of the complex type into TOAST...

This part I'm not sure I understand. (Again, you're meaning composite
types in general, not complex types (x + yi) in particular, right?). I
did find the TOAST developers site where there's a little more
information about TOAST. What you're saying is that it might be
difficult to figure out how to split a composite type to off-load part
of it onto a TOAST table?

> I would settle for an implimentation that:
> 1: Moved all or none of the entity into TOAST, (i.e. not moving
> individual
> components) as this is not done for other datatypes.

Thus you don't need to figure out how to split it, right?

> 2: Could only do functional indexing of complex types, as this would
> get
> around the issues of display and searching.
> 3: Required explicit casting to simple data types.

Could you give an example of this last one?

Michael Glaesemann
grzm myrealbox com


From: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
To: Michael Glaesemann <grzm(at)myrealbox(dot)com>
Cc: Chris Travers <chris(at)travelamericas(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Bug and/or feature? Complex data types in tables...
Date: 2004-01-15 13:31:21
Message-ID: 20040115133120.GA8382@dcc.uchile.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

On Thu, Jan 15, 2004 at 08:29:41PM +0900, Michael Glaesemann wrote:

> >Is there a problem with the built in definitions of point and interval?
>
> I'm thinking along the lines of the temporal proposals Date, Darwen,
> and Lorentzos ("Temporal Data and the Relational Model"). Their
> "interval" type is more along the lines of a beginning and end time,
> such as ['2003-1-23':'2003-1-25'], rather than just '2 days'.

Maybe it's tangential to this discussion, but there's a type to store
that kind of intervals. It's called tinterval. ISTM it's not too much
documented, and I haven't really used it.

alvherre=> select tinterval('2003-10-28', '2004-11-05');
tinterval
-----------------------------------------------------
["2003-10-28 00:00:00-03" "2004-11-05 00:00:00-03"]
(1 fila)

alvherre=> select tinterval('2003-10-28', '2004-11-05') && tinterval('2003-10-05', '2003-10-15');
?column?
----------
f
(1 fila)

alvherre=> select tinterval('2003-10-28', '2004-11-05') && tinterval('2003-10-05', '2003-11-15');
?column?
----------
t
(1 fila)

(There are other operators, of course ...)

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
Criptografía: Poderosa técnica algorítmica de codificación que es
empleada en la creación de manuales de computadores.


From: Michael Glaesemann <grzm(at)myrealbox(dot)com>
To: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: pgsql-general(at)postgresql(dot)org, Chris Travers <chris(at)travelamericas(dot)com>
Subject: Re: Bug and/or feature? Complex data types in tables...
Date: 2004-01-15 17:47:55
Message-ID: F2F2B8F2-4782-11D8-B824-000A95C88220@myrealbox.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Hi Alvaro,

On Jan 15, 2004, at 10:31 PM, Alvaro Herrera wrote:

> Maybe it's tangential to this discussion, but there's a type to store
> that kind of intervals. It's called tinterval. ISTM it's not too much
> documented, and I haven't really used it.
>
> alvherre=> select tinterval('2003-10-28', '2004-11-05');
> tinterval
> -----------------------------------------------------
> ["2003-10-28 00:00:00-03" "2004-11-05 00:00:00-03"]
> (1 fila)
>
> alvherre=> select tinterval('2003-10-28', '2004-11-05') &&
> tinterval('2003-10-05', '2003-10-15');
> ?column?
> ----------
> f
> (1 fila)
>
> alvherre=> select tinterval('2003-10-28', '2004-11-05') &&
> tinterval('2003-10-05', '2003-11-15');
> ?column?
> ----------
> t
> (1 fila)
>
> (There are other operators, of course ...)

Well, lookee here! Thanks a lot, Alvaro!. Tangential or not, it's
definitely something I'm interested in! More exploring to do!

Michael Glaesemann
grzm myrealbox com


From: Richard Huxton <dev(at)archonet(dot)com>
To: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Bug and/or feature? Complex data types in tables...
Date: 2004-01-15 19:23:09
Message-ID: 200401151923.09120.dev@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

On Thursday 15 January 2004 13:31, Alvaro Herrera wrote:
>
> Maybe it's tangential to this discussion, but there's a type to store
> that kind of intervals. It's called tinterval. ISTM it's not too much
> documented, and I haven't really used it.

Just a note to echo Michael's thanks for the pointer on this one. A whole type
I've not come across before - excellent :-)

--
Richard Huxton
Archonet Ltd


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Michael Glaesemann <grzm(at)myrealbox(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org, "Chris Travers" <chris(at)travelamericas(dot)com>
Subject: Re: [GENERAL] Bug and/or feature? Complex data types in tables...
Date: 2004-01-16 00:40:45
Message-ID: 25848.1074213645@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

[ moved to pg-hackers, since it's *way* off topic for -general ]

Michael Glaesemann <grzm(at)myrealbox(dot)com> writes:
> On Jan 3, 2004, at 2:31 PM, Tom Lane wrote:
>> The thing we are missing (i.e., what makes it crash) is an internal
>> representation that allows a tuple to be embedded as a field of a
>> larger
>> tuple. I've looked at this a couple of times, and each time concluded
>> that it was more work than I could afford to spend at the moment. The
>> support-such-as-it-is for tuple return values uses a structure that has
>> embedded pointers, and it doesn't make any effort to get rid of
>> out-of-line TOAST pointers within the tuple. Neither one of those
>> things are acceptable for a tuple that's trying to act like a Datum.

> Would you mind explaining this a little more, or pointing me to where I
> can learn more about this?

Well, to make composite data types into real first-class citizens, we
have to be able to represent their values as ordinary Datums that don't
act differently from run-of-the-mill Datums, except when some operation
that actually wants to understand the contents of the value is invoked.
I think the only workable representation is as a variable-length datum
along the lines of

int32 length word (overall length of datum)
OID type indicator (OID of the composite type)
header fields similar to a normal on-disk tuple
null bitmap if needed
values of fields (themselves also Datums)

It's possible we could leave out the type OID, but given that we found
it useful to include an element type OID in array headers, I'm betting
we want one for composite-type values too. Without it, we must always
know the exact composite type makeup from context. (But see below.)

Now, this structure could be TOASTed as a whole, since it's just a
varlena data type. But we cannot expect the toasting routines to look
inside it --- that would imply that it's not like other varlena data
types after all. That means that the contained fields had better not be
out-of-line TOAST value references, because there's no way to keep track
of them and keep from deleting the referenced value too soon. (It would
be workable for them to be compressed inline, but I suspect we don't
really want that either, since it'd interfere with attempts to compress
the overall datum.) So somehow we'd need to expand out any toasted
component fields, at least before attempting to store such a datum on
disk. Not sure where to do that cleanly.

The other point was that what's actually returned at the moment from a
function-returning-tuple is a Datum that contains a pointer to a
TupleTableSlot, not a pointer to a datum of this kind. (Look in
executor/functions.c and executor/execQual.c to see the related code.)
We'd need to change that API, which would be a good thing to do, but
it'd no doubt break some user-written functions.

In particular I do not know how we'd handle functions declared to return
RECORD --- in general, they may need to return composite types that are
defined on-the-fly and don't have any associated type OID. The
TupleTableSlot convention works for this since it can include a
tupledescriptor that was built on the fly. We can't have tupdescs
embedded in datums stored on disk, however.

regards, tom lane


From: Joe Conway <mail(at)joeconway(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Glaesemann <grzm(at)myrealbox(dot)com>, pgsql-hackers(at)postgreSQL(dot)org, Chris Travers <chris(at)travelamericas(dot)com>
Subject: Re: [GENERAL] Bug and/or feature? Complex data types in
Date: 2004-01-16 05:07:37
Message-ID: 40077199.7080906@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Tom Lane wrote:
> int32 length word (overall length of datum)
> OID type indicator (OID of the composite type)
> header fields similar to a normal on-disk tuple
> null bitmap if needed
> values of fields (themselves also Datums)
>
> It's possible we could leave out the type OID, but given that we found
> it useful to include an element type OID in array headers, I'm betting
> we want one for composite-type values too. Without it, we must always
> know the exact composite type makeup from context. (But see below.)

Makes sense. But see below...

> Now, this structure could be TOASTed as a whole, since it's just a
> varlena data type. But we cannot expect the toasting routines to look
> inside it --- that would imply that it's not like other varlena data
> types after all. That means that the contained fields had better not be
> out-of-line TOAST value references, because there's no way to keep track
> of them and keep from deleting the referenced value too soon.

Why wouldn't we handle this just like we do when we build an array from
elemental datums (i.e. allocate sufficient space and copy the individual
datums into the structure)?

Continuing the analogy:

int32 size; /* overall length of datum */
int flags; /* null-bitmap indicator, others reserved */
Oid relid; /* OID of the composite type */
int16 t_natts; /* number of attributes */
bits8 t_bits[1]; /* null bitmap if needed */
Datum *values /* values of fields */

values would be built similar to how its done in
construct_md_array/CopyArrayEls/ArrayCastAndSet

The overlying datatype would be similar to anyarray.

AFAICS SQL2003 (and SQL99) defines something similar to this as a "row
type". It looks like this:

ROW ( column definition list )

But it also seems to equate a table's-row type to a "row type" in
section 4.8 (Row types):

"A row type is a sequence of (<field name> <data type>) pairs, called
fields. It is described by a row type descriptor. A row type descriptor
consists of the field descriptor of every field of the row type.

The most specific type of a row of a table is a row type. In this
case, each column of the row corresponds to the field of the row type
that has the same ordinal position as the column."

So maybe as an extension to the standard, we could allow something like:

ROW composite_type_name

Example:

CREATE TABLE foo (id int, tup ROW (f1 int, f2 text));

or ...

CREATE TABLE bar (f1 int, f2 text);
CREATE TABLE foo (id int, tup ROW bar);

> The other point was that what's actually returned at the moment from a
> function-returning-tuple is a Datum that contains a pointer to a
> TupleTableSlot, not a pointer to a datum of this kind.

If you had something akin to arrayin/arrayout, would this still need to
be changed?

Joe


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: Michael Glaesemann <grzm(at)myrealbox(dot)com>, pgsql-hackers(at)postgreSQL(dot)org, Chris Travers <chris(at)travelamericas(dot)com>
Subject: Re: [GENERAL] Bug and/or feature? Complex data types in tables...
Date: 2004-01-16 15:02:32
Message-ID: 29619.1074265352@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general pgsql-hackers

Joe Conway <mail(at)joeconway(dot)com> writes:
> Tom Lane wrote:
>> ... That means that the contained fields had better not be
>> out-of-line TOAST value references, because there's no way to keep track
>> of them and keep from deleting the referenced value too soon.

> Why wouldn't we handle this just like we do when we build an array from
> elemental datums (i.e. allocate sufficient space and copy the individual
> datums into the structure)?

Well, the problem is that there are tuples and there are tuples. We do
*not* want to force expansion of TOAST references every time we build an
intermediate tuple to return up one level in a plan. That would cost
gobs of memory, and it's possible the expanded value will never actually
be used at all (eg, the row might fail a join qual further up the plan).
Ideally the forced expansion should only occur if a composite-type tuple
is actually about to be stored on disk. Maybe this says that the
toaster routines are the right place to take care of it after all, but
I'm not quite sure where it should go.

BTW, you could argue that TOAST references in a constructed array ought
not be expanded until/unless the array gets written to disk, too. But
the expense of scanning a large array on the off chance there's some
TOAST references in there might dissuade us from doing that. (Hmm ...
maybe use a flag bit in the array flag word?)

>> The other point was that what's actually returned at the moment from a
>> function-returning-tuple is a Datum that contains a pointer to a
>> TupleTableSlot, not a pointer to a datum of this kind.

> If you had something akin to arrayin/arrayout, would this still need to
> be changed?

I don't see the connection. This is an internal representation either
way, and there's no point at which one would want to invoke an I/O
routine.

regards, tom lane