Re: enum types and binary queries

Lists: pgsql-hackerspgsql-patches
From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: enum types and binary queries
Date: 2007-08-30 17:13:17
Message-ID: b42b73150708301013y58eb976bi38dc8acb3471a603@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

I noticed that enums are not available to be queried as binary through
the protocol. Is this a known issue? Too late to fix for 8.3? This
is kind of a pain, because it forces any query that returns an enum to
return the entire result as text. afaik, enums are the only POD type
to behave this way.

postgres=# create type foo as enum('foo');
CREATE TYPE

postgres=# copy (select 'foo'::foo) to '/home/postgres/foo.txt' binary;
ERROR: no binary output function available for type foo

merlin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: enum types and binary queries
Date: 2007-08-30 18:48:02
Message-ID: 46D710E2.3070604@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Merlin Moncure wrote:
> I noticed that enums are not available to be queried as binary through
> the protocol. Is this a known issue? Too late to fix for 8.3? This
> is kind of a pain, because it forces any query that returns an enum to
> return the entire result as text. afaik, enums are the only POD type
> to behave this way.
>
> postgres=# create type foo as enum('foo');
> CREATE TYPE
>
> postgres=# copy (select 'foo'::foo) to '/home/postgres/foo.txt' binary;
> ERROR: no binary output function available for type foo
>

The trouble is that an enum doesn't have an immutable internal binary value.

I guess we could say that the binary value is the integer offset of the
value in the enum ordering, and translate it back on input. Providing
the binary IO funcs shouldn't be too hard if we do that, unless I'm
missing something.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
Cc: "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: enum types and binary queries
Date: 2007-08-30 18:58:48
Message-ID: 7511.1188500328@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Merlin Moncure" <mmoncure(at)gmail(dot)com> writes:
> I noticed that enums are not available to be queried as binary through
> the protocol.

What do you think the binary representation ought to be? Copying OIDs
seems pretty useless.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: enum types and binary queries
Date: 2007-08-30 19:21:28
Message-ID: 8516.1188501688@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Merlin Moncure wrote:
>> I noticed that enums are not available to be queried as binary through
>> the protocol.

> The trouble is that an enum doesn't have an immutable internal binary value.

> I guess we could say that the binary value is the integer offset of the
> value in the enum ordering, and translate it back on input. Providing
> the binary IO funcs shouldn't be too hard if we do that, unless I'm
> missing something.

That would be expensive to get, wouldn't it? The main point of binary
I/O is to be fast, so I'm not excited about an expensive conversion.

What if we just defined the binary format as being identical to the text
format, ie, the enum label? A bit silly but at least it would eliminate
the gotcha that binary I/O fails.

regards, tom lane


From: "korry(dot)douglas" <korry(dot)douglas(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Merlin Moncure" <mmoncure(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: enum types and binary queries
Date: 2007-08-30 19:47:24
Message-ID: 46D71ECC.2080500@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


> What if we just defined the binary format as being identical to the text
> format, ie, the enum label? A bit silly but at least it would eliminate
> the gotcha that binary I/O fails.
>
Seems like it would make a lot more sense to the client application that
way too...

-- Korry


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: enum types and binary queries
Date: 2007-08-30 21:44:16
Message-ID: 46D73A30.9000108@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> What if we just defined the binary format as being identical to the text
> format, ie, the enum label? A bit silly but at least it would eliminate
> the gotcha that binary I/O fails.
>
>
>

So we should pretty much mimic text_send/text_recv? If so I can probably
get that done fairly quickly.

cheers

andrew


From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Subject: Re: enum types and binary queries
Date: 2007-08-31 12:49:05
Message-ID: b42b73150708310549w28da546av7564ee7e4e9abb6d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On 8/30/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Merlin Moncure" <mmoncure(at)gmail(dot)com> writes:
> > I noticed that enums are not available to be queried as binary through
> > the protocol.
>
> What do you think the binary representation ought to be? Copying OIDs
> seems pretty useless.

I'm on the fence on this one.

I actually think this would be ok, if you mean pg_enum.oid, or the
string would be fine too. I agree that binary protocol is supposed to
be fast, and I can prefetch the pg_enum table to the client and do the
magic there. Many other binary formats do similarly inscrutable
things.

One other very small observation: afaict, there appears to be no way
to list enum contents in psql (although you can list the enums
themselves in the type listing). Maybe this should be possible? I'm
willing to take a stab at these things if Andrew is busy.

merlin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: enum types and binary queries
Date: 2007-08-31 13:14:47
Message-ID: 46D81447.8060007@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Merlin Moncure wrote:
>
> One other very small observation: afaict, there appears to be no way
> to list enum contents in psql (although you can list the enums
> themselves in the type listing).

select enum_range(null::myenumtype);

> I'm willing to take a stab at these things if Andrew is busy.
>
>
>

I should have a cut of binary I/O done today, which I will send to you
for testing and TomD for comment. We need to get it in by beta as it's a
catalog change.

cheers

andrew


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
Cc: "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Subject: Re: enum types and binary queries
Date: 2007-08-31 13:38:34
Message-ID: 22949.1188567514@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Merlin Moncure" <mmoncure(at)gmail(dot)com> writes:
> On 8/30/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> What do you think the binary representation ought to be? Copying OIDs
>> seems pretty useless.

> I actually think this would be ok, if you mean pg_enum.oid, or the
> string would be fine too. I agree that binary protocol is supposed to
> be fast, and I can prefetch the pg_enum table to the client and do the
> magic there. Many other binary formats do similarly inscrutable
> things.

Well, inscrutable is one thing and unportable is another. It's supposed
to be possible to reload binary COPY data into a fresh database --- with
maybe some restrictions on the architecture being similar, for the more
machine-specific datatypes such as float. If we emit raw OIDs then this
will never work, since the same type definition made in a fresh database
would have the same OIDs only by awe-inspiring coincidence.

Andrew's idea of using the enum ordinal value would meet that test, but
at least with the current layout of pg_enum it would be quite expensive
to do the conversion in either direction --- you'd have to fetch
multiple catalog rows. I think we'd have to add another column showing
the ordinal value, and put an index on it, to make I/O reasonably fast.
Doesn't really seem worth it.

regards, tom lane


From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Subject: Re: enum types and binary queries
Date: 2007-08-31 13:41:11
Message-ID: 87odgn2554.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

"Merlin Moncure" <mmoncure(at)gmail(dot)com> writes:

> On 8/30/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> "Merlin Moncure" <mmoncure(at)gmail(dot)com> writes:
>> > I noticed that enums are not available to be queried as binary through
>> > the protocol.
>>
>> What do you think the binary representation ought to be? Copying OIDs
>> seems pretty useless.
>
> I'm on the fence on this one.
>
> I actually think this would be ok, if you mean pg_enum.oid, or the
> string would be fine too. I agree that binary protocol is supposed to
> be fast, and I can prefetch the pg_enum table to the client and do the
> magic there. Many other binary formats do similarly inscrutable
> things.

I think it would be ok only if a pg_dump/pg_restore reliably restored the same
oid->enum value mapping. Otherwise a binary dump is useless. But as I
understand it that's the case currently, is it?

You would also have to have some way for a client to simply look up the enum
mapping. Something like an SRF that returned setof(oid,name).

My first instinct was to just use the enum name but I'm starting to think this
would be better. It is more in line with the promise of enums which is that
the names are just labels and the data internally is fixed size.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com


From: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: enum types and binary queries
Date: 2007-08-31 14:37:12
Message-ID: 20070831143712.GE5925@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Gregory Stark escribió:

> I think it would be ok only if a pg_dump/pg_restore reliably restored the same
> oid->enum value mapping. Otherwise a binary dump is useless. But as I
> understand it that's the case currently, is it?

That doesn't work if the dump is restored on a database that already has
those OIDs used for another enum. The easy workaround would be to
"rewrite" the data to use the new OID, but believe me, you don't want to
go down that route, lest madness await you at the end.

--
Alvaro Herrera http://www.advogato.org/person/alvherre
"Uno puede defenderse de los ataques; contra los elogios se esta indefenso"


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Tom Dunstan <tom(at)tomd(dot)cc>
Subject: Re: [HACKERS] enum types and binary queries
Date: 2007-08-31 14:44:43
Message-ID: 46D8295B.4040908@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
>
> Andrew's idea of using the enum ordinal value would meet that test, but
> at least with the current layout of pg_enum it would be quite expensive
> to do the conversion in either direction --- you'd have to fetch
> multiple catalog rows. I think we'd have to add another column showing
> the ordinal value, and put an index on it, to make I/O reasonably fast.
> Doesn't really seem worth it.
>
>
>

Yeah. I think we should treat enums just as we do text, for this purpose.

Here's a patch (minus catalog bump) which I think does that.

cheers

andrew

Attachment Content-Type Size
enumbin.patch text/x-patch 5.4 KB

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>
Cc: "Merlin Moncure" <mmoncure(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Subject: Re: enum types and binary queries
Date: 2007-08-31 14:48:11
Message-ID: 873axz221g.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches


"Alvaro Herrera" <alvherre(at)commandprompt(dot)com> writes:

> Gregory Stark escribió:
>
>> I think it would be ok only if a pg_dump/pg_restore reliably restored the same
>> oid->enum value mapping. Otherwise a binary dump is useless. But as I
>> understand it that's the case currently, is it?

er, lost a "not" in the editing of that, sorry.

> That doesn't work if the dump is restored on a database that already has
> those OIDs used for another enum. The easy workaround would be to
> "rewrite" the data to use the new OID, but believe me, you don't want to
> go down that route, lest madness await you at the end.

enum OIDs are unique across enums? This seems like a strange way to do it. I
recall conversations about this a while back though and there were limitations
of the type system that led to this, right?

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Merlin Moncure" <mmoncure(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)postgresql(dot)org>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>
Subject: Re: enum types and binary queries
Date: 2007-08-31 14:56:15
Message-ID: 24381.1188572175@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> enum OIDs are unique across enums? This seems like a strange way to do it.

That decision was already made, we are not revisiting it (at least not
for 8.3).

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Tom Dunstan <tom(at)tomd(dot)cc>
Subject: Re: [HACKERS] enum types and binary queries
Date: 2007-08-31 15:00:21
Message-ID: 24489.1188572421@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Here's a patch (minus catalog bump) which I think does that.

Looks sane in a very quick once-over, but I didn't test it.

regards, tom lane


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: enum types and binary queries
Date: 2007-08-31 15:14:29
Message-ID: 46D83055.3000109@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Gregory Stark wrote:
> enum OIDs are unique across enums? This seems like a strange way to do it. I
> recall conversations about this a while back though and there were limitations
> of the type system that led to this, right?
>

No, not the type system as such. It stems from this quote from Tom:

> If an output function depends on anything more than the contents of
> the object it's handed, it's vulnerable to being lied to.
> http://archives.postgresql.org/pgsql-hackers/2005-04/msg00998.php

So the value passed to the enum_out function has to be sufficiently
unique to be able to look up the label. This arrangement got the best
combination of compactness and simplicity that we could come up with at
the time.

Incidentally, FWIW, I have heard tales of considerable speedup from
people being able to avoid using FKs/lookup tables by using enums.

cheers

andrew


From: Decibel! <decibel(at)decibel(dot)org>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: enum types and binary queries
Date: 2007-08-31 17:17:14
Message-ID: 20070831171714.GZ38801@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Fri, Aug 31, 2007 at 08:49:05AM -0400, Merlin Moncure wrote:
> On 8/30/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > "Merlin Moncure" <mmoncure(at)gmail(dot)com> writes:
> > > I noticed that enums are not available to be queried as binary through
> > > the protocol.
> >
> > What do you think the binary representation ought to be? Copying OIDs
> > seems pretty useless.
>
> I'm on the fence on this one.
>
> I actually think this would be ok, if you mean pg_enum.oid, or the
> string would be fine too. I agree that binary protocol is supposed to
> be fast, and I can prefetch the pg_enum table to the client and do the
> magic there. Many other binary formats do similarly inscrutable
> things.

The last time I worked on a project where we had C code access the
database, we added stuff to map C enums to ints in the database (along
with a parent table to store the enum label). ISTM it'd be good if we
had a way to get the numeric index out of an enum. I also like the idea
of having a fixed ordering to the labels in an enum.

> One other very small observation: afaict, there appears to be no way
> to list enum contents in psql (although you can list the enums
> themselves in the type listing). Maybe this should be possible? I'm
> willing to take a stab at these things if Andrew is busy.

Is there an SRF that will return this info? ISTM you should be able to
get the labels programmatically as well as via psql.
--
Decibel!, aka Jim Nasby decibel(at)decibel(dot)org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: "Decibel!" <decibel(at)decibel(dot)org>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: enum types and binary queries
Date: 2007-08-31 17:41:47
Message-ID: 46D852DB.1020203@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Decibel! wrote:
>
> The last time I worked on a project where we had C code access the
> database, we added stuff to map C enums to ints in the database (along
> with a parent table to store the enum label). ISTM it'd be good if we
> had a way to get the numeric index out of an enum.

If you mean here that you want to find the position in the ordering of
an enum value, it would be trivial to write, searching in the results
from enum_range().

Next cycle it might be worth adding a column to pg_enum with the
explicit order. My serious worry, though, is that it might lead people
to think they could alter that column and thereby change the ordering,
which of course they can't. (Quite apart from anything else, a mutable
ordering would play havoc with enums used in indexes.)

The technique of using a lookup table that you seem to refer to doesn't
need any special support from the catalogs or the type system. It's used
today in millions of applications. But it can have quite a high cost in
extra joins required to get the labels and extra application complexity.
For a case where the values in the domain of labels are truly fixed,
enums offer a much more performant and much simpler way to go.

> I also like the idea
> of having a fixed ordering to the labels in an enum.
>

I do not understand what this sentence means. The ordering *is* fixed -
it is defined by the order in which the labels are given in the create
type statement. And the oids that are assigned to the values in pg_enum
are sorted before being assigned to the labels precisely so that they
reflect this ordering. So rest assured that a given enum type will have
a fixed ordering, and it will be consistent across a dump/restore. What
will not necessarily be consistent is the actual oids used, making the
oids unsuitable for use in binary output as noted upthread.
>
>> One other very small observation: afaict, there appears to be no way
>> to list enum contents in psql (although you can list the enums
>> themselves in the type listing). Maybe this should be possible? I'm
>> willing to take a stab at these things if Andrew is busy.
>>
>
> Is there an SRF that will return this info? ISTM you should be able to
> get the labels programmatically as well as via psql.
>

Maybe you need to read
http://developer.postgresql.org/pgdocs/postgres/functions-enum.html to
see info we have made available. We fully expect this list of functions
to grow as we discover how enums are used in practice.

cheers

andrew


From: Decibel! <decibel(at)decibel(dot)org>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: enum types and binary queries
Date: 2007-08-31 17:58:36
Message-ID: 20070831175836.GD38801@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On Fri, Aug 31, 2007 at 01:41:47PM -0400, Andrew Dunstan wrote:
> The technique of using a lookup table that you seem to refer to doesn't
> need any special support from the catalogs or the type system. It's used
> today in millions of applications. But it can have quite a high cost in
> extra joins required to get the labels and extra application complexity.
> For a case where the values in the domain of labels are truly fixed,
> enums offer a much more performant and much simpler way to go.

AIUI, in C code it's easiest to deal with the int value that a C enum
gets, rather than dealing with a label coming back from the database. I
know that's what we did where I worked; the enum column stored the
corresponding C int, and that's what was used as the PK in the lookup
table.

ISTM it'd be good if we could do the same with our enums (pass the int
value back instead of a label).

> >I also like the idea
> >of having a fixed ordering to the labels in an enum.
> >
>
> I do not understand what this sentence means. The ordering *is* fixed -
> it is defined by the order in which the labels are given in the create
> type statement. And the oids that are assigned to the values in pg_enum
> are sorted before being assigned to the labels precisely so that they
> reflect this ordering. So rest assured that a given enum type will have
> a fixed ordering, and it will be consistent across a dump/restore. What
> will not necessarily be consistent is the actual oids used, making the
> oids unsuitable for use in binary output as noted upthread.

What if the OID counter wraps in the middle of adding the labels? (IE:
create a 4 label ENUM when the OID counter is 1 number away from
wrapping).

If we ever add support for adding additional labels to enums this could
be an issue too.

> Maybe you need to read
> http://developer.postgresql.org/pgdocs/postgres/functions-enum.html to
> see info we have made available. We fully expect this list of functions
> to grow as we discover how enums are used in practice.

Looks good... should still be a psql command, imo. Perhaps as part of
\dT...
--
Decibel!, aka Jim Nasby decibel(at)decibel(dot)org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: "Decibel!" <decibel(at)decibel(dot)org>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: enum types and binary queries
Date: 2007-08-31 18:26:01
Message-ID: 46D85D39.7060208@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Decibel! wrote:
> On Fri, Aug 31, 2007 at 01:41:47PM -0400, Andrew Dunstan wrote:
>
>> The technique of using a lookup table that you seem to refer to doesn't
>> need any special support from the catalogs or the type system. It's used
>> today in millions of applications. But it can have quite a high cost in
>> extra joins required to get the labels and extra application complexity.
>> For a case where the values in the domain of labels are truly fixed,
>> enums offer a much more performant and much simpler way to go.
>>
>
> AIUI, in C code it's easiest to deal with the int value that a C enum
> gets, rather than dealing with a label coming back from the database. I
> know that's what we did where I worked; the enum column stored the
> corresponding C int, and that's what was used as the PK in the lookup
> table.
>
> ISTM it'd be good if we could do the same with our enums (pass the int
> value back instead of a label).
>

Jim, you and I have had this discussion before. The answer is the same
as the last time you asked about this: because it breaks the abstraction.

If the C code doesn't need the label then why store it at all? You can
just make the database field an int.

If you want to get the ordering offset of a particular enum value you
can write a function in about 10 lines of C that will give it to you.

>
>>> I also like the idea
>>> of having a fixed ordering to the labels in an enum.
>>>
>>>
>> I do not understand what this sentence means. The ordering *is* fixed -
>> it is defined by the order in which the labels are given in the create
>> type statement. And the oids that are assigned to the values in pg_enum
>> are sorted before being assigned to the labels precisely so that they
>> reflect this ordering. So rest assured that a given enum type will have
>> a fixed ordering, and it will be consistent across a dump/restore. What
>> will not necessarily be consistent is the actual oids used, making the
>> oids unsuitable for use in binary output as noted upthread.
>>
>
> What if the OID counter wraps in the middle of adding the labels? (IE:
> create a 4 label ENUM when the OID counter is 1 number away from
> wrapping).
>

It will not be a problem. I have just explained that we sort them first.
This is a furfy that has been raised before and explained before. See
pg_enum.c starting around line 52. In particular:

/* sort them, just in case counter wrapped from high to low */
qsort(oids, n, sizeof(Oid), oid_cmp);

> If we ever add support for adding additional labels to enums this could
> be an issue too.
>

I doubt we will be doing it. You can get the effect by defining a new
type and using the old labels.

>
>> Maybe you need to read
>> http://developer.postgresql.org/pgdocs/postgres/functions-enum.html to
>> see info we have made available. We fully expect this list of functions
>> to grow as we discover how enums are used in practice.
>>
>
> Looks good... should still be a psql command, imo. Perhaps as part of
> \dT...
>

We'd have to special case enums, or provide a special \d command to
handle them. Not sure either is worth it when we have the functions anyway.

cheers

andrew


From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, "Tom Dunstan" <tom(at)tomd(dot)cc>
Subject: Re: [HACKERS] enum types and binary queries
Date: 2007-08-31 18:32:34
Message-ID: b42b73150708311132k563c3e94le52989667b8b186f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On 8/31/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> > Here's a patch (minus catalog bump) which I think does that.
>
> Looks sane in a very quick once-over, but I didn't test it.

works fine (here was my test). thanks for quick resolution to this
issue. strings returned in binary format is IMO ok.

enum.c:
include "libpq-fe.h"
#include "string.h"
#include <stdlib.h>

int main(int argc, char **argv)
{
PGconn *c = PQconnectdb("user=postgres");
PGresult *r;
r = PQexecParams(c, "select 'foo'::foo", 0, NULL, NULL, NULL, NULL, 1);

ExecStatusType t = PQresultStatus(r);

if(t != PGRES_COMMAND_OK & t != PGRES_TUPLES_OK)
{
printf("%s", PQresultErrorMessage(r));
exit(1);
}

char* f = PQgetvalue(r,0,0);
int len = 3;
int format = 1;
PQclear(r);

r = PQexecParams(c, "select $1::foo", 1, NULL, (const char* const
*)&f, &len, &format, 1);

if(t != PGRES_COMMAND_OK & t != PGRES_TUPLES_OK)
{
printf("%s", PQresultErrorMessage(r));
exit(1);
}

PQfinish(c);
}


From: "Merlin Moncure" <mmoncure(at)gmail(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, "Tom Dunstan" <tom(at)tomd(dot)cc>
Subject: Re: [HACKERS] enum types and binary queries
Date: 2007-08-31 18:36:59
Message-ID: b42b73150708311136w5af78f33r98b6275315ba8c2c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

On 8/31/07, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On 8/31/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> > > Here's a patch (minus catalog bump) which I think does that.
> >
> > Looks sane in a very quick once-over, but I didn't test it.
>
> works fine (here was my test). thanks for quick resolution to this
> issue. strings returned in binary format is IMO ok.
> if(t != PGRES_COMMAND_OK & t != PGRES_TUPLES_OK)

oops, this line was wrong. the enum is fine though.

merlin


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: "Decibel!" <decibel(at)decibel(dot)org>, Merlin Moncure <mmoncure(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: enum types and binary queries
Date: 2007-08-31 18:55:47
Message-ID: 46D86433.6030303@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Andrew Dunstan wrote:
>
>
> This is a furfy that has been raised before and explained before.

Of course, as usual I misspelled it, the word is "furphy". I didn't
realise that it was an Australianism. It means more or less "a red
herring". Wikipedia says that it is a term particularly popular in
Australian politics, and as some people know my late father was a noted
practitioner of that art ;-)

cheers

andrew


From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Patches (PostgreSQL)" <pgsql-patches(at)postgresql(dot)org>, Tom Dunstan <tom(at)tomd(dot)cc>
Subject: Re: [HACKERS] enum types and binary queries
Date: 2007-09-04 16:46:19
Message-ID: 46DD8BDB.5010004@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers pgsql-patches

Merlin Moncure wrote:
> On 8/31/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>>
>>> Here's a patch (minus catalog bump) which I think does that.
>>>
>> Looks sane in a very quick once-over, but I didn't test it.
>>
>
> works fine (here was my test). thanks for quick resolution to this
> issue. strings returned in binary format is IMO ok.
>
>
>

Patch applied (with catalog bump).

cheers

andrew