Using ENUM with huge NAMEDATALEN

Lists: pgsql-performance
From: "David Andersen" <david(at)andersen(dot)gs>
To: pgsql-performance(at)postgresql(dot)org
Subject: Using ENUM with huge NAMEDATALEN
Date: 2008-07-26 21:22:09
Message-ID: c6bf5b380807261422i77ed5b3evf55da6891e2e4170@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi All,

This is my first post so please be gentle if I make any mistakes.

I know that ENUMs were not intended to be used instead of joins, but I
believe that I my case there will be huge saves on using Enums. Every night
I will drop the database and rebuild it from CSV files. The database will be
quite large with around 200 million rows in a fact table (only numbers).
There are about 10 dimension tables (lookup table with ID and text columns).
What is very special is that the software I am 100% dependent on to generate
queries, joins the fact table will all dimension tables in all cases. Mostly
my queries do not include any WHERE clauses (apart from those needed for
joins) only GROUP BY and HAVING. Joining 200 million rows with side tables
that on average have about 50000 rows, is heavy. When I need results within
10 seconds, I believe there is a need for Enums.

One of the side tables is text with length of up to 900 characters. My
question is whether I could build PostgreSQL with NAMEDATALEN (which
controls the max size of Enums) equal to 900? Or do you see any things going
wrong then (like stability problems or whatever)?

Regards,

David


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "David Andersen" <david(at)andersen(dot)gs>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Using ENUM with huge NAMEDATALEN
Date: 2008-07-26 21:44:24
Message-ID: 4191.1217108664@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

"David Andersen" <david(at)andersen(dot)gs> writes:
> One of the side tables is text with length of up to 900 characters. My
> question is whether I could build PostgreSQL with NAMEDATALEN (which
> controls the max size of Enums) equal to 900?

I wouldn't recommend it. Consider changing pg_enum.enumlabel to type
TEXT instead.

regards, tom lane


From: "David Andersen" <david(at)andersen(dot)gs>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Using ENUM with huge NAMEDATALEN
Date: 2008-07-26 21:48:50
Message-ID: c6bf5b380807261448h65aac228i112cdca380bf4b4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi Tom,

Thanks a lot for the tip! I will try this. You probably saved be a few days
of work!

Regards,

David

On Sat, Jul 26, 2008 at 11:44 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> "David Andersen" <david(at)andersen(dot)gs> writes:
> > One of the side tables is text with length of up to 900 characters. My
> > question is whether I could build PostgreSQL with NAMEDATALEN (which
> > controls the max size of Enums) equal to 900?
>
> I wouldn't recommend it. Consider changing pg_enum.enumlabel to type
> TEXT instead.
>
> regards, tom lane
>
>
>


From: "David Andersen" <david(at)andersen(dot)gs>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Using ENUM with huge NAMEDATALEN
Date: 2008-07-27 00:13:47
Message-ID: c6bf5b380807261713m2c413388y22814f22a8f6fd6a@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi again,

I am attempting to alter pg_enum.enumlabel to Text, but I seem to run into a
strange permission problem with regards to system tables. I am not allowed
to modify them even if I am a superuser. Some output to make sure I do not
make any novice mistakes:

C:\Program Files\PostgreSQL\8.3\bin>psql -d postgres -U postgres
Welcome to psql 8.3.0, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit

Warning: Console code page (850) differs from Windows code page (1252)
8-bit characters might not work correctly. See psql reference
page "Notes for Windows users" for details.

postgres=# GRANT ALL privileges ON TABLE pg_enum TO postgres;
GRANT
postgres=# alter table pg_enum ALTER COLUMN enumlabel TYPE text;
ERROR: permission denied: "pg_enum" is a system catalog
STATEMENT: alter table pg_enum ALTER COLUMN enumlabel TYPE text;
ERROR: permission denied: "pg_enum" is a system catalog
postgres=# \du
List of roles
Role name | Superuser | Create role | Create DB | Connections | Member of
-----------+-----------+-------------+-----------+-------------+-----------
postgres | yes | yes | yes | no limit | {}
tull | yes | yes | yes | no limit | {}
(2 rows)

postgres=#

Thanks in advance for your help.

Regards,

David

On Sat, Jul 26, 2008 at 11:48 PM, David Andersen <david(at)andersen(dot)gs> wrote:

> Hi Tom,
>
> Thanks a lot for the tip! I will try this. You probably saved be a few days
> of work!
>
> Regards,
>
> David
>
>
> On Sat, Jul 26, 2008 at 11:44 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> "David Andersen" <david(at)andersen(dot)gs> writes:
>> > One of the side tables is text with length of up to 900 characters. My
>> > question is whether I could build PostgreSQL with NAMEDATALEN (which
>> > controls the max size of Enums) equal to 900?
>>
>> I wouldn't recommend it. Consider changing pg_enum.enumlabel to type
>> TEXT instead.
>>
>> regards, tom lane
>>
>>
>>
>


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "David Andersen" <david(at)andersen(dot)gs>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Using ENUM with huge NAMEDATALEN
Date: 2008-07-27 00:36:00
Message-ID: 5850.1217118960@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

"David Andersen" <david(at)andersen(dot)gs> writes:
> I am attempting to alter pg_enum.enumlabel to Text, but I seem to run into a
> strange permission problem with regards to system tables. I am not allowed
> to modify them even if I am a superuser.

ALTER TABLE is hardly gonna be sufficient on a system catalog anyway, as
knowledge of its rowtype is generally hardwired into the C code. You'd
have to modify src/include/catalog/pg_enum.h and then go around and find
all the references to enumlabel and fix them to know it's text not name.
Fortunately, this being not a widely used catalog, there shouldn't be
too many places to fix. Right offhand, it looks like the indexing.h
definition of its index and about three places in pg_enum.c would be all
that have to change.

Note that this would be an initdb-forcing change and so you should also
bump the catversion number.

regards, tom lane