Re: Efficient use of space in large table?

From: Manfred Koizar <mkoi-pg(at)aon(dot)at>
To: Josh Jore <josh(at)greentechnologist(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Efficient use of space in large table?
Date: 2002-07-05 13:36:37
Message-ID: 535biug0vv0v2olbhpt46nr8281ktqijtq@4ax.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 4 Jul 2002 21:43:10 -0500 (CDT), Josh Jore
<josh(at)greentechnologist(dot)org> wrote:
>I was just wondering - I've got two large tables and I was wondering
>if there is anyway to shrink them somewhat. I imagined compression for
>non-indexed columns or something. Is varchar or char more efficient than
>text?
>
Josh,

first of all, text is ok. You might want to store NULL instead of ''
to squeeze out a few bytes here and there.

Now I have even more questions instead of answers :-)

PG version? OS?
Do you have lots of UPDATEs/DELETEs?
Do you ANALYZE regularly?
Please show us the outputs of
VACUUM VERBOSE sospeople;
VACUUM VERBOSE votes;

>
>A size summary
> relname | size | type | reltuples
>------------------------+-------+---------- +----------
> sospeople | 599MB | table | 2M
> sospeople_fn | 71MB | index | 2M
> sospeople_ln | 73MB | index | 2M
> sospeople_zip | 73MB | index | 2M
> votes | 937MB | table | 15M

Could you show us the tuple counts in your relations, just like the
wild guess I have inserted?

>
>
> Table "sospeople"
> Attribute | Type | Modifier
>------------------+---------+---------------------
> sosid | text | not null default ''
> countyname | text | not null
> firstname | text |
> middlename | text |
> lastname | text |
> suffix | text |
> homephone | text |
> registrationdate | text |
> birthyear | integer |
> status | text |
> precinctcode | text |
> housenumber | text |
> streetname | text |
> unittype | text |
> unit | text |
> address2 | text |
> city | text |
> state | text |
> zipcode | text |

What are the average sizes of these text columns? If there are long
repeated values (e.g. countyname), it might help do pull these out
into a separate table

CREATE TABLE county (id serial, name text);

and replace
> countyname | text | not null
by
county_id NOT NULL REFERENCES county

>Indices: sospeople_fn,
> sospeople_ln,
> sospeople_zip
>
Also post the results of
\d sospeople_fn
\d sospeople_ln
\d sospeople_zip

> Table "votes"
> Attribute | Type | Modifier
>------------+---------------+----------
> sosid | character(10) |
> electionid | integer |
> votetype | character(1) |

No index here?
Using sosno INT instead of sosid CHAR(10) could save you (vaguely
estimated) up to 10% of space for this table. OTOH you would have to
insert sosno into sospeople, so this would only be a win, if votes has
far more rows than sospeople.

>I'm sort of just grasping at straws for something to get the data
>smaller.

Is your problem really related to space, or to speed?

Servus
Manfred

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ricardo Junior 2002-07-05 13:42:18 Re: Indexes not working, please help.
Previous Message Michael Walzl 2002-07-05 13:24:54 Null in the where-clause