PostgreSQL Arrays and Performance

From: Marc Philipp <mail(at)marcphilipp(dot)de>
To: pgsql-general(at)postgresql(dot)org
Subject: PostgreSQL Arrays and Performance
Date: 2006-01-03 15:46:35
Message-ID: 927255D6-5FD9-4D06-93B6-C3FE25395C61@marcphilipp.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

A few performance issues using PostgreSQL's arrays led us to the
question how postgres actually stores variable length arrays. First,
let me explain our situation.

We have a rather large table containing a simple integer primary key
and a couple more columns of fixed size. However, there is a dates
column of type "timestamp without time zone[]" that is apparently
causing some severe performance problems.

During a daily update process new timestamps are collected and
existing data rows are being updated (new rows are also being added).
These changes affect a large percentage of the existing rows.

What we have been observing in the last few weeks is, that the
overall database size is increasing rapidly due to this table and
vacuum processes seem to deadlock with other processes querying data
from this table.

Therefore, the the database keeps growing and becomes more and more
unusable. The only thing that helps is dumping and restoring it which
is nothing you are eager to do on a large live system and a daily basis.

This problem led us to the question, how these arrays are stored
internally. Are they stored "in-place" with the other columns or
merely as a pointer to another file?

Would it be more efficient to not use an array for this purpose but
split the table in two parts?

Any help is appreciated!

Marc Philipp

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Arnaud Lesauvage 2006-01-03 15:50:09 Re: initdb: invalid locale name
Previous Message Scott Marlowe 2006-01-03 15:33:50 Re: Forum Software