Re: Column storage positions

From: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
To: Stephan Szabo <sszabo(at)megazone(dot)bigpanda(dot)com>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Phil Currier <pcurrier(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Column storage positions
Date: 2007-02-21 22:14:25
Message-ID: 45DCC441.5000709@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Stephan Szabo wrote:
> On Wed, 21 Feb 2007, Alvaro Herrera wrote:
>> Did I miss something in what you were trying to say? I assume you must
>> already know this.
>
> I think so. What I was mentioning was that I was pretty sure that there
> was a message with someone saying that they actually tried something that
> did this and that they found left-most varchar access was slightly slower
> after the reordering although general access was faster. I believe the
> table case was alternating smallint and varchar columns, but I don't know
> what was tested for the retrieval. If that turns out to be able to be
> supported by other tests, then for some access patterns, the rearranged
> version might be slower.

Here is the original quote:
----------------------------------------------------------------
The results were encouraging: on a table
with 20 columns of alternating smallint and varchar(10) datatypes,
selecting the max() of one of the rightmost int columns across 1
million rows ran around 3 times faster. The same query on the
leftmost varchar column (which should suffer the most from this
change) predictably got a little slower (about 10%);
----------------------------------------------------------------

What the OP doesn't mention is how the exact layouts looked before
and after the reordering - maybe a nullable field fixed-length field
got moved before the varchar column in question, which would disable
offset caching I guess.

Let's say the reodering algorithm is changed to only move non-nullable
fixed-width columns to the left - can anyone see an access pattern that
would run slower after the reodering? I certainly can't - because the set
of columns for which offset caching works after the reodering would
be a superset of the one for which it works before the reordering.

BTW, this is a good case for why the storage order should - directly or
indirectly - be tweakable. You can either optimize for space, and _then_
for speed - which is what the OP did I think - or first for speed, and then
for space. If the dba cannot choose the strategy, there will always be
workloads where the engine does it the wrong way around.

greetings, Florian Pflug

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2007-02-21 22:16:02 Re: [previously on HACKERS] "Compacting" a relation
Previous Message Andrew Dunstan 2007-02-21 22:12:28 Re: Column storage positions