Re: Variable length varlena headers redux

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gregory Stark <gsstark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Variable length varlena headers redux
Date: 2007-02-13 14:25:32
Message-ID: 45D1CA5C.3010506@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Gregory Stark wrote:
> "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> For example it'd be easy to implement the previously-discussed design
>> involving storing uncompressed length words in network byte order:
>> SET_VARLENA_LEN does htonl() and VARSIZE does ntohl() and nothing else in
>> the per-datatype functions needs to change. Another idea that we were
>> kicking around is to make an explicit distinction between little-endian and
>> big-endian hardware: on big-endian hardware, store the two TOAST flag bits
>> in the MSBs as now, but on little-endian, store them in the LSBs, shifting
>> the length value up two bits. This would probably be marginally faster than
>> htonl/ntohl depending on hardware and compiler intelligence, but either way
>> you get to guarantee that the flag bits are in the physically first byte,
>> which is the critical thing needed to be able to tell the difference between
>> compressed and uncompressed length values.
>
> Actually I think neither htonl nor bitshifting the entire 4-byte word is going
> to really work here. Both will require 4-byte alignment. Instead I think we
> have to access the length byte by byte as a (char*) and do arithmetic. Since
> it's the pointer being passed to VARSIZE that isn't too hard, but it might
> perform poorly.

We would still require all datums with a 4-byte header to be 4-byte
aligned, right? When reading, you would first check if it's a compressed
or uncompressed header. If compressed, read the 1 byte header, if
uncompressed, read the 4-byte header and do htonl or bitshifting. No
need to do htonl or bitshifting on unaligned datums.

>> The important point here is that VARSIZE() still works, so only code that
>> creates a new varlena value need be affected, not code that examines one.
>
> So what would VARSIZE() return, the size of the payload plus VARHDRSZ
> regardless of what actual size the header was? That seems like it would break
> the least existing code though removing all the VARHDRSZ offsets seems like it
> would be cleaner.

My vote would be to change every caller. Though there's a lot of
callers, it's a very simple change.

To make it posible to compile an external module against 8.2 and 8.3,
you could have a simple ifdef block to map the new macro to old
behavior. Or we could backport the macro definitions as Magnus suggested.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-02-13 14:38:57 Re: HOT for PostgreSQL 8.3
Previous Message Bruce Momjian 2007-02-13 14:22:08 Re: Variable length varlena headers redux