Re: Variable length varlena headers redux

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Gregory Stark" <gsstark(at)mit(dot)edu>, "Bruce Momjian" <bruce(at)momjian(dot)us>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Variable length varlena headers redux
Date: 2007-02-13 14:41:48
Message-ID: 87fy9a86hf.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Heikki Linnakangas" <heikki(at)enterprisedb(dot)com> writes:

>> Actually I think neither htonl nor bitshifting the entire 4-byte word is going
>> to really work here. Both will require 4-byte alignment. Instead I think we
>> have to access the length byte by byte as a (char*) and do arithmetic. Since
>> it's the pointer being passed to VARSIZE that isn't too hard, but it might
>> perform poorly.
>
> We would still require all datums with a 4-byte header to be 4-byte aligned,
> right? When reading, you would first check if it's a compressed or uncompressed
> header. If compressed, read the 1 byte header, if uncompressed, read the 4-byte
> header and do htonl or bitshifting. No need to do htonl or bitshifting on
> unaligned datums.

It's not easy to require datums with 4-byte headers to be 4-byte aligned. How
do you know where to look for the bits to show it's an uncompressed header if
you don't know where it's aligned yet?

It could be done if you rule that if you're on an unaligned byte and see a \0
then scan forward until the aligned byte. But that seems just as cpu expensive
as just doing the arithmetic. And wastes space to boot.

I'm thinking VARSIZE would look something like:

#define VARSIZE((datum)) \
((((uint8*)(datum))[0] & 0x80) ?
(((uint8*)(datum))[0] & 0x7F) : \
(((uint8*)(datum))[0]<< 24 | ((uint8*)(datum))[1]<<16 | ((uint8*)(datum))[2]<<8 | ((uint8*)(datum))[0]))

Which is effectively the same as doing ntohl except that it only works for
left hand sides -- luckily VARSIZE always has a lhs. It also works for
unaligned accesses. It's going to be fairly slow but no slower than doing an
unaligned access looking at nul padding bytes.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-02-13 14:44:03 Re: Variable length varlena headers redux
Previous Message Tom Lane 2007-02-13 14:38:57 Re: HOT for PostgreSQL 8.3