Re: Plan for compressed varlena headers

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "PostgreSQL Development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Plan for compressed varlena headers
Date: 2007-02-15 14:54:14
Message-ID: 87wt2jsc89.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Gregory Stark <stark(at)enterprisedb(dot)com> writes:
>> 1) Replace the VARATT_SIZEP macro with SET_VARLENA_LEN.
>
> If we're going to do this then it's time to play the name game;

Least...fun...game...evar...

> A first-cut proposal:
>
> VARHDRSZ same as now, ie, size of 4-byte header
> VARSIZE(x) for *reading* a 4-byte-header length word
> VARDATA(x) same as now, ie, ptr + 4 bytes
> SET_VARSIZE(x, len) for *writing* a 4-byte-header length word

There's also VARATT_CDATA which I suppose I should rename to VARCDATA. I
may not even need it once I hit tuptoaster.c since that file works directly
with the structure members anyways.

I supposed we also rename VARATT_IS_{COMPRESSED,EXTERNAL,EXTENDED} ?
Is VAR_IS_* ok or does that sound too generic?

> We'll also need names for the macros that can read the length and find
> the data of a datum in either-1-or-4-byte-header format. These should
> probably be named as variants of VARSIZE and VARDATA, but I'm not sure
> what exactly; any thoughts?

I can't think of any good names for the "automatic" macros. Right now I have
VARSIZE_ANY(ptr) but that doesn't seem particularly pleasing.

For the internal macros for each specific size I have:

#define VARDATA_4B(PTR) ((PTR)->va_4byte.va_data)
#define VARDATA_2B(PTR) ((PTR)->va_2byte.va_data)
#define VARDATA_1B(PTR) ((PTR)->va_1byte.va_data)

#define VARSIZE_IS_4B(PTR) ((PTR)->va_1byte.va_header & ~0x3F == 0x00)
#define VARSIZE_IS_2B(PTR) ((PTR)->va_1byte.va_header & ~0x1F == 0x20)
#define VARSIZE_IS_1B(PTR) ((PTR)->va_1byte.va_header & ~0x7F == 0x80)

#define VARSIZE_4B(PTR) (ntohl((PTR)->va_4byte.va_header) & 0x3FFFFFFF)
#define VARSIZE_2B(PTR) (ntohs((PTR)->va_2byte.va_header) & 0x1FFF)
#define VARSIZE_1B(PTR) ( ((PTR)->va_1byte.va_header) & 0x7F)

#define SET_VARSIZE_4B(PTR,len) ((PTR)->va_4byte.va_header = htonl(len))
#define SET_VARSIZE_2B(PTR,len) ((PTR)->va_2byte.va_header = htons((len) | 0x2000))
#define SET_VARSIZE_1B(PTR,len) ((PTR)->va_1byte.va_header = (len) | 0x80)

I had a separate version for little-endian but it was driving me nuts having
two versions to keep tweaking. I also had the magic constants as #defines but
it really didn't enhance readability at all so I took them out when I rewrote
this just now.

Incidentally I profiled htonl against a right shift on my machine (an intel
2Ghz core duo). htonl is four times slower but that's 3.2ns versus 0.8ns.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2007-02-15 15:25:25 Re: ERROR: failed to build any 8-way joins
Previous Message Mario Weilguni 2007-02-15 14:28:23 ERROR: failed to build any 8-way joins