Short varlena headers

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Short varlena headers
Date: 2007-02-22 17:59:37
Message-ID: 87slcykr92.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches


I've tried repeatedly to send this patch but it doesn't seem to be getting
through. It's not in the archives or my inbox. Now that I look I realized that
none of the WIP versions of the patch that I sent arrived either. That's
fairly disappointing since I had made efforts to keep people apprised of the
development.

Here's a working patch that provides 1-byte headers and unaligned storage for
varlena data types storing < 128 bytes (actually <= 126 bytes).

http://community.enterprisedb.com/varlena/patch-varvarlena-9.patch.gz

Things it does:

1) Changes the varlena api to use SET_VARSIZE instead of VARATT_SIZEP

2) Changes the heap_form_tuple api in a subtle way: attributes in a heap tuple
may need to be detoasted even if the tuple is never written out to disk.

3) Changes the GETSTRUCT api in another subtle way: it's no loner safe to
access then first varlena in a tuple directly through the GETSTRUCT
interface. At least not unless special care is taken.

4) Saves a *ton* of space because it saves 3 of the 4 bytes of varlena
overhead *and* the up to 4 bytes of alignment padding before every varlena.
That turns out to be a lot more space than I realized. I'll post some
sample schemas with space savings later.

5) Passes all postgres regression tests.

Things it doesn't do:

1) 2-byte headers for objects that exceed 128 bytes :(

2) 0-byte headers for single ascii characters :(

3) avoid htonl/ntohl by using low order bits on little-endian machines

4) provide an escape hatch for types or columns that don't want this
behaviour. Currently int2vector and oidvector are specifically exempted
since they're used in the system tables, sometimes through the GETSTRUCT
api. I doubt anything not used in the system tables has any business being
exempted which only leaves us with the occasional text attribute which I
plan to double check aren't problems.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

Browse pgsql-patches by date

  From Date Subject
Next Message Kris Jurka 2007-02-22 20:50:49 lo_truncate
Previous Message Teodor Sigaev 2007-02-22 17:28:28 Re: First implementation of GIN for pg_trgm