Re: Variable length varlena headers redux

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gregory Stark <stark(at)enterprisedb(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Variable length varlena headers redux
Date: 2007-02-10 03:45:38
Message-ID: 200702100345.l1A3jcT16281@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Bruce Momjian wrote:
> Tom Lane wrote:
> > Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> > > That seems like an awful lot of copying and pallocs that aren't there
> > > currently though. And it'll make us reluctant to change over frequently used
> > > data types like text -- which are precisely the ones that would gain us the
> > > most.
> >
> > > It seems to me that it might be better to change to storing varlena lengths in
> > > network byte order instead. That way we can dedicate the leading bits to toast
> > > flags and read more bytes as necessary.
> >
> > This'll add its own overhead ... but probably less than pallocs and
> > data-copying would. And I agree we can find (pretty much) all the
> > places that need changing by the expedient of deliberately renaming
> > the macros and struct fields.
>
> I think we should go with the pallocs and see how it performs. That is
> certainly going to be easier to do, and we can test it pretty easily.
>
> One palloc optimization idea would be to split out the representation so
> the length is stored seprately from the data in memory, and we could use
> an int32 for the length, and point to the shared buffer for the data.
> However I don't think our macros can handle that so it might be a
> non-starter.
>
> However, I think we should find out of the palloc is a problem before
> avoiding it.

Another idea about reducing palloc is that we know every short column is
at most 128 + 4 = 132 bytes, so we could allocate a 132-byte buffer for
every short column in the scan, and just re-use the buffer for every
row.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Neil Conway 2007-02-10 06:57:46 Re: patch adding new regexp functions
Previous Message Tom Lane 2007-02-10 03:35:53 Foreign keys for non-default datatypes, redux