Re: Vertical Partitioning with TOAST

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Junji TERAMOTO <teramoto(dot)junji(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Vertical Partitioning with TOAST
Date: 2005-12-08 19:04:22
Message-ID: 439883B6.6000502@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/8/2005 1:42 PM, Jim C. Nasby wrote:

> On Thu, Dec 08, 2005 at 10:03:43AM -0500, Bruce Momjian wrote:
>> Jim C. Nasby wrote:
>> > On Thu, Dec 08, 2005 at 12:59:47AM -0500, Tom Lane wrote:
>> > > "Jim C. Nasby" <jim(at)nasby(dot)net> writes:
>> > > > This seems like a useful feature to add, allowing for easy built-in
>> > > > verticle partitioning. Are there issues with the patch as-is?
>> > >
>> > > Other than the ones mentioned by the poster?
>> > >
>> > > It seemed to me more like a not-too-successful experiment than something
>> > > ready for application. If you take the viewpoint that this is just
>> > > another TOAST storage strategy, I think it's pretty useless. A large
>> > > field value is going to get toasted anyway with the regular strategy,
>> > > and if your column happens to contain some values that are not large,
>> > > forcing them out-of-line anyway is simply silly. (You could make a case
>> > > for making the threshold size user-controllable, but I don't see the
>> > > case for setting the threshold to zero, which is what this amounts to.)
>> >
>> > Valid point. I do think there's a lot of benefit to being able to set
>> > the limit much lower than what it currently defaults to today. We have a
>> > client that has a queue-type table that is updated very frequently. One
>> > of the fields is text, that is not updated as frequently. Keeping this
>> > table vacuumed well enough has proven to be problematic, because any
>> > delay to vacuuming quickly results in a very large amount of bloat.
>> > Moving that text field into a seperate table would most likely be a win.
>> >
>> > Presumably this would need to be settable on at least a per-table basis.
>> >
>> > Would adding such a variable be a good beginner TODO, or is it too
>> > invasive?
>>
>> Well, we have now:
>>
>> ALTER TABLE ALTER [ COLUMN ] column
>> SET STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN }
>>
>> What else is needed?
>
> As Tom suggested, I think it would be best to be able to change the size
> at which a field gets stored externally. I think it also makes sense to
> have this reverse the normal order of compress first, then if it still
> doesn't fit store it externally. I forsee this typically being useful
> when you have fields that are between ~100 and 1000 bytes in size, and
> I'm doubtful that compression would do much good there. But I wouldn't
> rule out this being useful on fields that can also sometimes contain
> much larger amounts of data, so I don't think it makes sense to disable
> compression completely. So, I think this leaves two new options:

It's not the size of a field that triggers toasting. It is the size of
the entire tuple. As long as that is > BLKSIZE/4, the toaster will pick
the currently largest inline value and do "something" with it.
"something" is either compressing or (if not allowed or already done)
moving external.

Jan

>
> SET STORAGE EXTERNAL [THRESHOLD x]
> If a field is over x in size, it's stored externally.
>
> SET STORAGE EXTENDED [THRESHOLD x]
> If a field is over x in size, it's stored externally. If it's over
> BLCKSZ/4 it will also be compressed (I think that's how things work
> now).
>
> Actually, that's rather ugly. I think it would be better to just break
> external storage and compression out into their own attributes:
>
> SET STORAGE [ ALLOW EXTERNAL [THRESHOLD x] ] [ ALLOW COMPRESSION [THRESHOLD x] ]
>
> ALLOW EXTERNAL: if a field is larger than x bytes (BLCKSZ/4 by default)
> then it will be stored externally. May be specified along with ALLOW
> COMPRESSION.
>
> ALLOW COMPRESSION: if a field is larger than x bytes (BLCKSZ/4 by default)
> then it will be COMPRESSED. May be specified along with ALLOW EXTERNAL.
> --
> Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
> Pervasive Software http://pervasive.com work: 512-231-6117
> vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2005-12-08 19:04:59 Re: Replication on the backend
Previous Message Jim C. Nasby 2005-12-08 18:57:02 Improving free space usage (was: Reducing relation locking overhead)