Re: extensible external toast tuple support

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: extensible external toast tuple support
Date: 2013-06-18 08:58:50
Message-ID: 20130618085850.GA5646@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-06-18 00:56:17 -0700, Hitoshi Harada wrote:
> On Fri, Jun 14, 2013 at 4:06 PM, Andres Freund <andres(at)2ndquadrant(dot)com>wrote:
>
> >
> > Here's the updated version. It shouldn't contain any obvious WIP pieces
> > anymore, although I think it needs some more documentation. I am just
> > not sure where to add it yet, postgres.h seems like a bad place :/
> >
> >
> I have a few comments and questions after reviewing this patch.

Cool!

> - heap_tuple_fetch_attr doesn't need to be updated to reflect ONDISK macro?

It calls toast_fetch_datum() for any kind of external datum, so it
should be fine since ONDISK is handled in there.

> - I'm not sure if plural for datum is good to use. Datum values?

Not sure at all either. Supposedly data is the plural for datum, but for
the capitalized version Datums is used in other places, so I think it
might be fine.

> - -1 from me to use enum for tag types, as I don't think it needs to be.
> This looks more like other "kind" macros such as relkind, but I know
> there's pros/cons

Well, relkind cannot easily be a enum because it needs to be stored in a
char field. I like using enums because it gives you the chance of using
switch()es at some point and getting warned about missed cases.

Why do you dislike it?

> - don't we need cast for tag value comparison in VARTAG_SIZE macro, since
> tag is unit8 and enum is signed int?

I think it should be fine (and the compiler doesn't warn) due to the
integer promotion rules.

> - Is there better way to represent ONDISK size, instead of magic number
> 18? I'd suggest constructing it with sizeof(varatt_external).

I explicitly did not want to do that, since the numbers really don't
have anything to do with the struct size anymore. Otherwise the next
person reading that part will be confused because there's a 8byte struct
with the enum value 1. Or somebody will try adding another type of
external tuple that also needs 18 bytes by copying the sizeof(). Which
will fail in ugly, non-obvious ways.

> Other than that, the patch applies fine and make check runs, though I don't
> think the new code path is exercised well, as no one is creating INDIRECT
> datum yet.

Yea, I only use the API in the changeset extraction patch. That actually
worries me to. But I am not sure where to introduce usage of it in core
without making the patchset significantly larger. I was thinking of
adding an option to heap_form_tuple/heap_fill_tuple to allow it to
reference _4B Datums instead of copying them, but that would require
quite some adjustments on the callsites.

> Also, I wonder how we could add more compression in datum, as well as how
> we are going to add more compression options in database. I'd love to see
> pluggability here, as surely the core cannot support dozens of different
> compression algorithms, but because the data on disk is critical and we
> cannot do anything like user defined functions. The algorithms should be
> optional builtin so that once the system is initialized the the plugin
> should not go away. Anyway pluggable compression is off-topic here.

Separate patchset by now, yep ;).

Thanks!

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-06-18 09:01:28 Re: A minor correction in comment in heaptuple.c
Previous Message Amit Langote 2013-06-18 08:56:34 A minor correction in comment in heaptuple.c