Re: intarray internals

From: Volkan YAZICI <yazicivo(at)ttnet(dot)net(dot)tr>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: intarray internals
Date: 2006-05-06 14:38:24
Message-ID: 20060506143824.GB202@alamut
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Hi,

First, thanks so much for your response.

On May 06 12:13, Martijn van Oosterhout wrote:
> On Sat, May 06, 2006 at 12:46:01AM +0300, Volkan YAZICI wrote:
> > [1]
> > What's the function of execute() in _int_bool.c? As far as I can
> > understand, some other functions (eg. execconsistent()) calling
> > execute() with specific check methods (like checkcondition_bit()) but
> > I still couldn't figure out which functionality execute() stands for.
>
> It's a boolean expression evaluator. The query given is some kind of
> boolean expression. That function is a recusive function that evaluates
> the expression given certain information.

I thought the same but the code (and its variable handling/coersion
stuff) is quite messy to figure this out, IMHO. Nearly no comments at
all while using curitem->val or calcnot.

> > [2]
> > In g_int_decompress(), shouldn't
> >
> > if (ARRISVOID(in))
> > PG_RETURN_POINTER(entry);
> >
> > part be replaced with
> >
> > if (ARRISVOID(in))
> > {
> > if (in != (ArrayType *) DatumGetPointer(entry->key))
> > pfree(in);
> > PG_RETURN_POINTER(entry)
> > }
>
> You very rarely need to pfree() anything explicitly. However, the code
> has just tested if in is VOID. If it is, you obviously don't need to
> free it. Or it may not be big enough to bother explicitly freeing.

Yep, it shouldn't be so big. I just wanted to follow same style as in
the previous page. (See "if (ARRISVOID(r))" check in g_int_compress().)

> > [3]
> > Again, in g_int_decompress(), I couldn't figure out the functionality of
> > below lines:
> >
> > din = ARRPTR(in);
> > lenr = internal_size(din, lenin);
> >
> > for (i = 0; i < lenin; i += 2)
> > for (j = din[i]; j <= din[i + 1]; j++)
> > if ((!i) || *(dr - 1) != j)
> > *dr++ = j;
> >
> > If I understand right, above loop, tries to reconstruct array with more
> > smaller intervals - to be able to make more accurate predicates while
> > digging into nodes. If so, AFAICS, g_int_compress() and
> > g_int_decompress() methods can be (quite?) improved.
>
> Well, it's probably trying to undo whatever g_int_compress. If you can
> explain the algorithm used it will probably be clearer.

Actually, algorithms used in the g_int_compress() and g_int_decompress()
methods are quite awesome. (I don't know if this is the authors'
creation, but if so, kudos.) But the problem I think is they're quite
lossy compression methods. To clarify, here's a small explanation of
algorithm used (if I understood right):

g_int_compress():
if (integer array length > constant limit)
{
Transfrom {A, B, C, ..., Z} array into
{A, A, B, B, ..., Z, Z}

while (integer array length > constant limit)
{
Select two couples whose difference is minimum
and remove them from the list.
}
}

g_int_decompress():
for (iterate over compressed array items)
{
Transform {..., 47, 50, ...} into {..., 47, 48, 49, 50, ...}
}

As you can see both compression and decompression methods are quite
lossy. I'm not sure if this has any negative impact on the traversing
of nodes stuff for more accurate predicates, but I am currently
considering "performance gain * physical storage gain / cpu
consumation loss" ratio if we'd instead use a lossless data
compression method. I'd be appreciated to hear your ideas (and
experiences).

Regards.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David J N Begley 2006-05-06 15:11:42 Composite types and NULL within PL/pgSQL
Previous Message Dany De Bontridder 2006-05-06 10:33:02 How to tune function plpgsql

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-05-06 15:16:13 Re: Remove behaviour of postmaster -o
Previous Message Bruce Momjian 2006-05-06 12:37:46 Re: Remove behaviour of postmaster -o