Re: WIP: extensible enums

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: extensible enums
Date: 2010-10-19 08:00:26
Message-ID: AANLkTimf8DZF6uCfSjHzYRUm8ChjWXDN38rGNmzLPW5U@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 19 October 2010 05:21, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:
>
>
> On 10/18/2010 10:52 AM, Tom Lane wrote:
>>
>> We could possibly deal with enum types that follow the existing
>> convention if we made the cache entry hold a list of all the original,
>> known-to-be-sorted OIDs.  (This could be reasonably compact and cheap to
>> probe if it were represented as a starting OID and a Bitmapset of delta
>> values, since we can assume that the initial set of OIDs is pretty close
>> together.)  But we have to have that cache entry, and we have to consult
>> it on every single comparison, so it's definitely going to be slower
>> than before.
>>
>> So I'm thinking the comparison procedure goes like this:
>>
>> 1. Both OIDs even?
>>        If so, just compare them numerically, and we're done.
>>
>> 2. Lookup cache entry for enum type.
>>
>> 3. Both OIDs in list of known-sorted OIDs?
>>        If so, just compare them numerically, and we're done.
>>
>> 4. Search the part of the cache entry that lists sort positions.
>>        If not both present, refresh the cache entry.
>>        If still not present, throw error.
>>
>> 5. Compare by sort positions.
>>
>> Step 4 is the slowest part but would be avoided in most cases.
>> However, step 2 is none too speedy either, and would usually
>> be required when dealing with pre-existing enums.
>
> OK, I've made adjustments that I think do what you're suggesting.
>
> Patch is attached.
>

Ah, I'd missed the point about the bitmapset. In the most common case,
most of the enum elements are probably going to be in the right order,
so you save a lot by identifying that case quickly.

I didn't have time to play with hash maps myself, but I don't think it
will make much difference now because hopefully the binary search
isn't going to be hit a lot anyway.

There are a couple of things that look odd about this code though (I
haven't tested it, but they look wrong):

In the loop identifying the longest range:

for (list_end = start_pos+1; list_end < num; list_end++)
if (mycache->sort_order_list[list_end].sort_order <
mycache->sort_order_list[list_end - 1].sort_order ||
mycache->sort_order_list[list_end].enum_oid -
mycache->sort_order_list[list_end].enum_oid >= BITMAPSIZE)

That last test isn't doing anything. Shouldn't the second "list_end"
should be "start_pos"?

In the loop that sets bits in the bitmap, it looks like it's assuming
the range starts at 0. I haven't had any caffeine yet, so maybe I'm
misunderstanding it, but I can't see anywhere that sets bitmap_base.

Regards,
Dean

> Alternatively this can be pulled from
> <git(at)github(dot)com:adunstan/postgresql-dev.git>
>
> cheers
>
> andrew
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thom Brown 2010-10-19 08:19:16 Re: WIP: extensible enums
Previous Message David Fetter 2010-10-19 06:36:07 Re: How to determine failed connection attempt due to invalid authorization (libpq)?