why is gist index taking so much space on the disc

Lists: pgsql-hackers
From: Grzegorz Jaskiewicz <gj(at)pointblue(dot)com(dot)pl>
To: pgsql-hackers(at)postgresql(dot)org
Subject: why is gist index taking so much space on the disc
Date: 2005-11-21 15:58:25
Message-ID: EAAFDF94-A127-464F-80DA-6B6959F5130E@pointblue.com.pl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi folks

my conquers with Gist index for custom type are nearly finished. It
is working as it is now, but there are few problems here and there.
One of em, being amount of disc space index it self takes. The type
stucture it self takes 160bytes. Adding 100.000 rows into table -
CREATE TABLE blah (a serial, b customType);
with my gist index takes around 2GB on disc ! 100.000 is a large
number, but the purpose of having gist in first place is defeated if
that machine can't handle fast I/O or has at least 3GB of ram, first
to hold index in cache, secondly to operate postgres caching (shared
memory).
Is it normal that index is so hudge ? Even tho my type has built in
masks (element that can match few different values), and %. up front
the string (which behaves just like the sql % in b ~ '%.something').
And both are used to build "unions" for pick-split, and other
operations. Is it because of pick-split it self ? It does good work
in splitting up table of elements into two separate ones, by sorting
them first, than creating common "mask" for L and P. And by scanning
whole table again, and putting elements matching into L or P. L and P
elements sometimes overlap, but so far I can't find better solution.
Having to iterate 10 or 20 times using k-means (the type holds tree a
like structure) isn't going to boost efficiency either.
This index works, and it is very fast, but still large.

So final question, what should I do to make that index much smaller
on the disc.

--
GJ

"If we knew what we were doing, it wouldn't be called Research, would
it?" - AE


From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Grzegorz Jaskiewicz <gj(at)pointblue(dot)com(dot)pl>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: why is gist index taking so much space on the disc
Date: 2005-11-21 17:00:30
Message-ID: 4381FD2E.7010500@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> So final question, what should I do to make that index much smaller on
> the disc.

Tune your penalty and picksplit function. Gevel module can help you to look
inside of index ( http://www.sai.msu.su/~megera/postgres/gist/gevel ).

Usially, index becomes big when picksplit works bad: during split it place one
key on one page and all other keys on another page. So you have a huge number of
page with single value.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Grzegorz Jaskiewicz <gj(at)pointblue(dot)com(dot)pl>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: why is gist index taking so much space on the disc
Date: 2005-11-21 18:32:20
Message-ID: 20051121183214.GF16764@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 21, 2005 at 04:58:25PM +0100, Grzegorz Jaskiewicz wrote:
> my conquers with Gist index for custom type are nearly finished. It
> is working as it is now, but there are few problems here and there.
> One of em, being amount of disc space index it self takes. The type
> stucture it self takes 160bytes. Adding 100.000 rows into table -
> CREATE TABLE blah (a serial, b customType);

Let's see, 160bytes means you'll get aboud 50 keys per page. So you
would expect 2000 leaf page, 40 level 1 pages. This should be less than
20-30MB

<snip>

> Is it normal that index is so hudge ? Even tho my type has built in
> masks (element that can match few different values), and %. up front
> the string (which behaves just like the sql % in b ~ '%.something').
> And both are used to build "unions" for pick-split, and other
> operations. Is it because of pick-split it self ? It does good work
> in splitting up table of elements into two separate ones, by sorting
> them first, than creating common "mask" for L and P. And by scanning
> whole table again, and putting elements matching into L or P. L and P
> elements sometimes overlap, but so far I can't find better solution.

You mean you sometimes put the same elements in the two halves? You
shouldn't do that. The whole point is that the search will descend any
node that matches consistant, but any single key should only appear
once in each index.

picksplit should *split* the set, not return two sets about the same
size as you started...

Hope this helps,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


From: "Kevin McArthur" <postgresql-list(at)stormtide(dot)ca>
To: <pgsql-hackers(at)postgresql(dot)org>, "Grzegorz Jaskiewicz" <gj(at)pointblue(dot)com(dot)pl>
Subject: Re: why is gist index taking so much space on the disc
Date: 2005-11-21 19:08:36
Message-ID: 002701c5eece$fa2ec550$0701a8c0@kdesktop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Take the query.

select a,b from dupa where b::text in (select b::text from dupa group by
b::text having count(b) > 2);

This is acceptable to create a unique constraint, however, we cannot mark
the column unique, without defining btree operators, which clearly are not
possible for sorting. Is there any way to base the operators based on the
text representation of the type for strict equality (not to be confused with
same or equivilent) and thus use that not as an ordering method, but as a
simple equality for uniqueness.

Kevin McArthur

----- Original Message -----
From: "Grzegorz Jaskiewicz" <gj(at)pointblue(dot)com(dot)pl>
To: <pgsql-hackers(at)postgresql(dot)org>
Sent: Monday, November 21, 2005 7:58 AM
Subject: [HACKERS] why is gist index taking so much space on the disc

> Hi folks
>
> my conquers with Gist index for custom type are nearly finished. It is
> working as it is now, but there are few problems here and there.
> One of em, being amount of disc space index it self takes. The type
> stucture it self takes 160bytes. Adding 100.000 rows into table - CREATE
> TABLE blah (a serial, b customType);
> with my gist index takes around 2GB on disc ! 100.000 is a large number,
> but the purpose of having gist in first place is defeated if that machine
> can't handle fast I/O or has at least 3GB of ram, first to hold index in
> cache, secondly to operate postgres caching (shared memory).
> Is it normal that index is so hudge ? Even tho my type has built in masks
> (element that can match few different values), and %. up front the string
> (which behaves just like the sql % in b ~ '%.something'). And both are
> used to build "unions" for pick-split, and other operations. Is it
> because of pick-split it self ? It does good work in splitting up table
> of elements into two separate ones, by sorting them first, than creating
> common "mask" for L and P. And by scanning whole table again, and putting
> elements matching into L or P. L and P elements sometimes overlap, but so
> far I can't find better solution. Having to iterate 10 or 20 times using
> k-means (the type holds tree a like structure) isn't going to boost
> efficiency either.
> This index works, and it is very fast, but still large.
>
> So final question, what should I do to make that index much smaller on
> the disc.
>
> --
> GJ
>
> "If we knew what we were doing, it wouldn't be called Research, would
> it?" - AE
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match
>


From: Grzegorz Jaskiewicz <gj(at)pointblue(dot)com(dot)pl>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: why is gist index taking so much space on the disc
Date: 2005-11-21 19:14:44
Message-ID: FCB2D3B8-9CCD-404E-8728-45CDE4DBE93F@pointblue.com.pl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


On 2005-11-21, at 19:32, Martijn van Oosterhout wrote:

> On Mon, Nov 21, 2005 at 04:58:25PM +0100, Grzegorz Jaskiewicz wrote:
>> my conquers with Gist index for custom type are nearly finished. It
>> is working as it is now, but there are few problems here and there.
>> One of em, being amount of disc space index it self takes. The type
>> stucture it self takes 160bytes. Adding 100.000 rows into table -
>> CREATE TABLE blah (a serial, b customType);
>
> Let's see, 160bytes means you'll get aboud 50 keys per page. So you
> would expect 2000 leaf page, 40 level 1 pages. This should be less
> than
> 20-30MB
>
yep;

> You mean you sometimes put the same elements in the two halves? You
> shouldn't do that. The whole point is that the search will descend any
> node that matches consistant, but any single key should only appear
> once in each index.
>
> picksplit should *split* the set, not return two sets about the same
> size as you started...

Nope, I mean that 'masks' created to match either 'half' sometimes
match elements in the other one.
This shouldn't be a big deal, just one level to go down on query to
much more specific result set.
I have fixed that with, somewhat hack.
Here's some pseudo code

sort_data(input);
find_split_point(input);
mask1 = generate_two_masks(input[0]);
mask2 = generate_two_masks(input[1]);

foreach(input) {
bool a = matches1(input);
bool b = matches2(input);
if ( a && b ) {
if ( left_index == 0 ) {
left[left_index++] = input;
}
else {
if ( right_index == 0 ) {
right[right_index++] = input;
continue;
}
/* this part is new code, and helped a lot, now gist index takes much
less space
and is much faster, because of lower I/O consumption*/
if ( loop_index % 2 ) {
right[right_index++] = input;
}
else {
left[left_index++] = input;
}
}
}
else {
if ( a) left[left_index++] = input;
if ( b) right[right_index++] = input;
}
}

mask1 = generate(left );
mask2 = generate(right);

return (left, right, blah, blih, others);

Ok, so the part with i%2 helped a lot, it distributes elements
matching both masks evenly.

Thanks guys.

I will play with k-means, and see if they will work better with no
hacks. Either way, I have to have some code that will handle "matches
both" case.

Thanks again.

--
GJ

"If we knew what we were doing, it wouldn't be called Research, would
it?" - AE


From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Grzegorz Jaskiewicz <gj(at)pointblue(dot)com(dot)pl>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: why is gist index taking so much space on the disc
Date: 2005-11-21 19:45:32
Message-ID: 20051121194527.GG16764@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, Nov 21, 2005 at 08:14:44PM +0100, Grzegorz Jaskiewicz wrote:
> >You mean you sometimes put the same elements in the two halves? You
> >shouldn't do that. The whole point is that the search will descend any
> >node that matches consistant, but any single key should only appear
> >once in each index.
> >
> >picksplit should *split* the set, not return two sets about the same
> >size as you started...
>
> Nope, I mean that 'masks' created to match either 'half' sometimes
> match elements in the other one.
> This shouldn't be a big deal, just one level to go down on query to
> much more specific result set.
> I have fixed that with, somewhat hack.

It's not a hack, that's how it's supposed to work. An entry should only
appear once in the index, but it could appear in multiple places. Like
you say, some entries can go into either half.

B-Trees are the rather special case that you can always split a set of
values into two non-overlapping sets. With geometric types (like your
bitmasks) you can't avoid overlap sometimes so you have to follow
multiple branches to check if an element is there or not.

Your pseudo code is good and will work fine. However, ideally you want
to divide the overlap in such a way that later splits work better.
Maybe by trying to decide which mask is "closer".

The better the splitting the more efficient your tree will become.
Ofcourse, perfect splitting may be expensive but then it depends on how
many inserts vs how many selects. If you do a lot of searches it may be
worth the time.

BTW, I glad you're making progress and hopefully you might be able to
publish some code. PostgreSQL could do with some example GiST indexes
on bitmaps.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
Cc: Grzegorz Jaskiewicz <gj(at)pointblue(dot)com(dot)pl>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: why is gist index taking so much space on the disc
Date: 2005-11-22 00:28:52
Message-ID: Pine.GSO.4.63.0511220320010.29329@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Mon, 21 Nov 2005, Martijn van Oosterhout wrote:

> On Mon, Nov 21, 2005 at 08:14:44PM +0100, Grzegorz Jaskiewicz wrote:
>>> You mean you sometimes put the same elements in the two halves? You
>>> shouldn't do that. The whole point is that the search will descend any
>>> node that matches consistant, but any single key should only appear
>>> once in each index.
>>>
>>> picksplit should *split* the set, not return two sets about the same
>>> size as you started...
>>
>> Nope, I mean that 'masks' created to match either 'half' sometimes
>> match elements in the other one.
>> This shouldn't be a big deal, just one level to go down on query to
>> much more specific result set.
>> I have fixed that with, somewhat hack.
>
> It's not a hack, that's how it's supposed to work. An entry should only
> appear once in the index, but it could appear in multiple places. Like
> you say, some entries can go into either half.
>
> B-Trees are the rather special case that you can always split a set of
> values into two non-overlapping sets. With geometric types (like your
> bitmasks) you can't avoid overlap sometimes so you have to follow
> multiple branches to check if an element is there or not.
>
> Your pseudo code is good and will work fine. However, ideally you want
> to divide the overlap in such a way that later splits work better.
> Maybe by trying to decide which mask is "closer".
>
> The better the splitting the more efficient your tree will become.
> Ofcourse, perfect splitting may be expensive but then it depends on how
> many inserts vs how many selects. If you do a lot of searches it may be
> worth the time.

Martijn is perfectly right here. You, probably, need to read a bit
some classical papers, for example,
"R-TREES: A dynamic index structure for spatial searching" by Antonin
Guttman.

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83