Re: plans for bitmap indexes?

Lists: pgsql-hackers
From: "Dave Page" <dpage(at)vale-housing(dot)co(dot)uk>
To: "Yann Michel" <yann-postgresql(at)spline(dot)de>, "Reini Urban" <rurban(at)x-ray(dot)at>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: plans for bitmap indexes?
Date: 2004-10-08 09:09:18
Message-ID: E7F85A1B5FF8D44C8A1AF6885BC9A0E43069EF@ratbert.vale-housing.co.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org
> [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Yann Michel
> Sent: 08 October 2004 08:28
> To: Reini Urban
> Cc: pgsql-hackers(at)postgresql(dot)org
> Subject: Re: [HACKERS] plans for bitmap indexes?
>
>
> I don't know what this means to my questin, but anyway...
>
> > If you don't want to code that in the app, the database
> backend is the
> > solution. The database is the golden hammer.
>
>
> Therefore I asked wheather there are any thoughts of
> implementing them.

I think what Reini was asking was why do you think you need bitmap
indexes as opposed to any existing type?

Regards, Dave.


From: Yann Michel <yann-postgresql(at)spline(dot)de>
To: Dave Page <dpage(at)vale-housing(dot)co(dot)uk>
Cc: Yann Michel <yann-postgresql(at)spline(dot)de>, Reini Urban <rurban(at)x-ray(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: plans for bitmap indexes?
Date: 2004-10-08 09:45:28
Message-ID: 20041008094528.GA9375@zoom.spline.inf.fu-berlin.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Fri, Oct 08, 2004 at 10:09:18AM +0100, Dave Page wrote:
> I think what Reini was asking was why do you think you need bitmap
> indexes as opposed to any existing type?

due to I'm developing a datawarehousing application we have lots of
fact-data in our central fact-table. As I know how to improve
performance on Oracle based datawarehouses, I'm used to add bitmap
indexes for atributes having only a few distinct values.
So I was looking for any comparable indexing technology but didn't find
any so far.

Regards,
Yann


From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Yann Michel <yann-postgresql(at)spline(dot)de>
Cc: Dave Page <dpage(at)vale-housing(dot)co(dot)uk>, Reini Urban <rurban(at)x-ray(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: plans for bitmap indexes?
Date: 2004-10-08 10:06:26
Message-ID: 1097229986.2632.9.camel@fuji.krosing.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On R, 2004-10-08 at 12:45, Yann Michel wrote:
> Hi,
>
> On Fri, Oct 08, 2004 at 10:09:18AM +0100, Dave Page wrote:
> > I think what Reini was asking was why do you think you need bitmap
> > indexes as opposed to any existing type?
>
> due to I'm developing a datawarehousing application we have lots of
> fact-data in our central fact-table. As I know how to improve
> performance on Oracle based datawarehouses, I'm used to add bitmap
> indexes for atributes having only a few distinct values.
> So I was looking for any comparable indexing technology but didn't find
> any so far.

There is currently no suitable index type for this type of queries (huge
tables with a few distinct values).

You may try to optimise performance by partitioning your fact tables on
these few dimension values by using table inheritance or union all
views.

There was a discussion on partitioning postgres tables on
pgsql-performance list a few days ago.

-------------
Hannu


From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Yann Michel <yann-postgresql(at)spline(dot)de>
Cc: Dave Page <dpage(at)vale-housing(dot)co(dot)uk>, Reini Urban <rurban(at)x-ray(dot)at>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: plans for bitmap indexes?
Date: 2004-10-08 10:09:54
Message-ID: Pine.GSO.4.61.0410081408550.23275@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, 8 Oct 2004, Yann Michel wrote:

> Hi,
>
> On Fri, Oct 08, 2004 at 10:09:18AM +0100, Dave Page wrote:
>> I think what Reini was asking was why do you think you need bitmap
>> indexes as opposed to any existing type?
>
> due to I'm developing a datawarehousing application we have lots of
> fact-data in our central fact-table. As I know how to improve
> performance on Oracle based datawarehouses, I'm used to add bitmap
> indexes for atributes having only a few distinct values.
> So I was looking for any comparable indexing technology but didn't find
> any so far.

Yann, you may try our contrib/intarray module

>
> Regards,
> Yann
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


From: Chris Browne <cbbrowne(at)acm(dot)org>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: plans for bitmap indexes?
Date: 2004-10-09 17:31:36
Message-ID: 604ql3x0g7.fsf@dba2.int.libertyrms.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

yann-postgresql(at)spline(dot)de (Yann Michel) writes:
> On Fri, Oct 08, 2004 at 10:09:18AM +0100, Dave Page wrote:
>> I think what Reini was asking was why do you think you need bitmap
>> indexes as opposed to any existing type?
>
> due to I'm developing a datawarehousing application we have lots of
> fact-data in our central fact-table. As I know how to improve
> performance on Oracle based datawarehouses, I'm used to add bitmap
> indexes for atributes having only a few distinct values. So I was
> looking for any comparable indexing technology but didn't find any
> so far.

The most nearly comparable thing is be the notion of "partial
indexes," where, supposing you had 60 region codes (e.g. - 50 US
states, 10 Canadian provinces), you might set up indices thus:

TABLE=my_table
FIELD=stateprov
FILE=$HOME/regionlist.txt
for region in `cat $FILE`; do
query="create index ${TABLE}_partial_on_${region} on $TABLE($FIELD) where $FIELD = '$region';"
echo $query | psql -d datawarehouse
done

That would set up 60 (or whatever $HOME/regionlist.txt indicated)
partial indices on the table on that field.

By the way, I thought ahead a little, in that; doing the same thing
for country codes might be as easy as replacing:

FIELD=country
FILE=$HOME/countrylist.txt

The partial indexes will not ALWAYS be useful; in cases where they
aren't, it is not inconceivable that there are improvements to be made
in the query optimizer...
--
let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];;
http://www.ntlug.org/~cbbrowne/linuxxian.html
A VAX is virtually a computer, but not quite.


From: Yann Michel <yann-postgresql(at)spline(dot)de>
To: Chris Browne <cbbrowne(at)acm(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: plans for bitmap indexes?
Date: 2004-10-14 21:08:54
Message-ID: 20041014210854.GA6742@zoom.spline.inf.fu-berlin.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On Sat, Oct 09, 2004 at 01:31:36PM -0400, Chris Browne wrote:
>
> The most nearly comparable thing is be the notion of "partial
> indexes," where, supposing you had 60 region codes (e.g. - 50 US
> states, 10 Canadian provinces), you might set up indices thus:
>
[...]
>
> The partial indexes will not ALWAYS be useful; in cases where they
> aren't, it is not inconceivable that there are improvements to be made
> in the query optimizer...

So what you are suggesting here is the "tree-fashioned-static way" of
real bitmap indexes. I.E. each time a new value is inserted vor any kind
of thus indexes column you have to create a new index which is not very
usefull as you can think of. In addition nothing about the real
granularity is known to the optimizer to let it guess the best execution
plan, i.e. to do a full table scan or use an index. That means if one
attributes value is representative for 80 percent it is usefull to do a
full table scan whereas if its value is representative for only 5
percent the index might be better. But as I understood the partial index
concept, no statistics for "value representation" are available.

Therefore I started to do read some articles and books about bitmap
index implementations to maby contribute... we will see...

BTW: Is there any more documented CVS-version available? I mean it would
be really nice to read some comments from time to time or at least more
comments about each function/method's purpose or functionality.

Regards,
Yann


From: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
To: Yann Michel <yann-postgresql(at)spline(dot)de>
Cc: Chris Browne <cbbrowne(at)acm(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: plans for bitmap indexes?
Date: 2004-10-14 21:18:12
Message-ID: 20041014211812.GA11062@dcc.uchile.cl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Oct 14, 2004 at 11:08:54PM +0200, Yann Michel wrote:

> BTW: Is there any more documented CVS-version available? I mean it would
> be really nice to read some comments from time to time or at least more
> comments about each function/method's purpose or functionality.

Huh, the code is reasonably commented. Certainly not following
Javadoc-like standards, but it can be read.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"La rebeldía es la virtud original del hombre" (Arthur Schopenhauer)


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: Yann Michel <yann-postgresql(at)spline(dot)de>, Chris Browne <cbbrowne(at)acm(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: plans for bitmap indexes?
Date: 2004-10-15 15:27:05
Message-ID: 9529.1097854025@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl> writes:
> On Thu, Oct 14, 2004 at 11:08:54PM +0200, Yann Michel wrote:
>> BTW: Is there any more documented CVS-version available? I mean it would
>> be really nice to read some comments from time to time or at least more
>> comments about each function/method's purpose or functionality.

> Huh, the code is reasonably commented. Certainly not following
> Javadoc-like standards, but it can be read.

Also, have you read the Internals volume of the SGML docs? Mostly
pretty high-level stuff, but that's what you need to get started.
The developer's FAQ is also required reading for newbies.
There are also README files in various parts of the source tree that
provide information about various sub-modules.

regards, tom lane


From: Yann Michel <yann-postgresql(at)spline(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>, Yann Michel <yann-postgresql(at)spline(dot)de>, Chris Browne <cbbrowne(at)acm(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: plans for bitmap indexes?
Date: 2004-10-15 16:49:39
Message-ID: 20041015164939.GA29832@uff.spline.inf.fu-berlin.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hi Tom,

On Fri, Oct 15, 2004 at 11:27:05AM -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl> writes:
> > On Thu, Oct 14, 2004 at 11:08:54PM +0200, Yann Michel wrote:
> >> BTW: Is there any more documented CVS-version available? I mean it would
> >> be really nice to read some comments from time to time or at least more
> >> comments about each function/method's purpose or functionality.
>
> > Huh, the code is reasonably commented. Certainly not following
> > Javadoc-like standards, but it can be read.
>
> Also, have you read the Internals volume of the SGML docs? Mostly
> pretty high-level stuff, but that's what you need to get started.
> The developer's FAQ is also required reading for newbies.
> There are also README files in various parts of the source tree that
> provide information about various sub-modules.

I have not jet been reading all of it but some of the README files. I
will keep that hint in mind but first of all I'll read something about
bitmap compression and other relevant techniques before starting to
discover the index internals of postgresql... ;-) I've been using all
kinds of functions in oracle for a long time but never had the
experience to implement any indexing strategies. The only thing I did
were some operating system extensions for minix during my os-studies
(scheduling, driver, acl etc.)

If there is anything additional/special to know further, I apreciate any
hints.

Regards and thanks,
Yann