Re: Grouped Index Tuples

Lists: pgsql-hackers
From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Grouped Index Tuples
Date: 2007-02-22 12:47:42
Message-ID: 45DD90EE.1080202@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I've brought the GIT patch up-to-date with CVS head. The latest version
can be found at http://community.enterprisedb.com/git/

I also reran the CPU bound test cases with the latest patch.

I want this in 8.3 in some form, and I have the time to do any required
changes. If someone wants to see more tests, I can arrange that as well.

The patch is pretty big at the moment. I think the best way to proceed
with this is to extract some smaller, incremental patches from it that
just refactor the current b-tree code. After that, the final patch that
implements GIT should be much smaller and more readable. And there's
still a bunch of todo items there as well...

But before I start doing that, I need some review and general agreement
on the design. What I don't want to happen is that three days after the
feature freeze, someone finally looks at it and finds a major issue or
just thinks it's an unreadable mess, and we no longer have the time to
fix it.

One question that I'm sure someone will ask is do we need this if we
have bitmap indexes? Both aim at having a smaller index, after all. The
use cases are quite different; GIT is effective whenever you have a
table that's reasonably well-clustered. Unlike the bitmap indexam, GIT's
effectiveness doesn't depend on the number of distinct values, in
particular it works well with unique indexes. GIT is comparable to
clustered indexes in other DBMSs (in fact we might want to call GIT that
in the end).

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Grouped Index Tuples
Date: 2007-02-23 00:30:32
Message-ID: 1172190633.3874.214.camel@silverbirch.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, 2007-02-22 at 12:47 +0000, Heikki Linnakangas wrote:

> One question that I'm sure someone will ask is do we need this if we
> have bitmap indexes? Both aim at having a smaller index, after all.
> The use cases are quite different; GIT is effective whenever you have
> a table that's reasonably well-clustered. Unlike the bitmap indexam,
> GIT's effectiveness doesn't depend on the number of distinct values,
> in particular it works well with unique indexes. GIT is comparable to
> clustered indexes in other DBMSs (in fact we might want to call GIT
> that in the end).

A few thoughts:

Whether its proof, or merely strong interest, we need to see where the
cross-over point is between bitmap indexes (BMI) and GIT. I'm assuming
there is one, but others may need convincing that there is a clear win
in common use cases. Graphs, numbers etc..

On your TODO list, it would be good to differentiate between ideas and
must-complete items. My suggested list would be:

Must complete
- cost model for optimizer
- user documentation
- performance tests

Good to complete
- duplicate key flag (should increase range of overlap with BMI)
- range re-check optimization (based upon your results)
- better index AM
- HOT, so that GIT can provide longer term benefits

Others
...

I seem to recall you mentioning there was a way to record that the heap
tuples are stored in monotonic order and so the sort can be avoided when
you return the tuples. Is that still true?

Also
"the patch roughly doubles the amount of WAL generated by WAL"
not sure what that means

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Grouped Index Tuples
Date: 2008-03-06 22:05:29
Message-ID: 200803062205.m26M5Tm23460@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


Heikki, are you going to revise this for 8.4?

---------------------------------------------------------------------------

Heikki Linnakangas wrote:
> I've brought the GIT patch up-to-date with CVS head. The latest version
> can be found at http://community.enterprisedb.com/git/
>
> I also reran the CPU bound test cases with the latest patch.
>
> I want this in 8.3 in some form, and I have the time to do any required
> changes. If someone wants to see more tests, I can arrange that as well.
>
> The patch is pretty big at the moment. I think the best way to proceed
> with this is to extract some smaller, incremental patches from it that
> just refactor the current b-tree code. After that, the final patch that
> implements GIT should be much smaller and more readable. And there's
> still a bunch of todo items there as well...
>
> But before I start doing that, I need some review and general agreement
> on the design. What I don't want to happen is that three days after the
> feature freeze, someone finally looks at it and finds a major issue or
> just thinks it's an unreadable mess, and we no longer have the time to
> fix it.
>
> One question that I'm sure someone will ask is do we need this if we
> have bitmap indexes? Both aim at having a smaller index, after all. The
> use cases are quite different; GIT is effective whenever you have a
> table that's reasonably well-clustered. Unlike the bitmap indexam, GIT's
> effectiveness doesn't depend on the number of distinct values, in
> particular it works well with unique indexes. GIT is comparable to
> clustered indexes in other DBMSs (in fact we might want to call GIT that
> in the end).
>
> --
> Heikki Linnakangas
> EnterpriseDB http://www.enterprisedb.com
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
> http://www.postgresql.org/about/donate

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://postgres.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Bruce Momjian" <bruce(at)momjian(dot)us>
Cc: "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Grouped Index Tuples
Date: 2008-03-07 10:52:34
Message-ID: 47D11E72.8000703@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Bruce Momjian wrote:
> Heikki, are you going to revise this for 8.4?

Probably not. I have other features I want to work on at the moment.

> ---------------------------------------------------------------------------
>
> Heikki Linnakangas wrote:
>> I've brought the GIT patch up-to-date with CVS head. The latest version
>> can be found at http://community.enterprisedb.com/git/
>>
>> I also reran the CPU bound test cases with the latest patch.
>>
>> I want this in 8.3 in some form, and I have the time to do any required
>> changes. If someone wants to see more tests, I can arrange that as well.
>>
>> The patch is pretty big at the moment. I think the best way to proceed
>> with this is to extract some smaller, incremental patches from it that
>> just refactor the current b-tree code. After that, the final patch that
>> implements GIT should be much smaller and more readable. And there's
>> still a bunch of todo items there as well...
>>
>> But before I start doing that, I need some review and general agreement
>> on the design. What I don't want to happen is that three days after the
>> feature freeze, someone finally looks at it and finds a major issue or
>> just thinks it's an unreadable mess, and we no longer have the time to
>> fix it.
>>
>> One question that I'm sure someone will ask is do we need this if we
>> have bitmap indexes? Both aim at having a smaller index, after all. The
>> use cases are quite different; GIT is effective whenever you have a
>> table that's reasonably well-clustered. Unlike the bitmap indexam, GIT's
>> effectiveness doesn't depend on the number of distinct values, in
>> particular it works well with unique indexes. GIT is comparable to
>> clustered indexes in other DBMSs (in fact we might want to call GIT that
>> in the end).
>>
>> --
>> Heikki Linnakangas
>> EnterpriseDB http://www.enterprisedb.com
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 7: You can help support the PostgreSQL project by donating at
>>
>> http://www.postgresql.org/about/donate
>

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com