Grouped Index Tuples

Lists: pgsql-hackers
From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Grouped Index Tuples
Date: 2006-12-07 10:30:11
Message-ID: 4577ED33.60602@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

I've cut a new version of the GIT patch I posted earlier, and collected
all my dispersed todo-lists, post-it notes, performance results,
supplementary patches etc. I had to a single web-page:

http://community.enterprisedb.com/git/

Perhaps the most interesting stuff apart from the patch itself is the
performance results. I've run some CPU bound tests to measure the extra
CPU overhead it causes. The CPU overhead is significant, the worst case
being a select of a single row from a table with just one integer column.

However, the I/O savings are also the greatest for that same test case,
as the table grows and the test becomes I/O bound. I don't have the
numbers now, but earlier runs showed that the duration of the test was
roughly halved, which makes sense because the patch reduced the index
size so that it fit in memory, reducing the number of physical I/Os
required per select from 2 to 1.

ISTM that if we want to enable GIT automatically, we need a way to
either reduce the CPU overhead, or have a smart heuristic to tune the
feature so that it's only enabled when it's beneficial.

Thoughts?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Grouped Index Tuples
Date: 2006-12-10 19:16:44
Message-ID: 20061210191643.GN44124@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Dec 07, 2006 at 10:30:11AM +0000, Heikki Linnakangas wrote:
> I've cut a new version of the GIT patch I posted earlier, and collected
> all my dispersed todo-lists, post-it notes, performance results,
> supplementary patches etc. I had to a single web-page:
>
> http://community.enterprisedb.com/git/
>
> Perhaps the most interesting stuff apart from the patch itself is the
> performance results. I've run some CPU bound tests to measure the extra
> CPU overhead it causes. The CPU overhead is significant, the worst case
> being a select of a single row from a table with just one integer column.
>
> However, the I/O savings are also the greatest for that same test case,
> as the table grows and the test becomes I/O bound. I don't have the
> numbers now, but earlier runs showed that the duration of the test was
> roughly halved, which makes sense because the patch reduced the index
> size so that it fit in memory, reducing the number of physical I/Os
> required per select from 2 to 1.
>
> ISTM that if we want to enable GIT automatically, we need a way to
> either reduce the CPU overhead, or have a smart heuristic to tune the
> feature so that it's only enabled when it's beneficial.

The maintain_cluster_order patch is useful by itself, and handles an
existing TODO regarding pulling pages out of WAL in a specified order to
maintain clustering. I think it'd be good to submit that patch
separately. Even if we get HOT into the backend, the cluster patch would
still be useful for cases where you sometimes have to update fields in a
clustered index.

On usage, ISTM it would be better to turn on GIT only for a clustered
index and not the PK? I'm guessing your automatic case is intended for
SERIAL PKs, but maybe it would be better to just make that explicit.
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)


From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Grouped Index Tuples
Date: 2006-12-10 19:56:29
Message-ID: 457C666D.4050005@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jim C. Nasby wrote:
> On Thu, Dec 07, 2006 at 10:30:11AM +0000, Heikki Linnakangas wrote:
>> I've cut a new version of the GIT patch I posted earlier, and collected
>> all my dispersed todo-lists, post-it notes, performance results,
>> supplementary patches etc. I had to a single web-page:
>>
>> http://community.enterprisedb.com/git/
>>
>> Perhaps the most interesting stuff apart from the patch itself is the
>> performance results. I've run some CPU bound tests to measure the extra
>> CPU overhead it causes. The CPU overhead is significant, the worst case
>> being a select of a single row from a table with just one integer column.
>>
>> However, the I/O savings are also the greatest for that same test case,
>> as the table grows and the test becomes I/O bound. I don't have the
>> numbers now, but earlier runs showed that the duration of the test was
>> roughly halved, which makes sense because the patch reduced the index
>> size so that it fit in memory, reducing the number of physical I/Os
>> required per select from 2 to 1.
>>
>> ISTM that if we want to enable GIT automatically, we need a way to
>> either reduce the CPU overhead, or have a smart heuristic to tune the
>> feature so that it's only enabled when it's beneficial.
>
> The maintain_cluster_order patch is useful by itself, and handles an
> existing TODO regarding pulling pages out of WAL in a specified order to
> maintain clustering.

Pull pages out of WAL? That must be a typo...

> I think it'd be good to submit that patch
> separately. Even if we get HOT into the backend, the cluster patch would
> still be useful for cases where you sometimes have to update fields in a
> clustered index.

Yeah, I submitted it in August for the first time. The way it's written
now is the most non-intrusive way I could think of: it just adds a new
optional indexam method. That has a small performance drawback: When
inserting, the B-tree needs to be descended twice, once for the
amsuggestblock, and then second time in aminsert. It would make sense to
keep the index page pinned to avoid the descend, but that requires API
changes.

> On usage, ISTM it would be better to turn on GIT only for a clustered
> index and not the PK? I'm guessing your automatic case is intended for
> SERIAL PKs, but maybe it would be better to just make that explicit.

As the patch stands, GIT is enabled by default for clustered indexes.
And also by default, a PK index is created as the clustered index for
table, if it's a simple single column integer key. It's a bit arbitrary,
but lacking a better heuristic, it's a reasonable guess that should
enable the feature in the most common cases where it helps.

Yeah, I'm guessing that a table with a serial PK becomes naturally
clustered by PK.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
Subject: Re: Grouped Index Tuples
Date: 2006-12-11 21:40:29
Message-ID: 457DD04D.1080701@cheapcomplexdevices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Jim C. Nasby wrote:
> On usage, ISTM it would be better to turn on GIT only for a clustered
> index and not the PK? I'm guessing your automatic case is intended for
> SERIAL PKs, but maybe it would be better to just make that explicit.

Not necessarily; since often (in my tables at least) the data for
come columns has some local grouping of similar values even though
it's not the clustered index.

The obvious example is addresses (city, state, street, etc may not appear
clustered - but if you cluster your table by zip code, GIT will work
well for them).

Other examples would be tables containing dates and product names - even
though there's no total monotonic ordering if your product families
get refreshed every couple years you'll find that one range of product
names shows up mostly in old years, and others in new years - so GIT
could prove useful there - despite not being a clustered index.


From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
Cc: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Grouped Index Tuples
Date: 2006-12-12 10:40:09
Message-ID: 457E8709.2020808@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Ron Mayer wrote:
> Jim C. Nasby wrote:
>> On usage, ISTM it would be better to turn on GIT only for a clustered
>> index and not the PK? I'm guessing your automatic case is intended for
>> SERIAL PKs, but maybe it would be better to just make that explicit.
>
> Not necessarily; since often (in my tables at least) the data for
> come columns has some local grouping of similar values even though
> it's not the clustered index.

Yes, there's a lot of cases like that.

My real goal is to make it cheap enough in the case where there is no
clustering, that we could just enable it on all indexes by default. At
the moment, it looks like it's indeed near-zero cost when the table is
in random order, but the CPU overhead is too great in many workloads to
have it always enabled. More autotuning logic would be needed, or a
significant reduction in overhead.

But as it is, you can always turn it on explicitly if you think it'd help.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com


From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Grouped Index Tuples
Date: 2006-12-12 20:26:32
Message-ID: 200612122026.kBCKQWA03551@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Heikki Linnakangas wrote:
> Jim C. Nasby wrote:
> > On Thu, Dec 07, 2006 at 10:30:11AM +0000, Heikki Linnakangas wrote:
> >> I've cut a new version of the GIT patch I posted earlier, and collected
> >> all my dispersed todo-lists, post-it notes, performance results,
> >> supplementary patches etc. I had to a single web-page:
> >>
> >> http://community.enterprisedb.com/git/
> >>
> >> Perhaps the most interesting stuff apart from the patch itself is the
> >> performance results. I've run some CPU bound tests to measure the extra
> >> CPU overhead it causes. The CPU overhead is significant, the worst case
> >> being a select of a single row from a table with just one integer column.
> >>
> >> However, the I/O savings are also the greatest for that same test case,
> >> as the table grows and the test becomes I/O bound. I don't have the
> >> numbers now, but earlier runs showed that the duration of the test was
> >> roughly halved, which makes sense because the patch reduced the index
> >> size so that it fit in memory, reducing the number of physical I/Os
> >> required per select from 2 to 1.
> >>
> >> ISTM that if we want to enable GIT automatically, we need a way to
> >> either reduce the CPU overhead, or have a smart heuristic to tune the
> >> feature so that it's only enabled when it's beneficial.
> >
> > The maintain_cluster_order patch is useful by itself, and handles an
> > existing TODO regarding pulling pages out of WAL in a specified order to
> > maintain clustering.
>
> Pull pages out of WAL? That must be a typo...

I assume he meant FSM (free space map).

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +


From: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Grouped Index Tuples
Date: 2006-12-13 04:56:41
Message-ID: 20061213045641.GR6551@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Tue, Dec 12, 2006 at 03:26:32PM -0500, Bruce Momjian wrote:
> Heikki Linnakangas wrote:
> > > The maintain_cluster_order patch is useful by itself, and handles an
> > > existing TODO regarding pulling pages out of WAL in a specified order to
> > > maintain clustering.
> >
> > Pull pages out of WAL? That must be a typo...
>
> I assume he meant FSM (free space map).

Yup. Brainfart.
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)