Re: multi-column index

From: Manfred Koizar <mkoi-pg(at)aon(dot)at>
To: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: multi-column index
Date: 2005-03-18 10:34:03
Message-ID: 985l31t5jn1bvmhaihifslitbc940pdo5f@email.aon.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, 17 Mar 2005 23:48:30 -0800, Ron Mayer
<rm_pg(at)cheapcomplexdevices(dot)com> wrote:
>Would this also help estimates in the case where values in a table
>are tightly clustered, though not in strictly ascending or descending
>order?

No, I was just expanding the existing notion of correlation from single
columns to index tuples.

>For example, address data has many fields that are related
>to each other (postal codes, cities, states/provinces).

This looks like a case for cross-column statistics, though you might not
have meant it as such. I guess what you're talking about can also be
described with a single column. In a list like

3 3 ... 3 1 1 ... 1 7 7 ... 7 4 4 ... 4 ...

equal items are "clustered" together but the values are not "correlated"
to their positions. This would require a whole new column
characteristic, something like the probability that we find the same
value in adjacent heap tuples, or the number of different values we can
expect on one heap page. The latter might even be easy to compute
during ANALYSE.

Servus
Manfred

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Manfred Koizar 2005-03-18 10:42:23 Re: multi-column index
Previous Message Hannu Krosing 2005-03-17 21:27:25 Re: One tuple per transaction