Re: Fast insertion indexes: why no developments

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Leonardo Francalanci <m_lists(at)yahoo(dot)it>
Cc: PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fast insertion indexes: why no developments
Date: 2013-11-05 13:18:16
Message-ID: CAGTBQpZg7Dpav4dKsL3CKYJ8Ur1PfZPuikapSntrSWnsYkCRBw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 5, 2013 at 6:57 AM, Leonardo Francalanci <m_lists(at)yahoo(dot)it> wrote:
> Simon Riggs wrote
>> Minmax indexes seem to surprise many people, so broad generalisations
>> aren't likely to be useful.
>>
>> I think the best thing to do is to publish some SQL requests that
>> demonstrate in detail what you are trying to achieve and test them
>> against minmax indexes. That way we can discuss what does work and
>> what doesn't work well enough yet.
>
> While I do believe in testing (since "In theory there is no difference
> between theory and practice. In practice there is"), I would like to know
> the "properties" of the minmax index before trying it.
> What is it supposed to be good at? What are the pros/cons? We can't ask all
> the users to just "try" the index and see if it works for them.
> As I said, my understanding is that is very efficient (both in insertion and
> in searching) when data is somehow ordered in the table. But maybe I got it
> wrong...

Well, for one, random inserts (with random data) on a min-max index
have a roughly 1/N chance of requiring a write to disk, and (N-1)/N
chance of being completely free (or maybe a read to verify a write
isn't needed, but that'll probably hit shared buffers), where N is the
number of tuples per page. Per index page that is.

Of course, non-random workloads are a different matter.

Min-max indexes always require a sequential scan of the min-max index
itself when querying. That works when you intend to query enough
tuples to make up the cost (that is, more tuples than M * N *
random_cost / seq_cost), where M is the number of pages in the index.
Well, actually, since they result in better io patterns as well, the
tradeoff is probably a little bit more tricky than that, in favor of
min-max indexes.

Min-max indexes tend to be very compact, so M is usually low.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sameer Thakur 2013-11-05 13:30:08 Re: pg_stat_statements: calls under-estimation propagation
Previous Message Albe Laurenz 2013-11-05 13:17:48 Re: UTF8 national character data type support WIP patch and list of open issues.