Re: Fast insertion indexes: why no developments

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Yann Fontana <yann(dot)fontana(at)gmail(dot)com>
Cc: Leonardo Francalanci <m_lists(at)yahoo(dot)it>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fast insertion indexes: why no developments
Date: 2013-11-05 07:49:00
Message-ID: CA+U5nMJpR+_aHdXowrd2L5OFTrcRLWx9_umS+SGw=MfLRkVjpQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 30 October 2013 14:34, Yann Fontana <yann(dot)fontana(at)gmail(dot)com> wrote:
>
>
>> On 30 October 2013 11:23, Leonardo Francalanci <m_lists(at)yahoo(dot)it> wrote:
>>
>> >> In terms of generality, do you think its worth a man year of developer
>> >> effort to replicate what you have already achieved? Who would pay?
>
>
> I work on an application that does exactly what Leonardo described. We hit
> the exact same problem, and came up with the same exact same solution (down
> to the 15 minutes interval). But I have also worked on other various
> datamarts (all using Oracle), and they are all subject to this problem in
> some form: B-tree indexes slow down bulk data inserts too much and need to
> be disabled or dropped and then recreated after the load. In some cases this
> is done easily enough, in others it's more complicated (example: every day,
> a process imports from 1 million to 1 billion records into a table partition
> that may contain from 0 to 1 billion records. To be as efficient as
> possible, you need some logic to compare the number of rows to insert to the
> number of rows already present, in order to decide whether to drop the
> indexes or not).
>
> Basically, my point is that this is a common problem for datawarehouses and
> datamarts. In my view, indexes that don't require developers to work around
> poor insert performance would be a significant feature in a
> "datawarehouse-ready" DBMS.

Everybody on this thread is advised to look closely at Min Max indexes
before starting any further work.

MinMax will give us access to many new kinds of plan, plus they are
about as close to perfectly efficient, by which I mean almost zero
overhead, with regard to inserts as it is possible to get.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Leonardo Francalanci 2013-11-05 08:25:34 Re: Fast insertion indexes: why no developments
Previous Message Gurjeet Singh 2013-11-05 07:47:24 Re: Shave a few instructions from child-process startup sequence