Re: Fast insertion indexes: why no developments

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Leonardo Francalanci <m_lists(at)yahoo(dot)it>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fast insertion indexes: why no developments
Date: 2013-10-29 14:28:40
Message-ID: CAHyXU0yQPgCHF6siBhnmyLGUDx=k_k66Y6okEVaEw1PF29Y+LQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 29, 2013 at 2:53 AM, Leonardo Francalanci <m_lists(at)yahoo(dot)it> wrote:
> Hi,
>
>
> I don't see much interest in insert-efficient indexes. These are the ones I've found:
>
> - LSM-tree (used by Cassandra and SQLite4?)
> - Y-Tree (http://www.bossconsulting.com/oracle_dba/white_papers/DW%20in%20oracle/P23%20(ytree%20index%20structure%20for%20DWs).pdf )
> - Fractal indexes (TokuDB, patented)
>
> While I understand that b*trees are still the best compromise in insertion/search speed, disk size, concurrency, and more in general in OLTP workloads, they are useless when it comes to insertion in big data tables (>50M rows) of random values (not ordered values).
>
> I would like to know if the lack of development in this area (not only in Postgresql, but in databases in general) is due to:
>
> 1) complex implementation
> 2) poor search performance
> 3) poor concurrency performance
> 4) not interesting for most users
> 5) something else???
>
> I thought this was going to change due to the fast-insertion speeds needs of "Social Applications", but only TokuDB seems to be the only "successful" player in the area (I don't know how much of it is due to good marketing). Most other DB technology claims faster insertion speed (MongoDB and the like...) but in the end they rely on the old b*tree + sharding instead of using different indexing mechanisms (with the exception of Cassandra).

Another point to add: I don't really see btree as a barrier to
performance for most of the problems I face. The real barriers to
database performance are storage, contention, and query planning.
Postgres btreee indexes are pretty fast and for stuff like bulk
insertions there are some optimization techniques available (such as
sharding or create index concurrently).

Stuff I'd like to see in terms of postgres indexing:
*) faster wal logged hash index
*) composite gist/gin
*) faster gist/gin (to the extent that it's possible).

merlin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-10-29 14:32:37 Re: CLUSTER FREEZE
Previous Message Tom Lane 2013-10-29 14:14:33 Re: Fast insertion indexes: why no developments