Red-black tree for GIN

From: Teodor Sigaev <teodor(at)sigaev(dot)ru>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Red-black tree for GIN
Date: 2009-11-23 13:28:26
Message-ID: 4B0A8DFA.7050009@sigaev.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi there,

attached is a patch, which contains implementation of a red-black
tree, a self-balanced implementation of binary tree. The main goal of
this patch is to improve creation time of GIN index in the corner cases.
While creation, GIN collects data in memory in binary tree until it
reach some limit and then it flush tree to disk. Some data could
produces unbalanced binary tree, for example, sorted data, so the tree
degenerates to the list with O(N^2) processing time (see thread
http://archives.postgresql.org/pgsql-performance/2009-03/msg00340.php)
), which cause very slow index creation. Tom has fixed that by limiting
depth of tree (committed to 8.3 and 8.4), but we found it's not enough
and propose to use red-black tree, which is very good for skewed data and has
almost the same performance for unsorted data, see
http://www.sai.msu.su/~megera/wiki/2009-07-27 and
http://www.sai.msu.su/~megera/wiki/2009-04-03 for more information.

Implementation of red-black tree has several currently unused methods,
but they will be used in next patches.

--
Teodor Sigaev E-mail: teodor(at)sigaev(dot)ru
WWW: http://www.sigaev.ru/

Attachment Content-Type Size
rbtree-0.5.gz application/x-tar 6.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Teodor Sigaev 2009-11-23 13:33:03 point_ops for GiST
Previous Message Magnus Hagander 2009-11-23 12:54:48 Re: forget patch win32.mak.