Re: Making a tree with "millions and millions" of dynamic

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Christian Fowler <google(at)NOSPAM(dot)gravesweeper(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Making a tree with "millions and millions" of dynamic
Date: 2003-12-05 09:04:27
Message-ID: Pine.GSO.4.58.0312051203090.27913@ra.sai.msu.su
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Christian,

you may try our contrib/ltree module
see http://www.sai.msu.su/~megera/postgres/gist/ltree
With tuned postgresql and reasonable hardware you might get what you're
looking for.

Oleg

On Tue, 2 Dec 2003, Christian Fowler wrote:

>
>
> I have a VERY LARGE pile of geographic data that I am importing into a database (db of choice is postgres,
> though may hop to oracle if necessary). The data is strictly hierarchical - each node has one, and only one
> parent. The depth should not exceed 6 or 7 levels. The initial import will have about 6 million leaves, and
> 3 million branches. I would expect the leaves to grow significantly, in number easily tripling. However, the
> branches will likely stay very constant in number, but I expect there locations to shift around somewhat
> (affecting up to thousand of children).
>
> For selection, it is crucial for me to get:
>
> 1. path generation speed
> 2. immediate sibling speed
> 3. children count speed
>
>
> I have implemented a first run using Joe Celko's Nested Sets (w/ a mod to store tree level for speed). The
> update propigation issue is the achilles heel of this approach for me. I have read Vadim Tropashko Nested
> Interval concept ( http://www.dbazine.com/tropashko4.html ) , and my brain is painfully stretched enough to
> get the general idea. I have a feeling i will hit the arithmetic issues others of reported.
>
> So it seems Materialized Path is my only option, however I am concerned about LIKE performance for the right
> hand side of the tree, where the path is 8digits x 6 levels = 48 chars. Should I be concerned? I need
> split-second real-time performance, and can't imagine it will be as fast the Nested Set arithmatic approach.
> I can flatten out the import to insure the upper tree has the smallest numbers, however, it will save at
> most 8 chars on the path.
>
> I've been googling through USENET archives watching the big debates of years gone by. I've grazed much
> knowledge, but now am curious if anyone has any thoughts/advice/war stories about a data set this large.
>
> (and if any fellow postgres fans have some reassuring words about real-time performance of a 20 million row
> tree, i'd love to hear ;-)
>
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Thierry Missimilly 2003-12-05 09:40:41 Re: What is WAL used for?
Previous Message Peter Eisentraut 2003-12-05 06:59:27 Re: Groups vs. Roles