From: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
---|---|
To: | Christian Fowler <google(at)NOSPAM(dot)gravesweeper(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Making a tree with "millions and millions" of dynamic |
Date: | 2003-12-05 09:04:27 |
Message-ID: | Pine.GSO.4.58.0312051203090.27913@ra.sai.msu.su |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Christian,
you may try our contrib/ltree module
see http://www.sai.msu.su/~megera/postgres/gist/ltree
With tuned postgresql and reasonable hardware you might get what you're
looking for.
Oleg
On Tue, 2 Dec 2003, Christian Fowler wrote:
>
>
> I have a VERY LARGE pile of geographic data that I am importing into a database (db of choice is postgres,
> though may hop to oracle if necessary). The data is strictly hierarchical - each node has one, and only one
> parent. The depth should not exceed 6 or 7 levels. The initial import will have about 6 million leaves, and
> 3 million branches. I would expect the leaves to grow significantly, in number easily tripling. However, the
> branches will likely stay very constant in number, but I expect there locations to shift around somewhat
> (affecting up to thousand of children).
>
> For selection, it is crucial for me to get:
>
> 1. path generation speed
> 2. immediate sibling speed
> 3. children count speed
>
>
> I have implemented a first run using Joe Celko's Nested Sets (w/ a mod to store tree level for speed). The
> update propigation issue is the achilles heel of this approach for me. I have read Vadim Tropashko Nested
> Interval concept ( http://www.dbazine.com/tropashko4.html ) , and my brain is painfully stretched enough to
> get the general idea. I have a feeling i will hit the arithmetic issues others of reported.
>
> So it seems Materialized Path is my only option, however I am concerned about LIKE performance for the right
> hand side of the tree, where the path is 8digits x 6 levels = 48 chars. Should I be concerned? I need
> split-second real-time performance, and can't imagine it will be as fast the Nested Set arithmatic approach.
> I can flatten out the import to insure the upper tree has the smallest numbers, however, it will save at
> most 8 chars on the path.
>
> I've been googling through USENET archives watching the big debates of years gone by. I've grazed much
> knowledge, but now am curious if anyone has any thoughts/advice/war stories about a data set this large.
>
> (and if any fellow postgres fans have some reassuring words about real-time performance of a 20 million row
> tree, i'd love to hear ;-)
>
>
>
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
From | Date | Subject | |
---|---|---|---|
Next Message | Thierry Missimilly | 2003-12-05 09:40:41 | Re: What is WAL used for? |
Previous Message | Peter Eisentraut | 2003-12-05 06:59:27 | Re: Groups vs. Roles |