Re: Making a tree with "millions and millions" of dynamic nodes

From: "Joe \"Nuke Me Xemu\" Foster" <bftsi0!joe(at)news(dot)hub(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Making a tree with "millions and millions" of dynamic nodes
Date: 2003-12-03 04:22:24
Message-ID: 1070425373.522288@news-1.nethere.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

"Mikito Harakiri" <mikharakiri(at)iahu(dot)com> wrote in message <news:3kczb(dot)23$2A3(dot)117(at)news(dot)oracle(dot)com>...

> "Christian Fowler" <google(at)NOSPAM(dot)gravesweeper(dot)com> wrote in message
> news:6b-dnQJnDI6YvlCiXTWc-g(at)speakeasy(dot)net(dot)(dot)(dot)
> > For selection, it is crucial for me to get:
> >
> > 1. path generation speed
>
> Path to the root? It is coded explicitly in the materialized path. All you
> need to do is parsing the path and generating a list of prefixes that you
> use within your query like this:
>
> select * from hierarchy where path in ('1','1.2','1.2.1', '1.2.1.1')
>
> If you have an index, and your rdbms optimiser is smart enough, the query
> should execute fast.
>
> > 2. immediate sibling speed
>
> Here is the kludge:
>
> select * from hierarchy where path in ('1.2.1.1','1.2.1.2','1.2.1.3',
> '1.2.1.4','1.2.1.5')
>
> Again, I assume that your query would use an index here.
>
> If you get exactly 5 records, then you can't be sure that your node has that
> many children. You have to repeat query like this:
> select * from hierarchy where path in (
> '1.2.1.1','1.2.1.2','1.2.1.3', '1.2.1.4','1.2.1.5'
> ,'1.2.1.6','1.2.1.7','1.2.1.8', '1.2.1.9','1.2.1.10')
> ,'1.2.1.11','1.2.1.12','1.2.1.13', '1.2.1.4','1.2.1.15')
> ,'1.2.1.16','1.2.1.17','1.2.1.18', '1.2.1.19','1.2.1.20')
> ,'1.2.1.21','1.2.1.22','1.2.1.23', '1.2.1.4','1.2.1.25')
>
> Yet again, there might be more than 25 children, so you'll have to run yet
> again more "generos" query.

Here's an alternative which may not perform very well but may
still be better than risking a full table-scan...

select * from hierarchy
where path like '1.2.1.%'
and path not like '1.2.1.%.%'

What if you use a fixed number of chars per level? The sibling
and immediate children queries then become something more like,

select * from hierarchy
where path like '0001.0002.0001.____'

select count(*) from hierarchy
where path like '0001.0002.0001.1234.____'

Since the characters per level is fixed, you can get rid of the
dots too, and parsing becomes a matter of length and substring.
Finding all descendants could be implemented using BETWEEN in
either materialized path scheme:

select * from hierarchy
where path between '1.2.1.1234.' and '1.2.1.1234~'

select * from hierarchy
where path between '0001.0002.0001.1234.' and '0001.0002.0001.1234~'

select * from hierarchy
where path between '0001000200011234.' and '0001000200011234~'

The '.' and '~' may need changing depending on collating order.

--
Joe Foster <mailto:jlfoster%40znet.com> DC8s in Spaace: <http://www.xenu.net/>
WARNING: I cannot be held responsible for the above They're coming to
because my cats have apparently learned to type. take me away, ha ha!

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message pg 2003-12-03 06:07:36 last update time of a table
Previous Message Lamar Owen 2003-12-03 04:09:14 Re: perl(Pg) (S)RPM