RE: Database cluster?

Lists: pgsql-general
From: "Nathan Barnett" <nbarnett(at)cellularphones(dot)com>
To: "'Gordan Bobic'" <gordan(at)freeuk(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: RE: Database cluster?
Date: 2000-11-30 14:34:57
Message-ID: 71975481CD04D4118E57004033A2596E0DFB5B@ip205.82.136.216.in-addr.arpa
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

This system is in use through what is called shared nothing clustering which
is employed by IBM's DB2 and Microsoft SQL Server 2000. Either of these
products will work in the manner that you are looking for.

---------------------------------------------
Nathan Barnett

-----Original Message-----
From: pgsql-general-owner(at)postgresql(dot)org
[mailto:pgsql-general-owner(at)postgresql(dot)org]On Behalf Of Gordan Bobic
Sent: Thursday, November 30, 2000 4:02 AM
To: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] Database cluster?

> > > I am considering splitting the database into tables residing on
separate
> > > machines, and connect them on one master node.
> > >
> > > The question I have is:
> > >
> > > 1) How can I do this using PostgreSQL?
> >
> > You can't.
>
> I'll jump in with a bit more info. Splitting tables across multiple
> machines would do nothing more than make the entire system run at a
snail's
> pace. Yes, it would slow it down immensely, because you just couldn't
move
> data between machines quickly enough.

I don't believe that is the case. In my case, queries typically return
comparatively small amounts of data. Around 100 records at most. The amount
of data that needs to be transferred is comparatively small, and even over
10 Mb ethernet, it would take at most about a second to transfer. This is a
much smaller delay than the query time itself, which can take 10 seconds or
more. Remember that I said there are tables with over 30M records? Doing
multi-table joins on things like this takes a long time...

So, splitting the data in such a way that one table is queried, and then
tables joined from it are queried in parallel, would cause a signifficant
speed-up.

For example, say we have tables T1, T2 and T3.

T1 has fields F1.1, F1.2 and F1.3. T2 has F2.1 and T3 has F3.1 (at least,
probably lots of other fields.

Say I want to do
SELECT *
FROM T1
WHERE F1.1 = F2.1 AND F1.2 = F3.1 AND F1.3 = 'somedata';

Then F1.3 could be searched for 'somedata'. When the records are found,
this could be cross-matched remotely, in parallel for F1.1=F2.1 and
F1.2=F3.1, on different machines.

This means that depending on the type, configuration and the usage of the
database, a potentially massive improvement in performance could be
achiveved, especially on multi-table joins which span lots of BIG tables.

Somebody mentioned the fact that postgres uses IPC for communicating
between processes. I think there are tools for clustering (I am not sure if
Mosix supports transparently allowing IPC across nodes) which can work
around that.

> Why? Well, whenever you join two tables that are on different
machines,
> the tables have to go across whatever sort of connection you have between
> the machines. Even if you use gigabit ethernet, you are still running at
a
> mere fraction of the bandwidth of the computer's internal bus - and at
> orders of magnitude greater latency. You'd have lots of CPU's sitting
> around, doing absolutely nothing, waiting for data to come across the
wire.

Gigabit ethernet has around the same bandwidth as PCI bus. I suppose it all
depends on what machine you have running this. This would be true in the
case that the datbase server is a nice big Alpha with severl CPUs.

> There are alternatives, such as IP-over-SCSI. That reduces the
latency
> of ethernet quite a bit, and gives you much more bandwidth (say, up to
160
> megabytes/second). However, that's still a pittance compared to the main
> system bus inside your computer.

But SCSI is still 160MB burst (not sustained, unless you're using very
expensive arrays). And Gigabit ethernet is 133 MB/s, albeit with greater
latency.

> That's one of the greatest hurdles to distributed computing. That's
why
> the applications that are best adapted to distributed computing are those
> that don't require much data over the wire - which certainly doesn't
apply
> to databases. : )

I think it depends whether the amount of data is the problem, or fitting it
together.

Somebody please explain to me further why I am wrong in all this?

Regards.

Gordan


From: "Gordan Bobic" <gordan(at)freeuk(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Database cluster?
Date: 2000-11-30 14:44:23
Message-ID: 005701c05adc$0c7e90a0$8000000a@localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

Thanks.

I have just had another thought. If all the tables are split across several
computers, this would help as well.

For example, if we have 100 records and 2 database servers, each server
could have 50 of those 100 records on it. When a selection is required,
each server would look through it's much smaller database, and report back
the "hits". This would, effectively, provide a near linear speedup in the
query time, while introducing only the minor network overhead (or a major
one, depending on how much data is transferred).

Some extra logic could then be implemented for related tables that would
allow the most closely related records from the different tables to be
"clustered" (as in kind of remotely similar to the CLUSTER command) on the
same server, for faster response time and minimized network usage
requirements. The "vacuum" or "cluster" features could be used overnight to
re-optimize the distribution of records across the servers.

In all this, a "master" node could be used for coordinating the whole
operation. We could ask the master node to do a query, and it would
automatically, knowing what slaves it has, fire off that query on them.
Each slave would then in parallel, execute a query, and return a subset of
the data we were looking for. This data would then be joined into one
recordset before it is returned to the client that requested it.

As far I can see, as long as the amounts of data shifted aren't huge enough
to cause problems with network congestion, and the query time is dominant
to data transfer time over the network, this should provide a rather
scaleable system. I understand that the form of database clustering I am
mentioning here is fairly rudimentary and unsophisticated, but it would
certaily be a very useful feature.

Are there any plans to implement this sort of functionality in PostgreSQL?
Or is this a lot more complicated than it seems...

Regards.

Gordan

----- Original Message -----
From: "Nathan Barnett" <nbarnett(at)cellularphones(dot)com>
To: "'Gordan Bobic'" <gordan(at)freeuk(dot)com>; <pgsql-general(at)postgresql(dot)org>
Sent: Thursday, November 30, 2000 2:34 PM
Subject: RE: [GENERAL] Database cluster?

> This system is in use through what is called shared nothing clustering
which
> is employed by IBM's DB2 and Microsoft SQL Server 2000. Either of these
> products will work in the manner that you are looking for.
>
> ---------------------------------------------
> Nathan Barnett
>
> -----Original Message-----
> From: pgsql-general-owner(at)postgresql(dot)org
> [mailto:pgsql-general-owner(at)postgresql(dot)org]On Behalf Of Gordan Bobic
> Sent: Thursday, November 30, 2000 4:02 AM
> To: pgsql-general(at)postgresql(dot)org
> Subject: Re: [GENERAL] Database cluster?
>
>
> > > > I am considering splitting the database into tables residing on
> separate
> > > > machines, and connect them on one master node.
> > > >
> > > > The question I have is:
> > > >
> > > > 1) How can I do this using PostgreSQL?
> > >
> > > You can't.
> >
> > I'll jump in with a bit more info. Splitting tables across multiple
> > machines would do nothing more than make the entire system run at a
> snail's
> > pace. Yes, it would slow it down immensely, because you just couldn't
> move
> > data between machines quickly enough.
>
> I don't believe that is the case. In my case, queries typically return
> comparatively small amounts of data. Around 100 records at most. The
amount
> of data that needs to be transferred is comparatively small, and even
over
> 10 Mb ethernet, it would take at most about a second to transfer. This is
a
> much smaller delay than the query time itself, which can take 10 seconds
or
> more. Remember that I said there are tables with over 30M records? Doing
> multi-table joins on things like this takes a long time...
>
> So, splitting the data in such a way that one table is queried, and then
> tables joined from it are queried in parallel, would cause a signifficant
> speed-up.
>
> For example, say we have tables T1, T2 and T3.
>
> T1 has fields F1.1, F1.2 and F1.3. T2 has F2.1 and T3 has F3.1 (at least,
> probably lots of other fields.
>
> Say I want to do
> SELECT *
> FROM T1
> WHERE F1.1 = F2.1 AND F1.2 = F3.1 AND F1.3 = 'somedata';
>
> Then F1.3 could be searched for 'somedata'. When the records are found,
> this could be cross-matched remotely, in parallel for F1.1=F2.1 and
> F1.2=F3.1, on different machines.
>
> This means that depending on the type, configuration and the usage of the
> database, a potentially massive improvement in performance could be
> achiveved, especially on multi-table joins which span lots of BIG tables.
>
> Somebody mentioned the fact that postgres uses IPC for communicating
> between processes. I think there are tools for clustering (I am not sure
if
> Mosix supports transparently allowing IPC across nodes) which can work
> around that.
>
> > Why? Well, whenever you join two tables that are on different
> machines,
> > the tables have to go across whatever sort of connection you have
between
> > the machines. Even if you use gigabit ethernet, you are still running
at
> a
> > mere fraction of the bandwidth of the computer's internal bus - and at
> > orders of magnitude greater latency. You'd have lots of CPU's sitting
> > around, doing absolutely nothing, waiting for data to come across the
> wire.
>
> Gigabit ethernet has around the same bandwidth as PCI bus. I suppose it
all
> depends on what machine you have running this. This would be true in the
> case that the datbase server is a nice big Alpha with severl CPUs.
>
> > There are alternatives, such as IP-over-SCSI. That reduces the
> latency
> > of ethernet quite a bit, and gives you much more bandwidth (say, up to
> 160
> > megabytes/second). However, that's still a pittance compared to the
main
> > system bus inside your computer.
>
> But SCSI is still 160MB burst (not sustained, unless you're using very
> expensive arrays). And Gigabit ethernet is 133 MB/s, albeit with greater
> latency.
>
> > That's one of the greatest hurdles to distributed computing.
That's
> why
> > the applications that are best adapted to distributed computing are
those
> > that don't require much data over the wire - which certainly doesn't
> apply
> > to databases. : )
>
> I think it depends whether the amount of data is the problem, or fitting
it
> together.
>
> Somebody please explain to me further why I am wrong in all this?
>
> Regards.
>
> Gordan
>
>
>
>


From: Doug Semig <dougslist(at)semig(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Database cluster?
Date: 2000-11-30 15:37:37
Message-ID: 3.0.6.32.20001130103737.007d8ca0@sloth.c3net.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

You're almost describing a Teradata DBM.

What an amazing machine! Last I heard about 6 years ago, though, AT&T was
rewriting it as an NT app instead of running on proprietary hardware. The
proprietary hardware was essentially a cluster of 80486 computers (at the
time).

What they had done was implemented a pyramid structure of 80486 computers.
The lowest level of computers had hard disks and stored the data. Two of
the lowest level computers would "report" to a single higher up computer.
Two of these higher up computers would "report" to yet another single
higher up computer until there was only one higher up computer to report to.

The thing that impacted me the most about this architecture was that
sorting was practically built in. So all the intermediary computers had to
do was merge the sorted result sets from its lower level computers. Blazing!

And data was stored on a couple of leaf-level computers for redundancy.

I miss that old beast. But I certainly cannot afford the multimillion
dollars required to get one for myself. We lovingly called the one we
worked with the "Pteradactyl," which is the old name for that bird-like
dinosaur (evidentally there's a new word for the bird-like dinosaur, the
pteronodon or something?).

Doug

At 02:44 PM 11/30/00 -0000, Gordan Bobic wrote:
>Thanks.
>
>I have just had another thought. If all the tables are split across several
>computers, this would help as well.
>
>For example, if we have 100 records and 2 database servers, each server
>could have 50 of those 100 records on it. When a selection is required,
>each server would look through it's much smaller database, and report back
>the "hits". This would, effectively, provide a near linear speedup in the
>query time, while introducing only the minor network overhead (or a major
>one, depending on how much data is transferred).
>
>Some extra logic could then be implemented for related tables that would
>allow the most closely related records from the different tables to be
>"clustered" (as in kind of remotely similar to the CLUSTER command) on the
>same server, for faster response time and minimized network usage
>requirements. The "vacuum" or "cluster" features could be used overnight to
>re-optimize the distribution of records across the servers.
>
>In all this, a "master" node could be used for coordinating the whole
>operation. We could ask the master node to do a query, and it would
>automatically, knowing what slaves it has, fire off that query on them.
>Each slave would then in parallel, execute a query, and return a subset of
>the data we were looking for. This data would then be joined into one
>recordset before it is returned to the client that requested it.
>
>As far I can see, as long as the amounts of data shifted aren't huge enough
>to cause problems with network congestion, and the query time is dominant
>to data transfer time over the network, this should provide a rather
>scaleable system. I understand that the form of database clustering I am
>mentioning here is fairly rudimentary and unsophisticated, but it would
>certaily be a very useful feature.
>
>Are there any plans to implement this sort of functionality in PostgreSQL?
>Or is this a lot more complicated than it seems...
>
>Regards.
>
>Gordan


From: "Gordan Bobic" <gordan(at)freeuk(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Database cluster?
Date: 2000-11-30 16:02:42
Message-ID: 006201c05ae6$f98b2e80$8000000a@localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> You're almost describing a Teradata DBM.

I knew someone must have thought of it before. ;-)

[snip]

> The thing that impacted me the most about this architecture was that
> sorting was practically built in. So all the intermediary computers had
to
> do was merge the sorted result sets from its lower level computers.
Blazing!

They effectively implemented a binary tree in hardware. One hell of an
indexing mechanism. :-)

> I miss that old beast. But I certainly cannot afford the multimillion
> dollars required to get one for myself.

I suppose it would depend on how many computers you want to have in this
cluster. The main reason why clusters are getting popular recently (albeit
not yet for databases, or so it would seem) is because it is cheaper than
anything else with similar performance.

The main question remains - are there any plans to implement something
similar to this with PostgreSQL? I would volunteer to help with some
coding, if a "group" was formed to work on this "clustering" module.

Regards.

Gordan


From: Doug Semig <dougslist(at)semig(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Database cluster?
Date: 2000-11-30 19:04:46
Message-ID: 3.0.6.32.20001130140446.007e3b10@sloth.c3net.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

I actually analyzed it once. I came to the conclusion that to do it right
it would be easier to make an almost entirely new db but use the same
external interfaces as PostgreSQL. To do a kludge of it, one might just
implement a tier that sits between the user and a bunch of standard
PostgreSQL backends.

It'd make a neat companion project, though. Like PG/Enterprise or
PG/Warehouse or something.

Doug

At 04:02 PM 11/30/00 -0000, Gordan Bobic wrote:
>> You're almost describing a Teradata DBM.
>
>I knew someone must have thought of it before. ;-)
>
>[snip]
>
>> The thing that impacted me the most about this architecture was that
>> sorting was practically built in. So all the intermediary computers had
>to
>> do was merge the sorted result sets from its lower level computers.
>Blazing!
>
>They effectively implemented a binary tree in hardware. One hell of an
>indexing mechanism. :-)
>
>> I miss that old beast. But I certainly cannot afford the multimillion
>> dollars required to get one for myself.
>
>I suppose it would depend on how many computers you want to have in this
>cluster. The main reason why clusters are getting popular recently (albeit
>not yet for databases, or so it would seem) is because it is cheaper than
>anything else with similar performance.
>
>The main question remains - are there any plans to implement something
>similar to this with PostgreSQL? I would volunteer to help with some
>coding, if a "group" was formed to work on this "clustering" module.
>
>Regards.
>
>Gordan


From: Peter Korsgaard <jacmet(at)control(dot)auc(dot)dk>
To: Doug Semig <dougslist(at)semig(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Database cluster?
Date: 2000-11-30 21:16:11
Message-ID: Pine.GSO.4.21.0011302159290.10266-100000@brille.control.auc.dk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, 30 Nov 2000, Doug Semig wrote:

> I actually analyzed it once. I came to the conclusion that to do it right
> it would be easier to make an almost entirely new db but use the same
> external interfaces as PostgreSQL. To do a kludge of it, one might just
> implement a tier that sits between the user and a bunch of standard
> PostgreSQL backends.
>
> It'd make a neat companion project, though. Like PG/Enterprise or
> PG/Warehouse or something.

I'm currently developing a simple version of such a system as an
university project. It is a fairly simple aproach with a proxy or a
distributor in front of a bunch of standard postgresl database servers.

The proxy monitors and forwards the requests from the clients to the
database servers. If it is a read-only request the query is forwarded to
the databaseserver currently experiencing the lowest load/most free
memory, otherwise it is sent to all database servers.

This approach obviously only performs well in systems with a high ratio of
read-only queries, such as search engines and so on.

--
Bye, Peter Korsgaard


From: "Steve Wolfe" <steve(at)iboats(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Database cluster?
Date: 2000-11-30 22:13:48
Message-ID: 002701c05b1a$d95b71e0$50824e40@iboats.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> The proxy monitors and forwards the requests from the clients to the
> database servers. If it is a read-only request the query is forwarded to
> the databaseserver currently experiencing the lowest load/most free
> memory, otherwise it is sent to all database servers.
>
> This approach obviously only performs well in systems with a high ratio of
> read-only queries, such as search engines and so on.

The tough part is syncronicity, should one of the machines drop out of the
cluster and need to be re-added without bringing the others down. In order
to get around that, each query needs to be logged on the master node with a
timestamp, so that the failed node can "catch up" in real-time. That brings
about other considerations, as well....

steve


From: Peter Korsgaard <jacmet(at)control(dot)auc(dot)dk>
To: Steve Wolfe <steve(at)iboats(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Database cluster?
Date: 2000-11-30 22:15:31
Message-ID: Pine.GSO.4.21.0011302313340.10266-100000@brille.control.auc.dk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Thu, 30 Nov 2000, Steve Wolfe wrote:

> The tough part is syncronicity, should one of the machines drop out of the
> cluster and need to be re-added without bringing the others down. In order
> to get around that, each query needs to be logged on the master node with a
> timestamp, so that the failed node can "catch up" in real-time. That brings
> about other considerations, as well....

Exactly! Thats also why I have decided not to implement that feature as
the deadline is in 3 weeks ;) If a database server fails it is simple
discarded from the cluster and can only be reconnected by taking the
system offline and doing the syncronisation manually.

--
Bye, Peter Korsgaard


From: "Gordan Bobic" <gordan(at)freeuk(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Database cluster?
Date: 2000-12-01 08:58:58
Message-ID: 005301c05b75$d928b9c0$8000000a@localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> I actually analyzed it once. I came to the conclusion that to do it
right
> it would be easier to make an almost entirely new db but use the same
> external interfaces as PostgreSQL.

I admit that I am not really too up-to-date on database theory, but I am a
bit surprised at that...

> To do a kludge of it, one might just implement a tier that sits between
the
> user and a bunch of standard PostgreSQL backends.

That is precisely what I was thinking about. There would have to be a
"master" node that controls what goes where, and distributed the load. This
"shouldn't" be too difficult (although I am not totally sure what I mean by
that). The nasty bit would probably be hacking the optimizer, SQL command
"CLUSTER", and VACUUM to take account and efficiently use all the extra
room for improving the performance.

Automating a "near-optimal" distribution of tables across machines could be
a bit of a difficult problem from a theory side, but it ought to be
possible. There are several options here.

One could just put one table on each server, which is unlikely to be all
that beneficial, although in a multi-table join, you'd want to search the
smallest tables first.

Then, there's the option of just splitting each table across multiple
machines. There is also the possibility of having some records overlap
between machines, of the on-line optimizer decides that that would be
useful for performance, and then sort out the syncing somehow.

Or, one could set up an even more sophisticated system where only the
tables and data that would benefit from being together would be on the same
server, so there could be a section of two tables on one server, the rest
of those two tables and a section of another table on another server, etc.
Basically, make both the table and record allocations completely dynamic
between the servers.

I am not sure how useful each of these splits would be, but it is certainly
something well worth exploring theoretically before the actual
implementation, because I reserve the right to be wrong in thinking that
any of these methods would produce an actual improvement in performance.

And, of course, there would be the bit of getting the optimizer and partial
replication to work properly across servers, which may not be an easy task.

> It'd make a neat companion project, though. Like PG/Enterprise or
> PG/Warehouse or something.

I agree. It would be really neat. Something like Mosix, but for databases.
And it just sounds like something that would be really useful for large
databases, especially as we start reaching steep part of the
price/performance curve for database servers.

Regards.

Gordan

> At 04:02 PM 11/30/00 -0000, Gordan Bobic wrote:
> >> You're almost describing a Teradata DBM.
> >
> >I knew someone must have thought of it before. ;-)
> >
> >[snip]
> >
> >> The thing that impacted me the most about this architecture was that
> >> sorting was practically built in. So all the intermediary computers
had
> >to
> >> do was merge the sorted result sets from its lower level computers.
> >Blazing!
> >
> >They effectively implemented a binary tree in hardware. One hell of an
> >indexing mechanism. :-)
> >
> >> I miss that old beast. But I certainly cannot afford the multimillion
> >> dollars required to get one for myself.
> >
> >I suppose it would depend on how many computers you want to have in this
> >cluster. The main reason why clusters are getting popular recently
(albeit
> >not yet for databases, or so it would seem) is because it is cheaper
than
> >anything else with similar performance.
> >
> >The main question remains - are there any plans to implement something
> >similar to this with PostgreSQL? I would volunteer to help with some
> >coding, if a "group" was formed to work on this "clustering" module.
> >
> >Regards.
> >
> >Gordan


From: "Gordan Bobic" <gordan(at)freeuk(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Database cluster?
Date: 2000-12-01 09:05:33
Message-ID: 005601c05b75$df0dd1e0$8000000a@localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> > I actually analyzed it once. I came to the conclusion that to do it
right
> > it would be easier to make an almost entirely new db but use the same
> > external interfaces as PostgreSQL. To do a kludge of it, one might
just
> > implement a tier that sits between the user and a bunch of standard
> > PostgreSQL backends.
> >
> > It'd make a neat companion project, though. Like PG/Enterprise or
> > PG/Warehouse or something.
>
> I'm currently developing a simple version of such a system as an
> university project. It is a fairly simple aproach with a proxy or a
> distributor in front of a bunch of standard postgresl database servers.
>
> The proxy monitors and forwards the requests from the clients to the
> database servers. If it is a read-only request the query is forwarded to
> the databaseserver currently experiencing the lowest load/most free
> memory, otherwise it is sent to all database servers.

Surely, you have to send off a query to all servers that the specific
tables you are searching are split across. How are you handling splitting
of the data?

> This approach obviously only performs well in systems with a high ratio
of
> read-only queries, such as search engines and so on.

This implies lots of servers with identical data on them. Am I right?

Could you tell us a bit more about your project (if it's not breaching any
non-disclosure agreements, that is)? It could be a good starting point for
what we are talking about implementing.

Regards.

Gordan


From: Peter Korsgaard <jacmet(at)control(dot)auc(dot)dk>
To: Gordan Bobic <gordan(at)freeuk(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Database cluster?
Date: 2000-12-01 09:31:45
Message-ID: Pine.GSO.4.21.0012011015120.15910-100000@brille.control.auc.dk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

On Fri, 1 Dec 2000, Gordan Bobic wrote:

> > The proxy monitors and forwards the requests from the clients to the
> > database servers. If it is a read-only request the query is forwarded to
> > the databaseserver currently experiencing the lowest load/most free
> > memory, otherwise it is sent to all database servers.
>
> Surely, you have to send off a query to all servers that the specific
> tables you are searching are split across. How are you handling splitting
> of the data?

I'm not ;) All the database servers of the clusters simply each contain
a copy of all the data. I know this isn't an optimal solution, but this
project is only running for a very short period (around 2 months)

> > read-only queries, such as search engines and so on.
>
> This implies lots of servers with identical data on them. Am I right?
>
> Could you tell us a bit more about your project (if it's not breaching any
> non-disclosure agreements, that is)? It could be a good starting point for
> what we are talking about implementing.

There is no nda's involved in this project. We are planning on releasing
the software under GPL after the deadline (22th of December).

There isn't much more to tell about the system at this time. Currently we
are finishing off the lower levels of the systems like database driver
(currently only postgres), status daemons through snmp and so on. At this
time the actual scheduling hasn't been implemented.

--
Bye, Peter Korsgaard


From: "Gordan Bobic" <gordan(at)freeuk(dot)com>
To: "pggeneral" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Database cluster?
Date: 2000-12-01 09:48:25
Message-ID: 003301c05b7d$d6eefcc0$8000000a@localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> > The proxy monitors and forwards the requests from the clients to the
> > database servers. If it is a read-only request the query is forwarded
> > to the databaseserver currently experiencing the lowest load/most free
> > memory, otherwise it is sent to all database servers.
> >
> > This approach obviously only performs well in systems with a high ratio
> > of read-only queries, such as search engines and so on.
>
> The tough part is syncronicity, should one of the machines drop out of
> the cluster and need to be re-added without bringing the others down.
> In order to get around that, each query needs to be logged on the master
> node with a timestamp, so that the failed node can "catch up" in
> real-time. That brings about other considerations, as well....

What about using something like RAID-type setup on the database servers? So
that for example, all data (records or tables, depending on the split)
exists on more than one server. That way if a server goes down, all the
data would still exist on the other servers. A server failure could trigger
a replication operation that would re-distribute the data across the
remaining servers in such a way that if another one fails, the data
integrity would still be preserved. Similarly, when a new server is added,
this should trigger an operation that would re-distribute records in sich a
way as to insure redundancy with the new number of server (in this case,
move some records from the old servers to the new one and load balance it).
This, again, "shouldn't" be too hard to implement, if we could sort out the
design first.

However, this could be (theoretically) difficult to combine with
"near-optimal" table/record distribution. The two might go against each
other. There should be a way of enforcing the redundancy part, and then
doing a "best effort" to do it in such a way as to gain most performance
improvement. This _could_ be done, and off the top of my head, I cannot see
any reason why it wouldn't work reasonably well. Obviously, a server
failure would cause the whole thing to slow down for a while until the
records were re-distributed, and the optimizer re-assesses the situation,
but that is true of any redundant system that has to remain at least
operational during a failure. After the replication is completed, it should
resume at it's full speed (dictated by the number of available servers).

Any opinions on this?

Regards.

Gordan


From: "Gordan Bobic" <gordan(at)freeuk(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Database cluster?
Date: 2000-12-01 09:51:24
Message-ID: 003401c05b7d$d7215760$8000000a@localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> On Thu, 30 Nov 2000, Steve Wolfe wrote:
>
> > The tough part is syncronicity, should one of the machines drop out
of the
> > cluster and need to be re-added without bringing the others down. In
order
> > to get around that, each query needs to be logged on the master node
with a
> > timestamp, so that the failed node can "catch up" in real-time. That
brings
> > about other considerations, as well....
>
> Exactly! Thats also why I have decided not to implement that feature as
> the deadline is in 3 weeks ;)

Fair!

> If a database server fails it is simple
> discarded from the cluster and can only be reconnected by taking the
> system offline and doing the syncronisation manually.

Where's all that enthusiasm for a new project gone? Ever thought of making
a "new and improved" version 2 or something? I think it's a great idea, and
if we can get a few people together to implement it, it sould be a
worthwhile exercise. And right now, it looks like you have a bit of a head
start. ;-)

Regards.

Gordan


From: "Gordan Bobic" <gordan(at)freeuk(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Database cluster?
Date: 2000-12-01 10:26:27
Message-ID: 008c01c05b81$33e183a0$8000000a@localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-general

> > > The proxy monitors and forwards the requests from the clients to the
> > > database servers. If it is a read-only request the query is forwarded
to
> > > the databaseserver currently experiencing the lowest load/most free
> > > memory, otherwise it is sent to all database servers.
> >
> > Surely, you have to send off a query to all servers that the specific
> > tables you are searching are split across. How are you handling
splitting
> > of the data?
>
> I'm not ;)

That's what I thought you were implying...

> All the database servers of the clusters simply each contain
> a copy of all the data. I know this isn't an optimal solution, but this
> project is only running for a very short period (around 2 months)

It's actually a rather fair compromise, given the time scale (IMHO). It's
like mirroring disks with RAID. You get same write performance, but
multiply the read performance for multiple disks. It sounds like you're
doing the same thing with databases.

And as your application is web search engine, this should work quite well,
especially under heavy load / lots of hits.

> > > read-only queries, such as search engines and so on.
> >
> > This implies lots of servers with identical data on them. Am I right?
> >
> > Could you tell us a bit more about your project (if it's not breaching
any
> > non-disclosure agreements, that is)? It could be a good starting point
for
> > what we are talking about implementing.
>
> There is no nda's involved in this project. We are planning on releasing
> the software under GPL after the deadline (22th of December).

Hmm... I'll be looking forward to having a look at that. Could you post us
a URL, once it is released? Are you intending to continue development of
this? I'm just thinking about how well this approach would work with
something like what Valter has suggested here recently...

Regards.

Gordan