Networking Forums

Networking Forums > Computer Networking > Linux Networking > Cluster: what type to use?

Reply
Thread Tools Display Modes

Cluster: what type to use?

 
 
Morris Ebbets
Guest
Posts: n/a

 
      10-09-2005, 07:00 PM
I work in a scientific computation group. In the next year or so we want to
migrate from our Sun servers (one v1280 and one e4500; they're old hence
slow) to a set of Linux workstations.

To share resources, we'd like to cluster the workstations. The cluster
should have the following properties:
* Load balancing
* Each node should be able to access our RAID without having to use a slow
connection like NFS
* Software for cluster should be relatively easy to install and maintain,
and shouldn't be inordinately expensive (say <~ $30 K)

We _don't_ need the cluster to be capable of true parallelism, because the
computations we run don't need that kind of power (and the software we use,
primarily matlab and some freeware stuff, isn't really set up for parallel
computation anyway). (Argument here is that we don't need Beowulf, and I'm
supposing that because it offers true parallel computing, there is likely
some higher "cost" in terms of installation/maintenance, but I could be dead
wrong about that.)

Anyone have any suggestions for cluster packages that would meet our needs?

TIA,

S


 
Reply With Quote
 
 
 
 
Juha Laiho
Guest
Posts: n/a

 
      10-10-2005, 06:30 PM
"Morris Ebbets" <(E-Mail Removed)> said:
>I work in a scientific computation group. In the next year or so we want to
>migrate from our Sun servers (one v1280 and one e4500; they're old hence
>slow) to a set of Linux workstations.
>
>To share resources, we'd like to cluster the workstations. The cluster
>should have the following properties:
>* Load balancing
>* Each node should be able to access our RAID without having to use a slow
>connection like NFS
>* Software for cluster should be relatively easy to install and maintain,
>and shouldn't be inordinately expensive (say <~ $30 K)


I'm not saying I have (any kinf of) answer for you, but I have couple
of questions which may help others in finding the answer.

- looks like you're not looking for failover/fault-tolerance, right?
- please elaborate on load balancing -- you said in your other text that
"true parallelism" is not needed -- so, please explain what it is you
mean by load balancing; what it should do?
- "each node should be able to access our RAID" -- should there be a file
system shared across the nodes (each node having simultaneous and equal
read/write access to a set of files), or do you just have a bunch of
disk space you want to provide to the machines, with no need for sharing?
Do you have the disk server already, or should this be part of the
specification you're looking for -- if you have the disk server/disk
subsystem already, it would help to know what it is.

Then, NFS necessarily isn't that slow -- esp. if you can provide it
a switched segment of its own (say, 1Gbit/s segment with jumbo frames
enabled). Just make sure that flooding protections in the switch don't
kick in. Perhaps more than one connection trunked to the server so that
server has more bandwidth than any single client.
--
Wolf a.k.a. Juha Laiho Espoo, Finland
(GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
"...cancel my subscription to the resurrection!" (Jim Morrison)
 
Reply With Quote
 
Morris Ebbets
Guest
Posts: n/a

 
      10-11-2005, 06:38 PM

"Juha Laiho" <(E-Mail Removed)> wrote in message
news:diec02$iv9$(E-Mail Removed)-int...
> "Morris Ebbets" <(E-Mail Removed)> said:
> >I work in a scientific computation group. In the next year or so we want

to
> >migrate from our Sun servers (one v1280 and one e4500; they're old hence
> >slow) to a set of Linux workstations.
> >
> >To share resources, we'd like to cluster the workstations. The cluster
> >should have the following properties:
> >* Load balancing
> >* Each node should be able to access our RAID without having to use a

slow
> >connection like NFS
> >* Software for cluster should be relatively easy to install and maintain,
> >and shouldn't be inordinately expensive (say <~ $30 K)

>
> I'm not saying I have (any kinf of) answer for you, but I have couple
> of questions which may help others in finding the answer.


Thanks for your kind, informative reply.

> - looks like you're not looking for failover/fault-tolerance, right?


Correct.

> - please elaborate on load balancing -- you said in your other text that
> "true parallelism" is not needed -- so, please explain what it is you
> mean by load balancing; what it should do?


Just that it wouldn't be efficient if one of the CPUs got many heavy jobs
and the others were free.

If we had separate workstations and users could log into any workstation
they liked, by random chance it might occur that an unreasonable fraction of
the heavy jobs were placed on one CPU.

> - "each node should be able to access our RAID" -- should there be a file
> system shared across the nodes (each node having simultaneous and equal
> read/write access to a set of files), or do you just have a bunch of
> disk space you want to provide to the machines, with no need for

sharing?
> Do you have the disk server already, or should this be part of the
> specification you're looking for -- if you have the disk server/disk
> subsystem already, it would help to know what it is.


Hmm...don't know enough about disk storage to answer in detail. What I mean
is the following.
* We already have a RAID, equipped with a Veritas FS. I think it's being
run by the Sun v1280 right now, but am not sure.
* There's a lot of data.
* To my admittedly naive eye, it doesn't make sense to assign particular
disk space to particular machines, because that would tie particular CPUs to
particular disk space.

> Then, NFS necessarily isn't that slow -- esp. if you can provide it
> a switched segment of its own (say, 1Gbit/s segment with jumbo frames
> enabled). Just make sure that flooding protections in the switch don't
> kick in. Perhaps more than one connection trunked to the server so that
> server has more bandwidth than any single client.


OK. Don't know much about that stuff either, but it's very informative to
know that NFS isn't necessarily a terrible bottleneck, whatever else its
faults might be.

Thanks,

S

> --
> Wolf a.k.a. Juha Laiho Espoo, Finland
> (GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
> PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
> "...cancel my subscription to the resurrection!" (Jim Morrison)



 
Reply With Quote
 
Juha Laiho
Guest
Posts: n/a

 
      10-12-2005, 04:14 PM
"Morris Ebbets" <(E-Mail Removed)> said:
>"Juha Laiho" <(E-Mail Removed)> wrote in message
>news:diec02$iv9$(E-Mail Removed)-int...
>> "Morris Ebbets" <(E-Mail Removed)> said:
>> >I work in a scientific computation group. In the next year or so
>> >we want to migrate from our Sun servers (one v1280 and one e4500;
>> >they're old hence slow) to a set of Linux workstations.
>> >
>> >To share resources, we'd like to cluster the workstations.

....
>> - please elaborate on load balancing -- you said in your other text that
>> "true parallelism" is not needed -- so, please explain what it is you
>> mean by load balancing; what it should do?

>
>Just that it wouldn't be efficient if one of the CPUs got many heavy jobs
>and the others were free.
>
>If we had separate workstations and users could log into any workstation
>they liked, by random chance it might occur that an unreasonable fraction of
>the heavy jobs were placed on one CPU.


Ok - I think I have the picture now. At least some years ago there was
a product called LSF from Platform Computing, Inc., which might fit your
needs. I think there could be some open source alternatives available as
well, but haven't been following the situation.
--
Wolf a.k.a. Juha Laiho Espoo, Finland
(GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
"...cancel my subscription to the resurrection!" (Jim Morrison)
 
Reply With Quote
 
Morris Ebbets
Guest
Posts: n/a

 
      10-12-2005, 07:34 PM

"Juha Laiho" <(E-Mail Removed)> wrote in message
news:dijcou$oou$(E-Mail Removed)-int...
> "Morris Ebbets" <(E-Mail Removed)> said:
> >"Juha Laiho" <(E-Mail Removed)> wrote in message
> >news:diec02$iv9$(E-Mail Removed)-int...
> >> "Morris Ebbets" <(E-Mail Removed)> said:
> >> >I work in a scientific computation group. In the next year or so
> >> >we want to migrate from our Sun servers (one v1280 and one e4500;
> >> >they're old hence slow) to a set of Linux workstations.
> >> >
> >> >To share resources, we'd like to cluster the workstations.

> ...
> >> - please elaborate on load balancing -- you said in your other text

that
> >> "true parallelism" is not needed -- so, please explain what it is you
> >> mean by load balancing; what it should do?

> >
> >Just that it wouldn't be efficient if one of the CPUs got many heavy jobs
> >and the others were free.
> >
> >If we had separate workstations and users could log into any workstation
> >they liked, by random chance it might occur that an unreasonable fraction

of
> >the heavy jobs were placed on one CPU.

>
> Ok - I think I have the picture now. At least some years ago there was
> a product called LSF from Platform Computing, Inc., which might fit your
> needs. I think there could be some open source alternatives available as
> well, but haven't been following the situation.


OK; thanks for the tips!

> --
> Wolf a.k.a. Juha Laiho Espoo, Finland
> (GC 3.0) GIT d- s+: a C++ ULSH++++$ P++@ L+++ E- W+$@ N++ !K w !O !M V
> PS(+) PE Y+ PGP(+) t- 5 !X R !tv b+ !DI D G e+ h---- r+++ y++++
> "...cancel my subscription to the resurrection!" (Jim Morrison)



 
Reply With Quote
 
Douglas O'Neal
Guest
Posts: n/a

 
      10-13-2005, 05:28 PM
Juha Laiho wrote:
> "Morris Ebbets" <(E-Mail Removed)> said:
> <snip>
>
> Ok - I think I have the picture now. At least some years ago there was
> a product called LSF from Platform Computing, Inc., which might fit your
> needs. I think there could be some open source alternatives available as
> well, but haven't been following the situation.



One open-source alternative to LSF is Grid Engine (formerly Sun Grid Engine
and also available commercially from Sun as N1 Grid Engine). Look at
http://gridengine.sunsource.net for more info.

Doug
 
Reply With Quote
 
 
 
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
2 NIC's in a 'cluster' Mha Windows Networking 5 12-09-2007 07:51 AM
Our high end SQL server cluster is maxed out, how else to expand? Will it help to move tables off of the cluster onto other clusters or will that just create processing bottleneck on the cluster running SQL server? Daniel Windows Networking 0 07-20-2007 07:02 PM
Wireless Encryption Type In Outbound Packets? Enforcing Wireless Connection Type south.loop.blogger@gmail.com Wireless Internet 0 05-30-2007 04:18 PM
ABE + Cluster KB Windows Networking 0 03-28-2006 03:33 PM
cannot add a host in a NLB cluster Jéjé Windows Networking 4 11-17-2004 09:25 PM



1 2 3 4 5 6 7 8 9 10 11