Cyborg Parallel Cluster



Some History

    The original Beowulf cluster consisted of 16 nodes built by Don Becker (formerly of NASA) and Dan Ridge (University of Maryland).  It underwent many transitions.  It was administered by Kimberly Engle and later rebuilt and administered by Josephine Palencia.

    The department built another cluster of 32 nodes  and called it the Cyborg Project.  Cyborg consisted of dual motherboard with (32x2)  PIII 450mhz processing elements onboard.  The main console has 2 SMC 10/100 fast ethernet cards connected to the nodes by 3COM switches. The switches have become necessary due to the high volume of network traffic brought about by node-to-node communication.  

    Although the  hardware layout is the same as that of the original Beowulf system at NASA., the setup of the cluster was in such a way that each of the nodes were independent machines mounting only user binaries from the console.



The Hardware

      CLUSTER I : Cyborg.physics.drexel.edu

Dual motherboard,  32(x2) Pentium III 450mhz PE's                      

ASUS P2B-D(S) Motherboard (console)  

512(256)mb 8ns PCI-100 mem console (node) : Total  8500Mb

3(2) SMC 100mbps fast-ethernet cards (node)

8.4Gig Seagate IDE+9.0Gig Seagate SCSI disk (console): Total  4000Gb

Acer CD-RW

                                                                                                                                                             

      CLUSTER II:  Beowulf.physics.drexel.edu                          

       

16 First generation Pentium I's with the  166 Mhz Intel Triton Chip set      

Tyan Mother Board    

(64)48MB EDO memory    

2 SMC 100mbps fast-ethernet cards    

2.2GB IDE disk

The Software Environment


Redhat Linux                
MOSIX coutesy of the Comp.  Science, Hebrew University of Jerusalem  
PVM from the Oak Ridge National Laboratory
MPI from Argone National Laboratory

                     

Network Communication

    There is an optimal number of nodes that can be added to the cluster beyond which the network traffic congestion inhibits the normal communication channels of the system:  Depending on the parallel tasks being ran or how much node-to-node communications are occurring, this number is close to 30 units.    An ordinary 10/100 hub cannot handle the network congestion and so we used   2 24-port  3COM Switches.

       

Heat

      Another important aspect in the cluster build-up is adequate cooling.  With several drives in each of  the  32 machines, the heat build-up (if not adequately dissipated) would  make the system unstable.  This is basically because the CPU and the drives are all sensitive to heat.  The main console alone of Cyborg has 5 fans- 2 for the drives, 2 case fans and 1 fan for the CPU.  Each of the nodes have each have 4 fans per case.