| Beowulf Project at CESDIS
|
|
The Beowulf clusters that have been constructed at the
Goddard Space Flight Center are presented along with
links to
Processor and Motherboard information.
Goddard Clusters
There are current several smaller clusters at GSFC
and two large Beowulf clusters.
The large clusters each use half of a
144-port Foundry switch as their network backplane.
Usually, the clusters are employed for distinct
purposes so it makes sense to treat them as
individual clusters; however, when wired together,
theHive and Ecgtheow form a 256-processor cluster
which will be used during the summer of 1999 to
conduct scaling tests.
theHIVE
theHIVE is the
Highly-parallel Integrated Virtual Environment
constructed by Dr. John E. Dorband.
This is a 64 node cluster with a 128 P6 processors, 24 GBytes of memory and
0.8 TBytes of disk. The network backbone is half
of the 144-port Foundry switch.
Read about it on it own
homepage.
Bulk Data Server
The Bulk Data Server, Bulk Data Server
ecgtheow, was built in May and June 1997.
It received a significant upgrade in the winter 98-99.
It is now a 64 node, 128 Intel P6 at 200MHz processors.
The "fat" tree network configuration has been replaced
with the other half of the 144-port Foundry switch
used on the theHIVE.
The memory has been increased to 8GBytes and the disk
space is now at 1.4 TBytes.
Each node consists of:
- Intel "Providence" PR440FX Motherboard with on-board
Intel Fast Ethernet.
- Two Pentium Pro CPU's running at 200 MHz
- 128M of memory
- 3 IDE disks
Older clusters
Older GSFC Beowulf clusters are a 16 processor Pentium-based cluster
"Hrothgar", a
five process Pentium-based cluster with multiple signal processing boards,
and our original 486-based cluster
"Wiglaf"
Each Hrothgar node consists of
- A Pentium processor running at 100Mhz
- A PCI motherboard based on the Intel Triton chipset
- 256K of synchronous cache
- 32M of memory
- 1.2G EIDE disk attached to the motherboard's 17MB/sec. bus master IDE
controller.
- Two/three 100Mbs Fast Ethernet
adapters.
The processors are currently connected by channel bonding
two Fast Ethernet switches.
The original Beowulf prototype has been "decommissioned."
It is interesting to note the progress that has been made
over the last few years.
The highlights of the system components and characteristics
of this 16-node system are:
- DX4 processor running at 100Mhz internally.
This was a hybrid between the 80486 and the Intel P5 Pentium.
Its features included:
- '486 execution core with improved microcode
- SMM, System Management Mode, power management from the SL series
- a 16KB cache, the same as the P5 and twice the 8K of the '486,
- made with the same 3.3V, 0.6 micron process and on the same process
lines as the P5-90 and P5-100 processors.
- The motherboard were based on the SiS 82471 chipset.
This was the highest performance low-cost
'486 support chipset available at the time we purchased
the system. Each motherboard had:
- 3 VL-bus slots, 2 bus-master capable
- 4 ISA-only slots
- 256K secondary cache with 2-1-1-1 burst refill.
- "green" power-saving circuitry.
- Each processor had 16M of 60ns DRAM.
The 60ns memories were only slightly more expensive
than the usual 70ns or 80ns variety,
and allow use-to-use a shorter delay
when accessing main memory.
The higher memory bandwidth was especially
important when the interally clock-tripled processor
does block memory moves. Having only 16M of memory
is the principle reason the system is difficult to
use and mantain, given our current machines.
- Each node had a 540M or 1G EIDE disk.
The EIDE disks connected to a VL bus controller based
on the DTC805 chip. The measured performance is about
4.5 MB/sec., close to the physical head data rate of
the drive (nominally 3.5-5.6MB/sec, depending on the zone).
- Three 10Mbs bus-master ethernet cards.
The scalable communications was implemented by
duplicating the hardware address of a primary network
adaptor to the secondary interfaces, and marking all packets
received on the internal networks as coming from a single
pseudo-interface. This scheme constrains each internal
network to connect to each node. With these constraints
the Ethernet packet contents are independent of the actual
interface used and we avoid the software routing overhead
of handling more general interconnect topologies. The
only additional computation over a using single network
interface is the computationally simple task of distributing
the packets over the available device transmit queues.
The method alternated packets among the available network
interfaces.
The system-visible interface to this "channel bonding"
is the 'ifenslave' command. This command is analogous
to the 'ifconfig' command used to set up the primary
network interface. The 'ifenslave' command copies the
configuration of a "master" channel to a slave channel.
It can optionally configure the slave channel to run
in a receive-only mode, which is useful when initially
configuring or shutting the down the additional network
interfaces.
Links to Processor and Motherboard information
Dual Pentium II motherboards
Contact:
Phil Merkey
merk@cesdis.gsfc.nasa.gov.