Beowulf Logo

Beowulf Bulk Data Server

CESDIS Logo

Overview

The goal of the "Beowulf Data Server" project is to develop a

for a high performance bulk data server constructed entirely from low-cost commodity off-the-shelf components.

Processors

The prototype system is being constructed from dual processor Intel P6 motherboards. While the processing power of the P6 might seem to be overkill for a machine that's designed for I/O, there are several factors that led us to this selection:

Disks

A vital element of a mass storage system is the disks: in our system they will be one of the most expensive items. They are also one of our most-studied components.

An initial suprise to many people is that we are focusing on EIDE disks, when conventional wisdom holds that SCSI disks have better performance. The reason is the price/performance ratio between the two interface types. Most disk drives are available with either EIDE or SCSI interfaces, with the SCSI interface model commanding between a $100 and %40 premium. Since the physical structure of the drives and heads is identical, the sustained I/O performance is nearly the same.

SCSI disks have the advantage of "tagged command queueing" and "disconnects", but these feature are often outweighed by the lower overhead of issuing IDE commands, and having few devices on an IDE chain.

A major advantage of a SCSI interface, the ability to connect many devices to single interface, is also not significant in this application: to use the I/O bandwidth of all connected devices limits the SCSI bus to just a few devices.

A final consideration is the cost of the SCSI adapter and cables. High performance SCSI adapters are expensive compared to bus-mastering EIDE adapters. And the cables for a high-performance SCSI bus, "ultra-wide, fast, differential" are a significant fraction of the disk drive price.

A summary of our most recent IDE disk evaluations are available here.

Server Software Design

(Work in progress)

Some old pictures

Here are some pictures the way the cluster was original configured:

  1. The first picture shows all 50 nodes of the cluster. Notice the administrative node, "Ecgtheow," and its console and the high-speed networks in the center column of the cluster. Also note, in the background one sees a Maspar and the Beowulf cluster theHIVE.
  2. The second picture gives a closer view of the newtorks. One can see the "fat" tree structure, using Gigabit Ethernet for the fat part of the tree. In center column holds the high-speed switches: the orange or gray fibre optic cables are the connections to the Packet Engines Gigabit Ethernet switch. The next two inner columns of the cluster are the "router" nodes: note the connection to the fibre optics and the Quad Ethernet adpaters. The four remaining nodes on a row are the "data" nodes and form the "leaves" for the corresponding router nodes.

Sponsors

ICON(darpa.gif,116,66) ICON(cesdis-ball.gif,80,65,alt="CESDIS")

This project is funded by the DARPA Information Technology Office and the NASA HPCC program.


Contact: Phil Merkey merk@cesdis.gsfc.nasa.gov
Page last modified: 1999/05/04 16:32:54 GMT
CESDIS is operated for NASA by the USRA