We designed another artificial test to determine the approximate disk bandwidth of the Beowulf prototype and the limiting factors on remote interprocessor file accesses. The results of this experiment as originally run on the Beowulf prototype are shown in Figure 3 and the demonstration system results are shown in Figure 5. The experiment measured the throughput of simultaneous file transfers across a mix of intra-processor and interprocessor copies for a range of file sizes. Seven simultaneous file transfers were performed. Each file transfer could be either remote or local. A local file transfer involved only one processor, which would copy a file from its local disk to another file on its local disk. A remote file transfer involved two processors. One processor would run a process reading a file from its local disk and writing it across the network to another processor running a process reading the data from the network and writing it to its local disk. No processor was ever involved in more than one file transfer, avoiding local disk contention. A problem that arises when conducting an experiment of this type is that the Linux operating system automatically caches files as they are accessed in an attempt to reduce the cost of future accesses. To ensure that all file transfers involved only uncached files, we copied a dummy 32 MB file prior to each run. Local file transfers were performed using the Linux implementation of the POSIX read() and write() system calls while remote file transfers additionally used TCP/IP for transferring the files across the network.
Figure 5 shows the results of running this experiment in 2 channel mode for file sizes ranging from 1 to 16 MB and varying remote file copies from 0 (all local copies) to 7 (all remote copies). As one would expect, a file transfer rate of 11.7 MB/s, the largest achieved, occurred when all file transfers were local. The smallest transfer rate achieved was 6.6 MB/s. In the prototype system (see Figure 3) the network clearly constrained the disk throughput. This is no longer the case. As the number of remote file transfers increase, the curves no longer converge on the maximum sustained network performance. They now remain rather flat, only degrading by about 3.5% for each additional remote file transfer, unlike the 15% seen in the Beowulf prototype with 10 Mbps Ethernet.