The strategy of moving a product upward from modest origins has been successful before. At its beginning, Sun Microsystems took off-the-shelf parts, a freely available operating system (Berkeley UNIX), and set out to overthrow the high-end UNIX workstations of the day. Comparable performance and especially high cost savings were benefits too great for customers to overlook. Apollo perished in the competition with Sun, and Cray, the big name in supercomputers, is no more thanks to competitive pressures in the high-end computing space.
High-end computing focuses on a number of simple problems with complicated solutions. Besides the basic question of high performance there is the problem of scalability: How can we keep adding computing power to a system and gain corresponding increases in productivity? How do we manage all the power? Another is the problem of availability: How can we maintain a high-power computing system that is available 168 hours a week, that will not stumble or fall when a component fails (whether processor or memory chips, disk drives, network communications, or anything else)?
The Open Source community has attacked these problems in its usual way, with many different projects springing up and then cooperating or fading as the technology as a whole moves forward. The Community has used clustering as the most effective way of dealing with the problems of low-cost high-performance computing, and proprietary software has helped here as well. A few of the projects and successful implementations of them are covered in this chapter; links to many of the others can be found at http://linas.org/linux/.
The uses for high-end Open Source systems currently run from weather analysis to encryption cracking to genetic research, and the projects discussed in this chapter may be run on discarded hardware or on workstation or even mainframe processors. Except for the mainframe projects, all of the innovations involve ganging sets of processors in different ways.
As ordinary desktop computers have become more powerful, users are discovering that they can join them in clusters to undertake the same computing tasks formerly reserved for the biggest machines. All the hardware of a supercomputer, including its specially designed and enormously expensive high-end chips, is designed to minimize the communications difficulties within the computer, contributing to its great speed. The actual computing power of a supercomputer can now be imitated by clusters of desktop computers; the chief difficulty is getting the desktop computers within these clusters to communicate efficiently so as to approach the speed of the supercomputer. The final result is the raw processing power of a supercomputer, at a lower speed, and at a greatly reduced cost. The Incyte Genomics computer staff estimates that their clustered Linux server farms (described later) do the work of supercomputers at one-hundredth of the cost.
The Linux kernel itself was slow to get started in clustering (see the section, "SMP" later in this chapter), partly because Linus Torvalds did not have that many machines around the house. But one element of the Linux community went early and eagerly into distributed computing, the use of many different machines for solving a large problem. High-end computer problems tend to sort themselves into two types: one sort of problem such as cracking encryption does not require interaction between the various machines as they work on their assigned portions of the work; the other, such as weather prediction, requires a constant manipulation of interacting variables and works best if each processor can communicate directly with all the other processors. The second type of problem will run substantially slower on multiple machines unless some communication arrangements are made. The special hardware design of supercomputers achieved this intercommunication, and was responsible for much of their great expense.
Because Linux needed to start at the bottom of cluster computing, a problem involving parallel machines that simply received and executed centrally assigned problems was a natural starting point. It struck many Linux fans and other computing enthusiasts that cracking computer encryption keys would be an exciting development and demonstration ground for distributed computing. In early 1997 a group of hackers (in the good sense, remember!) organized an effort that was to become distributed.net (http://www.distributed.net/). The group supplied client pieces for a wide variety of computers (making this a heterogeneous distributed computing project, like the Titanic work described later), and invited users to install the clients on their machines to participate. The client software used the spare computing power of the local machines as available to do the work of grinding through the possible solutions. A central server organized the division of the work and the checking of results as they came in.
The first challenge was the cracking of a 56-bit key; this took 212 days in 1997. The group immediately took up the much more difficult task of breaking a 64-bit key; the task is still not finished, but the estimated time for the task continues to drop as more powerful computers join the network and the overall project organization and software improve. The client software is Open Source, and anyone is free to submit improvements. In the most recent worldwide demonstration of this method, CS Communication & Systèmes (http://www.cie-signaux.fr/), offered a challenge for which distributed.net organized over 38,000 participants in 140 countries to crack the 56-bit key in only 62 days. In the meantime the Electronic Frontier Foundation (http://www.eff.org/) has demonstrated much faster results using heavy equipment dedicated totally to key-cracking.
From an Open Source point of view, these demonstrations show a) the weakness of the preferred government schemes of encryption, which at that time permitted the export of keys no longer than 56-bits, and only to friendly countries; and b) the power of ordinary people to rival the computer power available to governments to address code-cracking. The power to harness dozens of different kinds of computers and their operating systems by using open standards and Open Source software is another demonstration of the limits of the proprietary approach to serious computing.
The dislocations of widely available massive computer power are only beginning to be felt. It is now possible for ambitious secondary schools to collect old i486-based computers and wire them together to get supercomputing power, and the Chinese were not slow to buy up such machines in the U.S. and take them back to China to build their own supercomputers. The Stone Soup Supercomputer at Oak Ridge National Laboratory (http://stonesoup.esd.ornl.gov/) received its name because it cost nothing to build, being made up of donated obsolete computers that had been lying unused around offices at the site. The Beowulf software (discussed later in the section, "Beowulf") with which it was built makes it easy to add or substitute newer machines as they are donated; Stone Soup has increased its power with the addition of cast-off Alpha- and Pentium-based machines.
From hobbyists and scientific researchers the technology passed into the commercial world. The expensive film Titanic would have cost far more if its producer had not saved money by using computer animation to avoid the hiring of flocks of extras and the building of an enormous ship model and tank to sink it in. Passengers on the ship, waves around the ship, and even the wind’s rippling of the lifeboat covers were all depicted with computer-rendered automation. The Render Ranch, as the server farm was called, was built in 1996, and included over 100 Alpha machines running Linux and another 50-odd machines running NT, all coordinated with hundreds of SGI machines, and all wired together with a 100Mbps Ethernet.
Beowulf (http://www.beowulf.org/) is the most famous of the Linux distributed computing systems, and its software, continually updated, is often shipped free with Linux distributions. It began at Greenbelt, Maryland in the laboratories of a NASA contractor, the Center of Excellence in Space Data and Information Sciences (CESDIS) in 1994 as a 16-node cluster. Its design solves one of the problems inherent in regular supercomputers, whose hardware, being designed first, must always race ahead of the software that runs on it. Beowulf software is designed to stay ahead of hardware improvements; as newer hardware is added it is simply plugged into the system. One method that Beowulf uses to increase the speed of communication within the cluster is to use multiple Ethernets.
An important aspect of cost-cutting and ease-of-use in Linux cluster supercomputing is the use of Commercial Off-the-Shelf components (COTS). Their use was a focus of the LoBoS (Lots of Boxes on Shelves) project at the National Institutes of Health (http://www.lobos.nih.gov/) and the Los Lobos project newly announced by IBM and the University of New Mexico. This will cluster 64 servers and a processing speed of 375 GigaFLOPS, which should make it number 24 in the list of the world’s 500 fastest supercomputers (http://www.top500.org/), where Linux supercomputers regularly appear. The use of machine switches limits the number of computers that can be directly interconnected to 64, but it is possible to use other means, such as clusters of clusters, to join more.
Incyte Genomics, Inc. (http://www.incyte.com/) supposedly has the largest farm of Linux servers, with some 2000 processors linked together for genome mapping. It had been using 4-processor Alpha 4100 machines costing $140,000 each, and decided to save money and increase computing power by switching to Pentium II dual-processor machines at $5,000 each. They found that they could pay a tenth the price in equipment and still manage the same computing power. The large Alpha machines are still used, this time to manage the server farm. The PC equipment may not be so reliable as the more expensive equipment, but the entire system runs without shutdown, and at a lower cost, tended by a single system administrator, and porting an application to Linux from UNIX requires very little time. The project is so successful that Incyte plans to keep expanding it.
These savings spur the growing adoption not only of COTS supercomputer systems, but also of Open Source software for other massive practices. A large Web site is not actually a supercomputer, but does involve the linking of large numbers of hosting machines. A large hosting site for Web pornography, Cave Creek, figures that a $1500 AMD server running BSD on a K-6 processor will deliver 90 percent of the computing power of a $25,000 server from Sun. As in supercomputing, the superior performance of the larger machines cannot justify the large difference in price.
The wide use of Intel chips should not discourage you if you own or prefer the PowerPC processor. Terra Soft Solutions (http://www.terrasoftsolutions.com/), the makers of Yellow Dog Linux for Apple and IBM PowerPC systems, also produces Black Lab Linux (http://www.blacklablinux.com/), both for workstations and for clustering Apple PowerPC G4 chips. It also is possible to buy Linux supercomputing systems as ready-to-run Alpha clusters (http://www.compaq.com/solutions/customsystems/hps/linux.html), and Mission Critical Linux (http://www.missioncriticallinux.com/) supplies enterprise-level Linux systems and support, including on SGI. Intel, however, is the focus of new SGI technology, for the company is developing clustering for the forthcoming Intel Itanium 64-bit processor (IA-64). The IA-64 Linux Project (formerly called the Trillian Project, http://linuxia64.org/), was begun by VA Linux to ensure that Linux will be running on the Itanium when it appears. The effort is supported by the major Linux distributions, but understandably not by Corel, whose specialty is desktop Linux. Indeed, there are signs that Linux will be the first distribution ready to run on the new high-powered chip, ahead of Sun and Microsoft. Hewlett-Packard is a member both of the IA-64 project and the Monterey Project, in which it combines with SCO and IBM to produce a 64-bit UNIX for the Itanium.
The KLAT2 parallel supercomputer at the University of Kentucky was built in April 2000 and uses 64 Athlon chips running at 700MHz. It arranges the connection between the chips in a new configuration called a "flat neighborhood" network. The project cost about $41,000, and points out that the cost of $650 per billion floating point operations per second (GFLOPS) is below a typical Beowulf project cost of $3,000 per GFLOP. KLAT2 can work at more than 64 GFLOPS. The operating system is Linux, and the source code for the networking will be made Open Source.
Don’t think that Beowulf Linux systems are confined to users with only a few hundred thousand dollars to spend. The NOAA Forecast Systems Laboratory (FSL) is installing one of the fastest computing systems in the world at a cost of $15 million over three years (http://www-ad.fsl.noaa.gov/ac/HPCS/text0.htm). The first phase of Jet consists of 256 Alpha nodes, stock items supplied by Compaq, and running lightly modified Red Hat Linux for Alpha. Because weather prediction is one of those problems that require each processor to talk to each of the others, the networking is handled by Myrinet from Myricom (http://www.myri.com). High Performance Technologies (http://www.hpti.com/) is organizing the entire project. In late 2000 FSL will double the number of nodes, and two years after that the 512 nodes will be replaced with newer processors and the number of nodes doubled again to 1024. By then it will be processing at a rate of about 4 TeraFLOPS (some four trillion computations per second). Even in its initial phase with only a twelfth of its eventual power on tap, the Jet supercomputer will be twenty times faster than the fastest current computer at the Laboratory.
Symmetric Multi-Processing (SMP) is a clustering method that connects multiple processors on a single server, and has them share a single operating system, memory, and input/output devices (the "backplane"). Linux has not shown itself to be particularly strong here, managing only 4 processors in 1999’s version 2.2, but large improvements should appear with version 2.4, including 8-processor support. One difficulty with SMP, however, is that if a single processor goes down, the whole system fails. This brings us to the question of availability.
It is not enough that high-end computers crunch the numbers; for important projects and global businesses to run on them they have to have high availability. In the commercial world, Tandem supplies the famous fault-tolerant computers that have no single point of failure. BSD is the most robust of the Open Source systems, but Linux is improving, and commercial systems that overlay it with high-availability features are becoming available. This is a large market; while high-end high-availability systems are expected to sell only 57,000 units this year, low-end (Linux and SCO) solutions are projected at 1 million units, and 40 percent of those will be in clusters.
High-availability technology includes fault-tolerance. It is not enough to build in redundancy and to avoid single points of failure; the systems must provide for failover, the switching of work to running components when a component fails. Even load-balancing, typically handled by a single front-end server to the cluster (such as in the TurboLinux proprietary solution), must be distributed so that the failure of an entire server (let alone a component of a server) cannot halt the system. Linux is still weak in this regard, but at least one company supplies a commercial system that provides fault tolerance. RSF-1 from High-Availability.com (http://www.high-availability.com) is proprietary (although free for non-commercial use); it runs on Solaris (SPARC/Intel), AIX, NT, SGI, Linux, FreeBSD, and HP-UX. The Linux Community is working to build an Open Source High-Availability Linux (http://linux-ha.org/).
One approach to high availability is the Virtual Server, a cluster arrangement that appears to the user as a single machine. This is done by having one machine balance the load among the others and software that both detects and routes around failures and enables the easy addition of additional nodes to the cluster. The network connection among the machines may be a local area network (LAN) or a wide area network (WAN), so that the Virtual Server could be geographically dispersed. The Linux Virtual Server project (http://www.LinuxVirtualServer.org/) is directed from China, but has considerable help from American and European developers.
For those who have been waiting for Linux to reach the mainframe itself, the day is already here. The original IBM mainframe port, "Bigfoot" for the S/370, was a Community project; there is also a version for the S/390 (http://linas.org/linux/i370.html) downloadable at (http://oss.software.ibm.com/developerworks/opensource/linux390/index.html).
One strength of the S/390 is its ability to run multiple virtual machines. A single mainframe, for instance, could easily run 5,000 Linux virtual machines to serve that many users; testing has shown that over 41,000 Linux virtual machines can be run on a single S/390! This approach, however, is hardly cost-effective compared with using smaller machines. IBM is using Linux to keep its older machines, their data, and their users from being stranded. The S/390 can simultaneously run an older operating system and its applications while also running new operating systems and applications such as Linux. This practice will enable smooth transitioning from old systems to new systems, and also enable spare processor power to be put to use.
The long-range view of IBM may be that the S/390 running Linux will be able to outrun mainframe competitors such as the Sun Enterprise 10000. Sun uses 64 processors to get high performance from this machine, but the Linux kernel, optimized for a single processor and run on the S/390, may very well beat the Sun machine. After all, Linux running on a Sun SPARC chip is reported to outperform Solaris on the same chip.
Actual deployment of Linux on the S/390 so far has been to replace NT servers. The Oklahoma Department of Corrections, for instance, intends to replace 40 such servers with Linux/390; by using existing resources this will save considerably in system hardware (which would need upgrading to Windows 2000), floor space, and system administration.
Cluster computing is only one of a number of technologies that must be tamed for Open Source software to reach enterprise-level computing. At the present time the most serious piece missing appears to be On-Line Transaction Processing (OLTP), the ability to perform not just enormous numbers of transactions per minute, but to propagate the results through an entire system — airline reservation systems are an example of this sort of computing. With time and the efforts of the large database firms that are embracing Linux, it is possible that OLTP will eventually become a Linux function.
The rapid progress of Linux over the past year came from two salutary shocks delivered in April 1999 by the D.H. Brown Associates report on Linux (http://www.dhbrown.com/dhbrown/linux.html) and by the Mindcraft benchmarking of Linux and NT sponsored by Microsoft (see Chapter 6, "The GNU GPL and the Open Source Definition"). Linux is now taking SMP seriously, and the Brown criticism concerning the lack of a good file system will soon be met now that SGI has released the Linux source code for its XFS file system. High-end companies watching the rise of Linux like to comfort themselves with the thought that no real innovations can come out of Open Source, but only imitations of continually evolving and superior high-end technology. In this view, Linux is forever doomed to play the dog chasing the tail lights. The Brown report was skeptical that Linux could ever reach the top tier because no one would pay for all the needed improvements. The report did suggest that Intel might have a motivation to do so, since the sale of Intel chips as opposed to mainframe chips would increase. For that matter, why not Compaq as well? When we consider that Linux users and the Open Source GNUPro developers may be the only welcoming committee when Itanium ships, Linux supporters can think a few comforting thoughts themselves. . . . And for now, Linux doesn’t have to beat AIX or Solaris; all it has to do is beat Windows.
Next chapter: The Secret Battlefield: Embedded Systems
Table of Contents
Copyright © 2000 by IDG Books Worldwide, Inc. Published under IDG Books Worldwide, Inc. Open Content License.