|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Taking parallel processing to a new level
(IDG) -- Ever wonder just how much processing power your data warehousing and mining products are going to need once your database starts growing into the tens and hundreds of terabytes? How could you possibly afford a server that can process that much information quickly and efficiently -- without it keeling over from exhaustion?
The server vendors hope you will continue to buy larger and larger servers to handle this gargantuan task, but that's just throwing brute force (and a lot of money) at the problem. Even parallel database server products are starting to hit limits. Still, they do have the right idea: work smarter, not harder. A new generation of upcoming products based on distributed object processing will solve this problem. Microsoft has revealed that it is working on a component load balancer product which involves distributed ActiveX/COM+ object processing on multiple servers; its Millennium project, a future OS still in the development phase, is built around this architecture as well. These products spread the processing across many servers, assigning each a subset of the work, in a method known as parallel processing. Rather than using expensive specialized computers, they use a large cluster of small, ordinary servers, each running its own operating system, to take a number of jobs, process them, and send the output to the primary system. This is where cluster computing is heading. The ability to handle a large task in small bits, or lots and lots of small tasks across an entire cluster, makes an entire system more affordable and more scalable. Or, in the case of our opening example, it takes a Web-accessible, 100-terabyte database out of the realm of fantasy and makes it a real-world possibility. You may think NT is going to get left behind by Unix when it comes to distributed objects and clustering; one group is already working to prove you wrong. The National Center for Supercomputing Applications (NCSA) -- the same group that released the original Mosaic Web browser -- has created just such a distributed application processing system based on Windows COM+. To support this distributed computing environment, the group has created the NT Supercluster system, possibly the largest NT cluster of any kind. The Supercluster worksThe NT Supercluster is based on a model known as the cluster of workstations. It uses a large number of smaller workstation-class machines that are connected by high-speed network interfaces. The Supercluster also runs some software that distributes jobs in parallel across the cluster. This same model was first used with Unix systems, and has been implemented in several well-known products, such as Beowulf (Linux) and Inktomi Traffic Server (Solaris). In fact, one of the largest search engines on the Web, HotBot, uses the Inktomi product. The use of workstations running in parallel grew out of many companies' desire to use machines cheaper than traditional servers, although low-end servers pretty much match the price point of workstations in the Wintel market. The NT Supercluster uses a combination of three classes of workstations:
The Kayak groups are connected with proprietary Myrinet 1.28 Gbps interfaces, and the Compaq workstations have 100 Mbps Fast Ethernet connections. Myricom's Myrinet high-speed network interface is commonly used as a cluster interconnect because it not only provides link speeds in excess of 1 Gbps directly between components, but also has a latency -- the time it takes to deliver data from one end to the other -- of about 5 microseconds. This figure is quite low -- compare it to the several hundred microsecond latency of LAN interfaces, such as Ethernet. Each HP system has a Myrinet interface that is connected to a 16-port Myrinet switch, making the total latency between any two nodes around 15 microseconds. The HP nodes perform all the parallel processing work assigned to them. The Compaq systems are used as batch job systems for applications which cannot execute in parallel. Each Compaq workstation thus runs either a single-threaded application or a serial application within its own system. Since a serial application does not need to communicate with others until it is done, there is no real need for a high-speed interconnect in the Compaq cluster. All nodes are connected to a number of keyboard-video-mouse (KVM) switches from Raritan. These are hierarchically connected, allowing sysadmins to control any individual node from one or two consoles.
E pluribus unumWhen all this is added equipment together, you have 160 machines and 320 processors of various models, currently the largest NT cluster of any kind publicly known. There are other NT-based clusters for Web sites out there that may be larger, but those servers are not as tightly coupled as the NCSA system's.On top of the raw hardware and operating systems, you need software that controls the cluster nodes and communicates between them. NCSA's project relies on the High Performance Virtual Machine (HPVM) system, developed at the University of California in San Diego, for this purpose. This is the support system that actually builds the cluster. It provides several different programming interfaces to develop parallel applications, and features a Java-based front-end for cluster management. In fact, the NT Supercluster is the one and only implementation of the HPVM III, the largest model. The HPVM system uses Platform Computing's LSF multicluster tool -- the same tool used in several Unix cluster solutions, like Solaris's -- as a distributed load-sharing facility and a job-scheduling system. Controlled by upper layers, LSF locates the available nodes within the cluster with the smallest loads, and assigns jobs accordingly. To run applications on the NT Supercluster, programmers have to use the message passing interface (MPI), a very popular application programming interface for parallel processing.
Symera, a distributed object server systemUnlike OS-based clusters, the NT Supercluster is a middleware cluster with server components distributed across the nodes to perform the processing. It does not provide services for integrating systems together for high availability, but rather focuses on how to distribute a job across many servers to be processed by each COM+ server component.To work with COM+, NCSA developed Symera, a symbiotic extensible resource architecture. Symera is a distributed architecture for building COM+ servers. It is a management system at the COM+ object level that allocates cluster resources, schedules jobs, implements object migration, and observes and controls the execution of processes. By allocating resources and jobs, you can confine some of the processing to a select group within the cluster, while other nodes run other parallel computing jobs. Object migration is very useful, since it frees object processing from being tied to a single server node and allows it to be moved to other nodes with lesser loads. This distributes the processing across the cluster more evenly, so that you can actually get near-linear scaling of processing power. This is one of the factors that distinguish a network operating system from a distributed network operating system. Symera also includes application libraries that implement all these features in such a way that they can be controlled from the management system. The libraries are designed so that they will scale appropriately according to the number of nodes used for processing. For a product comparable to this middleware cluster, think of Enterprise JavaBeans (EJB). This Java-based technology defines a system that creates pools of server objects for processing incoming requests. EJB servers, such as BEA/Weblogic Tengah, allow Java programmers to create scalable server applications that run on multiprocessor node systems. Now take this idea and scale it to a large cluster environment, and use COM+ instead Java, and you have an idea of what Symera does.
Build your own hot rodInteresting projects such as Symera and the NCSA NT Supercluster have earned the right to be made prominent and hailed as actual wonders, but application vendors have largely ignored them. Unfortunately, just because you have a breakthrough technology does not mean it will be a commercial success.These two projects were launched to examine the lengths to which NT and COM+ clustering can go, and have proven that NT can scale. They both certainly caught Microsoft's eye. According to Jae Allen, assistant director of NCSA's grid division, under which the projects fall, Microsoft has given gift money to support the projects for the past two and a half years, and Redmond researchers have had technical discussions with their counterparts at NCSA. Microsoft's internal Millennium project is conceptually, though not physically, similar. This is still a research operating system and may become a high-end COM+ clustering product, but not for several years. In the meantime, with time and money, it is possible to build your own COM+ cluster using HPVM, hardware, and the Symera system. Both HPVM and Symera are publicly available. HPVM does use a commercial product, Platform's LSF tool, which requires that you purchase a license. The following is a quick estimate of what it would cost to build a system on similar scale to the NCSA NT Supercluster, based on average market costs for products. Take each parallel node (like the HPs) configuration to cost about $6,000, and each serial node about $3,000. A Myrinet interface card runs about $1,000, and a 16-port switch runs about $6,000. Each 16-port KVM switch should cost about $1,000.
The grand total comes to just over a million dollars, plus additional licensing costs for LSF (which were not available to us), and a few odd extras. Considering that most supercomputers and parallel computers start at around a million dollars for an average system, this is a fairly good deal, especially considering that this relies only on dual-processor systems. Using quad processor Intel Xeons would run an extra $8,000 per node, but would also theoretically double the overall power. Most people won't be constructing such huge clusters, unless they happen to be planning to build and test nuclear weapons. Even a 16-node cluster can do wonders for data mining or other jobs that require large-scale processing. You could use such a cluster to maintain an auction Web site or even simulate the stock market. The NCSA Symera and NT Supercluster projects are proof that NT can achieve scales of clustering once thought to be only in the domain of Unix servers. Rawn Shah helped create Windows TechEdge. He has written on NT since 1993 for a number of top publications, and has managed systems and developed software in both Unix and Windows environments. He can be reached at rawn.shah@windowstechedge.com. RELATED STORIES: How computer farms create movie special effects RELATED IDG.net STORIES: Cornell sets up cluster consortium
RELATED SITES: NCSA Symera distributed COM+ software
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Back to the top |
© 2001 Cable News Network. All Rights Reserved. Terms under which this service is provided to you. Read our privacy guidelines. |