ad info




CNN.com
 MAIN PAGE
 WORLD
 ASIANOW
 U.S.
 LOCAL
 POLITICS
 WEATHER
 BUSINESS
 SPORTS
 TECHNOLOGY
   computing
   personal technology
   space
 NATURE
 ENTERTAINMENT
 BOOKS
 TRAVEL
 FOOD
 HEALTH
 STYLE
 IN-DEPTH

 custom news
 Headline News brief
 daily almanac
 CNN networks
 CNN programs
 on-air transcripts
 news quiz

  CNN WEB SITES:
CNN Websites
 TIME INC. SITES:
 MORE SERVICES:
 video on demand
 video archive
 audio on demand
 news email services
 free email accounts
 desktop headlines
 pointcast
 pagenet

 DISCUSSION:
 message boards
 chat
 feedback

 SITE GUIDES:
 help
 contents
 search

 FASTER ACCESS:
 europe
 japan

 WEB SERVICES:
COMPUTING

Taking parallel processing to a new level

August 9, 1999
Web posted at: 11:16 a.m. EDT (1516 GMT)

by Rawn Shah

From...
Windows TechEdge

(IDG) -- Ever wonder just how much processing power your data warehousing and mining products are going to need once your database starts growing into the tens and hundreds of terabytes? How could you possibly afford a server that can process that much information quickly and efficiently -- without it keeling over from exhaustion?
MORE COMPUTING INTELLIGENCE
IDG.net   IDG.net home page
  Windows Tech Edge home page
  Make your PC work harder with these tips
 Reviews & in-depth info at IDG.net
 *   IDG.net's personal news page
  IDG.net's products pages
  Year 2000 World
  Questions about computers? Let IDG.net's editors help you
  Subscribe to IDG.net's free daily newsletters
  Search IDG.net in 12 languages
 News Radio
 * Fusion audio primers
 * Computerworld Minute
   

 

The server vendors hope you will continue to buy larger and larger servers to handle this gargantuan task, but that's just throwing brute force (and a lot of money) at the problem. Even parallel database server products are starting to hit limits. Still, they do have the right idea: work smarter, not harder.

A new generation of upcoming products based on distributed object processing will solve this problem. Microsoft has revealed that it is working on a component load balancer product which involves distributed ActiveX/COM+ object processing on multiple servers; its Millennium project, a future OS still in the development phase, is built around this architecture as well.

These products spread the processing across many servers, assigning each a subset of the work, in a method known as parallel processing. Rather than using expensive specialized computers, they use a large cluster of small, ordinary servers, each running its own operating system, to take a number of jobs, process them, and send the output to the primary system.

This is where cluster computing is heading. The ability to handle a large task in small bits, or lots and lots of small tasks across an entire cluster, makes an entire system more affordable and more scalable. Or, in the case of our opening example, it takes a Web-accessible, 100-terabyte database out of the realm of fantasy and makes it a real-world possibility.

You may think NT is going to get left behind by Unix when it comes to distributed objects and clustering; one group is already working to prove you wrong.

The National Center for Supercomputing Applications (NCSA) -- the same group that released the original Mosaic Web browser -- has created just such a distributed application processing system based on Windows COM+. To support this distributed computing environment, the group has created the NT Supercluster system, possibly the largest NT cluster of any kind.

The Supercluster works

The NT Supercluster is based on a model known as the cluster of workstations. It uses a large number of smaller workstation-class machines that are connected by high-speed network interfaces. The Supercluster also runs some software that distributes jobs in parallel across the cluster.

This same model was first used with Unix systems, and has been implemented in several well-known products, such as Beowulf (Linux) and Inktomi Traffic Server (Solaris). In fact, one of the largest search engines on the Web, HotBot, uses the Inktomi product. The use of workstations running in parallel grew out of many companies' desire to use machines cheaper than traditional servers, although low-end servers pretty much match the price point of workstations in the Wintel market.

The NT Supercluster uses a combination of three classes of workstations:

  • 64 HP Kayak XU PCs with dual 550 MHz Pentium III Xeons, with 1 GB RAM and 7.5 GB storage

  • 64 HP Kayak XU PCs with dual 300 MHz Pentium IIs, with 512 MB RAM and 3 GB storage

  • 32 Compaq Professional Workstation 6000 PCs with dual 333 MHz Pentium II, with 512 MB RAM and 7 GB storage.

The Kayak groups are connected with proprietary Myrinet 1.28 Gbps interfaces, and the Compaq workstations have 100 Mbps Fast Ethernet connections. Myricom's Myrinet high-speed network interface is commonly used as a cluster interconnect because it not only provides link speeds in excess of 1 Gbps directly between components, but also has a latency -- the time it takes to deliver data from one end to the other -- of about 5 microseconds. This figure is quite low -- compare it to the several hundred microsecond latency of LAN interfaces, such as Ethernet.

Each HP system has a Myrinet interface that is connected to a 16-port Myrinet switch, making the total latency between any two nodes around 15 microseconds. The HP nodes perform all the parallel processing work assigned to them.

The Compaq systems are used as batch job systems for applications which cannot execute in parallel. Each Compaq workstation thus runs either a single-threaded application or a serial application within its own system. Since a serial application does not need to communicate with others until it is done, there is no real need for a high-speed interconnect in the Compaq cluster.

All nodes are connected to a number of keyboard-video-mouse (KVM) switches from Raritan. These are hierarchically connected, allowing sysadmins to control any individual node from one or two consoles.

E pluribus unum

When all this is added equipment together, you have 160 machines and 320 processors of various models, currently the largest NT cluster of any kind publicly known. There are other NT-based clusters for Web sites out there that may be larger, but those servers are not as tightly coupled as the NCSA system's.

On top of the raw hardware and operating systems, you need software that controls the cluster nodes and communicates between them. NCSA's project relies on the High Performance Virtual Machine (HPVM) system, developed at the University of California in San Diego, for this purpose. This is the support system that actually builds the cluster. It provides several different programming interfaces to develop parallel applications, and features a Java-based front-end for cluster management. In fact, the NT Supercluster is the one and only implementation of the HPVM III, the largest model.

The HPVM system uses Platform Computing's LSF multicluster tool -- the same tool used in several Unix cluster solutions, like Solaris's -- as a distributed load-sharing facility and a job-scheduling system. Controlled by upper layers, LSF locates the available nodes within the cluster with the smallest loads, and assigns jobs accordingly. To run applications on the NT Supercluster, programmers have to use the message passing interface (MPI), a very popular application programming interface for parallel processing.

Symera, a distributed object server system

Unlike OS-based clusters, the NT Supercluster is a middleware cluster with server components distributed across the nodes to perform the processing. It does not provide services for integrating systems together for high availability, but rather focuses on how to distribute a job across many servers to be processed by each COM+ server component.

To work with COM+, NCSA developed Symera, a symbiotic extensible resource architecture. Symera is a distributed architecture for building COM+ servers. It is a management system at the COM+ object level that allocates cluster resources, schedules jobs, implements object migration, and observes and controls the execution of processes. By allocating resources and jobs, you can confine some of the processing to a select group within the cluster, while other nodes run other parallel computing jobs.

Object migration is very useful, since it frees object processing from being tied to a single server node and allows it to be moved to other nodes with lesser loads. This distributes the processing across the cluster more evenly, so that you can actually get near-linear scaling of processing power. This is one of the factors that distinguish a network operating system from a distributed network operating system.

Symera also includes application libraries that implement all these features in such a way that they can be controlled from the management system. The libraries are designed so that they will scale appropriately according to the number of nodes used for processing.

For a product comparable to this middleware cluster, think of Enterprise JavaBeans (EJB). This Java-based technology defines a system that creates pools of server objects for processing incoming requests. EJB servers, such as BEA/Weblogic Tengah, allow Java programmers to create scalable server applications that run on multiprocessor node systems. Now take this idea and scale it to a large cluster environment, and use COM+ instead Java, and you have an idea of what Symera does.

Build your own hot rod

Interesting projects such as Symera and the NCSA NT Supercluster have earned the right to be made prominent and hailed as actual wonders, but application vendors have largely ignored them. Unfortunately, just because you have a breakthrough technology does not mean it will be a commercial success.

These two projects were launched to examine the lengths to which NT and COM+ clustering can go, and have proven that NT can scale. They both certainly caught Microsoft's eye. According to Jae Allen, assistant director of NCSA's grid division, under which the projects fall, Microsoft has given gift money to support the projects for the past two and a half years, and Redmond researchers have had technical discussions with their counterparts at NCSA.

Microsoft's internal Millennium project is conceptually, though not physically, similar. This is still a research operating system and may become a high-end COM+ clustering product, but not for several years.

In the meantime, with time and money, it is possible to build your own COM+ cluster using HPVM, hardware, and the Symera system. Both HPVM and Symera are publicly available. HPVM does use a commercial product, Platform's LSF tool, which requires that you purchase a license.

The following is a quick estimate of what it would cost to build a system on similar scale to the NCSA NT Supercluster, based on average market costs for products. Take each parallel node (like the HPs) configuration to cost about $6,000, and each serial node about $3,000. A Myrinet interface card runs about $1,000, and a 16-port switch runs about $6,000. Each 16-port KVM switch should cost about $1,000.

Item

Cost

Total

Parallel nodes

128 x $6,000

$768,000

Serial nodes

32 x $3,000

$96,000

Myrinet cards

128 x $1,000

$128,000

Myrinet switches

8 x $6,000

$48,000

KVM switches

8 x $1,000

$8,000

   

$1,048,000

The grand total comes to just over a million dollars, plus additional licensing costs for LSF (which were not available to us), and a few odd extras. Considering that most supercomputers and parallel computers start at around a million dollars for an average system, this is a fairly good deal, especially considering that this relies only on dual-processor systems. Using quad processor Intel Xeons would run an extra $8,000 per node, but would also theoretically double the overall power.

Most people won't be constructing such huge clusters, unless they happen to be planning to build and test nuclear weapons. Even a 16-node cluster can do wonders for data mining or other jobs that require large-scale processing. You could use such a cluster to maintain an auction Web site or even simulate the stock market.

The NCSA Symera and NT Supercluster projects are proof that NT can achieve scales of clustering once thought to be only in the domain of Unix servers.

Rawn Shah helped create Windows TechEdge. He has written on NT since 1993 for a number of top publications, and has managed systems and developed software in both Unix and Windows environments. He can be reached at rawn.shah@windowstechedge.com.


RELATED STORIES:
How computer farms create movie special effects
May 21, 1999
Six NT workstations churn out intense performance
April 23, 1999
Eyeing ERP: Microsoft and IBM roles in flux
March 18, 1999

RELATED IDG.net STORIES:
Cornell sets up cluster consortium
(Computerworld)
Giganet introduces Linux cluster, larger switches
(Network World Fusion)
Microsoft exec touts scalability, clustering technologies
(InfoWorld Electric)
QuickStudy: Massively parallel processing
(Computerworld)
The return of the cluster
(SunWorld)
Note: Pages will open in a new browser window
External sites are not endorsed by CNN Interactive.

RELATED SITES:
NCSA Symera distributed COM+ software
NCSA's NT Cluster Consortium report for 1998
More technical info on NCSA's NT Cluster
Myrinet's Myricom products
Microsoft's Millennium OS project
Note: Pages will open in a new browser window
External sites are not endorsed by CNN Interactive.
 LATEST HEADLINES:
SEARCH CNN.com
Enter keyword(s)   go    help

Back to the top   © 2001 Cable News Network. All Rights Reserved.
Terms under which this service is provided to you.
Read our privacy guidelines.