advertising information

CNN.com
 MAIN PAGE
 WORLD
 ASIANOW
 U.S.
 LOCAL
 POLITICS
 WEATHER
 BUSINESS
 SPORTS
 TECHNOLOGY
   computing
   personal technology
   space
 NATURE
 ENTERTAINMENT
 BOOKS
 TRAVEL
 FOOD
 HEALTH
 STYLE
 IN-DEPTH

 custom news
 Headline News brief
 daily almanac
 CNN networks
 CNN programs
 on-air transcripts
 news quiz

  CNN WEB SITES:
CNN Websites
 TIME INC. SITES:
 MORE SERVICES:
 video on demand
 video archive
 audio on demand
 news email services
 free email accounts
 desktop headlines
 pointcast
 pagenet

 DISCUSSION:
 message boards
 chat
 feedback

 SITE GUIDES:
 help
 contents
 search

 FASTER ACCESS:
 europe
 japan

 WEB SERVICES:
COMPUTING

From...
PC World

How to build computers that don't crash

March 15, 1999
Web posted at: 4:56 p.m. EST (2156 GMT)

by Stan Miastkowski

(IDG) -- Where do you want to reboot today?

You don't? Well, it's not exactly a secret that you may have no choice. Every desktop version of Windows locks up from time to time, necessitating the old "three-fingered salute" (Ctrl-Alt-Del) or a complete system reboot.

Most of us just live with this inconvenience, but it's not an option for servers running critical applications such as a nationwide telecommunications network, an assembly line on a factory floor, or life-critical medical equipment.

Highly reliable redundant computer hardware has been available for years, but creating truly "crashless" computers has required very expensive hardware, specially modified operating systems, and carefully designed proprietary applications. However, Texas Micro says it's developing what it claims will be a virtually crashless server that uses unmodified Windows NT server software and off-the-shelf apps.

MORE COMPUTING INTELLIGENCE
  IDG.net home page
  PC World home page
  FileWorld find free software fast
  Make your PC work harder with these tips
 Reviews & in-depth info at IDG.net
    IDG.net's desktop PC page
  IDG.net's portable PC page
  IDG.net's Windows software page
  IDG.net's personal news page
  Questions about computers? Let IDG.net's editors help you
  Subscribe to IDG.net's free daily newsletter for computer geniuses (& newbies)
  Search IDG.net in 12 languages
 News Radio
  Fusion audio primers
  Computerworld Minute
   

The company, which got its start in the eighties building ruggedized PCs for the rigors of oil exploration, says its servers will offer "Five-Nines" availability, meaning they'll be up and running 99.999 percent of the time -- which translates to a total of no more than 5 minutes of downtime per year.

A three-fold path

Texas Micro, which intends to ship its servers by year end, is employing a three-tiered approach to reach its goal.

The first is using its own high-reliability PC hardware, which includes a ruggedized design and redundant components such as power supplies, and puts almost everything in individual hot swappable modules that can be changed without the need to power down the machine.

Next is the Intelligent Platform Management Interface, a system management tool that combines hardware and software to automatically and continuously monitor server hardware. IPMI is designed to anticipate impending failures, as well as detect and recover from many failures as they occur.

Third, and the heart of Texas Micro's design, is System Directed Checkpointing technology. A typical SDC system consists of two identical Windows NT servers, each equipped with a special communications board and connected by a cable separate from the main network. The hard drives of the second server are continuously mirrored to the master server.

Such mirroring is common in dual-server systems. But SDC goes way beyond drive mirroring: It actually keeps the contents of each server's memory identical to one another. Critical system parameters that can't easily be mirrored, such as the contents of the server cache and the state of the processor's internal registers, are stored in "snapshots" (checkpoints) that are made 20 times a second and sent to the second server. If the second server stops receiving these snapshots, it knows that something's wrong and immediately becomes the primary server in a fraction of a second, taking over from the point before the error occurred.

Catastrophe-proof?

Of course, the system isn't fully redundant again until the problem that caused the first server to fail is corrected. And what happens if a software problem that caused the first server to fail crashes the second server immediately?

A Texas Micro spokesperson says the company's extensive research shows that truly catastrophic server failures of this type are extremely rare, and claims that most failures are extremely short-lived timing problems that are self-correcting. In cases like these, the entire redundant system repairs itself and can be back to normal in seconds. (Some software failures, as well as hardware failures, however, do require human intervention. The monitoring system sends alarms to system managers whenever problems occur.

Not for the mainstream -- yet

The company says it can't say yet how much SDC will add to the cost of a fully redundant dual-server setup, but hastens to add that the hardware that system requires is relatively inexpensive. The high-reliability PC hardware needed, however, is another story. The company declined to comment on the additional cost of its ruggedized PCs over off-the-shelf servers.

Although Texas Micro will be initially be marketing "Five-Nines" systems to its traditional customer base, the company hopes to be working with major PC makers to eventually make the technology available to servers in small-to-medium businesses.

Meanwhile, you and I need to back up our desktop PCs -- and be prepared for the three-fingered salute.


RELATED STORIES:
Internet worm can crash corporate servers
January 29, 1999
Are dirt-cheap consumer PCs good for IT shops?
February 12, 1999
Crash! Take advantage of these emergency Registry routines if Windows goes south
August 11, 1998

RELATED IDG.net STORIES:
Are you being served?
(PC World Online)
No-fault servers
(PC World Online)
Servers on steroids
(Computerworld)
Two servers are better than one
(Computerworld)
Sun debuts new Java Embedded Server
(JavaWorld)

Note: Pages will open in a new browser window
External sites are not endorsed by CNN Interactive.


RELATED SITES:
Texas Micro Inc.

Note: Pages will open in a new browser window
External sites are not endorsed by CNN Interactive.

 LATEST HEADLINES:
SEARCH CNN.com
Enter keyword(s)   go    help

Back to the top   © 2001 Cable News Network. All Rights Reserved.
Terms under which this service is provided to you.
Read our privacy guidelines.