Taking a crack at backup - administrators exploring fault-tolerant software as alternative to costly hardware appraoches - Technology Information
John P. Mello, Jr.As critical data stakes out LANs, administrators explore fault-tolerant software solutions as an alternative to more-costly hardware approachesClient/server computing has diffused information power within many businesses. With that diffusion has come a migration of mission-critical data from traditional data centers to the distributed networks that have sprouted in their place. As such data moves to local area networks, the issue of how to preserve it in the face of disasters becomes a burning one.
Why has this issue of mission-critical data preservation come to the LAN environment? One reason is that with the advent of Web-enabled applications, workgroups have extended their enterprise operations to include remote branch office workers, telecommuters, customers and suppliers. "As we move into electronic commerce and people place orders for goods directly over the Internet, if a company's system is down, the company is no longer operating," says Larry Sherman, director of marketing for Isis Distributed Systems, a division of Stratus Computer in Marlborough, Mass.
The movement of mission-critical data to PC LANs is only going to increase, say observers. "People are putting more and more data out there that is more and more critical," says C.D. Larson, a storage management specialist at IBM's storage system division in San Jose, Calif. "A few years ago, people were making commitments to the client/server idea, but it wasn't where the rubber met the road. Today, people have Oracle databases, [Novell's fault-tolerant software] SFT III and Lotus Notes, and they're running their businesses on the basis of these systems."
Indeed, 43% of the data housed on corporate PC LANs today is mission-critical, according to a disaster recovery survey conducted by research group David Michaelson & Associates.
"The number of people with a PC on their desk has gone through the roof, but no one thought anything mission-critical was being done out there," says Jack Latchford, director of new product engineering at SunGard, a disaster recovery company located in Wayne, Pa. "Now they're finding there's stuff out there the organization can't do without, and they have to find ways to back it up."
It's unlikely that mission-critical applications would have made the move to LANs if the only way to bulletproof them was through expensive, one-to-one, fault-tolerant hardware solutions. In fact, a new wave of inexpensive software solutions is making highly available LAN applications a realistic goal. "There are two forces at work here," Isis' Sherman says. "The cost of availability has dramatically decreased over what it was just a couple of years ago. But on the other side of the equation, the cost of the system being down is much, much higher than it was a couple of years ago. And that's making high availability a requirement for more users."
This need for high availability is changing the way companies deal with data backup. Of respondents to the David Michaelson & Associates survey, 77% employ a continuous or daily backup for their PC LANs, and 89% follow some kind of backup procedure. This is up significantly from 1993, when 45% of organizations used continuous or daily backup for PC LANs.
Conventional approaches to backups, which involve copying all the data files on a network, can prove wanting for local area networks -- especially for offsite backups via a telecommunications link.
"It's a bandwidth-intensive problem because whole files are transmitted, even if only one bit has changed. You run out of hours in a night," says SunGard's Latchford.
Furthermore, should a LAN fail, companies want the recovery time to be as short as it was in the legacy environment. "Not only does the data have to be protected, but the window it takes to get the system functioning again can't be too large," Latchford says. "It used to be that three days or a week was okay. Now people think that if they're out of touch with their customers for that long, they lose market share in a big way."
That's why a growing number of IS managers are installing some kind of realtime continuous backup for their LANs that enables them to be up and running within 15 minutes to an hour after failover. "In the mainframe world, there are systems where as a transaction is written at the production site, it's sent out over the telecom circuit to the disaster recovery site. So if a company's production building goes away for any reason -- fire, flood, whatever --right up to the last transaction is at the recovery center," Latchford explains. "Now there's software that will do that for NetWare and Windows NT." One such program is Doubletake from Network Specialists Inc., Hoboken, N.J. The advantage Doubletake has over conventional backup systems is it can update data without rewriting an entire file, says Latchford, who has incorporated the software into some of SunGard's operations. "So if you have a 150 megabyte database and you change one name, that change is all that's transmitted to the mirrored copy of the data," he says.
According to Network Specialists, Doubletake bypasses the applications layer of the network by intercepting files at the operating system layer. When the software is initialized on the main server, it creates a mirror image of the files on a backup server. From then on, Doubletake intercepts write requests to the main server and copies them to the backup server, thus keeping the files on the backup server up-to-date.
Another fault-tolerant software solution playing in the distributed computing space is Orbix+Isis from Isis Distributed Systems. Orbix+Isis provides fault tolerance through a process that Isis calls "active replication."
Here's how it works: Multiple copies of an application are set up to run on multiple servers. Orbix+Isis knows where these copies reside and acts as a traffic cop for the network. As the application is changed, the changes are recorded to the copies. All this activity takes place in the background and is invisible to the clients on the system; to them, it looks as if they're working with a single application on the network. A query to the application is load-distributed so only one server needs to act on it, which cuts down on system overhead. If one of the servers fails, the other keeps the network's clients functioning without interrupting work.
Proximity issues can also be avoided with Orbix+Isis because the servers running the multiple applications don't have to be close to each other. "That's one of the big differences between our approach and a cluster," notes Isis' Sherman. "In a cluster, everything has to be in the same room because you're sharing disks and the disk cables can't be very long. With our approach, those servers could be across town, across the country or across the world, as long as you have enough network bandwidth between the clients and the servers."
Software solutions like Orbix+Isis may save companies money over hardware-based solutions. A hardware solution consisting of a two-node Unix cluster could cost anywhere from $150,000 to $200,000. Orbix+Isis, says Isis, can cover the same bases for around $50,000.
But the program isn't for everyone because it requires some tinkering with applications. "If you're the kind of user who has simply bought an off-the-shelf application, you may not be able to use Isis," says Sherman. "If you're the provider of the application or you write your own applications -- as much of the corporate world does -- then it might make sense."
Creating a mirror image of an application's data is fine, but it has its drawbacks. One of them is that some of the most common causes of network downtime -- software error, human error and data corruption -- can foul up both the primary data source and its mirror. According to Network Integrity Inc. of Marlborough, Mass., its LANtegrity software can protect a LAN from those failures.
Designed to work with Novell's NetWare, LAN- tegrity allows a single server to automatically take over for any down or failed server in a network. And it gives that server the power to back up data on all enterprise servers, including remote servers, and keep tabs on all archival copies.
Network Integrity claims that LANtegrity fills in for a downed server in about 15 seconds. At the St. James, La., plant of Chevron Chemical, headquartered in San Ramon, Calif., four hardware failures have occurred since LANtegrity was installed. "LANtegrity stood in automatically for the crashes and our end users were not even aware of the failure," says Brian Durham, the lead systems analyst at the plant.
At Sherwin Williams, Cleveland, LANtegrity protects a network that runs the paint and coating company's office automation applications. "None of the applications it's backing up are mission-critical, but like everything else these days, they're business- critical," says Mark Danczak, systems engineer. "The issue with LAN servers is a lot of times you need them to gain access to the mission-critical applications."
He says his firm chose LANtegrity because it addresses both the backup and fault-tolerance concerns of the enterprise, where it monitors several NetWare servers. As files on those servers change, LANtegrity moves copies of them to its server, then writes the copies to tape. This process takes place continuously and in realtime, says Danczak.
He says implementing a system like LANtegrity's is much cheaper than installing a one-to-one fault tolerance system. A one-to-one system essentially doubles hardware costs because every server on the network is mirrored by a backup server. Danczak estimates that LANtegrity costs about a quarter of what it would cost to implement a one-to-one system.
But like most business solutions, realtime backups have trade-offs, says Gregory Smith, the Reston, Va.-based director of corporate finance systems at Sallie Mae, which provides second funding and servicing of student loans. "The trade-off is network performance," he says. "I'm a firm believer that unless you're running anything close to 100 megabits across your network, I don't think that 'intraday' backup is a viable capability because it will severely affect the performance of your network."
Will software solutions now seeping into the market make one-to-one fault-tolerant systems a thing of the past? "I wouldn't say software solutions will displace hardware solutions, but they do offer an attractive alternative," says Isis' Sherman. "Both solutions will continue to exist and both will have their place."
COPYRIGHT 1996 Wiesner Publications, Inc.
COPYRIGHT 2004 Gale Group