Join Newsletter
Trusted Business Advisors, Expert Technology Analysts

Taneja Blog

Taneja Blog

Backup and Archive: Two Different Animals

Are you using older backups as your archives? Are your archives sitting on tape? For years, this has been the norm because on the surface this approach looks cheap and easy. But like some other things that are cheap and easy, you may be in for a few unwanted surprises if you continue in your errant ways.

Historically, backup and archiving were mostly about making and retaining a copy of the production data. Many backup products in the 90s seemed to be all about backup, regardless of what that did to the recovery process. Archives, if people even had them, were about keeping data around as cheaply as possible, mostly to meet regulatory requirements (which meant that it was mostly done in certain industries with compliance requirements). Both used tape, so it was natural for a "backup" to become an "archive" and get shipped to some remote site after some period of time. But things have changed considerably:

  1. With increasingly stringent requirements for RPO and RTO, the focus of data protection has clearly shifted to recovery
  2. Operations have moved to a 7x24 clock, driving concerns about the implications of backup on production application environments
  3. The focus of archiving has expanded to include accessibility, primarily to meet the demands of an increasingly litigious corporate environment

Pushing the envelope on tape technologies to try to address the first two items above led to another unintended consequence: people became very aware of the recovery reliability issues with tape media when used to meet backup requirements. Tape is a sequential access media, but backups and restores basically needed a random access media. Tape is also primarily an offline medium, a fact which meant it did not lend itself well to the types of discovery operations that had to be performed against archives to find responsive materials to deal with lawsuits. A study we did last year indicated that discovery operations against tape cost 10x as much as those same operations if they were performed against disk where computerized search could be leveraged. With the average cost of a lawsuit being in the range of half a million dollars for large enterprises, e-discovery could save hundreds of thousands of dollars if at least several lawsuits were being handled per year. Plus, imagine the judge's reaction when you can't produce some responsive materials that you clearly should be able to due to media reliability issues. Disk was the obvious answer, if its cost could be brought down significantly.

Today, backup is about recovery, archiving is about cost effective retention and searchability. The two business objectives drive different requirements, but there is a single medium which is well matched with their foundation requirements: disk. Different software functionality is required for each, but this raises the question again of whether your backups should just age into becoming your archives.

We recommend that backup and archive be managed separately. First, since most restore requests come from the most recent backups, the "backup" problem has more of a short term focus to it. Disaster recovery has less of a short term focus, mostly because of operational limitations about how to get that data to a remote site but also because of the requirement that it support multiple comprehensive recovery points. Archiving clearly has a long term focus but should NOT just be a process which occurs at the end of the backup data life cycle. To optimize your existing storage infrastructure for performance, cost, and protection, data should be archived well before it is no longer needed for backup and/or DR purposes. This drives very positive implications for managing primary storage and the costs associated with it (see my blog from February 24, 2009).

In dealing with end users on this issue, two conclusions are evident:

  • Backups and archives should be managed separately, and you should seriously consider using disk-based options for both if you're not already
  • Archiving to tape is NOT cost effective from an overall TCO point of view if you're dealing with multiple concurrent lawsuits on a regular basis
  • Premiered: 02/25/09
  • Author: Taneja Group
Topic(s): Active Archive Archiving Backup Tape Archive Disk-based Backup


Eric - Great post topic! We totally agree that backup is not archive!

We believe there are only three components of a storage environment:

1) Primary Storage - This is purpose built for handling transactional data.

2) Backup Storage - This is purpose built for disaster recovery scenarios.

3) Archive Storage - This is purpose built for fixed and/or reference information.

Using #3 effectively can have dramatic savings on #1 and #2.

That’s what Permabit does better than anyone else in the industry!

By mivanov on 02/26/09


Leave a Comment

You must be logged in to comment. Click here to log in or register if you don't have an account.