FREE eLETTER SIGNUP
Washington Technology Newswatch delivers the latest news to your inbox.

The National Magazine for Government Contractors.
Site Search Quickfind Go
Login | Register
Updated 5:35 PM EST Dec 1
  CURRENT ISSUE         About Us
Sprint
HOT TOPICS
RESOURCES
researchstore
SPONSOR SOLUTIONS
STORY TOOLS:  Email this Story   Print this Story   Contact the Author  Contact  Order Reprints of this Story  Reprints
Washington Technology home > 07/05/05 issue
07/05/05; Vol. 20 No. 13

Long live e-records!

By Alice Lipowicz
Staff writer


Miles of files at the National Archives
RELATED TOPICS
SHARE ARTICLE

            Forget gigabytes or even terabytes.

            The National Archives and Records Administration’s creation of a permanent online archive of its electronic records is one of the few projects anywhere in which data storage is measured by the petabyte — a quadrillion bytes — and that is what fascinates Steve Hansen.

            “It is hard to get your mind around the sheer volume of it,” said Hansen, chief engineer for a Lockheed Martin Corp. team competing for the contract. The other team is led by Harris Corp. of Melbourne , Fla.

            But the project is significant not just for its mammoth size. It is the highest-profile example of the growing trend of information lifecycle management, a strategy for managing records from their creation to their use to how they are archived.

            Data storage is becoming a much more strategic decision. In past years, data storage was bought as needed, with regular migrations of the data to new formats as the old formats and applications became obsolete.

            But over the last several years, powerful forces have complicated data management and storage, including steep increases in the volume of data to be stored, and new regulations as well as business and legal requirements requiring data to be more accessible over longer periods of time.

            Major corporate scandals, such as Enron’s financial collapse and Martha Stewart’s conviction, have focused attention on electronic records, such as e-mails, as evidence in lawsuits and in shaping public opinion. New compliance regulations such as the Sarbanes-Oxley and the Health Insurance Portability and Accountability acts include tough new requirements for storing data.

            Facing these demands, many government agencies and corporations are applying a lifecycle approach to electronic records, assessing the data’s value now and in the future, and designing systems to manage and store the data based on those priorities. Major IT systems integrators have entered the field to chase what could be a multibillion global market.

            The total global storage market is estimated at $65 billion for 2005 and is expected to grow to $80 billion by 2009, according to research firm IDC of Framingham, Mass. The firm doesn’t have an estimate for spending on the lifecycle approach for records management. But the National Archives project may bolster the lifecycle approach.

70 years in the making

            The lifecycle concept may have originated with the National Archives itself 70 years ago.

            “Information lifecycle management is an obvious copycat of the basic application of records archiving,” said Kenneth Thibodeau, director of the electronic archives project for the National Archives.

            Typically, many agencies stored old data in proprietary formats, mostly offline and out of immediate reach. But that approach is changing as demand rises for online access over longer periods of time, and as technologies such as Extensible Markup Language have developed. XML is an open, nonproprietary standard that is interoperable with multiple applications and data formats. It does not become obsolete over time.

            Even so, many users are heavily invested in proprietary systems and technologies, including massive amounts of data stored in Adobe Corp.’s Portable Document Format files and Microsoft Corp.’s Word documents. Switching to other formats can carry steep upfront costs, although from a long-term perspective, there may be cost savings from alternative formats. In addition, standards and protocols for e-records management and storage are not mature, particularly for long-term, accessible records.

            “There are no standards yet for information lifecycle management,” said Michael Peterson of Santa Barbara , Calif. , program director for the Storage Networking Industry Association’s data management forum.

            Industry association members are developing such standards, Peterson said, through projects such as SNIA’s own “100-year-archive” committee and through lessons expected from the National Archives electronic records project.

            The data management and storage industry is evolving, complex and fragmented, with sectors devoted to storage and management software, hardware and services. Many industry observers are looking to the National Archives project to solve — or at least shine a bright light on — the stickiest technical problems of long-term e-records management, such as how to keep huge volumes of data accessible, searchable and authentic while formats and applications become obsolete.

            The project “will be influential in solving some of the problems vexing the broader market,” said Charles Brett, managing principal for electronic records management at Xerox Corp.

            The project is supposed to create an accessible, authentic, secure archive that functions in perpetuity and transcends today’s and even tomorrow’s technologies.

            “The goal is to make the information available, so that it’s independent of any particular hardware or software,” said Clyde Relick, project manager for the team headed by Bethesda, Md.-based Lockheed Martin. Other team members include BearingPoint Inc., McLean , Va. ; EDS Corp., Plano , Texas ; Fenestra Technologies Corp., Germantown , Md. ; Filetech Storage Systems Inc., Farmington Hills , Mich. ; Metier Ltd., Washington; and Science Applications International Corp., San Diego .

            In the short term, the National Archives project also will set data management and storage standards for all other federal agencies to follow and could have a big impact on IT electronic records projects governmentwide.

            “It will be extremely influential for all federal agencies initially,” said Karen Knockel, program manager for the Harris team. Other members include Booz Allen Hamilton Inc., McLean , Va. ; CACI International Inc., Arlington , Va. ; and Information Manufacturing Corp., Manassas , Va.

            The project has been budgeted at $136 million through October 2006 but is expected to cost hundreds of millions, or even billions, more. The National Archives is expected to pick a winner in August.

Grow along

            Though Relick and Knockel declined to say what their proposed solutions entail, experts said XML and metadata, which is data about data, are likely to be part of the answer.

            “We need a data architecture that is scalable and evolvable over time,” Thibodeau said. The archives’ own research with the San Diego Supercomputer suggested that such an architecture might be created with XML language, but “obviously we don’t have the answer yet,” he said.

            The National Archives project will handle electronic records dating back to 1970, which constitute about a terabyte in total, but it must be able to handle about three petabytes a year when it becomes operational in 2007, Thibodeau said.

            The records include about 100 million White House e-mails from the Clinton and Bush administrations. The e-mails themselves are not difficult to archive, “but the attachments will kill you,” he said.

            New technology is likely to assist National Archives personnel in automating data management at each step: imaging, formatting, authenticating, securing and storing the records. For example, newly developed, “content-addressed” software may be used to automatically classify which data is likely to be in frequent demand and ought to be made immediately accessible online, and which data is in less-frequent demand and could be saved on magnetic tape, Thibodeau said. The data on the tapes could be made accessible to a researcher within four to five minutes per item, at best, he said.

            He estimated 70 percent of the e-records would need infrequent accessibility, and 30 percent would need active accessibility.

            Regardless of the outcome of the National Archives project, the lifecycle approach appears to be gaining steam in the data storage industry. Companies need to reduce their storage costs with strategic decisions, said Russ Kennedy, director of software product management for information lifecycle management at StorageTek Corp., a data storage company in Louisville , Colo.

            “With lifecycle management, you decide the value of an individual object and, based on its value, you decide how long it is to be retained, in what storage format and with what devices. Then you can decide whether it ought to be moved to a lower-cost tier of storage that fits the information’s value,” he said.

            StorageTek in June announced its new IntelliStore data management and archiving system, a lifecycle-based system that permits multiple tiers of storage.

            Another popular solution that stresses the lifecycle approach is the Centera data archiving and networked storage system from EMC Corp. of Hopkington, Mass. Centera can expand to hold petabytes of data and store it according to a lifecycle strategy for a long-term archive, said Kenneth Steinhardt, EMC’s director of technical analysis.

            “Instead of being places where information goes to die, archives are becoming the place where information goes to live in perpetuity,” Steinhardt said.


WASHINGTONTECHNOLOGY LATEST NEWS GCN.COM FCW.COM
TOP JOBS FROM LOCAL EMPLOYERS
All Top Jobs

Home | About | Advertise | Contact | Custom Media | Editorial Calendar | Events
List Rental | Privacy Policy | Reprints/Linking Policy | Subscribe | Site Map

1105 Media, Inc.

© 1996-2008 1105 Media, Inc. All Rights Reserved.