Privacy and civil-rights concerns have dogged the use of data mining for counterterrorism and national security purposes, but it is quietly thriving nonetheless.
Although several high-profile federal data-mining programs have been shut down, contractors supporting the work say new opportunities continue to develop for agencies to use commercial data mining and analytics.
Data mining facilitates the modeling that would take years to do manually, said Jesus Mena, chief strategy officer at InferX Corp., a software company in McLean, Va., that specializes in data mining and analytics for systems integrators and the military.
Mena helped the Homeland Security Departments Office of Inspector General review DHS data-mining programs in 2006.
Each of [DHS] components has missions, and they cannot accomplish those missions without advanced analytics, Mena said.
Without a doubt, data mining is an area of growth, said Patrick Crago, president of Multi-Threaded Inc., an information technology company in Herndon, Va., with customers at defense and intelligence agencies. Our customers are becoming a lot more sophisticated in their use of data. Data mining can help you analyze unstructured data, which has a volume that is larger than structured data by a factor of 10 to 1.
Despite progress, the publics concerns about using data mining for homeland security has slowed its adoption and forced agencies to turn to more modest forms and different terminologies, industry experts say. In addition, some proponents claim that newer data-mining techniques could offer better privacy protections.
People are doing it but not calling it data mining, said Gary Monroe, director of federal operations at MicroStrategy Inc., of McLean, Va. It is a very valid technology that can be applied to uncover trends. Speaking for myself, we try to steer away from the term data mining because it has preconceived negative connotations.
Although commercial and government datamining applications are growing, the commercial ones are more popular: For every $1 in federal data mining, there is $20 worth of commercial data mining, Monroe said.
DHS demand
Data mining is broadly
defined as the analysis of large
amounts of data to uncover hidden
relationships and patterns. In one type of analysis, keywords
are used to search large amounts
of data to determine patterns
and associations and ultimately
develop behavioral profiles.
Those profiles can identify other
people who fit the pattern and
possibly predict their behavior.
For example, marketers can sort through data to identify the behavior and characteristics of people who bought a particular item most quickly at a Web site. Then they develop marketing strategies to target more likely buyers.
Similar techniques were used to develop the Terrorist Screening Centers watch list of 750,000 individuals and create DHS Automated Targeting System, which produces risk scores for cargo and airline passengers entering the United States. Both programs have been criticized for inadequate privacy protections, lack of transparency and high error rates.
Within weeks, DHS intends to introduce its long-delayed Secure Flight program, through which it will assume full responsibility for checking airline passengers names against the terrorist watch list. Airlines conduct those checks now.
There have also been highly publicized flops.
Congress rejected the Pentagons Terrorism Information Awareness program in 2003. And the DHS-financed Multistate Anti-Terrorism Information Exchange, which offered search and data-mining capabilities to local law enforcement agencies, was terminated in 2005 because of privacy fears.
More recently, DHS Analysis, Dissemination, Visualization, Insight and Semantic Enhancement program was suspended in August because of privacy concerns, according to a Government Accountability Office report.
I have a feeling data mining has probably lost its luster, said Jim Harper, director of information policy studies at the Cato Institute, a Washington think tank.
Data mining for commercial purposes is booming, but when the technology is used to target terrorists, there is too little data to accurately identify patterns, which could result in accusations against innocent people, he added.
Funding continues
Even so, House appropriators
approved $12 million in fiscal
2008 for the FBIs National
Security Branch Analysis
Center, which will have 36 staff
positions. The money will support
advanced analytical techniques,
technologies and data
resources for terrorist tracking.
However, Rep. Brad Miller (D-N.C.), chairman of the House Science and Technology Committees Investigations and Oversight Subcommittee, and Rep. F. James Sensenbrenner Jr. (R-Wis.), the subcommittees ranking member, asked GAO to evaluate whether the FBI can properly manage the centers proposed 6 billion records.
DHS has 12 data-mining programs on the books, nine of which are active, according to a survey released in August 2006 by DHS Inspector General Richard Skinner.
To address privacy concerns, DHS could use newer technologies that allow searchable data to remain at its original location, Mena said. Doing so would help protect privacy by avoiding situations in which data is collected for one purpose but eventually used for many other purposes, he said.
Newer data-mining techniques are also more effective, Mena said. You can buy more power in commercial products off the shelf, he said. DHS is using 20-year-old technologies.
Sergei Ananyan, president of Megaputer Intelligence Inc., of Bloomington, Ind., said federal agencies, including DHS, are using data mining to achieve goals such as evaluating employee training and improving safety records.
DHS could benefit from datamining software that analyzes text on Web sites, he said. In a test project, that tool helped a law enforcement agency identify connections and links among various crime groups, but it is difficult to make such a program transparent without revealing too much information, he added.
As for counterterrorism, the techniques exist today that, if they are put to good use, the results would be quite good, Ananyan said. Can it protect privacy? Its not a technical question but a political decision. Despite progress on many data-mining fronts, concerns about privacy and civil rights are not likely to go away.
Data mining is a very sensitive issue, said Michael Daconta, an independent consultant who was formerly metadata program manager at DHS. It is a powerful tool, but it has lots of implications, and people get nervous.
Staff writer Alice Lipowicz can be reached at alipowicz@1105govinfo.com.



