Without doubt, Big Data is on the rise. President Obama authorized $200 million in March to be spent on Big Data research for the federal government; the CIO of Walmart posted a video on YouTube just this month where she proclaimed that “retail essentially is Big Data.” Any number of surveys find that executives are collecting more data and planning to spend more so they can analyze it.

Still, none of that means Big Data will be easy to embrace.

The reality is that Corporate America is mis-structured in all sorts of ways to take full advantage of what Big Data has to offer. Current methods of classifying and storing data, for example, can make it difficult for analytical software to find and extract the precise information they need. Many employees have neither the skill nor the time to do Big Data analysis. The workflow processes companies use today might not even generate data you'd want to capture and put to use tomorrow.

The net result: Few companies are finding diamonds in their piles of data. More common were the sentiments expressed in a survey of 300 business executives conducted by Oracle. Respondents said they are missing out on opportunities that amount to an average 14 percent of annual revenue due to an inability to leverage the data they have.

“You can see Big Data; you can feel it, you can taste it,” says Navin Ganeshan, vice president of products and technology for Centrifuge, maker of an analytics tool that creates visualizations out of large amounts of data. “But companies have a lot of trouble figuring out how to systematize it.”

The promise of Big Data is that it can combine multiple, disparate data sets to yield answers that could never come from studying separate piles of data alone. For example, you might uncover a payroll fraud by matching employee vacation schedules in the HR department with key-card access records in the facilities department, to detect someone entering the premises when he shouldn't be there.

But even in that simple case, such data typically exists in silos that only a corporate IT department could piece together correctly. So now someone would need to tell the IT department to do so (that would be you, the compliance executive), and the IT department would need the time and manpower to spare (which it never does). And the prospect of incorporating useful data outside the company's ownership, like Twitter or Facebook posts? Dream on.

Incomplete access to data is the top obstacle to getting crisp analysis, according to a recent survey by Capgemini. “There is a tsunami of pent-up need within most companies,” says Goutham Belliappa, a principal in Capgemini, and “the challenges of data integration are not going to go away, as the [group] that manages the data will have more and more needs in the future.”

Now, compliance officers do stand a better chance than most in a company of getting their hands on multiple sets of data. But short of running an internal investigation (which has a wonderful way of driving cooperation), convincing others to provide wanted data can still be a battle.

“It totally makes sense to auto-generate reports, pulling in information from different databases, and then just looking at anomalies to help identify negative trends,” says Neil Frieser, vice president of internal audit for Frontier Communications. “But sometimes operations management doesn't want to be dragged along for the ride.”

The pushback, Frieser says, is largely the result of the extra time required for managers to structure meaningful comparison of data and then refining the comparisons to eliminate false positives—not to mention time to further investigate any legitimate discrepancies that emerge.

“You can see Big Data; you can feel it, you can taste it, but companies have a lot of trouble figuring out how to systematize it.”

—Navin Ganeshan,

VP of Products and Technology,

Centrifuge

In a recent study titled “The Future of Big Data,” the Pew Internet & American Life Project asked more than 1,000 people involved in technology how they imagine Big Data will evolve from now through 2020. The comments—all free-response answers—varied widely, but data availability and comparability was a recurring theme.

“Big Data will not be so big,” said Jeff Eisenach, managing director for consulting firm Navigant Economics, because “most data will remain proprietary, or reside in incompatible formats and inaccessible databases where it cannot be used in real time.”

The People Problem

Even if all the data were perfectly organized and open, however, another reality is that companies will need actual human beings to use it—the fabled “technology worker who understands the business,” someone with strong statistical skills and a deep understanding of the company's business. Such persons are currently about as common as unicorns or the Chicago Cubs in the World Series.

Indeed, in the Capgemini survey, lack of skilled labor was the second-biggest obstacle cited by respondents. Without people to analyze the data intelligently, reaching useful conclusions to drive the business is nearly impossible, various respondents said.

                     ABOUT THIS SERIES

Compliance Week's exclusive four-part series on “Big Data” is examining the growing volume of information that companies are capturing and the tools they are building to harness mass volumes of diverse data at speeds once inconceivable. We'll look at ways it can be used to improve risk management, audit, and compliance, and the compliance officer's role in this landmark business transformation.

Part 1: Unlocking the Potential of Information, July 17Part 2: Starting Small, Scaling Up, July 31Part 3: Big Data Playing a Bigger Role in Fraud-Spotting, Aug. 14Part 4: Big Data: For All Its Promise, Obstacles Remain Ahead, Aug. 28

Even worse: Without the right people, decisions that do get automated are more likely to lead to disaster. As Michael Knorr, head of data and integration services for Citigroup, commented in Capgemini's report: The more money that is at stake in a decision, the more important people are. For example, automated loan decisions can work well in the consumer market, where errors are relatively easy to correct with a phone call. In the commercial world, however, a loan that is erroneously rejected for a time-sensitive need like a shipment of cargo can lead to large losses.

The good news is that Big Data is improving on numerous fronts, with the next generation of tools aiming to make information attainable for dummies—or at least the ordinary business executive.

For starters, more efficient storage of data means more efficient retrieval, and more analysis in real time, says Neil McGovern, senior director of financial services product management at SAP. McGovern notes that data warehouses themselves have made great strides in the last 10 years, thanks to a number of advances including column databases that can organize data fields in relation to each other, and can also recode certain types of numbers such as zip codes to make them smaller. Nerdy? Yes, but the advances are still critical.

Next, he says, is to make analysis fast enough that programs can find and read the data at its underlying source rather than in the data warehouse, which saves the step of putting data there.

SAP is among the many software heavy-hitters trying to achieve such breakthroughs, but a flock of startups are dedicated to Big Data as well. Some are focused on data visualization, such as Clearstory, backed by Google Ventures; others, including Centrifuge, aim to make data integration a non-issue by integrating pieces of data as they're created.

Such new tools could even help organizations already on the path to a Big Data strategy. Investment banks, for example, have long been using analytics to track possible breaches of the firewall meant to exist between research analysts and traders. “They've already built really sophisticated systems to flag any events that might be a firewall cross, such as a trader having a phone conversation with someone on other side of wall or two individuals being in same conference room at same time,” Ganeshan says.

But the price for that effectiveness? Compliance groups at such banks now get 7,000 to 10,000 alerts per day, an overwhelming number. Centrifuge tries to solve that problem by pulling in even more data, such as a trader's trading history, or history of conversations with particular people, to give a better sense of whether those encounters are illicit or acceptable. “Our focus is completely on enabling normal business users,” who are not necessarily tech experts, Ganeshan says.

For most businesses, however, the best Big Data strategy right now is to proceed cautiously. “There is a tipping point when the value in Big Data exceeds the cost of obtaining that data,” says SAP's McGovern. And that tipping point will arrive sooner or later, as the cost of tools comes down and the value becomes more apparent.