Just how much data are companies generating these days? Well the numbers are measured in Exabytes, a unit of measurement with 18 zeros.

From the dawn of civilization until 2003, humankind generated roughly five exabytes of data. These days, according to Google Chairman Eric Schmidt, we produce five exabytes every two days, and the pace is accelerating. To put that number in perspective, one exabyte is equivalent to 4,000 times the information stored by the U.S. Library of Congress.

In other words, almost all organizations are generating mountains of data that they then have to store and manage, a trend that's led to the term, “Big Data.” While it's difficult to find a universally accepted definition of Big Data, this one from research firm Gartner gets at the challenges the emerging technology poses: “Big data is high-volume, high-velocity, and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.”       

The convergence of several factors is behind the exponentially expanding volumes of data. Technology advances, of course, have played a significant role. Data processing and storage costs continue to decline, while analytical tools have become far more sophisticated.

Stockpiling data doesn't do much good, though, unless companies can harness its  power. “A number of companies have demonstrated the value of data,” says Lothar Determann, a lawyer with Baker & McKenzie, and author of the book, “Determann's Field Guild to International Data Privacy Law Compliance.” For instance, companies that mine the information contained within their customer loyalty programs often find ways to better reach these customers and boost their top lines. 

While data can provide value, generating and retaining large volumes of it can also create many challenges for compliance officers, who need to work with their colleagues across the company to manage, store, and dispose of data securely, all while complying with a patchwork of state and federal regulations.

This responsibility is complicated by the fact that data is generated by many people and flows into and out of an organization at multiple points, says Michael Rappa, professor and executive director with the Institute for Advanced Analytics at North Carolina State University. He provides an example from academia: When a prospective student applies to most colleges, an established data management process is activated to protect the personal information contained in the application.

That same process, however, might not be triggered when an applicant decides to e-mail a professor his or her transcript. “Then it's outside the compliance rules.” While most professors would treat the data carefully, the lack of an established process can expose organizations to compliance risks.

Compliance must balance the needs of data users inside the company with data security and regulatory risks. “It's the typical dynamic, where the data user community within a company, such as marketing, wants more data and no constraints,” Determann says. “Compliance and legal would prefer less personal information wherever possible.” The goal is to manage these competing interests and determining which data has enough value that the benefits of retaining it outweigh the potential liabilities.

Getting Past Resistance

To effectively manage their organizations' data streams, compliance officers need to first work through their colleagues' resistance and lack of understanding. Many departments may be nervous about sharing their data practices—or lack thereof—with compliance, worried that they will be singled out for having mishandled their responsibility. That's especially the case when the compliance department historically has been seen as the “corporate cop,” says Michael Rasmussen, chief GRC pundit with research firm GRC 20/20 Research.

                     ABOUT THIS SERIES

Compliance Week's six-part series, “The Lifecycle of Information Governance,” sponsored by HP Autonomy, will examine all the elements of handling information properly—from creation to storage to destruction—and how compliance departments should address each element. Click on the links below to access this exclusive series.

Part 1: Crafting an Effective Data Security Policy, Feb. 12

Part 2: Catching and Managing New Data, Feb. 20

Part 3: Get Data Classification Right First

Part 4: Protecting Data From Inside & Outside Threats, March 12

Parts 5 and 6: To Be Announced

Complicating this is the fact that employees outside the compliance, legal, or data security areas may lack an understanding of the risks data can pose. “Companies are collecting and amassing data, in many cases without a clear vision of why or what to do with it, nor the liabilities or opportunities it presents,” Determann says.

Politics also plays a role in making it difficult to break down a siloed approach to data management. “People want to protect their turf,” Rasmussen adds.

To gain buy-in, compliance also needs to let employees know how solid data management practices can make their jobs easier, says Jesse Wilkins, director of research and development with AIIM, a global group for information professionals. “Answer the question, ‘What's in it for me?'” For example, proper data management should make it easier and more efficient for workers to access updated, accurate information.

Training also can help employees in other areas gain an understanding of the regulations with which the company has to comply, Determann says.

While compliance wants to work effectively with employees, it also may need to ask tough questions. Employees should provide enough information that compliance can get an idea of the benefits and risks of the data they're amassing, Determann says. Before anyone in the company develops a new application or database, he or she should answer a few key questions: Why is the data being created in the first place? What will the unit do with the information? What promises are being made to others regarding the way in which the data will be used and protected? Gathering this information through a brief questionnaire can help to highlight emerging risk issues, he says.


McKinsey Global Institute studied big data in five domains—healthcare in the United States, the public sector in Europe, retail in the United States, and manufacturing and personal-location data globally. The research offers seven key insights.

1. Data have swept into every industry and business function and are now an important factor of production, alongside labor and capital. We estimate that, by 2009, nearly all sectors in the US economy had at least an average of 200 terabytes of stored data (twice the size of US retailer Wal-Mart's data warehouse in 1999) per company with more than 1,000 employees.

2. There are five broad ways in which using big data can create value. First, big data can unlock significant value by making information transparent and usable at much higher frequency. Second, as organizations create and store more transactional data in digital form, they can collect more accurate and detailed performance information on everything from product inventories to sick days, and therefore expose variability and boost performance. Leading companies are using data collection and analysis to conduct controlled experiments to make better management decisions; others are using data for basic low-frequency forecasting to high-frequency nowcasting to adjust their business levers just in time. Third, big data allows ever-narrower segmentation of customers and therefore much more precisely tailored products or services. Fourth, sophisticated analytics can substantially improve decision-making. Finally, big data can be used to improve the development of the next generation of products and services. For instance, manufacturers are using data obtained from sensors embedded in products to create innovative after-sales service offerings such as proactive maintenance (preventive measures that take place before a failure occurs or is even noticed).

3. The use of big data will become a key basis of competition and growth for individual firms. From the standpoint of competitiveness and the potential capture of value, all companies need to take big data seriously. In most industries, established competitors and new entrants alike will leverage data-driven strategies to innovate, compete, and capture value from deep and up-to-real-time information. Indeed, we found early examples of such use of data in every sector we examined.

4. The use of big data will underpin new waves of productivity growth and consumer surplus. For example, we estimate that a retailer using big data to the full has the potential to increase its operating margin by more than 60 percent. Big data offers considerable benefits to consumers as well as to companies and organizations. For instance, services enabled by personal-location data can allow consumers to capture $600 billion in economic surplus.

5. While the use of big data will matter across sectors, some sectors are set for greater gains. We compared the historical productivity of sectors in the United States with the potential of these sectors to capture value from big data (using an index that combines several quantitative metrics), and found that the opportunities and challenges vary from sector to sector. The computer and electronic products and information sectors, as well as finance and insurance, and government are poised to gain substantially from the use of big data.

6. There will be a shortage of talent necessary for organizations to take advantage of big data. By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.

7. Several issues will have to be addressed to capture the full potential of big data. Policies related to privacy, security, intellectual property, and even liability will need to be addressed in a big data world. Organizations need not only to put the right talent and technology in place but also structure workflows and incentives to optimize the use of big data. Access to data is critical—companies will increasingly need to integrate information from multiple data sources, often from third parties, and the incentives have to be in place to enable this.

Source: McKinsey Global Institute.

Data Ownership

Another thorny question is determining just who within an organization “owns” the data. Employees in other areas of an organization may assume that compliance is the owner—particularly when a security breach occurs. “Too often, compliance is the scapegoat,” Rasmussen says.

Many experts agree that the people who are generating the data and, in some cases, deciding how long to keep it, are the ones who own the data, usually the business units. “Ultimately, the business units have ownership, strongly informed by compliance, and compliance's understanding of regulations and requirements,” Wilkins says.

That's especially true when a unit decides to hold on to data for longer than is required to meet regulations. A marketing department, for example, may decide to keep customer data for ten years in order to perform trend analysis, even after compliance has confirmed data older than two years can be disposed of and has informed the department of the need to continue protecting the older data.

What Technology?

Along with determining the organizational structure that a company will use to manage data streams, compliance also needs to evaluate the technology it has in place and assess data-management capabilities. For example, does the company have technology that will allow it to locate and classify data scattered across the enterprise?

Data classification can be based on the level of protection required. The data may be considered open, confidential, or for review by board members only, Rasmussen says. Or, the data could be categorized by type, such as financial records, client information, or the company's intellectual property.

“Work with IT to understand if the solutions in place will make it less likely for compliance issues to take place,” Wilkins says. 

Another critical step is to regularly audit the type and amount of data the organization is collecting and the steps it's taking to manage it, Determann says. This can be accomplished through a combination of technical tools and interviews at the user level. For example, compliance can talk with human resources to determine the type of data they're keeping and the technology solutions they're using to manage it.

Ultimately, the goal is to build the concepts of data security and privacy into the organization's DNA, Determann says.

Accomplishing this becomes increasingly important as organizations generate more data, much of which can be valuable. “Information may have strong strategic value,” Rappa says. Compliance needs to work with other areas to ensure that the organization is making the best use of it, while also meeting regulatory and compliance obligations.