Forget the cloud; Big Data is the new new thing, and it could have major implications for internal audit.

More organizations are exploring what sort of business intelligence they can derive from all the information at their disposal. Although its application is in its infancy, internal auditors and compliance professionals are paying close attention to how Big Data is evolving and, more critically, how they can put it to work. The amount of information companies are now capturing and its potential value cannot be ignored.

The emergence of Big Data is based on three developments: an explosion in the amount of data available, a dramatic reduction in cost of storing that data, and a dramatic improvement in the tools available to analyze it. For example, there's been a tremendous improvement in the ability to aggregate the data, and big advances in predictive analytics—all of which presents tremendous opportunity to the internal auditor.

But what exactly is “Big Data?” Big Data is not just about size. Big Data is actually about the analytics of that data. Here are some basics:

Big Data encompasses unstructured in addition to structured data. This includes non-numerical data like e-mails, audio, video, images, free text, and social media content. There are estimates that 80 percent of all data is unstructured.

Traditional data systems and data warehousing don't handle Big Data very well because they can't handle the variety of data—data is becoming less structured in part because it evolves very quickly. The ability to converge multiple data sources—structured and unstructured —is new.

Big Data analytics means that unstructured data can now be mined. The ability to produce new types of information (and insights) in real time is a powerful advancement.

Some commentators go so far as to intimate that Big Data offers the promise of a style of computing that more closely mimics the functioning of the human mind as it takes in data from many different sources, forming thoughts, and making close-in-time decisions. We're not there yet, but the potential is real.

The Big Data approach represents a substantial shift in how data is viewed and handled. Instead of painstakingly creating a clean subset to place data in a warehouse to be queried against a limited number of predetermined ways, Big Data collects all the sources of records and information an organization generates, and then allows business managers and analysts to worry about how to use the data later.

Apply What You Know

While companies are just starting to consider how to unlock the potential of Big Data, data analytics itself is not new to the audit profession. Auditors can start with a basic understanding of data warehousing and add knowledge of how Big Data capabilities can improve what the classic data warehouse doesn't offer.

Data warehousing works with a limited number of data sources. Big data analytics has the power to combine disparate sources to deliver information previously unattainable. The classic data warehouse user sets up queries and gets results anywhere from a day to a week later, whereas the goal for many Big Data analytics processes is to deliver results to users in real time.

One of the keys to taking unstructured data and extracting useful data is the creation of a semantic data model that sits on top of the data and helps you make sense of it. Reference information can come from various sources and different individuals may call the same thing by several different names. Semantic technology has the ability to infer that those things are in fact the same thing and then groups them together. For example, someone might refer to Hewlett Packard in the data as ‘HP' or ‘HP Corporation' or some other variation. They really are the same thing and by showing that equivalence within the semantic layer, you indicate they are the same.

Internal auditors and compliance professionals are already aware of the challenges of working with company data and that for Big Data to work the following elements are essential:

Companies must be collecting all the data they're generating and tracking it somehow so they know they have it;

Multiple parties within the organization must be willing to share their data for others to see; and

Managers must be asking the right questions, so that when they examine all those troves of data, the right answer emerges.

Auditors strive to support business units to develop monitoring reports, pulling information from different databases and looking at anomalies to help identify negative trends; they are familiar with the challenges of the time required to structure meaningful comparison of data and then the effort needed to refine the comparisons to eliminate false positives plus the time to investigate legitimate discrepancies that emerge.

Big Data essentially could provide a vehicle to broader thinking about risks. Auditors can better define which risks concern the organization, and which policies and processes to monitor for compliance with an understanding of the data that allows them to analyze those risks. They then look across both structured and unstructured data to select the items most likely to drive the risk or at least yield some insights about it.

Internal audit and compliance in its assurance activities can assist an organization in achieving the balance between useful targeting and activities that raise privacy concerns or other inappropriate uses that Big Data can motivate.

Using analytics to help spot fraud, for example, is not new to the auditor. For some time, internal auditors have used software tools to manage large quantities of data, typically structured and numerical data. Now, Big Data tools can add non-numerical data, such as e-mail texts, employee files, and audio from hotline calls to the mix, vastly expanding the reach of such monitoring. With Big data capabilities, fraud examiners can parse through larger and more varied types of data to identify meaningful anomalies. Predictive models can be developed to determine which anomalies and outliers are most likely to yield useful results.

Privacy and Security

Big Data doesn't come without some caveats. Companies that collect and leverage Big Data can find that they can also have “toxic data” on their hands. For example, imagine a wireless company that is collecting machine data—who's logged onto which towers, how long they're online, how much data they're using, whether they're moving or staying still—that can be used to provide insight to user behavior. We've already heard negative publicity of a company that gathered additional user-generated data in the collection process—passwords, credit card numbers, social security numbers—that generated a firestorm of privacy concerns.

The potential correlations that can be drawn are seemingly limitless. If you collect enough data, you can start to correlate it so that it becomes possible for those with access to this data to have a very complete picture of customers, users, or other individuals. It's not so much the individual datum, but the ability to piece those together to form a very complete picture of a specific person. To a marketer or retailer, all of that information might be very useful, but it also represents potentially significant liability on the collector's part if the data is used inappropriately or is the subject of a data breach.

A major retailer learned this recently. In attempting to influence shopping patterns, the retailer analyzed their customers' shopping from purchase data. Analysis suggested that when a couple becomes pregnant, they are likely to shift their shopping patterns. So the retailer used data analytics to determine by correlation if particular customers were likely to be pregnant. Targeted advertisements were sent to those people as part of an effort to get them to come into the store.

That was fine until an irate father discovered, because of the marketing effort, that his teenage daughter was pregnant—something she had chosen not disclose to her father.

So how far is too far when using Big Data tools to assemble customer profiles and to provide personalized services or marketing?  Just because a company has in its possession massive amounts of data to make use of doesn't necessarily mean that it needs to, or should.

Internal audit and compliance in its assurance activities can assist an organization in achieving the balance between useful targeting and activities that raise privacy concerns or other inappropriate uses that Big Data can motivate. Organizations need to be aware of and sensitive to privacy restrictions, and to be careful about aggregation and dissemination of this information.

Talent Management

Along with the organization itself, chief audit executives should be mindful that finding the right talent to analyze Big Data will be a significant challenge. Currently, most companies have a shortage of employees with strong statistical skills and a deep understanding of the company's business. Organizations will have to focus on data science, and on hiring statistical modelers, text-mining professionals, and specialists knowledgeable in sentiment analysis.

Without the right people, decisions based on automated reports and analysis from Big Data can prove faulty. Some believe that the talent gap has to do more with collaboration and bringing varying disciplines together. Implementation of Big Data is expected to build competence as the need for expertise is recognized. Internal audit shops should start identifying skill sets and competencies that can be recruited to supplement current use of data analysts.

Internal auditors can proceed cautiously as the power of Big Data is still in its early stages. However, companies are eyeing the tipping point when the value in Big Data exceeds the cost of obtaining that data. And that tipping point will arrive as the cost of tools comes down and the value becomes more apparent—auditors and other assurance professionals need to be ready.