As companies continue to advance Big Data projects, they are realizing that sophisticated data analytics that combine structured and unstructured data might give them their best chance yet to pinpoint fraud, waste, and abuse by employees.

The hybrid approach “builds a better mousetrap,” said Samir Hans, an analytics principal with Deloitte's financial advisory services. “Just using structured data puts the accuracy of your model at about 70 percent,” he says. “Adding additional data points from e-mails or tweets—people actually do brag about what they do on social media—you can increase your model to maybe about 80 percent, with fewer false positives.”

Several providers are releasing tools that use Big Data and source from such places as social media sites, and with much promise. “Five years ago there were not the specialized tools focused on forensic collection from social media, or even the ability to aggregate across social media providers,” said Johnny Lee, managing director of Grant Thornton's forensic technology services.

Grant Thornton has collaborated with X1, which provides social media discovery and Web collection tools, on numerous investigations. Lee points to Equal Employment Opportunity Commission cases where employers have been able to make their cases by using information obtained from Facebook: “These generally have to do with allegations against an employer where harm is alleged and the inability to do the job, when evidence, such as pictures on Facebook of the alleged victim skiing, clearly demonstrates that quality of life was not impugned,” says Lee. “And courts have been generally lenient about allowing Facebook claims.”

Social media discovery is useful in detecting classic fraudster behaviors, said Cynthia Hetherington, principal of an intelligence, security, and investigations consultancy, who also led Aon Consulting's Corporate Strategic Intelligence group. It can, for example, indicate that an employee comes to work when they otherwise wouldn't because they don't want their scamming to be discovered. “Fraudsters show up to work every day, since they feel the need to cover. But they'll be on social networks talking about being terribly ill and going to work anyway, or that the rest of the family is going away but they aren't.” She says these types of posts can raise flags for fraud.

Fraudsters also reveal intent with verbiage, if unwittingly. “No one says ‘I'm going to defraud,' but might say ‘I'm gonna get one over on these guys,'” said Hetherington, while an embezzler who covers his tracks at work brags on Facebook that he “made a killing on the stock market” or “came into some cash.”

Marketers use text analytics to gauge consumer sentiment toward a product, while compliance officers can use it to spot suspicious behavior. One of the more mature tools for text analytics is computer assisted review (CAR), used chiefly by corporate counsel to code documents for e-discovery and disclosure. CAR users can set the parameters, which can include communications between two specific individuals, employees in specific functions (like accounts payable and vendors), or word associations. It can also spot behavior around a document, such as someone forwarding sensitive information to a personal e-mail account. The software “trains” on a subset of documents, then scores other documents with impressive accuracy and output. In the Department of Justice investigation of Verizon's acquisition of MCI, for example, CAR reviewed 1.6 million documents in a week and with greater accuracy than a team of lawyers working 16 hour days for four weeks.

Companies can also trace and search instant messaging and text messaging—presuming the computer or mobile device belongs to the company. That is not a given in the bring-your-own-device (BYOD) environment. Some companies are requiring employees who use their own devices for work to sign an agreement to let the company search them if suspicion of wrongdoing arises.

“Just using structured data puts the accuracy of your model at about 70 percent. Adding additional data points from e-mails or tweets you can increase your model to maybe about 80 percent, with fewer false positives.”

—Samir Hans,

Analytics Principal,

Deloitte

Recognizing Patterned Behavior

Text and keyword analytics are just one level of scrutiny; deviation from normal behavior is another. “Everyone works in a pattern that can be monitored over time—they may keep e-mails forever or delete them once a week,” said Hans, and a sudden deviation can signal a problem.

“Looking for changes in communications patterns is becoming more common,” says Doug Clare, vice president of fraud solutions at data analytics and credit-score provider FICO. “If a couple of traders are collaborating to move a market in a certain direction, you may find that they very frequently communicate by e-mail, and all of a sudden they communicate by phone because voice is not captured.” One of FICO's claims to fame is its profile-based analytics which can detect deviations from a credit card user's normal behavior. FICO has recently begun testing the same capability to detect insider fraud.

Sometimes it is not deviant behavior, but a consistent pattern that raises the alarm, says Clare. “You might have call center notes where the collectors are fraud agents letting certain customers off the hook, or turning a blind eye to certain customers or categories of customers,” perhaps involving a specific third-party provider like an auto body shop.

The hybrid structured-unstructured approach is useful at recognizing collusive relationships. FICO recently acquired Infoglide which specializes in social network analysis—social networks are comprised of the more informal relationships and interconnections inside or outside an organization. The technology mines multiple structured and unstructured inputs for linkages and is used by both the Department of Homeland Security and Transportation Safety Administration.

Oftentimes the “smoking gun” in the case of a kickback is a year-old e-mail or text message, and “The unstructured data analytics is arguably more important than the data mining of the structure,” says Stephanie Giammarco, a partner and Forensic Technology Services practice leader at BDO Consulting. “They tell a story together.” But, Giammarco adds, the technology never stands alone. “The technology is imperative in fraud investigations, but it doesn't stand alone—it's still driven by the people behind it,” she says, be they programming the technology or responding to a red flag.

An enterprise accustomed to purchasing all of its applications from SAP or Oracle will find itself cobbling together a security system from numerous vendors. Tools exist for distinct sources (hence, for e-mail and social media) and for industry verticals, such as pharmaceuticals or financial services.

USING BIG DATA ANALYTICS

The following is excerpted from a report by EMC Consulting on using Big Data to detect fraud.

EMC Consulting integrates solutions from Greenplum and RSA with VMware technology to provide one of the most advanced fraud detection solutions available in the marketplace. This solution detects threats by first creating a baseline against which data can be evaluated. The solution manages, assesses, and analyzesa data pool of billions of potentially daily incidents to update existing fraud and waste analytics models, and create new ones based upon finding unusual patterns in the data. These results are then used to define the normal patterns of network behavior. The second step is to deploy the new models to detect anomalies in the traffic that are indicative of fraud or waste.

STEP 1: DEVELOP A “NORMAL” BASELINE

The EMC Consulting solution uses the Greenplum Unified Analytics Platform to ingest the petabytes of structured and unstructured data that comprise your normal network traffic. Our library of network analytics fraud algorithms and our team of skilled data scientists refine these algorithms to your specific needs. The result is the ability to identify a clear pattern of normal and activities that fall outside these normal parameters.

STEP 2: DEPLOY ANALYTIC MODELS

The second step is to deploy the resulting analytic models (and associated business rules) to detect anomalies in the data that may be indicative of waste or fraud. We use VMware GemFire technology to stream network traffic information in real time, integrated with RSA NetWitness for evaluation of configured security definitions. RSA NetWitness enables the analysis of network traffic in near real time and uses the RSA envision suite to report, route, and resolve incidents as they occur. Finally, we deploy a Splunk capability to add intelligence to the logging process and leverage analytical insights to route suspect activities accordingly.

Source: EMC Consulting.

Grant Thornton's Lee points to a short list of tools that can be used to fight fraud:

·         Data leakage prevention (DLP), which can, for example, recognize sensitive data (the formula for Coca-Cola for instance) in a semantic string;

·         Social media forensic collection tools such as X1 Social Discovery;

·         e-Discovery software; and

·         A host of due diligence tools such as corruption index databases.

With many different providers, companies can end up with a patchwork of systems that don't communicate with each other well. “All the time our clients say to us ‘we've been working with 15 vendors,” said Hans. Deloitte advises starting with the pain points, then creating a roadmap. “You don't start with a solution first.”

While sophisticated technology is certainly boosting companies' ability to combat fraud, some experts say it shouldn't completely replace some older, low-tech methods, such as fraud reporting hotlines and common-sense thinking. “I think the most basic things to prevent or detect fraud really don't have much to do with technology,” said Stephen Leggett, senior vice president of FINEX North America, the Willis Group's executive risks practice.

The overwhelming majority of detection at FINEX client companies, said Leggett, is discovered through tips on old-school confidential hotlines. But hotlines work only if they are in place and employees are aware of them. No Koss Corp. employee raised the alarm when CFO Sujata Sachdeva used her office to store mountains of designer clothing; American Express did, when it discovered large transfers from corporate accounts to pay Sachdeva's credit cards. Sachdeva had embezzled nearly $20 million, essentially wiping out Koss' profit margin.

Leggett is not technology averse, but points out that “In my space, 50 percent of the losses we see involve employee-vendor fraud. The basic tools are segregation of duties, and a preapproved list of vendors, which stops an employee from creating a bogus vendor and putting invoices under it.”

So yes, enterprise-grade technology exists to use Big Data against employee fraud, waste, and abuse. It is not plug-and-play, and likely, impractical for small companies.