Statistical sampling is a powerful—and often misunderstood—tool that has a wide range of applications, from audit to testing product preferences to predicting the outcome of political elections.

My first exposure to the power of statistical sampling was when I was quite young and learned about the Nielsen television ratings. I wondered how accurate such ratings could be when our family didn't even have a Nielsen box in our home, so our viewing habits were completely ignored by the ratings.

Even after taking graduate courses on statistical methods and learning their mathematical foundations, I still marvel at how it is possible to examine a seemingly small sample and arrive at conclusions about the views or actions of a much larger population. The ubiquitous use of political polling is another example of the influence of sampling techniques. Just ask Nate Silver.

Statistical sampling plays an important role in the audit process, as well. The Institute of Internal Auditors practice advisory on audit sampling provides the core principles that auditors must understand in order to base conclusions and engagement results on appropriate analyses and evaluations. “Continuous auditing allows the internal auditor to test the whole population in a timely fashion, while audit sampling facilitates the selection of less than 100 percent of the population,” it notes.

The practice advisory also states that: “Audit sampling is used to provide factual evidence and a reasonable basis to draw conclusions about a population from which a sample is selected. The internal auditor should design and select an audit sample, perform audit procedures, and evaluate sample results to obtain sufficient, reliable, relevant, and useful audit evidence to achieve the engagement's objectives.”

Effective sampling can thus support the auditor in providing the reasonable assurance required during an attestation engagement. In forming an opinion or conclusion, auditors frequently do not examine *all* available information, as it may be impractical and valid conclusions can still be attained using audit sampling.

Auditors and compliance professionals should use sampling of various types, including judgmental sampling and statistical sampling, to support auditing and monitoring efforts. Both have a role in auditing and are techniques used by government regulators as well. A reasonable sampling strategy and appropriate sampling type will depend on the audit objective and complexity of the study.

**Types of Sampling**

Judgmental sampling is applied on the basis of knowledge of a certain problem that is being researched or reviewed. Broadly, it includes sampling by *non-statistical* methods—for instance, pulling charts from a pile in a haphazard manner or picking medical records at will from a patient list. In small probe audits (used to determine if an issue even exits) and routine monitoring, haphazardly picking from a list can be perfectly reasonable. Not every sampling needs to lend itself to rigorous statistical methodology. If a probe review indicates a problem exits, a bigger sample utilizing statistical methodologies may then be needed to determine the scope and nature of the matter.

###
Auditors and compliance professionals should use sampling of various types, including judgmental sampling and statistical sampling, to support auditing and monitoring efforts.

Statistical sampling involves a more rigorous and mathematical approach than judgmental sampling, and it has the benefit of being defensible and rendering objective and statistically valid estimates, such as transaction overpayment extrapolations. Systematic sampling, one type of statistical sampling, is easy to understand and can be used if a group of units to sample, such as list of transactions, can be numbered. This method selects sampling units from a frame or population according to a random start point and a fixed, periodic interval. For example, every fifth unit is selected for inspection.

Simple random sampling, another type of statistical sampling, requires a randomizer typically available in statistical software packages. Microsoft Excel has a random number generation module, for example, in the data analysis section, and it is broadly available and familiar to many auditors and compliance professionals.

**Sampling Procedures**

Before we look at the basic steps to consider when performing audit sampling, let's first define a few key terms: In statistics and scientific-based fields, the accuracy of a measurement system is the degree of closeness of measurements of a quantity to that quantity's actual (or true) value. The precision of a measurement is the degree to which repeated measurements under unchanged conditions show the same results.

A measurement system can be accurate but not precise, precise but not accurate, neither, or both. If an experiment contains a systematic error, for example, then increasing the sample size generally increases precision but does not improve accuracy. The result would be a consistent yet inaccurate string of results from the flawed study. Eliminating the systematic error improves accuracy but does not impact precision. The auditor in designing the sampling approach will want to avoid systematic error (am I measuring the right thing?) while achieving reasonable precision (what sample size do I need to obtain the precision I am seeking?).

The following are basic steps to consider when performing audit sampling:

*Identify your sampling frame*: This can be the specific population to be reviewed. For example, the audit could involve transactions for a particular service provided during a particular time period rather than the universe of all claims. The sampling frame can comprise the characteristic to be measured to determine whether the transactions were performed correctly or not. For a compliance audit this would require applying laws, regulations, and company policy pertaining to the class of transactions being reviewed. Incorrectly designating the sampling frame can lead to a flawed sample that would be inappropriate to extrapolate to the population.*Conduct an analytical examination of the sampling frame*: Are there distinctive features of the sampling frame to consider? For instance, are the dollar amounts of equal or similar values? Are there outliers, such as a few extremely high or extremely low values in the frame? This analysis would have bearing as to what the sample design will be.*Identify your sample design*: The analysis of the sampling frame will help determine if you should use a simple random sample or a stratified random sample. If the dollar values of the sampling frame vary by large amounts, for example, normally a stratified sample design is preferred. If all frame units have similar dollar amounts, a simple random sample is used. For a well-designed sample methodology, take the time to look at the data, perform a frequency distribution of the sampling frame, and look at different sampling methodologies that may fit the data. A poorly designed sample will result in a poor precision of the estimate being tested.*Determine your sample size*: A small probe sample is often taken to determine whether a problem exists or not. The results of a small probe sample are used as input into a statistical package which requires some statistics from the probe sample and population. The sample size estimator uses the sampling frame size, the mean, and the standard deviation from the probe sample as input. The sample size estimator will produce a range of sample sizes, given the desired precision and confidence interval. Assuming accuracy in the sampling frame and design, the larger the sample size, the better the precision of the estimate.*Generate random numbers*: Once your sample size has been determined, a set of random numbers can be generated. Statistical software typically offers several options for generating random numbers based on sample design and the physical structure of the sampling frame and has the capability to output the set of random numbers to Access, Excel, or a text file. You will need to output the random numbers to the same file format as your sampling frame in order to extract the sample.

Keep in mind that such software removes the need to generate a source of random numbers or to understand statistical formulas, but it does not perform the steps for designing an effective sampling methodology, such as deciding on simple random sampling or stratified random sampling, nor does it create the sampling frame from which the sample will be drawn.

**Use of Sampling Is Growing**

Statistical sampling is here to stay, and its use is expanding, particular in the current regulatory enforcement climate. It is also the most reasonable and cost-effective way for organizations to assess the amount of payment errors in large populations.

One particular experience taught me the value of understanding the power and limitations of sampling. A customer of the company where I served as compliance officer was seeking repayment of several millions of dollars for invoices he believed were erroneously submitted and that he therefore overpaid. However, a close examination of this customer's methodology revealed flaws in the sampling approach. In essence the customer was attempting to generalize to a population far broader than the sample that was utilized. Needless to say this customer backed down when this was pointed out and a much smaller repayment was made for the invoice errors.

When systemic issues are suspected, organizations should consider robust statistical sampling to identify the total overpayment for the population, especially when significant estimated amounts are involved. Statistical sampling is the appropriate strategy but it must be applied correctly before any attempt to extrapolate is made.