Part 2

The Importance of Sample Size

It’s common to struggle with the correct sample size for your analysis. Your sample is a group of people that are taken to represent the entire population in which your analysis is focused. Your sample size is the size of your sample. For example, if I want to know what percentage of U.S. teenagers (ages 10-19) watched a certain TV show, it would be impossible for me to gather the data on the entire population. Instead, I would select a sample that would represent this population, on a smaller scale. 

The Importance of Sample Size


Finding the perfect sample size is difficult. Too large? It will take you too much time and resources. Too small a sample size? That weakens the power (the ability to detect an effect when there is one to be detected) of the study. It also increases the margin of error. But there are ways to ease the difficulty:

  1. Take a census: A census is the collection of data about an entire population. As in the example with the teenagers, the entire population of U.S. teenagers is too large of a sample size. Gathering the data for the entire population of U.S. teenagers is unattainable. But taking a census for a smaller data set (ie. 1,000 estimated data points) will provide you the highest success rate for accurate analysis. 
  1. The internet is your friend: There’s a high probability that someone else has done a similar study. The internet can highlight where others may have tripped up when performing analysis. Warren Buffet once said: “It’s good to learn from your mistakes. It’s better to learn from other people’s mistakes.” The internet also can provide insight into how to achieve accurate analysis. 
  1. Use a formula: While this is a more advanced technique, there are formulas that can calculate what sample size will give you accurate analysis. Formulas require you to have knowledge about the population of which you are taking a sample. Using a formula opens the door to more errors, as you have yet another equation to input data into. But formulas should be used with a cautious approach and only if there’s a true understanding of the calculation at hand. (Ex: Cochran’s Sample Size Formula)

In investing, we repeatedly see the same kind of effect: small samples of securities or market conditions, only to be proven wrong. Sample size neglect is a bias in which one evaluates statistical information and arrives at an erroneous conclusion after failing to consider the sample size of the data set. Small samples are usually more likely to contain high degrees of variance. Yet if a sample size isn’t large enough, we sometimes make the error of relying on it too heavily and drawing conclusions anyhow. 

Further, we often see sample size neglect when investors evaluate an investment’s past performance. An investor might see the past three months of a stock chart and become excited. Or an investor might see 401k investment options with the 5-year trailing return for all the available investments. Usually, investors select the one with the best past performance. In doing so, they’re ignoring the standard principle of investing that past returns are not indicative of future results. They’re also overlooking the fact that the past timeframe (three months, one year, five years), might be too small a sample in the world of investing, where assets that underperform for five years still might outperform over decades. See: S&P 500 vs. cash. 

Said The Big Short author Michael Lewis: “The smaller the sample size, the more likely that it is unrepresentative of the wider population.”

Related Guides