Anecdotes, Correlation and Causation–Basic Explanation

Thinking critically when building an argument using evidence.

When we base our actions and beliefs on careful measurement and observation of the the real world, it is called empiricism.  When we base our beliefs on how we imagine the world is, it is called dogma.  We currently treat depression with medication and talk therapy, both of which have been carefully measured for effectiveness, an example of empiricism.  A few short generations ago, we drilled holes in depressed people’s heads to let the evil spirits out, an example of dogma. All your arguments, in school and out of school, should avoid dogma and embrace empiricism.

Humans look for patterns everywhere.  Often this serves us well. Our ancestors observed that winter arrived in a predictable pattern, and they prepared for it.  The problem is that sometimes we imagine patterns that aren’t there.  For example, many think they see an increase in some human behaviors during a full moon.  Empirical research has repeatedly shown this to be false.

Whether you are working on an oral presentation or a research paper, you will be choosing evidence to convince your audience that your points are true. To do that well and honestly, you will have to think empirically and understand the differences among:

  1. anecdotes,
  2. correlations and
  3. research that proves causation.

Anecdotes are stories, one thing that happened one time to one person, team, company, village etc. For example, if your Uncle Bud took a drug called, Hairinol (a fictional anti-baldness drug), and got sick to his stomach immediately afterward, you might guess that the drug caused his upset stomach; this is what scientists call anecdotal evidence that Hairinol causes stomach upset. The problem is that it also could have been caused by a virus or by what he ate for lunch yesterday. So from a scientific point of view, anecdotes are the lowest of the low—misleading and worse than nothing. Anecdotes don’t prove that one thing causes another, and most of us get that. In other words, the vast majority of us understand easily that Uncle Bud’s stomach upset may not have been caused by the Hairinol.

In communication, however, anecdotes have value. They illustrate the point, make it memorable, and persuade the audience emotionally. It’s your responsibility to choose an anecdote that is representative, one that shows what happens most of the time according to experts and other evidence.

Correlations are harder. In elementary school, kids with bigger feet score higher on math tests—these two variables are correlated. (Without fail, when looking at a large group of elementary-school students, as foot sizes increase, so do math scores.) Most people see this and conclude that the data prove that larger feet cause better math performance or vice versa. The thing that’s actually causing the higher math scores, however, probably isn’t foot size—it’s probably age. Kids with bigger feet are older (adapted from Research Methods in Physical Activity, 7E, Thomas, Silverman & Nelson p12). In this case, scientists would call the age of the student the third variable.

A correlation is when two variables predictably rise and fall together (or when one predictably rises as the other falls). The third variable is another thing that might have caused one or both of the two variables in the correlation to change.

The fact that foot-size probably doesn’t cause higher math scores is obvious, but what about correlations that appear to make sense? In 1999 Nature published a study that found a correlation—infants whose parents left lights on where the infant slept were much more likely to develop nearsightedness. Health professionals worldwide began to advise patients that total darkness was vital to infants’ visual development. Later Nature published a follow-up study, which found that the infants’ vision problems were almost certainly caused by genetics not the bedroom lamp. They found that nearsighted parents were more likely to leave a light on in the infant’s room, and that the nearsightedness was simply an inherited trait. You can see how a sensible person might easily think that the light caused the infants’ vision deficits (Wagner, Holly. “Night Lights Don’t Lead To Nearsightedness, Study Suggests.” The Ohio State University Research News. n.d.).

Returning to Uncle Bud and Hairinol, if researchers examine 200,000 hospital records and find that those who took Hairinol reported stomach upset 22% of the time, whereas only 3% of the patients who didn’t take Hairinol reported feeling sick, that’s a correlation. Most people think that this proves that Hairinol makes people sick, but think carefully. What if the people who took the Hairnol were significantly older or more stressed than those who didn’t? Maybe one of those third variables (age or stress level) actually caused the upset stomach, or maybe a third variable we didn’t think of caused it. Correlations are not convincing evidence that one thing causes another.

So how can we find out for sure that one thing causes another? We need a way to give one group Hairinol and another group containing the same percentage of old, stressed, etc. people a placebo (a fake pill that does nothing).   How do we find these two groups? This is the genius of the modern scientific method: we choose both groups randomly. The key to proving causation is that the researchers assign the experimental variable randomly. (Memorize the preceding sentence; make sure to stress the underlined words. This is the key to telling the difference between correlation and causation.) In this case, the experimental variable is taking the Hairinol, so the researchers must assign who gets it randomly. Experimenters might choose 200 people randomly and choose 100 of those randomly to be in the group that gets the Hairinol while the other 100 get the placebo. They will get approximately the same number of old, young, stressed, relaxed etc. in each group. Then, if 38% of the group that gets the Hairinol get sick, but only 2% of the placebo group do, they can say that Hairinol causes stomach upset (although they have to repeat the experiment at least once to be sure). There is much more to the scientific method, but these basics will get you through choosing evidence and points for papers and oral presentations.

Mostly, when you find a correlation that you want to use for evidence, you will have to word your point carefully. For example, you could say that, “Children whose families eat dinner together get higher grades,” but you couldn’t say that, “Choosing to eat dinner together will improve your kids’ grades.” The first example point doesn’t claim that one causes the other, but the second does. You may use correlations as evidence. Just be honest—tell you audience that the data show a relationship between the two things, not proof that one causes the other.

Understanding the difference between correlational and causational evidence is important to your academic work, but it’s also important to our democracy. During Wisconsin Public Radio’s Ideas Network Program on Monday, 3/5/2012, at 7:30 a.m. (Gene Purcell in for Joy Cardin – 120305X), State Senator Glenn Grothman shared data showing that children of unwed mothers were 20 times more likely (than children of wed mothers) to be sexually abused. He went on to say that having children out of wedlock places them at higher risk for being sexually abused, and used this as evidence in support of his “single-parenthood bill,” which sought to penalize single parents. Grothman’s argument contains the basic logical error—that correlation proves causation. Grothman cannot claim that the data he cited shows that unwed birth causes sexual abuse or that the specific data he cited shows that his legislation is likely to reduce sexual abuse.   Assuming that Grothman accurately cited the data, the data show a correlation. From this, we can’t say that being unwed causes child sexual abuse—many other things, like poverty, might be the cause. To effectively participate in our democracy, you need to be able to tell that this politician’s claim about what his data show is false.


Dogma – Overconfident opinions not based on careful measurement of the real world.  The opposite of empiricism.  For example, “Look, it’s obvious; it’s just natural talent. Some kids have it and some kids don’t. I don’t have to do any research or testing. I just know.”

Empiricism – Careful measurement of the real world. For example, “Let’s take the list of every kid in the league and randomly pick 60 for the visualization exercise, and 60 to track without giving them the visualization exercise. Then we’ll compare the batting averages of the two groups.”

Experimental variable – the thing being tested. For example, if we want to see whether or not praying causes people to heal faster, praying is the experimental variable.

Anecdote – a story.  In this context, a story that claims to show that one thing causes another.  For example, “My friend Joe sat in the front of statistics class and got an A, so when people sit in front they get high grades.”

Correlation – When two variables (things) increase or decrease together consistently.  For example, whenever ice cream sales increase, so do drownings.  (These two variables are correlated even though it’s obvious that neither causes the other.)

Third variable – When two things are correlated, the third variables are the other things that might possibly be causing both (rather than one of the two variables causing the other).  For example, we know that whenever ice cream sales increase, so do drownings, but both increases are probably caused by the third variable, hot weather.

Directionality problem – When two things are correlated, in addition to the possibility of a third variable causing both, we don’t know which of the two correlated variables causes the other.  For example, Married men live longer—a correlation, but marriage doesn’t cause men to live longer; a longer life expectancy causes women to be willing to marry them.

Causation – When careful research has shown that one thing causes another.  Evidence proving causation always contains the researchers randomly assigning the experimental variable.

Bias – A lack of objectivity about a given situation based on one’s personal interests.  For example, we might trust Consumer Reports to tell us whether or not Fords are good quality cars, but we wouldn’t trust a Ford dealer on that subject.

Hierarchy of evidence – a list of the types of evidence that one might use to persuade someone else that one thing causes another, listed from most to least persuasive.  Review it here.