Unlock Content Over 8,500 lessons in all major subjects
Get FREE access for 5 days,
just create an account.
No obligation, cancel anytime.
Perhaps the most common statistic you'll see from psychology is a correlation. Do you know how to correctly interpret correlations when you see them? This lesson covers everything you need to know.
Imagine you're reading the newspaper, and you see an article that says that a study was done on whether reading books about vampires makes children want to turn into vampires themselves. The article says that there's a correlation between reading vampire books and desire to be a vampire, and that the correlation is -1.5. The reporter concludes that vampire books should be banned, because they are causing children to turn into vampires! What do you think of this reporter's conclusion? If you understand the theory and statistics behind correlation studies, you'll know that this reporter needs to go back to school to learn about how correlations really work. That's the topic of this lesson.
A correlation is a simple statistic that explains whether there's a relationship or association between any two variables. Correlations are probably the most common statistic used in the field of psychology, so it's important to understand how they work.
Let's start with how we might do a basic correlational study, before we get to the meaning behind the actual numbers from the statistics. In a correlational study, researchers pick two variables they think might be associated with each other. For this lesson, let's think about a student who wants to go from high school to college. The admissions office at each college will want to know what that student's high school grades were like, because they believe that high school grades can predict college grades. In other words, they believe that high school grades are associated with college grades. Why would they make this conclusion? They could make a graph showing all of the students they have accepted in the past, and this graph could show both variables.
On the y-axis, we could plot each student's high school grade point average. On the x-axis we could plot that same student's overall college grade point average. We put a dot on the graph showing where these two variables intersect. We keep going until we have a dot for every student. Each dot represents one person. This type of graph is called a scatter plot. A scatter plot shows a dot for each person of interest, where each dot represents one person's scores on the two variables of interest. Here, each dot shows one person's high school grades and college grades. You can remember the name 'scatter plot' because after we plot, or mark, the graph with each person, it looks like a bunch of dots have been scattered all over it.
After all of the dots have been plotted, we can look for the general pattern, or trend, that is a representation of most people. In other words, if you were to draw a single, straight line on this graph, where would you draw the line? It would probably be right here. This line is a quick summary of the general pattern of dots we see on the scatter plot.
Now, how does this graph relate to correlations? A correlation is simply a number that is assigned to represent this scatter plot and this line. The equation for how to calculate the number you end up with is complicated, and you don't need to know it until you take a statistics class in college. For now, all you need to know is that the equation gives you a number that's like a code, and you can interpret this number, or code, to know what the graph looks like that resulted in this number. How to read the number is what we'll cover next in this lesson.
The resulting statistic you get from a correlation equation is called a correlation coefficient. There will always be two parts to a correlation coefficient. The first part is the sign, or direction, meaning whether the coefficient is a positive number or a negative number. That sign is the first part of the code you need to know.
The second part of the correlation coefficient will be a number. The number will always be between zero and one. That means that the correlation coefficient will always be somewhere between negative one and positive one, but it could be anywhere in between. Let's go over each part of the correlation coefficient and discuss what that part means.
We'll start with the sign, or direction. Unless your coefficient is exactly zero, you'll have a number that's either positive or negative. The sign of positive or negative is simply a code that indicates how the line appeared on the scatter plot. Remember our example before? We plotted high school grades and college grades, and we ended up with a line that looked like this. Notice that the line goes from the bottom left corner of the graph to the upper right corner of the graph. That means that as one of our variables went up in value, so did the other variable. In other words, if a student had a high GPA in high school, he or she is likely to also have a high GPA in college. As one variable gets higher, the other variable also gets higher.
Whenever we graph two variables that move in the same direction, the line we draw will generally go from the bottom left to the upper right of the graph. We call this a positive correlation. A positive correlation means that both variables move in the same direction - as one goes up, the other goes up, or vice versa. We call this a positive correlation because when we do the equation to come up with our correlation coefficient, the result is going to be a positive number. It can be anywhere from +0.01 all the way up to +1.00.
The only other option for a correlation will be that it's got a negative sign in front of the coefficient. You won't be surprised to learn that we call this a negative correlation. A negative correlation means that the two variables move in the opposite direction from each other - as one goes up, the other goes down. What would that look like on a scatter plot? Where would we draw the line?
Imagine that we plotted two variables we think move in opposite directions. Let's pick college grade point average and the number of hours a student spends partying instead of studying. We might imagine that the more you party at college, the lower your grades might be. So the two variables move in opposite directions; as one goes up (that would be the partying), the other goes down (that would be the GPA).
If we made a scatter plot of several students who were already in college, we could put number of hours partying on the y-axis, and keep college grades on the x-axis. Now, the scatter plot might look like this. Where would we draw the line representing the general pattern? Here, it goes from the upper left to the bottom right. This will always be where the line goes for a negative correlation.
So, now you know what the positive or negative sign means on a correlation coefficient. It tells you whether the two variables move in the same direction or opposite directions. It also tells you the general direction of the line you would see on a scatter plot showing all of the people used to calculate the correlation.
The second part of any correlation coefficient is the number that appears after the sign. Remember that a correlation coefficient will always range from zero to one. So, you might see a correlation of -0.85, or +0.14, or +0.98. You already know what the negative or positive sign means. What does the number behind it mean?
The number you see in a correlation tells you the strength of the association between the two variables of interest. In other words, are these two variables very strongly related, or not? Let's go through some examples to make this clear.
If the number you get is a perfect zero, that means that the two variables are not related to each other at all. You can imagine some variables that simply have nothing to do with each other. For example, college grades are not correlated with how tall you are, or what color your eyes are or the average number of pizza slices you eat in any given week. These variables are not correlated, meaning the correlation coefficient you would get would be zero.
As a correlation moves from zero to one, it means that the relationship becomes stronger and stronger. A low correlation means that the two variables are a little bit related to each other, but not much. A high correlation, meaning one that's closer to the value of one, means that the two variables are very strongly related to each other. For example, high school grades and college grades are generally related to each other.
So, what does this look like on the scatter plot? Let's go back to a correlation of zero. If we tried to plot two variables that have nothing to do with each other, we'd end up with dots all over the graph! Basically, it would look like a bunch of random dots with no general pattern at all. Here's an example of what that might look like. If I asked you to draw a line on this graph showing the pattern, what would you do? You really couldn't. That's what a zero correlation means; there's no pattern to the dots.
As we get closer and closer to the value of one (either positive or negative), what happens is that the dots start to cluster around each other in a recognizable pattern. They start to look more and more like the line we want to draw. Here's an example of what a correlation of +0.56. You can see that we're starting to see a pattern. Here's an example of what a correlation of +0.91 looks like. We're getting close to that high value of one, so you can see that the line is pretty obvious here. If we had a perfect correlation of +1.00, it would look like this! You can see that every single dot lies perfectly on the line.
Get FREE access for 5 days,
just create an account.
No obligation, cancel anytime.
Remember that correlations can be negative or positive. You saw the graph of +0.56 before. A graph of -0.56 would look exactly the same, except that the slant to the dots would go the other direction, like you see here. And here's an example of what -0.91 would look like, and here's an example of -1.00. It's important to remember that a correlation of -1.00 and a correlation of +1.00 both indicate a perfect association between the two variables. Both of these correlations indicate equal strength in terms of how one variable is related to the other. The only difference is in the direction of the line.
Now that you understand both the direction (positive or negative) and the number (anywhere from zero to one), whenever you see a correlation from a study, you'll know what it means. That little coefficient tells you both whether the two variables move together or in opposite directions, and it tells you how strongly the two variables are related to each other.
There's only one thing left to know about correlations. The biggest mistake that people make when they see a correlation between two variables is thinking that one of the variables caused the other variable. That's not necessarily true, even if the correlation is a perfect -1.00 or perfect +1.00. Why not?
Let's go back to our example of grades. We know that high school grades and college grades are, in fact, positively correlated with each other to a pretty high degree. But, does that mean that your grades in high school actually caused you to get similar grades in college? No. Instead, there are other variables that actually cause both of these variables to move in the same direction. It could be that you are very smart, and that your intelligence causes both types of grades to be high. Or, it could be that you are a very dedicated student, and that your motivation to study actually caused both types of grades to be high. This type of situation is called a third variable problem. Even though our two original variables of high school grades and college grades are correlated, both are actually being caused by some other third variable or even a fourth or fifth variable! So we can't actually say that college grades are caused by high school grades.
While this concept might seem simple with the example of grades, many people make this mistake in research studies that you might hear about on TV or in the newspaper. Let's say a researcher finds out that there's a positive correlation between the number of times parents encourage their children and how well those children do in school. It's very tempting for us to conclude that encouraging children causes them to do well in school, because of the correlation. But, making that conclusion would be a mistake. We don't actually know that encouraging children causes them to do well. It could be that when children do well in school, their parents are more motivated to encourage them - so the cause actually goes in the other direction! Or, it could be that people who live in fancy neighborhoods are both more likely to encourage their children (to look good to the neighbors) and are more likely to have children who do well in school (to look good to the other students). So here, a third variable, the neighborhoods, is actually the causal factor.
The important thing to remember is that while it's possible that two variables that are correlated might have a causal relationship, we can't be sure. In order to study cause-and-effect relationships, we would have to do a true experimental study. You can learn about true experiments in a different lesson.
A common type of statistic in psychology is called a correlation. Correlations tell you if two variables are related to each other, and if so, in what way.
The sign in a correlation tells you what direction the variables move. A positive correlation means the two variables move in the same direction. A negative correlation means they move in opposite directions. The number in a correlation will always be between zero and one. The closer the number is to one (either positive or negative), the stronger the association between the two variables.
Remember that just because two variables are correlated, that doesn't necessarily mean that one causes the other.
If you can remember all of this information about correlations, you're well-prepared to understand and interpret them when you read about research or when you do your own.
"This just saved me about $2,000 and 1 year of my life." — Student
"I learned in 20 minutes what it took 3 months to learn in class." — Student