# Correlation & Scatter Diagrams

In statistics, correlation is a way of measuring the strength of a relationship between two variables. Loosely speaking, if the two variables were plotted against each other, correlation would measure how close they come to making a straight line.

Chances are you would have been asked to draw a line of best fit on a scatter diagram before. Most of the time the scatter diagram does not give an exact straight line and there are some gaps between the line and the plotted points. If there are a lot of large gaps the correlation is said to be weak. If there are few gaps and they are small, the correlation is said to be strong. Note that this does not indicate where the relationship is positive or negative. The correlation is positive if the line of best fit has a positive gradient. The correlation is negative if the line of best fit has a negative gradient. Note that it says nothing about how steep the lines of best fit are.

We calculate a measure of correlation from the product moment correlation coefficient or the PMCC. The PMCC is a number between 1 and -1. If the PMCC is positive and close to 1, this is a good indication that there is a strong positive relationship between the two variables. If the PMCC is close to -1, this is a good indication that the relationship between the two variables is negative and strong. If the PMCC is close to 0, this is an indication that there is no relationship between the two variables.

If there is a point that seems to be unusually far away from the line of best fit, this might indicate that there is an outlier in the original data. Click here to find out more about outliers.