A scatter diagram is a graphical representation of two variables, one on X axis and the other on Y axis. Hence they are also known as XY plots. A Scatter plot depicts the relationship between the two variables and determines if there is a correlation between those two variables.When one variable changes if the other variable changes, then a correlation is said to exist between those two variables. We can use the correlation to predict behavior. Its very useful if one variable is easy to measure and the other variable is difficult to measure.
Scatter plots show large amounts of data in a chart form. When the points on the scatter plot come closer making it a straight line, the correlation between the variables is higher and the relationship between the variables is stronger.
Correlation Coefficient (R) and Coefficient of determination (R Squared)
Correlation Coefficient is measured as:
R = [N * Sum(XY) – Sum(X)*Sum(Y)] / SQRT [N * Sum (X^2) – (Sum(X))^2] * SQRT [N * Sum (Y^2) – (Sum(Y))^2]
R is always between -1 and +1
Coefficient of determination is the square of R or R squared. This is always a positive number between 0 and +1
If the straight line goes up from left to right with higher values of X corresponding to higher values of Y then the correlation is said to be positive. A perfect positive correlation is of value +1. A good example of perfect positive correlation of +1 is the case where you use X grams of flour to make Y grams of Cake. As X increases Y increases proportionally and the correlation is equal to +1.
If the straight line goes down from left to right with higher values of X corresponding to lower values of Y then the correlation is said to be negative. A perfect negative correlation is of value -1. A good example of perfect negative correlation of -1 is the case where you have X amount of money in the bank and Y amount of additional money needed to become a millionaire. As X increases Y decreases proportionally and the correlation is equal to -1.
If there is no relationship between the variables, then the plot does not look like a straight line, rather it looks “scattered” all over the XY plane. If there is no correlation between the variables then the correlation value is 0.
The correlation value thus ranges from -1 to +1. As the correlation number gets closer to +1 or -1 such as 0.8991 0r -0.9023 then the relationship is said to be stronger. If the correlation number is closer to 0 such as -0.1023 or + 0.122 then the relationship is considered to be week.