Correlation and Coefficient
Correlation refers to the statistical relationship between two variables. It measures the strength and direction of the linear relationship between them. The correlation coefficient quantifies this relationship, providing a numerical value that ranges from -1 to 1.
Formula:
The correlation coefficient (often denoted by r) is calculated using the following formula:
Pearson Correlation Coefficient (r):
where:
- n is the number of observations
- x and y are the two variables
- is the sum of the products of the paired observations
- and are the sums of the individual observations
- and are the sums of the squares of the individual observations
Example:
Suppose we want to determine the correlation between the hours of study and exam scores of students. Below are the hours of study (x) and corresponding exam scores (y) for a sample of 5 students:
Hours of Study (x) | Exam Scores (y) |
---|---|
3 | 75 |
5 | 85 |
7 | 90 |
4 | 80 |
6 | 88 |
Using the formula, we can calculate the correlation coefficient:
After calculation, if is found to be positive, it indicates a positive correlation between hours of study and exam scores. If it's negative, it indicates a negative correlation. The closer is to 1 or -1, the stronger the correlation.
Characteristics:
The correlation coefficient ranges from -1 to 1. A value of 1 implies a perfect positive linear relationship, -1 implies a perfect negative linear relationship, and 0 implies no linear relationship.