# Covariance and Correlation

I couldn’t find a better explanation of this. So I had to blog it for posterity. The original article may be found here.

For a background on the topic. See related article. (Different source, but simple explanation)

Covariance and Correlation are two mathematical concepts which are commonly used in the field of probability and statistics. Both concepts describe the relationship between two variables.

Covariance –

1. It is the relationship between a pair of random variables where change in one variable causes change in another variable.
2. It can take any value between -infinity to +infinity, where the negative value represents the negative relationship whereas a positive value represents the positive relationship.
3. It is used for the linear relationship between variables.
4. It gives the direction of relationship between variables.

Formula

Here,
x’ and y’ = mean of the given sample set
n = total no of sample
xi and yi = individual sample of a set

Correlation –

1. It show whether and how strongly pairs of variables are related to each other.
2. Correlation takes values between -1 to +1, wherein values close to +1 represents strong positive correlation and values close to -1 represents strong negative correlation.
3. In this variable are indirectly related to each other.
4. It gives the direction and strength of relationship between variables.

Here,
x’ and y’ = mean of a given sample set
n = total no of sample
xi and yi = individual sample of a set

Covariance versus Correlation –

COVARIANCECORRELATION
Covariance is a measure of how much two random variables vary togetherCorrelation is a statistical measure that indicates how strongly two variables are related.
involve the relationship between two variables or data setsinvolve the relationship between multiple variables as well
Lie between -infinity and +infinityLie between -1 and +1
Measure of correlationScaled version of covariance
provide direction of relationshipprovide direction and strength of relationship
dependent on scale of variableindependent on scale of variable
have dimensionsdimensionless