In statistics, the Pearson product-moment
correlation coefficient (sometimes known as the PMCC) (r) is a measure of the correlation of two variables X and Y measured on the same object or organism, that is, a
measure of the tendency of the variables to increase or decrease together. It is defined as the sum of the products of the
standard scores of the two measures divided by the degrees of freedom:

Note that this formula assumes the Z scores are calculated using standard deviations which are calculated using
n − 1 in the denominator.
The result obtained is equivalent to dividing the covariance between the two variables by
the product of their standard deviations.
The coefficient ranges from −1 to 1. A value of 1 shows that a linear equation describes the relationship perfectly and
positively, with all data points lying on the same line and with Y increasing
with X. A score of −1 shows that all data points lie on a single line but that Y increases as X decreases. A
value of 0 shows that a linear model is inappropriate – that there is no linear relationship between the variables.
The Pearson coefficient is a statistic which estimates the correlation of the two given
random variables.
The linear equation that best describes the relationship between X and Y can be found by linear regression. This equation can be used to "predict" the value of one measurement from knowledge
of the other. That is, for each value of X the equation calculates a value which is the best estimate of the values of
Y corresponding the specific value. We denote this predicted variable by Y'.
Any value of Y can therefore be defined as the sum of Y′ and the difference between Y and Y′:

The variance of Y is equal to the sum of the variance of the two components of
Y:

Since the coefficient of determination implies that
sy.x2 = sy2(1 − r2) we can derive the identity

The square of r is conventionally used as a measure of the association between X and Y. For example, if
the coefficient is 0.90, then 81% of the variance of Y can be "accounted for" by changes in X and the linear
relationship between X and Y.
In computer software
- The
CORREL() function in many major spreadsheet packages, such as
Microsoft Excel, OpenOffice.org Calc and
Gnumeric calculates Pearson's correlation coefficient. Note that versions of Excel prior to
2003 exhibited rounding errors in this function and others [1].
- The
PEARSON() function in Microsoft Excel also calculates Pearson's
correlation coefficient.
- In MATLAB and Minitab,
corr(X) calculates
Pearsons correlation coefficient along with p-value.
- In MATLAB, scilab, and GNU Octave
corrcoef
calculates Pearsons correlation coefficient.
- In S-Plus and R,
cor.test(X,Y) calculates Pearson's
correlation coefficient.
- R = corrcoef(X) returns a matrix R of correlation coefficients calculated from an input matrix X whose rows are observations
and whose columns are variables.
- In IDL, the CORRELATE() function computes the PMCC.
See also
External links
This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)