Watch
Watching this resources will notify you when proposed changes or new versions are created so you can keep track of improvements that have been made.
Favorite
Favoriting this resource allows you to save it in the “My Resources” tab of your account. There, you can easily access this resource later when you’re ready to customize it or assign it to your students.
Coefficient of Correlation
The correlation coefficient is a measure of the linear dependence between two variables X and Y, giving a value between +1 and −1.
Learning Objectives

Compute Pearson's productmoment correlation coefficient.

List the mathematical properties of the correlation coefficient.
Key Points

The correlation coefficient was developed by Karl Pearson from a related idea introduced by Francis Galton in the 1880s.

Pearson's correlation coefficient between two variables is defined as the covariance of the two variables divided by the product of their standard deviations.

Pearson's correlation coefficient when applied to a sample is commonly represented by the letter r.

The size of the correlation r indicates the strength of the linear relationship between x and y.

Values of r close to 1 or to +1 indicate a stronger linear relationship between x and y.
Terms

correlation
One of the several measures of the linear statistical relationship between two random variables, indicating both the strength and direction of the relationship.

covariance
A measure of how much two random variables change together.
Full Text
The most common coefficient of correlation is known as the Pearson productmoment correlation coefficient, or Pearson's r. It is a measure of the linear correlation (dependence) between two variables X and Y, giving a value between +1 and −1. It is widely used in the sciences as a measure of the strength of linear dependence between two variables. It was developed by Karl Pearson from a related idea introduced by Francis Galton in the 1880s.
Pearson's correlation coefficient between two variables is defined as the covariance of the two variables divided by the product of their standard deviations. The form of the definition involves a "product moment", that is, the mean (the first moment about the origin) of the product of the meanadjusted random variables; hence the modifier productmoment in the name.
Pearson's correlation coefficient when applied to a population is commonly represented by the Greek letter ρ (rho) and may be referred to as the population correlation coefficient or the population Pearson correlation coefficient.
Pearson's correlation coefficient when applied to a sample is commonly represented by the letter r and may be referred to as the sample correlation coefficient or the sample Pearson correlation coefficient. The formula for r is as follows: .
An equivalent expression gives the correlation coefficient as the mean of the products of the standard scores. Based on a sample of paired data (X_{i}, Yi), the sample Pearson correlation coefficient is shown in .
Mathematical Properties
 The value of r is always between 1 and +1: 1≤r≤1.
 The size of the correlation r indicates the strength of the linear relationship between x and y. Values of r close to 1 or to +1 indicate a stronger linear relationship between x and y.
 If r=0 there is absolutely no linear relationship between x and y (no linear correlation).
 A positive value of r means that when x increases, y tends to increase and when x decreases, y tends to decrease (positive correlation).
 A negative value of r means that when x increases, y tends to decrease and when x decreases, y tends to increase (negative correlation).
 If r=1, there is perfect positive correlation. If r=1, there is perfect negative correlation. In both these cases, all of the original data points lie on a straight line. Of course, in the real world, this will not generally happen.
 The Pearson correlation coefficient is symmetric.
Another key mathematical property of the Pearson correlation coefficient is that it is invariant to separate changes in location and scale in the two variables. That is, we may transform X to a + bX and transform Y to c + dY, where a, b, c, and d are constants, without changing the correlation coefficient. This fact holds for both the population and sample Pearson correlation coefficients.
Example
Consider the following example data set of scores on a third exam and scores on a final exam: .
To find the correlation of this data we need the summary statistics; means, standard deviations, sample size, and the sum of the product of x and y.
To find (xy), multiply the x and y in each ordered pair together then sum these products. For this problem, Σ(xy) = 122,500. To find the correlation coefficient we need the mean of x, the mean of y, the standard deviation of x and the standard deviation of y.
x = 69.1818, y = 160.4545, s_{x }= 2.85721, s_{y }= 20.8008, Σ(xy) = 122,500
Put the summary statistics into the correlation coefficient formula and solve for r, the correlation coefficient.
Key Term Reference
 Pearson's correlation coefficient
 Appears in this related concepts: Other Types of Correlation Coefficients and Hypothesis Tests with the Pearson Correlation
 correlation coefficient
 Appears in this related concepts: Coefficient of Determination, Inferences of Correlation and Regression, and Overview of How to Assess StandAlone Risk
 datum
 Appears in this related concepts: Comparing Nested Models, Controlling for a Variable, and Using a Statistical Calculator
 deviation
 Appears in this related concepts: Standard Error, Variance, and Degrees of Freedom
 line
 Appears in this related concepts: Plotting Lines, Line, and Qualities of Line
 mean
 Appears in this related concepts: Mean, Variance, and Standard Deviation of the Binomial Distribution, The Mean Value Theorem, Rolle's Theorem, and Monotonicity, and Understanding Statistics
 population
 Appears in this related concepts: The Functionalist Perspective on Deviance, Quorum Sensing, and Organismal Ecology and Population Ecology
 sample
 Appears in this related concepts: Applications of Statistics, Defining the Sample and Collecting Data, and Identifying Product Benefits
 standard deviation
 Appears in this related concepts: Typical Shapes, Variance, and IQ Tests
 standard score
 Appears in this related concepts: Change of Scale, Variation and Prediction Intervals, and Introduction to Linear Regression
 statistics
 Appears in this related concepts: What Is Statistics?, Communicating Statistics, and Population Demography
 variable
 Appears in this related concepts: Calculating the NPV, Fundamentals of Statistics, and The Linear Function f(x) = mx + b and Slope
Sources
Boundless vets and curates highquality, openly licensed content from around the Internet. This particular resource used the following sources:
Cite This Source
Source: Boundless. “Coefficient of Correlation.” Boundless Statistics. Boundless, 28 May. 2015. Retrieved 28 May. 2015 from https://www.example.com/statistics/textbooks/boundlessstatisticstextbook/correlationandregression11/correlation44/coefficientofcorrelation2082660/