Correlation coefficient

In many cases, the independent variable and the dependent variable are both continuous (taking on real values). This enables another important measure called the Pearson correlation coefficient (or Pearson's r). This estimates the amount of linear dependency between the two variables. For each subject , the treatment (or level) is applied and the response is . Note that in this case, there are no groups (or every subject is a unique group). Also, any treatment could potentially be applied to any subject; the index only denotes the particular subject.

The r-value is calculated as the estimated covariance between and when treated as random variables:

$\displaystyle r = {\displaystyle\strut \sum_{i=1}^n (x[i] - \hat{\mu}_x) (y[i] ... ...t{\mu}_x)^2} \sqrt{\displaystyle\strut \sum_{i=1}^n (y[i] - \hat{\mu}_y)^2} } ,$

(12.7)

in which $\hat{\mu}_x$ and $\hat{\mu}_y$ are the averages of

and

, respectively, for the set of all subjects. The denominator is just the product of the estimated standard deviations: $\hat{\sigma}_x \hat{\sigma}_y$ .

The possible r-values range between and . Three qualitatively different outcomes can occur:

: This means that and are positively correlated. As increases, tends to increase. A larger value of implies a stronger effect.
: This means that and are uncorrelated, which is theoretically equivalent to a null hypothesis.
: This means that and are negatively correlated. As increases, tends to decrease. A smaller value of implies a stronger effect.

In practice, it is highly unlikely to obtain

from experimental data; therefore, the absolute value $\vert r\vert$ gives an important indication of the likelihood that

depends on

. The theoretical equivalence to the null hypothesis (

) would happen only as the number of subjects tends to infinity.

Steven M LaValle 2020-01-06