*Notes to a video lecture on http://www.unizor.com*

__Statistical Correlation -__

Problems 2

Problems 2

Let's make an example that show how dependency and

*correlation*are related.

We know that

*independent*variables have zero correlation, while

*linearly dependent*variables have the correlation 1 by absolute value.

Is the inverse true?

*Problem A*

Make up the values of experiment with two dependent random variables whose sample correlation equals to zero.

*Solution*

To make such an example, it is sufficient to come up with such sample values of random variables

*S*and

*T*that, on one hand, produce zero correlation, but, on the other hand, do not produce some characteristic property of independent variables. For example, do not produce the equality between conditional and unconditional probabilities of

*S*taking some value when

*T*took some value.

Let's simplify it to a minimum and consider that we have only two possible observations of

*S*(

*a, b*) and three possible observation of

*T*(

*c, d, x*). Possible combinations of their values are:

*(a,c), (a,d), (a,x), (b,c), (b,d), (b,x)*.

Assume, in 100 conducted experiments the combination

*(a,c)*never occurred,

*(a,d)*- 50 times,

*(a,x)*- never,

*(b,c)*- 25 times,

*(b,d)*- never, and

*(b,x)*- 25 times.

It means that, if

*S=a*(with frequency

*50/100=0.5*),

*T*unconditionally takes value

*d*. If, however,

*S=b*(also with frequency

*50/100=0.5*),

*T*can take either value

*c*(with frequency

*25/100=0.25*) or

*x*(also with frequency

*25/100=0.25*).

Values of random variables

*S*and

*T*and numbers of times they occur are in a table below.

T=c | T=d | T=x | Σ(S) | |

S=a | 0 | 50 | 0 | 50 |

S=b | 25 | 0 | 25 | 50 |

Σ(T) | 25 | 50 | 25 |

To satisfy the requirement of zero correlation we have to make sure that their

*covariance*is zero, that is

= (50ad+25bc+25bx)/100=

=

= [(50a+50b)/100]·

·[(25c+50d+25x)/100]

**E**(S·T) == (50ad+25bc+25bx)/100=

=

**E**(S)·**E**(T) == [(50a+50b)/100]·

·[(25c+50d+25x)/100]

To simplify it even further, let's assign some concrete values to variables

*a, b, c, d*and find the value of

*x*from the equation above.

Set

*a=2, b=4, c=8, d=16*.

New values of random variables

*S*and

*T*and numbers of times they occur are in this new table below.

T=8 | T=16 | T=x | Σ(S) | |

S=2 | 0 | 50 | 0 | 50 |

S=4 | 25 | 0 | 25 | 50 |

Σ(T) | 25 | 50 | 25 |

Then our equation would look like this:

=

**E**(S·T) = 24+x ==

**E**(S)·**E**(T) = 30+0.75xSolving this equation leads to the following value for an unknown

*x*:

*x=24*

So, for

*x=24*the covariance of our

*dependent*random variables

*S*and

*T*equals to zero.

This proves that, while

*independence*implies

*covariance = 0*, the inverse is not true. There are

*dependent*random variables with

*covariance = 0*.