Tuesday, June 28, 2016

Unizor - Linear Regression - Problem 1

Notes to a video lecture on http://www.unizor.com

Linear Regression - Problem 1

Consider a linear regression model described in the previous lecture:
Y = a·X + b + ε
where independent variable X is represented by sample data
x1x2 ...xn
and observed values of dependent variable Y are
y1y2 ...yn.

We have introduced two averages of the sample data:
Ave(x)=(x1+x2+...+xn)/n = U
Ave(y)=(y1+y2+...+yn)/n = V

Using new variables
Xk = xk − U and Yk = yk − V
(where index k is from 1 to n)
we came up with the best possible value for a coefficienta in the formula for linear regression as
a = ΣXk·Yk / ΣXk²

Problem

Using the sample averaging function Ave() applied as
Ave(x)=(x1+x2+...+xn)/n
Ave(y)=(y1+y2+...+yn)/n
Ave(xy)=(x1y1+...+xnyn)/n
Ave(x²)=(x1²+...+xn²)/n
prove that the expression for a coefficient a in the formula for linear regression in terms of original sample data xk and yklooks like
a = [Ave(xy)−Ave(x)Ave(y)] / [Ave(x²)−Ave²(x)]