Wednesday, March 23, 2016

Unizor - Statistical Distribution - Task A - Quality

Unizor - Creative Minds through Art of Mathematics - Math4Teens

Notes to a video lecture on

Statistical Distribution
Task A - Quality

Task A. The values our random variable ξ takes are discrete and theoretically known, they are X1, X2... XK, but probabilities to take these values - p1, p2... pK - are unknown and we need to evaluate them.
Examples: rolling a standards cubical dice (6 known outcomes), Bernoulli random variable (only two values - 1 and 0).

Our task is to evaluate the unknown probabilities of ξ to take different values based on experimental results with this random variable.

Let's recall the approach suggested in a general lecture on statistical distribution.
Assume that, as the result of N experiments, random variable ξ (that could theoretically take values X1, X2... XK) took value Xj in νj experiments, where j∈[1,K].
Obviously, ν1+ν2+...+νK=N.
Also notice that all νj are random variables, sum of which is constant (non-random) N.

Our best approximation for the probability pj of ξ to take value Xj is random variable νj/N - the empirical frequency of occurrence of this event.
So, our statistical distribution of random variable ξ looks like
pj = Prob{ξ=Xj} ≅ νj/N
where j∈[1,K].

The quality of this approximation should be better with larger number of experiments and we will attempt to evaluate this quality.

Let's concentrate on one particular probability p1=Prob{ξ=X1} and its empirical approximation with a random variable ν1/N. The other probabilities will be similar.

Consider a new Bernoulli random variable β that takes the value 1 if variable ξ takes value X1 and 0 otherwise.
Obviously, as we know from the properties of Bernoulli random variables,

Out of N experiments value β=1 occurred ν1 times and value β=0 occurred N−ν1 times.
Therefore, we can say that random variable ν1 (that we want to use to evaluate probability p1 of ξ to take value X1) equals to β1+β2+...+βN, where all βj are independent identically distributed random variables, distributed exactly as β (that is, they are equal to 1 if ξ=X1 and 0 otherwise).
Thus, we have reduced our problem to evaluate the quality of approximation of unknown probability p1 with empirical frequency ν1/N to an already researched problem of evaluating probability of a Bernoulli random variable β to take value 1 with multiple experiments with this random variable producing results 
β1, β2,...βN.

As we know, the above mentioned sum
has expectation N·p1 and variance N·p1(1−p1).
The empirical frequency
ν1/N = (β1+β2+...+βN)/N
(a sample average of the results of experiments) has a distribution close to Normal (according to Central Limit Theorem) with expectation p1 (and, therefore, is used as an unbiased approximation of p1) and variance p1(1−p1)/N, which we want to evaluate to determine the quality of approximation.

The quality of approximation of p1 with ν1/N is measured by its standard deviation from expected value.
Crude evaluation of this standard deviation, that depends only on the number of experiments N, equals, as we know from the properties of Bernoulli random variables, to σmax=1/(2√N ). So, with 95% certainty, we can say that the approximation of probability p1 with empirical frequency ν1/N
is an unbiased evaluation with margin of error not exceeding 2σmax=1/√N .

More precise (albeit, slightly less certain) evaluation of the quality of our approximation of p1 with ν1/N can be obtained using the sample variance of random variable ν1/N.
First, let's calculate the sample variance of β - an average of square deviations of its values βj from its sample average
Σ(βj)/N = ν1/N.
This sample variance of β equals to:
s²N−1 = [(β1−ν1/N)²+...+
Out of N experiments β1, β2,...βN the value β=1 occurred ν1 times and the value β=0 occurred N−ν1 times. Therefore, our sample variance of β equals to:
+(N−ν1)·(0−ν1/N)²]/(N−1) =
= [ν1(N−ν1)]/[N(N−1)]

From this we can calculate a sample variance of our estimate of probability p1 ≅ ν1/N:
σ² = [ν1(N−ν1)]/[N²(N−1)]

Expanding this from evaluation of probability p1 to any pj, with 95% certainty we can say that νj/N can be used as an approximation of probability pj with margin of error
2σ = 2√[ν(N−ν)]/[N²(N−1)]
In the above formula νj is substituted with ν for brevity and because it's applicable to any νj.

No comments: