Thursday, February 4, 2016

Unizor - Bernoulli Statistics - Solution

Unizor - Creative Minds through Art of Mathematics - Math4Teens

Notes to a video lecture on

Bernoulli Statistics - Solution

Obviously, one series of N experiments (Bernoulli trials in our case) produces a set of results that is different from another series of N experiments.
Our goal is to evaluate probabilistic characteristic of a Bernoulli random variable ξ based on a series of experiments with it. But, since different series produces different results, we must consider results of our series of N experiment as a set of independent random variables, identically distributed as ξ, that we denote as ξ1, ξ2,...,ξN.

The only probabilistic characteristic of Bernoulli random variable ξ that we are interested in is the probability P of ξ to take a value of 1, since all other properties (like probability of taking a value of 0 or variance) follow from it.
Since E(ξ)=P, we will attempt to approximate it with a random variable
η = (ξ1+ξ2+...+ξN) / N

Let's define a concept of a good approximation of a constant P with a value of a random variable η. Since η is a random variable, it can take many values with different probabilities. It makes sense to say that η is a good approximation of P if values η can take are relatively close to P and those values that are closer are more probable, while those that are farther from P are less probable.

A good measure of this quality of approximation η≅P might be mathematical expectation E(η) (that we hope should be very close or even equal to P) and variance E{[(η−E(η)]²} - measure of how close values of η lie around its mathematical expectation.

E(ξ) = 1·P + 0·(1−P) = P
Var(ξ) = E[ξ − E(ξ)]² = P·(1−P)
Based on these, we could calculate the mathematical expectation and variance of η:
E(η) = N·E(ξ)/N = P
Var(η) = N·Var(ξ)/N² = Var(ξ)/N = P(1−P)/N

Although we don't know the probability P=Prob{ξ=1} and, consequently, we don't know Var(η), we can evaluate its range.
Since Var(ξ)=P(1−P), we can analyze the quadratic polynomial y=x(1−x) on a segment [0,1] and find its maximum.
The maximum value of y at x=1/2 is 1/4

Var(η) = Var(ξ)/N ≤ 1/(4N)

Let's understand what random variable η is and what is its distribution.
Obviously, this is a variable with Binomial distribution that can take values from 0 to 1:
0/N=0, 1/N, 2/N...N/N=1
(see the corresponding lecture in this course by following the links Probability - Binary Distributions - Binomial).
This is a rather complicated distribution, and, to find an interval Δ around its expectation that corresponds to a given level of certainty p is not easy, but we will make a shortcut that uses approximation to this distribution with Normal distribution along the bell curve.

As we know from the Central Limit Theorem (see Probability - Normal Distribution in this course), given certain rather liberal conditions, the average of a large number of random variables behaves approximately like a normal random variable with the same expectation and variance.
Assuming the number of experiments N is rather large to justify this approximation, let's use this property and approximate the distribution of our random variable η with a normal random variable that has mathematical expectation P and variance 1/(4N).
The variance of η is, actually, less or equal to 1/(4N), so our estimate for Δ might be a little wider, which works in our favor since the precision of our estimate of probability P will be slightly better than calculated based on properties of normal random variable.

Let's use "sigma limits" for a normal variable with expectation P and variance 1/(4N), from which standard deviation equals to σ=1/(2√N), to determine margin of error for three typical cases:
(a) for p≅0.6825 margin of error is not greater than σ=1/(2√N)
(b) for p≅0.9545 margin of error is not greater than 2σ=1/(√N)
(a) for p≅0.9973 margin of error is not greater than 3σ=3/(2√N)

Using the above calculations, for given number of experiments N and one of three typical certainty levels we found a margin of error.
It is easy to assume that certainty level and margin of error are given, in which case we can easily resolve the above equations for N - required minimum number of experiments to achieve required margin of error and certainty level.
Finally, we can determine a certainty level p for given number of experiments and margin of error.

As an example, with N=100 and p=0.9545 ("two sigma" case) the margin of error does not exceed 1/10. That means that the empirical frequency of ξ=1, obtained as a result of 100 experiment, deviates from the real probability Prob{ξ=1} by no more than 1/10 with a certainty level of 0.9545.
With 10,000 experiments our precision is greater, we could say that margin of error does not exceed 0.01 with the same certainty.

No comments: