## Wednesday, October 29, 2014

### Unizor - Probability - Geometric Distribution - Properties

Definition

Recall the definition of the Geometric distribution of probabilities.
Assume that we conduct a sequence of independent random experiments - Bernoulli trials with the probability of SUCCESS p - with the goal to reach the first SUCCESS. The number of trials to achieve this goal is, obviously, a random variable. The distribution of probabilities of this random variable is called Geometric.

Formula for
Distribution of Probabilities

Recall from a previous lecture the formula for the probability of a random variable distributed Geometrically to take a value of K:
P(γ[p]=K) = (1−p)^(K−1)·p

Graphical Representation

Our random variable can take any integer value from 1 to infinity with the probability expressed by the above formula.
The graphical representation of this distribution of probabilities consists of a sequence of rectangles with the bases [0.1], [1,2], etc. and the height of Kth rectangle equal to (1−p)^(K−1)·p.
This resembles a staircase with gradually decreasing height of the steps from p for the first down to 0 as we move farther and farther from the beginning.

Expectation (Mean)

The expectation of a random variable that takes values x1, x2, etc. with probabilities p1, p2, etc. is a weighted average of its values with probabilities as weights:
E = x1·p1+x2·p2+...

In our case, considering the random variable can take any integer value from 1 to infinity with the probability described by the formula above, its expectation equals to
E(γ[p]) =
1·(1−p)^0·p +
2·(1−p)^1·p +
3·(1−p)^2·p +
4·(1−p)^3·p...

To calculate the value of this expression, multiply its both sides by a factor (1−p):
(1−p)·E(γ[p]) =
1·(1−p)^1·p +
2·(1−p)^2·p +
3·(1−p)^3·p+...
and subtract the result from the original sum
E(γ[p]) − (1−p)·E(γ[p]) =
(1−p)^0·p +
(1−p)^1·p +
(1−p)^2·p +...

On the left of this equation we have p·E(γ[p]).
On the right we have a geometric series that converges to p/[1−(1−p)]=1.

Therefore, p·E(γ[p]) = 1
And the mean value (expectation) of γp is
E(γ[p]) = 1/p

This value of the expectation is intuitively correct since, when the probability of SUCCESS is greater, we expect to get it sooner, in the smaller number of trials on average. Also, if the probability of SUCCESS equals to 1, which means that we cannot get FAILURE at all, we expect to get SUCCESS on the first trial. Finally, if the probability of SUCCESS is diminishing, we expect to get it on average later, with more and more trials.

Variance, Standard Deviation

Variance of a random variable that takes values x1, x2, etc. with probabilities p1, p2, etc. is a weighted average of squares of deviations of the values of our random variable from its expected value E:
Var = (x1−E)^2·p1+(x2−E)^2·p2+...

In our case, considering the random variable can take any integer value from 1 to infinity with the probability described by the formula above and its expectation equals to 1/p, the variance equals to
Var(γ[p]) =
(1−1/p)^2·(1−p)^0·p +
(2−1/p)^2·(1−p)^1·p +
(3−1/p)^2·(1−p)^2·p +
(4−1/p)^2·(1−p)^3·p...

Reducing this complex expression to a short form involves a lot of calculations. Thankfully, it was done by mathematicians and documented numerous times. The idea is similar to what we used for calculating the mean value - multiplying the sum by a denominator (1−p) of a geometric progression and subtracting the resulting sum from the original. The final formula is:
Var(γ[p]) = (1−p)/(p^2)

For probability values from 0 to 1 this expression is always positive, equals to 0 for the probability of SUCCESS equaled to 1 (as it should, since we always have the SUCCESS on the first trial with no deviation from this). As the probability of SUCCESS diminishes to 0, the variance increases to infinity (as it should, since on average it would take more and more trials to reach the SUCCESS).

As for the standard deviation, it is equal to a square root of the variance:
σ(γ[p]) = [√(1−p)]/p