Tuesday, October 21, 2014

Unizor - Probability - Binary Distributions





Binary probability distribution is a distribution related to random experiments with just two outcomes.

In this lecture we will consider two binary distributions:
Bernoulli distribution and Binomial distribution.

Bernoulli Distribution

This is a distribution within a sample space that contains only two elementary events called SUCCESS and FAILURE. Then the measure of probability p (0≤p≤1) is assigned to one of them and the measure of probability q=1−p is assigned to another.

Usually, we will not deal with this sample space or its elementary events, but, instead, assume that there is a random variable ξ, defined as a numeric function on this sample space, that takes the value of 1 on one elementary event - ξ(SUCCESS)=1 - and the value of 0 on another - ξ(FAILURE)=0, with probabilities, correspondingly, p and q=1−p.
Symbolically,
P(ξ=1) = p
P(ξ=0) = q = 1−p

We can describe this differently, using the random variable ξ we defined above that takes two values 1 and 0, correspondingly on SUCCESS and FAILURE. Assume we repeat our experiment with two outcomes again and again, and the result of the Jth experiment is Ej. Then ξ(Ej)=1 if Ej=SUCCESS and ξ(Ej)=0 if Ej=FAILURE.
Then, if we conduct N experiments, the sum of all ξ(Ej), where J runs from 1 to N, symbolically expressed as Σ{J∈[1,N]} ξ(Ej), is the number of times our experiment ended in SUCCESS.
Therefore, the ratio of the number of SUCCESS outcomes to a total number of experiments equals to
[Σξ(Ej)] / N
Since the limit of this ratio, as the number of experiments increases to infinity, is the definition of the measure of probability of the outcome SUCCESS, we can write the following scary looking equality that symbolically states what we talked about when defining the Bernoulli distribution:
lim(N→∞){[Σξ(Ej)] / N} = p

A possible interpretation of the above equality that involves the limits might be that with large number of experiments N the number of SUCCESS outcomes is approximately equal to p·N.

A simple example of a Bernoulli distribution is a coin tossing. With an ideal coin the heads and tails have equal chances to come up, therefore their probabilities are 1/2 each:
P(HEADS) = P(TAILS) = 1/2
If we associate a random variable ξ with this random experiment and set ξ(HEADS)=1 and ξ(TAILS)=0, we obtain a classic example of a Bernoulli random variable ξ:
P(ξ=1) = p = 1/2 and
P(ξ=0) = q = 1−p = 1/2

Binomial Distribution

Consider N independent Bernoulli random experiments with results SUCCESS or FAILURE and the same probability of SUCCESS in each one. The number of SUCCESSes among the results of this combined experiment is a random variable. It can take values from 0 to N with different probabilities. The distribution of this random variable is called Binomial.

Obviously, we are interested in quantitative characteristics of this distribution, more precisely, we would like to calculate the probability of having exactly K SUCCESSes out of N independent Bernoulli experiments with the probability of SUCCESS equal to p in each one, where K can be any number from 0 to N.

Using the language of Bernoulli random variables, our task can be formulated differently.
Let ξi be a Bernoulli random variable that describes the i-th Bernoulli experiment, that is it is equal to 1 with a probability p and equals to 0 with a probability q=1−p. Then the sum of N such random variables is exactly the number of SUCCESSes in N Bernoulli experiments we are talking about.
So, the random variable
η = Σ ξi ,
where all ξi are independent Bernoulli random variables,
has Binomial distribution.

Let's now calculate the probabilities of our Binomial random variable η to have different values, that is let's determine the quantitative characteristic of this distribution.

We are interested in determining the value of P(η=K) for all K from 0 to N.

For a sum η of N independent Bernoulli variables ξi to be equal to K, exactly K out of N of these Bernoulli variables must be equal to 1 and the other N−K variables must be equal to 0.
From combinatorics theory we know that we can choose K elements out N in
C(N,K) = (N!)/[(K!)·(N−K)!]
ways.
Once chosen, these K random variables must be equal to 1 with a probability p^K and the other N−K variable must be equal to 0 with a probability q^(N−K).
That determines the probability of a Binomial random variable to have a value of :
P(η=K) = C(N,K) · p^K · q^(N−K)
The only two parameters of this distribution are the number of Bernoulli random variables N participating in the Binomial distribution (which is the number of Bernoulli random experiments results of which we follow) and the probability of SUCCESS p for each such Bernoulli experiment.
The probability q is not a new parameter since q=1−p and the formula above can be written as
P(η=K) = C(N,K) · p^K · (1−p)^(N−K)

No comments: