Wednesday, March 9, 2016

Unizor - Probability - Cumulative Distribution Function





Unizor - Creative Minds through Art of Mathematics - Math4Teens

Notes to a video lecture on http://www.unizor.com

Cumulative Distribution Function

Let's consider a case of continuously distributed random variable ξ.
For instance, ξ is the results of measuring a weight of a tennis ball (assuming our ability to measure it absolutely precisely).
As we noted in the previous lecture, where we have introduced a concept of a continuous distribution, the probability of this weight ξ to be any exact real number is zero. However, the probability of taking a value in some range, say from 55g to 58g, is not zero.

This property of continuous distributions is fundamentally different from a corresponding property of discrete distributions, where the probability of a random variable to take any specific value is non-zero.

Our task is to describe the continuous distribution of a random variable, but similar approach - to specify all values with corresponding probabilities - would not work because there is infinite number of values and each specific value occurs with a probability zero.

Recall that the probability is in many ways similar to a measure. For instance, the length of any individual point is zero, but the length of any segment is some non-zero positive number.

Returning to random variables with continuous distribution of probabilities and using the above analogy, assume for definiteness that our random variable ξ (the weight of a tennis ball) can take values in the range from 50 to 60 and we would like to be able to describe all the probabilities for all the intervals of the weight. The function of numerical argument x (the weight) that might be very helpful as a representation of all this knowledge is the probability that our random variable ξ takes the value, which is less than x:
Fξ(x) = Prob{ξ less than x}

Now, if we want to know the probability that the weight is between w1 and w2, we can calculate it as
P[w1,w2] = Fξ(w2)−Fξ(w1).

Since all values of ξ are concentrated between 50 and 60, this function equals to 0 for all x smaller or equal to 50 and is equal to 1 for all values greater or equal to 60.
Function Fξ(x) is obviously monotonically non-decreasing. As x increases from 50 to 60, Fξ(x) increases from 0 to 1.
This function is called a distribution function of our random variable ξ.

This distribution function fully defines all the probabilities on all the intervals of values of our random variable ξ. If the probability of the weight of a tennis ball to be in central interval is proportional to the width of this interval (a so-called uniform distribution of probabilities), the graph of the corresponding distribution function would look like this:

(where a is the minimum weight of a tennis ball, which we assume is 50 gram and b is its maximum weight - 60 gram.)

In case the weight is distributed non-uniformly from a to b, the distribution function would grow faster within more probable intervals to accommodate greater probability concentrated in these areas.
Here is an example:


We can construct a distribution function for discrete random variable as well. Consider the mass distribution function for this variable. As argument x moves from its minimum value to a maximum, the distribution function remains constant in-between the points of concentration of mass and jumps up every time it goes over the next mass, adding this mass to a cumulative probability.
Graphically, it looks like this:

Here discrete random variable X takes values x1, x2 etc. with probabilities P(x1), P(x2) etc.
Every time an argument x goes over the next value xi that random variable X can take with the probability PX(xi), the distribution function grows by that probability value up to 1 at the point of maximum and then stays equal to 1.

As we see, the probability distribution function is, in a way, a universal function applied to both discrete and continuous distributions and sufficient to recreate the full picture of probability distribution, that is to determine a probability of any event associated with the random variable this function belongs to.

No comments: