Friday, November 21, 2014

Unizor - Probability - Normal Distribution - Sigma Limits





As we know, the normal distribution of a random variable (or the distribution of probabilities for a normal random variable) is defined by two parameters:
expectation (or mean) μ and
standard deviation σ.

The expectation defines the center of a bell curve that represents the distribution of probabilities.
The standard deviation defines the steepness of this curve around this center - smaller σ corresponds to a steeper central part of a curve, which means that values around μ are more probable.

Our task is to evaluate the probabilities of the normal variable to take values within certain interval around its mean μ based on the value of its standard deviation σ.

Consider we have these two parameters, μ and σ, given and, based on their values, we have constructed a bell curve that represents the distribution of probabilities of a normal variable with these parameters. Let's choose some positive constant d and mark three points, A, M and B, on the X-axis with coordinates, correspondingly, μ−d, μ and μ+d.
Point M(μ) is at the center of our bell curve.
Points A(μ−d) and B(μ+d) are on both sides from a center on equal distance d from it.

The area under the entire bell curve equals to 1 and represents the probability of our normal random variable to take any value.
The area under the bell curve restricted by a point A on the left and a point B on the right represents the probability of our random variable to take value in the interval AB.
We have specifically chosen points A and B symmetrical relatively to a midpoint M because the bell curve has this symmetry.

It is obvious that the wider interval AB is - the greater the probability of our random variable to take a value within this interval. Since the area under the bell curve restricted by points A and B around the center M depends only on its width (defined by the d constant) and the steepness of a curve, let's measure the width using the same parameter that defines the steepness, the standard deviation σ. This will allow us to evaluate probabilities of a normal random variable to take values within certain interval based only on one parameter - its standard deviation σ.

Traditionally, there are three different intervals around the mean value μ considered to evaluate the values of normal random variable:
d=σ, d=2σ and d=3σ.
Let's quantify them all.

1. For a normal random variable with mean μ and standard deviation σ the probability of having a value in the interval [μ−σ, μ+σ] (the narrowest interval of these three) approximately equals to 0.6827.

2. For a normal random variable with mean μ and standard deviation σ the probability of having a value in the interval [μ−2σ, μ+2σ] (the wider interval) approximately equals to 0.9545.

3. For a normal random variable with mean μ and standard deviation σ the probability of having a value in the interval [μ−3σ, μ+3σ] (the widest interval of these three) approximately equals to 0.9973.

As you see, the value of a normal variable can be predicted with the greatest probability when choose the widest interval of the three mentioned - the 3σ-interval around its mean. The value will fall into this interval with a very high probability.

Narrower 2σ-interval still maintains a relatively high probability to have a value of our random variable fallen into it.

The narrowest σ-interval has this probability not much higher than 0.5, which makes the prediction for the value of our random variable to fall into it not very reliable.

No comments: