Wednesday, October 8, 2014

Unizor - Probability - Continuous Distribution





Let's consider a different random experiment that is modeled by an infinite and uncountable set of elementary events and, correspondingly, uncountable number of values of a random variable defined on them.

For example, we measure weights of tennis balls manufactured by a particular plant. In theory (and we are talking about mathematical model, not the real world), this weight can be any real non-negative number within certain limits, like from 50 to 60 gram. Let's assume that we can measure the weight absolutely precisely. Repeating the experiment again and again and counting the times this weight exactly equals to, say, 55 gram and taking a ratio of the number of times when the weight is 55 gram to the total number of experiments, we will get closer and closer to zero. And so will be with any other specific weight.

So, any particular value of our random variable has a probability of zero, but the number of these values is a number of all real numbers from 50 to 60 - an uncountable infinity. This presents a mathematical challenge in operating with particular values of this random variable.

To overcome this challenge, instead of considering individual values of our random variable, we should consider intervals.
In our example of the weight of a tennis ball, we can talk about a probability of this weight to be in the interval from, say, 54 to 56 gram. This probability will be greater than zero. The wider our interval - the larger the probability. At the extreme, for an interval from 50 to 60 gram, the probability will be equal to 1 because all tennis balls are manufactured with the weight in this interval.

The probability of our random variable of having any particular exact value equals to zero, but it has a non-zero probability of having a value within some interval, different probabilities for different intervals. Such random variables are called continuous.
Then for such random variable ξ we can say that the probability of ξ to take a value in the interval [a,b] equals to p (which depends on a and b). Usually, all possible values of a continuous random variable constitute a finite or infinite continuous interval. The probability of this random variable to take a value within this interval equals to 1 and the probability of it to take a value in some narrower interval is less than 1.

Example - Sharp Shooting Competition

Sharpshooters are shooting a target, and the random variable we are interested in is the distance of a point where a bullet hits the target from the target's center.
For a particular sharpshooter, assuming his skill level is constant, he does not get tired and does not miss for more than 0.5 meters, the continuous distribution of this random variable is defined on the range of values from 0 (when he hit exactly at the center of a target) to a maximum deviation of his bullet from the center that we assumed to be 0.5 meters.

For any exact value of our random variable, say, 0.2768 meters, the probability of having this value is 0. It is obvious if we recall that the probability is a limit of the ratio of the number of occurrence of a particular event to a total number of experiments. As a sharpshooter fires shots to infinity, the ratio of the number of shots on a distance of exactly 0.2768 meters from a center, as well as on any other exact distance, tends to zero.

The continuous distribution of probabilities can (also only approximately) be represented graphically similarly to the way we presented the discrete distributions. First of all, we break an entire range of values of our random variable (the distance from a center of a target) into 5 smaller intervals of 0.1 meters wide and mark these points on the X-axis: 0, 0.1, 0.2, 0.3, 0.4 and 0.5. On each interval between these values we construct a rectangle of a height equal to the corresponding probability of our random variable to fall within this interval. So, for a random variable ξ (results of the first shooter) the rectangles might be:
base [0,0.1] - height 0.6
base [0.1,0.2] - height 0.2
base [0.2,0.3] - height 0.1
base [0.3,0.4] - height 0.07
base [0.4,0.5] - height 0.03

Obviously, the more intervals we use to break the entire range of values of a random variable into smaller intervals - the more precisely we can characterize the continuous distribution.

No comments: