Tuesday, January 26, 2016

Unizor - Statistics - Stability





Unizor - Creative Minds through Art of Mathematics - Math4Teens

Notes to a video lecture on http://www.unizor.com

Stability of Statistics

As we noted, the purpose of Mathematical Statistics is, using the past observations, to evaluate the probability of certain events to be able to predict their occurrence in the future.
Is it really possible?

Yes, sometimes. But there are certain provisions that must be satisfied to succeed in this endeavor.

The first and most important condition for applying statistical approach to predict the future is stability.
Imagine, you want to evaluate the chances of getting two aces from a dealer in a casino. You are not familiar with combinatorics and cannot calculate these probabilities and, instead, you decide to evaluate your chances statistically. So, you sit at the table and ask a dealer to give you two cards. You do or don't get two aces and continue the process. Eventually, after the whole stack of cards (say, 5 full decks of 52 cards each - 260 cards, that is 130 pairs) is exhausted, you get the final result of the number of times you've got two aces. Say, 3 times out of 130.
Would you say that the statistically evaluated probability of getting a pair of aces is somewhat around 3/130 (that is, 2.3%)?
Absolutely NOT!
And the main reason is - the conditions of our random experiment are changing. We started with a set of 5 full decks of cards shuffled together. After the first top two cards are dealt, it's a different set of cards you have for the next experiment. Every time we pick two cards, the deck is changing, so do the probabilities associated with getting two aces.

Consider another random experiment. You would like to predict the weather temperature tomorrow. For this you observe the temperature for an entire year and get the statistical distribution of temperatures. For instance, out of 365 observations you had 50 days with a temperature from 0 to 10 degrees, 100 days with a temperature between 10 and 20 degrees, 150 days with a temperature between 20 and 30 degrees and 65 days with temperature above 30 degrees.
Does it mean that a distribution of probabilities of the temperature is as follows?
0-10: 50/365 = 13.7%
10-20: 100/365 = 27.4%
20-30: 150/365 = 41.1%
30-99: 65/365 = 17.8%
Definitely NOT!
The main reason - the Earth rotates around the Sun during the year, that's why in the North hemisphere it's cold in winter and hot in summer. In summer the temperature would be above 20 degrees with a very high probability and in winter it will be below 20 degrees most of the time. So, the conditions of our experiment are changing and, therefore, the results are not applicable for predicting the future.

In summary, if we want to use past results of experiments to evaluate the distribution of probabilities of some random variable, all these experiments must be conducted under identical conditions to assure that we deal with exactly the same distribution of probabilities at all times.
If this condition is not satisfied, the validity of our conclusions about the distribution of probabilities of our random variable are quite questionable.

No comments: