INTRODUCTION TO STATISTICS AND ECONOMETRICS
PROBABILITY
In this chapter we shall define probability mathematically and learn how to calculate the probability of a complex event when the probabilities of simple events are given. For example, what is the probability that a head comes up twice in a row when we toss an unbiased coin? We shall learn that the answer is У4. As a more complicated example, what is the probability that a student will be accepted by at least one graduate school if she applies to ten schools for each of which the probability of acceptance is 0.1? The answer is 1 — 0.910 = 0.65. (The answer is derived under the assumption that the ten schools make independent decisions.) Or what is the probability a person will win a game in tennis if the probability of his winning a point is p? The answer is
pl + 4(1 - P) + 10(1 - pf + 20p(l - pf/[ 1 - 2p(l - /»)]}.
For example, if p = 0.51, the formula gives 0.525.
In these calculations we have not engaged in any statistical inference. Probability is a subject which can be studied independently of statistics; it forms the foundation of statistics.
Definitions of a few commonly used terms follow. These terms inevitably remain vague until they are illustrated; see Examples 2.2.1 and 2.2.2.
Sample space. The set of all the possible outcomes of an experiment. Event. A subset of the sample space.
Simple event. An event which cannot be a union of other events. Composite event. An event which is not a simple event.
EXAMPLE 2.2.1
Experiment: Tossing a coin twice.
Sample space: {HH, HT, TH, TT}.
The event that a head occurs at least once: HH U HT U TH. EXAMPLE 2.2.2
Experiment: Reading the temperature (F) at Stanford at noon on October 1.
Sample space: Real interval (0, 100).
Events of interest are intervals contained in the above interval.
A probability is a nonnegative number we assign to every event. The axioms of probability are the rules we agree to follow when we assign probabilities.
Axioms of Probability
(1) P(A) ^ 0 for any event A.
(2) P(S) = 1, where S is the sample space.
(3) If {A,}, і = 1,2,..., are mutually exclusive (that is, Al D Aj = 0 for all і j), then P(Aj U A2 U. ..) = P(A]) + P(A2) + . . . .
The first two rules are reasonable and consistent with the everyday use of the word probability. The third rule is consistent with the frequency interpretation of probability, for relative frequency follows the same rule. If, at the roll of a die, A is the event that the die shows 1 and В the event that it shows 2, the relative frequency of A U В (either 1 or 2) is clearly the sum of the relative frequencies of A and B. We want probability to follow the same rule.
When the sample space is discrete, as in Example 2.2.1, it is possible to assign probability to every event (that is, every possible subset of the sample space) in a way which is consistent with the probability axioms. When the sample space is continuous, however, as in Example 2.2.2, it is not possible to do so. In such a case we restrict our attention to a smaller class of events to which we can assign probabilities in a manner consistent with the axioms. For example, the class of all the intervals contained in (0, 100) and their unions satisfies the condition. In the subsequent discussion we shall implicitly be dealing with such a class. The reader who wishes to study this problem is advised to consult a book on the theory of probability, such as Chung (1974).