Kolmogorov Probability For Dummies
Contents |
[edit] The Axiomatic Method
The axiomatic method means you don't comment on what things are, but it in how they behave -- more formally, what set of axioms they satisfy. The Kolmogorov formalization (I'm actually working from the probability manuals of Barry James and Shiryaev) is therefore valid for any definition of probability (frequentist, subjective, subjective-objetive à la Keynes, bayesian) that satisfies the axioms for a probability measure over an algebra.
Yes, Virginia, all this holds true for the degree-of-belief-in-hypothesis/generalized-inductive-method view of probability.
[edit] Sigma-algebras (in some places called "sigma-fields" or even "tribes")
A sigma-algebra over a set S is a family A of sets (representing random events) satisfying the following axioms:
A1. S belongs to A
A2. If a belongs to A, then c(a)=S-a belongs to A
A3. If a1,a2,... all belong to A, then the union of those (infinite, but ennumerable) sets belong to A.
From axioms A2 and A3 (and the laws of de Morgan) it's easy to prove th at the infinite but ennumerable intersection of sets belonging to A belongs to A.
[edit] Probability
For a in A there is a number P(a) satisfying the following axioms:
P1: P(a)>=0 for any a
P2: P(S) = 1
P3: If a1,a2,.. are disjoint (no intersection), then P(union of a1,a2..) = P(a1)+P(a2)+...
Probability given those axioms satisfies the following provable properties:
P4: Continuity into the void, i.e. if a sequence of sets a1,a2..an.. converges to the empty set, when n goes to infinity, then P(an) converges to 0.
P5: P(c(a)) = 1-P(a)
P6: 0<=P(a)<=1
P7: if a1 is contained in a2, then P(a1)<=P(a2)
P8: P(union of a1,a2...) <= P(a1)+P(a2)+...
P9: Continuity: If a sequence of sets a1, a2,..,an,... converges to A as n goes to infinity, then P(an) converges to P(A) as n goes to infinity.
[edit] Probabilistic model
A probabilistic model for a random experiment consists of a nonempty set of possible results, a sigma-algebra over that set and a probability over that sigma-algebra.
[edit] Conditional probability
P(a|b)=P(a and b)/P(b), for a and b in a sigma-algebra A and P(B)>0.
The best way of thinking about conditional probability is thinking of it as a rescaling. Imagine a new sigma-algebra where b plays the role of the set S generating it. Then, when you know b happens, what is the probability of a happening?
[edit] The composite probability theorem
P(a or b) = P(a) * P(b|a) = P(b)*P(a|b)
[edit] The total probability theorem
A partition over a set b is a disjoint (no intersection) family of sets such that their union is equal to set b. If a1, a2, ..., an is a partition over set b,
P(b) = \sum P(ai) * P(b|ai)
For two sets a1, a2 covering b,
P(b) = P(a1)*P(b|a1)+P(a2)*P(b|a2)
In other words, the probability of b is a weighted average of the conditional probabilities with the probabilities of the conditionals as weights.
[edit] Bayes' formula
Since P(a|b)=P(a and b)/P(b),
P(aj|b) = P(aj and b)/(\sum P(ai)*P(b|ai)) for a partition ai of b.
For two sets a1 and a2 covering b,
P(a1|b) = P(a1 and b)/{P(a1)*P(b|a1)+P(a2)*P(b|a2)} P(a2|b) = P(a2 and b)/{P(a1)*P(b|a1)+P(a2)*P(b|a2)}
[edit] Independence
Two events a, b are independent if
P(a and b) = P(a) * P(b)