What is probability?
Sure, I’m an engineering major, I know how to work with probability. I know about expected values and joint CDFs and autocovariances and Markov chains, and all these things tie into each other in ways that make perfect sense to me. I know they can be used for practical purposes and yield real results. But what does it all mean?
Suppose I have a perfectly weighted six-sided die. The odds I’ll roll, say, a three with this die are exactly one in six.
What I’ve been taught this means is that if I roll the die 6000 times, a three will come up approximately 1000 times. If I roll it 60,000 times it will come up approximately 10,000 times, and so on, with the (relative) error bars coded into the word “approximately” shrinking and shrinking as my number of rolls increases; eventually, as my number of rolls approaches infinity, the proportion of threes rolled approaches exactly one-sixth.
There are serious problems with this definition, though – the most obvious one being that I don’t really have time to roll the die an infinite number of times. Suppose I settle for 6000 rolls and roll 991 threes. I write down that the chance of rolling a three is 991/6000. The next day, I roll the die another 6000 times and roll 1012 threes. I write down that the chance of rolling a three has mysteriously increased to 1012/6000. I’m wrong both times, because this is in fact a perfectly weighted die, whose probability of rolling a three is exactly one in six: 1000/6000. But if I don’t have any idea what to expect from this die and try to figure out its probability distribution from scratch, what else can I do than roll it a lot, and count how often it lands on each number?
Another problem is that a parameter defined as “the limit of x as y approaches infinity” doesn’t seem to tell you anything useful about the outcome of a single die roll. Take another die, which is so absurdly unbalanced that the probability of rolling a three is one in 10 million. If you’re about to roll the die once, what does this tell you? You either roll a three, or you don’t – whether the probability is one in six or one in 10 million. There’s no way to be sure beforehand.
And yet people will happily step into a plane with a one-in-1o-million chance per flight of falling out of the sky and killing everyone on board, but they won’t come near a plane with a one-in-six chance. Clearly, probability means something.
Right now, as I’m typing this, I’m trusting my life to the very low probability that the floor will collapse under me and drop me into the neighbours’ living room in a neck-breaking sort of way. When I go to sleep (which I should have done a long time ago) I’ll be trusting my life to the very low probability that something in my room will randomly catch fire during the night and burn the building down. Every day, when I cycle to class, I trust my life to the very low probability that any driver around me is a murderous psycho and/or on heroin.
And it goes further. Engineers make cost-probability tradeoffs all the time: if a system (like a plane, or a train, or a nuclear power plant) has a potential for catastrophic failure, how high do we allow the probability of such a failure to be? How much money are we willing to spend to drive it down by a factor of z? Depending on the career choices I make, I might have to make tradeoffs like that at some point.
So we have this weird parameter whose very definition is vulnerable to measurement error, and which intuitively sounds like it doesn’t tell us anything more than “X might happen or it might not” – and yet we trust our lives to it and hang price tags on it and feed it into all kinds of complicated mathematical models as if it were as meaningful and quantifiable as the number of apples in a basket.
Now I’ve been told that the definition above is called “frequentist”, and there’s another view called “Bayesian” which interprets probability as the certainty with which a belief (e.g. “this plane will not fall out of the sky and kill everyone on board today”) is held – but on the face of it, that makes even less sense. If a friend of mine takes the perfectly balanced die and, by freak accident, gets 1200 threes out of 6000 rolls, does that mean the die’s probability of rolling a three changes to one in five whenever he’s looking at it? If a devout Christian is 100% certain that God exists, does that prove the existence of God beyond a shadow of doubt? How do you quantify an abstract psychological concept like “certainty”, anyway?
Me, I’m just about ready to start believing that probability is some kind of mystical property of the die itself – or maybe that there really is a Random Number God.