| UNLV : Math : Corran Webster : Teaching : Teaching : Codes : Information Theory | Skip to content | |
Communication, Codes and CyphersInformation Theory |
||
Contents
|
Entropy: Measuring UncertaintyWe want to come up with some means of quantifying uncertainty. Before we can do this, we should try and work out what properties uncertainty should have. Hopefully the following propositions make some sort of heuristic sense.
We'll eventually come up with a mathematical quantity which satisfies these three conditions, plus a number of others. To justify it, however, we will first go through a heuristic argument. An interesting, if somewhat unusual, way to consider uncertainty, is that it is the average amount of surprise that you have upon learning of the result. So what do we mean by surprise? Surprise is clearly related to the probability that an event happens. Having a head come up when tossing a coin is less surprising than having the number you chose on a roulette wheel winning, since the first is more probable than the second. And if your number came up on two consecutive spins of a roulette wheel, you'd be twice as surprised. Mathematically, then surprise is a function S(p) of the probability p of an event. It should also be additive in the sense that if an event with probability p happens and, independently, an event with probability q happens, then your total surprise whould be S(p) + S(q), the sum of the two surprises. Since the probability of these two independent events both happening is pq, we have: S(pq) = S(p) + S(q) Also, if something is certain (ie. it has probability 1), then there is no surprise, so S(1) = 0 Mathematically, we can show that a function S with these sort of properties must be related to a logarithm. This is reasonable, since logarithms have the same sort of properties: logb pq = logb p + logb q It turns out that the best measure of surprise is: S(p) = - log2 p Probabilities are always numbers from 0 to 1, and the logarithms of these numbers will be negative. Since we'd really like surprise to be a positive quantity, we want to take the negative of the logarithm to get a positive quantity. We choose a base 2 for the logarithm because it is the most convenient for calculation. Example
Example
If uncertainty is the average amount of surprise, then if the possible outcomes have probabilities p1, p2, ..., pn, then the average surprise is: p1 S(p1) + p2 S(p2) + ... + pn S(pn) or -p1 log2 p1 - p2 log2 p2 - ... - pn log2 pn.
We call this quantity Example
Example
Example
|
|
|
|
||