Entropy
The core idea of information theory is that the informational value of a communicated message depends on the degree to which the content of the message is surprising. If a highly likely event occurs, the message carries little information. On the other hand, if a highly unlikely event occurs, the message is much more informative.
The information content, also called the surprise or self-information, of an event
Definition
The information, or surprise, of an event
is defined by or equivalently,
The logarithm gives 0 surprise when the probability of the event is 1. In fact,
is the only function that satisfies a specific set of conditions for information theory.
Definition (Entropy)
The entropy of a random variable
with distribution , denoted , is a measure of its uncertainty. The entropy of a discrete random variable , which takes values in the set and is distributed accordingly to such that , is Note that
is itself a random variable. The entropy can be explicitly written as For continuous random variables, with probability density function
, the differential entropy (or continuous entropy) is given by
Cross-Entropy
The cross-entropy between two probability distributions
Definition (Cross-Entropy)
The cross-entropy between of the distribution
relative to distribution over a given set is defined as For discrete distributions
and with the same support , The situation for continuous distributions is analogous
where
and are the probability density functions of and respectively.