Examples
1. RSVP    yes or no?
   When you reply, you reduce my uncertainty by 1/2, requires only 1 bit
to achieve—the minimal amount of information
2. A 32 icon language.
   When the destination receives/selects one, the uncertainty is reduced
by 31/32, requires 5 bits (log322=5), five times as much information as
the RSVP.  So, selecting (or giving a command) a single character/icon
in a 32 language reduces uncertainty (provides more information) than
selecting a character in a 2 character language.
Assumes independence of each ‘choice’
For more typical settings, conditional probability arises (e.g., if the
receiver has received a ‘Q’ in and English word message, the next
letter carries 0 information since it does not reduce any uncertainty (we
are sure it will be an ‘U.’  This gives rise to coding theory.