First, we want the average length L of the message sent to be as small as we can make it (to save the use of facilities). Second, it must be a statistical theory, since we cannot know the messages which are to be sent, but we can know some of the statistics by using past messages plus the assumption the future will probably be like the past. For the simplest theory, which is all we can discuss here, we will need the probabilities of the individual symbols occurring in a message. How to get these is not part of the theory, but can be obtained by inspection of past experience, or imaginative
...more

