Information has historically been understood and measured in a wide variety of ways. Usually these models are discipline specific or limited in scope. For example, physicists speak of information in thermodynamic terms, while epistemologists describe information as something that occurs within the context of higher level human cognitive processes. We briefly present here a number of different ideas about information, concluding with a discipline independent definition of information that can be used to provide a basis for a general definition of communication.
The most widely understood notion of information for English speaking children may be seen in Cookie Monster's definition of information as ``news or facts about something." Other definitions tend to be more explicitly human-centered, such as ``information is knowledge." Similarly, Dretske [Dre81] views information as something that brings us to certainty, a definition that annoys those who use probability theory and assume that one can never be absolutely certain about anything. More formal than this is Bar Hillel's model of information as what is excluded by a statement [BHC53,BH55]. Bar Hillel's definition provides a bridge between formal and rigorous definitions of information and the idea of meaning. Based upon earlier thermodynamic models of entropy, Brillouin [Bri56] suggests that ``information is a function of the ratio of the number of possible answers before and after...." and that anything that reduces the size of the set of answers provides information. Shannon's model of information and communication, with which we assume the reader is familiar, measures information as inversely proportional to the probability of a signal [Sha93b,Rit86,Rit91,Los90]. Most of the measures of information produce numbers similar to those produced by the familiar Shannon model, with the amount of information being inversely proportional to the probability of an event.
Several measures of complexity and information explicitly address processes. For example, several scholars, including Kolmogorov [Kol65], Solomonoff [Sol64a,Sol64b], and Chaitin [Cha77], have separately developed measures of the complexity or information inherent in a process as the smallest algorithm that produces the output produced by the process in question. The Minimal Description (MD) of the process, with all redundancy and extraneous material removed, captures the essential nature of the process. Slightly different than this is the Minimum Description Length (MDL), the length of the smallest description for a process, proposed by Wallace [WF87,WB68] and Rissannen [Ris89].
One measure of information, seemingly more popular in the U.K. than in North America, measures information as inversely related to the variance of a variable [Fis25,Mac69]. As we learn more about a variable or an event (e.g. the average height of communication scholars) the variance of our estimate decreases and the information increases.
There is a large gap between what is provided by a measure and what is provided by a definition, and many of the measures developed in the ``hardest" of the sciences do not explicitly provide an associated definition. Information scientists, for example, may propose a model of an information related process and then measure a characteristic of the model without necessarily providing a clear definition of their terms. Definitions help clarify the essence of a phenomenon, while a measure may capture what is important in some respect about the phenomena. A measure without a definition is still useful, however a definition would improve the understanding of the context for most measures. We propose a definition of information, followed by a definition of communication, but one may also think of either more generally as a model allowing us to measure, describe, and predict characteristics and relationships.
Information and communication may be defined in relationship to the processes that move the information from the beginning, or entry of the information into the channel, to its exit from the channel. A process is an action-medium that moves an input presented to the process to the output, possibly assigning new values to one or more output variables. We use the term process because of Church's thesis, widely believed to be correct [KMA82], which suggests that processes, functions, and algorithms all describe the same phenomenon; all these models of computation for most purposes are equivalent and one may describe using one of these and at the same time satisfactorily also describe as effectively as if one were using one of the other models.
Information may be defined [Los97] as
the values for all variables in the output of any process. This information is about either the process, or the inputs to the process (the context of the process), or both.
A process is a set of related components that produce or change something. When studying an information system, there may be one large process or several smaller processes. The scope of the process is not important to our definition of information, or later to our definition of communication. However, the choice of how to view a process will affect how one may study or understand the process. For example, the reader may view the entire process that takes the author's thoughts and results in the thoughts being understood by the reader as a communication process, or one may view numerous small steps in the writing, publishing, and reading processes as linked processes. Whether viewed as one large process or several smaller processes, the information presented at the end of the entire large process or at the end of the set of smaller processes is the same.
The amount of information produced by a process may be measured, as with Shannon's model, as the logarithm [Har28] of the inverse of the probability of the state of nature found at the output of the process. This may be used for the values of both discrete and continuous output variables. An output variable with a value having associated probability of 1/4 may be computed as having bits of information, for example.
We refer to a process that produces output from a context (defined as the set of values presented to the input of a process) as an information channel. The channel is that set of components that implements the functionality inherent in the process. A channel might be implemented by a computer and its program, or it might be a mechanical device such as a pantograph that reproduces an input movement at the output with a pre-specified degree of magnification.
Processes may consistently produce the same result when given the same input and context. These deterministic processes are what most people mean when they refer to a process in an unqualified manner. Given the same input, two identical probabilistic processes may produce different results. The output of a probabilistic process may be described by a probabilistic distribution or density function. Observers of such a process may understand the randomness as coming from two sources: inherently random processes and error-producing noise.
The process based definition of information differs in several respects from Shannon's model of information, to the extent that Shannon's ideas about information may be separated from those he had about communication. Shannon views communication systems primarily as transmitting symbols between a source and a destination. This is very different from the process model proposed above which produces output from input, with no explicit attempt being made to encode or represent something symbolically, or to be message based. Shannon notes that his theory is ``quite different from classical communication engineering theory which deals with the devices employed but not with that which is communicated" [Sha93a, p. 212]. Shannon's focus clearly was on the message.
Unlike Shannon's model, the process model describes information as the outcome of any process, not just an encoding process in a symbol based communication system. The process model of information, for example, can be used to describe an addition process that accepts two inputs and produces the sum (information) at the output. This output is informative about the additive process and the inputs. This notion of aboutness is similar to Devlin's notion of a constraint [Dev91]. While the addition process could be interpreted as encoding the input, that is certainly not a very natural interpretation and is certainly not a required interpretation if we wish to understand arithmetic sums. Similarly, the process model of information measures the information at the output as proportional to the number of possible states and their relative frequency in the output.
Each process accepts inputs or environmental characteristics and produces something (or nothing), given these input values. The function f(x) denotes the processing (or encoding) of x by function f().A function f(), when combined with its functional inverse f-1(), accepts a variable xand produces as output x, which was presented to the processes at the input; i.e. For example, the square and the square-root functions are well known functions with each being the inverse of the other for positive numbers. Thus, and
Consider again the addition function and its inverse. An addition function that accepts two inputs and produces an output, the sum, discards information that was present at the input (the output of other processes). One cannot work backwards from only the sum to the original two numbers (the reader is encouraged to consider what the original pair of numbers might have been if the sum is 3.)
Consider a function which produces both the sum of two presented numbers and the difference between the two numbers. This function hasn't discarded any information in producing the sum and the difference, in that one can work backward to produce the original numbers. Consider a sum of 3 and a difference of 1; one can solve x+y=3 and x-y=1 to produce y=1 and x=2.Given a lossless process, in which no information is lost during processing, there is always an inverse process that can recreate the input to the first process.
Is this precise model of information too narrow to be of much practical use by scholars? Saracevic and Kantor [SK97] suggest that such normative approaches, including ``formal and rigorous models involving uncertainty...." (p. 532) take ``the narrowest view of information." We disagree, believing that the above rigorous and general model represents a very broad view of information, with other models of information being capable of being shown to be special cases of this model. Below, we present what we similarly consider to be a general, yet rigorous, definition of communication.