Sunday, September 11, 2011

The Essence of Information

In the last blog we came to the conclusion that information is real -- yet elusive, when we try to understand what it is made of.  There appears to be a layer of reality in which we live that is not matter or energy but is something different. This layer is very familiar to us. Yet, this layer is not formally recognized by science even though science has been bumping into it since the discovery of the idea of entropy in the 19th century. Today, in the 21st century, we have yet to be able to frame a model of reality that includes this elusive entity we know as information. This should not be surprising. Our science today can still only deal with things that are matter or energy -- and information, as we saw, really does not qualify as either.
So, the question before us is: what is the stuff of information? Can we explain it in terms of other things we do understand? That is what we will tackle in this blog post.

Some Guidance from History
In general, we say we understand something if we are able to explain it in terms of more elementary constituents. As an example, until almost the 20th century, the atom was considered to be the smallest unit of matter. Dalton’s Atomic Theory envisioned the atom as a small indivisible, indestructible billiard ball type object. It was a fundamental constituent of material things that could not be divided into anything more elementary. Scientists did not really have a clue about what it was, and whether it really existed. At best, it was a useful model of how things were at the very smallest of scales. Only by late 19th century did scientists begin to suspect that the atom was made up of even more fundamental constituents. Today, not only do we know that the atom exists, we have a clear understanding of the more elementary particles and forces that constitute the atom. We now understand atomic behavior enough to make precise predictions on how atoms would behave under a given set of circumstances. This is not to say we have a complete understanding. Yet, now we can say that we truly understand the atom. As proof, we are able to manipulate events on a sub-atomic level, as we do for instance in a nuclear reactor. We would not be able to do that if we truly did not understand the atom.
For us the question is: can we try to understand information in a similar way? Can we inquire into the physical nature of information and try to understand it in the same way scientists have understood the atom or gravity? At first, this does not appear to be a legitimate question. Information has an elusive, non-material quality about it. Information seems to be something purely in the realm of the mind with no real physical existence unlike the material stuff around us. It almost seems like trying to understand in a physical way what beauty, or love, or emotions are.
However, as we saw earlier, a very simple examination of information suggests it is real and can exist independent of a mind. Further, we know that information can have real physical effects in the world simply by looking at any artifact around us. Consider a chair built to a certain design. At first the design only existed on paper. Then, when used to build the chair, this design had a very physical and measurable effect not only on the wood and the other materials, but also the builder. So, information has to be real if it is to have an effect on material things.
Fortunately, history has examples of seemingly elusive entities like information being corralled by diligent scientists. Consider gravity. Then, as today, people knew instinctively how gravity worked and how to work with gravity. Yet, people really did not know what gravity was or how it behaved in a quantitative way. Then along came Newton and connected the dots:  he connected the movement of the Earth around the sun, with the movement of the moon around the earth, and the reason the apple falls to the ground, and showed that everything is attracted to everything else by the same “force” called gravity. He then went on to work out a simple equation that described the gravitational force between two objects: the force was related to the distance between the two objects and the mass of the objects. Since then the law of gravitation has been found to apply to even the largest things in the universe like clusters of galaxies. In this way, something nebulous but familiar became very well defined from a mathematical and physical perspective.
Yet, we still do not know what gravity is – only how it behaves. Newton himself said “I have not been able to discover the cause of those properties of gravity from phenomena, and I frame no hypotheses.” In other words, we do not know what gravity is. We only know how to describe its behavior.
Our current understanding of information is similar to how gravity was understood in pre-Newton times: we neither understand what it is nor how it works. As an example, we do not understand how information is created;   or how it is recognized as information by the mind. But we are immersed in it and are born with an innate ability to wield it. It would be incorrect to say we know nothing about how information works because an entire field of mathematics has developed called Information Theory that is the basis of our communications systems we enjoy today like cellular phones and cable TV. So clearly we are able to use information. However, this again is like in pre-Newton times, when people had developed a working knowledge of gravity to be able to build bridges and machines of war like the catapult which flung rocks during a siege, even though they did not have an understanding of gravity as we do today.  

Seeking the Essence

This analogy to gravity is encouraging in that it gives us hope that we might find a physical basis for information as well.
We can begin this exploration by trying to understand what information is in the hope that it will someday lead us to understand how it operates. We can begin by asking about the essence of information. In other words: what is that core property of information that makes it what it is? What is that core property that, if information did not possess it, information would no longer be information?
To help illustrate what is meant by the word essence, consider the paper document again – the deed to the house you may own. It bears information about the house, the location, your name and a myriad other details to establish that you own the house described in the deed. Without this information, the document would no longer be a deed to your house. It would no longer be a deed. It would no longer be a document.  So we can say that the essence of the document is the information it bears. Without the information it would no longer be a document.
So, again, what is the essence of information?
Consider the act of creating information. Consider a painter who uses the medium of paint and canvas to give expression to his ideas. Note that the painter does not create the medium: he does not create the paint or the canvas. He selects the paint, selects the mixture of paints, selects the locations on which to put that paint and makes a series of choices that ultimately result in the painting. The choices are what the painter makes and has control over. With these choices he creates something new that did not exist. The painting is very real and can be sold for real money. The painting is not the paint or the canvas but is the record of the decisions that the painter made to create the unique painting. The painting is new information that is created and is ultimately a visual record of the painter’s choices.
It could be stated that the painting, in essence, is a record of choices. It is not the ink or any other material but purely the choices made by the painter.  If so, can we generalize this and state that all information is nothing more, or nothing less than a record of choices made by the creator of the information?

Quantifying Information – The Bit

It is rather surprising that, even though the nature of information is not understood, we know how to quantify it. We actually know how to measure information. In our quest, this gives us a great advantage. Without the ability to quantify information, we would have been even further away from understanding what information is. Not only can we quantify it, but we have mathematical models to describe how to transmit it over different kinds of information channels like wires, or cables or over the air.
Information theory was developed by Claude Shannon of AT&T and now forms the basis for how our global telecommunication works. Shannon established the mathematical frame work of what we know as information theory which is used widely in communication and computer science. In this theory he uses the bit as the basic unit of information. This measure of information does not say anything about the content. This is the clarifying and interesting thing about the way Shannon treated information: that information can be dealt with entirely decoupled or abstracted from the content of the information. In today’s terms, this idea of information is better conveyed by the term data.
We instinctively tend to confuse information about something with the something. But these are two very distinct entities. As an illustration, imagine a crystal drinking goblet. The goblet and the information about the goblet are two different but related things. Information about the goblet can be shared among people, can be discussed, can be stored on a hard drive, and can be remembered. The crystal glass does not lend itself to this kind of manipulation.
It is this abstract entity that the bit measures. It does not matter if the information is true or false; it does not matter if we understand the information. The information can be treated as a separate entity from the topic that the information is about.  When Shannon was formulating his mathematical models for the transmission of information, the content of the information did not matter. Shannon wanted to have a mathematical way of handling information that was general and independent of the content. And he used the bit to help him arrive at a quantity that he could use to model how information is transmitted. In our quest, we will similarly delve into the nature of information without being concerned about the content.
In simplest terms, the bit is the smallest unit of data. A disk drive in a computer stores data. Generally, disk drives have a fixed maximum capacity to hold a certain amount of data. This data is stored in terms of bits or bytes, where 8 bits makes 1 byte. A bit represents a decision.  A bit can have only one of two values:  0 or 1. It allows us to represent a choice – a choice between two equally probable options. We make these kinds of choices all the time. Consider the case when you are asked a question that calls for a simple YES or NO answer.  It could be other simple choices like UP or DOWN or LEFT or RIGHT or a 1 or 0. The point is that the answer communicates a decision between two options. The minimum number of options is two – hence the bit represents the smallest possible unit of data.
So the bit (short for binary digit) is the amount of information that is generated when a choice is made between two equally probable options. It is the answer that is given when a question with a yes or no question is posed.
This was also the view of the bit when the concept of the bit emerged. It can be argued that the invention of the idea of the bit ranks up there with the invention of the concept of zero. Shannon did not originate the concept of the bit, but introduced the idea to a wide audience in his famous 1948 paper titled "A Mathematical Theory of Communication" published in The Bell Systems Techical Journal. In it he acknowledges that the name bit was coined by a John W. Tukey. The actual concept of the bit was first clearly articulated by Leo Szilard where he was trying to provide an answer to a famous problem commonly referred to as Maxwell’s Demon.  In his 1929 paper titled "On the Decrease of Entropy in a Thermodynamic System by the Intervention of Intelligent Beings" published in Zeitschrift fur Physik, Szilard suggested a thought experiment where a measurement needs to be taken whether a molecule was in one half of a given container or the other. In it he defined a quantity y that could take one of two values, +1 or -1, depending on whether a gas molecule was either in one half of a cylinder or the other. Without going into the details of the experiment, the number y was the result of a binary decision. The point here is that the idea of the bit and of the decision is tied together.
Any information that you can conceive of can be represented as a string of bits – conventionally a string of 1s and 0s. All the information that is on your computer hard drive is stored as bits. All the photos, the documents, the programs – any and every information can be broken down and represented as a string of bits. For example the word C-A-T can be represented as 01000011-01000001-01010100 using the ASCII code, which assigns an 8-bit long string of ones and zeroes to each of the English letters, numbers and special characters.
If all information can be represented as a string of bits, and each bit represents a decision, then information can be viewed as, and even defined as, a string of decisions or a record of decisions; or a record of choices.
As we saw earlier, a painting is not the paint or the canvas, but is the record of the decisions that the painter made to create the unique painting. The painting is new information that is created and is ultimately a visual record of the painter’s choices. In other words, this new information in the painting is nothing more than a – record of choices of the paint colors, brushes and brush strokes chosen by the painter. In other words, the essence of a certain quantity of information is the set of decisions that make up the information.
This then begs the question: what is a decision? We will get into that a little later. For now let us stay at this level and examine the idea of information from this level before taking a deeper dive.
In trying to understand the nature of information, we also have a new definition of information: information is a record of decisions.
Before we proceed further, it is useful to pause and ask what the word record means in this context.  The word has multiple related meanings, and already includes the idea of information in it, making our definition less clear and ambiguous. Merriam-Webster defines record as “something that recalls or relates past events”. This is the sense the term is used here – the past events in this case being the decisions taken to create the information.
This view of information does not seem to be very interesting – a string of bits seems to be extremely drab and dull. The information that we are interested in is at a higher level – like books, and movies and photos. Does this view of what information is possibly apply to those examples? For example, can a movie be considered a record of decisions?

A Test Drive – over Known Roads

Let’s take our definition for a test drive over known roads – of examples of information that we care about. This is a valuable exercise to establish its applicability. While it applies to the notion of computer bits, we need to understand if it applies in a broader sense. You will see that it is not only consistent, but as a good definition can often do, takes us to a deeper level of understanding and brings new things under its umbrella that were hard to define.
Consider a simple example of information. My wife Sarah calls me on my phone and informs me that it is raining where she works. She might say something like, “George, it is raining here”. Clearly, this represents some information passed from Sarah to me. The question is: does this information represent a record of decisions?
During the phone call, this sentence that Sarah spoke was converted from a string of sounds to a string of bits by the phone system. So, right away we see that at a trivial level, this does comply with our definition of information.
Moving up a level from bits to words, Sarah decided to string together the particular choice of words and decided to give it voice. Taking it up another level, Sarah decided to convey the idea that it was raining. She decided to convey it to me, her husband. At an even higher level, she decided to pick up her mobile phone and talk to me – conveying and confirming the warm relationship with me, her husband. Everything about the phone call represents layers of decisions. Each of these levels, from the decision to connect with her husband, down to the level of bits – all represents decisions by Sarah and by the phone system. That simple sentence over the phone is actually a multi-layer record of decisions.
It will of course be noticed that all the decisions were not made by Sarah. The choice and order of words were governed to some extent by the English language. The rules of grammar were decisions taken by other humans. So, Sarah was wielding units of information of far higher complexity than the bit. But this does not in any way take away from the idea that information is made up of decisions. The main point that emerges from looking at this example is that not only does information consist of decisions at the bit level – it is decisions all the way up.
The above example is fairly general and covers all information that is recorded as words. However, information is not restricted to words alone. What about alternate forms like images and sounds which do not represent words? Again, we ask the question: does our definition capture these alternate forms of information? As an illustration, consider the case where my wife Sarah decides to take a picture of our children while I am away on a business trip. She decides to email it to me. That evening, I sit in a far-away hotel room, look at the picture on my laptop, and think warmly about my family. Clearly, this photograph represents information. The question is: does this information represent a record of decisions?
The photograph that glows on my laptop screen is actually made of pixels, millions of them, each a little square of a color at a certain brightness level. These pixels are so small that the eye does not see them as separate squares of light, but instead blend into one smooth image. Usually, each pixel is stored in the computer’s memory using 24 bits of data. So right away, at the primitive level the picture is a string of bits – each bit representing a decision taken by the digital camera used by Sarah when converting the light projected by the lens onto the image sensor in the camera. So, this photograph is a record of decisions at the bit level. Again, as before, this insight is not very illuminating or satisfying.
So, we move a level higher from the decisions the camera took, to the decisions Sarah took. Sarah decided to take the picture by pressing the shutter on the camera. Before she pressed the shutter, she decided on the location and composition of the photo – essentially she decided on the content she wanted to capture. She then decided to email that picture to me. At an even higher level she decided to express her love for me by sending me a picture. As before, each of these levels, from the decision to express love to her husband down to the level of bits – all represents decisions by Sarah and by the information transfer chain formed by camera à email à computer systems. Again, that simple photograph by email is a multi-layer record of decisions.
The example of the digital image can be extended to movies and videos. By a similar argument we can show that music can also be considered as a record of decisions. We see that the new definition for information – a record of decisions – holds up to common examples of what we consider information.

Information in Nature

So far, the reader may concede that the definition appears to cover those examples of information which are intentionally created by the human mind. But what do we make of information that is discovered? Like those found in nature: the arrangement of pebbles in the bed of a stream, or the patterns created by leaves being blown around. Surely you might say, there were no decisions taken, yet there is information there. Science is all about discovering the order underlying nature. Surely, this definition could not cover those cases. Scientific discoveries in general can be viewed as our understanding of the underlying order that is present in nature. How does this type of information be termed a record of decisions when there was no one taking decisions? Setting aside biology for the moment, most scientific investigations are about things that are “natural”, with no obvious evidence of involvement of any type of mind. Examples are study of the earth, the atmosphere or metals etc.
There is a subtle point to be grasped here. As mentioned earlier, the information about something is distinct from the “something”. The scientific discovery represents knowledge we have gained that did not exist before. However, the phenomenon that we discovered pre-existed our knowledge. So, the scientific discovery is the information we draw from the phenomenon that was discovered.
To illustrate this, consider a hypothetical case where a geologist happens upon a new type of rock. As soon as he discovers the rock and realizes it is new, he has already started creating information about the rock. The decision to declare as new, the classification and the naming of a new rock, all are examples of new information that the excited geologist creates. This new information is a record of decisions taken by the geologist. As new questions are raised and answered, new information is created which is a record of the decisions of that entire process. Ultimately, science is about classification and trying to fit the new into an existing framework. This represents selecting between existing classes or creating new ones. All these can ultimately be described as decisions.
The key is that our observing things in nature and creating any kind of sense around the observation is the act of creating information – creating it by classifying according to known categories. Classification is nothing more than taking decisions about the buckets to place the new discovery in. Until these decisions are taken, there is no information about the phenomenon discovered.
This even holds for discoveries that are about long lost information. Consider archeologists who discover writings, or a treasure map. This is discovery of information that was generated by another mind. The discovery is still new information about the ancient information. The scientists may not be able to understand the ancient writing or decipher the ancient drawings. Most likely they will not know the identity of the person who created the information. Nevertheless, they are not in doubt that it is information generated by humans in the past.

Information and Life

This brings us to information we find in nature that is unmistakably information yet could not be human in origin – the DNA molecule. It is found in every cell of our body. It is best described as a library of recipes for building all the proteins needed by the animal. These recipes are written in a language with only four letters, unlike English with its twenty-six letters. Each letter is represented by a short molecule called a nucleotide. Essentially, DNA is a long molecule that is a string of nucleotides. Our DNA is about three billion letters long. It is nothing short of information – it is detailed instructions on how the cell can assemble proteins. 
How do we know it is information? Well, the DNA is a language with 4 symbols. Each symbol is known and each symbol is equally likely. Local molecular forces do not favor any one symbol overt the other.  We do not understand the language, but we know it is a language understood by the machinery in our bodies. There is nothing else like it in nature. The closest thing we can compare it is language itself.
 When Watson-Crick team was unraveling the mysteries of the DNA, it was the scientists who were creating the information about the DNA. But the information stored in the DNA was not created by the scientists. In some ways the discovery of the DNA is similar to the discovery of a library of a long lost civilization. The information about the library was created by the archeologists; the information stored in the library was created by unknown, long-lost minds.
So does the definition “a record of decisions” hold for DNA? The word decision seems to unambiguously connect information to a mind. We are used to thinking about information as something that is independent of the mind. But, as we have seen, the word decision is integral to our definition of information. For decisions to occur, a decision making-entity must exist – at least as far as our goes. Decision-making entities are associated either with a mind or with a computer. But minds created computers to execute decisions. Essentially, computers are devices that bottle decisions. Based on our collective experience then, we have to assert that for decisions to exist there must be a mind.
The problem is that, unlike earlier examples of information that conformed to the definition, we cannot immediately identify a decision-maker.  In effect this definition appears to smuggle in aliens or a God. To set this aside as an obstacle, first let me point out that not knowing the mind that created information does not in any way cause the information to cease being information. When we read an anonymous piece of text, it still remains a piece of text; it is still information even if we cannot identify the mind that generated the text. Second, DNA calls for a decision-making entity that is not human; it does not automatically demand aliens or a deity. Francis Crick, the co-discoverer of the DNA when faced with this challenge, decided that DNA must have originated from an extra-terrestrial intelligence (Crick). But this is not the only avenue. Natural selection, which is at the heart of the theory of evolution, by its very name suggests a process that could produce decisions.
If you, dear reader, are someone who is ready to dismiss this definition, because accepting it will force you to believe in aliens or a God, allow me point out that this will not force you into an intellectual corner. The definition appears challenge the evolutionary view that nature did all the creating of life. But it does not. What the definition actually does is to create a new scientific question: can nature take decisions?  Or, using our definition, we can reword that question as: can nature create information? As Paul Davies and other thinkers point out, that the real problem to be solved when investigating the origin of life is the origin of biological information (Davies, 2000).  So, not only does this definition hold, but it is not in conflict with current thinking around a natural origin of life.
To summarize our test drive, we have seen that our new definition does apply to all known types of information. Yet, there is still something missing: the definition does not seem to capture the notion of the content that is associated with the word.

Taking the definition for a Test Drive – over a New Road

Our new definition not only holds for known examples of what we consider information. It also helps clarify some distinctions that were harder to articulate without this definition.
Consider our crystal drinking glass again. Beginning from humble beginnings as sand, it ends up, in the hands of a glass craftsman as a thing of beauty. If I were to place an equivalent pile of sand next to the cup, the difference is painfully obvious. But, if we stop and try to articulate what the essential difference is between the pile and the glass, we would be hard pressed to give voice to what our mind perceives so simply. Everything is different, the color, the shape, the texture, the beauty, the utility, the value. Yet at the same time, at a material level, it is very similar – they are both made of the same stuff – silicon dioxide.
Our definition allows us to clearly articulate the difference: the crystal cup has information in it that the pile of sand did not have. Putting it another way: sand + information = the cup. The cup represents a record of decisions taken by the glass craftsman when he or she gently shaped the molten sand to his/her will.

The Essence… and a Step into the Unknown

We are now able to articulate what the essence of information is. To remind the reader, the essence of information is that core property that makes information what it is. It is that core property that, if information did not possess, it would no longer be information.
Our definition tightly relates information to the decision. Take away the decision and there is no information. Clearly, the decision is a key part of the essence of information. We can now take a bolder step and assert that the essence of information is nothing but decisions.
This seems anti-climactic to say the least. The idea that information at its core is nothing but decisions is not really very intuitive or illuminating. After all, we do not really know what a decision is. Sure, we understand decisions in terms logic and reason. But we do not understand it from a physical point of view. A decision, like information, does not seem to be of matter or energy. It is not clear how our physical laws govern decisions – if they govern them at all. Further, a decision seems to actually muddy the waters.  It is commonly associated with a mind – another debatable and mysterious entity. So instead of providing clarity, this step seems to only pull in a new set of vaguely understood entities.
We now have an important choice before us. We can seek comfortable answers that explain information in terms of matter and energy.  Or, we can tentatively step out beyond matter and energy and recognize that maybe, a decision is a new beast – that information is a new and little understood entity that we need to understand, and we may not be able to understand it purely in terms of matter and energy. This immediately of course threatens to open science into a whole new set of ideas that may undermine well established beliefs of how the world works. However unsettling this may seem, it has to be explored with honesty lest we miss out on new advances that we cannot even dream about today.
We seem to stand at the edge of a precipice, with the comfortable sciences we know today behind us. In front of us is the murky vastness of the unknown. For some perspective on this choice, consider that we have been at the edge of a similar precipice before in the history of science. When the idea of energy was first postulated, it was considered as giving into supernatural ideas, even occult ideas. The scientists who first proposed it were considered to be playing outside the acceptable bounds of science. Yet, over time, the concept of energy was tested, accepted and integrated into mainstream thinking. Today, science would be inconceivable without the concept of energy. The advances we have made in our understanding of the universe and in technology could not have been dreamed of when the concept of energy was first proposed.
I am firmly of the opinion that we stand again at the edge of such a precipice. Behind us lie the comfortable sciences related to matter and energy – and out ahead of us, in the fog, lie new ideas waiting to be discovered. This step – away from matter and energy and into the unknown – is important if we have to find new perspectives on nature. I invite you to join me as we venture out into the new unknown.
The first step on this journey is to examine what we have just found – the idea that the humble decision may be a fascinating new object waiting to be examined in this new light.

© 2011 George Valliath