The Information

It’s somewhat ironic that despite my Computer Science degree I didn’t hear a lot about Alan Turing until after college, reading Cryptonomicon. I’d heard the name, along with the other big names in computer science: Djikstra, Knuth, Kernighan and Ritchie, along with software deities like Gates and Jobs and Torvolds. Learning computer science was disconnected from history, much like math. You’d only hear a guy’s name if he was attached to a proof or an algorithm.

Much like math, there was an acknowledged naiveté of the history, as though it didn’t matter. Never mind the messy past where this stuff had been hashed out, now we had the proofs, the parts that were useful. Here’s to the future.


I hadn’t heard the name Claude Shannon until picking up The Information, by James Gleick. Shannon was pretty much the American counterpart to Turing, laying the foundations of information theory during World War II. Instead of cracking codes, he was trying to optimize the trajectories of aircraft guns, or improve the fidelity of private communiques between the allies. After the war he went to work at Bell Labs, putting out unassuming white papers in the company journal that laid the ground work for the entire information age.

But before we get to Shannon, Gleick starts with a network of African villages that signal via complex drum rhythms. Using a set of two drums with differing timbres, they could rap out a message to a distant village – “come back home”, or “the raiders are attacking from the west”. The concept was similar to Morse code: using the repetition of simple discrete signals to build up a more complex message.

The book continues a journey through time, bouncing off different cultures that have utilized information, classified it into dictionaries, languages, character sets. One of the more fascinating was the “telegraph” during the age of Napoleon, where secret military orders were conveyed across the French countryside via a system of elaborate windmill arms.

He touches on Charles Babbage and Ada Lovelace, Victorian savants that very well could have ushered in a steampunk information age if given enough time and money. And then Shannon and Turning and all the others who, through the crucible of a world war, crafted the technology that now dominates our world.

One of the first big concepts was that pretty much anything (text, sound, pictures, concepts, etc) could be reduced to a long chain of binary numbers.

Turing discovered that a long chain of binary numbers could act as a code, and in turn compute other long chains of binary numbers. This was a Universal Turing Machine (UTM).

Similarly, Shannon discovered a formula to describe any long chain of numbers, for instance receiving it via a wire. The amount of information you receive is a function of how well you can predict the next value. Shannon called this Informational Entropy.

If you flip a coin, each flip is random, so you have no way to predict the next number. “HHTTHTTHTHHTH”. Randomness contained the maximum entropy = 1. Conversely, something like the English language was only 0.6. You could potentially predict the next letter in the sequence. “Ths cn be prvn by th fllwng sntnce”

This laid the foundation for things like compression and encryption – trading one set of numbers for another (via an algorithm) and yet maintaining the same amount of information.

The fascinating thing was the mathematical formula for Information Entropy and Thermodynamic Entropy were the same


Thermodynamic Entropy is basically a measure of the potential to do “work”. The second law of thermodynamics says that entropy must always increase (hence, the ability to do work decreases).

Informational Entropy works the same way: as the amount of “noise” in the message increase, so too the entropy. A random message looks just like noise, and is at maximum entropy.

Into this whole thing steps Kurt Gödel, who basically throws a wrench in the gears of mathematics, science, and information theory. He says: “There is no perfect system. No matter what, any system will break down, usually when attempting to describe itself.” In physics, this was mirrored with the Uncertainty Principle of Quantum mechanics. In mathematics, incompleteness theorems. In computer science, the halting problem (a program running on a Universal Turning Machine cannot predict whether it will complete or run forever).

Given all these parallels between the operation of the fundamental particles (the physical world) and bits (the information world), some have theorized they are one and the same. The universe is an information system, or a Universal Turning machine (or perhaps we are simulations in a high-order turning machine, etc).

Just as algorithms “optimize” programs of UTM tape, what if atoms, molecules, proteins, and life itself are simply algorithmic optimizations of the information system of the universe? On and on up the chain: enzymes -> multi celled organisms -> conscious humans -> social structures and institutions -> abstract concepts like science and math -> theories of information, bending back upon itself in endless recursion.

It’s easy to take information theory for granted, stick it in a box of things that have given us some niceties of modern life: the internet and iPhones and telecommuting. But the idea that sits behind it is pretty fundamental, and isn’t limited to circuit boards or even electricity. Like Djikstra said: “Computer Science has as much to do with Computers as Astronomy has to do with Telescopes”.

Gleicks’ book is excellent – an entertaining journey through history, illustrated with colorful characters, the oddities of dead technologies, and complex topics explained lucidly. This book should be required reading for any citizen of the digital age.

Show Comments