In 1948, Claude Shannon published "A Mathematical Theory of Communication" and unwittingly launched a thousand academic ships sailing into territories he never intended to explore. His elegant theory—focused on the reliable transmission of symbols across noisy channels—would become one of the most consequential frameworks of the 20th century.
But Shannon himself soon noticed something troubling: his information theory was being stretched far beyond its intended boundaries, applied to domains where it simply didn't fit. By 1956, he felt compelled to write "The Bandwagon," a brief paper warning the scientific community against overextending his work.

Nearly seven decades later, we're still not listening.
What is Shannon Information, Really?
Shannon's information theory isn't about meaning—it's about messages. At its core, it offers a mathematical framework for quantifying, encoding, and transmitting information reliably from one point to another.

The components are elegant in their simplicity:
Information Source: Produces a message or sequence of messages
Transmitter: Encodes the message into a signal suitable for transmission
Channel: The medium through which the signal is sent
Noise Source: Introduces distortions or errors
Receiver: Decodes the signal back into a message
Destination: Where the message is delivered
Shannon gave us ways to measure information content (entropy), channel capacity, and optimal encoding schemes that minimize errors. His work made the digital world possible—from the internet to mobile phones to data compression.
But crucially, Shannon information is syntactic, not semantic. It concerns itself with the reliable delivery of symbols, not what those symbols mean.
Shannon's Warning: The Bandwagon Paper
In "The Bandwagon," Shannon wrote with remarkable prescience:
"It has perhaps been inevitable that some of the applications of information theory have been taken too literally by enthusiasts... The establishing of such applications is not a trivial matter of translating words to a new domain, but rather the slow tedious process of hypothesis and experimental verification."
Shannon recognized that his theory—developed for engineering communication systems—was suddenly being pressed into service to explain everything from psychology to economics, linguistic meaning to biological systems.
He wasn't saying these domains couldn't benefit from information-theoretic concepts, but rather that direct applications without domain-specific adaptation would lead to fundamental misunderstandings. And he was right.
Few Domains Where Shannon Information Overreaches
Linguistics and Meaning
Linguists quickly seized upon information theory, seeing an opportunity to formalize communication. But Shannon's entropy measures word predictability, not semantic content. A sentence can be highly predictable ("Have a nice...day") while conveying little useful information, or unpredictable but meaningful.
When we stretch Shannon information to explain semantics, we're trying to measure meaning with a ruler designed for bits. The fundamental error is assuming that unpredictability equals information richness. Yet a random string of characters has maximum Shannon entropy while being semantically empty.
Noam Chomsky recognized this limitation early, noting that "colorless green ideas sleep furiously" is syntactically correct but semantically nonsensical—a distinction Shannon's mathematics simply isn't equipped to capture.
Economic Information Flow
Economists love mathematizing human behavior, and Shannon's equations offered tantalizing possibilities. Markets have been recast as information processing systems where prices serve as signals.
But economic information isn't just transmitted—it's interpreted, valued differently by different actors, and embedded in complex social contexts. A stock price movement means different things to a day trader, a pension fund manager, and a company CEO.
Friedrich Hayek actually anticipated this distinction in his work on knowledge in society. He understood that economic knowledge is distributed, contextual, and often tacit—qualities that Shannon's framework, focused on centralized transmission of explicit symbols, struggles to accommodate.
When financial models treat economic information as Shannon information, they often miss the human elements of interpretation and valuation that drive actual market behavior. The 2008 financial crisis demonstrated how catastrophically these models can fail when confronted with human panic, institutional trust breakdowns, and information cascades.
Cognition and Consciousness
Perhaps nowhere has Shannon information been more ambitiously (and problematically) extended than in theories of mind. The computational theory of mind, drawing heavily on information processing analogies, has dominated cognitive science for decades.
The brain-as-computer metaphor leans heavily on Shannon's framework, suggesting cognition is fundamentally about information processing in the Shannon sense. This perspective inevitably leads to imposing Von Neumann architectures onto neural systems—assuming cognition must be discretely stored, sequentially accessed, and processed by separate computational components.
Consider Gallistel and King's argument that "the brain must in some sense be capable of storing numbers"—an assertion that lived experiences must be encoded in molecular structures within neurons, addressable like computer memory. This view insists that if animals can count or remember quantities, those numbers must be encoded in Shannon-like information structures within the brain, ignoring possibilities that numerical concepts might emerge from distributed properties of neural networks without explicit digital storage.
Human thought involves integration, interpretation, and the generation of meaning in ways that transcend binary encoding and transmission—and may operate through principles entirely different from Von Neumann computational models.
When we encounter a face, we don't just process visual data—we experience recognition, emotional response, social context, and memory activation simultaneously. These aren't just additional "bits" being processed, but qualitatively different modes of knowing.
Phenomenological consciousness—what it feels like to experience something—remains stubbornly resistant to reduction to Shannon information processes. As philosopher David Chalmers noted in formulating the "hard problem" of consciousness, even perfect information processing models might never explain why experience exists at all.
.. and many more, and now …
Agency & AI : The New Frontier of Misapplication
As we develop increasingly sophisticated AI systems and agent-based models, we face a new chapter in the overextension of Shannon information. Agency—the capacity to act independently toward goals—is being reduced to information processing capabilities.
Current frameworks for "agentic AI" often implicitly assume that sufficient information processing capabilities will naturally produce true agency. But agency involves more than processing information—it requires having a perspective, valuing certain outcomes over others, and possessing some form of intrinsic motivation.
When we collapse agency into Shannon information processing, we risk creating systems that simulate rather than possess genuine autonomy. This category error could have profound implications as we integrate AI systems more deeply into society.
A large language model processing terabytes of text hasn't necessarily developed agency just because it can effectively predict and generate human-like responses. Information processing capacity is necessary but not sufficient for true agency—a distinction Shannon's framework, with its focus on transmission rather than meaning or purpose, doesn't help us navigate.
The Von Neumann Trap: When Shannon Meets Architecture
When we misapply Shannon information theory across domains, we often commit a second, more subtle error: we impose Von Neumann computer architectures onto systems that may operate on entirely different principles.
The Von Neumann architecture—with its strict separation between processing unit, memory storage, and transmission channels—maps cleanly onto Shannon's model of information transmission. This alignment has led many researchers to unconsciously assume that any system processing "information" must have analogous structures.
In neuroscience, this manifests as models that insist brain memory must work like computer memory—explicitly encoded, addressable, and stored in specific locations (molecules, cells, or circuits). Randall Gallistel's insistence that memory must be stored in "intracellular molecular" structures exemplifies this thinking. If the brain processes Shannon information, the reasoning goes, it must need Von Neumann-like structures to do so.
In economics, we see it in models that treat markets as computational devices with separate information storage and processing components. In ecology, researchers sometimes impose information processing hierarchies onto ecosystem dynamics that may actually operate through distributed, non-hierarchical interactions.
The pattern is consistent: Shannon information leads to Von Neumann architectures, which further constrains our understanding of complex systems. The issue isn't just that we're applying the wrong information concept—it's that we're imposing an entire computational paradigm that brings with it assumptions about how information must be stored, accessed, and processed.
Beyond the Bandwagon
Shannon's mathematics brilliantly solved the problems he set out to address—reliable communication across noisy channels. The digital world we inhabit is a testament to the power of his insights.
But as we continue exploring domains where information takes forms Shannon never contemplated—embodied, embedded, enactive, extended, or emergent—we need theoretical frameworks that account for these unique properties.
Information in biological systems isn't just transmitted; it's grown, evolved, and physically instantiated. Information in social systems isn't just exchanged; it's negotiated, contested, and culturally shaped. Information in conscious systems isn't just processed; it's experienced, valued, and integrated into a subjective perspective.
Shannon himself would likely be the first to recognize these limitations. After all, he was the one who tried to slow the bandwagon almost 70 years ago. Perhaps it's time we finally listened.
Rather than stretching Shannon's theory beyond recognition, we might need fundamentally new theories of semantic information, biological information, social information, and phenomenological information—each with mathematics and frameworks suited to their unique properties.
The challenge isn't applying Shannon information more broadly—it's developing entirely new information theories for domains where Shannon's elegant framework was never designed to go.