Wittgenstein’s Legacy:
Probability, Logic, and Language
in the Age of AI
Elizabeth Rohwer
erohwer@san.rr.com
On April 27 1951, two days before his death, Wittgenstein was investigating a topic that had interested him during the last year and a half of his life. “We might speak of fundamental principles of human enquiry,” begins the last entry in his diary, and as an illustration of what these principles might be, he presents us with one of his hallmark philosophical puzzles. If I fly to a desert island where people have never seen airplanes and “have only indefinite information, or none at all, about the possibility of flying,” how do I explain to them the way I got there?—he asks. A collection of his reflections on the theme were posthumously published in a book entitled On Certainty.
Paradoxically, the opposite idea—the notion of uncertainty—proved easier to numerically pin down and measure. E. T. Jaynes, an American theoretical physicist, clarified the concept in 1956, five years after Wittgenstein’s death, in his seminal paper How Does the Brain Do Plausible Reasoning. Jaynes’ contribution was theoretically sound and is now considered an important breakthrough whose repercussions have lead to the advancement of modern AI. However, his idea that logical reasoning can be formally expressed using probabilities sounded so far-fetched that the physics journal to which he submitted his paper refused its publication. In it, Jaynes proves that the Maximum Entropy law (MaxEnt), which was discovered by Boltzmann in the late 19th century to quantify the Second Law of thermodynamics, is applicable to another domain of physics, namely the field of information exchange.
In 1948, Claude Shannon formulated the mathematical theory of communication, defining communication as the transmission of information from a source to a receiver. Notably, when the means of communication is natural language, the transmission follows the natural direction of time, which inevitably runs from a speaker to a listener. Shannon used probability to quantify the loss of information in the communication channel and came up with the term information entropy, as his measure, surprisingly, differed from Boltzmann's entropy just by a minus sign.
Jaynes discovered that the mathematical formalism underpinning the correct interpretation of probability is applicable to both fields: thermodynamics and information exchange, and demonstrated that both Boltzmann's and Shannon's entropy measures in fact quantify the same concept: uncertainty. This finding explains Wittgenstein's philosophical puzzle. In his traveler's metaphorical struggle to convey information about airplanes to people on a remote island who had only witnessed birds flying, Wittgenstein illustrates the challenges of information transfer in situations where the source and receiver (speaker and listener) have little in common. His story highlights the inherently probabilistic nature of language communication, which often involves varying degrees of information loss and is thus plagued by uncertainty.
The challenges understanding uncertainty came to the fore when Boltzmann, in a stroke of genius, applied the mathematical framework of probability theory to the study of thermodynamics. His entropy formula, now immortalized on his tombstone, proved to be effective in practice. However, at the time, its underlying rationale eluded comprehension, even for Boltzmann himself. This lack of understanding was so profound that in 1903, the year in which young Wittgenstein had planned to study under him, Boltzmann tragically took his own life. He could not withstand the criticisms of his peers, whose views were steeped in the classical laws of Newtonian mechanics that are deterministic and time-reversible but were not applicable to the phenomena of thermodynamics.
Boltzmann’s MaxEnt law continued to intrigue the scientific community when, a decade later, Wittgenstein embarked on the creation of his enigmatic philosophical masterpiece: Tractatus Logico-Philosophicus the only book published during his lifetime. Though an inspiration to many, the book to this day lacks a comprehensive, logically consistent, and generally agreed-upon interpretation. I believe that is because Wittgenstein was ahead of his time.
He was trained as an aeronautical engineer and held a patent for the design of an advanced turbo-propeller. This highly specialized, technical background suggests familiarity with Boltzmann’s statistical approach to thermodynamics. I believe that Wittgenstein had the correct intuition about the general applicability of Boltzmann’s MaxEnt law and was the first to apply it to a different domain, in his philosophical investigations into the nature of logic and language. How else can one explain why he devoted the longest section of his book—section five—to the use of probability in the formal analysis of logic and language?
What elevates Boltzmann’s MaxEnt law to a fundamental principle in physics is that the computational framework introduced by Boltzmann takes into account the direction of time. Unlike Newton's classical laws, this probabilistic framework behaves differently when applied to the forward or backward directions of time. Since speech is an irreversible process (what is said cannot be unsaid) and inevitably runs from a speaker to a listener, i.e., follows the direction of time, Wittgenstein’s insightful application of probability to the realm of language communication stands as a pioneering achievement.
Deciphering the Tractatus has been challenging, as it took over a century for the scientific community to adopt the probabilistic perspective. The new direction in understanding probability has been a result of the research efforts of numerous physicists and mathematicians. The list starts with Frank Ramsey, the mathematician who translated the Tractatus into English under Wittgenstein’s supervision and benefitted from his explanations in direct discussions with him. Ramsey’s insights into key concepts have contributed to the advancement of the mathematical theory of probability. These include the clarification of the notion of inverse probability and the equation of the concept of subjective probability with expected utility.
Jaynes research dotted the i’s and crossed the t’s of the modern interpretation of probability. It came to an end in 2003 with the publishing of his posthumous book "Probability Theory—the Logic of Science” whose topic is optimal processing of incomplete information. Jaynes aimed to clarify the normative function of logic, the emergence of meaning from the statistics of natural language use, by treating probability as an extension of Aristotelian logic, which he calls plausible logic. This approach elucidates how people form groups united by a their subjective understanding of word meaning, as epitomized by Wittgenstein’s example of an encounter between societies with different concepts of “flying”.
This new way of understanding and working with probabilities is one of the contributing factors to the success of modern AI, leading to the development of an arsenal of computational techniques culminating in today’s Large Language Models that are able to operate as human-like agents communicating with us using natural language and performing complex reasoning tasks. Recent experiments with artificial communities of communicating Large Language Models have gone so far as to demonstrate spontaneous formation of groups differentiated by their subjective understanding of language, vividly illustrating Wittgenstein’s intuitions.
“If someone is merely ahead of his time, it will catch him up someday,” Wittgenstein rightly predicted.