Wikipedia Deep Dive

History of artificial intelligence

18 min read

Based on Wikipedia: History of artificial intelligence

The Dream That Took Three Thousand Years

In the summer of 1956, ten scientists gathered at Dartmouth College with an audacious claim: every aspect of human intelligence could be precisely described and then replicated in a machine. They gave themselves two months. They expected to make substantial progress.

They were wildly wrong about the timeline. But they were right about almost everything else.

What those researchers couldn't have known—what nobody could have predicted—was that they were continuing a project humanity had been working on for millennia. Long before anyone wrote a line of code, before the first transistor was etched into silicon, people had been dreaming of artificial minds. The question of whether we could build thinking machines is as old as civilization itself.

Gods and Golems: The Mechanical Dreams of Antiquity

The Greeks imagined Talos, a giant made entirely of bronze, patrolling the shores of Crete to protect the island from invaders. This wasn't just mythology—it was a thought experiment. What would it mean for something made of metal to think, to decide, to act with purpose?

During the Islamic Golden Age, the alchemist Jabir ibn Hayyan pursued what was called Takwin—the artificial creation of life. Whether he meant this literally or metaphorically remains debated, but the ambition itself is revealing. The desire to understand intelligence by recreating it runs deep in human history.

Jewish folklore gave us the Golem, a being sculpted from clay and animated by inscribing one of God's names on paper and placing it in the creature's mouth. The word itself comes from a Hebrew term meaning "shapeless mass." Here was an early intuition about artificial intelligence that modern researchers would eventually confirm: the raw material matters less than the pattern, the information, the word.

By the sixteenth century, the Swiss alchemist Paracelsus had written detailed instructions for creating a homunculus—a miniature artificial human. His recipe was fanciful, involving horse manure and human blood, but the underlying question was serious: Could the processes that create natural intelligence be understood and reproduced?

The Logic Machine

While alchemists searched for the spark of artificial life in chemistry, philosophers were pursuing a different path entirely. They wanted to understand thought itself—to reduce reasoning to rules so precise that even a machine could follow them.

This project began in earnest with the Spanish philosopher Ramon Llull in the thirteenth century. Llull built what he called "logical machines"—physical devices with rotating disks that could combine concepts mechanically. His idea was that if you could identify basic, undeniable truths and simple logical operations, you could generate all possible knowledge through systematic combination.

It sounds almost naive. But Llull had grasped something essential: reasoning might be mechanical. Not in the sense of being soulless or unimaginative, but in the sense of following patterns that could be specified exactly.

Four centuries later, Gottfried Wilhelm Leibniz picked up where Llull left off. Along with Thomas Hobbes and René Descartes, Leibniz explored whether all rational thought could be made as systematic as algebra. Hobbes put it memorably in his masterwork Leviathan: "For reason is nothing but reckoning, that is adding and subtracting."

Leibniz went further. He imagined something he called the characteristica universalis—a universal language of reasoning that would reduce every argument to calculation. He wrote:

"There would be no more need of disputation between two philosophers than between two accountants. For it would suffice to take their pencils in hand, down to their slates, and to say to each other: Let us calculate."

This was an extraordinary vision. Leibniz was suggesting that disagreements about truth could be resolved the same way we resolve disagreements about arithmetic—by checking the work. No rhetoric, no persuasion, no appeals to authority. Just calculation.

He couldn't build it. The mathematics didn't exist yet. But he had described, with remarkable precision, what a computer program would eventually be.

The Mathematics of Mind

The nineteenth and early twentieth centuries finally delivered the mathematical foundations Leibniz had lacked.

George Boole showed that logical reasoning could be expressed as algebra. His 1854 work, titled with characteristic ambition The Laws of Thought, demonstrated that the operations of the mind—at least the logical ones—could be captured in equations. True and false became 1 and 0. "And" became multiplication. "Or" became addition. Suddenly, thinking looked like arithmetic.

Gottlob Frege developed a formal system for expressing mathematical proofs with unprecedented rigor. Bertrand Russell and Alfred North Whitehead extended this work in their monumental Principia Mathematica, attempting to derive all of mathematics from pure logic. The project eventually foundered—Kurt Gödel proved that any sufficiently powerful formal system would contain truths it couldn't prove—but it established something equally important: the limits of formal reasoning could themselves be precisely defined.

Then came Alan Turing.

In 1936, Turing published a paper that was ostensibly about a technical problem in mathematical logic. But hidden within it was something much more profound: a complete theory of computation. Turing imagined an abstract machine—now called a Turing machine—consisting of nothing more than an infinite tape divided into squares, a head that could read and write symbols, and a set of rules for what to do next.

This simple device, Turing proved, could compute anything that could be computed at all. Every spreadsheet, every video game, every neural network, every large language model—all of them are, mathematically speaking, just elaborate Turing machines. The key insight was that symbol manipulation following rules was all you needed. If thinking was a kind of computation, then thinking could be mechanized.

The Electric Brain

While mathematicians were proving that computation was universal, biologists were discovering that the brain was computational.

In the eighteenth and nineteenth centuries, researchers like Luigi Galvani demonstrated that nerves carried electrical signals. Hermann von Helmholtz measured the speed of these signals—not instantaneous, as people had assumed, but a measurable fraction of the speed of sound. The brain was not some mystical organ operating by unknown principles. It was a network of electrical impulses.

By 1828, the physician Robert Bentley Todd had correctly guessed that the brain functioned as an electrical network. And by the early twentieth century, Santiago Ramón y Cajal had established the neuron doctrine: the brain was built from discrete cells called neurons, connected together in vast networks. Ramón y Cajal was staggered by the implications: "The truly amazing conclusion is that a collection of simple cells can lead to thought, action, and consciousness."

Simple cells. That was the key phrase. Each neuron was doing something relatively basic—receiving signals, processing them, sending signals onward. The magic emerged from the connections, the network, the pattern.

In 1943, Warren McCulloch and Walter Pitts published a paper that fused these insights with Turing's mathematical framework. They showed that networks of simplified artificial neurons—each one essentially a switch that could be either on or off—could perform logical operations. This was the first formal model of what we now call neural networks.

The paper directly influenced a young graduate student named Marvin Minsky. In 1951, Minsky and his colleague Dean Edmonds built the first neural network machine, called SNARC. It used 3,000 vacuum tubes and an automatic pilot mechanism from a B-24 bomber to simulate 40 neurons. The machine could learn. Minsky would go on to become one of the founding fathers of artificial intelligence.

The Second World War and the Birth of the Computer

The theoretical groundwork was in place. The biological inspiration was clear. What was missing was the machine itself.

World War Two provided the impetus and the funding to build it.

Across the combatant nations, engineers raced to construct machines that could perform calculations faster than any human. Konrad Zuse built his Z3 in Nazi Germany. Tommy Flowers created Colossus in Britain to break German codes. In the United States, ENIAC—the Electronic Numerical Integrator and Computer—filled an entire room with 18,000 vacuum tubes.

These machines were built for military calculations: ballistic tables, cryptanalysis, logistics. But a handful of scientists immediately saw their broader potential. A machine that could manipulate numbers could manipulate any symbols. And if thinking was symbol manipulation—as the logicians had argued—then these machines might be able to think.

The ideas converged from multiple directions. Norbert Wiener's cybernetics described how systems could regulate themselves through feedback loops. Claude Shannon's information theory showed that any information could be encoded in binary digits—bits. Turing's computation theory proved that these bits could implement any computable function.

The electronic brain was no longer a metaphor. It was an engineering project.

Turing's Test

In 1950, Alan Turing published a paper that would define the field before it even had a name.

"Computing Machinery and Intelligence" opened with a deceptively simple question: Can machines think? Turing immediately acknowledged the problem: "thinking" is difficult to define. Rather than get mired in philosophical debates, he proposed a practical test.

Imagine a human interrogator communicating via text with two hidden parties—one human, one machine. If the interrogator cannot reliably distinguish which is which, Turing argued, we should accept that the machine is thinking. Not because we've proven the machine has consciousness, but because we've demonstrated that its behavior is indistinguishable from behavior we already accept as thinking.

This was a characteristically elegant move. Turing sidestepped millennia of philosophical debate about the nature of mind and replaced it with an empirical question: Can you tell the difference? If not, the philosophical distinction doesn't matter.

The Turing test has been criticized on many grounds. But it served a crucial purpose in 1950: it made the question of machine intelligence seem tractable. Not something for metaphysicians to debate endlessly, but something engineers could actually work toward.

The Birth of a Field

By the mid-1950s, the pieces were in place: the theoretical frameworks, the first computers, and a growing conviction that machine intelligence was possible. What was needed was a name and a research program.

In 1955, Allen Newell and Herbert Simon—working with programmer J.C. Shaw—created a program called the Logic Theorist. It could prove mathematical theorems from Russell and Whitehead's Principia Mathematica. Not by brute-force searching through possibilities, but by using heuristics—rules of thumb—to guide its search, much as a human mathematician would.

The Logic Theorist eventually proved 38 of the first 52 theorems in Principia. For some theorems, it found proofs more elegant than the ones Russell and Whitehead had published. Simon, who would later win the Nobel Prize in Economics, declared that they had "solved the venerable mind-body problem, explaining how a system composed of matter can have the properties of mind."

This was overconfident. But it was also thrilling. A machine had done something that, by any reasonable standard, required intelligence.

The following summer, Marvin Minsky and John McCarthy organized a workshop at Dartmouth College. Their proposal to the Rockefeller Foundation made a bold claim: "Every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it."

The workshop ran for two months. The attendees included nearly everyone who would shape AI research for the next generation: Newell, Simon, Shannon, Samuel, Selfridge, Solomonoff, and others. They debated everything from machine learning to natural language to the nature of creativity.

Most importantly, they gave their endeavor a name. John McCarthy proposed "Artificial Intelligence." The term stuck.

The Golden Years

The decade following Dartmouth was intoxicating. Researchers made breakthrough after breakthrough, and it seemed like human-level artificial intelligence might be just around the corner.

Arthur Samuel built a checkers program that learned from experience, eventually becoming good enough to defeat skilled amateurs. This was among the first demonstrations of what we now call machine learning. Frank Rosenblatt invented the perceptron, a type of neural network that could learn to classify patterns. The media breathlessly covered each advance.

Funding poured in. The U.S. government, eager for any technological edge in the Cold War, provided millions of dollars for AI research. Simon predicted in 1965 that "machines will be capable, within twenty years, of doing any work a man can do." Minsky was even more optimistic, suggesting the problem might be solved within a generation.

This optimism was based on genuine achievements. But it dramatically underestimated the difficulty of what remained.

The First Winter

By the early 1970s, the limitations were becoming clear.

The programs that seemed so impressive were actually quite brittle. They worked well on carefully chosen problems but failed catastrophically when faced with anything unexpected. A chess program couldn't play checkers. A theorem prover couldn't understand a simple story. Each success was narrow, unable to generalize.

The British mathematician James Lighthill was commissioned to assess the state of AI research. His 1973 report was devastating. He argued that AI had failed to achieve its ambitious goals and was unlikely to do so. The "combinatorial explosion"—the way the number of possibilities grew exponentially with problem complexity—seemed insurmountable.

The U.S. Congress, frustrated by the lack of progress, also investigated. Funding was cut dramatically. What researchers would later call the "AI winter" had begun.

Projects were cancelled. Researchers moved to other fields or rebranded their work to avoid the stigma of AI. The vision of thinking machines that had seemed so close receded into an indefinite future.

Expert Systems and the Second Boom

But AI didn't die. It transformed.

In the early 1980s, a new approach called expert systems achieved commercial success. Rather than trying to create general intelligence, expert systems focused on narrow domains. They encoded the knowledge of human experts in specific fields—medical diagnosis, chemical analysis, financial planning—as explicit rules.

The Japanese government announced a massive initiative called the Fifth Generation Computer Project, aiming to create intelligent computers that could understand natural language. This spurred a reaction in the United States and Europe. Suddenly, AI was a matter of national competitiveness.

Investment flooded back. By the late 1980s, the AI industry was generating over a billion dollars in annual revenue. Companies built specialized hardware for AI applications. Consultants promised that expert systems would transform every industry.

But expert systems had their own limitations. Extracting knowledge from human experts was painstakingly slow. The rules were brittle—they broke when faced with situations the experts hadn't anticipated. And maintaining these systems as knowledge changed was enormously expensive.

By the early 1990s, the bubble had burst again. The Japanese Fifth Generation project failed to deliver on its promises. Companies that had invested heavily in AI pulled back. The second AI winter had arrived.

Machine Learning's Quiet Revolution

Through the boom and bust cycles, one approach continued to develop: machine learning.

The idea was simple but powerful. Rather than programming a computer with explicit rules, you would show it examples and let it learn the patterns. A spam filter didn't need rules about what made an email spam—it just needed to see thousands of examples of spam and legitimate mail, and it would figure out the patterns itself.

Three factors came together in the early 2000s to make machine learning practical at scale.

First, computer hardware had become exponentially more powerful. Moore's Law—the observation that computing power doubled roughly every two years—had been running for decades. Operations that would have taken years in the 1970s could now be done in seconds.

Second, the internet had created vast datasets. Companies like Google and Facebook accumulated billions of data points about human behavior. Every search query, every click, every purchase was potential training data for machine learning systems.

Third, researchers had developed more sophisticated mathematical techniques. Support vector machines, random forests, and improved neural network architectures made it possible to learn more complex patterns from data.

Machine learning began solving problems that had resisted decades of traditional AI research. Speech recognition improved dramatically. Image classification became increasingly accurate. Recommendation systems learned to predict what you might want to watch, read, or buy.

This was happening largely without fanfare. The word "AI" still carried the stigma of overpromising. Researchers and companies preferred terms like "machine learning," "data mining," or "predictive analytics." But they were building artificial intelligence, whether they used the name or not.

The Deep Learning Breakthrough

Neural networks—the approach inspired by the brain's architecture that Pitts, McCulloch, and Minsky had pioneered in the 1940s and 50s—had gone through their own cycles of enthusiasm and disillusionment. By the 2000s, most researchers had written them off.

A small group of true believers, including Geoffrey Hinton, Yann LeCun, and Yoshua Bengio, kept working on them anyway.

Their persistence paid off. In 2012, a neural network called AlexNet won the ImageNet competition—an annual contest in image recognition—by a stunning margin. It wasn't slightly better than the competition. It was dramatically better, reducing the error rate by more than ten percentage points.

The key was depth. Previous neural networks had used only a few layers of artificial neurons. AlexNet used eight layers—still shallow by today's standards, but deep enough to learn hierarchical representations. Early layers learned to detect edges. Later layers combined edges into shapes. The deepest layers recognized objects.

This was "deep learning," and it changed everything.

Suddenly, problems that had seemed intractable began falling. Speech recognition improved to near-human levels. Machine translation became genuinely useful. Image recognition exceeded human performance on some benchmarks. Self-driving cars began to seem possible.

The tech industry noticed. Google, Facebook, Microsoft, Amazon, and others invested billions in AI research. The field that had been left for dead was now the hottest area in technology.

The Transformer and the Age of Language Models

In 2017, a team at Google published a paper with a modest title: "Attention Is All You Need." It introduced a new neural network architecture called the transformer.

Previous approaches to language processing had handled text sequentially, one word at a time. The transformer could process entire sequences in parallel, using a mechanism called "attention" to weigh the importance of different words in relation to each other. This made it faster to train and better at capturing long-range dependencies in language.

The results were immediate and dramatic. Transformers quickly became the dominant architecture for natural language processing. But the real revolution came when researchers began making them larger.

Much, much larger.

OpenAI's GPT-2, released in 2019, had 1.5 billion parameters—the numerical values that determine how a neural network behaves. GPT-3, released the following year, had 175 billion. Each generation became more capable in ways that surprised even their creators.

These "large language models" could write essays, answer questions, translate languages, write code, and engage in conversations that felt startlingly human. They weren't just pattern matching in any simple sense. They seemed to understand context, to reason, to be creative.

In November 2022, OpenAI released ChatGPT, making these capabilities available to the general public through a simple chat interface. Within five days, it had a million users. Within two months, it had a hundred million—the fastest-growing application in history.

The Current Moment

We are now living through what may be the most consequential period in the history of artificial intelligence.

Investment has reached unprecedented levels. Companies and governments are pouring hundreds of billions of dollars into AI development. The race to build more capable systems has become a matter of national strategy for major powers.

Large language models have been integrated into search engines, productivity software, creative tools, and countless other applications. They can write, summarize, analyze, and generate content in ways that were science fiction just a few years ago.

But this moment is also marked by uncertainty and concern.

The capabilities of these systems have grown faster than our understanding of how they work. We can observe what they do, but we often can't explain why they do it. They exhibit knowledge, attention, and creativity—but whether they truly "understand" anything remains deeply contested.

Questions about safety, bias, misinformation, and displacement of human workers have moved from academic discussions to urgent policy debates. The same technology that can tutor a student or accelerate scientific research might also be used to generate propaganda or automate surveillance.

Meanwhile, some researchers worry about scenarios that once seemed like science fiction. If AI systems continue to become more capable, could they eventually exceed human intelligence in all domains? What would that mean for humanity's future?

Three Thousand Years of Questions

The questions we're asking today would be recognizable to Ramon Llull, to Leibniz, to Turing. Can thinking be mechanized? What does it mean for something to understand? How would we know if we had created a mind?

We're closer to answering these questions than anyone in history. We have machines that can do things previously thought to require human intelligence. We have mathematical frameworks for reasoning about computation and learning. We have empirical evidence about what architectures and training methods produce capable systems.

But we're also further from answers in some ways. The more capable our AI systems become, the more mysterious they seem. The simple logic gates of early AI have been replaced by neural networks with billions of parameters whose behavior emerges from their training in ways we don't fully understand.

The dream that started with bronze giants and clay golems, that passed through logical machines and electronic brains, has arrived at a strange destination: systems that exceed human performance on many tasks while remaining fundamentally different from human minds. Not the artificial humans that fiction imagined, but something new—something we're still learning to understand.

The researchers at Dartmouth in 1956 thought they might solve artificial intelligence in a summer. Nearly seventy years later, we're still working on it. But we've come far enough to glimpse just how far there is to go.