Genius Makers: The Mavericks Who Brought A.I. to Google, Facebook, and the World
Rate it:
Open Preview
7%
Flag icon
As Rosenblatt explained it, the machine had learned this skill on its own, thanks to a mathematical system modeled on the human brain. He called it a Perceptron. In the future, he said, this system would learn to recognize printed letters,2 handwritten words, spoken commands, and even people’s faces, before calling out their names. It would translate one language into another. And in theory, he added, it could clone itself on an assembly line, explore distant planets, and cross the line from computation into sentience.
7%
Flag icon
He attended Bronx Science,7 the elite public high school that eventually produced eight Nobel laureates,8 six Pulitzer Prize winners, eight National Medal of Science winners,9 and three recipients of the Turing Award,10 the world’s top computer science prize.
Krishna Chaitanya Venkata
The same network effort here too
8%
Flag icon
When the reporter asked if there was anything the Perceptron was not capable of, Rosenblatt threw up his hands. “Love. Hope. Despair.19 Human nature, in short,” he said. “If we don’t understand the human sex drive, how should we expect a machine to?”
8%
Flag icon
A Dartmouth professor named John McCarthy had urged the wider academic community to explore an area of research he called29 “automata studies,” but that didn’t mean much to anyone else. So he recast it as “artificial intelligence,” and that summer, he organized a conference along-side several like-minded academics and other researchers. The agenda at the Dartmouth Summer Research Conference on Artificial Intelligence included30 “neuron nets,” but also “automatic computers,” “abstractions,” and “self-improvement.” Those who attended the conference would lead the movement into the 1960s, most ...more
9%
Flag icon
But when Munson finished the lecture and took questions from the floor, Minsky made himself heard. “How can an intelligent young man like you,” he asked, “waste your time with something like this?”
10%
Flag icon
Although neural networks had fallen from favor in the wake of Minsky’s book on the Perceptron, Hinton, then a computer science professor at Carnegie Mellon University in Pittsburgh, had kept the faith, building the Boltzmann Machine in collaboration with a researcher named Terry Sejnowski, a neuroscientist at Johns Hopkins in Baltimore. They were part of what one contemporary later called “the neural network underground.” The rest of the AI movement was focused on symbolic methods, including the Cyc project under way in Texas. Hinton and Sejnowski, in contrast, believed the future still lay in ...more
11%
Flag icon
THE year Hinton entered the University of Edinburgh, 1971, the British government commissioned a study on the progress of artificial intelligence.11 It proved to be damning. “Most workers in AI research and in related fields confess to a pronounced feeling of disappointment in what has been achieved in the past twenty-five years,”12 the report said. “In no part of the field have the discoveries made so far produced the major impact that was then promised.” So the government cut funding across the field, ushering in what researchers would later call an “AI winter.”
11%
Flag icon
By the time Hinton was finishing his thesis, his research was on the fringes of a shrinking field. Then his father died. “The old bastard died before I was successful,” Hinton says. “Not only that, he got a cancer with a high genetic linkage. The last thing he did was increase my chances of dying.”
12%
Flag icon
Rumelhart had set himself a very particular, but central, challenge. One of the great problems with building a multilayered neural network was that it was very difficult to determine the relative importance (“the weight”) of each neuron to the calculation as a whole. With a single-layer network, like the Perceptron, this was at least doable: The system could automatically set its own weights across its single layer of neurons. But with a multilayered network, such an approach simply didn’t work. The relationships between the neurons were too expansive and too complex. Changing the weight of ...more
12%
Flag icon
Years later, when asked to explain the Boltzmann Machine for the benefit of an ordinary person who knew little about math or science, Hinton declined to do so. This, he said, would be like Richard Feynman, the Nobel Prize–winning physicist, explaining his work in quantum electrodynamics. When anyone asked Feynman to explain the work that won him the Nobel Prize in terms the layperson could understand,14 he, too, would decline. “If I could explain it to the average person,” he would say, “it wouldn’t have been worth the Nobel Prize.” The Boltzmann Machine was certainly hard to explain, in part ...more
13%
Flag icon
Early one Sunday morning in 1991, ALVINN drove itself from Pittsburgh to Erie, Pennsylvania, at nearly sixty miles an hour. Two decades after Minsky and Papert published their book on the Perceptron, it did the kind of thing they said a neural network couldn’t do.
15%
Flag icon
His breakthrough was a variation on the neural network modeled on the visual cortex,4 the part of the brain that handles sight. Inspired by the work of a Japanese computer scientist named Kunihiko Fukushima, he called this a “convolutional neural network.” Just as different sections of the visual cortex process different sections of the light captured by your eyes, a convolutional neural network cut an image into squares and analyzed each one separately, finding small patterns in these squares and building them into larger patterns as information moved through its web of (faux) neurons. It was ...more
15%
Flag icon
LeCun’s team built a chip for this one particular task. That meant it could handle the task at speeds well beyond the standard processors of the day: about 4 billion operations a second. This fundamental concept—silicon built specifically for neural networks—would remake the worldwide chip industry, though that moment was still two decades away.
15%
Flag icon
As the months passed, another chill settled over the wider world of connectionist research. Pomerleau’s truck could drive itself. Sejnowski’s NETtalk could learn to read aloud. And LeCun’s bank scanner could read handwritten checks. But it was clear the truck couldn’t deal with anything more than private roads and the straight lines of a highway. NETtalk could be dismissed as a party trick. And there were other ways of reading checks. LeCun’s convolutional neural networks didn’t work when analyzing more complex images, like photos of dogs, cats, and cars. It wasn’t clear if they ever would.
15%
Flag icon
during a lecture on artificial intelligence, a Stanford University computer science professor named Andrew Ng described neural networks to a roomful of graduate students. Then he added a caveat. “Yann LeCun,” he said, “is the only one who can actually get them to work.” But even LeCun was unsure of the future.
15%
Flag icon
Neural networks did need more computing power, but no one realized just how much they needed. As Geoff Hinton later put it: “No one ever thought to ask: ‘Suppose we need a million times more?’
16%
Flag icon
The world’s natural language researchers soon overhauled their approach, embracing the kind of statistical models unveiled that afternoon at the lab outside Seattle. This was just one of many mathematical methods that spread across the larger community of AI researchers in the 1990s and on into the 2000s, with names like “random forests,” “boosted trees,” and “support vector machines.” Researchers applied some to natural language understanding, others to speech recognition and image recognition. As the progress of neural networks stagnated, many of these other methods matured and improved and ...more
16%
Flag icon
When submitting papers to conferences and journals, hoping to improve their chances of success, some researchers would replace the words “neural network” with very different language, like “function approximation” or “nonlinear regression.” Yann LeCun removed the word “neural” from the name of his most important invention. “Convolutional neural networks” became “convolutional networks.”
16%
Flag icon
Inside the Dalle Molle Institute for Artificial Intelligence Research, Schmidhuber and one of his students developed what they described as a neural network with short-term memory. It could “remember” data it had recently analyzed and, using this recall, improve its analysis each step along the way. They called it an LSTM, for Long Short-Term Memory. It didn’t actually do much, but Schmidhuber believed this kind of technology would deliver intelligence in the years to come.
17%
Flag icon
A researcher named Aapo Hyvärinen once published an academic paper with an acknowledgment that summed up both Hinton’s12 sense of humor and his belief in ideas over mathematics: The basic idea in this paper was developed in discussions with Geoffrey Hinton, who, however, preferred not to be a coauthor because the paper contained too many equations.
17%
Flag icon
Another student, George Dahl, noticed a similar effect across the wider world of machine learning research. Every time he identified an important research paper—or an important researcher—there was a direct connection to Hinton. “I don’t know whether Geoff picks people that end up being successful or he somehow makes them successful. Having experienced it, I think it’s the latter,” Dahl says.
17%
Flag icon
When Hinton gave a lecture at the annual NIPS conference, then held in Vancouver, on his sixtieth birthday, the phrase “deep learning” appeared in the title for the first time. It was a cunning piece of rebranding. Referring to the multiple layers of neural networks, there was nothing new about “deep learning.” But it was an evocative term designed to galvanize research in an area that had once again fallen from favor. He knew the name was a good one when, in the middle of the lecture, he said everyone else was doing “shallow learning,” and his audience let out a laugh. In the long term, it ...more
18%
Flag icon
“Are you the devil?” Sejnowski asked. Minsky waved the question aside, explaining the many limitations of neural networks and pointing out, rightly, that they had never done what they were supposed to do. So Sejnowski asked again: “Are you the devil?” Exasperated, Minsky finally answered: “Yes, I am.”
19%
Flag icon
This was no more than a two-person project, but they had trouble getting down to work. Hinton needed a password to log in to Microsoft’s computer network, and the only way to get a password was over a company phone, which required its own password. They sent countless email messages trying to get a password for a phone, and when that didn’t work, Deng walked Hinton up to the tech-support desk on the fourth floor. Microsoft had a special rule that allowed a temporary network password if someone was visiting for only a day, and the woman sitting at the desk gave them one. But when Hinton asked ...more
Krishna Chaitanya Venkata
Haha big corp bureaucracy
19%
Flag icon
In Toronto, Hinton made use of a very particular kind of computer chip called a GPU, or graphics processing unit. Silicon Valley chip makers like Nvidia originally designed these chips as a way of quickly rendering graphics for popular video games like Halo and Grand Theft Auto, but somewhere along the way, deep learning researchers realized GPUs were equally adept at running the math that underpinned neural networks.
19%
Flag icon
It was an inflection point in the long history of artificial intelligence. In a matter of months, a professor and his two graduate students matched a system that one of the world’s largest companies had worked on for more than a decade. “He is a genius,” Deng says. “He knows how to create one impact after another.”
21%
Flag icon
Though he had once told a roomful of students that Yann LeCun was the only person on Earth who could coax something useful from a neural network, he moved with the tide as he saw it turn. “He was one of the few people doing other work who switched over to neural nets because he realized what was happening,” Hinton says.
21%
Flag icon
As much as he was shaped by the work of Geoff Hinton, he was also the product of a 2004 book titled On Intelligence,3 written by a Silicon Valley engineer, entrepreneur, and self-taught neuroscientist named Jeff Hawkins.
21%
Flag icon
In the days that followed his Japanese lunch with Larry Page, as he typed up a formal pitch for the Google founder, this became a pillar of his proposal. He told Page that deep learning would not only provide image recognition and machine translation and natural language understanding, but would also push machines toward true intelligence. Before the year was out, the project was approved. It was called Project Marvin, in a nod to Marvin Minsky. Any irony was unintended.
21%
Flag icon
Ng also met with the heads of Google’s image search and video search services, and they turned him down, too. He didn’t really find a collaborator until he and Jeff Dean walked into the same microkitchen,4 the very Googley term for the communal spaces spread across its campus where its employees could find snacks, drinks, utensils, microwave ovens, and maybe even a little conversation. Dean was a Google legend.
22%
Flag icon
6 Google’s early success is often attributed to Page-Rank, the search algorithm developed by Larry Page while he and his cofounder, Sergey Brin, were graduate students at Stanford. But the slim, square-jawed, classically handsome Dean, who spoke with a polite shyness and a slight lisp, was just as important to the company’s rapid rise—if not more so. He and a handful of other engineers built the sweeping software systems that underpinned the Google search engine, systems that ran across thousands of computer servers and multiple data centers, allowing PageRank to instantly serve millions of ...more
22%
Flag icon
But when Google approached him in the spring of 2012, he had no interest in leaving the University of Toronto. He was a sixty-four-year-old tenured professor overseeing a long line of graduate students and postdocs. So he merely agreed to spend the summer at the new lab.11 Owing to the idiosyncrasies of Google’s employment rules, the company brought him in as an intern, alongside dozens of college students on summer break. He felt like an oddity during orientation week,12 when he seemed to be the only one who didn’t know that an LDAP was a way of logging in to the Google computer network.
23%
Flag icon
During Hinton’s summer internship, one project ran into a cap Google had placed on available computing power. So the researchers told Jeff Dean, and he ordered up another $2 million worth. He had built the Google infrastructure, and that meant he could use it as he saw fit. “He created a kind of canopy where the Brain team could operate, and we didn’t have to worry about anything else,” Hinton says. “If you needed something, you asked Jeff and he got it.” What was odd about Dean, Hinton thought, was that unlike most people so intelligent and so powerful, he wasn’t driven by ego. He was always ...more
23%
Flag icon
When Jeff Dean and his team unveiled their methods at one of the big AI conferences, Ian Goodfellow, still a student at the University of Montreal, stood up from his seat in the audience and chided them for not using GPUs—though he would soon regret that he had so blithely and so publicly criticized Jeff Dean. “I had no idea who he was,” Goodfellow says. “Now I kind of worship him.”
23%
Flag icon
When the door opened, he immediately asked, in his clipped Eastern European accent, if he could join Hinton’s deep learning lab. “Why don’t you make an appointment and we could talk about it?” Hinton said. “Okay,” Sutskever said. “How about now?” So Hinton invited him in. Sutskever was a mathematics student, and in those few minutes, he seemed like a sharp one. Hinton gave him a copy of the backpropagation paper—the paper that had finally revealed the potential of deep neural networks twenty-five years earlier—and told him to come back once he’d read it. Sutskever returned a few days later. “I ...more
24%
Flag icon
The University of Toronto, Hinton liked to say, didn’t even have to pay for the electricity. Each week, Krizhevsky would start the training, and with each passing hour, on the computer screen in his bedroom, he could watch its progress—a black screen filled with white numbers counting upward. At the end of the week, he would test the system on a new set of images. It would fall short of the goal, so then he would hone the GPU code and adjust the weights of the neurons and train for another week. And another. And another. Each week, too, Hinton would oversee a gathering of the students in his ...more
24%
Flag icon
Hinton and his students had used a modified version of LeCun’s creation from the late 1980s: the convolutional neural network. But for some students in LeCun’s lab, it was also a disappointment. After Hinton and his students published the AlexNet paper, LeCun’s students felt a deep sense of regret descend on their own lab—a sense that after thirty years of struggle, they had stumbled at the last hurdle. “Toronto students are faster than NYU students,” LeCun told Efros and Malik as they discussed the paper later that night.
24%
Flag icon
“It was just asking too much to believe that if you started with random weights and you had lots of data and you followed the gradient, you would create all these wonderful representations. Forget it. Wishful thinking.”
24%
Flag icon
The AlexNet paper would become one of the most influential papers in the history of computer science, with over sixty thousand citations from other scientists.
24%
Flag icon
AlexNet was a turning point not just for deep learning but for the worldwide technology industry. It showed that neural networks could succeed in multiple areas—not just speech recognition—and that GPUs were essential to this success. It shifted both the software and hardware markets. Baidu recognized the importance, after the deep learning researcher Kai Yu explained the moment to the CEO, Robin Li. So did Microsoft, after Li Deng won the support of an executive vice president named Qi Lu. And so did Google.
24%
Flag icon
It was at this pivotal moment that Hinton created DNNresearch, the company they would auction off in a Lake Tahoe hotel room that December for $44 million. When it came time to split the proceeds, the plan had always been to divide the money equally among the three of them. But at one point, the two graduate students told Hinton he deserved a larger share: 40 percent. “You’re giving away an awful lot of money,” he told them. “Go back to your rooms and sleep on it.” When they returned the next morning, they insisted he take the larger share. “It tells you what kind of people they are,” Hinton ...more
25%
Flag icon
Deep learning was about to change the industry, Page told his lieutenants, and Google needed to get there first. “Let’s really go big,” he said. Eustace was the only one in the room who really knew what he was talking about. “They all stepped back,” Eustace remembers. “I didn’t.” In that moment, Page gave Eustace free rein to secure any and all of the leading researchers in what was still a tiny field, potentially hundreds of new hires. He had already landed Hinton, Sutskever, and Krizhevsky from the University of Toronto. Now, in the last days of December 2013, he was flying to London in ...more
25%
Flag icon
Eustace was not just an engineer. A trim, straight-backed man who wore rimless glasses, he was also a pilot, skydiver, and all-around thrill seeker who choreographed each new thrill with the same cold rationality he applied to building a computer chip. He would soon set a world record when he donned a pressure suit and jumped from a balloon floating in the stratosphere,1 twenty-five miles above the Earth. Just recently, he and several other skydivers had parachuted from a Gulfstream jet—something no one else had ever done—and this gave him an idea. Before any of them could make the leap, ...more
25%
Flag icon
“He’s got three things,” Hinton says. “He’s very bright, he’s very competitive, and he’s very good at social interactions. That’s a dangerous combination.”
26%
Flag icon
The diary seemed to look beyond Elixir, toward his next venture. The first entry began with him sitting in a plush chair at home, listening to the musical score from Blade Runner (track twelve, “Tears in Rain,” on continuous repeat). Much as Stanley Kubrick inspired a young Yann LeCun in the late ’60s, Ridley Scott had captured the imagination of a young Hassabis in the early ’80s with this latter-day sci-fi classic, in which a scientist and his imperious corporation build machines that behave like humans.
26%
Flag icon
Superintelligence, he had said in his thesis,11 could bring unprecedented wealth and opportunity—or lead to a “nightmare scenario” that threatened the very existence of humankind. Even if there was only a tiny possibility of building superintelligence, he believed, researchers had to consider the consequences. “If one accepts that the impact of truly intelligent machines is likely to be profound,12 and that there is at least a small probability of this happening in the foreseeable future, it is only prudent to try to prepare for this in advance. If we wait until it seems very likely that ...more
27%
Flag icon
They approached DeepMind’s most important investor even before the company was founded. In recent years, Legg had joined an annual gathering of futurists called the Singularity Summit. “The Singularity” is the (theoretical) moment when technology improves to the point where it can no longer be controlled by the human race. The founders of this tiny conference belonged to an eclectic group of fringe academics, entrepreneurs, and hangers-on who believed this moment was on the way. They were intent on exploring not only artificial intelligence but life-extension technologies and stem-cell ...more
27%
Flag icon
DeepMind snowballed from there. Hassabis and Legg enlisted both Hinton and LeCun as technical advisors, and the start-up quickly hired many of the field’s up-and-coming researchers, including Vlad Mnih, who studied with Hinton in Toronto; a Turkish-born researcher, Koray Kavukcuoglu, who worked under LeCun in New York; and Alex Graves, who studied under Jürgen Schmidhuber in Switzerland before a postdoc with Hinton. As they’d told Peter Thiel, the starting point was games.
28%
Flag icon
In January, Google announced it was acquiring DeepMind,25 a fifty-person company, for $650 million. It was another photo finish. Facebook had also bid for the London lab, offering each DeepMind founder twice as much money as they made from the sale to Google.
29%
Flag icon
This is how the tech industry works. The largest companies are locked in a never-ending race toward the next transformative technology, whatever that might be. Each is intent on getting there first, and if someone beats them to it, then they are under even more pressure to get there, too, without delay.
« Prev 1 3