Santosh Shetty’s Kindle Notes & Highlights for Supremacy: AI, ChatGPT, and the Race that Will Change the World

At the time of writing, Google’s parent company, Alphabet Inc., had a market capitalization of $1.8 trillion. In 2020, Apple became the first publicly traded US company to hit a $2 trillion valuation, while Amazon’s and Microsoft’s market values were hovering at around $1.7 trillion and an astonishing $3 trillion, respectively. Before Apple first became a trillion-dollar company in 2018, no company ever had become so big. Yet there’s one thing that nearly all the world’s most valuable companies have in common: they are tech firms. In fact, the companies that we might normally think of as being ...more

36%

Even the market dominance of tech giants is unparalleled. Before regulators broke it up in 1911, Standard Oil controlled about 90 percent of oil business in the United States. Today, Google controls about 92 percent of the search engine market—globally. Roughly one billion people around the world run a search on Google each day. More than two billion check Facebook. And about 1.5 billion people in the world have an iPhone. No government or empire in history has touched so many people at once.

36%

Firms like Facebook and Google use that data to conduct hypertargeted advertising, displaying ads that pique a person’s interests and fuel sophisticated recommendation algorithms. That software powers the “feeds” that people thumb through every day, making sure the content that pops up is most likely to keep them continuously scrolling. The companies are incentivized to keep us as addicted as possible to their platforms, since that generates more ad dollars. But the adverse effects are plentiful.

36%

The other way these companies became so enormous was network effects, a seemingly magical phenomenon that every start-up founder craves. The basic idea of network effects is that the more users and customers a company has, the better their algorithms will become, making it increasingly difficult for competitors to catch up, further entrenching their grip on the market.

38%

This wouldn’t be the first time large companies had distracted the public while their businesses swelled. In the early 1970s, the plastic industry, backed by oil companies, began to promote the idea of recycling as a solution to the growing problem of plastic waste. Keep America Beautiful, for instance, was an organization founded in 1953 that ran public service campaigns encouraging consumers to recycle, and was funded in part by drinks and packaging firms. Its famous “Crying Indian” ad aired on Earth Day in 1971 and encouraged people to recycle their bottles and newspapers to help prevent ...more

38%

Recycling is not a bad thing per se. But by promoting the practice, the industry could argue that plastics weren’t inherently bad so long as they were recycled properly, which shifted the perception of responsibility from producers to consumers. Plastics companies knew that recycling on a large scale was expensive

38%

and often inef...

This highlight has been truncated due to consecutive passage length restrictions.

38%

As she clicked through her slides, she explained that AI systems could combine their ability to recognize cars with their ability to make predictions to forecast things like voting patterns or household income. One venture capitalist there, a Tesla investor and friend of Elon Musk named Steve Jurvetson, was stunned, but not for the reasons Gebru was hoping. Think about how powerful this kind of data made Google and the kinds of insights it could make about different neighborhoods or towns. He was so impressed that he posted photos of Gebru’s talk to Facebook.

39%

it’s too difficult to fix these biases, arguing that modern-day AI models are so complex that even their creators don’t understand why they make certain decisions. Deep-learning models, like neural networks, are made up of millions or billions of parameters, also known as “weights,” that act as adjusters in complex mathematical functions between connected layers. Think of the layers of a neural network as being a bit like a factory with an assembly line, where each person on the line has a certain job like painting a toy car or adding the wheels. By the end of the line, you have a toy car. ...more

39%

She’d heard rumors through the grapevine that Google was a toxic place to work, particularly for women and minorities.

40%

In 2017, Google had about eighty thousand salaried employees. Not all of them were engineers. There were curators of the daily Google Doodle that showed up above everybody’s search bar. There were in-office chiropractors and masseuse managers, snackologists who made sure the staff were fueled between their three hot meals at the canteen, horticulturists who looked after the plants, and cleaners who wiped down the foosball tables. Google’s business model was a golden goose. That year its advertising business was generating close to $100 billion annually—a number that would more than double by ...more

40%

The problem with being so big was that if someone did invent something groundbreaking inside Google, it might struggle to see the light of day. Google’s digital ad business was sacrosanct. You didn’t mess with the algorithms that powered it unless you really had to. For all the kudos that Silicon Valley got for being the innovation capital of the world, its biggest companies weren’t all that innovative. Google’s home page had barely changed over the past decade. The iPhone was still the same old rectangular slab of metal. And nearly every new Facebook feature was a direct copy of a competitor ...more

40%

The T in ChatGPT stands for “transformer.” This has nothing to do with the alien robots that morph into eighteen-wheelers but a system that allows machines to generate humanlike text. The transformer has become critical to the new wave of generative AI that can produce realistic text, images, videos, DNA sequences, and many other kinds of data. The transformer’s invention in 2017 was about as impactful to the field of AI as the advent of smartphones was for consumers. Before smartphones, mobile phones couldn’t do much more than make calls, send texts, and play the odd game of Snake. But when ...more

This highlight has been truncated due to consecutive passage length restrictions.

41%

The chip in your home laptop probably had something like four “cores” to handle instructions, but the GPU chips used in servers to process AI systems had thousands of cores. This meant an AI model could “read” lots of words in a sentence all at once, not just in sequence. Not capitalizing on those chips was like switching off an electric saw to manually cut wood.

41%

Shazeer had extensive experience with large language models. These were computer programs that could analyze and generate humanlike text after being trained on billions of words.

42%

programmer on the team, was stunned to find that the system was doing something called coreference resolution. This had been a huge sticking point in the effort to make computers process language properly. It referred to the task of finding all expressions that refer to the same entity in a text. For instance, in the sentence “The animal didn’t cross the street because it was too tired,” it’s obvious to us as humans that it refers to the animal. But change the sentence to “The animal didn’t cross the street because it was too wide,” and it now refers to the road. Until then, it had been ...more

42%

to infer that kind of shift in context because doing so required some element of commonsense knowledge, built up over years of experience of how the world works and how objects interact.

42%

“It’s a classic intelligence test [that] AI’s failed on,” Jones says. “We couldn’t get common sense into a neural network.” But when they fed those same sentences into the transformer, the researchers could see something unusual happening to its “attention head.” The attention head was like a mini-detector in their model that focused on different parts of data it was being fed. It was the part that would harness the power of current chips and what allowed the transformer to pay attention to all the different words of a sentence at the same time, instead of one by one in sequence. When the ...more

This highlight has been truncated due to consecutive passage length restrictions.

42%

These engines for reasoning had the potential to supercharge AI systems, but Google was slow off the mark to do anything about them. It took several years, for instance, for Google to plug transformers into services like Google Translate or BERT, a large language model that it developed to make its search engine better at processing the nuance of human language. The transformer’s inventors couldn’t help but feel frustrated. Even a small start-up in Germany had started using the transformer to translate languages well before Google, putting the bigger company in a position where it was now ...more

43%

Google’s cautious approach was largely a product of bloat. The downside to being one of the largest companies of all time, with a monopolistic grip on the search market, is that everything moves at a snail’s pace. You’re constantly afraid of public backlash or regulatory scrutiny. Your prime concern is maintaining growth and dominance. So intent has Google been on keeping a stranglehold on the search market that it paid more than $26.3 billion in 2021 to Apple, Samsung, and others—more than a third of its net profit that year—just to preinstall its search engine on their phones, according to a ...more

43%

The transformer allowed computers to generate not just text but answers to all manner of questions. If consumers started using something like that more, they could end up going to Google less.

43%

There was nothing unusual about Google sharing some of the foundational mechanics of an invention with the world. That was often how tech companies operated. When they “open-sourced” new techniques, they got feedback from the research community, which boosted their reputation among top engineers, making it easier to hire them. But Google underestimated how much that would cost the company. Of the eight researchers who invented the transformer, all have now left Google. Most started their own AI companies, which at the time of writing were worth more than $4 billion in aggregate. Character.ai ...more

43%

LaMDA was probably the world’s most advanced chatbot, but only a few people inside Google could use it. Google was loathe to release any new technology that could end up disrupting the success of its search business. Its executives and publicity team framed that approach as being one of caution, but more than anything, the company was obsessed with maintaining its reputation and the status quo.

44%

OpenAI was about to differentiate itself from DeepMind in another way. Ilya Sutskever, OpenAI’s star scientist, couldn’t stop thinking about what the transformer could do with language. Google was using it to better understand text. What if OpenAI used it to generate text?

44%

When the transformer came out, he saw it at first as a crushing blow from Google. Clearly the bigger company had more expertise in AI. But after a while, it looked like Google didn’t have any big plans for its new invention, and Radford and Sutskever realized they could use the architecture to OpenAI’s advantage. They would just have to put their own spin on it. The transformer model that powered Google Translate used something called an encoder and a decoder to process words. The encoder would process the sentence coming in, perhaps in English, and the decoder would generate the output, like ...more

44%

Sutskever had long believed that “success was guaranteed” when you scaled everything up in AI, especially with language models. The more data you had, combined with the highest-possible computing power and a large and intricate model, the more capable it would be.

45%

He and his colleagues started working on a new language model they called a “generatively pre-trained transformer” or GPT for short.

45%

Human workers would have to label comments like “I love this product” as positive and “It’s ok” as neutral, for instance. That method was slow and expensive. But GPT was different because it was learning from a mountain of seemingly random text that wasn’t labeled to get the hang of how language worked. It didn’t have the guiding hand of those human labelers.

45%

these different approaches as being like a new way of educating humans. For instance, suppose two groups of art students were being taught how to paint. The first group was given a book with pictures of paintings, each one labeled with captions like “sunrise,” “portrait,” or “abstract.” That’s how traditional AI models were learning from labeled data. It was a structured and precise method—like telling the art students exactly what each picture represented—but it also limited what machines could infer. They could only recall what had been labeled. The students in this first group would ...more

45%

Radford’s team realized that by exposing GPT to a vast array of language uses and nuances, the model itself could generate more creative responses in text. Once the initial training was done, they fine-tuned the new model using some labeled examples to get better at specific tasks. This two-step appr...

This highlight has been truncated due to consecutive passage length restrictions.

46%

“I would wake up, nervous that Google was just gonna go release something much better than us,”

46%

A company like OpenAI couldn’t train its AI models on the same laptops its staff members were working on. To process so many billions of pieces of data for training, and quickly, it needed the powerful chips found only in servers and typically rented from cloud providers like Amazon Web Services, Google Cloud, or Microsoft’s Azure. These were the companies that had endless football fields of computers enclosed in vast warehouses, whose ownership of these “cloud” computers would see them become the biggest financial winners of the AI boom.

46%

That was the predicament OpenAI found itself in. It needed to rent more cloud computers, and it was also running out of money. “We’re just going to need to raise way more money than what we can as a [nonprofit],” Brockman told other executives. “Many billions of dollars.”

46%

The whole thing sounded magnanimous. OpenAI was framing itself as an organization that was so highly evolved that it was putting the interests of humanity above traditional Silicon Valley pursuits like profit and even prestige.

46%

Altman told Hoffman that he might have an answer to the problem: a strategic partnership. The term strategic partnership is a handy one that companies frequently use to cover a wide range of corporate relationships that could put them at arm’s length or on a tight leash. It could mean sharing money and technology between two firms or setting up a licensing agreement. The term was ambiguous enough to hide the true nature of an awkward corporate relationship, perhaps one with complicated financial ties or where one firm has an embarrassing amount of control over another. “Partnership” implied a ...more

47%

a strategic partnership could create the illusion of greater independence from a larger tech company, while giving him the computing power OpenAI needed.

47%

Hoffman was a rotund, jolly man with a boyish grin, and his real value to OpenAI wasn’t cash but connections. He was so good at making friends and acquaintances that he had founded the world’s number-one professional networking site, LinkedIn.

47%

In 2016, he’d sold the company to Microsoft for $26.2 billion, giving him a net worth of about $3.7 billion and leading him to a new career of backing start-ups as an investor with storied venture capital firm Greylock Partners.

48%

“Maybe we should figure out something more,” Nadella told him.

48%

“[Altman] really tries to find the thing that matters most to a person—and then figures out how to give it to them,” Greg Brockman would later tell the New York Times. “That is the algorithm he uses over and over.”

48%

OpenAI was building AI systems that could one day lead to AGI, but along the way, as those systems became more powerful, they could make Azure a more attractive service to customers. Artificial intelligence was going to become a fundamental part of the cloud business, and cloud was on track to make up half of Microsoft’s annual sales. If Microsoft could sell some cool new AI features—like chatbots that could replace call center workers—to its corporate customers, those customers were less likely to leave for a competitor. The more features they signed up for, the harder it would be to switch. ...more

49%

Altman and Brockman would go on to say that this was never their intention and that OpenAI was genuinely concerned about how GPT-2 could be abused. But their approach to public relations was, arguably, still a form of mystique marketing with a dash of reverse psychology. Apple had done it for years with secretive product launches that would drum up excitement, and OpenAI was now being similarly secretive about how GPT-2 had come together. Some AI academics meanwhile found that trying to access GPT-2 was like trying to get into an exclusive nightclub. OpenAI was being more careful and selective ...more

49%

Effective altruism hit the spotlight in late 2022 when one-time crypto billionaire Sam Bankman-Fried became the movement’s most well-known supporter. But it had been around since the 2010s. The idea, which was spawned by a handful of philosophers at Oxford University and then spread like wildfire through college campuses, was to improve on traditional approaches to charity by taking a more utilitarian approach to giving. Instead of volunteering at a homeless shelter, for instance, you could help more people by working at a high-paying job like a hedge fund, making lots of money, and then ...more

49%

Sometimes effective altruists were split on the best way to do that. Some might say that you could impact more people by donating to global causes like poverty than to local US or European causes like homelessness.

50%

Licensing technology to a large company is fundamentally no different from selling a product. It simply means selling technology to a larger customer that has more power and control than regular consumers. And so long as OpenAI’s board said it hadn’t reached AGI, it could keep licensing to Microsoft.

51%

new technology has come with a price, from the loss of human connection and privacy to the rise of screen-time addiction, mental health problems, political polarization, and income inequality from greater automation, all powered by a handful of companies.

51%

A typical home computer has one central processing unit, or CPU, the powerful silicon chip that’s

51%

rectangular in shape and covered in billions of tiny transistors. It’s the brain of your computer, and usually has between four and eight cores, each of which deal with all the necessary calculations. Microsoft’s new super computer had 285,000 CPU cores. If a regular home computer was like a toy car, this was a tank.

51%

When people bought a more powerful computer for playing games, those machines would typically contain a GPU, which quickly processed complex visual data to make video game images look smooth and polished. Those same chips were now also being used to train AI because they could perform so many calculations in parallel. Microsoft’s new supercomputer had ten thousand of them. And it could move data...

This highlight has been truncated due to consecutive passage length restrictions.

51%

Facebook wasn’t an option, since after the Cambridge Analytica scandal of 2018, Mark Zuckerberg’s platform had stopped other companies from accessing its user data. But Twitter was still mostly a free-for-all, and so was Reddit.

See a Problem?

Preview — Supremacy by Parmy Olson