The Gen AI Bridge to the Future

Monday, December 2, 2024

In the beginning was the mainframe.

In 1945 the U.S. government built ENIAC, an acronym for Electronic Numerical Integrator and Computer, to do ballistics trajectory calculations for the military; World War 2 was nearing its conclusion, however, so ENIAC’s first major job was to do calculations that undergirded the development of the hydrogen bomb. Six years later, J. Presper Eckert and John Mauchly, who led the development of ENIAC, launched UNIVAC, the Universal Automatic Computer, for broader government and commercial applications. Early use cases included calculating the U.S. census and assisting with calculation-intensive back office operations like payroll and bookkeeping.

These were hardly computers as we know them today, but rather calculation machines that took in reams of data (via punch cards or magnetic tape) and returned results according to hardwired calculation routines; the “operating system” were the humans actually inputting the data, scheduling jobs, and giving explicit hardware instructions. Originally this instruction also happened via punch cards and magnetic tape, but later models added consoles to both provide status and also allow for register-level control; these consoles evolved into terminals, but the first versions of these terminals, like the one that was available for the original version of the IBM System/360, were used to initiate batch programs.

Any recounting of computing history usually focuses on the bottom two levels of that stack — the device and the input method — because they tend to evolve in parallel. For example, here are the three major computing paradigms to date:

These aren’t perfect delineations; the first PCs had terminal-like interfaces, and pre-iPhone smartphones used windows-icons-menus-pointer (WIMP) interaction paradigms, with built-in keyboards and styluses. In the grand scheme of things, though, the distinction is pretty clear, and, by extension, it’s pretty easy to predict what is next:

Wearables is an admittedly broad category that includes everything from smart watches to earpieces to glasses, but I think it is a cogent one: the defining characteristic of all of these devices, particularly in contrast to the three previous paradigms, is the absence of a direct mechanical input mechanism; that leaves speech, gestures, and at the most primitive level, thought.

Fortunately there is good progress being made on on all of these fronts: the quality and speed of voice interaction has increased dramatically over the last few years; camera-intermediated gestures on the Oculus and Vision Pro work well, and Meta’s Orion wristband uses electromyography (EMG) to interpret gestures without any cameras at all. Neuralink is even more incredible: an implant in the brain captures thoughts directly and translates them into actions.

These paradigms, however, do not exist in isolation. First off, mainframes still exist, and I’m typing this Article on a PC, even if you may consume it on a phone or via a wearable like a set of AirPods. What stands out to me, however, is the top level of the initial stack I illustrated above: the application layer on one paradigm provides the bridge to the next one. This, more than anything, is why generative AI is a big deal in terms of realizing the future.

Bridges to the Future

I mentioned the seminal IBM System/360 above, which was actually a family of mainframes; the first version was the Model 30, which, as I noted, did batch processing: you would load up a job using punch cards or magnetic tape and execute the job, just like you did with the ENIAC or UNIVAC. Two years later, however, IBM came out with the Model 67 and the TSS/360 operating system: now you could actually interact with a program via the terminal. This represented a new paradigm at the application layer:

It is, admittedly, a bit confusing to refer to this new paradigm at the application layer as Applications, but it is the most accurate nomenclature; what differentiated an application from a program was that while the latter was a pre-determined set of actions that ran as a job, the former could be interacted with and amended while running.

That new application layer, meanwhile, opened up the possibility for an entirely new industry to create those applications, which could run across the entire System/360 family of mainframes. New applications, in turn, drove demand for more convenient access to the computer itself. This ultimately led to the development of the personal computer (PC), which was an individual application platform:

Initial PCs operated from a terminal-like text interface, but truly exploded in popularity with the roll-out of the WIMP interface, which was invented by Xerox PARC, commercialized by Apple, and disseminated by Microsoft. The key point in terms of this Article, however, is that Applications came first: the concept created the bridge from mainframes to PCs.

PCs underwent their own transformation over their two decades of dominance, first in terms of speed and then in form factor, with the rise of laptops. The key innovation at the application layer, however, was the Internet:

The Internet differed from traditional applications by virtue of being available on every PC, facilitating communication between PCs, and by being agnostic to the actual device it was accessed on. This, in turn, provided the bridge to the next device paradigm, the smartphone, with its touch interface:

I’ve long noted that Microsoft did not miss mobile; their error was in trying to extend the PC paradigm to mobile. This not only led to a focus on the wrong interface (WIMP via stylus and built-in keyboard), but also an assumption that the application layer, which Windows dominated, would be a key differentiator.

Apple, famously, figured out the right interface for the smartphone, and built an entirely new operating system around touch. Yes, iOS is based on macOS at a low level, but it was a completely new operating system in a way that Windows Mobile was not; at the same time, because iOS was based on macOS, it was far more capable than smartphone-only alternatives like BlackBerry OS or PalmOS. The key aspect of this capability was that the iPhone could access the real Internet.

What is funny is that Steve Jobs’ initial announcement of this capability was met with much less enthusiasm than the iPhone’s other two selling points of being a widescreen iPod and a mobile phone:

Today, we’re introducing three revolutionary products of this class. The first one is a wide-screen iPod with touch controls. The second is a revolutionary mobile phone. The third is a breakthrough Internet communications device…These are not three separate devices, this is one device, and we are calling iPhone. Today, Apple is going to reinvent the phone.

I’ve watched that segment hundreds of times, and the audience’s confusion at “Internet communications device” cracks me up every time; in fact, that was the key factor in reinventing the phone, because it was the bridge that linked a device in your pocket to the world of computing writ large, via the Internet. Jobs listed the initial Internet features later on in the keynote:

Now let’s take a look at an Internet communications device, part of the iPhone. What’s this all about? Well, we’ve got some real breakthroughs here: to start off with, we’ve got rich HTML email on iPhone. The first time, really rich email on a mobile device, and it works with any IMAP or POP email service. You’ve got your favorite mail service, it’ll likely work with it, and it’s rich text email. We wanted the best web browser on our phone, not a baby browser or a WAP browser, a real browser, and we picked the best one in the world: Safari, and we have Safari running on iPhone. It is the first fully-usable HTML browser on a phone. Third, we have Google Maps. Maps, satellite images, directions, and traffic. This is unbelievable, wait until you see it. We have Widgets, starting off with weather and stocks. And, this communicates with the Internet over Edge and Wifi, and iPhone automatically detects Wifi and switches seamless to it. You don’t have to manage the network, it just does the right thing.

Notice that the Internet is not just the web; in fact, while Apple wouldn’t launch a 3rd-party App Store until the following year, it did, with the initial iPhone, launch the app paradigm which, in contrast to standalone Applications from the PC days, assumed and depended on the Internet for functionality.

The Generative AI Bridge

We already established above that the next paradigm is wearables. Wearables today, however, are very much in the pre-iPhone era. On one hand you have standalone platforms like Oculus, with its own operating system, app store, etc.; the best analogy is a video game console, which is technically a computer, but is not commonly thought of as such given its singular purpose. On the other hand, you have devices like smart watches, AirPods, and smart glasses, which are extensions of the phone; the analogy here is the iPod, which provided great functionality but was not a general computing device.

Now Apple might dispute this characterization in terms of the Vision Pro specifically, which not only has a PC-class M2 chip, along with its own visionOS operating system and apps, but can also run iPad apps. In truth, though, this makes the Vision Pro akin to Microsoft Mobile: yes, it is a capable device, but it is stuck in the wrong paradigm, i.e. the previous one that Apple dominated. Or, to put it another way, I don’t view “apps” as the bridge between mobile and wearables; apps are just the way we access the Internet on mobile, and the Internet was the old bridge, not the new one.

To think about the next bridge, it’s useful to jump forward to the future and work backwards; that jump forward is a lot easier to envision, for me anyways, thanks to my experience with Meta’s Orion AR glasses:

They’re real and they’re spectacular. pic.twitter.com/hIJZuS6taY

— Ben Thompson (@benthompson) September 25, 2024

The most impressive aspect of Orion is the resolution, which is perfect. I’m referring, of course, to the fact that you can see the real world with your actual eyes; I wrote in an Update:

The reality is that the only truly satisfactory answer to passthrough is to not need it at all. Orion has perfect field-of-view and infinite resolution because you’re looking at the real world; it’s also dramatically smaller and lighter. Moreover, this perfect fidelity actually gives more degrees of freedom in terms of delivering the AR experience: no matter how high resolution the display is, it will still be lower resolution than the world around it; I tried a version of Orion with double the resolution and, honestly, it wasn’t that different, because the magic was in having augmented reality at all, not in its resolution. I suspect the same thing applies to field of view: 70 degrees seemed massive on Orion, even though that is less than the Vision Pro’s 100 degrees, because the edge of the field of view for Orion was reality, whereas the edge for the Vision Pro is, well, nothing.

The current iteration of Orion’s software did have an Oculus-adjacent launch screen, and an Instagram prototype; it was, in my estimation, the least impressive part of the demonstration, for the same reason that I think the Vision Pro’s iPad app compatibility is a long-term limitation: it was simply taking the mobile paradigm and putting it in front of my face, and honestly, I’d rather just use my phone.

One of the most impressive demos, meanwhile, had the least UI: it was just a notification. I glanced up, saw that someone was calling me, touched my fingers together to “click” on the accept button that accompanied the notification, and was instantly talking to someone in another room while still being able to interact freely with the world around me. Of course phone calls aren’t some sort of new invention; what made the demo memorable was that I only got the UI I needed when I needed it.

This, I think, is the future: the exact UI you need — and nothing more — exactly when you need it, and at no time else. This specific example was, of course, programmed deterministically, but you can imagine a future where the glasses are smart enough to generate UI on the fly based on the context of not just your request, but also your broader surroundings and state.

This is where you start to see the bridge: what I am describing is an application of generative AI, specifically to on-demand UI interfaces. It’s also an application that you can imagine being useful on devices that already exist. A watch application, for example, would be much more usable if, instead of trying to navigate by touch like a small iPhone, it could simply show you the exact choices you need to make at a specific moment in time. Again, we get hints of that today through deterministic programming, but the ultimate application will be on-demand via generative AI.

Of course generative AI is also usable on the phone, and that is where I expect most of the exploration around generative UI to happen for now. We certainly see plenty of experimentation and rapid development of generative AI broadly, just as we saw plenty of experimentation and rapid development of the Internet on PCs. That experimentation and development was not just usable on the PC, but it also created the bridge to the smartphone; I think that generative AI is doing the same thing in terms of building a bridge to wearables that are not accessories, but general purpose computers in their own right:

This is exciting in the long-term, and bullish for Meta (and I’ve previously noted how generative AI is the key to the metaverse, as well). It’s also, clearly, well into the future. It also helps explain why Orion isn’t shipping today: it’s not just that the hardware isn’t yet in a production state, particularly from a cost perspective, but the entire application layer needs to be built out, first on today’s devices, enabling the same sort of smooth transition that the iPhone had. No, Apple didn’t have the App Store, but the iPhone was extraordinarily useful on day one, because it was an Internet Communicator.

Survey Complete

Ten years ago I wrote a post entitled The State of Consumer Technology in 2014, where I explored some of the same paradigm-shifts I detailed in this Article. This was the illustration I made then:

There is a perspective in which 2024 has been a bit of a letdown in terms of generative AI; there hasn’t been a GPT-5 level model released; the more meaningful developments have been in the vastly increased efficiency and reduction in size of GPT-4 level models, and the inference-scaling possibilities of o1. Concerns are rising that we may have hit a data wall, and that there won’t be more intelligent AI without new fundamental breakthroughs in AI architecture.

I, however, feel quite optimistic. To me the story of 2024 has been filling in those question marks in that illustration. The product overhang from the generative AI capabilities we have today are absolutely massive: there are so many new things to be built, and completely new application layer paradigms are at the top of the list. That, by extension, is the bridge that will unlock entirely new paradigms of computing. The road to the future needs to be built; it’s exciting to have the sense that the surveying is now complete.

A Chance to Build

Monday, November 18, 2024Wednesday, November 27, 2024

Semiconductors are so integral to the history of Silicon Valley that they give the region its name, and, more importantly, its culture: chips require huge amounts of up-front investment, but they have, relative to most other manufactured goods, minimal marginal costs; this economic reality helped drive the development of the venture capital model, which provided unencumbered startup capital to companies who could earn theoretically unlimited returns at scale. This model worked even better with software, which was perfectly replicable.

That history starts in 1956, when William Shockley founded the Shockley Semiconductor Laboratory to commercialize the transistor that he had helped invent at Bell Labs; he chose Mountain View to be close to his ailing mother. A year later the so-called “Traitorous Eight”, led by Robert Noyce, left and founded Fairchild Semiconductor down the road. Six years after that Fairchild Semiconductor opened a facility in Hong Kong to assemble and test semiconductors. Assembly required manually attaching wires to a semiconductor chip, a labor-intensive and monotonous task that was difficult to do economically with American wages, which ran about $2.50/hour; Hong Kong wages were a tenth of that. Four years later Texas Instruments opened a facility in Taiwan, where wages were $0.19/hour; two years after that Fairchild Semiconductor opened another facility in Singapore, where wages were $0.11/hour.

In other words, you can make the case that the classic story of Silicon Valley isn’t completely honest. Chips did have marginal costs, but that marginal cost was, within single digit years of the founding of Silicon Valley, exported to Asia.

Moreover, that exportation was done with the help of the U.S. government. In 1962 the U.S. Congress passed the Tariff Classification Act of 1962, which amended the Tariff Act of 1930 to implement new tariff schedules developed by the United States Tariff Commission; those new schedules were implemented in 1963, and included Tariff Item 807.00, which read:

Articles assembled abroad in whole or in part of products of the United States which were exported for such purpose and which have not been advanced in value or improved in condition abroad by any means other than by the act of assembly:

A duty upon the full value of the imported article, less the cost or value of such products of the United States.

The average Hong Kong worker assembled around 24 chips per hour; that meant their value add to the overall cost of the chip was just over $0.01, which means that tariffs were practically non-existent. This was by design! Chris Miller writes in Chip War:

South Vietnam would send shock waves across Asia. Foreign policy strategists perceived ethnic Chinese communities all over the region as ripe for Communist penetration, ready to fall to Communist influence like a cascade of dominoes. Malaysia’s ethnic Chinese minority formed the backbone of that country’s Communist Party, for example. Singapore’s restive working class was majority ethnic Chinese. Beijing was searching for allies—and probing for U.S. weakness…

By the end of the 1970s, American semiconductor firms employed tens of thousands of workers internationally, mostly in Korea, Taiwan, and Southeast Asia. A new international alliance emerged between Texan and Californian chipmakers, Asian autocrats, and the often ethnic-Chinese workers who staffed many of Asia’s semiconductor assembly facilities.

Semiconductors recast the economies and politics of America’s friends in the region. Cities that had been breeding grounds for political radicalism were transformed by diligent assembly line workers, happy to trade unemployment or subsistence farming for better paying jobs in factories. By the early 1980s, the electronics industry accounted for 7 percent of Singapore’s GNP and a quarter of its manufacturing jobs. Of electronics production, 60 percent was semiconductor devices, and much of the rest was goods that couldn’t work without semiconductors. In Hong Kong, electronics manufacturing created more jobs than any sector except textiles. In Malaysia, semiconductor production boomed in Penang, Kuala Lumpur, and Melaka, with new manufacturing jobs providing work for many of the 15 percent of Malaysian workers who had left farms and moved to cities between 1970 and 1980. Such vast migrations are often politically destabilizing, but Malaysia kept its unemployment rate low with many relatively well-paid electronics assembly jobs.

This is a situation that, at least in theory, should not persist indefinitely; increased demand for Asian labor should push up both the cost of that labor and also the currency of the countries where that labor is in demand, making those countries less competitive over time. The former has certainly happened: Taiwan, where I live, is one of the richest countries in the world. And yet chip-making is centered here to a greater extent than ever before, in seeming defiance of theory.

The Post-War Order

The problem with theory is usually reality; in 1944 the U.S. led the way in establishing what came to be known as the Bretton Woods System, which pegged exchange rates to the U.S. dollar. This was a boon to the devastated economies of Europe and first Japan, and then later the rest of Asia: an influx of U.S. capital rebuilt their manufacturing capability by leveraging their relatively lower cost of labor. This did raise labor costs, but thanks to the currency peg, the U.S. currency couldn’t depreciate in response, which in turn made U.S. debt much more attractive than it might have been otherwise for those manufacturing profits, which in turn helped to fund both the Vietnam War and the 1960’s expansion in social programs.

Ultimately this pressure on the U.S. dollar was too intense, leading to the dissolution of Bretton Woods in 1971 and a depreciation of the U.S. dollar relative to gold; the overall structure of the world economy, however, was set: trade was denominated in dollars — i.e. the U.S. dollar was the world’s reserve currency — which kept its value higher than economic theory would dictate. This made U.S. debt attractive, which funded deficit spending; that spending fueled the U.S. consumer market, which bought imported manufactured goods; the profits of those goods were reinvested into U.S. debt, which helped pay for the military that kept the entire system secure.

The biggest winner was the U.S. consumer. Money was cycled into the economy through an impressive and seemingly impossible array of service sector jobs and quickly spent on cheap imports. Those cheap imports were getting better too: to take chips as an example, increased automation decreased costs, and the development of software made those chips much more valuable. This applied not just to the chips directly, but everything built with and enabled by them; the actual building of electronics happened in Asia, as countries rapidly ascended the technological ladder, but the software was the province of Silicon Valley.

This is where it matters that software is truly a zero marginal cost product. R&D costs for tech companies have skyrocketed for decades, but that increase has been more than offset by the value created by software and captured by scale. Moreover, those increasing costs manifested as the highest salaries in the world for talent, the true scarce resource in technology. This meant that the most capable technologists made their way to the U.S. generally and Silicon Valley specifically to earn the most money, and, if they had the opportunity and drive, to create new companies as software ate the world.

Still, software runs on hardware, and here Asia dominates. Consider AI:

Chip design, a zero marginal cost activity, is done by Nvidia, a Silicon Valley company.
Chip manufacturing, a minimal marginal cost activity that requires massive amounts of tacit knowledge gained through experience, is done by TSMC, a Taiwanese company.
An AI system contains multiple components beyond the chip, many if not most of which are manufactured in China, or other countries in Asia.
Final assembly generally happens outside of China due to U.S. export controls; Foxconn, for example, assembles many of its systems in Mexico.
AI is deployed mostly by U.S. companies, and the vast majority of application development is done by tech companies and startups, primarily in Silicon Valley.

The fact that the U.S. is the bread in the AI sandwich is no accident: those are the parts of the value chain where marginal cost is non-existent and where the software talent has the highest leverage. Similarly, it’s no accident that the highest value add in terms of hardware happens in Asia, where expertise has been developing for fifty years. The easiest — and by extension, most low-value — aspect is assembly, which can happen anywhere labor is cheap.

All of this has happened in a world where the trend in trade was towards more openness and fewer barriers, at least in terms of facilitating this cycle. One key development was the Information Technology Agreement (ITA), a 1996 World Trade Organization agreement, which completely eliminated tariffs on IT products, including chips. The Internet, meanwhile, meant there were no barriers to the spread of software, with the notable exception of China’s Great Firewall; the end result is that while U.S. software ran on Asian hardware, it was U.S. companies that ultimately reaped the largest returns from scale.

Cars and China

Perhaps the defining characteristic of the Clinton-Bush-Obama era was the assumption that this system would continue forever; it certainly was plausible in terms of products. Consider cars: for a hundred years cars were marvelous mechanical devices with tens of thousands of finely engineered parts predicated on harnessing the power of combustion to transport people and products wherever they wished to go. Electric cars, however, are something else entirely: yes, there is still a mechanical aspect, as there must be to achieve movement in physical space, but the entire process is predicated on converting electricity to mechanical movement, and governed entirely by chips and software.

This products looks a lot more like a computer on wheels than the mechanical cars we are familiar with; it follows, then, that the ultimate structure of the car industry might end up looking something like the structure of AI: the U.S. dominates the zero marginal cost components like design and the user experience, while Asia — China specifically, given the scale and labor requirements — dominates manufacturing.

This has been Waymo’s plan; while current self-driving cars on the road are retrofitted Jaguar I-Pace sedans, the 6th-generation Waymo vehicle is manufactured by Chinese car company Geely. This car, called Zeekr, is purpose-built for transportation, but ideally it would be custom-built for Waymo’s purposes: you can imagine future fleets of self-driving cars with designs for different use cases, from individual taxis to groups to working offices to sleeper cars. The analogy here would be to personal computing devices: you can get a computer in rack form, a desktop, a laptop, or a phone; the chips and software are by-and-large the same.

Cars aren’t there yet, but they’re not far off; the relative simplicity of electric cars makes it more viable for established car manufactures to basically offer customizable platforms: that is how a company like Xiaomi can develop its own SUV. The consumer electronics company, most well-known for its smartphones, contracts with Beijing Automotive Group for manufacturing, while doing the design and technological integration. Huawei has a similar arrangement with Seres, Changan, and Chery Automobile.

Tesla, it should be noted, is a bit different: the company is extremely vertically integrated, building not only its own hardware and software but also a significant number of the components that go into its cars; this isn’t a surprise, given Tesla’s pioneering role in electrical cars (pioneers are usually vertically integrated), but it does mean that Tesla faces a significant long-term threat from the more modular Chinese approach. Given this, it’s not a surprise that Elon Musk is staking Tesla’s long-term future on autonomy, in effect doubling down on the company’s integration.

Regardless, what is notable — and ought to be a wake-up call to Silicon Valley — is the fact that the Xiaomi and Huawei cars run Chinese software. One of the under-appreciated benefits of the Great Firewall is that it created an attractive market for software developers that was not reachable from Silicon Valley; this means that while a good number of Chinese software engineers are in the U.S., there is a lot of talent in China as well, and that talent is being applied to products that can leverage Chinese manufacturing to win markets Silicon Valley thought would be theirs to sandwich forever.

Waymo’s Zeekr car, meanwhile, has a problem; from Bloomberg in May:

President Joe Biden will quadruple tariffs on Chinese electric vehicles and sharply increase levies for other key industries this week, unveiling the measures at a White House event framed as a defense of American workers, people familiar with the matter said. Biden will hike or add tariffs in the targeted sectors after nearly two years of review. The total tariff on Chinese EVs will rise to 102.5% from 27.5%, the people said, speaking on condition of anonymity ahead of the announcement. Others will double or triple in targeted industries, though the scope remains unclear.

Given this, it’s no surprise that Waymo had a new announcement in October: Waymo was partnering with Hyundai for new self-driving cars that are manufactured in America. This car is a retro-fitted IONIQ 5, which is built as a passenger car, unlike the transportation-focused Zeekr; in other words, Google is taking a step back in functionality because of government policy.

Trump’s Tariffs

Waymo may not be the only company taking a step back: newly (re-)elected President Trump’s signature economic proposal is tariffs. From the 2024 GOP Platform:

Our Trade deficit in goods has grown to over $1 Trillion Dollars a year. Republicans will support baseline Tariffs on Foreign-made goods, pass the Trump Reciprocal Trade Act, and respond to unfair Trading practices. As Tariffs on Foreign Producers go up, Taxes on American Workers, Families, and Businesses can come down.

Foreign Policy published an explainer over the weekend entitled Everything You Wanted to Know About Trump’s Tariffs But Were Afraid to Ask:

U.S. President-elect Donald Trump, the self-proclaimed “tariff man,” campaigned on the promise of ratcheting import duties as high as 60 percent against all goods from China, and perhaps 20 percent on everything from everywhere else. And he might be able to do it—including by drawing on little-remembered authorities from the 1930 Smoot-Hawley Tariff Act, the previous nadir of U.S. trade policy.

Trump’s tariff plans are cheered by most of his economic advisers, who see them as a useful tool to rebalance an import-dependent U.S. economy. Most economists fear the inflationary impacts of sharply higher taxes on U.S. consumers and businesses, as well as the deliberate drag on economic growth that comes from making everything more expensive. Other countries are mostly confused, uncertain whether Trump’s tariff talk is just bluster to secure favorable trade deals for the United States, or if they’ll be more narrowly targeted or smaller than promised. Big economies, such as China and the European Union, are preparing their reprisals, just in case.

What makes it hard for economists to model and other countries to understand is that nobody, even in Trump world, seems to know exactly why tariffs are on the table.

Sounds like the explainer needs an explainer! Or maybe the author was afraid to ask, but I digress.

The story to me seems straightforward: the big loser in the post World War 2 reconfiguration I described above was the American worker; yes, we have all of those service jobs, but what we have much less of are traditional manufacturing jobs. What happened to chips in the 1960s happened to manufacturing of all kinds over the ensuing decades. Countries like China started with labor cost advantages, and, over time, moved up learning curves that the U.S. dismantled; that is how you end up with this from Walter Isaacson in his Steve Jobs biography about a dinner with then-President Obama:

When Jobs’s turn came, he stressed the need for more trained engineers and suggested that any foreign students who earned an engineering degree in the United States should be given a visa to stay in the country. Obama said that could be done only in the context of the “Dream Act,” which would allow illegal aliens who arrived as minors and finished high school to become legal residents — something that the Republicans had blocked. Jobs found this an annoying example of how politics can lead to paralysis. “The president is very smart, but he kept explaining to us reasons why things can’t get done,” he recalled. “It infuriates me.”

Jobs went on to urge that a way be found to train more American engineers. Apple had 700,000 factory workers employed in China, he said, and that was because it needed 30,000 engineers on-site to support those workers. “You can’t find that many in America to hire,” he said. These factory engineers did not have to be PhDs or geniuses; they simply needed to have basic engineering skills for manufacturing. Tech schools, community colleges, or trade schools could train them. “If you could educate these engineers,” he said, “we could move more manufacturing plants here.” The argument made a strong impression on the president. Two or three times over the next month he told his aides, “We’ve got to find ways to train those 30,000 manufacturing engineers that Jobs told us about.”

I think that Jobs had cause-and-effect backwards: there are not 30,000 manufacturing engineers in the U.S. because there are not 30,000 manufacturing engineering jobs to be filled. That is because the structure of the world economy — choices made starting with Bretton Woods in particular, and cemented by the removal of tariffs over time — made them nonviable. Say what you will about the viability or wisdom of Trump’s tariffs, the motivation — to undo eighty years of structural changes — is pretty straightforward!

The other thing about Jobs’ answer is how ultimately self-serving it was. This is not to say it was wrong: Apple could not only not manufacture an iPhone in the U.S. because of cost, it also can’t do so because of capability; that capability is downstream of an ecosystem that has developed in Asia and a long learning curve that China has traveled and that the U.S. has abandoned. Ultimately, though, the benefit to Apple has been profound: the company has the best supply chain in the world, centered in China, that gives it the capability to build computers on an unimaginable scale with maximum quality for not that much money at all.

This benefit has extended to every tech company, whether they make their own hardware or not. Software has to run on something, whether that be servers or computer or phones; hardware is software’s most essential complement. Joel Spolsky, in his canonical post about commoditizing your complements, wrote:

Every product in the marketplace has substitutes and complements…A complement is a product that you usually buy together with another product. Gas and cars are complements. Computer hardware is a classic complement of computer operating systems. And babysitters are a complement of dinner at fine restaurants. In a small town, when the local five star restaurant has a two-for-one Valentine’s day special, the local babysitters double their rates. (Actually, the nine-year-olds get roped into early service.)…

Demand for a product increases when the price of its complements decreases. In general, a company’s strategic interest is going to be to get the price of their complements as low as possible. The lowest theoretically sustainable price would be the “commodity price” — the price that arises when you have a bunch of competitors offering indistinguishable goods…If you can run your software anywhere, that makes hardware more of a commodity. As hardware prices go down, the market expands, driving more demand for software (and leaving customers with extra money to spend on software which can now be more expensive.)…

Spolsky’s post was written in 2002, well before the rise of smartphones and, more pertinently, ad-supported software that now permeates our world. That, though, only makes his point: hardware has become so cheap and so widespread that software can be astronomically valuable even as it’s free to end users. Which, by the way, is that other part of the boon to consumers I noted above.

It’s Time to Build

A mistake many analysts make, particularly Americans, is viewing the U.S. as the only agent of change in the world; events like the Ukraine or Gaza wars are reminders that we aren’t in control of world events, and nothing would make that lesson clearer than a Chinese move on Taiwan. At the same time, we are living in a system the U.S. built, so it’s worth thinking seriously about the implications of a President with a mandate to blow the whole thing up.

The first point is perhaps the most comforting: there is a good chance that Trump makes a lot of noise and accomplishes little, at least in terms of trade and — pertinently for this blog — its impact on tech. That is arguably what happened his first term: there were China tariffs (that Apple was excluded from), and a ban on chip shipments to Huawei (that massively buoyed Apple), and TSMC committed to building N-1 fabs in Arizona. From a big picture perspective, though, today Silicon Valley is more powerful and richer than ever, and the hardware dominance of Asia generally and China specifically is larger than ever.

The reality is that uprooting the current system would take years of upheaval and political and economic pain; those who argue it is impossible are wrong, but believing it’s highly improbable is very legitimate. Indeed, it may be the case that systems can only truly be remade in the presence of an exogenous destructive force, which is to say war.

The second point, though, is that there does seem to be both more risk and opportunity than many people think. Tariffs do change things; by virtue of my location I talk to plenty of people on the ground who have been busy for years moving factories, not from China to the U.S., but to places like Thailand or Vietnam. That doesn’t really affect the trade deficit, but things that matter don’t always show up in aggregate numbers.

To that end, the risk for tech is that tariffs specifically and Trump’s approach to trade generally do more damage to the golden goose than expected. More expensive hardware ultimately constricts the market for software; tariffs in violation of agreements like the ITA give the opening for other countries to impose levies of their own, and U.S. tech companies could very well be popular targets.

The opportunity, meanwhile, is to build new kinds of manufacturing companies that can seize on a tariff-granted price advantage. These sorts of companies, perhaps to Trump’s frustration, are not likely to be employment powerhouses; the real opportunity is taking advantage of robotics and AI to make physical goods into zero marginal cost items in their own right (outside of commodities; this is what has happened to chips: assembly and testing are fully automated, which makes a U.S. buildout viable).

To take a perhaps unintuitive example, consider Amazon: the company is deeply investing in automation for its fulfillment centers, which decreases the marginal cost of picking, enabling the company to sell more items like “Everyday Essentials” that don’t cost much but are purchased frequently; it’s also no surprise that Amazon is invested in drones and self-driving car startups, to take the same costs out of delivery. It’s a long journey, to be sure, but it’s a destination that is increasingly possible to imagine.

The analogy to manufacturing is that a combination of automation and modular platforms, defined by software, is both necessary and perhaps possible to build for the first time in a long time. It won’t be an easy road — see Tesla’s struggles with automation — but there is, at a minimum, a market in national security, and perhaps arenas like self-driving cars, to build something scalable with assumptions around modularity and software-defined functionality at the core.

Again, I don’t know if this will work: the symbiotic relationship between Silicon Valley software makers and Asian hardware manufactures is one of the most potent economic combinations in history, and it may be impossible to compete with; if it’s ever going to work, though, the best opportunity — absent a war, God forbid — is probably right now.

Meta’s AI Abundance

Tuesday, October 29, 2024Sunday, November 17, 2024

This Article is available as a video essay on YouTube

Stratechery has benefited from a Meta cheat code since its inception: wait for investors to panic, the stock to drop, and write an Article that says Meta is fine — better than fine even — and sit back and watch the take be proven correct. Notable examples include 2013’s post-IPO swoon, the 2018 Stories swoon, and most recently, the 2022 TikTok/Reels swoon (if you want a bonus, I was optimistic during the 2020 COVID swoon too):

Perhaps with that in mind I wrote a cautionary note earlier this year about Meta and Reasonable Doubt: while investors were concerned about the sustainability of Meta’s spending on AI, I was worried about increasing ad prices and the lack of new formats after Stories and then Reels; the long-term future, particularly in terms of the metaverse, was just as much of a mystery as always.

Six months on and I feel the exact opposite: it seems increasingly clear to me that Meta is in fact the most well-placed company to take advantage of generative AI. Yes, investors are currently optimistic, so this isn’t my usual contrarian take — unless you consider the fact that I think Meta has the potential to be the most valuable company in the world. As evidence of that fact I’m writing today, a day before Meta’s earnings: I don’t care if they’re up or down, because the future is that bright.

Short-term: Generative AI and Digital Advertising

Generative AI is clearly a big deal, but the biggest winner so far is Nvidia, in one of the clearest examples of the picks-and-shovels ethos on which San Francisco was founded: the most money to be made is in furnishing the Forty-niners (yes, I am using a linear scale instead of the log scale above for effect):

The big question weighing on investors’ minds is when all of this GPU spend will generate a return. Tesla and xAI are dreaming of autonomy; Azure, Google Cloud, AWS, and Oracle want to undergird the next generation of AI-powered startups; and Microsoft and Salesforce are bickering about how to sell AI into the enterprise. All of these bets are somewhat speculative; what would be the most valuable in the short-term, at least in terms of justifying the massive ongoing capital expenditure necessary to create the largest models, is a guaranteed means to translate those costs into bottom-line benefit.

Meta is the best positioned to do that in the short-term, thanks to the obvious benefit of applying generative AI to advertising. Meta is already highly reliant on machine learning for its ads product: right now an advertiser can buy ads based on desired outcomes, whether that be an app install or a purchase, and leave everything else up to Meta; Meta will work across their vast troves of data in a way that is only possible using machine learning-derived algorithms to find the right targets for an ad and deliver exactly the business goals requested.

What makes this process somewhat galling for the advertiser is that the more of a black box Meta’s advertising becomes the better the advertising results, even as Meta makes more margin. The big reason for the former is the App Tracking Transparency (ATT)-driven shift in digital advertising to probabilistic models in place of deterministic ones.

It used to be that ads shown to users could be perfectly matched to conversions made in 3rd-party apps or on 3rd-party websites; Meta was better at this than everyone else, thanks to its scale and fully built-out ad infrastructure (including SDKs in apps and pixels on websites), but this was a type of targeting and conversion tracking that could be done in some fashion by other entities, whether that be smaller social networks like Snap, ad networks, or even sophisticated marketers themselves.

ATT severed that link, and Meta’s business suffered greatly; from a February post-earnings Update:

It is worth noting that while the digital ecosystem did not disappear, it absolutely did shrink: [MoffettNathanson’s Michael] Nathanson, in his Meta earnings note, explained what he was driving at with that question:

While revenues have recovered, with +22% organic growth in the fourth quarter, we think that the more important driver of the outperformance has been the company’s focus on tighter cost controls. Coming in 2023, Meta CEO Mark Zuckerberg made a New Year’s resolution, declaring 2023 the “Year of Efficiency.” By remaining laser-focused on reining in expense growth as the top line reaccelerated, Meta’s operating margins (excluding restructuring) expanded almost +1,100 bps vs last 4Q, reaching nearly 44%. Harking back to Zuckerberg’s resolution, Meta’s 2023 was, in fact, highly efficient…

Putting this in perspective, two years ago, after the warnings on the 4Q 2021 earnings call, we forecasted that Meta Family of Apps would generate $155 billion of revenues and nearly $68 billion of GAAP operating income in 2023. Fast forward to today, and last night Meta reported that Family of Apps delivered only $134.3 billion of revenues ($22 billion below our 2-year ago estimate), yet FOA operating income (adjusted for one-time expenses) was amazingly in-line with that two-year old forecast. For 2024, while we now forecast Family of Apps revenues of $151.2 billion (almost $30 billion below the forecast made on February 2, 2022), our current all-in Meta operating profit estimate of $56.8 billion is also essentially in line. In essence, Meta has emerged as a more profitable (dare we say, efficient) business.

That shrunken revenue figure is digital advertising that simply disappeared — in many cases, along with the companies that bought it — in the wake of ATT. The fact that Meta responded by becoming so much leaner, though, was critical to not just surviving ATT, but also laid the groundwork for where the company is going next.

Increased company efficiency is a reason to be bullish on Meta, but three years on, the key takeaway from ATT is that it validated my thesis that Meta is anti-fragile. From 2020’s Apple and Facebook:

This is a very different picture from Facebook, where as of Q1 2019 the top 100 advertisers made up less than 20% of the company’s ad revenue; most of the $69.7 billion the company brought in last year came from its long tail of 8 million advertisers. This focus on the long-tail, which is only possible because of Facebook’s fully automated ad-buying system, has turned out to be a tremendous asset during the coronavirus slow-down…

This explains why the news about large CPG companies boycotting Facebook is, from a financial perspective, simply not a big deal. Unilever’s $11.8 million in U.S. ad spend, to take one example, is replaced with the same automated efficiency that Facebook’s timeline ensures you never run out of content. Moreover, while Facebook loses some top-line revenue — in an auction-based system, less demand corresponds to lower prices — the companies that are the most likely to take advantage of those lower prices are those that would not exist without Facebook, like the direct-to-consumer companies trying to steal customers from massive conglomerates like Unilever.

In this way Facebook has a degree of anti-fragility that even Google lacks: so much of its business comes from the long tail of Internet-native companies that are built around Facebook from first principles, that any disruption to traditional advertisers — like the coronavirus crisis or the current boycotts — actually serves to strengthen the Facebook ecosystem at the expense of the TV-centric ecosystem of which these CPG companies are a part.

Make no mistake, a lot of these kinds of companies were killed by ATT; the ones that survived, though, emerged into a world where no one other than Meta — thanks in part to a massive GPU purchase the same month the company reached its most-recent stock market nadir — had the infrastructure to rebuild the type of ad system they depended on. This rebuild had to be probabilistic — making a best guess as to the right target, and, more confoundingly, a best guess as to conversion — which is only workable with an astronomical amount of data and an astronomical amount of infrastructure to process that data, such that advertisers could once again buy based on promised results, and have those promises met.

Now into this cauldron Meta is adding generative AI. Advertisers have long understood the importance of giving platforms like Meta multiple pieces of creative for ads; Meta’s platform will test different pieces of creative with different audiences and quickly hone in on what works, putting more money behind the best arrow. Generative AI puts this process on steroids: advertisers can provide Meta with broad parameters and brand guidelines, and let the black box not just test out a few pieces of creative, but an effectively unlimited amount. Critically, this generative AI application has a verification function: did the generated ad generate more revenue or less? That feedback function, meanwhile, is data in its own right, and can be leveraged to better target individuals in the future.

The second piece to all of this — the galling part I referenced above — is the margin question. The Department of Justice’s lawsuit against Google’s ad business explains why black boxes are so beneficial to big ad platforms:

Over time, as Google’s monopoly over the publisher ad server was secured, Google surreptitiously manipulated its Google Ads’ bids to ensure it won more high-value ad inventory on Google’s ad exchange while maintaining its own profit margins by charging much higher fees on inventory that it expected to be less competitive. In doing so, Google was able to keep both categories of inventory out of the hands of rivals by competing in ways that rivals without similar dominant positions could not. In doing so, Google preserved its own profits across the ad tech stack, to the detriment of publishers. Once again, Google engaged in overt monopoly behavior by grabbing publisher revenue and keeping it for itself. Google called this plan “Project Bernanke.”

I’m skeptical about the DOJ’s case for reasons I laid out in this Update; publishers made more money using Google’s ad server than they would have otherwise, while the advertisers, who paid more, are not locked in. The black box effect, however, is real: platforms like Google or Meta can meet an advertiser’s goals — at a price point determined by an open auction — without the advertisers knowing which ads worked and which ones didn’t, keeping the margin from the latter. The galling bit is that this works out best for everyone: these platforms are absolutely finding customers you wouldn’t get otherwise, which means advertisers earn more when the platforms earn more too, and these effects will only be supercharged with generative ads.

There’s more upside for Meta, too. Google and Amazon will benefit from generative ads, but I expect the effect will be the most powerful at the top of the funnel where Meta’s advertising operates, as opposed to the bottom-of-the-funnel search ads where Amazon and Google make most of their money. Moreover, there is that long tail I mentioned above: one of the challenges for Meta in moving from text (Feed) to images (Stories) to video (Reels) is that effective creative becomes more difficult to execute, especially if you want multiple variations. Meta has devoted a lot of resources over the years to tooling to help advertisers make effective ads, much of which will be obviated by generative AI. This, by extension, will give long tail advertisers more access to more inventory, which will increase demand and ultimately increase prices.

There is one more channel that is exclusive to Meta: text-to-message ads. These are ads where the conversion event is initiating a chat with an advertiser, an e-commerce channel that is particularly popular in Asia. The distinguishing factor in the markets where these ads are taking off is low labor costs, which AI addresses. Zuckerberg explained in a 2023 earnings call:

And then the one that I think is going to have the fastest direct business loop is going to be around helping people interact with businesses. You can imagine a world on this where over time, every business has as an AI agent that basically people can message and interact with. And it’s going to take some time to get there, right? I mean, this is going to be a long road to build that out. But I think that, that’s going to improve a lot of the interactions that people have with businesses as well as if that does work, it should alleviate one of the biggest issues that we’re currently having around messaging monetization is that in order for a person to interact with a business, it’s quite human labor-intensive for a person to be on the other side of that interaction, which is one of the reasons why we’ve seen this take off in some countries where the cost of labor is relatively low. But you can imagine in a world where every business has an AI agent, that we can see the kind of success that we’re seeing in Thailand or Vietnam with business messaging could kind of spread everywhere. And I think that’s quite exciting.

Both of these use cases — generative ads and click-to-message AI agents — are great examples as to why it makes sense for Meta to invest in its Llama models and make them open(ish): more and better AI means more and better creative and more and better agents, all of which can be monetized via advertising.

Medium-Term: The Smiling Curve and Infinite Content

Of course all of this depends on people continuing to use Meta properties, and here AI plays an important role as well. First, there is the addition of Meta AI, which makes Meta’s apps more useful. Meta AI also opens the door to a search-like product, which The Information just reported the company was working on; potential search advertising is a part of the bull case as well, although for me a relatively speculative one.

Second is the insertion of AI content into the Meta content experience, which Meta just announced it is working on. From The Verge:

If you think avoiding AI-generated images is difficult as it is, Facebook and Instagram are now going to put them directly into your feeds. At the Meta Connect event on Wednesday, the company announced that it’s testing a new feature that creates AI-generated content for you “based on your interests or current trends” — including some that incorporate your face.

When you come across an “Imagined for You” image in your feed, you’ll see options to share the image or generate a new picture in real time. One example (embedded below) shows several AI-generated images of “an enchanted realm, where magic fills the air.” But others could contain your face… which I’d imagine will be a bit creepy to stumble upon as you scroll…

In a statement to The Verge, Meta spokesperson Amanda Felix says the platform will only generate AI images of your face if you “onboarded to Meta’s Imagine yourself feature, which includes adding photos to that feature” and accepting its terms. You’ll be able to remove AI images from your feed as well.

This sounds like a company crossing the Rubicon, but in fact said crossing already happened a few years ago. Go back to 2015’s Facebook and the Feed, where I argued that Facebook was too hung up on being a social network, and concluded:

Consider Facebook’s smartest acquisition, Instagram. The photo-sharing service is valuable because it is a network, but it initially got traction because of filters. Sometimes what gets you started is only a lever to what makes you valuable. What, though, lies beyond the network? That was Facebook’s starting point, and I think the answer to what lies beyond is clear: the entire online experience of over a billion people. Will Facebook seek to protect its network — and Zuckerberg’s vision — or make a play to be the television of mobile?

It took Facebook another five years — and the competitive threat of TikTok — but the company finally did make the leap to showing you content from across the entire service, not just that which was posted by your network. The latter was an artificial limitation imposed by the company’s own self-conception of itself as a social network, when in reality it is a content network; true social networking — where you talk to people you actually know — happens in group chats:

The structure of this illustration may look familiar; it’s another manifestation of The Smiling Curve, which I first wrote about in the context of publishing:

Over time, as this cycle repeats itself and as people grow increasingly accustomed to getting most of their “news” from Facebook (or Google or Twitter), value moves to the ends, just like it did in the IT manufacturing industry or smartphone industry:

On the right you have the content aggregators, names everyone is familiar with: Google ($369.7 billion), Facebook ($209.0 billion), Twitter ($26.4 billion), Pinterest (private). They are worth by far the most of anyone in this discussion. Traditional publishers, meanwhile, are stuck in the middle…publishers (all of them, not just newspapers) don’t really have an exclusive on anything anymore. They are Acer, offering the same PC as the next guy, and watching as the lion’s share of the value goes to the folks who are actually putting the content in front of readers.

It speaks to the inevitability of the smiling curve that it has even come for Facebook (which I wrote about in 2020’s Social Networking 2.0); moving to global content and purely individualized feeds unconstrained by your network was the aforementioned Rubicon crossing. The provenance of that content is a tactical question, not a strategic one.

To that end, I’ve heard whispers that these AI content tests are going extremely well, which raises an interesting financial question. One of Meta’s great strengths is that it gets its content for free from users. There certainly are costs incurred in personalizing your feed, but this is one of the rare cases where AI content is actually more expensive. It’s possible, though, that it simply is that much better and more engaging, in part because it is perfectly customized to you.

This leads to a third medium-term AI-derived benefit that Meta will enjoy: at some point ads will be indistinguishable from content. You can already see the outlines of that given I’ve discussed both generative ads and generative content; they’re the same thing! That image that is personalized to you just might happen to include a sweater or a belt that Meta knows you probably want; simply click-to-buy.

It’s not just generative content, though: AI can figure out what is in other content, including authentic photos and videos. Suddenly every item in that influencer photo can be labeled and linked — provided the supplier bought into the black box, of course — making not just every piece of generative AI a potential ad, but every piece of content period.

The market implications of this are profound. One of the oddities of analyzing digital ad platforms is that some of the most important indicators are counterintuitive; I wrote this spring:

The most optimistic time for Meta’s advertising business is, counter-intuitively, when the price-per-ad is dropping, because that means that impressions are increasing. This means that Meta is creating new long-term revenue opportunities, even as its ads become cost competitive with more of its competitors; it’s also notable that this is the point when previous investor freak-outs have happened.

When I wrote that I was, as I noted in the introduction, feeling more cautious about Meta’s business, given that Reels is built out and the inventory opportunities of Meta AI were not immediately obvious. I realize now, though, that I was distracted by Meta AI: the real impact of AI is to make everything inventory, which is to say that the price-per-ad on Meta will approach $0 for basically forever. Would-be competitors are finding it difficult enough to compete with Meta’s userbase and resources in a probabilisitic world; to do so with basically zero price umbrella seems all-but-impossible.

The Long-term: XR and Generative UI

Notice that I am thousands of words into this Article and, like Meta Myths, haven’t even mentioned VR or AR. Meta’s AI-driven upside is independent from XR becoming the platform of the future. What is different now, though, is that the likelihood of XR mattering feels dramatically higher than it did even six months ago.

The first one is obviously Orion, which I wrote about last month. Augmented reality is definitely going to be a thing — I would buy a pair of Meta’s prototypes now if they were for sale.

Once again, however, the real enabler will be AI. In the smartphone era, user interfaces started out being pixel perfect, and have gradually evolved into being declarative interfaces that scale to different device sizes. AI, however, will enable generative UI, where you are only presented with the appropriate UI to accomplish the specific task at hand. This will be somewhat useful on phones, and much more compelling on something like a smartwatch; instead of having to craft an interface for a tiny screen, generative UIs will surface exactly what you need when you need it, and nothing else.

Where this will really make a difference is with hardware like Orion. Smartphone UI’s will be clunky and annoying in augmented reality; the magic isn’t in being pixel perfect, but rather being able to do something with zero friction. Generative UI will make this possible: you’ll only see what you need to see, and be able to interact with it via neural interfaces like the Orion neural wristband. Oh, and this applies to ads as well: everything in the world will be potential inventory.

AI will have a similarly transformative effect on VR, which I wrote about back in 2022 in DALL-E, the Metaverse, and Zero Marginal Content. That article traced the evolution of both games and user-generated content from text to images to video to 3D; the issue is that games had hit a wall, given the cost of producing compelling 3D content, and that that challenge would only be magnified by the immersive nature of VR. Generative AI, though, will solve that problem:

In the very long run this points to a metaverse vision that is much less deterministic than your typical video game, yet much richer than what is generated on social media. Imagine environments that are not drawn by artists but rather created by AI: this not only increases the possibilities, but crucially, decreases the costs.

Here once again Meta’s advantages come to the fore: not only are they leading the way in VR with the Quest line of headsets, but they are also justified in building out the infrastructure necessary to generate metaverses — advertising included — because every part of their business benefits from AI.

From Abundance to Infinity

This was all a lot of words to explain the various permutations of an obvious truth: a world of content abundance is going to benefit the biggest content Aggregator first and foremost. Of course Meta needs to execute on all of these vectors, but that is where they also benefit from being founder-led, particularly given the fact that founder seems more determined and locked in than ever.

It’s also going to cost a lot of money, both in terms of training and inference. The inference part is inescapable: Meta may have a materially higher cost of revenue in the long run. The training part, however, has some intriguing possibilities. Specifically, Meta’s AI opportunities are so large and so central to the company’s future, that there is no question that Zuckerberg will spend whatever is necessary to keep pushing Llama forward. Other companies, however, with less obvious use cases, or more dependency on third-party development that may take longer than expected to generate real revenue, may at some point start to question their infrastructure spend, and wonder if it might make more sense to simply license Llama (this is where the “ish” part of “openish” looms large). It’s definitely plausible that Meta ends up being subsidized for building the models that give the company so much upside.

Regardless, it’s good to be back on the Meta bull train, no matter what tomorrow’s earnings say about last quarter or next year. Stratechery from the beginning has been focused on the implications of abundance and the companies able to navigate it on behalf of massive user bases — the Aggregators. AI takes abundance to infinity, and Meta is the purest play of all.

I wrote a follow-up to this Article in this Daily Update.

Elon Dreams and Bitter Lessons

Tuesday, October 15, 2024Thursday, October 31, 2024

This Article is available as a video essay on YouTube

In the days after SpaceX’s awe-inspiring Starship launch-and-catch — watch the first eight minutes of this video if you haven’t yet — there was another older video floating around on X, this time of Richard Bowles, a former executive at Arianespace, the European rocket company. The event was the Singapore Satellite Industry Forum, and the year was 2013:

This morning, SpaceX came along and said, “We foresee a launch costing $7 million”. Well, ok, let’s ignore the 7, let’s say $15 million…at $15 million every operator would change their gameplane completely. Every supplier would change their gameplan completely. We wouldn’t be building satellites exactly as we are today, so a lot of these questions I think it might be interesting to go on that and say, “Where do you see your companies if you’re going to compete with a $15 million launch program.” So Richard, where do you see your company competing with a $15 million launch?”…

RB: SpaceX is an interesting phenomenon. We saw it, and you just mentioned it, I thought it was $5 million or $7 million…

Why don’t you take Arianespace instead of SpaceX first. Where would you compete with a $15 million launch?

RB: I’ve got to talk about what I’m competing with, because that then predicates exactly how we will compete when we analyze what we are competing with. Obviously we like to analyze the competition.

So today, SpaceX hasn’t launched into the geosynchronous orbit yet, they’re doing very well, their progress is going forward amazingly well, but I’m discovering in the market is that SpaceX primarily seems to be selling a dream, which is good, we should all dream, but I think a $5 million launch, or a $15 million launch, is a bit of a dream. Personally I think reusability is a dream. Recently I was at a session where I was told that there was no recovery plan because they’re not going to have any failures, so I think that’s a part of the dream.

So at the moment, I feel that we’re looking, and you’re presenting to me, how am I going to respond to a dream? My answer to respond to a dream is that first of all, you don’t wake people up, they have to wake up on their own, and then once the market has woken up to the dream and the reality, then we’ll compete with that.

But they are looking at a price which is about half yours today.

RB: It’s a dream.

Alright. Suppose that you wake up and they’re there, what would you Arianespace do.

RB: We would have to react to it. They’re not supermen, so whatever they can do we can do. We would then have to follow. But today, at the moment…it is a theoretical question at this moment in time.

I personally don’t believe it’s going to be theoretical for that much longer. They’ve done everything almost they said they would do. That’s true.

The moderator ended up winning the day; in 2020 Elon Musk said on a podcast that the “best case” for Falcon 9 launches was indeed $15 million (i.e. most cost more, but that price point had been achieved). Of course customers pay a lot more: SpaceX charges a retail price of $67 million per launch, in part because it has no competition; Arianespace retired the Ariane 5 rocket, which had a retail launch price of $178 million, in 2023. Ariane 6 had its first launch this year, but it’s not price competitive, in part because it’s not reusable. From Politico:

The idea of copying SpaceX and making Ariane partly reusable was considered and rejected. That decision haunts France’s Economy Minister Bruno Le Maire. “In 2014 there was a fork in the road, and we didn’t take the right path,” Le Maire said in 2020.

But just because it works for Elon, doesn’t make it good for Europe. Once it’s up and running, Ariane 6 should have nine launches a year — of which around four will be for institutional missions, like government reconnaissance satellites and earth observation systems. The rest will be targeted at commercial clients.

Compare that to SpaceX. Fed by a steady stream of Pentagon and industry contracts, in addition to missions for its own Starlink satellite constellation, Musk’s company carried out a record 96 launches in 2023.

“It wasn’t that we just said reusability is bullshit,” said [former head the European Space Agency Jan] Wörner of the early talks around Ariane 6 in the mid-2010s, and the consideration of building reusable stages rather than burning through fresh components each mission. “If you have 10 flights per year and you are only building one new launcher per year then from an industrial point of view that’s not going to work.”

Wörner’s statement is like Bowles in the way in which it sees the world as static; Bowles couldn’t see ahead to a world where SpaceX actually figured out how to reuse rockets by landing them on drone ships, much less the version 2 example of catching a much larger rocket that we saw this weekend. Wörner, meanwhile, can’t see backwards: the reason why SpaceX has so much more volume, both from external customers and from itself (Starlink), is because it is cheap. Cheapness creates scale, which makes things even cheaper, and the ultimate output is entirely new markets.

The SpaceX Dream

Of course Bowles was right in another way: SpaceX is a dream. It’s a dream of going to Mars, and beyond, of extending humanity’s reach beyond our home planet; Arianespace is just a business. That, though, has been their undoing. A business carefully evaluates options, and doesn’t necessarily choose the highest upside one, but rather the one with the largest expected value, a calculation that incorporates the likelihood of success — and even then most find it prudent to hedge, or build in option value.

A dreamer, though, starts with success, and works backwards. In this case, Musk explained the motivation for driving down launch costs on X:

First off, this made it imperative that SpaceX find a way to launch a massively larger rocket that is fully recoverable, and doesn’t include the weight and logistics costs of the previous approach (this weekend SpaceX caught the Super Heavy booster; the next step is catching the Starship spacecraft that sits above it). Once SpaceX can launch massively larger rockets cheaply, though, it can start to do other things, like dramatically expand Starlink capability.

The next generation Starlink satellites, which are so big that only Starship can launch them, will allow for a 10X increase in bandwidth and, with the reduced altitude, faster latency https://t.co/HLYdjjia3o

— Elon Musk (@elonmusk) October 14, 2024

Starlink won’t be the only beneficiary; the Singapore moderator had it right back in 2013: everyone will change their gameplan completely, which will mean more business for SpaceX, which will only make things cheaper, which will mean even more business. Indeed, there is a window to rocketports that don’t have anything to do with Mars, but simply facilitate drastically faster transportation here on planet earth. The transformative possibilities of scale — and the dramatic decrease in price that follows — are both real and hard to imagine.

Tesla’s Robotaxi Presentation

The Starship triumph wasn’t the only Musk-related story of the week: last Thursday Tesla held its We, Robot event where it promised to unveil its Robotaxi, and observers were considerably less impressed. From Bloomberg:

Elon Musk unveiled Tesla Inc.’s highly anticipated self-driving taxi at a flashy event that was light on specifics, sending its stock sliding as investors questioned how the carmaker will achieve its ambitious goals. The chief executive officer showed off prototypes of a slick two-door sedan called the Cybercab late Thursday, along with a van concept and an updated version of Tesla’s humanoid robot. The robotaxi — which has no steering wheel or pedals — could cost less than $30,000 and “probably” will go into production in 2026, Musk said.

The product launch, held on a movie studio lot near Los Angeles, didn’t address how Tesla will make the leap from selling advanced driver-assistance features to fully autonomous vehicles. Musk’s presentation lacked technical details and glossed over topics including regulation or whether the company will own and operate its own fleet of Cybercabs. As Jefferies analysts put it, Tesla’s robotaxi appears “toothless.”

The underwhelming event sent Tesla’s shares tumbling as much as 10% Friday in New York, the biggest intraday decline in more than two months. They were down 7.6% at 12:29 p.m., wiping out $58 billion in market value. The stock had soared almost 70% since mid-April, largely in anticipation of the event. Uber Technologies Inc. and Lyft Inc., competing ride-hailing companies whose investors had been nervously awaiting the Cybercab’s debut, each surged as much as 11% Friday. Uber’s stock hit an all-time high.

Tesla has a track record of blowing past timelines Musk has offered for all manner of future products, and has had a particularly difficult time following through on his self-driving forecasts. The CEO told investors in 2019 that Tesla would have more than 1 million robotaxis on the road by the following year. The company hasn’t deployed a single autonomous vehicle in the years since.

First off, the shockingly short presentation — 22:44 from start to “Let’s get the party started” — was indeed devoid of any details about the Robotaxi business case. Secondly, all of the criticisms of Musk’s mistaken predictions about self-driving are absolutely true. Moreover, the fact of the matter is that Tesla is now far behind the current state-of-the-art, Waymo, which is in operation in four U.S. cities and about to start up in two more. Waymo has achieved Level 4 automation, while Tesla’s are stuck at Level 2. To review the levels of automation:

Level 0: Limited features that provide warnings and momentary assistance (i.e. automatic emergency braking)
Level 1: Steering or brake/acceleration automation (i.e. cruise control or lane centering)
Level 2: Steering and brake/acceleration control, which must be constantly supervised (i.e. hands-on-wheel)
Level 3: Self-driving that only operates under pre-defined conditions, and in which the driver must take control immediately when requested
Level 4: Self-driving that only operates under pre-defined conditions, under which the driver is not expected to take control
Level 5: Self-driving under all conditions, with no expectation of driver control

Waymo has two big advantages relative to Tesla: first, its cars have a dramatically more expansive sensor suite, including camera, radar, and LiDAR; the latter is the most accurate way to measure depth, which is particularly tricky for cameras and fairly imprecise for radar. Second, any Waymo car can be taken over by a remote driver any time it encounters a problem. This doesn’t happen often — once every 17,311 miles in sunny California last year — but it is comforting to know that there is a fallback.

The challenge is that both of these advantages cost money: LiDAR is the biggest reason why the Generation 5 Waymo’s on the streets of San Francisco cost a reported $200,000; Generation 6 has fewer sensors and should be considerably cheaper, and prices will come down as Waymo scales, but this is still a barrier. Humans in data centers, meanwhile, sitting poised to take over a car that encounters trouble, are not just a cost center but also a limit on scalability. Then again, higher cost structures are its own limitation on scalability; Waymos are awesome but they will need to get an order of magnitude cheaper to change the world.

The Autonomy Dream

What was notable about Musk’s Tesla presentation is what it actually did include. Start with that last point; Musk’s focus was on that changing the world bit:

You see a lot of sci-fi movies where the future is dark and dismal. It’s not a future you want to be in. I love Bladerunner, but I don’t know if we want that future. I think we want that duster he’s wearing, but not the bleak apocalypse. We want to have a fun, exciting future that if you could look in a crystal ball and see the future, you’d be like “Yes, I wish that I could be there now”. That’s what we want.

Musk proceeded to talk about having a lounge on wheels that gave you your time back and was safer to boot, and which didn’t need ugly parking lots; the keynote slides added parks to LAX and Sofi and Dodger Stadiums:

One of the things that is really interesting is how will this affect the cities that we live in. When you drive around a city, or the car drive you around the city, you see that there’s a lot of parking lots. There’s parking lots everywhere. There are parking garages. So what would happen if you have an autonomous world is that you can now turn parking lots into parks…there’s a lot of opportunity to create greenspace in the cities that we live in.

This is certainly an attractive vision; it’s also far beyond the world of Uber and Lyft or even Waymo, which are focused on solving the world as it actually exists today. That means dealing with human drivers, which means there will be parking lots for a long time to come. Musk’s vision is a dream.

What, though, would that dream require, if it were to come true? Musk said it himself: full autonomy provided by a fleet of low cost vehicles that make it silly — or prohibitively expensive, thanks to sky-rocketing insurance — for anyone to drive themselves. That isn’t Level 4, like Waymo, it’s Level 5, and, just as importantly, it’s cheap, because cheap drives scale and scale drives change.

Tesla’s strategy for “cheap” is well-known: the company eschews LiDAR, and removed radar from new models a few years ago, claiming that it would accomplish its goals using cameras alone.¹ Setting aside the viability of this claim, the connection to the the dream is clear: a cameras-only approach enables the low cost vehicles integral to Musk’s dream. Yes, Waymo equipment costs will come down with scale, but Waymo’s current approach is both safer in the present and also more limited in bringing about the future.

What many folks seemed to miss in Musk’s presentation was his explanation as to how Tesla — and only Tesla — might get there.

The Bitter Lesson

Rich Sutton wrote one of the most important and provocative articles about AI in 2019; it’s called The Bitter Lesson:

The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin. The ultimate reason for this is Moore’s law, or rather its generalization of continued exponentially falling cost per unit of computation. Most AI research has been conducted as if the computation available to the agent were constant (in which case leveraging human knowledge would be one of the only ways to improve performance) but, over a slightly longer time than a typical research project, massively more computation inevitably becomes available. Seeking an improvement that makes a difference in the shorter term, researchers seek to leverage their human knowledge of the domain, but the only thing that matters in the long run is the leveraging of computation. These two need not run counter to each other, but in practice they tend to. Time spent on one is time not spent on the other. There are psychological commitments to investment in one approach or the other. And the human-knowledge approach tends to complicate methods in ways that make them less suited to taking advantage of general methods leveraging computation. There were many examples of AI researchers’ belated learning of this bitter lesson, and it is instructive to review some of the most prominent.

The examples Sutton goes over includes chess, where search beat deterministic programming, and Go, where unsupervised learning did the same. In both cases bringing massive amounts of compute to bear was both simpler and more effective than humans trying to encode their own shortcuts and heuristics. The same thing happened with speech recognition and computer vision: deep learning massively outperforms any sort of human-guided algorithms. Sutton notes towards the end:

One thing that should be learned from the bitter lesson is the great power of general purpose methods, of methods that continue to scale with increased computation even as the available computation becomes very great. The two methods that seem to scale arbitrarily in this way are search and learning.

It’s a brilliant observation, to which I might humbly add one additional component: while the Bitter Lesson is predicated on there being an ever-increasing amount of compute, which reliably solves once-intractable problems, one of the lessons of LLMs is that you also need an ever-increasing amount of data. Existing models are already trained on all of the data AI labs can get their hands on, including most of the Internet, YouTube transcripts, scanned books, etc.; there is much talk about creating synthetic data, both from humans and from other LLMs, to ensure that scaling laws continue. The alternative is that we hit the so-called “data wall”.

LLMs, meanwhile, are commonly thought about in terms of language — it is in the name, after all — but what they actually predict are tokens, and tokens can be anything, including driving data. Timothy Lee explained some of Waymo’s research in this area at Understanding AI:

Any self-driving system needs an ability to predict the actions of other vehicles. For example, consider this driving scene I borrowed from a Waymo research paper:

Vehicle A wants to turn left, but it needs to do it without running into cars B or D. There are a number of plausible ways for this scene to unfold. Maybe B will slow down and let A turn. Maybe B will proceed, D will slow down, and A will squeeze in between them. Maybe A will wait for both vehicles to pass before making the turn. A’s actions depend on what B and D do, and C’s actions, in turn, depend on what A does.

If you are driving any of these four vehicles, you need to be able to predict where the other vehicles are likely to be one, two, and three seconds from now. Doing this is the job of the prediction module of a self-driving stack. Its goal is to output a series of predictions that look like this:

Researchers at Waymo and elsewhere struggled to model interactions like this in a realistic way. It’s not just that each individual vehicle is affected by a complex set of factors that are difficult to translate into computer code. Each vehicle’s actions depend on the actions of other vehicles. So as the number of cars increases, the computational complexity of the problem grows exponentially.

But then Waymo discovered that transformer-based networks were a good way to solve this kind of problem.

“In driving scenarios, road users may be likened to participants in a constant dialogue, continuously exchanging a dynamic series of actions and reactions mirroring the fluidity of communication,” Waymo researchers wrote in a 2023 research paper.

Just as a language model outputs a series of tokens representing text, Waymo’s vehicle prediction model outputs a series of tokens representing vehicle trajectories—things like “maintain speed and direction,” “turn 5 degrees left,” or “slow down by 3 mph”.

Rather than trying to explicitly formulate a series of rules for vehicles to follow (like “stay in your lane” and “don’t hit other vehicles”), Waymo trained the model like an LLM. The model learned the rules of driving by trying to predict the trajectories of human-driven vehicles on real roads.

This data-driven approach allowed the model to learn subtleties of vehicle interactions that are not described in any driver manual and would be hard to capture with explicit computer code.

This is not yet a panacea. Lee notes later in his article:

One big problem Sinavski noted is that Wayve hasn’t found a vision-language model that’s “really good at spatial reasoning.” If you’re a long-time reader of Understanding AI, you might remember when I asked leading LLMs to tell the time from an analog clock or solve a maze. ChatGPT, Claude, and Gemini all failed because today’s foundation models are not good at thinking geometrically.

This seems like it would be a big downside for a model that’s supposed to drive a car. And I suspect it’s why Waymo’s perception system isn’t just one big network. Waymo still uses traditional computer code to divide the driving scene up into discrete objects and compute a numerical bounding box for each one. This kind of pre-processing gives the prediction network a head start as it reasons about what will happen next.

Another concern is that the opaque internals of LLMs make them difficult to debug. If a self-driving system makes a mistake, engineers want to be able to look under the hood and figure out what happened. That’s much easier to do in a system like Waymo’s, where some of the basic data structures (like the list of scene elements and their bounding boxes) were designed by human engineers.

But the broader point here is that self-driving companies do not face a binary choice between hand-crafted code or one big end-to-end network. The optimal self-driving architecture is likely to be a mix of different approaches. Companies will need to learn the best division of labor from trial and error.

That sounds right, but for one thing: The Bitter Lesson. To go back to Sutton:

This is a big lesson. As a field, we still have not thoroughly learned it, as we are continuing to make the same kind of mistakes. To see this, and to effectively resist it, we have to understand the appeal of these mistakes. We have to learn the bitter lesson that building in how we think we think does not work in the long run. The bitter lesson is based on the historical observations that 1) AI researchers have often tried to build knowledge into their agents, 2) this always helps in the short term, and is personally satisfying to the researcher, but 3) in the long run it plateaus and even inhibits further progress, and 4) breakthrough progress eventually arrives by an opposing approach based on scaling computation by search and learning. The eventual success is tinged with bitterness, and often incompletely digested, because it is success over a favored, human-centric approach.

If The Bitter Lesson ends up applying to true Level 5 autonomy, then Waymo is already signed up for school. A “mix of different approaches” clearly works better now, and may for the next few years, but does it get them to Level 5? And what of the data, to the extent it is essential to the The Sweet Outcome of self-taught AI? This was the part of the Tesla presentation I referenced above:

One of the reasons why the computer can be so much better than a person is that we have millions of cars that are training on driving. It’s like living millions of lives simultaneously and seeing very unusal situations that a person in their entire lifetime would not see, hopefully. With that amount of training data, it’s obviously going to be much better than what a human could be, because you can’t live a million lives. It can also see in all directions simultaneously, and it doesn’t get tired or text or any of those things, so it will naturally be 10x, 20x, 30x safer than a human for all those reasons.

I want to emphasize that the solution that we have is AI and vision. There’s no expensive equipment needed. The Model 3 and Model Y and S and X that we make today will be capable of full autonomy unsupervised. And that means that our costs of producing the vehicle is low.

Again, Musk has been over-promising and under-delivering in terms of self-driving for existing Tesla owners for years now, so the jury is very much out on whether current cars get full unsupervised autonomy. But that doesn’t change the fact that those cars do have cameras, and those cameras are capturing data and doing fine-tuning right now, at a scale that Waymo has no way of matching. This is what I think Andrej Karpathy, the former Tesla Autopilot head, was referring to in his recent appearance on the No Priors podcast:

I think people think that Waymo is ahead of Tesla, I think personally Tesla is ahead of Waymo, and I know it doesn’t look like that, but I’m still very bullish on Tesla and it’s self-driving program. I think that Tesla has a software problem, and I think Waymo has a hardware problem, is the way I put it, and I think software problems are much easier. Tesla has deployment of all these cars on earth at scale, and I think Waymo needs to get there. The moment Tesla gets to the point where they can actually deploy and it actually works I think is going to be really incredible…

I’m not sure that people are appreciating that Tesla actually does use a lot of expensive sensors, they just do it at training time. So there a bunch of cars that drive around with LiDARS, they do a bunch of stuff that doesn’t scale and they have extra sensors etc., and they do mapping and all this stuff. You’re doing it at training time and then you’re distilling that into a test-time package that is deployed to the cars and is vision only. It’s like an arbitrage on sensors and expense. And so I think it’s actually kind of a brilliant strategy that I don’t think is fully appreciated, and I think is going to work out well, because the pixels have the information, and I think the network will be capable of doing that, and yes at training time I think these sensors are really useful, but I don’t think they are as useful at test time.

Do note that Karpathy — who worked at Tesla for five years — is hardly a neutral observer, and also note that he forecasts a fully neural net approach to driving as taking ten years; that’s hardly next year, as Musk promised. That end goal, though, is Level 5, with low cost sensors and thus low cost cars, the key ingredient of realizing the dream of full autonomy and the transformation that would follow.

The Cost of Dreams

I don’t, for the record, know if the Tesla approach is going to work; my experience with both Waymo and Tesla definitely makes clear that Waymo is ahead right now (and the disengagement numbers for Tesla are multiple orders of magnitude worse). Most experts assume that LiDAR sensors are non-negotiable in particular.

The Tesla bet, though, is that Waymo’s approach ultimately doesn’t scale and isn’t generalizable to true Level 5, while starting with the dream — true autonomy — leads Tesla down a better path of relying on nothing but AI, fueled by data and fine-tuning that you can only do if you already have millions of cars on the road. That is the connection to SpaceX and what happened this weekend: if you start with the dream, then understand the cost structure necessary to achieve that dream, you force yourself down the only path possible, forgoing easier solutions that don’t scale for fantastical ones that do.

Although there does appear to be an inactive radar module in recent Models S and X; radar is comparatively cheap, and is particularly useful in low visibility situations where both cameras and LiDAR struggle ↩

Enterprise Philosophy and The First Wave of AI

Tuesday, September 24, 2024Wednesday, October 9, 2024

This Article is available as a video essay on YouTube

The popular history of technology usually starts with the personal computer, and for good reason: that was the first high tech device that most people ever used. The only thing more impressive than the sheer audacity of “A computer on every desk and in every home” as a corporate goal, is the fact that Microsoft accomplished it, with help from its longtime competitor Apple.

In fact, though, the personal computer wave was the 2nd wave of technology, particularly in terms of large enterprises. The first wave — and arguably the more important wave, in terms of economic impact — was the digitization of back-end offices. These were real jobs that existed:

These are bookkeepers and and tellers at a bank in 1908; fast forward three decades and technology had advanced:

The caption for this 1936 Getty Images photo is fascinating:

The new system of maintaining checking accounts in the National Safety Bank and Trust Company, of New York, known as the “Checkmaster,” was so well received that the bank has had to increase its staff and equipment. Instead of maintaining a mininum balance, the depositor is charged a small service charge for each entry on his statement. To date, the bank has attracted over 30,000 active accounts. Bookkeepers are shown as they post entries on “Checkmaster” accounts.

It’s striking how the first response to a process change is to come up with a business model predicated on covering new marginal costs; only later do companies tend to consider the larger picture, like how low-marginal-cost checking accounts might lead to more business for the bank overall, the volume of which can be supported thanks to said new technology.

Jump ahead another three decades and the back office of a bank looked like this:

Now the image is color and all of the workers are women, but what is perhaps surprising is that this image is, despite all of the technological change that had happened to date, particularly in terms of typewriters and calculators, not that different from the first two.

However, this 1970 picture was one of the last images of its kind: by the time this picture was taken, Bank of America, where this picture was taken, was already well on its way to transitioning all of its accounting and bookkeeping to computers; most of corporate America soon followed, with the two primary applications being accounting and enterprise resource planning. Those were jobs that had been primarily done by hand; now they were done by computers, and the hands were no longer needed.

Tech’s Two Philosophies

In 2018 I described Tech’s Two Philosophies using the four biggest consumer tech companies: Google and Facebook were on one side, and Apple and Microsoft on the other.

In Google’s view, computers help you get things done — and save you time — by doing things for you. Duplex was the most impressive example — a computer talking on the phone for you — but the general concept applied to many of Google’s other demonstrations, particularly those predicated on AI: Google Photos will not only sort and tag your photos, but now propose specific edits; Google News will find your news for you, and Maps will find you new restaurants and shops in your neighborhood. And, appropriately enough, the keynote closed with a presentation from Waymo, which will drive you…

Zuckerberg, as so often seems to be the case with Facebook, comes across as a somewhat more fervent and definitely more creepy version of Google: not only does Facebook want to do things for you, it wants to do things its chief executive explicitly says would not be done otherwise. The Messianic fervor that seems to have overtaken Zuckerberg in the last year, though, simply means that Facebook has adopted a more extreme version of the same philosophy that guides Google: computers doing things for people.

Google and Facebook’s approach made sense given their position as Aggregators, which “attract end users by virtue of their inherent usefulness.” Apple and Microsoft, on the other hand, were platforms born in an earlier age of computing:

This is technology’s second philosophy, and it is orthogonal to the other: the expectation is not that the computer does your work for you, but rather that the computer enables you to do your work better and more efficiently. And, with this philosophy, comes a different take on responsibility. Pichai, in the opening of Google’s keynote, acknowledged that “we feel a deep sense of responsibility to get this right”, but inherent in that statement is the centrality of Google generally and the direct culpability of its managers. Nadella, on the other hand, insists that responsibility lies with the tech industry collectively, and all of us who seek to leverage it individually…

This second philosophy, that computers are an aid to humans, not their replacement, is the older of the two; its greatest proponent — prophet, if you will — was Microsoft’s greatest rival, and his analogy of choice was, coincidentally enough, about transportation as well. Not a car, but a bicycle:

You can see the outlines of this philosophy in these companies’ approaches to AI. Google is the most advanced, thanks to the way it saw the obvious application of AI to its Aggregation products, particularly search and advertising. Facebook, now Meta, has made major strides over the last few years as it has overhauled its recommendation algorithms and advertising products to also be probabilistic, both in response to the rise of TikTok in terms of customer attention, and the severing of their deterministic ad product by Apple’s App Tracking Transparency initiative. In both cases their position as Aggregators compelled them to unilaterally go out and give people stuff to look at.

Apple, meanwhile, is leaning heavily into Apple Intelligence, but I think there is a reason its latest ad campaign feels a bit weird, above-and-beyond the fact it is advertising a feature that is not yet available to non-beta customers. Apple is associated with jewel-like devices that you hold in your hand and software that is accessible to normal people; asking your phone to rescue a fish funeral with a slideshow feels at odds with Steve Jobs making a movie on stage during the launch of iMovie:

That right there is a man riding a bike.

Microsoft Copilot

Microsoft is meanwhile — to the extent you count their increasingly fraught partnership with OpenAI — in the lead technically as well. Their initial product focus for AI, however, is decidedly on the side of being a tool to, as their latest motto states, “empower every person and every organization on the planet to achieve more.”

CEO Satya Nadella said in the recent pre-recorded keynote announcing Copilot Wave 2:

You can think of Copilot as the UI for AI. It helps you break down these siloes between your work artifacts, your communications, and your business processes. And we’re just getting started. In fact, with scaling laws, as AI becomes more capable and even agentic, the models themselves become more of a commodity, and all the value gets created by you how steer, ground, fine-tune these models with your business data and workflow. And how it composes with the UI layer of human to AI to human interaction becomes critical.

Today we’re announcing Wave 2 of Microsoft 365 Copilot. You’ll see us evolve Copilot in three major ways: first, it’s about bringing the web plus work plus pages together as the new AI system for knowledge work. With Pages we’ll show you how Copilot can take any information from the web or your work and turn it into a multiplayer AI-powered canvas. You can ideate with AI and collaborate with other people. It’s just magical. Just like the PC birthed office productivity tools as we know them today, and the web makes those canvases collaborative, every platform shift has changed work artifacts in fundamental ways, and Pages is the first new artifact for the AI age.

Notice that Nadella, like most pop historians (including yours truly!), is reaching out to draw a link to the personal computer, but here the relevant personal computer history casts more shadow than light onto Nadella’s analogy. The initial wave of personal computers were little more than toys, including the Commodore 64 and TRS-80, sold in your local Radio Shack; the Apple I, released in 1976, was initially sold as a bare circuit board:

A year later Apple released the Apple II; now there was a case, but you needed to bring your own TV:

Two years later and Apple II had a killer app that would presage the movement of personal computers into the workplace: VisiCalc, the first spreadsheet.

VisiCalc’s utility for business was obvious — in fact, it was conceived of by Dan Bricklin while watching a lecture at Harvard Business School. That utility, though, was not about running business critical software like accounting or ERP systems; rather, an employee with an Apple II and VisiCalc could take the initiative to model their business and understand how things worked at a view grounded in a level of calculation that was both too much for one person, yet not sufficient to hire an army of backroom employees, or, increasingly at that point, reserve time on the mainframe.

Notice, though, how this aligned with the Apple and Microsoft philosophy of building tools: tools are meant to be used, but they take volition to maximize their utility. This, I think, is a challenge when it comes to Copilot usage: even before Copilot came out employees with initiative were figuring out how to use other AI tools to do their work more effectively. The idea of Copilot is that you can have an even better AI tool — thanks to the fact it has integrated the information in the “Microsoft Graph” — and make it widely available to your workforce to make that workforce more productive.

To put it another way, the real challenge for Copilot is that it is a change management problem: it’s one thing to charge $30/month on a per-seat basis to make an amazing new way to work available to all of your employees; it’s another thing entirely — a much more difficult thing — to get all of your employees to change the way they work in order to benefit from your investment, and to make Copilot Pages the “new artifact for the AI age”, in line with the spreadsheet in the personal computer age.

Clippy and Copilot

Salesforce CEO Marc Benioff was considerably less generous towards Copilot in last week’s Dreamforce keynote. After framing machine learning as “Wave 1” of AI, Benioff said that Copilots were Wave 2, and from Microsoft’s perspective it went downhill from there:

We moved into this Copilot world, but the Copilot world has been kind of a hit-and-miss world. The Copilot world where customers have said to us “Hey, I got these Copilots but they’re not exactly performing as we want them to. We don’t see how that Copilot world is going to get us to the real vision of artificial intelligence of augmentation of productivity, of better business results that we’ve been looking for. We just don’t see Copilot as that key step for our future.” In some ways, they kind of looked at Copilot as the new Microsoft Clippy, and I get that.

The Clippy comparison was mean but not entirely unfair, particularly in the context of users who don’t know enough to operate with volition. Former Microsoft executive Steven Sinofsky explained in Hard Core Software:

Why was Microsoft going through all this and making these risky, or even edgy, products? Many seemed puzzled by this at the time. In order to understand that today, one must recognize that using a PC in the early 1990s (and before) was not just difficult, but it was also confusing, frustrating, inscrutable, and by and large entirely inaccessible to most everyone unless you had to learn how to use one for work.

Clippy was to be a replacement for the “Office guru” people consulted when they wanted to do things in Microsoft Office that they knew were possible, but were impossible to discover; Sinofsky admits that a critical error was making Clippy too helpful with simple tasks, like observing “Looks like you’re trying to write a letter” when you typed “Dear John” and hit return. Sinofsky reflected:

The journey of Clippy (in spite of our best efforts that was what the feature came to be called) was one that parallels the PC for me in so many ways. It was not simply a failed feature, or that back-handed compliment of a feature that was simply too early like so many Microsoft features. Rather Clippy represented a final attempt at trying to fix the desktop metaphor for typical or normal people so they could use a computer.

What everyone came to realize was that the PC was a generational change and that for those growing up with a PC, it was just another arbitrary and random device in life that one just used. As we would learn, kids didn’t need different software. They just needed access to a PC. Once they had a PC they would make cooler, faster, and more fun documents with Office than we were. It was kids that loved WordArt and the new graphics in Word and PowerPoint, and they used them easily and more frequently than Boomers or Gen X trying to map typewriters to what a computer could do.

It was not the complexity that was slowing people down, but the real concern that the wrong thing could undo hours of work. Kids did not have that fear (yet). We needed to worry less about dumbing the software down and more about how more complex things could get done in a way that had far less risk.

This is a critical insight when it comes to AI, Copilot, and the concept of change management: a small subset of Gen Xers and Boomers may have invented the personal computer, but for the rest of their cohort it was something they only used if they had to (resentfully), and only then the narrow set of functionality that was required to do their job. It was the later generations that grew up with the personal computer, and hardly give inserting a table or graphic into a document a second thought (if, in fact, they even know what a “document” is). For a millenial using a personal computer doesn’t take volition; it’s just a fact of life.

Again, though, computing didn’t start with the personal computer, but rather with the replacement of the back office. Or, to put it in rather more dire terms, the initial value in computing wasn’t created by helping Boomers do their job more efficiently, but rather by replacing entire swathes of them completely.

Agents and o1

Benioff implicitly agrees; the Copilot Clippy insult was a preamble to a discussion of agents:

But it was pushing us, and they were trying to say, what is the next step? And we are now really at that moment. That is why this show is our most important Dreamforce ever. There’s no question this is the most exciting Dreamforce and the most important Dreamforce. What you’re going to see at this show is technology like you have never seen before…The first time you build and deploy your first autonomous agent for your company that is going to help you to be more productive, to augment your employees, and to get these better business results, you’re going to remember that like the first time it was in y our Waymo. This is the 3rd wave of AI. It’s agents…

Agents aren’t copilots; they are replacements. They do work in place of humans — think call centers and the like, to start — and they have all of the advantages of software: always available, and scalable up-and-down with demand.

We know that workforces are overwhelmed. They’re doing these low-value tasks. They’ve got kind of a whole different thing post-pandemic. Productivity is at a different place. Capacity is at a different place…we do see that workforces are different, and we realize that 41% of the time seems to be wasted on low value and repetitive tasks, and we want to address that. The customers are expecting more: zero hold times, to be more personal and empathetic, to work with an expert all the time, to instantly schedule things. That’s our vision, our dream for these agents…

What if these workforces had no limits at all? Wow. That’s kind of a strange thought, but a big one. You start to put all of these things together, and you go, we can kind of build another kind of company. We can build a different kind of technology platforms. We can take the Salesforce technology platform that we already have, and that all of you have invested so much into, the Salesforce Platform, and we can deliver the next capability. The next capability that’s going to make our companies more productive. To make our employees more augmented. And just to deliver much better business results. That is what Agentforce is.

This Article isn’t about the viability of Agentforce; I’m somewhat skeptical, at least in the short term, for reasons I will get to in a moment. Rather, the key part is the last few sentences: Benioff isn’t talking about making employees more productive, but rather companies; the verb that applies to employees is “augmented”, which sounds much nicer than “replaced”; the ultimate goal is stated as well: business results. That right there is tech’s third philosophy: improving the bottom line for large enterprises.

Notice how well this framing applies to the mainframe wave of computing: accounting and ERP software made companies more productive and drove positive business results; the employees that were “augmented” were managers who got far more accurate reports much more quickly, while the employees who used to do that work were replaced. Critically, the decision about whether or not make this change did not depend on rank-and-file employees changing how they worked, but for executives to decide to take the plunge.

The Consumerization of IT

When Benioff founded Salesforce in 1999, he came up with a counterintuitive logo:

Of course Salesforce was software; what it was not was SOFTWARE, like that sold by his previous employer Oracle, which at that time meant painful installation and migrations that could take years, and even then would often fail. Salesforce was different: it was a cloud application that you never needed to install or update; you could simply subscribe.

Cloud-based software-as-a-service companies are the norm now, thanks in part to Benioff’s vision. And, just as Salesforce started out primarily serving — you guessed it! — sales forces, SaaS applications can focus on individual segments of a company. Indeed, one of the big trends over the last decade were SaaS applications that grew, at least in the early days, through word-of-mouth and trialing by individuals or team leaders; after all, all you needed to get started was a credit card — and if there was a freemium model, not even that!

This trend was part of a larger one, the so-called “consumerization of IT”. Douglas Neal and John Taylor, who first coined the term in 2001, wrote in a 2004 Position Paper:

Companies must treat users as consumers, encouraging employee responsibility, ownership and trust by providing choice, simplicity and service. The parent/child attitude that many IT departments have traditionally taken toward end users is now obsolete.

This is actually another way of saying what Sinofsky did: enterprise IT customers, i.e. company employees, no longer needed to be taught how to use a computer; they grew up with them, and expected computers to work the same way their consumer devices did. Moreover, the volume of consumer devices meant that innovation would now come from that side of technology, and the best way for enterprises to keep up would be to ideally adopt consumer infrastructure, and barring that, seek to be similarly easy-to-use.

It’s possible this is how AI plays out; it is what has happened to date, as large models like those built by OpenAI or Anthropic or Google or Meta are trained on publicly available data, and then are available to be fine-tuned for enterprise-specific use cases. The limitation in this approach, though, is the human one: you need employees who have the volition to use AI with the inherent problems introduced by this approach, including bad data, hallucinations, security concerns, etc. This is manageable as long as a motivated human is in the loop; what seems unlikely to me is any sort of autonomous agent actually operating in a way that makes a company more efficient without an extensive amount of oversight that ends up making the entire endeavor more expensive.

Moreover, in the case of Agentforce specifically, and other agent initiatives more generally, I am unconvinced as to how viable and scalable the infrastructure necessary to manage auto-regressive large language models will end up being. I got into some of the challenges in this Update:

The big challenge for traditional LLMs is that they are path-dependent; while they can consider the puzzle as a whole, as soon as they commit to a particular guess they are locked in, and doomed to failure. This is a fundamental weakness of what are known as “auto-regressive large language models”, which to date, is all of them.

To grossly simplify, a large language model generates a token (usually a word, or part of a word) based on all of the tokens that preceded the token being generated; the specific token is the most statistically likely next possible token derived from the model’s training (this also gets complicated, as the “temperature” of the output determines what level of randomness goes into choosing from the best possible options; a low temperature chooses the most likely next token, while a higher temperature is more “creative”). The key thing to understand, though, is that this is a serial process: once a token is generated it influences what token is generated next.

The problem with this approach is that it is possible that, in the context of something like a crossword puzzle, the token that is generated is wrong; if that token is wrong, it makes it more likely that the next token is wrong too. And, of course, even if the first token is right, the second token could be wrong anyways, influencing the third token, etc. Ever larger models can reduce the likelihood that a particular token is wrong, but the possibility always exists, which is to say that auto-regressive LLMs inevitably trend towards not just errors but compounding ones.

Note that these problems exist even with specialized prompting like insisting that the LLM “go step-by-step” or “break this problem down into component pieces”; they are still serial output machines that, once they get something wrong, are doomed to deliver an incorrect answer. At the same time, this is also fine for a lot of applications, like writing; where the problem manifests itself is with anything requiring logic or iterative reasoning. In this case, a sufficiently complex crossword puzzle suffices.

That Update was about OpenAI’s new o1 model, which I think is a step change in terms of the viability of agents; the example I used in that Update was solving a crossword puzzle, which can’t be done in one-shot — but can be done by o1.

o1 is explicitly trained on how to solve problems, and second, o1 is designed to generate multiple problem-solving streams at inference time, choose the best one, and iterate through each step in the process when it realizes it made a mistake. That’s why it got the crossword puzzle right — it just took a really long time.

o1 introduces a new vector of potential improvement: while auto-regressive LLMs scaled in quality with training set size (and thus the amount of compute necessary), o1 scales inference. This image is from OpenAI’s announcement page:

This second image is a potential problem in a Copilot paradigm: sure, a smarter model potentially makes your employees more productive, but those increases in productivity have to be balanced by both greater inference costs and more time spent waiting for the model (o1 is significantly slower than a model like 4o). However, the agent equation, where you are talking about replacing a worker, is dramatically different: there the cost umbrella is absolutely massive, because even the most expensive model is a lot cheaper, above-and-beyond the other benefits like always being available and being scalable in number.

More importantly, scaling compute is exactly what the technology industry is good at. The one common thread from Wave 1 of computing through the PC through SaaS and consumerization of IT, is that problems gated by compute are solved not via premature optimizations but via the progression of processing power. The key challenge is knowing what to scale, and I believe OpenAI has demonstrated the architecture that will benefit from exactly that.

Data and Palantir

That leaves the data piece, and while Benioff bragged about all of the data that Salesforce had, it doesn’t have everything, and what it does have is scattered across the phalanx of applications and storage layers that make up the Salesforce Platform. Indeed, Microsoft faces the same problem: while their Copilot vision includes APIs for 3rd-party “agents” — in this case, data from other companies — the reality is that an effective Agent — i.e. a worker replacement — needs access to everything in a way that it can reason over. The ability of large language models to handle unstructured data is revolutionary, but the fact remains that better data still results in better output; explicit step-by-step reasoning data, for example, is a big part of how o1 works.

To that end, the company I am most intrigued by, for what I think will be the first wave of AI, is Palantir. I didn’t fully understand the company until this 2023 interview with CTO Shyam Sankar and Head of Global Commercial Ted Mabrey; I suggest reading or listening to the whole thing, but I wanted to call out this exchange in particular:

Was there an aha moment where you have this concept — you use this phrase now at the beginning of all your financial reports, which is that you’re the operating system for enterprises. Now, obviously this is still the government era, but it’s interesting the S-1 uses that line, but it’s further down, it’s not the lead thing. Was that something that emerged later or was this that, “No, we have to be the interface for everything” idea in place from the beginning?

Shyam Sankar: I think the critical part of it was really realizing that we had built the original product presupposing that our customers had data integrated, that we could focus on the analytics that came subsequent to having your data integrated. I feel like that founding trauma was realizing that actually everyone claims that their data is integrated, but it is a complete mess and that actually the much more interesting and valuable part of our business was developing technologies that allowed us to productize data integration, instead of having it be like a five-year never ending consulting project, so that we could do the thing we actually started our business to do.

That integration looks like this illustration from the company’s webpage for Foundry, what they call “The Ontology-Powered Operating System for the Modern Enterprise”:

What is notable about this illustration is just how deeply Palantir needs to get into an enterprise’s operations to achieve its goals. This isn’t a consumery-SaaS application that your team leader puts on their credit card; it is SOFTWARE of the sort that Salesforce sought to move beyond.

If, however, you believe that AI is not just the next step in computing, but rather an entirely new paradigm, then it makes sense that enterprise solutions may be back to the future. We are already seeing that that is the case in terms of user behavior: the relationship of most employees to AI is like the relationship of most corporate employees to PCs in the 1980s; sure, they’ll use it if they have to, but they don’t want to transform how they work. That will fall on the next generation.

Executives, however, want the benefit of AI now, and I think that benefit will, like the first wave of computing, come from replacing humans, not making them more efficient. And that, by extension, will mean top-down years-long initiatives that are justified by the massive business results that will follow. That also means that go-to-market motions and business models will change: instead of reactive sales from organic growth, successful AI companies will need to go in from the top. And, instead of per-seat licenses, we may end up with something more akin to “seat-replacement” licenses (Salesforce, notably, will charge $2 per call completed by one of its agents). Services and integration teams will also make a comeback. It’s notable that this has been a consistent criticism of Palantir’s model, but I think that comes from a viewpoint colored by SaaS; the idea of years-long engagements would be much more familiar to tech executives and investors from forty years ago.

Enterprise Philosophy

Most historically-driven AI analogies usually come from the Internet, and understandably so: that was both an epochal change and also much fresher in our collective memories. My core contention here, however, is that AI truly is a new way of computing, and that means the better analogies are to computing itself. Transformers are the transistor, and mainframes are today’s models. The GUI is, arguably, still TBD.

To the extent that is right, then, the biggest opportunity is in top-down enterprise implementations. The enterprise philosophy is older than the two consumer philosophies I wrote about previously: its motivation is not the user, but the buyer, who wants to increase revenue and cut costs, and will be brutally rational about how to achieve that (including running expected value calculations on agents making mistakes). That will be the only way to justify the compute necessary to scale out agentic capabilities, and to do the years of work necessary to get data in a state where humans can be replaced. The bottom line benefits — the essence of enterprise philosophy — will compel just that.

And, by extension, we may be waiting longer than we expect for AI to take over the consumer space, at least at the scale of something like the smartphone or social media. That is good news for the MKBHDs of everything — the users with volition — but for everyone else the biggest payoff will probably be in areas like entertainment and gaming. True consumerization of AI will be left to the next generation who will have never known a world without it.

Boomer Apple

Tuesday, September 10, 2024Sunday, September 29, 2024

This Article is available as a video essay on YouTube

I was born in 1980, which technically makes me a part of Generation X; there is at least one data point, though, that makes me a millennial. In 2021 Fast Company conducted a survey about what constituted middle age; millennials defined it as 35 to 50, while Generation X said 45 to 55 (Baby Boomers said 45 to 60).

For my part, I wrote that Apple was in its middle age in 2018, when the company was 42 years old, a solidly millennial definition. My argument then was that Apple was increasingly — and rightly — behaving like an incumbent in a market that was no longer growing. That meant that future iPhone growth would come from (1) more expensive iPhones, (2) selling more devices to existing iPhone users (AirPods, Watches, etc.), and (3) accelerating the revenue it derived from services for iPhone users.

That is exactly what the company did over the ensuing few years, but 2022 brought a bit of a surprise: Apple, in the face of the worst inflation in a generation, declined to raise the price of the iPhone, which in real terms meant an iPhone Pro, nominally priced at $999, was $116 cheaper than it had been two years prior.

I thought this was a big deal at the time; I wrote in The Services iPhone:

This doesn’t make much sense for the product company Apple has always been thought to be, and doesn’t fully align with the approach I laid out in Apple’s Middle Age. It does, though, make all kinds of sense for a services company, which is focused first-and-foremost on increasing its install base. Indeed, this is the missing piece from that Update I wrote about Apple’s changing metrics. To measure its business based on users, not products, was to measure like a services company; to lower the prices of the products that lead to services revenue is to price like one.

Here’s another aspect of getting old: it’s hard to let go of the preconceptions of your youth. In yesterday’s Update I wrote that I believed that Apple would surely raise prices this year; that $999 iPhone Pro is now $177 cheaper than in 2020 in real terms, and given the fact that Apple was increasing the bill of materials on the lower end phones in particular (bumping them to 8GB RAM, and onto the latest TSMC process), surely they wouldn’t accept the hit on margins on top of the loss in value of their longstanding iPhone pricing.

	2020	2021	2022	2023	2024
2 year old iPhone	$499	$477	$441	$422	$411
1 year old iPhone	$599	$572	$530	$507	$493
New iPhone	$799	$763	$707	$676	$657
iPhone Pro	$999	$954	$883	$846	$822
iPhone Pro Max	$1,099	$1,050	$972	$1,015	$904

In fact, though, I had it right in 2022: Apple held the line on prices once again; I should have realized the company — like myself! — really is in a different stage.

The iPhone Event

The lack of a price increase for the iPhone 16 Pro made more sense when I watched Apple’s presentation; I found the updates over the iPhone 15 Pro to be pretty underwhelming. The A18 Pro chip is on TSMC’s newest 3nm process, there is a new Camera Control button, and the screen is a bit bigger with bezels that are a bit smaller; that’s really about it from a hardware perspective, although as always, Apple continues to push the envelope with computational photography. And, frankly, that’s fine: last year’s iPhone Pro 15, the first with titanium and USB-C, was for me the iPhone I had been waiting for (and most customers don’t upgrade every year, so these and other previous updates will be new features for them).

What I find much more compelling — and arguably the best deal in iPhone history — is the $799 iPhone 16 (non-Pro). The A18 chip appears to be a binned version of the A18 Pro (there is one less GPU and smaller caches), while the aforementioned bump to 8GB of RAM — necessary to run Apple Intelligence — matches the iPhone 16 Pro. There is one fewer camera, but the two-camera system that remains has been reconfigured to a much more aesthetically pleasing pill shape that has the added bonus of making it possible to record spatial video when held horizontally. The screen isn’t quite as good, and the bezels are a bit larger, but the colors are more fun. It’s a great phone, and the closest the regular iPhone has been to the Pro since the line separated in 2017.

It’s also the more important phone when it comes to Apple’s long-term addressable market. The non-pro iPhones are the ones that stay on sale for years (the iPhone 14 just received its expected price drop to $599); one potential consideration for Apple in terms of price is that it wants to move the 8GB RAM iPhone 16 down the line sooner rather than later; a price raise, if it meant keeping a 6GB RAM iPhone for sale one year longer, could be bad for Apple Intelligence.

Apple Intelligence, meanwhile, is the great hope when it comes to driving new iPhone sales: Apple shareholders are hoping for an AI-driven “supercycle”, when consumers, eager for Apple Intelligence features, update their iPhones to one of the latest models with sufficient RAM to run Apple’s new tentpole offering.

Apple’s Crossing Lines

Notice, though, that this hope itself speaks to how Apple is at a different stage in life: the big hardware changes this year are at the low end; one of the takeaways from AI is that we are well and truly into the age of software-driven differentiation. This is hardly a surprise; I wrote in 2016’s Everything as a Service:

Apple has arguably perfected the manufacturing model: most of the company’s corporate employees are employed in California in the design and marketing of iconic devices that are created in Chinese factories built and run to Apple’s exacting standards (including a substantial number of employees on site), and then transported all over the world to consumers eager for best-in-class smartphones, tablets, computers, and smartwatches.

What makes this model so effective — and so profitable — is that Apple has differentiated its otherwise commoditizable hardware with software. Software is a completely new type of good in that it is both infinitely differentiable yet infinitely copyable; this means that any piece of software is both completely unique yet has unlimited supply, leading to a theoretical price of $0. However, by combining the differentiable qualities of software with hardware that requires real assets and commodities to manufacture, Apple is able to charge an incredible premium for its products.

The results speak for themselves: this past “down” quarter saw Apple rake in $50.6 billion in revenue and $10.5 billion in profit. Over the last nine years the iPhone alone has generated $600 billion in revenue and nearly $250 billion in gross profit. It is probably the most valuable — the “best”, at least from a business perspective — manufactured product of all time.

In the eight years since then the iPhone has driven another $1.4 trillion in revenue and something around $625 billion in profit;¹ it is still tops in the company for both. It is software, though, specifically services revenue that doesn’t depend on how good the underlying hardware is, that is 2nd place in terms of revenue:

The comparison is even more striking when you look at profit (again, with the assumption that iPhones have a 45% gross profit margin):

Jason Snell already noted the impending convergence of the overall product and services profit lines at Six Colors:

But what about the bottom line? While Apple’s Services gross margin was 74% last quarter, products was only a measly 35%. (I’m kidding—35% margin on hardware is actually a really great number. It just can’t compare to Services, because hardware has some fundamental costs that services just don’t.)

Let’s look at total profits:

Last quarter, Apple made about $22 billion in profit from products and $18 billion from Services. It’s the closest those two lines have ever come to each other.

This is what was buzzing in the back of my head as I was going over all the numbers on Thursday. We’re not quite there yet, but it’s hard to imagine that there won’t be a quarter in the next year or so in which Apple reports more total profit on Services than on products.

When that happens, is Apple still a products company? Or has it crossed some invisible line?

Snell concludes his piece by imploring Apple to “remember where you came from”; hardware is what makes the whole thing work. I’m sympathetic to Snell’s case, because Apple the integrated hardware company is both an entity I admire and one that — as evidenced by my mistaken prediction yesterday — still looms large in my mind. Moreover, you can can (and should) make the case that Services revenue and profit is just an accounting distinction for what is just more hardware revenue and profit, because it is the latter that unlocks the former.

That, though, is why I broke out the iPhone specifically: I think the invisible line Snell talks about has already been crossed. Yes, next quarter’s iPhone numbers will jump back up thanks to seasonality, but this is now three straight years of Apple either favoring its more reliably growing Services business (by increasing the iPhone’s addressable market by lowering real prices) or admitting that it doesn’t have the pricing power that drove up product revenue in their middle age.

Boomer Apple

This is not, to be clear, an “Apple is Doomed” post; Apple hardware is still very good, and in lots of small ways that go beyond new buttons. I’ve been exploring the Android market over the past few weeks and there is a level of “it just works” that still differentiates the iPhone, particularly in terms of overall system integration (relative to the Pixel, which has a poor modem and problematic battery life) and worldwide reliability (relative to Samsung devices that are endlessly segmented in weird ways across markets, a hangup for me in particular).

Moreover, it’s hardly a condemnation for hardware to become “good enough”, and to decrease in price over time; that has been the path of basically all consumer electronics.

It’s impressive that Apple was innovative enough to resist gravity for as long as it did.

And, of course, there is the lock-in endemic to software: iMessage only works on iPhones,² there are all of the apps you have already purchased, and, the muscle memory of years in a particular ecosystem. There are Watches and AirPods that only work or work best with iPhones, and the deep integration with Macs. Apple’s chips remain best-of-breed, and the company’s investment in privacy may pay off in the AI era as people trust Apple to leverage their personal data on device and in their new private cloud.

That, though, is the rub: Craig Federighi, Apple’s software chief, usually doesn’t appear at the iPhone event; his domain is the Worldwide Developers Conference in June, and yet here he was in front of the camera for ten minutes.

Software, specifically AI, is what will drive differentiation going forward, and even in the best case scenario, where Apple’s AI efforts are good enough to keep people from switching to Google, the economics of software development push towards broad availability on every iPhone, not special features for people willing to pay a bit more. It’s as if the iPhone, which started out as one model for everyone before bifurcating into the Pro and regular (and SE) lines, is coming full circle to standardization; the difference, though, is its value is harvested through R&D intensive services that need to run everywhere, instead of unit profits driven by hardware differentiation.

So what stage of life is this, where you are banking on your earlier investments for your moat, and having no choice but to strive for good enough on the hot new trends that don’t quite fit your model? Maybe it’s the age where you’re worried about things like irregular heart beats, sleep apnea, or hearing loss. I’m not meaning to make fun of the health features that are an ever larger focus of these events; these are incredible applications of technology that Apple is right to brag about. Indeed, these features make me want to wear an Apple Watch or use AirPods.

But I’m old, and so enamored of the companies of my youth that I ignored my own (correct!) analysis of Apple’s shift in a desire to believe that this was still a product company that could wow me and charge me like it did in 2020. This is a services company now; the hardware is necessary, but insufficient for long-term growth. So it goes.

I am assuming 45% profit margins for the iPhone. Many iPhone models have reported gross profit margins of 50% or more, but those reports are based on the physical components of the phone, and do not include the attributed cost of software; therefore I am using 45% as a best guess to be conservative in terms of the following argument, but I would guess the true margin, including fully attributed R&D costs, is closer to 40% ↩
Although this can be worked around using services like AirMessage ↩

Intel Honesty

Tuesday, September 3, 2024Sunday, September 15, 2024

This Article is available as a video essay on YouTube

It really is a valley:

There, right in the middle of the Santa Clara Valley, formed by the Santa Cruz Mountains on the west and the Diablo Range on the east, lies the once-sleepy city of Mountain View. Mountain View was dominated by the U.S. Navy’s Moffett Field in 1955, when William Shockley, one of the inventors of the transistor at Bell Labs, returned to neighboring Palo Alto to care for his ailing mother.

Convinced that silicon was a superior material for transistors — Bell Labs was focused on germanium — Shockley, unable to hire many Bell Labs co-workers both because of the distance from New Jersey and also his abusive management style, set up the Shockley Semiconductor Laboratory in 1956 in Mountain View with a collection of young scientists. Only a year later eight of those scientists, led by Robert Noyce and Gordon Moore, fled Shockley — he really was a terrible manager — and set up Fairchild Semiconductor, a new division of Fairchild Camera and Instrument, in neighboring Sunnyvale.

It was Fairchild Semiconductor that gave the tech industry’s home the other half of its name: yes, we talk about “The Valley”, but at least when it comes to tech, we mean Silicon Valley. From TechCrunch in 2014:

As Fairchild started to grow, employees began to leave the firm to launch new spin-off businesses. Many of these firms also grew quickly, inspiring other employees still working at the company…The growth of these new companies started to reshape the region. In just 12 years, the co-founders and former employees of Fairchild generated more than 30 spin-off companies and funded many more. By 1970, chip businesses in the San Francisco area employed a total of 12,000 people…

The achievements of these companies eventually attracted attention. In 1971, a journalist named Don Hoefler wrote an article about the success of computer chip companies in the Bay Area. The firms he profiled all produced chips using silicon and were located in a large valley south of San Francisco. Hoefler put these two facts together to create a new name for the region: Silicon Valley.

Hoefler’s article and the name he coined have become quite famous, but there’s a critical part of his analysis that is often overlooked: Almost all of the silicon chip companies he profiled can be traced back to Fairchild and its co-founders.

Still, for all of the massive success downstream from Fairchild Semiconductor, none mattered more, or came to define Silicon Valley in every respect, than Intel. Arthur Rock, who had helped the so-called “Traitorous Eight” find Fairchild Camera and Instrument, funded Intel, and in the process created the compensation structure that came to define Silicon Valley. Gordon Moore wrote the roadmap for Intel — now more commonly known as Moore’s Law — which “predicted” that the number of transistors would double at a set rate, both increasing compute speed and driving down prices for that compute; “predict” is in quotes because Moore’s Law was not a physical law, but an economic one, downstream of Intel’s inexorable push for continued improvement. That, by extension, meant that Intel set the pace of innovation for all of technology, not just by making the processors for the PC — and, in an underrated wave of disruption in the early part of this century, the cloud — but also by defining the expectations of every software engineer in the entire world.

Intel’s Long Decline

Stratechery has, from the beginning, operated with a great degree of reverence for tech history; perhaps that’s why I’ve always been a part of the camp cheering for Intel to succeed. The unfortunate fact of the matter is that the need for cheerleading has been clear for as long as I have written this blog: in May 2013 I wrote that Intel needed to build out a foundry business, as the economics of their IDM business, given their mobile miss, faced long-term challenges.

Unfortunately not only did Intel not listen, but their business got a lot worse: in the late 2010’s Intel got stuck trying to move to 10nm, thanks in part to their reluctance to embrace the vastly more expensive EUV lithography process, handing the performance crown to TSMC. Meanwhile Intel’s chip design team, increasingly fat and lazy thanks to the fact they could leverage Intel’s once-industry-leading processes, had started to fall behind AMD; today AMD has both better designs and, thanks to the fact they fab their chips at TSMC, better processes. Meanwhile, the rise of hyperscalers meant there were entities that both had the scale to justify overcoming whatever software advantages Intel had, and the resources to do so; the result is that AMD has been taking data center share for years, and is on the verge of passing 50%:

[Editor’s Note: these two paragraphs are technically incorrect, in that AMD’s data center revenue includes their AI chips; the directionally point remains, but I regret the errors]

~~This chart actually understates the problem, because it only includes x86 processors;~~ in fact, those capabilities that have allowed the hyperscalers to take advantage of AMD’s increasingly superior total cost of ownership have also been devoted to building ARM-based server chips. Amazon in particular has invested heavily in its Graviton line of chips, taking advantage of ARM’s theoretically better efficiency and lower licensing fees (as compared to Intel’s margins).

Beyond that, what is especially problematic — and why Intel’s datacenter revenue is actually down year-over-year — is that an increasing amount of data center spend is going towards AI, the latest paradigm where Intel missed the boat.

[End Editor’s Note]

The story Intel — or at least its past management — wants you to believe about mobile is that they foolishly passed up the opportunity to supply Apple’s iPhone, not realizing that the volume would more than make up for the margin hit; in fact, Tony Fadell told me that while Steve Jobs wanted Intel — Apple had just switched to using Intel chips for Macs — Intel chips weren’t competitive:

For me, when it came to Intel at the time, back in the mid-2000s, they were always about, “Well, we’ll just repackage what we have on the desktop for the laptop and then we’ll repackage that again for embedding.” It reminded me of Windows saying, “I’m going to do Windows and then I’m going to do Windows Mobile and I’m going to do Windows embedded.” It was using those same cores and kernels and trying to slim them down…

The mindset at Intel was never about — when they went through that CISC-RISC duality of “Which one are we going to be?”, and they chose CISC, which was the right thing at the time, if you fast forward, they also made that decision, they threw away architectural and they went to more manufacturing. That was the time when they said “We don’t have to worry about all these different product lines to meet all these architectural needs. We’re just going to have Moore’s Law take over” and so in a way that locks you into a path and that’s why Intel, not under the Pat days but previous to the Pat days, was all driven by manufacturing capability and legal. It wasn’t driven by architectural decisions, it was like, “Here’s what we got and we’re going to spread it around and we’re going to keep reusing it”.

In fact, it does go back to the Pat days, specifically CEO Pat Gelsinger’s initial stint at Intel. He was the one that pushed CISC over RISC, arguing that Intel’s CISC software advantage, supported by the company’s superior manufacturing, would ensure that the company dominated microprocessors. And, as Fadell noted, it worked, at least in PCs and servers.

Where it didn’t work was mobile: Intel couldn’t leverage its manufacturing to make x86 competitive with ARM, particularly since the latter had a head start on software; it also didn’t work in GPUs, where Intel spent years trying to build x86-based gaming chips that — you guessed it — were meant to rely on Intel’s manufacturing prowess. GPUs, of course, are the foundation of today’s AI boom, and while Intel bought Gaudi to offer AI chips, they haven’t made a dent in the market — and oh, by the way, Gaudi chips are manufactured by TSMC.

IDM 2.0

None of this story is new; I recounted it in 2021’s Intel Problems. My solution then — written shortly after Gelsinger came back to Intel, fifteen years after being passed over for the CEO job — was that the company needed to split up.

Integrating design and manufacturing was the foundation of Intel’s moat for decades, but that integration has become a strait-jacket for both sides of the business. Intel’s designs are held back by the company’s struggles in manufacturing, while its manufacturing has an incentive problem.

The key thing to understand about chips is that design has much higher margins; Nvidia, for example, has gross margins between 60~65%, while TSMC, which makes Nvidia’s chips, has gross margins closer to 50%. Intel has, as I noted above, traditionally had margins closer to Nvidia, thanks to its integration, which is why Intel’s own chips will always be a priority for its manufacturing arm. That will mean worse service for prospective customers, and less willingness to change its manufacturing approach to both accommodate customers and incorporate best-of-breed suppliers (lowering margins even further). There is also the matter of trust: would companies that compete with Intel be willing to share their designs with their competitor, particularly if that competitor is incentivized to prioritize its own business?

The only way to fix this incentive problem is to spin off Intel’s manufacturing business. Yes, it will take time to build out the customer service components necessary to work with third parties, not to mention the huge library of IP building blocks that make working with a company like TSMC (relatively) easy. But a standalone manufacturing business will have the most powerful incentive possible to make this transformation happen: the need to survive.

Two months later and Gelsinger announced his turnaround plan: IDM 2.0. Intel would separate out its manufacturing into a separate division that would serve third parties, but still under the Intel banner. Gelsinger told me in an interview that this was the only way Intel could both be competitive in chips and keep investing in the leading edge; after all, AMD’s spin-off of Global Foundries resulted in the former floundering until they could break their purchase agreements with Global Foundries and go to TSMC, and the latter giving up on the leading edge.

Gelsinger is persuasive and optimistic, and for the last three years I’ve given him the benefit of the doubt. Suddenly, though, a split is back on the table; from Bloomberg:

Intel Corp. is working with investment bankers to help navigate the most difficult period in its 56-year history, according to people familiar with the matter. The company is discussing various scenarios, including a split of its product-design and manufacturing businesses, as well as which factory projects might potentially be scrapped, said the people, who asked not to be identified because the deliberations are private…

A potential separation or sale of Intel’s foundry division, which is aimed at manufacturing chips for outside customers, would be an about-face for Chief Executive Officer Pat Gelsinger. Gelsinger has viewed the business as key to restoring Intel’s standing among chipmakers and had hoped it would eventually compete with the likes of Taiwan Semiconductor Manufacturing Co., which pioneered the foundry industry.

As the article notes, Intel is likely to consider less drastic steps first; Reuters reported that ideas include selling businesses like its Altera programmable chip business and reducing capital expeditures, including axing a proposed foundry in Germany. The company also finally killed its dividend, and is cutting 15,000 jobs, which frankly, isn’t enough; I noted in an Update last week:

Intel ended last year with 124,800 people; to put that in context, TSMC had 76,478 employees and AMD 26,000, which is to say that the two companies combined had fewer employees than Intel while making better x86 chips, an actually competitive GPU, and oh yeah, making chips for everyone else on earth, including Apple and Nvidia. A 15,000 employee cut is both too small and too late.

The fundamental problem facing the company is encapsulated in that paragraph:

Intel doesn’t have the best manufacturing
Intel doesn’t design the best chips
Intel is out of the game in AI

Moreover, the future does not look bright; the problem with Intel’s most recent earnings call was threefold:

Intel’s is technically on pace to achieve the five nodes in four years Gelsinger promised (in truth two of those nodes were iterations), but they haven’t truly scaled any of them; the first attempt to do so, with Intel 3, destroyed their margins. This isn’t a surprise: the reason why it is hard to skip steps is not just because technology advances, but because you have to actually learn on the line how to implement new technology at scale, with sustainable yield. Go back to Intel’s 10nm failure: the company could technically make a 10nm chip, they just couldn’t do so economically; there are now open questions about Intel 3, much less next year’s promised 18A.
Intel is dramatically ramping up its Lunar Lake architecture as it is the only design the company has that is competitive with the Qualcomm ARM architecture undergirding Microsoft’s CoPilot+ PC initiative; the problem is that Lunar Lake’s tiles — including its CPU — are made by TSMC, which is both embarrassing and also terrible for margins.
The third problem is that the goal Gelsinger has been pushing for is the aforementioned 18A, yet Intel has yet to announce a truly committed at-scale partner. Yes, the company is in talks with lots of folks and claims some number of secret agreements, but at this point the foundry strategy needs real proof points; unfortunately Intel itself ramping up on TSMC, even as it loses control of its costs, isn’t exactly a selling point as to why any third-party should put their fortunes in Intel’s hands.

All that noted, my initial response to the meltdown over Intel’s earnings was to defend Gelsinger; what is happening to Intel now is downstream of mistakes that happened years before Gelsinger came back to the company. That remains true, but Gelsinger does have one fatal flaw: he still believes in Intel, and I no longer do.

Market Realities

Here is the fundamental problem facing Intel, and by extension, U.S. dreams of controlling leading edge capacity: there is no reason for Intel Foundry to exist. Apple, Nvidia, AMD, and other leading edge fabless chip companies rely on TSMC, and why wouldn’t they? TSMC invested in EUV, surpassed Intel, and are spending tens of billions of dollars a year to continue pushing forward to 2nm and beyond. Yes, TSMC priced 3nm too low, but even if the company raises prices for future nodes, as I expect them to, the relative cost matters much less than TSMC’s superior customer services and demonstrated reliability.

The kicker is that the smartest decision for Intel’s own chip unit is to — as they are with Lunar Lake — rely on TSMC’s manufacturing as well. Intel still has advantages in PCs and a dominant position in on-premises and government data centers, but the best way to leverage those remaining areas of strength is to have TSMC make their chips.

This was, for the record, why Gelsinger did have a point in keeping the company together; Intel Foundry needs volume, and the easiest way to get that volume is from Intel itself. However, that by definition is a decision that is not driven by what is best for a theoretical Intel fabless business, but rather the impetus to restore Intel’s manufacturing capability, even as that manufacturing capability is heavily incentivized to cater to Intel’s chip business at the expense of external customers.

Gelsinger’s trump card has been the fact that TSMC is based in Taiwan, which is under continuous threat from China. Indeed, Gelsinger has been quite explicit on this point; from CNA English News in 2021:

Intel CEO Pat Gelsinger said at the Fortune Brainstorm Tech summit in California on Wednesday that the United States government should support a sustainable semiconductor supply chain in the U.S., in part because “Taiwan is not a stable place”…

Asked about the comment, TSMC Chairman Mark Liu (劉德音) said, “there’s nothing that needs to be addressed. TSMC does not speak ill of other companies in the industry,” and added there were probably not many people who believed Gelsinger’s argument. Geopolitical tensions, Liu said, may have a short-term impact, but he believed Taiwan could help create a brilliant decade for the global semiconductor industry, with the best technology and the best manufacturing ecosystem.

Gelsinger made the same point to me in that interview while explaining why Intel needed to stay together:

As we look at this, to me, there is almost a global national perspective to this, in that I deeply believe the West needs a world class technology provider, and I don’t think that splitting Intel in two, that it could survive for many, many, many years till that would become the case, that you could stand that up. Remember, given cash flows, R&D streams, products that enable us to drive that, and I’m committed to go fix it, and I think we’re on a good path to go fix it since I’ve been here as well. So for those three different reasons, we chose the IDM 2.0 path, but it’s not because we didn’t look at the alternative, it’s partially because we did.

This is where everyone who is invested in American manufacturing — or perhaps more accurately, concerned about China’s threat to Taiwan — has to get brutally honest. If the U.S. government and U.S. tech companies want to have a non-Taiwan option, they are going to have to pay for it directly. Yes, the CHIPS Act passed, but while Intel is getting a lot of funds, it’s going to take a lot more — and the price of those funds needs to be a much smarter incentive structure that drives Intel apart.

My proposal back in 2021 was purchase guarantees instead of subsidies, and I am back to thinking that is the only viable path.

That is why a federal subsidy program should operate as a purchase guarantee: the U.S. will buy A amount of U.S.-produced 5nm processors for B price; C amount of U.S. produced 3nm processors for D price; E amount of U.S. produced 2nm processors for F price; etc. This will not only give the new Intel manufacturing spin-off something to strive for, but also incentivize other companies to invest; perhaps Global Foundries will get back in the game, or TSMC will build more fabs in the U.S. And, in a world of nearly free capital, perhaps there will finally be a startup willing to take the leap.

That free capital world is gone, and it’s probably not realistic for a startup to figure out how to manufacture the most complex devices humans have ever produced; the best idea at this point is a new company that has the expertise and starting position of Intel Foundry. Critically, though, it shouldn’t be at all beholden to x86 chips, have hundreds of thousands of employees, or the cultural overhang of having once led the computing world. The best we can do is purchase guarantees — on the order of hundreds of billions of dollars over the next decade — and a prayer that someone can make such an entity stand on its own.

To summarize, there is no market-based reason for Intel Foundry to exist; that’s not a market failure in a purely economic sense, but to the extent the U.S. national security apparatus sees it as a failure is the extent to which the U.S. is going to have to pay to make it happen. And, if the U.S. is going to pay up, that means giving that foundry the best possible chance to stand on its own two feet in the long run. That means actually earning business from Apple, Nvidia, AMD, and yes, even the fabless Intel company that will remain. The tech world has moved on from Intel; the only chance for U.S. leading edge manufacturing is to do the same.

I wrote a follow-up to this Article in this Daily Update.

Integration and Android

Wednesday, August 14, 2024Wednesday, September 4, 2024

This Article is available as a video essay on YouTube

The original Pixel was released in 2016 at the end of interesting, at least when it came to smartphones. Two years earlier Apple had finally knocked down every barrier limiting iPhone growth, releasing the large-screen iPhone 6 worldwide, including on the elusive China Mobile; Samsung, meanwhile, had consolidated the high-end of Android. Over the ensuing decade cameras have grown in both quantity and quality, buttons and bezels have given way to dynamic islands and hidden sensors, and, most annoyingly, Android and Apple have increasingly aped the other’s aggressive App Store policies; the only thing better than a monopoly is a duopoly that reaps the same profits without the regulatory pressure.

Integration and Smartphone Innovation

What was clear is that Apple was not, as so many had predicted in the early 2010s, doomed; it’s hard to imagine now, but the conventional wisdom when I started this site was that Apple’s focus on integration, while essential to creating the modern smartphone, would lead to ever-dwindling marketshare in the face of Android’s modular approach, which, we were assured, would bring to bear more innovation at lower prices, until developers gave up on iOS completely, relegating the iPhone to the status of the mid-1990s Macintosh — i.e. on life support.

This didn’t happen for several reasons.

First off, popular history of Windows and the Mac is totally wrong; yes, the Mac came before Windows, but critically, Windows was not a new operating system: it sat on top of DOS, which both preceded the Mac by years, and was originally sold by IBM, making it the standard for enterprise. The iPhone, on the other hand, absolutely did come out before Android, which is to say it had the head start in users and developers that the Mac never had (it’s also worth pointing out that the iPhone, in contrast to the Mac, has always been the performance leader).

Second, the iPhone is first and foremost a consumer device: this means that the user experience matters, and integration, which smooths over the seams of modularization, delivers a better user experience. This was the foundation of the argument I made for the iPhone’s long-term prospects in 2013’s What Clayton Christensen Got Wrong:

The issue I have with this analysis of vertical integration — and this is exactly what I was taught at business school — is that the only considered costs are financial. But there are other, more difficult to quantify costs. Modularization incurs costs in the design and experience of using products that cannot be overcome, yet cannot be measured. Business buyers — and the analysts who study them — simply ignore them, but consumers don’t. Some consumers inherently know and value quality, look-and-feel, and attention to detail, and are willing to pay a premium that far exceeds the financial costs of being vertically integrated.

What is notable is that the iPhone’s most successful premium competitors have been more integrated than not: while Samsung and Huawei don’t make the Android operating system, they do make a huge number of components of their phones — more than Apple — which helps them keep up in the race for new and novel features.

That there is the third point: new and novel features continue to matter, because the smartphone is the black hole of consumer electronics. Or, to use the term I coined in 2013, the smart phone is Obsoletive:

In 2006, the Nokia 1600 was the top-selling phone in the world, and the BlackBerry Pearl the best-selling smartphone. Both were only a year away from their doom, but that doom was not a cheaper, less-capable product, but in fact the exact opposite: a far more powerful, and fantastically more expensive product called the iPhone.

The problem for Nokia and BlackBerry was that their specialties — calling, messaging, and email — were simply apps: one function on a general-purpose computer. A dedicated device that only did calls, or messages, or email, was simply obsolete.

An even cursory examination of tech history makes it clear that “obsoletion” — where a cheaper, single-purpose product is replaced by a more expensive, general purpose product — is just as common as “disruption” — even more so, in fact. Just a few examples (think about it, and you’ll come up with a bunch more):

The typewriter and word processor were obsoleted by the PC

Typesetting was obsoleted by the Mac and desktop publishing

The newspaper was obsoleted by the Internet

The CD player was obsoleted by the iPod

The iPod was obsoleted by the iPhone

Smartphones and app stores have only accelerated this process, obsoleting the point-and-shoot, handheld video games, watches, calculators, maps, and many, many more.

The smartphone was, and remains, the perfect product: it is with you all the time, constantly connected, and increasingly capable and extendable. This both means that the utility-to-dollar-spent ratio is hugely positive, even for phones that cost well into the four-figures, and also continues to accentuate the advantages of integration, which makes these new capabilities possible. Google Senior Vice President of Devices & Services Rick Osterloh told me in an interview that will be posted tomorrow:

In the premium side [of the market], I think the leaders are going to end up being people with deep technical capabilities. It is the frontier space of computing in my view. And, because phones are with you all the time and they’re so heavily used, people want them to do everything. And so, there’s almost a sensational appetite for increasing capability within phones, which keeps pushing the envelope on what computing capability can you add to it to be able to accomplish the next task. And, I mean, I wouldn’t have thought a decade ago that people would ever be interested in taking continuous 4K video on this, and then being able to immediately upload it to a cloud. And, I don’t know, you wouldn’t have envisioned that necessarily.

I think now, phones are on the cusp of being able to, not only do stuff like that, but also become your wallet, become your keys, run advanced AI workloads, do stuff in the background for you. I mean, the amount of capabilities they have today is outrageous, and that’s only going to grow based on what I’m seeing now. Various times I thought maybe this work had plateaued, but that is absolutely not the case. I think they’re going to become more and more computer-like, and because they’re with you, they’ve got this place of importance that is difficult to overestimate.

In short, integration wins, at least in premium smartphones, which goes a long way in explaining why Pixel even exists: yes, Google partners closely with OEMs like Samsung, but if it ever wants to take on the iPhone, the company needs to do it all.

And yet, the Pixel hasn’t amounted to much so far: Google is up to around 5% marketshare in the U.S., but only 1% worldwide. There are various reasons this might be the case, some of which may be under Google’s control; the biggest problem, though, is the end of interesting: smartphones have gotten better over the eight years the Pixel has been in the market, but in rather boring ways; the paradigm shift that let Apple and Samsung take over the premium market happened before the Pixel ever existed. But now comes AI.

Android Primacy

Yesterday Google announced its ninth iteration of Pixel phones, and as you might expect, the focus was on AI. It is also unsurprising that the foundation of Osterloh’s pitch at the beginning of the keynote was about integration. What was notable is that the integration he focused on actually didn’t have anything to do with Pixel at all, but rather Android and Google:

We’re re-imagining the entire OS layer, putting Gemini right at the core of Android, the world’s most popular OS. You can see how we’re innovating with AI at every layer of the tech stack: from the infrastructure and the foundation models, to the OS and devices, and the apps and services you use every day. It’s a complete end-to-end experience that only Google can deliver. And I want to talk about the work we’re going to integrate it all together, with an integrated, helpful AI assistant for everyone. It changes how people interact with their mobile devices, and we’re building it right into Android.

For years, we’ve been pursuing our vision of a mobile AI assistant that you can work with like you work with a real life personal assistant, but we’ve been limited by the bounds of what existing technologies can do. So we’ve completely rebuilt the personal assistant experience around our Gemini models, creating a novel kind of computing help for the Gemini era.

The new Gemini assistant can go beyond understanding your words, to understanding your intent, so you can communicate more naturally. It can synthesize large amounts of information within seconds, and tackle complex tasks. It can draft messages for you, brainstorm with you, and give you ideas on how you can improve your work. With your permission, it can offer unparalleled personalized help, accessing relevant information across your Gmail Inbox, your Google calendar, and more. And it can reason across personal information and Google’s world knowledge, to provide just the right help and insight you need, and its only possible through advances we made in Gemini models over the last six months. It’s the biggest leap forward since we launched Google Assistant. Now we’re going to keep building responsibly, and pushing to make sure Gemini is available to everyone on every phone, and of course this starts with Android.

This may seem obvious, and in many respects it is: Google is a services company, which means it is incentivized to serve the entire world, maximizing the leverage on its costs, and the best way to reach the entire world is via Android. Of course that excludes the iPhone, but the new Gemini assistant isn’t displacing Siri anytime soon!

That, though, gets why the focus on Android is notable: one possible strategy for Google would have been to make its AI assistant efforts exclusive to Pixel, which The Information reported might happen late last year; the rumored name for the Pixel-exclusive-assistant was “Pixie”. I wrote in Google’s True Moonshot:

What, though, if the mission statement were the moonshot all along? What if “I’m Feeling Lucky” were not a whimsical button on a spartan home page, but the default way of interacting with all of the world’s information? What if an AI Assistant were so good, and so natural, that anyone with seamless access to it simply used it all the time, without thought?

That, needless to say, is probably the only thing that truly scares Apple. Yes, Android has its advantages to iOS, but they aren’t particularly meaningful to most people, and even for those that care — like me — they are not large enough to give up on iOS’s overall superior user experience. The only thing that drives meaningful shifts in platform marketshare are paradigm shifts, and while I doubt the v1 version of Pixie would be good enough to drive switching from iPhone users, there is at least a path to where it does exactly that.

Of course Pixel would need to win in the Android space first, and that would mean massively more investment by Google in go-to-market activities in particular, from opening stores to subsidizing carriers to ramping up production capacity. It would not be cheap, which is why it’s no surprise that Google hasn’t truly invested to make Pixel a meaningful player in the smartphone space.

The potential payoff, though, is astronomical: a world with Pixie everywhere means a world where Google makes real money from selling hardware, in addition to services for enterprises and schools, and cloud services that leverage Google’s infrastructure to provide the same capabilities to businesses. Moreover, it’s a world where Google is truly integrated: the company already makes the chips, in both its phones and its data centers, it makes the models, and it does it all with the largest collection of data in the world.

This path does away with the messiness of complicated relationships with OEMs and developers and the like, which I think suits the company: Google, at its core, has always been much more like Apple than Microsoft. It wants to control everything, it just needs to do it legally; that the best manifestation of AI is almost certainly dependent on a fully integrated (and thus fully seamless) experience means that the company can both control everything and, if it pulls this gambit off, serve everyone.

The problem is that the risks are massive: Google would not only be risking search revenue, it would also estrange its OEM partners, all while spending astronomical amounts of money. The attempt to be the one AI Assistant that everyone uses — and pays for — is the polar opposite of the conservative approach the company has taken to the Google Aggregator Paradox. Paying for defaults and buying off competitors is the strategy of a company seeking to protect what it has; spending on a bold assault on the most dominant company in tech is to risk it all.

I’ve referenced this piece a few times over the last year, including when Osterloh, the founding father of Pixel, took over Android as well. I said in an Update at the time:

Google has a very long ways to go to make [Google’s True Moonshot] a reality, or, frankly, to even make it a corporate goal. It will cost a lot of money, risk partnerships, and lower margins. It is, though, a massive opportunity — the maximal application of AI to Google’s business prospects — and it strikes me as a pretty big deal that, at least when it comes to the org chart, the Pixel has been elevated above Android.

In fact, though, my takeaway from yesterday’s event is the opposite: Android still matters most, and the integration Google is truly betting on is with the cloud.

The Pixel Angle

That’s not to say that Google is giving up on integration completely; President of Android Ecosystem Sameer Samat framed Google’s approach this way:

As Rick said earlier, this is where our decades of investment in AI and Google’s full tech stack make Gemini really special. Gemini can handle these kinds of complex personal queries within Google’s own secure cloud, without sending any of your personal data to a third-party AI provider you may not know or trust. And for some of the most sensitive use cases, like summarizing audio from a phone call, or suggesting a helpful reply to an encrypted text message, we’ve pioneered on-device generative AI with Gemini Nano. It’s the first time a large multimodal AI model has been optimized for a mobile device, so the data never leaves your phone. With Gemini deeply integrated with Android, we’re well on our way to rebuilding the OS with AI at the core. The new Gemini assistant brings the benefits of AI to billions around the world, while helping to keep your personal information secure and private. Android is truly leading the way towards AI for everyone.

To the extent Google is betting on Pixel, it’s that the company will deliver on this hybrid approach more effectively than anyone else in the ecosystem, and that’s not nothing: Google has ultimate control of a Pixel device, and can reap all of the benefits of integration I highlighted above.

In the end, though, Google’s real bet is that owning the information stack matters more than owning the tech stack, particularly when you have the most capable cloud infrastructure to act on it. That is, of course, the same Android strategy as always; the bet is that AI does the hard work of making it more attractive to premium customers than it has to date.

Integration, Innovation, and the DOJ

From Bloomberg:

A bid to break up Alphabet Inc.’s Google is one of the options being considered by the Justice Department after a landmark court ruling found that the company monopolized the online search market, according to people with knowledge of the deliberations.

The move would be Washington’s first push to dismantle a company for illegal monopolization since unsuccessful efforts to break up Microsoft Corp. two decades ago. Less severe options include forcing Google to share more data with competitors and measures to prevent it from gaining an unfair advantage in AI products, said the people, who asked not to be identified discussing private conversations.

Regardless, the government will likely seek a ban on the type of exclusive contracts that were at the center of its case against Google. If the Justice Department pushes ahead with a breakup plan, the most likely units for divestment are the Android operating system and Google’s web browser Chrome, said the people. Officials are also looking at trying to force a possible sale of AdWords, the platform the company uses to sell text advertising, one of the people said.

As I noted last week, I think the remedy that actually addresses the issues in this case is that ban on exclusive contracts.

One important takeaway from yesterday’s Google event, though, and the overall discussion of the importance of integration, is that I think forcing a divesture of Android in particular would be a mistake. Yes, you can envision a world where Android spurs competition amongst AI providers by being open to the highest bidder, or the best product; note, though, that is basically what Apple Intelligence is proffering. Apple is handling AI related to your personal data that is held on the device they sell, and they are really the only ones that can do so; “world knowledge” is being handled by OpenAI for now, but the company has been clear that there will be other offerings.

What Google is proposing is something different entirely: you can pick your device, but your AI will be integrated with your data primarily via the cloud; the company can pull that off because they own both Android and the cloud. It’s something different and, to the extent the DOJ is concerned with historial patterns of innovation, they should let Google’s integration be.

Friendly Google and Enemy Remedies

Tuesday, August 6, 2024Wednesday, August 21, 2024

This Article is available as a video essay on YouTube

Railroads were, in theory, an attractive business: while they cost a lot to build, once built, the marginal cost of carrying additional goods was extremely low; sure, you needed to actually run a train, which need fuel and workers and which depreciated the equipment involved, but those costs were minimal compared to the revenue that could be earned from carrying goods for customers that had to pay to gain access to the national market unlocked by said railroads.

The problem for railroad entrepreneurs in the 19th century is that they were legion: as locomotion technology advanced and became standardized, and steel production became a massive industry in its own right, fierce competition arose to tie the sprawling United States together. This was a problem for those seemingly attractive railroad economics: sure, marginal costs were low, but that meant a race to the bottom in terms of pricing (which is based on covering marginal costs, ergo, low marginal costs mean a low floor in terms of the market clearing price). It was the investors in railroads — the ones who paid the upfront cost of building the tracks in the hope of large profits on low marginal cost service, and who were often competing for (and ultimately with) government land grants and subsidies, further fueling the boom — that were left holding the bag.

This story, like so many technological revolutions, culminated in a crash, in this case the Panic of 1873. The triggering factor for the Panic of 1873 was actually currency, as the U.S. responded to a German decision to no longer mint silver coins by changing its policy of backing the U.S. dollar with both silver and gold to gold only, which dramatically tightened the money supply, leading to a steep rise in interest rates. This was a big problem for railway financiers who could no longer service their debts; their bankruptcies led to a slew of failed banks and even the temporary closure of the New York Stock Exchange, which rippled through the economy, leading to a multi-year depression and the failure of over 100 railroads within the following year.

Meanwhile, oil, then used primarily to refine kerosene for lighting, was discovered in Titusville, Pennsylvania in 1859, Bradford, Pennsylvania in 1871, and Lima, Ohio in 1885. The most efficient refineries in the world, thanks to both vertical integration and innovative product creation using waste products from the kerosene refinement process, including the novel use of gasoline as a power source, were run by John D. Rockefeller in Cleveland. Rockefeller’s efficiencies led to a massive boom in demand for kerosene lighting, that Rockefeller was determined to meet; his price advantage — driven by innovation — allowed him to undercut competitors, forcing them to sell to Rockefeller, who would then increase their efficiency, furthering the supply of cheap kerosene, which drove even more demand.

This entire process entailed moving a lot of products around in bulk: oil needed to be shipped to refineries, and kerosene to burgeoning cities through the Midwest and east coast. This was a godsend to the struggling railroad industry: instead of struggling to fill trains from small-scale and sporadic shippers, they signed long-term contracts with Standard Oil; guaranteed oil transportation covered marginal costs, freeing up the railroads to charge higher profit-making rates on those small-scale and sporadic shippers. Those contracts, in turn, gave Standard Oil a durable price advantage in terms of kerosene: having bought up the entire Ohio refining industry through a price advantage earned through innovation and efficiency, Standard Oil was now in a position to do the same to the entire country through a price advantage gained through contracts with railroad companies.

The Sherman Antitrust Act

There were, to be clear, massive consumer benefits to Rockefeller’s actions: Standard Oil, more than any other entity, brought literal light to the masses, even if the masses didn’t fully understand the ways in which they benefited from Rockefeller’s machinations; it was the people who understood the costs — particularly the small businesses and farmers of the Midwest generally, and Ohio in particular — who raised a ruckus. They were the “small-scale and sporadic shippers” I referenced above, and the fact that they had to pay far more for railroad transportation in a Standard Oil world than they had in the previous period of speculation and over-investment caught the attention of politicians, particularly Senator John Sherman of Ohio.

Senator Sherman had not previously shown a huge amount of interest in the issue of monopoly and trusts, but he did have oft-defeated presidential aspirations, and seized on the discontent with Standard Oil and the railroads to revive a bill originally authored by a Vermont Senator named George Edmunds; the relevant sections of the Sherman Antitrust Act were short and sweet and targeted squarely at Standard Oil’s contractual machinations:

Sec. 1. Every contract, combination in the form of trust or otherwise, or conspiracy, in restraint of trade or commerce among the several States, or with foreign nations, is hereby declared to be illegal. Every person who shall make any such contract or engage in any such combination or conspiracy, shall be deemed guilty of a misdemeanor, and, on conviction thereof, shall be punished by fine not exceeding five thousand dollars, or by imprisonment not exceeding one year, or by both said punishments, at the discretion of the court.

Sec. 2. Every person who shall monopolize, or attempt to monopolize, or combine or conspire with any other person or persons, to monopolize any part of the trade or commerce among the several States, or with foreign nations, shall be deemed guilty of a misdemeanor, and, on conviction thereof; shall be punished by fine not exceeding five thousand dollars, or by imprisonment not exceeding one year, or by both said punishments, in the discretion of the court.

And so we arrive at Google.

The Google Case

Yesterday, from the Wall Street Journal:

A federal judge ruled that Google engaged in illegal practices to preserve its search engine monopoly, delivering a major antitrust victory to the Justice Department in its effort to rein in Silicon Valley technology giants. Google, which performs about 90% of the world’s internet searches, exploited its market dominance to stomp out competitors, U.S. District Judge Amit P. Mehta in Washington, D.C. said in the long-awaited ruling.

“Google is a monopolist, and it has acted as one to maintain its monopoly,” Mehta wrote in his 276-page decision released Monday, in which he also faulted the company for destroying internal messages that could have been useful in the case. Mehta agreed with the central argument made by the Justice Department and 38 states and territories that Google suppressed competition by paying billions of dollars to operators of web browsers and phone manufacturers to be their default search engine. That allowed the company to maintain a dominant position in the sponsored text advertising that accompanies search results, Mehta said.

While there have been a number of antitrust laws passed by Congress, most notably the Clayton Antitrust Act of 1914 and Federal Trade Commission Act of 1914, the Google case is directly downstream of the Sherman Act, specifically Section 2, and its associated jurisprudence. Judge Mehta wrote in his 286-page opinion:

“Section 2 of the Sherman Act makes it unlawful for a firm to ‘monopolize.’” United States v. Microsoft, 253 F.3d 34, 50 (D.C. Cir. 2001) (citing 15 U.S.C. § 2). The offense of monopolization requires proof of two elements: “(1) the possession of monopoly power in the relevant market and (2) the willful acquisition or maintenance of that power as distinguished from growth or development as a consequence of a superior product, business acumen, or historic accident.” United States v. Grinnell Corp., 384 U.S. 563, 570–71 (1966).

Note that Microsoft reference: the 1990s antitrust case provides the analytical framework Mehta used in this case.

The court structures its conclusions of law consistent with Microsoft’s analytical framework. After first summarizing the principles governing market definition, infra Section II.A, the court in Section II.B addresses whether general search services is a relevant product market, and finding that it is, then evaluates in Section II.C whether Google has monopoly power in that market. In Part III, the court considers the three proposed advertiser-side markets. The court finds that Plaintiffs have established two relevant markets — search advertising and general search text advertising — but that Google possesses monopoly power only in the narrower market for general search text advertising. All parties agree that the relevant geographic market is the United States.

The court then determines whether Google has engaged in exclusionary conduct in the relevant product markets. Plaintiffs’ primary theory centers on Google’s distribution agreements with browser developers, OEMs, and carriers. The court first addresses in Part IV whether the distribution agreements are exclusive under Microsoft. Finding that they are, the court then analyzes in Parts V and VI whether the contracts have anticompetitive effects and procompetitive justifications in each market. For reasons that will become evident, the court does not reach the balancing of anticompetitive effects and procompetitive justifications. Ultimately, the court concludes that Google’s exclusive distribution agreements have contributed to Google’s maintenance of its monopoly power in two relevant markets: general search services and general search text advertising.

I find Mehta’s opinion well-written and exhaustive, but the decision is ultimately as simple as the Sherman Act: Google acquired a monopoly in search through innovation, but having achieved a monopoly, it is forbidden from extending that monopoly through the use of contractual arrangements like the default search deals it has with browser developers, device makers, and carriers. That’s it!

Aggregators and Contracts

To me this simplicity is the key to the case, and why I argued from the get-go that the Department of Justice was taking a far more rational approach to prosecuting a big tech monopoly than the FTC or European Commission had been. From a 2020 Stratechery Article entitled United States v. Google:

The problem with the vast majority of antitrust complaints about big tech generally, and online services specifically, is that Page is right [about competition only being a click away]. You may only have one choice of cable company or phone service or any number of physical goods and real-world services, but on the Internet everything is just zero marginal bits.

That, though, means there is an abundance of data, and Google helps consumers manage that abundance better than anyone. This, in turn, leads Google’s suppliers to work to make Google better — what is SEO but a collective effort by basically the entire Internet to ensure that Google’s search engine is as good as possible? — which attracts more consumers, which drives suppliers to work even harder in a virtuous cycle. Meanwhile, Google is collecting information from all of those consumers, particularly what results they click on for which searches, to continuously hone its accuracy and relevance, making the product that much better, attracting that many more end users, in another virtuous cycle:

One of the central ongoing projects of this site has been to argue that this phenomenon, which I call Aggregation Theory, is endemic to digital markets…In short, increased digitization leads to increased centralization (the opposite of what many originally assumed about the Internet). It also provides a lot of consumer benefit — again, Aggregators win by building ever better products for consumers — which is why Aggregators are broadly popular in a way that traditional monopolists are not…

The solution, to be clear, is not simply throwing one’s hands up in the air and despairing that nothing can be done. It is nearly impossible to break up an Aggregator’s virtuous cycle once it is spinning, both because there isn’t a good legal case to do so (again, consumers are benefitting!), and because the cycle itself is so strong. What regulators can do, though, is prevent Aggregators from artificially enhancing their natural advantages…

That is exactly what this case was about:

This is exactly why I am so pleased to see how narrowly focused the Justice Department’s lawsuit is: instead of trying to argue that Google should not make search results better, the Justice Department is arguing that Google, given its inherent advantages as a monopoly, should have to win on the merits of its product, not the inevitably larger size of its revenue share agreements. In other words, Google can enjoy the natural fruits of being an Aggregator, it just can’t use artificial means — in this case contracts — to extend that inherent advantage.

I laid out these principles in 2019’s A Framework for Regulating Competition on the Internet, and it was this framework that led me to support the DOJ’s case initially, and applaud Judge Mehta’s decision today.

Mehta’s decision, though, is only about liability: now comes the question of remedies, and the truly difficult questions for me and my frameworks.

Friendly Google

The reason to start this Article with railroads and Rockefeller and the history of the Sherman Antitrust Act is not simply to provide context for this case; rather, it’s important to understand that antitrust is inherently political, which is another way of saying it’s not some sort of morality play with clearly distinguishable heroes and villains. In the case of Standard Oil, the ultimate dispute was between the small business owners and farmers of the Midwest and city dwellers who could light their homes thanks to cheap kerosene. To assume that Rockefeller was nothing but a villain is to deny the ways in which his drive for efficiency created entirely new markets that resulted in large amounts of consumer welfare; moreover, there is an argument that Standard Oil actually benefited its political enemies as well, by stabilizing and standardizing the railroad industry that they ultimately resented paying for.

Indeed, there are some who argue, even today, that all of antitrust law is misguided, because like all centralized interference in markets, it fails to properly balance the costs and benefits of interference with those markets. To go back to the Standard Oil example, those who benefited from cheap kerosene were not politically motivated to defend Rockefeller, but their welfare was in fact properly weighed by the market forces that resulted in Rockefeller’s dominance. Ultimately, though, this is a theoretical argument, because politics do matter, and Sherman tapped into a deep and longstanding discomfort in American society with dominant entities like Standard Oil then, and Google today; that’s why the Sherman Antitrust Act passed by a vote of 51-1 in the Senate, and 242-0 in the House.

Then something funny happened: Standard Oil was indeed prosecuted under the Sherman Antitrust Act, and ordered to be broken up into 34 distinct companies; Rockefeller had long since retired from active management at that point, but still owned 25% of the company, and thus 25% of the post-breakup companies. Those companies, once listed, ended up being worth double what they were as Standard Oil; Rockefeller ended up richer than ever. Moreover, it was those companies, like Exxon, that ended up driving a massive increase in oil by expanding from refining to exploration all over the world.

The drivers of that paradox are evident in the consideration of remedies for Google. One possibility is a European Commission-style search engine choice screen for consumers setting up new browsers or devices: is there any doubt that the vast majority of people will choose Google, meaning Google keeps its share and gets to keep the money it gives Apple and everyone else? Another is that Google is barred from bidding for default placement, but other search engines can: that will put entities like Apple in the uncomfortable position of either setting what it considers the best search engine as the default, and making no money for doing so, or prioritizing a revenue share from an also-ran like Bing — and potentially seeing customers go to Google anyways. The iPhone maker could even go so far as to build its own search engine, and seek to profit directly from the search results driven by its distribution advantage, but that entails tremendous risk and expense on the part of the iPhone maker, and potentially favors Android.

That, though, was the point: the cleverness of Google’s strategy was their focus on making friends instead of enemies, thus motivating Apple in particular to not even try. I told Michael Nathanson and Craig Moffett when asked in a recent interview why Apple doesn’t build a search engine:

Apple already has a partnership with Google, the Google-Apple partnership is really just a beautiful manifestation of how, don’t-call-them-monopolies, can really scratch each other’s back in a favorable way, such that Google search makes up something like 17% or 20% of Apple’s profit, it’s basically pure profit for Apple and people always talk about, “When’s Apple going to make a search engine?” — the answer is never. Why would they? They get the best search engine, they get profits from that search engine without having to drop a dime of investment, they get to maintain their privacy brand and say bad things about data-driven advertising, while basically getting paid by what they claim to hate, because Google is just laundering it for them. Google meanwhile gets the scale, there’s no points of entry for potential competition, it makes total sense.

This wasn’t always Google’s approach; in the early years of the smartphone era the company had designs on Android surpassing the iPhone, and it was a whole-company effort. That, mistakenly in my view, at least from a business perspective, meant using Google’s services — specifically Google Maps — to differentiate Android, including shipping turn-by-turn directions on Android only, and demanding huge amounts of user data from Apple to maintain an inferior product for the iPhone.

Apple’s response was shocking at the time: the company would build its own Maps product, even though that meant short term embarrassment. It was also effective, as evidenced by testimony in this case. From Bloomberg last fall:

Two years after Apple Inc. dropped Google Maps as its default service on iPhones in favor of its own app, Google had regained only 40% of the mobile traffic it used to have on its mapping service, a Google executive testified in the antitrust trial against the Alphabet Inc. company. Michael Roszak, Google’s vice president for finance, said Tuesday that the company used the Apple Maps switch as “a data point” when modeling what might happen if the iPhone maker replaced Google’s search engine as the default on Apple’s Safari browser.

The lesson Google learned was that Apple’s distribution advantages mattered a lot, which by extension meant it was better to be Apple’s friend than its enemy. I wrote in an Update after that revelation:

This does raise a question I get frequently: how can I argue that Google wins by being better when it is willing to pay for default status? I articulated the answer on a recent episode of Sharp Tech, but briefly, nothing exists in a vacuum: defaults do matter, and that absolutely impacts how much better you have to be to force a switch. In this case Google took the possibility off of the table completely, and it was a pretty rational decision in my mind.

It also, without question, reduced competition in the space, which is why I always thought this was a case worth taking to court. This is in fact a case where I think even a loss is worthwhile, because I find contracts between Aggregators to be particularly problematic. Ultimately, though, my objection to this arrangement is just as much, if not more, about Apple and its power. They are the ones with the power to set the defaults, and they are the ones taking the money instead of competing; it’s hard to fault Google for being willing to pay up.

Tech companies, particularly advertising-based ones, famously generate huge amounts of consumer surplus. Yes, Google makes a lot of money showing you ads, but even at a $300+ billion run rate, the company is surely generating far more value for consumers than it is capturing. That is in itself some degree of defense for the company, I should note, much as Standard Oil brought light to every level of society; what is notable about these contractual agreements, though, is how Google has been generating surplus for everyone else in the tech industry.

Maybe this is a good thing; it’s certainly good for Mozilla, which gets around 80% of its revenue from its Google deal. It has been good for device makers, commoditized by Android, who have an opportunity for scraps of profit. It has certainly been profitable for Apple, which has seen its high-margin services revenue skyrocket, thanks in part to the ~$20 billion per year of pure profit it gets from Google without needing to make any level of commensurate investment.

Enemy Remedies

However, has it been good for Google, not just in terms of the traffic acquisition costs it pays out, but also in terms of the company’s maintenance of the drive that gave it its dominant position in the first place? It’s a lot easier to pay off your would-be competitors than it is to innovate. I’m hesitant to say that antitrust is good for its subjects, but Google does make you wonder.

Most importantly, has it been good for consumers? This is where the Apple Maps example looms large: Apple has shown it can compete with Google if it puts resources behind a project it considers core to the iPhone experience. By extension, the entire reason why Google favored Google Maps in the first place, leaving Apple no choice but to compete, is because they were seeking to advantage Android relative to the iPhone. Both competitions drove large amounts of consumer benefit that continue to persist today.

I would also note that the behavior I am calling for — more innovation and competition, not just from Google’s competitors, but Google itself — is the exact opposite of what the European Union is pushing for, which is product stasis. I think the E.U. is mistaken for the exact same reasons I think Judge Mehta is right.

There’s also the political point: I am an American, and I share the societal sense of discomfort in dominant entities that made the Sherman Antitrust Act law in the first place; yes, it’s possible this decision doesn’t mean much in the end, but it’s pushing in a direction that is worth leaning into

This is why, ultimately, I am comfortable with the implications of my framework, and why I think the answer to the remedy question is an injunction against Google making any sort of payments or revenue share for search; if you’re a monopoly you don’t get to extend your advantage with contracts, period (now do most-favored nation clauses). More broadly, we tend to think of monopolies as being mean; the problem with Aggregators is they have the temptation to be too nice. It has been very profitable to be Google’s friend; I think consumers — and Google — are better off if the company has a few more enemies.

I wrote a follow-up to this Article in this Daily Update.

Crashes and Competition

Monday, July 22, 2024Wednesday, August 7, 2024

This Article is available as a video essay on YouTube

I’ve long maintained that if the powers-that-be understood what the Internet’s impact would be, they would have never allowed it to be created. It’s hard to accuse said shadowy figures of negligence, however, given how clueless technologists were as well; look no further than an operating system like Windows.

Windows was, from the beginning, well and truly open: 3rd-party developers could do anything, including “patching the kernel”; to briefly summarize:

The “kernel” of an operating system is the core of the operating system, the function of which is to manage the actual hardware of a computer. All software running in the kernel is fully privileged, which is to say it operates without any restrictions. If software crashes in the kernel, the entire computer crashes.
Everything else on a computer runs in “user space”; user space effectively sits on top of the kernel, and is dependent on APIs provided by the operating system maker to compel software in kernel space to actually interface with the hardware. If software crashes in user space the rest of the computer is fine.

This is a drastically simplified explanation; in some operating systems there are levels of access between kernel space and user space for things like drivers (which need direct access to the hardware they are controlling, but not necessarily hardware access to the entire computer), and on the other side of things significant limitations on software in user space (apps, for example, might be “sandboxed” and unable to access other information on the computer, even if it is in user space).

The key point for purposes of this Article, though, is that Windows allowed access to both kernel space and user space; yes, the company certainly preferred that developers only operate in user space, and the company never officially supported applications that patched the kernel, but the reality is that operating in kernel space is far more powerful and so a lot of developers would do just that.

Security Companies and Kernel Access

An example of developers with a legitimate argument for access to kernel space are security companies: Windows’ openness extended beyond easy access to kernel space; the reason why sandboxing became a core security feature of newer operating systems like iOS is that not all developers were good actors: virus and malware makers on Windows in particular would leverage easy access to other programs to infiltrate computers and make them nigh on unusable at best, and exfiltrate data or use computers they took over to attack others at worse.

The goal of security software like antivirus programs or malware scanners was to catch these bad actors and eliminate them; the best way to do so was to patch the kernel and so operate at the lowest, most powerful layer of Windows, with full visibility and access to every other program running on the computer. And, to be clear, in the 2000s, when viruses and malware were at their peak, this was very much necessary — and necessary is another way of saying this was a clear business opportunity.

Two of the companies seizing this opportunity in the 2000s were Symantec and McAfee; both reacted with outrage in 2005 and 2006 when Microsoft, in the run-up to the release of Windows Vista, introduced PatchGuard. PatchGuard was aptly named: it guarded the kernel from being patched by 3rd-parties, with the goal of increasing security. This, though, was a threat to Symantec and McAfee; George Amenuk, CEO of the latter, released an open letter that stated:

Over the years, the most reliable defenders against the many, many vulnerabilities in the Microsoft operating systems have been the independent security companies such as McAfee. Yet, if Microsoft succeeds in its latest effort to hamstring these competitors, computers everywhere could be less secure. Computers are more secure today, thanks to relentless innovations by the security providers. Microsoft also has helped by allowing these companies’ products full access to system resources-this has enabled the security products to better “see” threats and deploy defenses against viruses and other attacks.

With its upcoming Vista operating system, Microsoft is embracing the flawed logic that computers will be more secure if it stops cooperating with the independent security firms. For the first time, Microsoft shut off security providers’ access to the core of its operating system – what is known as the “kernel”.

At the same time, Microsoft has firmly embedded in Vista its own Windows Security Center-a product that cannot be disabled even when the user purchases an alternative security solution. This approach results in confusion for customers and prevents genuine freedom of choice. Microsoft seems to envision a world in which one giant company not only controls the systems that drive most computers around the world but also the security that protects those computers from viruses and other online threats. Only one approach protecting us all: when it fails, it fails for 97% of the world’s desktops.

Symantec, meanwhile, went straight to E.U. regulators, making the case that Microsoft, already in trouble over its inclusion of Internet Explorer in the 90s, and Windows Media Player in the early 2000s, was unfairly limiting competition for security offerings. The E.U. agreed and Microsoft soon backed down; from Silicon.com in 2006:

Microsoft has announced it will give security software makers technology to access the kernel of 64-bit versions of Vista for security-monitoring purposes. But its security rivals remain as yet unconvinced. Redmond also said it will make it possible for security companies to disable certain parts of the Windows Security Center in Vista when a third-party security console is installed. Microsoft made both changes in response to antitrust concerns from the European Commission. Led by Symantec, the world’s largest antivirus software maker, security companies had publicly criticised Microsoft over both Vista features and also talked to European competition officials about their gripes.

Fast forward nearly two decades, and while Symantec and McAfee are still around, there is a new wave of cloud-based security companies that dominate the space, including CrowdStrike; Windows is much more secure than it used to be, but after the disastrous 2000s, a wave of regulations were imposed on companies requiring them to adhere to a host of requirements that are best met by subscribing to an all-in-one solution that checks all of the relevant boxes, and CrowdStrike fits the bill. What is the same is kernel-level access, and that brings us to last week’s disaster.

The CrowdStrike Crash

On Friday, from The Verge:

Thousands of Windows machines are experiencing a Blue Screen of Death (BSOD) issue at boot today, impacting banks, airlines, TV broadcasters, supermarkets, and many more businesses worldwide. A faulty update from cybersecurity provider CrowdStrike is knocking affected PCs and servers offline, forcing them into a recovery boot loop so machines can’t start properly. The issue is not being caused by Microsoft but by third-party CrowdStrike software that’s widely used by many businesses worldwide for managing the security of Windows PCs and servers.

On Saturday, from the CrowdStrike blog:

On July 19, 2024 at 04:09 UTC, as part of ongoing operations, CrowdStrike released a sensor configuration update to Windows systems. Sensor configuration updates are an ongoing part of the protection mechanisms of the Falcon platform. This configuration update triggered a logic error resulting in a system crash and blue screen (BSOD) on impacted systems. The sensor configuration update that caused the system crash was remediated on Friday, July 19, 2024 05:27 UTC. This issue is not the result of or related to a cyberattack.

In any massive failure there are a host of smaller errors that compound; in this case, CrowdStrike created a faulty file, failed to test it properly, and deployed it to its entire customer base in one shot, instead of rolling it out in batches. Doing something different at each one of these steps would have prevented the widespread failures that are still roiling the world (and will for some time to come, given that the fix requires individual action on every affected computer, since the computer can’t stay running long enough to run a remotely delivered fix).

The real issue, though, is more fundamental: erroneous configuration files in userspace crash a program, but they don’t crash the computer; CrowdStrike, though, doesn’t run in userspace: it runs in kernel space, which means its bugs crash the entire computer — 8 million of them, according to Microsoft. Apple and Linux were not impacted, for a very obvious reason: both have long since locked out 3rd-party software from kernel space.

Microsoft, though, despite having tried to do just that in the 2000s, has its hands tied; from the Wall Street Journal:

A Microsoft spokesman said it cannot legally wall off its operating system in the same way Apple does because of an understanding it reached with the European Commission following a complaint. In 2009, Microsoft agreed it would give makers of security software the same level of access to Windows that Microsoft gets.

I wasn’t able to find the specifics around the agreement Microsoft made with the European Commission; the company did agree to implement a browser choice screen in December 2009, along with a commitment to interoperability for its “high-share software products” including Windows. What I do know is that a complaint about kernel level access was filed by Symantec, that Microsoft was under widespread antitrust pressure by regulators, and, well, that a mistake by CrowdStrike rendered millions of computers inoperable because CrowdStrike has kernel access.

Microsoft’s Handicap

On Friday afternoon, FTC Chair Lina Khan tweeted:

1. All too often these days, a single glitch results in a system-wide outage, affecting industries from healthcare and airlines to banks and auto-dealers. Millions of people and businesses pay the price.

These incidents reveal how concentration can create fragile systems.

— Lina Khan (@linakhanFTC) July 19, 2024

This is wrong on a couple of levels, but the ways in which it is wrong are worth examining because of what they mean for security specifically and tech regulation broadly.

First, this outage was the system working as regulators intended: 99% of Windows computers were not affected, just those secured by CrowdStrike; to go back to that 2006 open letter from the McAfee CEO:

We think customers large and small are right to rely on the innovation arising from the intense competition between diverse and independent security companies. Companies like McAfee have none of the conflicts of interest deriving from ownership of the operating system. We focus purely on security. Independent security developers have proven to be the most powerful weapon in the struggle against those who prey on weak computers. Computer users around the globe recognize that the most serious threats to security exist because of inherent weaknesses in the Microsoft operating system. We believe they should demand better of Microsoft.

For starters, customers should recognize that Microsoft is being completely unrealistic if, by locking security companies out of the kernel, it thinks hackers won’t crack Vista’s kernel. In fact, they already have. What’s more, few threats actually target the kernel – they target programs or applications. Yet the unfettered access previously enjoyed by security providers has been a key part of keeping those programs and applications safe from hackers and malicious software. Total access for developers has meant better protection for customers.

That argument may be correct; the question this episode raises, though, is what is the appropriate level of abstraction to evaluate risk? The McAfee CEO’s argument is that most threats are targeting userspace, which is why security developers deserve access to kernel space to root them out; again, I think this argument is probably correct in a narrow sense — it was definitely correct in the malware-infested 2000s — but what is a bigger systemic problem, malware and viruses on functioning computers, or computers that can’t even turn on?

Second, while Khan’s tweets didn’t mention Microsoft specifically, it seems obvious that is the company she was referring to; after all, CrowdStrike, who was actually to blame, is apparently only on 1% of Windows PCs, which even by the FTC’s standards surely doesn’t count as “concentration.” In this Khan was hardly alone: the company that is taking the biggest public relations hit is Microsoft, and how could they not:

Poor Microsoft pic.twitter.com/KrbYJmjcDA

— Ben Thompson (@benthompson) July 19, 2024

Everyone around the world encountered these images everywhere, both in person and on social media:

the prison walls become visible pic.twitter.com/aOrM3O44lk

— wil michael (@wilplatypus) July 19, 2024

This tweet was a joke, but from Microsoft’s position, apt: if prison is the restriction of freedom by the authorities, well, then that is exactly how this happened, as regulators restricted Microsoft’s long-sought freedom to lock down kernel space.

To be clear, restricting access to kernel space would not have made an issue like this impossible: after all, Microsoft, by definition, will always have access to kernel space, and they could very well issue an update that crashes not just 1% of the world’s Windows computers, but all of them. This, though, raises the question of incentives: is there any company both more motivated and better equipped than Microsoft to not make this sort of mistake, given the price they are paying today for a mistake that wasn’t even their fault?

Regulating Progress

Cloudflare CEO Matthew Prince already anticipated the potential solution I am driving at, and wrote a retort on X:

Here’s the scary thing that’s likely to happen based on the facts of the day if we don’t pay attention. Microsoft, who competes with @CrowdStrike, will argue that they should lock all third-party security vendors out of their OS. “It’s the only way we can be safe,” they’ll testify before Congress.

But lest we forget, Microsoft themselves had their own eternal screw up where they potentially let a foreign actor read every customer’s email because they failed to adequately secure their session signing keys. We still have no idea how bad the implications of #EternalBlue are.

So pick your poison. Today CrowdStrike messed up and some systems got locked out. That sucks a measurable amount. On the other hand, if Microsoft runs the app and security then they mess up and you’ll probably still be able to check your email — because their incentive is to fail open — but you’ll never know who else could too. Not to mention your docs, apps, files, and everything else.

Today sucked, but better security isn’t consolidated security. It isn’t your application provider picking who your security vendor must be. It’s open competition across many providers. Because CrowdStrike had a bad day, but the solution isn’t to standardize on Microsoft.

And, if we do, then when they have a bad day it’ll make today look like a walk in the park.

Prince’s argument is ultimately an updated version of that made by the McAfee CEO, and while I agree in theory, in this specific instance I disagree in practice: Windows gave kernel access because the company didn’t know any better, but just because the company won in its market doesn’t mean decisions made decades ago must then be the norm forever.

This is a mistake that I think that regulators make regularly, particularly in Europe. Last week I wrote in the context of the European Commission’s investigation of X and blue checkmarks:

One of the critiques of European economies is how difficult it is to fire people; while the first-order intentions are obviously understandable, the critique is that companies underinvest in growth because there is so much risk attached to hiring: if you get the wrong person, or if expected growth doesn’t materialize, you are stuck. What is notable is how Europe seems to have decided on the same approach to product development: Google is expected to have 10 blue links forever, Microsoft can’t include a browser or shift the center of gravity of its business to Teams, Apple can’t use user data for Apple Intelligence, and, in this case, X is forever bound to the European Commission’s interpretation of what a blue check meant under previous ownership. Everything, once successful, must be forever frozen in time; ultimately, though, the E.U. only governs a portion of Europe, and the only ones stuck in the rapidly receding past — for better or worse! — will be the E.U.’s own citizens.

In this case, people all over the world suffered because Microsoft was never allowed to implement a shift in security that it knew was necessary two decades ago.

More broadly, regulators need to understand that everything is a trade-off. Apple is under fire for its App Store policies — which I too have been relentlessly critical of — but as I wrote in The E.U. Goes Too Far earlier this month:

Apple didn’t just create the iPhone, they also created the App Store, which, after the malware and virus muddled mess of the 2000s, rebuilt user confidence and willingness to download 3rd-party apps. This was a massive boon to developers, and shouldn’t be forgotten; more broadly, the App Store specifically and Apple’s iOS security model generally really do address real threats that can not only hurt users but, by extension, chill the market for 3rd-party developers.

I went on to explain how Apple has gone too far with this model, particularly with its policy choices in the App Store that seem to be motivated more by protecting App Store revenue than security (and why the European Commission was right to go after anti-steering policies in particular), but I included the excerpted paragraph as a reminder that these are hard questions.

What does seem clear to me is that the way to answer hard questions is to not seek to freeze technology in time but rather to consider how many regulatory obsessions — including Windows dominance — are ultimately addressed by technology getting better, not by regulators treating mistaken assumptions (like operating system openness being an unalloyed good) as unchangeable grounds for competition.

Subscriber’s Daily Update

Bridges to the Future

The Generative AI Bridge

Survey Complete

The Post-War Order

Cars and China

Trump’s Tariffs

It’s Time to Build

Short-term: Generative AI and Digital Advertising

Medium-Term: The Smiling Curve and Infinite Content

The Long-term: XR and Generative UI

From Abundance to Infinity

The SpaceX Dream

Tesla’s Robotaxi Presentation

The Autonomy Dream

The Bitter Lesson

The Cost of Dreams

Tech’s Two Philosophies

Microsoft Copilot

Clippy and Copilot

Agents and o1

The Consumerization of IT

Data and Palantir

Enterprise Philosophy

The iPhone Event

Apple’s Crossing Lines

Boomer Apple

Intel’s Long Decline

IDM 2.0

Market Realities

Integration and Smartphone Innovation

Android Primacy

The Pixel Angle

Integration, Innovation, and the DOJ

The Sherman Antitrust Act

The Google Case

Aggregators and Contracts

Friendly Google

Enemy Remedies

Security Companies and Kernel Access

The CrowdStrike Crash

Microsoft’s Handicap

Regulating Progress