Saif M. Khan
The success of modern AI techniques relies on computation on a scale unimaginable even a few years ago. What exactly are the AI chips powering the development and deployment of AI at scale and why are they essential? Saif M. Khan and Alexander Mann explain how these chips work, why they have proliferated, and why they matter.Download Full Report
Artificial intelligence will play an important role in national and international security in the years to come. As a result, the U.S. government is considering how to control the diffusion of AI-related information and technologies. Because general-purpose AI software, datasets, and algorithms are not effective targets for controls, the attention naturally falls on the computer hardware necessary to implement modern AI systems. The success of modern AI techniques relies on computation on a scale unimaginable even a few years ago. Training a leading AI algorithm can require a month of computing time and cost $100 million. This enormous computational power is delivered by computer chips that not only pack the maximum number of transistors—basic computational devices that can be switched between on (1) and off (0) states—but also are tailor-made to efficiently perform specific calculations required by AI systems. Such leading-edge, specialized “AI chips” are essential for cost-effectively implementing AI at scale; trying to deliver the same AI application using older AI chips or general-purpose chips can cost tens to thousands of times more. The fact that the complex supply chains needed to produce leading-edge AI chips are concentrated in the United States and a small number of allied democracies provides an opportunity for export control policies.
This report presents the above story in detail. It explains how AI chips work, why they have proliferated, and why they matter. It also shows why leading-edge chips are more cost-effective than older generations, and why chips specialized for AI are more cost-effective than general-purpose chips. As part of this story, the report surveys semiconductor industry and AI chip design trends shaping the evolution of chips in general and AI chips in particular. It also presents a consolidated discussion of technical and economic trends that result in the critical cost-effectiveness tradeoffs for AI applications.
In this paper, AI refers to cutting-edge computationally-intensive AI systems, such as deep neural networks. DNNs are responsible for most recent AI breakthroughs, like DeepMind’s AlphaGo, which beat the world champion Go player. As suggested above, we use “AI chips” to refer to certain types of computer chips that attain high efficiency and speed for AI-specific calculations at the expense of low efficiency and speed for other calculations.1
This paper focuses on AI chips and why they are essential for the development and deployment of AI at scale. It does not focus on details of the supply chain for such AI chips or the best targets within the supply chain for export controls (CSET has published preliminary results on this topic.)2 Forthcoming CSET reports will analyze the semiconductor supply chain, national competitiveness, the prospects of China’s semiconductor industry for supply chain localization, and policies the United States and its allies can pursue to maintain their advantages in the production of AI chips, recommending how this advantage can be utilized to ensure beneficial development and adoption of AI technologies.
This report is organized as follows:
Industry Trends Favor AI Chips over General-Purpose Chips
From the 1960s until the 2010s, engineering innovations that shrink transistors doubled the number of transistors on a single computer chip roughly every two years, a phenomenon known as Moore’s Law. Computer chips became millions of times faster and more efficient during this period. (Section II.)
The transistors used in today’s state-of-the-art chips are only a few atoms wide. But creating even smaller transistors makes engineering problems increasingly difficult or even impossible to solve, causing the semiconductor industry’s capital expenditures and talent costs to grow at an unsustainable rate. As a result, Moore’s Law is slowing—that is, the time it takes to double transistor density is growing longer. The costs of continuing Moore’s Law are justified only because it enables continuing chip improvements, such as transistor efficiency, transistor speed, and the ability to include more specialized circuits in the same chip. (Section III and IV.)
The economies of scale historically favoring general-purpose chips like central processing units have been upset by rising demand for specialized applications like AI and the slowing of Moore’s Law-driven CPU improvements. Accordingly, specialized AI chips are taking market share from CPUs. (Section V.)
AI Chip Basics
AI chips include graphics processing units (GPUs), field-programmable gate arrays (FPGAs), and application-specific integrated circuits (ASICs) that are specialized for AI. General-purpose chips like central processing units (CPUs) can also be used for some simpler AI tasks, but CPUs are becoming less and less useful as AI advances. (Section V(A).)
Like general-purpose CPUs, AI chips gain speed and efficiency (that is, they are able to complete more computations per unit of energy consumed) by incorporating huge numbers of smaller and smaller transistors, which run faster and consume less energy than larger transistors. But unlike CPUs, AI chips also have other, AI-optimized design features. These features dramatically accelerate the identical, predictable, independent calculations required by AI algorithms. They include executing a large number of calculations in parallel rather than sequentially, as in CPUs; calculating numbers with low precision in a way that successfully implements AI algorithms but reduces the number of transistors needed for the same calculation; speeding up memory access by, for example, storing an entire AI algorithm in a single AI chip; and using programming languages built specifically to efficiently translate AI computer code for execution on an AI chip. (Section V and Appendix B.)
Different types of AI chips are useful for different tasks. GPUs are most often used for initially developing and refining AI algorithms; this process is known as “training.” FPGAs are mostly used to apply trained AI algorithms to real-world data inputs; this is often called “inference.” ASICs can be designed for either training or inference. (Section V(A).)
Why Cutting-Edge AI Chips are Necessary for AI
Because of their unique features, AI chips are tens or even thousands of times faster and more efficient than CPUs for training and inference of AI algorithms. State-of-the-art AI chips are also dramatically more cost-effective than state-of-the-art CPUs as a result of their greater efficiency for AI algorithms. An AI chip a thousand times as efficient as a CPU provides an improvement equivalent to 26 years of Moore’s Law-driven CPU improvements. (Sections V(B) and VI(A) and Appendix C.)
Cutting-edge AI systems require not only AI-specific chips, but state-of-the-art AI chips. Older AI chips—with their larger, slower, and more power-hungry transistors—incur huge energy consumption costs that quickly balloon to unaffordable levels. Because of this, using older AI chips today means overall costs and slowdowns at least an order of magnitude greater than for state-of-the-art AI chips. (Section IV(B) and VI(A) and Appendix D.)
These cost and speed dynamics make it virtually impossible to develop and deploy cutting-edge AI algorithms without state-of-the-art AI chips. Even with state-of-the-art AI chips, training an AI algorithm can cost tens of millions of U.S. dollars and take weeks to complete. In fact, at top AI labs, a large portion of total spending is on AI-related computing. With general-purpose chips like CPUs or even older AI chips, this training would take substantially longer to complete and cost orders of magnitude more, making staying at the research and deployment frontier virtually impossible. Similarly, performing inference using less advanced or less specialized chips could involve similar cost overruns and take orders of magnitude longer. (Section VI(B).)
Implications for National AI Competitiveness
State-of-the-art AI chips are necessary for the cost-effective, fast development and deployment of advanced security-relevant AI systems. The United States and its allies have a competitive advantage in several semiconductor industry sectors necessary for the production of these chips. U.S. firms dominate AI chip design, including electronic design automation (EDA) software used to design chips. Chinese AI chip design firms are far behind and are dependent on U.S. EDA software to design their AI chips. U.S., Taiwanese, and South Korean firms control the large majority of chip fabrication factories (“fabs”) operating at a sufficiently advanced level to fabricate state-of-the-art AI chips, though a Chinese firm recently gained a small amount of comparable capacity. Chinese AI chip design firms nevertheless outsource manufacturing to non-Chinese fabs, which have greater capacity and exhibit greater manufacturing quality. U.S., Dutch, and Japanese firms together control the market for semiconductor manufacturing equipment (SME) used by fabs. However, these advantages could disappear, especially with China’s concerted efforts to build an advanced chip industry. Given the security importance of state-of-the-art AI chips, the United States and its allies must protect their competitive advantage in the production of these chips. Future CSET reports will analyze policies for the United States and its allies to maintain their competitive advantage and explore points of control for these countries to ensure that the development and adoption of AI technologies increases global stability and is broadly beneficial for all. (Section VII.)
No comments:
Post a Comment