Sarah Kreps
Last September, Emmett Shear, who was later temporarily CEO of OpenAI during an organizational shakeup, cautioned that artificial intelligence was moving too quickly: “[I]f we’re at a speed of 10 right now, a pause is reducing to 0. I think we should aim for 1-2 instead.” Shear, who is concerned about human-level artificial intelligence, did not last long as CEO. And OpenAI, like its competitors, has kept its foot on the development pedal. Pauses (like the one suggested for giant AI experiments) or Shear’s slowdown suggestion are unlikely to work given the investment-driven technological arms race that has raised the payoff for first mover advantages.
In science, technology, and business, “disruptive innovation” is a goal. But the same innovations that start in the business world often end up in the national security space. Disruptive innovations are both dual-use and double-edged. Not only can they go from civilian to military applications, but they can also present opportunities as well as dangers. There is no stopping innovation, nor should there be. But it’s important to ask if innovators can create transformative technologies while not imperiling their country’s national security. In other words, can they be more mindful scientific stewards without compromising advancement? And what can today’s disruptors learn from yesterday’s innovators?
Answering those questions requires reflecting on historical instances of disruptive technologies. It turns out. the scientists behind some of those technologies did not intend to be disruptive. Alfred Nobel is the quintessential example. Nobel was curious, a “barefoot empiricist,” the type of scientist who learned by hands-on experimentation through trial-and-error. In 1847, a French chemist had discovered nitroglycerin, used as an antidote for angina. Nobel began studying the chemical’s homeopathic virtues but also recognized its volatile properties and ultimately developed the explosive and detonator that countries used in ways that he regretted. The Swedish innovator nursed his remorse by creating five Nobel Prizes, one a peace prize that would be awarded in Norway, a country that had previously demanded that its union with Sweden dissolve.
In 1897, two years after Nobel drafted his will that would create the Peace Prize, British physicist J.J. Thomson discovered the first subatomic particle when he observed streams of electrons deflected by electric and magnetic fields, implying they were comprised of negatively charged particles. The discovery unlocked the field of subatomic particle physics that led to Bohr’s model of the atom, the discovery of isotopes in the early 20th century, quantum mechanics in the 1920s and 1930s, and fission in 1938. These steps culminated in the making of the atomic bomb, which left its lead scientist guilt ridden. At the first detonation of the bomb, Robert Oppenheimer, director of the Los Alamos laboratory that built the first two bombs, is said to have recalled a verse from the Hindu scripture Bhagavad Gita: “Now I am become Death, the destroyer of worlds.”
As with nitroglycerin, atomic physics had salutary applications. Nuclear medicine is crucial in diagnosing, staging, and treating cancer. But as with Nobel, Oppenheimer did not need a crystal ball to know these discoveries could also end in destruction. Scientists Lisa Meitner and Joseph Rotblat found off-ramps well before the culmination of the bomb. Meitner did not join the Manhattan Project and Rotblat left once it was clear the Germans would lose World War II and not develop the bomb. Rotblat later joined Albert Einstein in arguing publicly for arms control and creating an organization called the Pugwash Conference on Science and World Affairs that would later win a Nobel Peace Prize for its efforts on nuclear disarmament. Rotblat’s resistance did not, however, stop the atom or hydrogen bombs from being built, or prevent an arms race in which the United States and Soviet Union built enough nuclear weapons to kill each other many times over.
Seeking scientific stewardship. These historical examples raise vexing questions about technological imperatives. Once a technology becomes feasible, is it also inevitable regardless of its anticipated adverse consequences? Just because something is attainable technologically, does it mean it should be attained? And most important, what would a scientific conscience look like in a world of technological imperative, as technologies become larger than just one innovator and become part of a broader competitive ecosystem?
Surely there must be a better way than sleepwalking into harm, existential or otherwise.
Innovators can start by asking the right questions. “Deciding what not to do,” Steve Jobs said, “is as important as deciding what to do.” This means asking what should not be done or how something can be done more ethically, sustainably, transparently, and responsibly, rather than focusing entirely on whether, or how fast, it can be done technologically. One of the common features of technological buyer’s remorse is that innovators did not seem to anticipate how their creations could be misused or how political actors could capture and use their innovations in new and unplanned ways. So the first step in ethical innovation is gaming out the possible consequences and misuses of the innovation. As people in the intelligence community say, it’s important to think like a terrorist and imagine not just the garden-variety intended uses or behaviors but the vulnerabilities—and not just of the current iteration, but also of version 12.0 of the technology.
The second step toward ethical and sustainable technological advance is to design some guardrails. The challenge is to figure out what guardrails mean short of banning a technology, which is neither practical nor terribly feasible. Some observers might suggest examining how a combination of the Nuclear Suppliers Group, the Non-Proliferation Treaty, the Wassenaar Arrangement on export controls, and the Coordinating Committee for Multilateral Export Controls helped limit nuclear proliferation and the diffusion of sensitive military technologies. But these arrangements took decades to conclude and were very targeted to nuclear weapons and specific to dual-use tech. That success story will be hard to replicate.
Contrary to proposals floated on Capitol Hill, similar measures do not work for technologies like social media (such as TikTok) given its widespread and easy adoption. Meta’s more recent measures that employ collected data from the past to anticipate where and how malicious actors might use the platform to incite violence or interfere with elections are, however, more viable. Another example is OpenAI’s prophylactic measures on electoral integrity, which entail making it more difficult to build applications that would discourage voting, allowing users to report violations and limiting users from building tools that could be used for personalized political persuasion. While commendable, these measures are almost self-evident after a multitude of incidents, not least the 2016 election, where the prospect of misuse was made manifest. But those serve as examples of the type of thinking that engineers should do ex ante, or before the event.
Engineering leaders also need to anticipate the possibility of scaling and disruptive impact well beyond their intended areas and see the public sector as a partner rather than an inconvenience or adversary. This is easier said than done. The Securities and Exchange Commission, for example, has been hostile toward cryptocurrency, which makes the commission an improbable partner for crypto-related firms. But Microsoft has worked with the government to support the White House’s cybersecurity Executive Order and the Cybersecurity and Infrastructure Security Agency (CISA). Obviously, the collaboration benefits the government by incorporating Microsoft’s expertise. But Microsoft benefits by better understanding the potential security issues that might threaten its own infrastructure and the country’s.
Finally, scientists are unlikely to have all the answers. An open, transparent process is more accountable than one that is opaque. Open-source principles—sharing data and findings, engaging in transparent decision-making, and encouraging participation and collaboration from a broad community—can help surface and address the possible risks that one scientist or an insular group or discipline might not identify. Given the financial stakes, innovators might have proprietary trade secrets that encourage the intellectual property protection of closed-source systems, but the transparency of open-source projects allows vulnerabilities to be identified and addressed through community collaboration. Mindful, open engagement across disciplines, missing during the development of dynamite and the atom bomb, might prompt questions about what to do with today’s emerging technologies—and also what not to do.
The governance challenge. The risks of misuse are less obvious for technologies like AI than they were for nuclear weapons. All the scientists observing the Trinity test knew it was a watershed event, that humans had entered the nuclear age and that they had been responsible. Nuclear weapons had firebreaks, a clear line between conventional and nuclear, and measurements of yield in kilotons or megatons. AI developments are far more incremental. No one can agree on what artificial general intelligence (AGI)—systems that rival and potentially exceed human intelligence—would mean, and Sam Altman admitted that AGI is “a ridiculous and meaningless term.” Google’s DeepMind tried to create a tractable framework for understanding artificial general intelligence and introduced a tiered categorization of machine intelligence, with 0 meaning it’s not an AI system (like a calculator) and 5 meaning it’s superhuman, with machines outperforming 100 percent of humans. These levels, however, do not cleanly correspond to risk. There is no reason, by the fact itself, to conclude that level 5 technology is threatening just because it can outperform all humans. If experts cannot agree on definitions and their implications for risk, then initiatives like OpenAI’s Red Teaming Network aimed at “improving the safety of OpenAI’s models” looks like virtue signaling. Evaluating model vulnerability or safety requires an established baseline risk from which to evaluate deviations.
Further, OpenAI’s goal of superalignment—the idea that the objectives of a super artificial intelligent system align with those of humans—sounds lofty and laudable, but that term too obfuscates a fundamental question: aligned with which humans? One possibility is the system is aligned with the platform (like OpenAI’s ChatGPT), specifically the developers themselves and their values. The AI could also align with its user, which is a different problem because it means potential confirmation bias, in other words taking the perceived inclination of a user’s prompt and affirming that perspective in the response, or even partake in sycophancy, which could perpetuate incorrect information or biases simply to align with user feedback. The AI could also be aligned to society’s goals, but then there’s a question of which society and values. El Salvador’s new government is raising fundamental debates about whether to optimize human safety or democracy, which have not easily co-existed and seem to require a choice. Translating the “answer” to these types of philosophical and subjective choices into code or an algorithm is not an obvious process. These are very different issues than the technical questions of whether the United States could detect Soviet nuclear tests of a particular yield, altitude, and distance.
Current debates about AI safety and scientific stewardship may over-analogize, comparing today’s diffuse, uncertain threats to technologies like nuclear weapons or dynamite, for which the consequence was obvious and immediately lethal. The obvious risk posed by atomic weapons made it possible to quickly coalesce around a consensus about avoiding nuclear war or even nuclear use. Experts might have disagreed on the paths to take to avoid those outcomes, but they at least agreed on the goal.
The goals for AI risk mitigation are nowhere close to consensus, both because the mainstream use of AI is still fairly recent but also because of the technology itself, which, despite the imaginative appeal of science fiction scenarios, is unlikely to lead to hegemonic machine overlords. Preoccupation with those unlikely scenarios takes the pressure off establishing nearer-term assessments about baseline risk needed for the stated goals of AI safety and “superalignment.” Artificial intelligence development may not culminate in a Trinity, TNT, or Skynet, but instead in the world of Alice in Wonderland, where without knowing the destination, it will be impossible to know what road to take.
No comments:
Post a Comment