29 December 2021

Understanding the Offense’s Systemwide Advantage in Cyberspace

Jason Healey

The Cybersecurity and Infrastructure Security Agency released on Dec. 11 a statement on the massive log4j vulnerability, one of the worst the internet has ever seen. This is just the latest in a string of some of the most brazen and dangerous cyberattacks of all times that have happened within the past year: SolarWinds, Colonial Pipeline, JBS, Kaseya and the Microsoft Exchange megahack. No wonder that cyber defenders are suffering from burnout, alert fatigue and overwork. This is not a new phenomenon. Since the very beginning of the internet, the offense often has seemed to have the advantage over defense.

But since attackers’ success is far from assured, as defenders can and do put up agile and spirited defenses, the right question is not whether the offense or defense has the advantage in cyberspace, but under what conditions (to paraphrase what Robert Jervis and I have written elsewhere).

This post accordingly starts with a unique structure of five separate frameworks for analyzing offense-defense advantage to reconcile the differing perspectives. Though each has its uses, only one, the systemwide framework, directly addresses policy issues such as managing systemic cyber risk and understanding the impact of one-to-multitude situations like log4j, which is affecting hundreds of millions of devices, and SolarWinds, in which a single intrusion directly affected 18,000 companies.

Log4j is, if anything, even worse, overwhelming defenders as an internet-wide vulnerability that was first exploited by curious hackers and criminals, and now by nation-state spies and warriors, including China, Iran, North Korea and Turkey. It is not just one-to-multitude but multitude-to-multitude.

This post also discusses the origins of systemwide offense cyber advantage, especially as documented over the past few decades in the computer security literature, and it suggests ways to measure the advantage. These metrics are one of the first proposals to directly measure systemic cyber risk and whether the defense is closing the gap on the offense.

Finally, this post offers suggestions for responding to the issue of offense advantage. Policymakers can help reverse the decades-long systemwide offense advantage by mimicking attackers to focus on one-to-multiple defenses, to give the defense the greatest advantage over attackers at the greatest scale and least cost. The overall goal must be to give the defense a systemwide defense, to smother one-on-multitude attacks and prevent defenders from repeatedly getting overrun.

Five Frameworks to Analyze Offense-Defense Balance

Most cybersecurity is focused, protecting a specific company’s information technology enterprise or testing a particular network or piece of software to see how secure it is. Cybersecurity is far more than just that though, including the back and forth between adversaries during national security competition and conflict.

Reconciling these differing perspectives requires five different offense-defense frameworks, each with unique dynamics, both influencing and influenced by the others.

The five (in general order of the narrowest to largest scale) are listed in Table 1.

Table 1. Frameworks for analyzing cyber offense-defense balance.

Asset

Dyadic-Actor

Operational

Strategic

Systemwide

Measures offense-defense balance between

Specific system against a threat

Specific attacker against a specific defender

“On-net” activities between adversaries

Campaigns by a specific nation-state against another

All attackers and defenders in entirety of cyberspace

Unit of analysis

One-to-one to one-to-few

One-to-one

One-to-one to one-to-few

One-to-one

One-to-multitude, multitude-to-multitude, one-to-all

Distinguishability of offense and defense

Some similar tools and methods; distinguishable activities

Some similar tools and methods; distinguishable activities

Hard to distinguish

Highly dependent on specifics

Generally distinguishable


Use of framework in cybersecurity industry (size of global market)


Penetration testing (~$1.1 billion)




The asset framework is the original computer security framework and the narrowest: the defense of a specific computer, network or piece of software (or a small and relatively bounded system). This framework is at the heart of the work done by the Ware Report of 1970, the founding document of computer security, and work at the National Security Agency in the 1980s on Trusted Computer Systems. Attackers and defenders have some similar tools and methods, but the activities of each are generally distinguishable, as there is no counteroffensive or active defense.

The dyadic-actor framework—such as the work of Rebecca Slayton or the more tongue-in-cheek Herb Lin—studies the pairing of a specific attacker against a specific defender. When the Russian military intelligence group known as Sandworm disrupted the Prykarpattyaoblenergo control center of Ukraine’s electrical grid, the relevant dyadic actors for this framing were Sandworm versus Prykarpattyaoblenergo, not Russia versus Ukraine. Again, some tools (e.g., scanning) and methods (e.g., those of red teams) are shared between the offense and defense, but their activities remain distinguishable as the vast bulk of defenders lack any meaningful counteroffensive capabilities.

The operational framework describes the back and forth of on-net activity between adversaries, often in “gray space” (the networks and systems owned by neither adversary) including persistent engagement: the large-scale, long-term sparring for advantage through reconnaissance, intrusions, campaigns and disruptions. The advantage here may not accrue to the offense or defense per se, as argued by Richard Harknett, but to whichever side operates with the most persistence, “grappling over who has the initiative.” Offense and defense can be hard to distinguish both because of this grappling and because the participants are military or intelligence forces using similar tools and methods to conduct intrusive reconnaissance, anticipatory attacks, counteroffensives and reprisals.

The strategic (or dyadic-nation state) is an international relations framework examining the back and forth of pairings of nation-state adversaries, explored in the articles and books of Brandon Valeriano and his co-authors, which can find insights not discoverable using other frameworks that are silent on states’ national security goals.

The distinguishability of offense and defense depends on the endogenous factors of particular campaigns. The U.S. military takedown of a Russian ransomware network in the run-up to the 2020 election was more clearly cyber-defensive than the U.S.-Israeli Stuxnet attack on Iranian nuclear enrichment.

The systemwide framework examines the largest-scale dynamics, the inherent systemic cyber risk of the internet. This framework includes both top-down effects (such as insecure technical standards) and emergent and bottom-up system effects (such as the proliferation of insecure “Internet of Things” devices). It includes the full range of internet “attacks,” including not just nation-state disruptions and espionage but also cyber crime and even hacking for fun and curiosity. The tools, methods and activities of offense and defense in this framework are relatively distinct.

To put these frameworks in context, patching the log4j vulnerability on a single system is in the asset framework; the global push to patch the tens of millions is systemwide. If Russian military intelligence tries to exploit the vulnerability in a particular disruption operation, it would be a dyadic actor, while U.S. Cyber Command attempts to stop them would be in the operational framework. If that disruption is an attempt to sabotage Ukrainian computers as part of, or in lieu of, a Russian invasion, then it would be in the strategic framework.

Origins of the Systemwide Framework

The computer security literature consistently has discussed systemwide offense advantage and how to reverse it. Some of that literature might be categorized as expert opinion conventional wisdom, such as “the attacker must find but one of possibly multiple vulnerabilities in order to succeed; the security specialist must develop countermeasures for all.”

The most important findings, however, are rooted in observation of long-established patterns, modeling and scientific discoveries. These find there is no single cause for systemwide offense advantage; indeed, there are more causes than can be compiled in any single article. One of the more comprehensive reports, the 2017 report by the New York Cyber Task Force, noted a dozen but still probably missed at least as many.

Systemwide offense advantage goes back to the internet’s original sin: prioritizing availability over security. One pioneer recalled that “it’s not that we didn’t think about security. We knew that there were untrustworthy people out there, and we thought we could exclude them.” Every network and device that is using or has ever used the internet inherited the same insecurities, a lesson its founders realized by 1988—when the Morris Worm decimated the early internet. Different design decisions in that earliest era might have led to better security not just for each internet-connected network and device but for every one built since then.

By 1991, the U.S. National Research Council was trying to tackle how the advantage was “heavily to the attacker,” with the 1991 Computers at Risk report. The report’s recommendations are rooted in systemwide assessments of one-to-multitude attacks, such as “the current concern about virus attacks derives … from the total lack of a countermeasure [in Microsoft and Apple operating systems]” and “had such attacks been anticipated, the means to resist them could have been intrinsic to the systems.”

The dominance of the Microsoft operating system led to a “software monoculture,” which can “create aggregated risk like nothing else,” according to another major report by expert cybersecurity practitioners. The September 2003 report warned of a different kind of systemwide vulnerability: common-mode failures, like log4j, “a design fault that causes redundant copies of the same software process to fail under identical conditions,” which is as applicable to ransomware now as it was to viruses then:

The NIMDA and Slammer worms that attacked millions of Windows-based computers were examples of such “cascade failure”—they spread from one to another computer at high rates. Why? Because these worms did not have to guess much about the target computers because nearly all computers have the same vulnerabilities.

This was not mere opinion. Cascading failures were modeled to be far more likely once even a single exploitable flaw reached 43 percent prevalence in simulations, a threshold well below Microsoft and other systemically important software, then and now. Dan Geer, one of the report’s authors, later wrote that “only monocultures enable Internet-scale failure; all other failures are merely local tragedies.” As with all highly interconnected, tightly coupled systems, such “unacknowledged correlated risk of cyberspace” leads to very unpredictable, extremely high-consequence incidents.

Later academic work discovered that the internet is a kind of network known as “scale-free,” strongly resistant to random failures but inherently vulnerable to targeted attacks against the most connected hubs. If targeted properly, “the simultaneous elimination of as few as five to 15 percent of all hubs can crash a system.”

Defining and Measuring Systemwide Offense-Defense Balance

The systemwide offense advantage in cyberspace has not been solved in part because it hasn’t been clearly described or measured. This section proposes several possibilities, borrowing from international relations. Exact measurements may be difficult but fortunately are not needed, as the scale and magnitude of the trends should be enough to determine the relative advantage over time between offense and defense.

The scale achievable in one-to-multitude attacks measures the rate over time at which attackers achieve scale by amortizing their investments in technology and skills over many attacks. This might be measured in several ways. Attackers may be advancing their relative advantage if more zero-day vulnerabilities are found “in the wild” and have a longer lifespan. Defenders seem to have advanced with investments in resilience: there was a measurably higher likelihood of attacks disrupting the entire internet when there were only two internet exchange points and 13 Domain Name System (DNS) root servers, compared to, respectively, more than 1,100 and 600 today. A third way to measure achievable scale is by the systemic impact of an unsuccessful defense. The potential upstream and downstream impact of a one-day attack on a systemically important company, according to a Rand study, can be many times the costs faced at the company itself—findings generally confirmed by Federal Reserve economists.

The surplus profits of cybercrime might be another useful measure. Cybercrime had historically been limited by the opportunities for revenue, that is, until the invention of cryptocurrencies revolutionized monetization at scale through ransomware. Profits ballooned: “[S]ome common criminal businesses … may routinely require nearly $3,800 a month and could return up to $1 million per month.” Further systemwide changes could amplify or dampen such excess profits.

The relative difficulty of hiding versus detection and ejection are related to two of the cybersecurity industry’s most important statistics, the mean time for defenders to detect or eject adversaries.

The good news is that organizations are now detecting intrusions far more quickly. In 2016, nearly 70 percent of breaches were detected in “months or more” and about 25 percent in “days or less.” By 2020, those stats had flipped.

The bad news, however, is that this change is likely tied to the rise of ransomware; if victims don’t notice it, they won’t pay for data breach detection. Supporting this pessimistic reading, criminal hacking groups are needing substantially less time to “break out” from their initial intrusion into a company.

Systemwide metrics might also measure the relative difficulty of patching compared to exploiting. Using hard data, researchers recently found that:

In the time leading up to a vulnerability being patched, attackers have a solid advantage. Armed with the patch, defenders take back the momentum by remediating exposed vulnerabilities across their environment. After that initial push, defender momentum wanes, and attackers regain the long-term advantage.

Responding to Systemwide Offense Advantage

Systemwide offense advantage is a sticky problem: Most of the reasons given for offense advantage in 1991 are still germane, 30 years later, despite thousands of patents’ worth of innovation, trillions of dollars of cybersecurity spending and all of those exhausted defenders.

Policymakers should focus on reducing the risks of one-to-multitude attacks. They can do so only through solutions with leverage, giving the defense the greatest advantage over attackers at the greatest scale and least cost. The efforts of a dollar (or hour) of defense is no match for the offense as the attackers have deep pockets. To break the decades-long systemwide offense advantage, each dollar must defeat hundreds or thousands or millions. This happens only when defenders mimic their attackers for one-to-multiple defense.

A key finding of the New York Cyber Task Force was that one-to-multiple defenses aren’t just possible but are being implemented, whether as innovations in policy, operational (such as such as Verizon’s VERIS for describing security incidents, the ATT&CK Framework and the sharing-at-scale of the Cyber Threat Alliance) or technical (such as Windows Update and end-to-end encryption that uplift all systems and users.) Implementing software bills of materials could greatly diminish the impact of global vulnerabilities like log4j. These defenses-at-scale innovations haven’t matched the pace of those for attacks-at-scale, so they must be prioritized and accelerated.

To avoid miscalculations, policymakers must incorporate systemwide offense advantage into their plans. Much of the current U.S. strategy is being built in line with international relations assessments that cyber conflict is an intelligence contest or strategic competition below the level of armed attack. While these are accurate descriptions, such findings are based on how states have behaved relatively recently, and such intent can change quickly.

Jenny Jun expects that ransomware will soon be used for coercive geopolitical attacks, a natural next step in the evolution of one-to-multitude attacks. With increasing great-power competition, some states will use their cyber capabilities not to quietly spy and steal data but to disrupt and destroy, widely and as quickly as possible.

There are also lessons for international relations scholars, who all too often can depend on single-factor analysis of offense advantage—such as the role of undoubted importance of deception. The computer security literature is far more likely to discuss multiple factors, though if any rise to the top it is complexity, “the worst enemy of security.”

Complexity can help defenders if it preferentially confounds attackers: the minotaur guarding its own maze. Inherently unknowable and unmanageable, the internet is lamentably not that defense-friendly kind of complex. Even experts cannot know all of the internet’s current states or the possible permutations of future states, nor can they model it as “‘an infinite series of monsters under the bed.’” The struggle to identify and patch every piece of software that uses log4j demonstrates the limits of trying to manage such incredibly complex systems. Well-resourced organizations may succeed (the “security one-pecent”) but not the hundreds or thousands more on whom those organizations depend systemwide, such as customers and their supply chain.

Although their research was constructed more narrowly (such as dyadic analysis, limited to a narrow range of online activities such as sabotage or coercion by military or intelligence forces), academics also often draw broad conclusions about systemwide cyber offense-defense balance. Illustrating the difference in perspectives, one of the most complete international relations datasets collected only 267 incidents over a decade while in a single year a leading computer security database confirmed nearly 30,000 incidents and analyzed 5,258 breaches. One-to-multitude attacks can skew findings, unless handled carefully; for example, should the Russian intrusion into SolarWinds be coded as a single campaign, an active intrusion into 110 organizations or a latent intrusion into 18,000?

Lastly, like trying to understand the dynamics of the global auto industry using a case study of Rolls-Royce, it might be misleading to draw too many conclusions from Stuxnet, as is done in several key academic articles. Singularly high-end, targeted and sophisticated, Stuxnet may not shed much light on down-market, one-to-multitude attacks.

Just because systemwide offense advantage is the internet’s original sin—having been designed for availability and not security—does not mean it should just be accepted (or exploited). There are worse things than offense advantage. If attacks escalate into a self-reinforcing spiral, the internet could pass a tipping point where there are more predators than prey, where the offense has not just the advantage but supremacy. Economic modeling I oversaw at the Atlantic Council in 2015 found that such a future could mean foregoing up to $90 trillion in cumulative global gross domestic product over 15 years.

To avoid such a future, policymakers, practitioners, and academics endeavor to end the decades-long systemwide offense advantage and give, at long last, the defenders the advantage.

No comments: