15 December 2016

Predicting the Future: Anticipating Security Events with Data Analytics

LEVI MAXEY

Is a future where people can be arrested for crimes they have not yet committed dystopian or utopian? While this scenario remains the fodder of science fiction, new tools are rapidly challenging our libertarian conceptions of innocent until proven guilty.

In the last few decades there has been enormous growth in available electronic data, including anything from social media posts, internet browsing logs, and communications metadata—the who, when, where, and how of our digital interactions—to surveillance camera footage, biometric datasets, criminal databases, and credit card records. The result? An unprecedented pool of information at the fingertips of law enforcement and intelligence agencies.

But bulk collection efforts have simultaneously created a data deluge that cannot be effectively tackled by human analysts without creating some level of analysis paralysis. Only a small sliver of the troves of data collected is pertinent to security threats, raising concerns about both efficiency and privacy.

This immense access to data has, to an extent, allowed security professionals to map criminal and terrorist networks, but it has been most effective after the fact—identifying or locating those responsible through forensic analysis. For example, following the November 2015 attacks in Paris, police quickly followed a series of digital breadcrumbs beginning with a discarded phone, found near the Bataclan Theater, which led investigators to where the attackers were hiding.

But how can law enforcement and intelligence services manage the immense amount of data available to become more proactive in preventing security incidents before they occur?

Law enforcement and intelligence agencies are seeking innovative new predictive data analytic tools, which give police officers and analysts advance warning to better interdict threats before they materialize. Using advanced computer algorithms, these tools identify the most critical pieces of information to forecast potential security events.

Robert Muggah, Research Director at the Igarapé Institute, suggests “predictive policing is one of the most widely known forecasting platforms. It is based on the expectation that crime is hyper-concentrated and contagious.” He goes on to note “the shift from describing patterns of crime to predicting them was triggered in part by the spread of powerful mapping software, data processing systems and social media.” If police officers are told when, where, and what crimes will likely be committed, they can focus traditional policing efforts in those areas, and their increased presence could, in theory, act as a criminal deterrent.

While predictive policing methods are relatively more public, U.S. intelligence has also begun developing their own predictive analytics. The U.S. intelligence community’s research arm, Intelligence Advanced Research Projects Activity (IARPA), launchedtheir Open Source Indicators (OSI) program in 2011 aiming to anticipate significant societal events, such as political and humanitarian crises, disease outbreaks, and economic instability using publically available data. For example, the OSI program was the first to detect the Ebola outbreak occurring in West Africa in 2014.

Since, the intelligence community has sought to expand their predictive datasets to include classified data. Leading this effort is Kristen Jordan, a Program Manager at IARPA overseeing the organization’s Mercury program. According to Jordan, the program aims to develop “methods for continuous, automated analysis of diverse, existing foreign signals intelligence (SIGINT) data to anticipate and/or detect events such as terrorist activity, civil unrest, and disease outbreaks abroad.”

“Ultimately,” Jordan notes, “we are looking for combinations of indirect signals that when taken together can provide the basis for a forecast, with an assigned probability and lead time.” Jordan goes on to clarify that “these tools have the potential to alert us to a group event, even identifying the tipping point of radicalization of a group over time. However, anticipating the activity of a lone wolf actor is a much harder problem.”

For example, despite buying bomb-making materials like circuit boards, electric igniters and ball bearings on eBay, linking jihad-related videos on social media, and traveling to Pakistan for long periods of time—all of which is legal—and being interviewed and arrested on separate occasions, law enforcement was unable to see the propensity of Ahmad Khan Rahami toward terrorism until he was arrested for planting bombs in New York and New Jersey. A potential reason behind Rahami’s seemingly rapid clearance upon reentering the country after a trip to Pakistan in March 2014 can be found in 2011 report from the DHS’s inspector general, describing the National Targeting Center’s data systems as cumbersome, making complete checks on incoming passengers difficult and time-consuming.

While this is a domestic example, it goes to show that much like predictive analytics can augment traditional policing methods, they should be considered as simply another tool in an intelligence analyst’s larger toolkit, who can then better focus their efforts based on forecasts. After all, computer algorithms often fail to understand the context of data—for example, whether someone on social media is serious about an inflammatory topic or merely joking.

However, aside from the typical privacy concerns over collecting large quantities of data, predictive analytics presents new ethical challenges based on inherent assumptions involved in forecasting events. Muggah points out that “while the stated goal of predictive policing is to reduce crime rates and police bias, there are fears it could do the opposite.” For example, Muggah notes that civil liberties groups have warned “such tools exacerbate profiling and selective policing since they perpetuate racial bias under a veneer of scientific credibility.”

A major problem with predictive analytics is that the underlying data and methods are often kept secret, and therefore not subject to external critical assessment to reign in potential bias.

Jordan points out that “a significant component of the Mercury program is what we call an audit trail. For every forecast that is generated, we can go back and see what features the system recognized as indicators for that warning and understand exactly what type of data or compilation of records it was derived from.”

While predictive analytics used by law enforcement could allow in-depth public transparency of the inner workings of their data and methods, programs like Mercury that deal with classified material would have to rely on built-in oversight mechanisms accountable to elected officials. Regardless, “in the future,” Muggah asserts, “there should be the introduction of independent audits for both the algorithms and data, complete with compliance requirements, whistleblower options and protections.”

Levi Maxey is a cyber and technology p

No comments: