Pages

9 December 2020

Applying Science and Analytics to the Exploitation of Open Source Intelligence

By Dan Gouré

Through the COVID-19 pandemic, a commonly heard phrase has been “trust the science.” The same can be said in the realm of intelligence. The more that scientific wisdom and quantitative analytic tools can be applied to the collection and analysis of raw data, the greater the likelihood that useful intelligence can be generated. This is particularly the case when it comes to exploiting the vast amounts of data available from open sources.

Publicly available information or open source intelligence (OSINT), is accessible in all countries and from every group, even so-called hard targets such as North Korea and ISIS. When proven principles from the social and behavioral sciences are applied, in combination with sophisticated computer models, artificial intelligence and machine learning, OSINT can be an extremely valuable tool supporting government and commercial decision making.

Intelligence collection is most commonly associated with spies, surveillance satellites, electronic intercepts and cyber hacking. The ability to “listen in” on the deliberations of world leaders or acquire their secret plans can be of inestimable value in peacetime, crisis or war. These capabilities are particularly useful on the battlefield when divining the adversary’s intentions and predicting their activities can produce war-winning results.

But OSINT is another world of information that can be extremely useful and is increasingly being exploited for commercial and national security purposes. There are six basic types of OSINT: public media sources; the Internet; public official government data; professional and academic publications; commercial data provided by governments and corporations; and so-called “grey data” which includes private business documents, unpublished works, technical reports, as well as patents that reside inaccessible databases.

Historically, OSINT has been more art than science. There was always too much data in too many forms for more than cursory analysis. With the advent of advanced computing, mass storage and artificial intelligence, science and analytics could be applied to OSINT.

The importance of OSINT has grown in recent decades. It is difficult for so-called hard intelligence targets, limiting their communications, and otherwise maintaining a high level of physical and electronic security, to hide from open source collections. Rogue nations such as North Korea, adversarial institutions like the Russian Armed Forces’ General Staff, and terrorist groups like ISIS and Al Qaeda are generally considered hard targets. But these targets produce a surprisingly large amount of potentially useful open source information. These hard targets need open forms of communications to mobilize their populations, communicate plans and strategies, achieve buy-in from their memberships and conduct critical operations.

OSINT can supplement traditional forms of intelligence. OSINT solutions can identify patterns in otherwise incomprehensible masses of data. Quantitative modeling of open source data can improve U.S. intelligence analysis and even enhance targeting for clandestine operations. But OSINT also can provide useful information and insights in areas where other methods cannot. Properly collected, organized and analyzed, OSINT not only can complement traditional intelligence collection methods but, in some cases, provide leading indicators.

OSINT can support the exploration of alternative U.S. policy options as well as test potential responses by other parties to prospective U.S. initiatives and actions. With the application of social science methodologies and AI, it can predict attitudes, trends and even behaviors. It also can provide guidance for influence operations.

The demand for data mining software programs has grown as the quantity and quality of publicly available information has exploded. There are a number of applications available to both government and commercial clients. Most of these are really search engines rather than analytic or predictive tools.

One of the best known companies in the more sophisticated data-driven decision-making field, using both classified information and OSINT, is Palantir. While its original clients were government agencies, particularly members of the U.S. Intelligence Community, Palantir has expanded into the exploitation of a wide variety of OSINT for commercial clients.

Another pioneering company in the field of OSINT is Kingfisher Systems, Inc. The company created an innovative computation platform, called VARYSS, to exploit all types of OSINT. But VARYSS is much more than just a data mining tool. VARYSS combines machine-coded aggregate data with refined issue analysis to ascertain trends and identify entities of interest. It is bringing science and analytics to bear on the challenge of exploiting open source data.

The platform has three major elements. First is the continuous autonomous collection, ingestion, and storage of open source data from around the world. Some 750,000 news items in nearly 100 languages are collected weekly and stored for later exploitation. VARYSS has 15 years of collected information in its databases. Second is a proprietary approach to the automated organization, integration and annotation of this large database. Using computer-based auto-curation algorithms, the system can generate millions of specific data points per day from unstructured OSINT. Third is applying a unique computational/analytic methodology to a specific intelligence or policy issue.

VARYSS exploits diverse and abundant sources of current and historical open source information to identify leading indicators related to significant events, strategic activities and historical trends. The collection of open source news items, technical information and other literature spanning over a decade allows a better understanding of events, including complex and dynamic systems, on a global, regional, and national level. The collected data is validated, generally through automated processes.

The VARYSS system does more than just collect and collate open source data. It combines the wisdom of the social sciences, the speed of artificial intelligence, and machine learning to identify and predict strategic changes. VARYSS incorporates unique measures of political behavior, allowing analysts to develop new insights and make predictions regarding hard political problems. One of the main ways the platform addresses strategic changes is by continuously collecting data on some 250,000 global influencers. VARYSS can be employed as a heuristic tool to assess different strategies for influencing strategic change. In a sense, VARYSS is the next step in exploiting open source and applying AI to extremely challenging social, political, and military problems.

No comments:

Post a Comment