Identify Terrorists with Predictive Modeling

This article reveals 'Till what extent Predictive Modeling can be used'. Predictive Modeling is a process which is used to build a model with the help of historical data to predict future behavior. In the process of predictive modeling, we use statistical and machine learning techniques. In this post, we will see how predictive modeling / data mining techniques can be used to identify terrorists.

Identify Terrorists Attacks with Predictive Modeling

Terrorists attacks are happening in every part of the world. Every day government announces a new terror alert. It has become priority of every government to eradicate terrorism from their country. Some countries have developed analytics-driven software to predict or forecast terrorists attacks. The software identifies patterns from the historical data and predicts terrorist activities.

The Australian Security Agency designed a terror attack system that let their citizens a clearer idea of whether they should be alert or alarmed. It classifies threats into five levels – Not Expected, Possible, Probable, Expected and Certain.

Likelihood of being a Terrorist

US National Security Agency use a machine learning algorithm to assess each person's likelihood of being a terrorist. They used Pakistan's mobile network metadata of 55 million people to develop a model to identify terrorists.

Background

Around 4,000 people have been killed by drone strikes in Pakistan since 2004. According to leaked documents on The Intercept, these drone strikes happened based on results from the machine learning algorithm. The disastrous result is that the thousands of innocent people in Pakistan may have been mislabelled as terrorists by the algorithm.

Data

Target / Dependent Variable - Whether a person is terrorist or not
Predictors / Independent Variable - 80 Variables. Some of the variables are listed below -

Travel Patterns
No. of visits to terrorist states
Moved Permanently to terrorist states
Overnight Trips
Travel on particular day of the week
Regular Visits to locations of Interest
Travel Phrases

Other Predictors
Low use / income calls only
Excessive SIM or Handset Swapping
Frequent Detach / Power-Down
Common Contacts
User Location
Pattern of Life
Social Network
Visits to airports

Data Preparation

Number of Events : Data from just seven known terrorists.
Number of Non-Events : 100,000 users were selected at random

Algorithm

Random Forest was used as a machine learning algorithm. No much detail is specified in the NSA presentation file. Not sure whether they used stacking/blending ensemble learning algorithm.

Model Results :

1. 50% False Negative Rate. It refers to "50% actual terrorists but model incorrectly predicted them as "Non-Terrorists".

2. 0.18% False Positive Rate. It refers to "0.18% innocents, but model incorrectly predicted as terrorists.

A false positive rate of 0.18 percent across 55 million people would mean 99,000 innocents mislabelled as "terrorists"

In marketing or credit risk models, 0.18% false positive rate is considered as an excellent score. But it is dangerous in the context of human lives. Even 0.01% false positive rate of 55 million population implies 5,500 innocent people potentially being misclassified as "terrorists" and killed.

The highest rated target according to this machine learning was Ahmad Zaidan, Al-Jazeera's long-time bureau chief in Islamabad.

Issue / Challenges related to this kind of model

Event Rate : The main issue of the model is that they used a very few events (7 terrorists) to train the model. Machine learning algorithms require more events than classical statistical techniques.
Unstructured Data : Huge amount of data but unstructured
Collaboration between Countries : Official data sharing security pact
Implementation : It is very dangerous to implement the model and kill someone after blindly following results from the model.

Several areas where we can leverage analytics to identify terrorists activities

Identifying terrorist financing which provides funds for terrorists activities
Profiling people who are educated but involved in terrorists activities.
Correlating terrorist attacks with trends in geo-politics and money trails

About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

While I love having friends who agree, I only learn from those who don't
Let's Get Connected Email LinkedIn

Post Comment 9 Responses to "Identify Terrorists with Predictive Modeling"

UnknownSeptember 23, 2016 at 10:51 AM
I am impressed by many of your previous posts, But explaining using a lively example. No words, My Hats off to you Bro.
--Vamshi
UnknownSeptember 29, 2016 at 12:28 PM
Hello. Thank you for this article. Did they display the complete list of the independent variables ?
AliOctober 6, 2016 at 3:27 AM
Sensitive issue but would require 0.00000001 or less failure of the model
NavdeepOctober 15, 2016 at 9:40 AM
Very good..need to send to raw to find sleeper cell and can leverage thismodel for cause of country..nice article man
UnknownOctober 19, 2016 at 5:48 AM
can't stop myself for praising you..simply WOW
AnonymousMarch 26, 2017 at 6:19 AM
from where i can get the data of this?
KumarJuly 16, 2017 at 4:07 AM
Execellent, Deepanshu. Never thought of possibility to predict a person as a terrorist using Data Science. It is a dangerous technique but can be trained well over the time.