REPORT DOCUMENTATION PAGE

Similar documents
DoD Common Access Card Information Brief. Smart Card Project Managers Group

REPORT DOCUMENTATION PAGE

REPORT DOCUMENTATION PAGE

Service Level Agreements: An Approach to Software Lifecycle Management. CDR Leonard Gaines Naval Supply Systems Command 29 January 2003

Empirically Based Analysis: The DDoS Case

Accuracy of Computed Water Surface Profiles

Multi-Modal Communication

ARINC653 AADL Annex Update

COMPUTATIONAL FLUID DYNAMICS (CFD) ANALYSIS AND DEVELOPMENT OF HALON- REPLACEMENT FIRE EXTINGUISHING SYSTEMS (PHASE II)

FUDSChem. Brian Jordan With the assistance of Deb Walker. Formerly Used Defense Site Chemistry Database. USACE-Albuquerque District.

Monte Carlo Techniques for Estimating Power in Aircraft T&E Tests. Todd Remund Dr. William Kitto EDWARDS AFB, CA. July 2011

WAITING ON MORE THAN 64 HANDLES

Running CyberCIEGE on Linux without Windows

High-Assurance Security/Safety on HPEC Systems: an Oxymoron?

Edwards Air Force Base Accelerates Flight Test Data Analysis Using MATLAB and Math Works. John Bourgeois EDWARDS AFB, CA. PRESENTED ON: 10 June 2010

73rd MORSS CD Cover Page UNCLASSIFIED DISCLOSURE FORM CD Presentation

REPORT DOCUMENTATION PAGE

Kathleen Fisher Program Manager, Information Innovation Office

Dana Sinno MIT Lincoln Laboratory 244 Wood Street Lexington, MA phone:

COUNTERING IMPROVISED EXPLOSIVE DEVICES

Architecting for Resiliency Army s Common Operating Environment (COE) SERC

Information, Decision, & Complex Networks AFOSR/RTC Overview

4. Lessons Learned in Introducing MBSE: 2009 to 2012

CENTER FOR ADVANCED ENERGY SYSTEM Rutgers University. Field Management for Industrial Assessment Centers Appointed By USDOE

73rd MORSS CD Cover Page UNCLASSIFIED DISCLOSURE FORM CD Presentation

Using Model-Theoretic Invariants for Semantic Integration. Michael Gruninger NIST / Institute for Systems Research University of Maryland

Topology Control from Bottom to Top

2013 US State of Cybercrime Survey

Concept of Operations Discussion Summary

Vision Protection Army Technology Objective (ATO) Overview for GVSET VIP Day. Sensors from Laser Weapons Date: 17 Jul 09 UNCLASSIFIED

SURVIVABILITY ENHANCED RUN-FLAT

HEC-FFA Flood Frequency Analysis

ASSESSMENT OF A BAYESIAN MODEL AND TEST VALIDATION METHOD

A Distributed Parallel Processing System for Command and Control Imagery

An Update on CORBA Performance for HPEC Algorithms. Bill Beckwith Objective Interface Systems, Inc.

Energy Security: A Global Challenge

Towards a Formal Pedigree Ontology for Level-One Sensor Fusion

Using Templates to Support Crisis Action Mission Planning

ASPECTS OF USE OF CFD FOR UAV CONFIGURATION DESIGN

Washington University

Setting the Standard for Real-Time Digital Signal Processing Pentek Seminar Series. Digital IF Standardization

ATCCIS Replication Mechanism (ARM)

Technological Advances In Emergency Management

Space and Missile Systems Center

Basing a Modeling Environment on a General Purpose Theorem Prover

Uniform Tests of File Converters Using Unit Cubes

Space and Missile Systems Center

Dr. Stuart Dickinson Dr. Donald H. Steinbrecher Naval Undersea Warfare Center, Newport, RI May 10, 2011

75th Air Base Wing. Effective Data Stewarding Measures in Support of EESOH-MIS

Data Warehousing. HCAT Program Review Park City, UT July Keith Legg,

Army Research Laboratory

MODELING AND SIMULATION OF LIQUID MOLDING PROCESSES. Pavel Simacek Center for Composite Materials University of Delaware

VICTORY VALIDATION AN INTRODUCTION AND TECHNICAL OVERVIEW

Exploring the Query Expansion Methods for Concept Based Representation

Shallow Ocean Bottom BRDF Prediction, Modeling, and Inversion via Simulation with Surface/Volume Data Derived from X-Ray Tomography

Speaker Verification Using SVM

NEW FINITE ELEMENT / MULTIBODY SYSTEM ALGORITHM FOR MODELING FLEXIBLE TRACKED VEHICLES

REPORT DOCUMENTATION PAGE Form Approved OMB NO

DoD M&S Project: Standardized Documentation for Verification, Validation, and Accreditation

Corrosion Prevention and Control Database. Bob Barbin 07 February 2011 ASETSDefense 2011

Guide to Windows 2000 Kerberos Settings

NATO-IST-124 Experimentation Instructions

Col Jaap de Die Chairman Steering Committee CEPA-11. NMSG, October

Smart Data Selection (SDS) Brief

Defense Hotline Allegations Concerning Contractor-Invoiced Travel for U.S. Army Corps of Engineers' Contracts W912DY-10-D-0014 and W912DY-10-D-0024

Dr. ing. Rune Aasgaard SINTEF Applied Mathematics PO Box 124 Blindern N-0314 Oslo NORWAY

Use of the Polarized Radiance Distribution Camera System in the RADYO Program

SINOVIA An open approach for heterogeneous ISR systems inter-operability

Shallow Ocean Bottom BRDF Prediction, Modeling, and Inversion via Simulation with Surface/Volume Data Derived from X-ray Tomography

U.S. Army Research, Development and Engineering Command (IDAS) Briefer: Jason Morse ARMED Team Leader Ground System Survivability, TARDEC

Ross Lazarus MB,BS MPH

C2-Simulation Interoperability in NATO

REPORT DOCUMENTATION PAGE

ENVIRONMENTAL MANAGEMENT SYSTEM WEB SITE (EMSWeb)

AFRL-ML-WP-TM

A Review of the 2007 Air Force Inaugural Sustainability Report

Distributed Real-Time Embedded Video Processing

Can Wing Tip Vortices Be Accurately Simulated? Ryan Termath, Science and Teacher Researcher (STAR) Program, CAL POLY SAN LUIS OBISPO

New Dimensions in Microarchitecture Harnessing 3D Integration Technologies

BUPT at TREC 2009: Entity Track

Android: Call C Functions with the Native Development Kit (NDK)

The C2 Workstation and Data Replication over Disadvantaged Tactical Communication Links

SCICEX Data Stewardship: FY2012 Report

DEVELOPMENT OF A NOVEL MICROWAVE RADAR SYSTEM USING ANGULAR CORRELATION FOR THE DETECTION OF BURIED OBJECTS IN SANDY SOILS

75 TH MORSS CD Cover Page. If you would like your presentation included in the 75 th MORSS Final Report CD it must :

Wireless Connectivity of Swarms in Presence of Obstacles

32nd Annual Precise Time and Time Interval (PTTI) Meeting. Ed Butterline Symmetricom San Jose, CA, USA. Abstract

Title: An Integrated Design Environment to Evaluate Power/Performance Tradeoffs for Sensor Network Applications 1

Web Site update. 21st HCAT Program Review Toronto, September 26, Keith Legg

Agile Modeling of Component Connections for Simulation and Design of Complex Vehicle Structures

Directed Energy Using High-Power Microwave Technology

Cyber Threat Prioritization

Using the SORASCS Prototype Web Portal

Center for Infrastructure Assurance and Security (CIAS) Joe Sanchez AIA Liaison to CIAS

Balancing Transport and Physical Layers in Wireless Ad Hoc Networks: Jointly Optimal Congestion Control and Power Control

Parallelization of a Electromagnetic Analysis Tool

Current Threat Environment

LARGE AREA, REAL TIME INSPECTION OF ROCKET MOTORS USING A NOVEL HANDHELD ULTRASOUND CAMERA

Advanced Numerical Methods for Numerical Weather Prediction

WaveNet: A Web-Based Metocean Data Access, Processing and Analysis Tool, Part 2 WIS Database

Transcription:

REPORT DOCUMENTATION PAGE Form Approved OMB NO. 0704-0188 The public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggesstions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington VA, 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to any oenalty for failing to comply with a collection of information if it does not display a currently valid OMB control number. PLEASE DO NOT RETURN YOUR FORM TO THE ABOVE ADDRESS. 1. REPORT DATE (DD-MM-YYYY) 28-08-2012 4. TITLE AND SUBTITLE Terrorist Activity Pattern Detection in Afghanistan:A Knowledge Discovery and Data Mining Approach for Counter-Terrorism 6. AUTHORS 2. REPORT TYPE Abstract Jose Pou, Dr. Jeff Duffany (Advisor), Dr. Alfredo Cruz, (Mentor) 5a. CONTRACT NUMBER W911NF-11-1-0174 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 206022 5d. PROJECT NUMBER 5e. TASK NUMBER 3. DATES COVERED (From - To) - 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAMES AND ADDRESSES Polytechnic University of Puerto Rico 377 Ponce De Leon Hato Rey San Juan, PR 00918-9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) U.S. Army Research Office P.O. Box 12211 Research Triangle Park, NC 27709-2211 12. DISTRIBUTION AVAILIBILITY STATEMENT Approved for public release; distribution is unlimited. 13. SUPPLEMENTARY NOTES The views, opinions and/or findings contained in this report are those of the author(s) and should not contrued as an official Department of the Army position, policy or decision, unless so designated by other documentation. 14. ABSTRACT Data mining is primarily used by businesses. Today companies with strong consumer focus depend on data mining to determine relationships among internal and external data factors that are being registered and stored digitally. It enables them to drill down data into summary information to view detail transactional data and reveal trends that could be beneficial for an organization s business decision making. In this paper we incorporate data mining techniques using open source applications to model, evaluate and identify patterns of terrorism activity in 15. SUBJECT TERMS Data Mining, NCTC, WITS 8. PERFORMING ORGANIZATION REPORT NUMBER 10. SPONSOR/MONITOR'S ACRONYM(S) ARO 11. SPONSOR/MONITOR'S REPORT NUMBER(S) 58924-CS-REP.2 16. SECURITY CLASSIFICATION OF: a. REPORT b. ABSTRACT c. THIS PAGE UU UU UU 17. LIMITATION OF ABSTRACT UU 15. NUMBER OF PAGES 19a. NAME OF RESPONSIBLE PERSON Alfredo Cruz 19b. TELEPHONE NUMBER 787-622-8000 Standard Form 298 (Rev 8/98) Prescribed by ANSI Std. Z39.18

Report Title Terrorist Activity Pattern Detection in Afghanistan:A Knowledge Discovery and Data Mining Approach for Counter-Terrorism ABSTRACT Data mining is primarily used by businesses. Today companies with strong consumer focus depend on data mining to determine relationships among internal and external data factors that are being registered and stored digitally. It enables them to drill down data into summary information to view detail transactional data and reveal trends that could be beneficial for an organization s business decision making. In this paper we incorporate data mining techniques using open source applications to model, evaluate and identify patterns of terrorism activity in Afghanistan for counter-terrorism and to strengthen homeland security. We apply data mining techniques to real terrorism incidents data from the Worldwide Incidents Tracking System (WITS) of the National Counterterrorism Center (NCTC). The results of the study will help in the discovery of terrorist group tendencies based on specific incident factors, but will also help evaluate the war on terror in Afghanistan up to date. With these results we also look to uncover valuable information regarding terrorist hot spots to determine geographical mobilization of security forces resources in the region.

Terrorist Activity Evaluation and Pattern Detection (TAE&PD) in Afghanistan: A Knowledge Discovery and Data Mining (KDDM) Approach for Counter-Terrorism Data mining (DM) is primarily used by businesses to discover customer tendencies to guarantee future profit opportunities. In the TAE&PD project we intend to incorporate a KDDM methodology using open source applications to gather, preprocess, model, evaluate and identify patterns of terrorism activity that may prove useful to counter-terrorism and strengthen homeland security in Afghanistan. We will experiment using real terrorism incidents data from the Worldwide Incidents Tracking System (WITS) of the National Counterterrorism Center (NCTC). The project seeks to discover terrorism trends based on specific incident factors, help in the evaluation of war in Afghanistan and demonstrate a KDDM approach that could be applied (proof of concept) to national security. Project results may uncover valuable information regarding terrorist hot spots to determine geographical mobilization of security forces resources in the region. Figure 1: TAE&PD component s and data flow interaction diagram. The following is a brief status report regarding the TAE&PD project current initiatives before implementation. It is crucial beforehand to prepare data accordingly as some attributes may not be suited for DM algorithms being considered. In this project we look to demonstrate the use of tools and techniques that are applied in KDDM related fields: Databases Statistics Machine Learning Visualization

The following table presents the current scope of project pertaining to KDDM field tools and techniques evaluation status that are still being researched before development of testing environment: KDDM Field Tool(s) Technique(s) Status Databases Server 2008 R2 Data Preprocessing Cleaning Integration Transformation Reduction On going New incident attributes (Integration) are being considered from other sources that may add more value to study. Dataset contains incidents from 2004 to 2011. WITS site is offline for some time, no 2012 data can be collected for the moment. Machine Learning Server 2008 R2 Data Mining Add-ins. Clustering Analysis - Afghanistan incident type volume evaluation by country regions. In progress - Acquired book Data Mining with Server 2008 Statistics Server 2008 R2 Data Mining Add-ins. Time Series Analysis - Prediction of 2012 future incidents regarding deaths, injuries and kidnappings - Explore algorithms provided by tool. Not started - Acquired book Data Mining with Server 2008 Machine Learning R language/ Rattle Association Rules Analysis - Create rules associating incident month, week and province to type of attack. - Explore algorithms provided by tool. In progress - Acquired book Data Mining with Rattle and R ()

Visualization R language/ Rattle Graph representation of trends of incident data via the R language. - Security Forces (Police and Military) incident victim status by province or other factors. Not started - R packages ~ RODBC ~ ggplot2 ~ arules ~ RStat Table 1: TAE&PD project s current scope based on KDDM associated fields. S The table above represents an initial blue print of the project in order to establish a defined scope. After evaluating software used for data analysis the likes of R, Weka and Rapidminer it was concluded that the R language is the best fit for this project. R has the capability to be customized to the needs of any user and as far as DM use, it can be used by people with or without a programming background. R also has the advantage of a vast community of resources in comparison to other DM suites. In the months of mid May, June and beginning of July of 2012 an effort has being made to acquire material to dive in R and its KDDM capabilities. R topics, to name a few, that have being studied include: Basic Operations Function(s) Introduction to Data Structures - Arrays - Lists - Data Frames Basic Charts and Graphs R Environment Creation Rattle and Data Mining Evaluation of Sampling Strategy - Training Dataset - Validation Dataset - Testing Dataset In the coming month an effort will be made to start experimenting with Cluster Analysis method using SQL Server and Association Rules Analysis. Also experiment with the RODBC

package to be able to interact directly with the TID database in SQL Server from R in order to present the use of the ggplot2 package for visualization.