A Multilevel Secure MapReduce Framework for Cross-Domain Information Sharing in the Cloud

Similar documents
A Multilevel Secure MapReduce Framework for Cross-Domain Information Sharing in the Cloud

Service Level Agreements: An Approach to Software Lifecycle Management. CDR Leonard Gaines Naval Supply Systems Command 29 January 2003

Multi-Modal Communication

Running CyberCIEGE on Linux without Windows

DoD Common Access Card Information Brief. Smart Card Project Managers Group

FUDSChem. Brian Jordan With the assistance of Deb Walker. Formerly Used Defense Site Chemistry Database. USACE-Albuquerque District.

Information, Decision, & Complex Networks AFOSR/RTC Overview

Using Model-Theoretic Invariants for Semantic Integration. Michael Gruninger NIST / Institute for Systems Research University of Maryland

ENVIRONMENTAL MANAGEMENT SYSTEM WEB SITE (EMSWeb)

2013 US State of Cybercrime Survey

Wireless Connectivity of Swarms in Presence of Obstacles

Empirically Based Analysis: The DDoS Case

Space and Missile Systems Center

4. Lessons Learned in Introducing MBSE: 2009 to 2012

Dana Sinno MIT Lincoln Laboratory 244 Wood Street Lexington, MA phone:

Kathleen Fisher Program Manager, Information Innovation Office

Architecting for Resiliency Army s Common Operating Environment (COE) SERC

ARINC653 AADL Annex Update

Concept of Operations Discussion Summary

High-Assurance Security/Safety on HPEC Systems: an Oxymoron?

Towards a Formal Pedigree Ontology for Level-One Sensor Fusion

DoD M&S Project: Standardized Documentation for Verification, Validation, and Accreditation

An Update on CORBA Performance for HPEC Algorithms. Bill Beckwith Objective Interface Systems, Inc.

COMPUTATIONAL FLUID DYNAMICS (CFD) ANALYSIS AND DEVELOPMENT OF HALON- REPLACEMENT FIRE EXTINGUISHING SYSTEMS (PHASE II)

73rd MORSS CD Cover Page UNCLASSIFIED DISCLOSURE FORM CD Presentation

A Review of the 2007 Air Force Inaugural Sustainability Report

Topology Control from Bottom to Top

Accuracy of Computed Water Surface Profiles

A Distributed Parallel Processing System for Command and Control Imagery

Vision Protection Army Technology Objective (ATO) Overview for GVSET VIP Day. Sensors from Laser Weapons Date: 17 Jul 09 UNCLASSIFIED

Space and Missile Systems Center

NAVAL POSTGRADUATE SCHOOL

NAVAL POSTGRADUATE SCHOOL

Advanced Numerical Methods for Numerical Weather Prediction

75 TH MORSS CD Cover Page. If you would like your presentation included in the 75 th MORSS Final Report CD it must :

C2-Simulation Interoperability in NATO

75th Air Base Wing. Effective Data Stewarding Measures in Support of EESOH-MIS

Data Warehousing. HCAT Program Review Park City, UT July Keith Legg,

Technological Advances In Emergency Management

Web Site update. 21st HCAT Program Review Toronto, September 26, Keith Legg

Title: An Integrated Design Environment to Evaluate Power/Performance Tradeoffs for Sensor Network Applications 1

Introducing I 3 CON. The Information Interpretation and Integration Conference

ATCCIS Replication Mechanism (ARM)

Distributed Real-Time Embedded Video Processing

MODELING AND SIMULATION OF LIQUID MOLDING PROCESSES. Pavel Simacek Center for Composite Materials University of Delaware

The C2 Workstation and Data Replication over Disadvantaged Tactical Communication Links

Washington University

Headquarters U.S. Air Force. EMS Play-by-Play: Using Air Force Playbooks to Standardize EMS

Advanced Numerical Methods for Numerical Weather Prediction

Balancing Transport and Physical Layers in Wireless Ad Hoc Networks: Jointly Optimal Congestion Control and Power Control

73rd MORSS CD Cover Page UNCLASSIFIED DISCLOSURE FORM CD Presentation

ASSESSMENT OF A BAYESIAN MODEL AND TEST VALIDATION METHOD

Corrosion Prevention and Control Database. Bob Barbin 07 February 2011 ASETSDefense 2011

QuanTM Architecture for Web Services

Use of the Polarized Radiance Distribution Camera System in the RADYO Program

COTS Multicore Processors in Avionics Systems: Challenges and Solutions

Setting the Standard for Real-Time Digital Signal Processing Pentek Seminar Series. Digital IF Standardization

SURVIVABILITY ENHANCED RUN-FLAT

Lessons Learned in Adapting a Software System to a Micro Computer

Defense Hotline Allegations Concerning Contractor-Invoiced Travel for U.S. Army Corps of Engineers' Contracts W912DY-10-D-0014 and W912DY-10-D-0024

Using Templates to Support Crisis Action Mission Planning

CASE STUDY: Using Field Programmable Gate Arrays in a Beowulf Cluster

Components and Considerations in Building an Insider Threat Program

LARGE AREA, REAL TIME INSPECTION OF ROCKET MOTORS USING A NOVEL HANDHELD ULTRASOUND CAMERA

Data Reorganization Interface

Army Research Laboratory

Shallow Ocean Bottom BRDF Prediction, Modeling, and Inversion via Simulation with Surface/Volume Data Derived from X-Ray Tomography

2011 NNI Environment, Health, and Safety Research Strategy

Dr. Stuart Dickinson Dr. Donald H. Steinbrecher Naval Undersea Warfare Center, Newport, RI May 10, 2011

CENTER FOR ADVANCED ENERGY SYSTEM Rutgers University. Field Management for Industrial Assessment Centers Appointed By USDOE

Requirements for Scalable Application Specific Processing in Commercial HPEC

Cyber Threat Prioritization

PEO C4I Remarks for NPS Acquisition Research Symposium

M&S Strategic Initiatives to Support Test & Evaluation

ASPECTS OF USE OF CFD FOR UAV CONFIGURATION DESIGN

An Efficient Architecture for Ultra Long FFTs in FPGAs and ASICs

U.S. Army Research, Development and Engineering Command (IDAS) Briefer: Jason Morse ARMED Team Leader Ground System Survivability, TARDEC

Exploring the Query Expansion Methods for Concept Based Representation

New Dimensions in Microarchitecture Harnessing 3D Integration Technologies

US Army Industry Day Conference Boeing SBIR/STTR Program Overview

Enhanced Predictability Through Lagrangian Observations and Analysis

Basing a Modeling Environment on a General Purpose Theorem Prover

AFRL-ML-WP-TM

Coalition Interoperability Ontology:

Computer Aided Munitions Storage Planning

Compliant Baffle for Large Telescope Daylight Imaging. Stacie Williams Air Force Research Laboratory ABSTRACT

BUPT at TREC 2009: Entity Track

Increased Underwater Optical Imaging Performance Via Multiple Autonomous Underwater Vehicles

Current Threat Environment

Edwards Air Force Base Accelerates Flight Test Data Analysis Using MATLAB and Math Works. John Bourgeois EDWARDS AFB, CA. PRESENTED ON: 10 June 2010

Dynamic Information Management and Exchange for Command and Control Applications

Folded Shell Projectors and Virtual Optimization

SETTING UP AN NTP SERVER AT THE ROYAL OBSERVATORY OF BELGIUM

Super Energy Efficient Containerized Living Unit (SuperCLU) Technology Design and Development

Col Jaap de Die Chairman Steering Committee CEPA-11. NMSG, October

Using the SORASCS Prototype Web Portal

Speaker Verification Using SVM

Automation Middleware and Algorithms for Robotic Underwater Sensor Networks

DEVELOPMENT OF A NOVEL MICROWAVE RADAR SYSTEM USING ANGULAR CORRELATION FOR THE DETECTION OF BURIED OBJECTS IN SANDY SOILS

Monte Carlo Techniques for Estimating Power in Aircraft T&E Tests. Todd Remund Dr. William Kitto EDWARDS AFB, CA. July 2011

Transcription:

A Multilevel Secure MapReduce Framework for Cross-Domain Information Sharing in the Cloud Thuy D. Nguyen, Cynthia E. Irvine, Jean Khosalim Department of Computer Science Ground System Architectures Workshop March 18-21, 2013 Published by The Aerospace Corporation with permission. 1

Report Documentation Page Form Approved OMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number. 1. REPORT DATE MAR 2013 2. REPORT TYPE 3. DATES COVERED 00-00-2013 to 00-00-2013 4. TITLE AND SUBTITLE A Multilevel Secure MapReduce Framework for Cross-Domain Information Sharing in the Cloud 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) Naval Postgraduate School,Department of Computer Science,Center for Information Systems Security Studies and Research,Monterey,CA,93943 8. PERFORMING ORGANIZATION REPORT NUMBER 9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSOR/MONITOR S ACRONYM(S) 12. DISTRIBUTION/AVAILABILITY STATEMENT Approved for public release; distribution unlimited 11. SPONSOR/MONITOR S REPORT NUMBER(S) 13. SUPPLEMENTARY NOTES Ground System Architectures Workshop (), Los Angeles, California, 18-21 March 2013. 14. ABSTRACT 15. SUBJECT TERMS 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT a. REPORT unclassified b. ABSTRACT unclassified c. THIS PAGE unclassified Same as Report (SAR) 18. NUMBER OF PAGES 15 19a. NAME OF RESPONSIBLE PERSON Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18

Introduction Motivation - Develop a cross-domain MapReduce framework for a multilevel secure (MLS) cloud, allowing users to analyze data at different security classifications Topics - Apache Hadoop framework - MLS-aware Hadoop Distributed File System Concept of operations Requirements, design, implementation - Future work and conclusion 2

Open source software framework for reliable, scalable, distributed computing Inspired by Google s MapReduce computational paradigm and Google File System (GFS) Two main subprojects: - Hadoop Distributed File System, Hadoop MapReduce Support distributed computing on massive data sets on clusters of commodity computers Common usage patterns - ETL (Extract Transform Load) replacement - Data analytics, machine learning Apache Hadoop - Parallel processing platforms (Map without Reduce) 3

Hadoop Architecture Data Node c HDFS block access 2 HDFS metadata access 1 Client 3 Intra-HDFS operations Name Node HDFS access a MapReduce Job Submission Data Node d Intra-HDFS operations Job Tracker Secondary NameNode b 1 y HDFS access Client-to-Server communications Internal server-to-server communications Task Tracker Intra-MR Engine operations Task Tracker 4

MLS-Aware: A Definition A component is considered MLS-aware if it executes without privileges in an MLS environment, and yet takes advantage of that environment to provide useful functionality. Examples: - Reading from resources labeled at the same or lower security levels - Making access decisions based on the security level of the data - Returning the security level of the data 5

Objective Objective and Approach - Extend Hadoop to provide a cross-domain read-down capability without requiring the Hadoop server components to be trustworthy Approach - Modify Hadoop to run on a trusted platform that enforces an MLS policy on local file system Use Security Enhanced Linux (SELinux) for initial prototype - Modify HDFS to be MLS-aware Multiple single-level HDFS instances each is cognizant of HDFS namespaces at lower security levels HDFS servers running at a security level can access file objects at lower levels as permitted by underlying trusted computing base (TCB) - No trusted processes outside TCB boundary 6

User session level - Implicitly established by security level of receiving network interface and TCP/IP ports File access policy rules - A user can read and write file objects at user s session level - A user can read file objects if the user s session level dominates the level of the requested object File system abstraction - HDFS interface is similar to UNIX file system - Traditional Hadoop cluster: one file system HDFS Concept of Operation - MLS-enhanced cluster: multiple file systems, one per security level 7

Root directory at a particular level is expressed as /<user-defined security-level-indicator> HDFS File Organization Security-level-indicator is administratively assigned to an SELinux sensitivity level Traditional root directory (/) is root at the user s session level 8

MLS-aware Hadoop Design Multiple single-level HDFS server instances co-locate on same physical node All NameNode instances run on same physical node DataNode instances are distributed across multiple physical nodes - Authoritative DataNode instance: owner of local files used to store HDFS blocks - Surrogate DataNode instance: handles read-down requests on behalf of an authoritative DataNode instance running at a lower level Configuration file defines allocation of authoritative and surrogate DataNode instances on different physical nodes Design does not impact MapReduce subsystem - JobTracker and TaskTracker only interact with NameNode and DataNode as HDFS clients 9

N AVAL POST G RADUATE SCH OOL ~ ~~~~~~~~~~~~---------------------- CENTER FOR INFORMATION SYSTEMS SECURITY STUDIES AN D RESEARCH 0 Process D NN: NanteNode SNN: Secondary NN DN: DataNode JT: JobTracker TT: TaskTracker Client-U Client-S Client-TS Physical node MLS-enhanced Hadoop Cluster NN Node SNN Node JT Node s @ DN ITT Node 1 EVEV C V~ ~C 3) DN ITT Node 2 DN ITT Node n 10

Client running at user s session level - Contact NameNode at same level to request a file at a lower level NameNode instance at session level - Obtain metadata of requested file and storage locations of associated blocks from NameNode instance running at lower security level - Direct client to contact surrogate DataNode instances that colocate with the file s primary DataNode instances Surrogate DataNode instance at session level - Look up locations of local files used to store requested blocks - Read local files and return requested blocks Cross-domain Read-down Security level of local files is lower than session level 11

NAVAL POST GRADUATE SCH OOL ~ ~~~~~~~~~~~~--------------------- CENTER FOR INFORMATION SYSTEMS SECURITY STUDIES AND RESEARCH Read-down Example Bn: Block n ~ I Client-TS I J,..., File W@ TS: {(Ba,DNl-TS), (Bb,DN2-TS)} File X@ S: {(Bc,DNl-TS), (Bd,DN2-TS)} File Z@ U: {(Bf,DN3-TS)} File X@ S: {(Bc,DN1-S}, (Bd,DN2-S)} NN Node < V 1-~..., <EV I"' File Y @ U: {(Be,DN2-U)} I Client-U I, File Z @ U: {(Bf,DN3-U)},. < ).1' I ~I Client-S DN ITT Node 1 DN ITT Node 2 DN ITT Node 3... ~ C V~lva e l I+- - C V~r za!l l ~7-(Xsc )...... ~4Xsd ) ~,/...,,',;- It!, ~~" ~~(Waa) ~~[wab) s, - - I( TS operations ( ) S operations < )ao U operations ----~ Read-do\<\rn operations 12

Source Lines of Code (SLOC) Metric Use open source Count Lines of Code (CLOC) tool - Can calculate differences in blank, comment, and source lines Summary of code modification - Delta value is the sum of addition, removal, and modification of source lines - Overall change is less than 5% 13

Future work - Adding read-down support to HDFS Federation - Implementing an external Cache Manager - Investigating Hadoop s use of Kerberos for establishing user sessions at different security levels - Performing benchmark testing with larger datasets Prototype is the first step towards developing a highly secure MapReduce platform Does not introduce any trusted processes outside the pre-existing TCB boundary Only affects HDFS servers Future Work and Conclusions 14

Contact Cynthia E. Irvine, PhD, irvine@nps.edu Thuy D. Nguyen, tdnguyen@nps.edu Center for Information Systems Security Studies and Research Department of Computer Science Naval Postgraduate School, Monterey, CA 93943 U.S.A http://www.cisr.us 15