TerraSwarm A Machine Learning and Op0miza0on Toolkit for the Swarm Ilge Akkaya, Shuhei Emoto, Edward A. Lee University of California, Berkeley TerraSwarm Tools Telecon 17 November 2014 Sponsored by the TerraSwarm Research Center, one of six centers administered by the STARnet phase of the Focus Center Research Program (FCRP) a Semiconductor Research Corpora@on program sponsored by MARCO and DARPA.
Report Documentation Page Form Approved OMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number. 1. REPORT DATE 17 NOV 2014 2. REPORT TYPE 3. DATES COVERED 00-00-2014 to 00-00-2014 4. TITLE AND SUBTITLE A Machine Learning and Optimization Toolkit for the Swarm 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) University of California, Berkeley,Department of Electrical Engineering and Computer Sciences,Berkeley,CA,94720 8. PERFORMING ORGANIZATION REPORT NUMBER 9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSOR/MONITOR S ACRONYM(S) 12. DISTRIBUTION/AVAILABILITY STATEMENT Approved for public release; distribution unlimited 13. SUPPLEMENTARY NOTES 14. ABSTRACT 11. SPONSOR/MONITOR S REPORT NUMBER(S) 15. SUBJECT TERMS 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT a. REPORT unclassified b. ABSTRACT unclassified c. THIS PAGE unclassified Same as Report (SAR) 18. NUMBER OF PAGES 40 19a. NAME OF RESPONSIBLE PERSON Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18
Overview 1. Mo0va0on 2. Overview of Current ML Toolkit Capabili0es 3. Case Study: Coopera0ve Robot Localiza0on and Control State Es0ma0on: Par0cle Filtering Path Planning: Informa0on Based Methods for Robot Trajectory Op0miza0on Actor- oriented Design for State Space Dynamics and Measurements 4. Future Direc0ons &Conclusions 2
Mo0va0on ML technology in programming languages: MATLAB, Python, Octave, Julia, R... And in the form of toolkits: GMTK, StreamLab, SHOGUN, Weka,... The state- of- the- art tools tradi0onally interact with data and present no na0ve way of incorpora0ng system aspects Goal: to make the ML aspects a na0ve part of the system design by Exploi0ng component- level interac0ons in the swarm Restoring the system level roots of machine learning methodologies by providing the right interfaces between machine learning tools and CPS design aspects. 3
Mo0va0on We present an actor- oriented machine learning toolkit that focuses on Applica0ons of ML Algorithms to streaming data Enabling ML techniques to be na0vely integrated into system design Context- aware parameteriza0on of a rich set of ML algorithms Library of easy- to- use tools for developers who are not ML experts Enhancing programmability of swarmlets 4
g(x t,u t,t) Inference for Streaming Data Goal : Inference on data that is evolving in @me x 1 x t-1 x t x t+1 x T f(x t,u t,t)...... z 1 z 1 t-1 z 1 z t 1 t+1 z 1 T observations z 2 z 2 t-1 z 2 t z 2 t+1 z 2 T 5
g(x t,u t,t) The Machine Learning Toolkit in Ptolemy II Machine Learning: 1. Hidden Markov Models (HMM) 2. Gaussian Mixture Models (GMM) Parameter Es0ma0on Classifica0on x 1 x t-1 x t x t+1 x T f(x t,u t,t)...... z 1 z 1 t-1 z 1 z t 1 t+1 z 1 T observations z 2 z 2 t-1 z 2 t z 2 t+1 z 2 T 6
The Machine Learning Toolkit in Ptolemy II State Es0ma0on: Par0cle Filtering 7
The Machine Learning Toolkit in Ptolemy II Op0miza0on: CompositeOp0mizer: An actor- oriented gradient- descent solver 8
Applica0on: Swarmlets for Coopera0ve Robot Control Problem Defini0on: A team of robots, tracking/pursuing a target. Model: State Space Model of target dynamics Observa@ons: Robot sensor measurements (generally nonlinear func@ons of target posi@on + noise) Tasks: Target State Es@ma@on Robot Path Planning: Mul@ple Objec@ves Collision/Obstacle Avoidance, Pursuit, SLAM, Fast Localiza@on, Minimal Uncertainty, 9
TerraSwarm Research Center 10 11/17/14 Coopera0ve Robot Control : Challenges Coopera@on between robots Complex measurement/noise models Range Measurements ( e.g., RSSI) Bearings Measurement (e.g., Cameras) Nonlinear robot dynamics Unknown Environment
Coopera0ve Robot Localiza0on: State Space Models t = apple xt y t x 0 Uniform([ 100, 100]) y 0 Uniform([ 100, 100]) z t = apple z1 z 2 Target state ( posi0on ) Range Measurements z i t = kr i t t k +! t, i =1, 2! t N (0, 2 ), 2 =5.0 apple 0 t+1 = t + t, t N ( 0, Measurement model apple 5.0 0.0 0.0 5.0 ) Target state dynamics 11
Algorithm Workflow 1. Robots make independent range measurements 2. A centralized (or local) coopera0ve state es0ma0on algorithm es0mates target posi0on given measurements 3. Robot trajectories are op0mized w.r.t. some objec0ve func0on based on the es0mated target posi0on 4. Robots move according to the planned path 12
g(x t,u t,t) Target State Es0ma0on x 1 x t-1 x t x t+1 x T f(x t,u t,t)...... z 1 z 1 t-1 z 1 z t 1 t+1 z 1 T observations z 2 z 2 t-1 z 2 t z 2 t+1 z 2 T Given z t, t=1, T: noisy measurements of a target state x t, Es0mate p(x T z 1:T ): Posterior density of the target state Par0cle filtering is a popular Bayesian Filtering technique to solve this problem: Provides a density es0mate of x T as a par0cle set 13
The Par0cle Filter Introducing the par0cle filter: Sequen0al Monte Carlo methods as a general family A Bayesian filter that performs maximum- likelihood state es0ma0on for state- space models with nonlinear dynamics and non- Gaussian noise, in the general case A stochas0c (and oben becer performing) alterna0ve of the Kalman filter ( which is only op0mal for the linear Gaussian case) 14
Par0cle Filter: Opera0on Establish a prior belief of the state, represented as a set of par0cles Each par0cle is a candidate state, which is the intruder posi0on in this par0cular applica0on 15
Par0cle Filter: Opera0on Make a measurement 16
Par0cle Filter: Assigning Weights Assign weights to each par0cle according to how well it explains the measurement ( subject to a measurement model and noise specifica0on) 17
Par0cle Filter: Assigning Weights The par0cle weights ( under Gaussian noise) would look like the following: 18
Par0cle Filter: Resampling The resul0ng set of par0cles would look like: 19
Par0cle Filter: Propaga0on Propagate resul0ng par0cles according to dynamics model TerraSwarm Research Center 11/17/14 20
Par0cle Filtering with Range Sensors x10-3 2.5 weights 2.0 1.5 1.0 0.5 0.0 25 20 15-20 -15-10 -5 0 5 10 15 20 x Collaborative Particle Output particles target robot1 robot2 state x estimate 10 5 z1 z2 y 0-5 x Robot 1 Robot 2-10 -15-20 x10-3 2.5 2.0 1.5 1.0 0.5 0.0-20 -15-10 -5 0 5 10 15 20 x 21
Two- Observer Par0cle Filter Measurement Input State Space Model t = apple xt y t x 0 Uniform([ 100, 100]) y 0 Uniform([ 100, 100]) z t = apple z1 z 2 z it = kr it t k +! t, i =1, 2! t N (0, 2 ), 2 =5.0 apple 0 t+1 = t + t, t N ( 0, apple 5.0 0.0 0.0 5.0 ) 22
Path Planning One candidate metric to be used for online trajectory op0miza0on: Informa0on based methods: Mutual Informa0on A par0cle set is a good probabilis0c measure of the uncertainty in a state variable Size of par0cle set can be used to tune approxima0on bounds Op0miza0on Goal: Maximize Mutual Informa0on between measurements and par0cle set: Locate intruder as precisely as possible, with fewest steps Can equivalently be formulated as: Minimize uncertainty in es0mated intruder loca0on 23
An Actor- oriented Op0mizer Consider the general constrained op@miza@on problem of the form: Currently supports: COBYLA, a gradient- descent constrained op@miza@on solver 24
Cost Func0ons for Path Planning: Mutual Informa0on Op0miza0on Goal: Maximize Mutual Informa0on between future measurements and predicted par0cle set: Locate intruder as precisely as possible, with fewest steps This can equivalently be formulated as: Minimizing the uncertainty in es0mated intruder loca0on. One- step op0mal trajectories: 25
Coopera0ve Target Localiza0on: Models 26
Demo: MI Maximiza0on 80 70 + + + 60 + + + 50.... 40.... 30 -..... 20....... 4 10.. -... -10 if + + 1j -30 +.. + -20 + + -- -40.. -50.. -60.. -70 +.. + -80-1.2-1.0-0.8-0.6-0.4-0.2-0.0 0.2 0.4 0.6 0.8 1.0 1.2 --.,.. intruder particles robot1. robot2 robot3 robot4 x1 0 2 02 Mean-Square Localization Error 1.2 1.0 l l.. l 0.8 + t- + t- 0.6 + + 0.4 +.. +.. 0.2.. t-.. t- 00,,_ ----t -r----,_ -r----,,_ 00 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 27
Demo: Direct Pursuit Trajectories and the Parti cle Filter Density Estimate 60 so +.. + +-- intruder particles 1""========= robot1 robot2 robot3 robot4 "" 40 30 + 20 + 10 + 0-10 + -20 + + -30-40 + + -SO.. + +.:::..... +.-- 1-... + 7.f + +..... _ I +.. - A.. - I... {J.. +.... -.. -.. + + -2.0-1.8-1.6-14 -1.2-1.0-0.8-0.6-0.4-0.2-0.0 0. 2 0.4 0.6 0.8 1.0.CentralizedOnlin on Node.Piots.TimedPiotter 02 1.2 Mean-s quare Localization Error 1.0 0.8-0.6 0.4 0.2 t- 0.0 +-- t-----t +-- t-----t ---+ +-- +-- 28 0.0 O.S 1.0 1.S 2.0 3.S 4.0 4.S S.O S.S 6.0 6.S 7.0
Demo: Hybrid Approach - 1 Follower 0.7 0.6 0.5 0.4 0.3 0.2 0.1-0.0 - -0.1-0.2-0.3-0.4-0.5-0.6-0.7-0.8-0.9-1.0. ai. - "1!\,, i...,..- -1.0-0 g -0.8-0.7-0.6-0.5-0.4-0.3-0.2-0.1-0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 g 1.0 utation Node.Piots.TimedPiotter intruder particles robot1 robot2 robot3 robot4 1.0 0.8 0.6 '\ Mean-Square Localization Error 1 1 1 1 1 1 0.4 0.2 0.0 \ 0 0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 time 29 6.5 7.0
Bridging Actor- Oriented Modeling and ML Algorithms Goal: ML Algorithms that are aware of the system models Methodology: Implement measurement models and system dynamics as decorator actors in the system model Easy to share, consistent models of underlying system models Scalable and unambiguous ML algorithm design for non- experts 30
Shared State Space Models for Model Predic0ve Control 31
Measurement Models and Dynamics as Decorators 32
Measurement Models and Dynamics as Decorators 33
Target Localiza0on: Adding a new Sensor 34
Demo: Predic0on 35 -- 30 -- 25 Range-Only Stationary Target Localization f... + + +.............. I_ I particles robot1 target position N-step prediction predicted state robot2 20 15 10 5-0 - t.i + +...... I I I I I -5-10 1't - + -15 - -20-25 -- -30 -- -35-15 -10-5 0 + + + r.. 5 10 15.. I 20 X.. 25.. 30 35 40 45 50 55 I I I I I 60 35
ML and Op0miza0on: Swarmlets PILOT App: State Estimation and Control 2 service discovery 1 TerraSwarm Research Center 11/17/14 36
Conclusions Presented an actor- oriented machine learning toolkit that is designed for ML and Op0miza0on applica0ons on streaming data Enhancing programmability of swarmlets Actor libraries for common state- space dynamics and sensor models 37
Looking Ahead Enhancing ML capabili0es: Discrete Op@miza@on Solvers (Mixed) Integer Programming Tool Integra@on: e.g., GMTK Developing Swarmlets: Providing Services to TerraSwarm Applica@on Developers More case studies Anomaly detec@on Mul@- sensor fusion 38
Demos: Available in Ptolemy II hcp://chess.eecs.berkeley.edu/ptexternal/ 39
Thank You! Ques0ons? Comments? 40