Machine Learning: Overview & Applications to Test

Similar documents
Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017

Machine Learning 13. week

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

Grundlagen der Künstlichen Intelligenz

Machine Learning : Clustering, Self-Organizing Maps

Code Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python:

Lecture 7: Neural network acoustic models in speech recognition

Deep Learning Applications

Deep Learning on Graphs

Deep Learning for Computer Vision

CSE 6242 A / CS 4803 DVA. Feb 12, Dimension Reduction. Guest Lecturer: Jaegul Choo

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

LSTM and its variants for visual recognition. Xiaodan Liang Sun Yat-sen University

Contents. Preface to the Second Edition

CS6220: DATA MINING TECHNIQUES

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition

A Deep Learning primer

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic

Machine Learning. The Breadth of ML Neural Networks & Deep Learning. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart

CSE 6242 A / CX 4242 DVA. March 6, Dimension Reduction. Guest Lecturer: Jaegul Choo

Network Traffic Measurements and Analysis

Advanced Introduction to Machine Learning, CMU-10715

F-SECURE S UNIQUE CAPABILITIES IN DETECTION & RESPONSE

Unsupervised Learning

Lecture 13. Deep Belief Networks. Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen

Deep Learning. Volker Tresp Summer 2014

ImageNet Classification with Deep Convolutional Neural Networks

Deep Learning for Computer Vision II

Neural Networks and Deep Learning

ECG782: Multidimensional Digital Signal Processing

DEEP LEARNING TO DIVERSIFY BELIEF NETWORKS FOR REMOTE SENSING IMAGE CLASSIFICATION

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Brainchip OCTOBER

10/14/2017. Dejan Sarka. Anomaly Detection. Sponsors

Machine Learning with Python

27: Hybrid Graphical Models and Neural Networks

Image Analysis, Classification and Change Detection in Remote Sensing

INTRODUCTION TO DEEP LEARNING

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why?

Using Machine Learning to Optimize Storage Systems

Convolutional Neural Networks

Neural Networks for unsupervised learning From Principal Components Analysis to Autoencoders to semantic hashing

Deep Learning With Noise

UNSUPERVISED LEARNING, CLUSTERING

Table of Contents. What Really is a Hidden Unit? Visualizing Feed-Forward NNs. Visualizing Convolutional NNs. Visualizing Recurrent NNs

CLASSIFICATION AND CHANGE DETECTION

Clustering algorithms and autoencoders for anomaly detection

CPSC 340: Machine Learning and Data Mining. Deep Learning Fall 2018

Multi-Glance Attention Models For Image Classification

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Pixel-level Generative Model

Deep Learning with Tensorflow AlexNet

LSTM for Language Translation and Image Captioning. Tel Aviv University Deep Learning Seminar Oran Gafni & Noa Yedidia

Image Captioning with Object Detection and Localization

ECE 285 Class Project Report

DIMENSION REDUCTION FOR HYPERSPECTRAL DATA USING RANDOMIZED PCA AND LAPLACIAN EIGENMAPS

Handwritten Gurumukhi Character Recognition by using Recurrent Neural Network

Machine Learning Methods in Visualisation for Big Data 2018

On Classification: An Empirical Study of Existing Algorithms Based on Two Kaggle Competitions

Efficient Algorithms may not be those we think

ADVANCED MACHINE LEARNING. Mini-Project Overview

Bioinformatics - Lecture 07

CSE 6242 / CX October 9, Dimension Reduction. Guest Lecturer: Jaegul Choo

Radial Basis Function Networks: Algorithms

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

CSC321: Neural Networks. Lecture 13: Learning without a teacher: Autoencoders and Principal Components Analysis. Geoffrey Hinton

Recurrent Convolutional Neural Networks for Scene Labeling

CPSC 340: Machine Learning and Data Mining. Multi-Dimensional Scaling Fall 2017

Large Scale Data Analysis Using Deep Learning

CONVOLUTIONAL NEURAL NETWORKS FOR HIGH SPATIAL RESOLUTION REMOTE SENSING IMAGE CLASSIFICATION

Machine Learning. MGS Lecture 3: Deep Learning

The exam is closed book, closed notes except your one-page cheat sheet.

Tutorial on Machine Learning Tools

Deep Neural Networks:

Hand Written Digit Recognition Using Tensorflow and Python

Deep Neural Network Hyperparameter Optimization with Genetic Algorithms

Computer Vision Lecture 16

Intro to Artificial Intelligence

Contents Machine Learning concepts 4 Learning Algorithm 4 Predictive Model (Model) 4 Model, Classification 4 Model, Regression 4 Representation

Vulnerability of machine learning models to adversarial examples

Cluster Analysis and Visualization. Workshop on Statistics and Machine Learning 2004/2/6

EE 589 INTRODUCTION TO ARTIFICIAL NETWORK REPORT OF THE TERM PROJECT REAL TIME ODOR RECOGNATION SYSTEM FATMA ÖZYURT SANCAR

Deep Generative Models and a Probabilistic Programming Library

Grounded Compositional Semantics for Finding and Describing Images with Sentences

IMAGE ANALYSIS, CLASSIFICATION, and CHANGE DETECTION in REMOTE SENSING

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Predictive Analytics: Demystifying Current and Emerging Methodologies. Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA

Semantic image search using queries

Neural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders

COMP 551 Applied Machine Learning Lecture 13: Unsupervised learning

K-Means Clustering 3/3/17

Know your data - many types of networks

10-701/15-781, Fall 2006, Final

Artificial Intelligence. Programming Styles

On Information-Maximization Clustering: Tuning Parameter Selection and Analytic Solution

CAMCOS Report Day. December 9 th, 2015 San Jose State University Project Theme: Classification

Transcription:

Machine Learning: Overview & Applications to Test 1st Lt Takayuki Iguchi 1st Lt Megan E. Lewis AFOTEC/Det 5/DTS Release Date: 6 MAR 17 Approved for Public Release: Distribution Unlimited. AFOTEC Public Affairs Public Release Number 2017-01 1

Why use Machine Learning in test? It takes more time to analyze large high dimensional data than it does to collect it Video Audio BUS data Machine learning is designed to work with large high dimensional data 2

Visualizing Large High Dimensional Data 3

Visualizing Large High Dimensional Data 4

Visualizing Large High Dimensional Data 5

Visualizing Large High Dimensional Data 6

Visualizing Large High Dimensional Data 7

Visualizing Large High Dimensional Data 8

What is Machine Learning? A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. The field of study that gives computers the ability to learn without being explicitly programmed. 9

Types of Machine Learning Reinforcement Learning Learn to select action that maximize the accumulated reward over time. Unsupervised Learning Infer a function from unlabeled training data Supervised Learning Infer a function from labeled training data 10

Types of ML Unsupervised Learning These things are similar These things will add up to something that will look like a 2. (van der Maaten [2008]) (Hinton [2013]) 11

Types of Machine Learning Supervised Learning This is the correct salary of a professor given the time since highest degree earned. These are a camels. Those are a people. Include graph of a simple linear regression (Weisberg [1985]) (ImageNet [2014]) 12

Unsupervised Learning Tasks Anomaly detection / outlier detection Dimensionality Reduction Manifold Learning Clustering 13

Anomaly Detection As instrumentation has improved, often the limiting factor isn t that there isn t enough data but that there is too much data With flight test there is often a small time window between sorties. Thus quick cursory data analysis is desired but not currently possible Anomaly Detection methods can help identify otherwise hidden issues (not detected by aircrew) before they manifest into larger issues 14

Anomaly Detection Perform problem ID with logged & uncontrollable factors Variety of different algorithms and methodologies Choice of algorithm & methodology is dependent on application and nature of data 15

Dimensionality Reduction The goal: Take a high dimensional dataset and find a good representation in a lower dimension (e.g., 2-D). Signal decomposition methods: Principal Component Analysis (PCA) Kernel PCA Factor Analysis Non-negative matrix factorization Manifold learning: Isomap Locally linear embedding (LLE) Spectral embedding Multi-dimensional scaling (MDS) t-stochastic Neighbor Embedding (t-sne) 16

t-stochastic Neighbor Embedding PCA is variance based. If the structure in the high dimensional space lies on a non-linear manifold, PCA will not work well. (Vanderplas, scikit-learn [2016]) 17

t-stochastic Neighbor Embedding (van der Maaten [2016]) 18

t-stochastic Neighbor Embedding (van der Maaten [2016]) 19

t-stochastic Neighbor Embedding (van der Maaten [2016]) 20

t-stochastic Neighbor Embedding (van der Maaten [2016]) 21

t-stochastic neighbor embedding 0 1 2 3 4 5 6 7 8 9 t-sne Sammon mapping Isomap Locally Linear Embedding (van der Maaten [2008]) 22

Clustering The goal: Partition a dataset to maximize similarity within each partition. Connectivity-based / hierarchical clustering Single Linkage Clustering (SLINK) Centroid based clustering k-means++ k-medians Density based clustering Density-based spatial clustering of applications with noise (DBSCAN) Distribution based clustering Gaussian Mixture Models 23

k-means Randomly draw cluster centroids Until clustering remains unchanged Assign points to nearest centroid Calculate new centroids Output Clustering 24

k-means Randomly draw cluster centroids Until clustering remains unchanged Assign points to nearest centroid Calculate new centroids Output Clustering 25

k-means Randomly draw cluster centroids Until clustering remains unchanged Assign points to nearest centroid Calculate new centroids Output Clustering 26

k-means Randomly draw cluster centroids Until clustering remains unchanged Assign points to nearest centroid Calculate new centroids Output Clustering 27

k-means Randomly draw cluster centroids Until clustering remains unchanged Assign points to nearest centroid Calculate new centroids Output Clustering 28

k-means Randomly draw cluster centroids Until clustering remains unchanged Assign points to nearest centroid Calculate new centroids Output Clustering 29

k-means Randomly draw cluster centroids Until clustering remains unchanged Assign points to nearest centroid Calculate new centroids Output Clustering 30

k-means Randomly draw cluster centroids Until clustering remains unchanged Assign points to nearest centroid Calculate new centroids Output Clustering 31

k-means Randomly draw cluster centroids Until clustering remains unchanged Assign points to nearest centroid Calculate new centroids Output Clustering 32

Supervised Learning Tasks Classification Output is discrete Speech Recognition Image Classification Regression Output is continuous 33

Neural Networks A Neuron a 0 = 1 Mathematical model for a neuron a i w ij w 0j Σ g a j (Russell, Norvig [2010]) 34

Perceptrons All inputs connected to outputs x 1 w 1 Error function: x 2 w 2 y E = 1 2 2 t y x 3 w 3 Update weights with each training case: Output unit is a threshold unit Δw i = ε y threshold(w x ) x i Output unit is a logistic unit Δw i = ε E w i = εσ w x (y σ w x )(1 σ w x )x i (Russell, Norvig [2010]) 35

Multi-layer Perceptron Better performing than a single-layer feed forward neural network Trained with backpropagation x 1 w 1 w 2 h 1 w 7 x 2 w 3 w 4 y x 3 w 5 h 2 w 8 w 6 36

Image Text Recognition Over-the-shoulder videos are common data sources Cheap to implement Processing is time intensive ANNs can help (Karpathy [2015]) (Shi, et. al. [2016]) 37

Convolutional Neural Network Typically used for image classification RGB Images can be thought of as a 3d matrix Fully connected hidden layers too many weights The forward pass: pass a filter over the image (Hinton [2013]) 38

Convolutional Neural Network (Karpathy [2016]) 39

Convolutional Neural Network (Karpathy [2016]) 40

Convolutional Neural Network (Karpathy [2016]) 41

Convolutional Neural Network (Karpathy [2016]) 42

Convolutional Neural Network (Karpathy [2016]) 43

Convolutional Neural Network (Karpathy [2016]) 44

Convolutional Neural Network (Karpathy [2016]) 45

Convolutional Neural Network (Karpathy [2016]) 46

Convolutional Neural Network (Karpathy [2016]) 47

Hyperspectral Classification Per-pixel classification from hyperspectral data Data from https://engineering.purdue.edu/biehl/multispec/hyperspectral.html 48

CNNs on MNIST Misclassifications of LeNet5 (LeCun [1998]) 49

Recurrent Neural Networks Directed cycles in their connection graph MLPs and CNNs require fixed sized input Used to model sequential data Hard to Train Output Layer Hidden Layer Input Layer t 1 t 2 t 3 t 4 t 5 t 6 50

Recurrent Neural Networks (Karpathy [2015]) 51

Image Text Recognition Different ways of thinking about the problem Long Short-Term Memory Layer most recently used (Karpathy [2015]) (Shi, et. al. [2016]) 52

Audio-speech Recognition Traditional Speech Models Phenomes Words Speech waveform Acoustic Model Pronunciation Model Language Model Sentence (kaˈfā) (cafe) argmax X P W X = argmax W,L P X L P L W P(W) (Beaufays [2016]) 53

Other Acoustic Models Other DNN based approaches to acoustic modeling (Beaufays [2016]) Method Year DBN 2012 Long Short Term Memory (LSTM) 2013 Convolutional LSTM DNN 2014 Connectionist Temporal Classification (CTC) 2015 54

Summary of Applications Audio to text Transcribe in-flight audio/ conversations Transcribe survey conversations Easily slew to audio of interest Image captioning Write text in image to a text file In-flight data Object recognition in images Help label truth data when testing sensors Video is just adding a time dimension to images Techniques from images may be applied to video 55

Next Steps Low hanging fruit Use already existing open source text recognition in images/video OpenCV Use free audio transcription software TensorFlow (Google) SwiftScribe (Baidu) 56

Next Steps Open areas for development: Transcribing acronyms Using Machine Learning on bus data to tell a maintainer of a certain risk. ATC radar more accurately narrowing down location of a/c in real time (Hrastovec et. al. [2014]) Identifying early indications of airframe stress and strain (Hickinbotham et. al. [2000]) 57

Acknowledgements Workshop organizers AFOTEC Det 5 leadership Mr. Jeff Wilson Capt Joshua Vaughan 58

References Hinton, Geoffrey. Artificial Neural Networks. Coursera. (2013) ImageNet (2014). http://www.image-net.org/challenges/lsvrc/2014/ui/det.html Karpathy, Andrej et. al. CS321n online course notes: http://cs231n.stanford.edu/ Karpathy, Andrej RNN github page (2015): http://karpathy.github.io/2015/05/21/rnneffectiveness/ LeCun, Yann Gradient-Based Learning Applied to document Recognition MATLAB documentation (2017): https://www.mathworks.com/discovery/supportvector-machine.html van der Maaten t-sne github page (2016): https://lvdmaaten.github.io/tsne/ van der Maaten, Hinton. Visualizing Data using t-sne JMLR 2008 Russell, Norvig. Artificial Intelligence: A Modern Approach. 3 rd Ed. 2010. New Jersey: Pearson. scikit-learn documentation (2016). http://scikit-learn.org/stable/documentation.html S. Weisberg (1985). Applied Linear Regression, Second Edition. New York: John Wiley and Sons. Wolberg, W.H., & Mangasarian, O.L. (1990). Multisurface method of pattern separation for medical diagnosis applied to breast cytology. In Proceedings of the National Academy of Sciences, 87, 9193--9196. 59

References van der Maaten, Hinton. Visualizing Data using t-sne JMLR (2008). Shi, Baoguang, Xiang Bai, and Cong Yao. "An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition." IEEE Transactions on Pattern Analysis and Machine Intelligence (2016). Ghamisi, Pedram, et al. "Advanced Supervised Spectral Classifiers for Hyperspectral Images: A Review." IEEE Geoscience and Remote Sensing Magazine (GRSM) (2017). Dahl, George E., Dong, Deng, Acero. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition. IEEE Transactions on Audio, Speech, and Language Processing (2012). Gupta, Manish, et. al. Outlier Detection for Temporal Data: A Survey. IEE Transactions on Knowledge and Data Engineering (2014). Beaufays, Francoise. Speech Recognition Google I/O (2016). Yoon, Seunghyun, et al. "Efficient Transfer Learning Schemes for Personalized Language Modeling using Recurrent Neural Network." arxiv preprint arxiv:1701.03578 (2017). 60

Questions? 61