THE free electron laser(fel) at Stanford Linear

Similar documents
Face Recognition Using Vector Quantization Histogram and Support Vector Machine Classifier Rong-sheng LI, Fei-fei LEE *, Yan YAN and Qiu CHEN

Deep Learning with Tensorflow AlexNet

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Louis Fourrier Fabien Gaie Thomas Rolf

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Lecture #11: The Perceptron

11. Neural Network Regularization

COMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

Mapping of Hierarchical Activation in the Visual Cortex Suman Chakravartula, Denise Jones, Guillaume Leseur CS229 Final Project Report. Autumn 2008.

Bagging for One-Class Learning

Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network. Nathan Sun CIS601

Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms

Perceptron: This is convolution!

Classification of Subject Motion for Improved Reconstruction of Dynamic Magnetic Resonance Imaging

Allstate Insurance Claims Severity: A Machine Learning Approach

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon

Classifying Depositional Environments in Satellite Images

Robustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification

A Deep Learning Approach to Vehicle Speed Estimation

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material

doi: /

Neural style transfer

Kaggle Data Science Bowl 2017 Technical Report

Study of Residual Networks for Image Recognition

Convolution Neural Network for Traditional Chinese Calligraphy Recognition

Tutorial 8. Jun Xu, Teaching Asistant March 30, COMP4134 Biometrics Authentication

CS 229 Final Project Report Learning to Decode Cognitive States of Rat using Functional Magnetic Resonance Imaging Time Series

CSE 252B: Computer Vision II

Convolution Neural Networks for Chinese Handwriting Recognition

CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015

Deep learning for object detection. Slides from Svetlana Lazebnik and many others

Object detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation

Fall 09, Homework 5

Markov Random Fields and Gibbs Sampling for Image Denoising

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why?

PARTIAL STYLE TRANSFER USING WEAKLY SUPERVISED SEMANTIC SEGMENTATION. Shin Matsuo Wataru Shimoda Keiji Yanai

Deep Learning. Volker Tresp Summer 2014

Chap.12 Kernel methods [Book, Chap.7]

Motion Estimation for Video Coding Standards

Dynamic Routing Between Capsules

Data Term. Michael Bleyer LVA Stereo Vision

Lab 9. Julia Janicki. Introduction

Assignment 3: Edge Detection

Code Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python:

Using Machine Learning for Classification of Cancer Cells

Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization. Presented by: Karen Lucknavalai and Alexandr Kuznetsov

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

Deep Learning With Noise

Image Transformation via Neural Network Inversion

Where s Waldo? A Deep Learning approach to Template Matching

Neural Network Optimization and Tuning / Spring 2018 / Recitation 3

CS229 Final Project: Predicting Expected Response Times

Scene Text Recognition for Augmented Reality. Sagar G V Adviser: Prof. Bharadwaj Amrutur Indian Institute Of Science

Breaking it Down: The World as Legos Benjamin Savage, Eric Chu

A New Algorithm for Shape Detection

CS489/698: Intro to ML

Motion illusion, rotating snakes

SSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang

Pose estimation using a variety of techniques

arxiv: v1 [cs.cv] 20 Dec 2016

2 OVERVIEW OF RELATED WORK

Facial Expression Classification with Random Filters Feature Extraction

An ICA based Approach for Complex Color Scene Text Binarization

Part Localization by Exploiting Deep Convolutional Networks

CS231N Project Final Report - Fast Mixed Style Transfer

Improving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah

An Intuitive Explanation of Fourier Theory

Stacked Denoising Autoencoders for Face Pose Normalization

Artificial Intelligence Introduction Handwriting Recognition Kadir Eren Unal ( ), Jakob Heyder ( )

Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting

Support Vector Regression for Software Reliability Growth Modeling and Prediction

Applied Statistics for Neuroscientists Part IIa: Machine Learning

Deep Learning Cook Book

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network

Motion Estimation. There are three main types (or applications) of motion estimation:

Image Features: Local Descriptors. Sanja Fidler CSC420: Intro to Image Understanding 1/ 58

CS5670: Computer Vision

Second Order SMO Improves SVM Online and Active Learning

ECE 285 Class Project Report

Applying Synthetic Images to Learning Grasping Orientation from Single Monocular Images

MOTION ESTIMATION USING CONVOLUTIONAL NEURAL NETWORKS. Mustafa Ozan Tezcan

CS 523: Multimedia Systems

Predict the Likelihood of Responding to Direct Mail Campaign in Consumer Lending Industry

International Journal of Computer Engineering and Applications, Volume XI, Issue IX, September 17, ISSN

Automatic Colorization of Grayscale Images

Ship Classification Using an Image Dataset

A Robust Method for Circle / Ellipse Extraction Based Canny Edge Detection

3D RECONSTRUCTION OF BRAIN TISSUE

Lecture 1 Notes. Outline. Machine Learning. What is it? Instructors: Parth Shah, Riju Pahwa

Image Processing Pipeline for Facial Expression Recognition under Variable Lighting

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

Detecting Salient Contours Using Orientation Energy Distribution. Part I: Thresholding Based on. Response Distribution

Unsupervised learning in Vision

Earthquake Waveform Recognition Olivia Grubert and Bridget Vuong Department of Computer Science, Stanford University

PSU Student Research Symposium 2017 Bayesian Optimization for Refining Object Proposals, with an Application to Pedestrian Detection Anthony D.

Boosting Simple Model Selection Cross Validation Regularization. October 3 rd, 2007 Carlos Guestrin [Schapire, 1989]

Using temporal seeding to constrain the disparity search range in stereo matching

Engineering Problem and Goal

Transcription:

1 Beam Detection Based on Machine Learning Algorithms Haoyuan Li, hyli16@stanofrd.edu Qing Yin, qingyin@stanford.edu Abstract The positions of free electron laser beams on screens are precisely determined by a sequence of machine learning models. Transfer training is conducted in a selfconstructed convolutional neural network based on VGG16 model. Output of intermediate layers are passed as features to a support vector regression model. With this sequence, 85.8% correct prediction is achieved on test data. Index Terms Beam Detection, SVM, CNN I. INTRODUCTION THE free electron laser(fel) at Stanford Linear Accelerator Center(SLAC) is an ultra-fast X-ray laser. As one of the most advanced X-ray light source [1] [2], it is famous for its high brightness and short pulse duration: it is 10 billion times brighter than the world s second brightest light source; the pulse duration is several tens femtoseconds.it plays a pivotal role in both fundamental science research and applied research [2]. The mechanism behind this laser is very delicate [1]. Thus to keep the laser in optimal working condition is challenging.the positions of the electron beams and the laser beams are of fundamental importance in the control and maintenance of this FEL. Currently, the task of locating beam spots heavily depends on human labor. This is mainly attributed to the wide varieties of beam spots and the presentation of strong noises as demonstrated in Figure 1, where the white square marks the boundary of the beam spot. Each picture requires a long sequence of signal processing methods to mark the beam position. To make things even worse, different instrument configurations and working conditions require different processing parameters. Within the frame of same configuration, the parameters will also drift away along with time advancement because of the inherent delicacy of the instrument. This makes tuning and maintaining the FEL a tedious and burdensome task for researchers.currently, the data update frequency of the laser is 120 Hz. We can barely handle this. In 2020, after the scheduled update, the frequency will climb to 5000 Hz. Thus the only hope lies in automatic processing methods. Considering that simple signal processing method can not handle such Fig. 1: Raw photos nasty condition, We hope to come up with a general machine learning algorithm which is capable of locating the beams positions quickly and automatically. In this report, we demonstrate the consecutive application of neural network and supportive vector machine (SVM) to determine the positions of beam spots on virtual cathode camera (VCC) screen in simple cases. II. CHALLENGES AND STRATEGIES A bird view of such a simple-looking task reveals the challenges behind it. Complicated background noises. The background noises are not static. In contrast, it is coherent in both time and space domain due to quantum

2 effect. Thus we have to dynamically change the background when doing background subtraction for different images. Large variety in the intensity and shape of the beam spot. As is indicated above, the maximum intensity of the beam is incredibly high. However, it can also be 0 which corresponds to the case that no beam spot presented. The shape of the beam can also varies significantly from case to case. Ground truths is hard to find. The signal processing methods fail in finding the beam spot sometimes. There are also cases when the photo is so terrible that even human can not find the beam spot after all the signal processing procedures. The task is similar to face recognition and the expectation of solving it in one strike is unrealistic. Thus, we develop the following strategy. First, we develop an algorithm that is capable of dealing with cases where we have implemented signal processing to realize the fundamental functions. Second, extend the application to cases that we gradually ignore pre-processing steps, e.g. Gaussian blur. In the end, extend the application to raw images. In the essay, we report the accomplishment of the first step that realizes fundamental functions. 16200 III. RELATED WORK A similar dilemma stated above occurred when researchers tried to pinpoint the electron bunch s trace [3]. It was solved by first feeding the images into a convolutional neural network [4], VGG16 [5], and then performing linear regression on its codewords. This gives us the inspiration of this project. Signal processing algorithms have been developed for targeting the beam spot. Thus, we use the results from these algorithms as the ground truth for our training. IV. DATASET Training dataset for this project consists of 162 original VCC screen figures and 16200 figures generated from the original ones (each original image corresponding to 100 generated images). Each figure contains only one beam spot. To generate new figures, we first cut out part of figure that contains the beam spot, then rotate the small patch of figure to some random degree, and put the patch on a random point. In the end, cut out the covered part of the figure and paste it to the original position of the beam spot. The position of the beam on each of the original figures can be determined by the signal processing algorithm. The position of the beam on the generated figures are determined by the generation process.the position of beam in each figure is Fig. 2: Algorithmic flowchart represented by the coordinates of two diagonal vertices: z (i) = (y (i) min, y(i) max, x (i) min, x(i) max) T A. Pipeline V. METHOD To obtain an algorithm robust to variations of beam spot and noises, we utilize a two step method. First, we use a convolutional neural network (CNN) as a feature extractor and we take the intermediate outputs of the CNN as features and train a Supportive Vector Regression(SVR) program with them. The pipeline is demonstrated in Figure 2. B. Feature Extraction Since we have moderate sample size, transfer training seems suitable for this purpose. VGG16 is selected as the pre-trained model.

3 Fig. 3: Structure of VGG 16 [6] VGG16 is a deep convolutional neural network. There are totally 13 layers of convolutional layer, 5 pooling layer and 3 full connected layer. The structure of this network in demonstrated in Figure 3. The output of the first and second fully connected layers are both 4096-dimensional vectors. We feed in our samples and use these two vectors as the features of the image. To do transfer training, first we re-construct the VGG16 with python 2.7 and tensorf low 0.11. The last layer of the original VGG16 is a classifier of a thousand classes. We modify it to output the position of the center of the beam spot. The loss is defined as the squared distance between the predicted beam center and the true beam center. A l 2 regularization term is added which contains parameters of the last three fully connected layer. L CNN,loss = 1 m m x i i x i 2 + λ m W 2 (1) where x i and x i indicate respectively the predicted center of the beam and the true center of the beam which is represented by the center of the box. W is a vector consisting of all parameters in the last three full connected layers, i.e. we only tune the parameters in the last three layers. The reason for only tuning the last three layers is that we don t have enough data to tune all parameters. The first several layers, according to modern viewpoint, mainly detect the detailed structures while the upper layers concern more about larger scale objects. So it seems more reasonable to try to tune those layers affecting our next step directly. C. Support Vector Regression The feature for the ith figure is denoted by q (i) R 8192.To avoid overfitting, we use l 2 regularization when implementing supportive vector regression (SVR) with Gaussian kernel to model the situation. The four parameters of the position are independent. So we train a SVR model for each of the four labels by maximizing the following dual function [7] m L s = αi s z s (i) 1 m αi s α s 2 jk(q (i), q (j) ) (2) i,j=1 α i [ t, t], i {1, 2,, m}, s = 1, 2, 3, 4. m is the number of training samples. αi s s are the optimization parameters. t is the penalty ratio of the difference between the predicted label and the real label. This dual function is obtained from the primary optimization problem [7] in the following equations by eliminating the original variables: L s = 1 m m 2 θ s 2 2 + t + t ξi (3) ξ + i under the constraints that i {1, 2,, m} z (i) s θ T q (i) s ξ + i 0 (4) ξ i 0 (5) θ T q (i) s ξ + i (6) z (i) s ξ i (7) where ξ + i and ξ i are the threshold for training error. VI. TRAINING A. Convolutional Neural Network Nesterov momentum method [8] is utilized to minimize the CNN loss function. We use a vanishing learning rate which decays as t 1 : B. Cross Validation α = 10 4 (1 + t) 1 (8) To maximize the usage of limited data, we use 9- fold cross-validation. First we randomly shuffle the 162 original images to erase the potential correlation between different images. Then divided them into 9 batches, each of 18 images. For each batch, the other 144 original images together with the related 14400 generated images form the training data set for the next two steps. This guarantees the independence between the training and testing process.

4 C. PCA and SVM Sequential minimal optimization (SMO) algorithm is utilized to minimize the SVR loss function. Package scikit-learn is used throughout the process to guarantee the performance. The feature has a huge dimension of 8192, while the prediction only contains 4 numbers. To prevent over-fitting, we use principle component analysis (PCA) to extract the most influence dimensions and iterate the training and testing procedure for 12 different different dimensions equally distributed in log-scale, i.e. {5, 10, 20, 40, 80, 160, 320, 640, 1280, 2560, 5120, 8000}. VII. RESULTS AND DISCUSSIONS A. Performance After re-tuning, the regression algorithm gives superior performance. Beam positions are corrected predicted in 139 of the 162 original images. The following diagram demonstrates the typical performance of the regression on the test set. When the regression algorithm catches the spot, it gives really precise prediction, which implies that the algorithm does have learned how to pinpoint the position of the spot. Fig. 4: Test Results Yet, when it can not catch the spot (picture in second column of Figure 4), the regression algorithm just randomly guess a point in the center of the screen. After checking the pictures where the algorithm miss the spot, we found that the missing spots are not dim at all. In fact the maximum point in those figures are even larger than those with perfection predictions. But the beam spots are all very small in those figures. B. Effect of PCA dimension To prevent overfitting, we use PCA to pick out the most influential dimensions of the features before regression. Figure 5 demonstrates the training error of the SVR for different PCA dimensions. The error is defined as the distance between the predicted beam center and the true beam center. Figure 6 demonstrate the performance of the SVR for different PCA dimensions.to quantify the performance, we use the average ratio between the overlapped area and the area of the true spot as the indicator of the performance. From these two figures, it is clearly shown that when the PCA dimension exceeds 1000, there will be severe overfitting problem, as the training vanishes and the performance drops in that area. Overlap area ratio 0.74 0.72 0.7 0.68 0.66 0.64 0.62 0.6 10 0 10 1 10 2 10 3 10 4 PCA Dimension Fig. 5: Test performance for different PCA dimensions VIII. CONCLUSIONS AND FUTURE WORKS With newly generated figures and re-tuned neural network, the performance of the SVR gains great improvement. Yet it is still far from applicable in real environment. Three approaches are scheduled for further improvements. A. Fine-Tuning Neural Network VGG16-like neural networks are huge and hard to tune. It is highly likely that with improved tuning skill and a bit of luck and patience, we can get better performance.

5 Center Deviation 10 8 6 4 2 d=5 d=160 d=1280 0 0 1000 2000 3000 4000 5000 6000 7000 8000 Sample label Fig. 6: Training error for different PCA dimensions [3] Spatial Localization - AI at SLAC - SLAC Confluence. [Online]. Available: https://confluence.slac.stanford.edu/display/ai/spatial+localization [4] CS231n Convolutional Neural Networks for Visual Recognition. [Online]. Available: http://cs231n.github.io/ [5] K. Simonyan and A. Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition, sep 2014. [Online]. Available: http://arxiv.org/abs/1409.1556 [6] VGG in TensorFlow Davi Frossard. [Online]. Available: https://www.cs.toronto.edu/ frossard/post/vgg16/ [7] A. J. Smola and B. Schölkopf, A tutorial on support vector regression, Statistics and computing, vol. 14, no. 3, pp. 199 222, 2004. [8] Y. Bengio, N. Boulanger-Lewandowski, and R. Pascanu, Advances in Optimizing Recurrent Networks, dec 2012. [Online]. Available: http://arxiv.org/abs/1212.0901 Fig. 7: Test results for raw pictures B. More and Harder Data Noisy figures are currently beyond the signal processing algorithm s capability. But as we can see sometimes our regression algorithm has done better job on the existing samples. And even for those raw pictures, our model can generate good prediction results occasionally (Figure 7). So we may try the algorithm on harder samples in the future. Especially we will try to mark the position of the beam by hand for noisy figures and try to train the model on those samples as well. C. Building New Neural Network It is possible that a smaller but better training neural network can do better job. Only after trying both methods can we decide which way is better. We have already constructed another smaller neural network. Training of this one will soon begin. REFERENCES [1] Z. H. Kwang-Je Kim and R. Lindberg, Synchrotron Radiation and Free-Electron Lasers. Cambridge University Press, 2017. [2] C. Pellegrini, A. Marinelli, and S. Reiche, The physics of x-ray free-electron lasers, Rev. Mod. Phys., vol. 88, p. 015006, Mar 2016. [Online]. Available: http://link.aps.org/doi/10.1103/revmodphys.88.015006