Machine Learning for Big Fishery Visual Data

Size: px

Start display at page:

Download "Machine Learning for Big Fishery Visual Data"

Alisha Wilkinson
5 years ago
Views:

1 Machine Learning for Big Fishery Visual Data Jenq-Neng Hwang, Professor Associate Chair, UWEE

2 Acknowledgements Farron Wallace, AFSC NOAA Kresimir Williams, AFSC NOAA Craig Rose, AFSC NOAA George Cutter, SWFSC NOAA Suzanne Romain, AFSC NOAA Jason Sagmiller, AFSC NOAA Paul Packer, AFSC NOAA Richard Towler, AFSC NOAA Meng-Che Chuang, Skytap Tsung-Wei Huang, Gaoang Wang, Sheng-Ting Shen, UWEE 2

3 Electronic Monitoring Electronic monitoring (EM) system on federal fisheries Monitor the fish species and size Near real-time reporting Automatic & non-lethal analysis Fish image segmentation Length measurement and species ID Catch accounting Retention compliance 3

4 Visual Data Analyses for Fisheries Surveys 4

5 Outline Machine Learning for Visual Data Fish Segmentation and Length Measurement Fish Tracking and Counting Fish Species Identification Conclusion 5

6 Visual Data Analytics Unsupervised: Clustering or Grouping Supervised: Classification (Recognition) or Regression/Prediction Visual Sensor Feature Representation Learning Algorithms LBP, HoG, SIFT, SURF e.g., segmentation, detection, tracking, ID, etc Application 6

7 Paradigms of Machine Learning Unsupervised Learning: K-Means, GMM Supervised Learning: CART/RF, SVM, ANN/CNN, HMM Semi-Supervised (Inductive) Learning Query (Active) Learning Reinforcement Learning: Q-Learning, TD Learning 7

8 Supervised Learning Approaches Can be arbitrary functions of high-dim m x R Nearest Neighbor (template matching) Decision Tree (CART & RF) Linear Functions (SVM & DML) Nonlinear Functions (ANN, CNN) { x, x2, x3,..., x 1 T } q s 0.2 q q f q 2 q Hidden Markov Model (HMM) 8

9 A Multilayer Perceptron (MLP) Neural Network Input layer Hidden Layers Output layer Trained by backpropagation learning, a stochastic generalized least mean squares (LMS) algorithm [Rumelhart 1986]. 9

Convolution Neural Networks (CNNs) Fully-connected layers does not take into account the spatial structure of the images Convolution produces feature maps,

10 Convolution Neural Networks (CNNs) Fully-connected layers does not take into account the spatial structure of the images Convolution produces feature maps, pooling down-sample maps Weight sharing increases learning efficiency and achieves better generalization on vision problems [Y. LeCun 1989, A. Krizhevsky 2012] 10

11 Using CNN Features for Detection Ross Girshick, Fast R-CNN, 2015 Wei Liu et al, SSD: Single Shot Multibox Dector, 2015

12 Outline Machine Learning for Visual Data Fish Segmentation and Length Measurement Fish Tracking and Counting Fish Species Identification Conclusion 12

(curved) endpoints rather than bounding box Water drop and

13 Chute Fish Length Measurement Automatic fish midline measurement Measure the fish length using head and tail (curved) endpoints rather than bounding box Water drop and blur detection blurred image area due to water drops on camera lens 13

14 Fish Length Measurement On 3571 fish sample consisting of 11 species, a 1.49% of mean of absolute error is achieved Different orientation Curved Forked tail 14

15 Fish Length Measurement Examples Tsung-Wei Huang, et al, IEEE ICASSP

16 Outline Machine Learning for Visual Data Fish Segmentation and Length Measurement Fish Tracking and Counting Fish Species Identification Conclusion 16

17 Camtrawl Tracking by Segmentation & Viterbi Data Association GMM Background Subtraction for Segmentation in Camtrawl GMM Background Subtraction for Segmentation in Camtrawl Meng-Che Chuang, et al, IEEE ISCAS

18 ROV-based Fish Tracking Problem: high false detection rate Solution: tracking by detection (deformable part model, DPM) and extract motion information across frames to deal with false detection (since fish move but rocks don't move) 18

19 SSD Detection+Segmentation+Depth Refinement Left Edge in disparity Right Edge in disparity Disparity Background Subtraction

20 Tracking and Measurement Tracking bounding box in 3D, and length (mm) 20

21 Outline Machine Learning for Visual Data Fish Segmentation and Length Measurement Fish Tracking and Counting Fish Species Identification Conclusion 21

Bag-of-Features Recognition SIFT descriptor x R n X = [x 1, x 2,..., x N ] d can be easily adapted to new datasets semi-supervised or query learning C ( c, c2,..., c 1 N 0.8 ) 0.2 0.

22 Bag-of-Features Recognition SIFT descriptor x R n X = [x 1, x 2,..., x N ] d can be easily adapted to new datasets semi-supervised or query learning C ( c, c2,..., c 1 N 0.8 ) Example X BC C = [c 1, c 2,..., c N ] maximum for each row B = [b 1, b 2,..., b K ] X (d N): Feature Matrix B (d K): Codebook C (K N): Coding Matrix 22

23 Parameters Settings Resize images with no larger than Patch (16 16) with 6-pixel stride. Extract features in pyramid structure with scale 0.75 in each level. Set Spatial Pyramid Matching (SPM) with (1 1, 2 2, 4 1) subregions. Image patch Extract features in different scales SPM in level (2 2) 23

24 Multi-Spectral Chute Data 43 categories, 6740 objects, 6 Channels (multispectral images), collected in 2016 The same object in different channels 24

25 Cam-Trawl Fish-ID Results The Cam-trawl accuracy 98.4% (10-fold cross validation) for 5 classes The Chute Accuracy 94.4% (6 channels), 93.8% (RGB channels), 10-fold cross validation for 43 classes Eulachon Pollock Rockfish Salmon Squid Eulachon Pollock Rockfish Salmon Squid Gaoang Wang, et al Shrinking Encoding with Two Level Codebook Learning for Fine-grained Fish Recognition, CVAUI workshop, IEEE ICPR

26 Chute Fish-ID Results 26

27 Chute Fish-ID with VGG-16 CNN The Chute Accuracy 93.0% (RGB channels), 10-fold cross validation Data augmentation:10 times of the data (6740) Translation: Random translation 20% image L/W Rotation Range : 5 Scale: Scale change of +/- 20% image Pixel Mean subtraction: Following ImageNet 27

28 Conclusion Real-time sensing, communication, computing and control is being realized everywhere -- smart city, smart car, intelligent house, smart manufacturing, etc Big data allow machine learning to be effective and possibly real-time response, thanks to powerful communication and computing From analyzing these big fishery visual data -- a step toward smart ocean 28

Large-scale Video Classification with Convolutional Neural Networks

Large-scale Video Classification with Convolutional Neural Networks Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, Li Fei-Fei Note: Slide content mostly from : Bay Area