Single Object Tracking with Organic Optic Attenuation
|
|
- Kellie Jones
- 5 years ago
- Views:
Transcription
1 Single Object Tracking with Organic Optic Attenuation Note: DEMO GIFS Have been removed due to making the presentation too large to upload to blackboard! (other gifs have been lossy-compressed) Ibraheem Saleh Download original presentation (970 MB) at: XeXYlCstDQDBPJJdU/view?usp=sharing
2 A Report on Hierarchical Attentive Recurrent Tracking Submitted for Publication June 2017 Github Page: Research Paper Publication Link: Research Paper Link: Adam Kosiorek et all
3 Problem Class Agnostic Object Tracking Different from class-aware object tracking which knows the type of object being tracked. Many Challenges Not Knowing Relational Data about the Object ahead of time Varying Lighting Conditions Environment Occlusion Subject Rotation Camera Movement Motion Blur Etc Object Tracking!
4 Significance Self-Driving Cars Artificial Intelligence Special Effects Rendering Security Systems Detecting Abandoned Bags at Airports Object Tracking is one of the core problems in the field Computer Vision
5 What is HART? Inspired by the general architecture of the human visual cortex and the role of attention mechanisms, this work presents a biologically-inspired recurrent model for single object tracking in videos [1] Goals of HART Fast Operate in real time Data Driven Perform Object Tracking by mimicking Human Eye Mechanisms
6 Biologically Inspired from Human Vision?!
7 Biology of Human Vision Spatial Attention When a human tracks an object, after the eyes retrieve the world-image information and passes that to the primary visual cortex in the brain, it further subdivides the image processing to two different processing pathways: to the Ventral Stream and the Dorsal Stream The Dorsal Stream is responsible for determining Where the object of interest is (Spatial Attention) It quickly acknowledges the images from the visual cortex and discard spatially irrelevant information. Unlike many popular object tracking algorithms, humans don t process the image in full for each image. Green: Dorsal stream Purple: Ventral Stream
8 Biology of Human Vision Appearance Attention (Ventral Stream) The Ventral Stream is responsible for determining What the object of interest is This is where our brain learns the details about the object that we are actively looking at. What was the color of the subjects shirt? His Pants? His Cape? Humans have limited processing capability. Whenever more than one visual stimulus is present in the receptive field of a neuron, all the stimuli compete for computational resources due to the limited processing capacity. [1] Green: Dorsal stream Purple: Ventral Stream
9 Building the Model Computes a Foreground and Background segmentation of the Glimpse Primary Visual Cortex Spatial Bernoulli Distribution Each value in s t Represents the probability of the tracked object occupying the corresponding location Compute the next Attention Area Input Images glimpse s t h t 1 α t+1 x t g t V t ഥs t o t b t Determine Where the object to track is v t Masked Features h t a t+1 Bounding Box of Object Extract Appearance Based Features Features Haadamard Product Learn the Object that we are Tracking Compute the next Appearance Area
10 Understanding the Model s t h t 1 α t+1 x t g t b t V t ഥs t o t v t h t a t+1
11 Understanding the Model Spatial Attention s t h t 1 α t+1 x t g t b t V t ഥs t o t v t h t a t+1
12 Understanding the Model Spatial Attention Given an image, the spatial attention mechanism creates 2 matrices. Each matrix contains one Gaussian per row. The width and position of the Gaussian determines which parts of the image are extracted. Initial Glimpse bounds are specified externally from the model. Future changes in the stride and centers of the glimpses are taken from the output of the LSTM
13 Understanding the Model V1 & Ventral/Dorsal Streams s t h t 1 α t+1 x t g t b t V t ഥs t o t v t h t a t+1
14 Understanding the Model V1 & Ventral/Dorsal Streams The V1 (Primary Cortex) is implemented as a CNN and, given a glimpse, outputs a number of convolutional and max-pooling layers which are then passed to both the Dorsal and Ventral Streams The Ventral Stream is also a CNN and outputs feature-maps to handle visual features. The Dorsal Stream is implemented as a DFN (Dynamic Filter Network) [2] Filters for a DFN are computed on the fly conditioned on input features (as opposed to the traditional CNN model which has the network remain static after training!)
15 Understanding the Model V1 & Ventral/Dorsal Streams After processing through those networks, the model takes the Haadamard Product of the outputs from the dorsal and ventral streams. Imitates the distractor-suppressing behavior of the human-brain.
16
17 Understanding the Model LSTM and MLP s t h t 1 α t+1 x t g t b t V t ഥs t o t v t h t a t+1
18 Understanding the Model LSTM and MLP Masked Feature outputs are then fed into a Long-Short-Term Memory mechanism -- special type of RNN The output from the LSTM is used to compute the predicted attention and appearance for the next frame. The LSTM is designed with the assumption that motion of these objects is representable as a Markovian State IE: Future states only depend on the current state and not frames or states before it.
19 Understanding the Model LSTM and MLP Benefits of using an LSTM Can learn rotating and occluded objects on the fly!
20 Loss Functions HART Loss Function Tracking Loss Function Spatial Attention Loss Function Appearance Attention Loss Function:
21 Experimentation Dataset - KITTI The Karlstuhe Institute of Technology and Toyota Technological Institute (KITTI) [1] dataset. A collection of real world pedestrian and traffic video footage taken from the perspective of a car. we hired a set of annotators, to label 3D bounding boxes as tracklets in point clouds.
22 Akosiorek et Al Results by introducing a set of auxiliary losses we are able to scale to challenging real world data, outperforming predecessor attempts and approaching state-of-the-art performance. [1]
23 My Experimentation Simple Pedestrian
24 My Experimentation Simple Huge Truck
25 My Experimentation Distant Car
26 My Experimentation Distant Pedestrian
27 My Experimentation Car Turning
28 My Experimentation Apple Rolling
29 My Experimentation My Cat
30 My Experimentation Me On Screen to Off
31 My Experimentation Brother Walking
32 My Experimentation Me Driving on Street
33 My Thoughts It learns the Appearance Features too slow causing them to get lost with fast rotational changes or lighting shifts. (Problem with Appearance Attention) Once the image is out of frame momentarily, it doesn t know how to search for it anymore (problem with Spatial Attention mechanism) Too hard to train!
34 Questions?
35 Primary References Kosiorek R Adam, Bewley Alex, Posner Ingmar, Hierarchical Attentive Recurrent Tracking, 5 Sept, 2017, arxiv: v2 Bert De Brabandere, Xu Jia, Tinne Tuytelaars, and Luc Van Gool. Dynamic Filter Networks. NIPS, Samira Ebrahimi Kahoú, Vincent Michalski, and Roland Memisevic. RATM: Recurrent Attentive Tracking Model. CVPR Work., 2017
Machine Learning 13. week
Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of
More informationFast scene understanding and prediction for autonomous platforms. Bert De Brabandere, KU Leuven, October 2017
Fast scene understanding and prediction for autonomous platforms Bert De Brabandere, KU Leuven, October 2017 Who am I? MSc in Electrical Engineering at KU Leuven, Belgium Last year PhD student with Luc
More informationTwo-Stream Convolutional Networks for Action Recognition in Videos
Two-Stream Convolutional Networks for Action Recognition in Videos Karen Simonyan Andrew Zisserman Cemil Zalluhoğlu Introduction Aim Extend deep Convolution Networks to action recognition in video. Motivation
More informationApril 4-7, 2016 Silicon Valley
April 4-7, 2016 Silicon Valley Neural Attention for Object Tracking Brian Cheung bcheung@berkeley.edu Redwood Center for Theoretical Neuroscience, UC Berkeley Visual Computing Research, NVIDIA Source:
More informationS7348: Deep Learning in Ford's Autonomous Vehicles. Bryan Goodman Argo AI 9 May 2017
S7348: Deep Learning in Ford's Autonomous Vehicles Bryan Goodman Argo AI 9 May 2017 1 Ford s 12 Year History in Autonomous Driving Today: examples from Stereo image processing Object detection Using RNN
More informationDeep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks
Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin
More informationCS 523: Multimedia Systems
CS 523: Multimedia Systems Angus Forbes creativecoding.evl.uic.edu/courses/cs523 Today - Convolutional Neural Networks - Work on Project 1 http://playground.tensorflow.org/ Convolutional Neural Networks
More informationarxiv: v2 [cs.cv] 14 May 2018
ContextVP: Fully Context-Aware Video Prediction Wonmin Byeon 1234, Qin Wang 1, Rupesh Kumar Srivastava 3, and Petros Koumoutsakos 1 arxiv:1710.08518v2 [cs.cv] 14 May 2018 Abstract Video prediction models
More informationarxiv: v3 [cs.lg] 30 Dec 2016
Video Ladder Networks Francesco Cricri Nokia Technologies francesco.cricri@nokia.com Xingyang Ni Tampere University of Technology xingyang.ni@tut.fi arxiv:1612.01756v3 [cs.lg] 30 Dec 2016 Mikko Honkala
More informationLearning Semantic Video Captioning using Data Generated with Grand Theft Auto
A dark car is turning left on an exit Learning Semantic Video Captioning using Data Generated with Grand Theft Auto Alex Polis Polichroniadis Data Scientist, MSc Kolia Sadeghi Applied Mathematician, PhD
More informationHierarchical Attentive Recurrent Tracking
Hierarchical Attentive Recurrent Tracking Adam R. Kosiorek Department of Engineering Science University of Oxford adamk@robots.ox.ac.uk Alex Bewley Department of Engineering Science University of Oxford
More informationCS231N Section. Video Understanding 6/1/2018
CS231N Section Video Understanding 6/1/2018 Outline Background / Motivation / History Video Datasets Models Pre-deep learning CNN + RNN 3D convolution Two-stream What we ve seen in class so far... Image
More informationPerson Action Recognition/Detection
Person Action Recognition/Detection Fabrício Ceschin Visão Computacional Prof. David Menotti Departamento de Informática - Universidade Federal do Paraná 1 In object recognition: is there a chair in the
More informationDeep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon
Deep Learning For Video Classification Presented by Natalie Carlebach & Gil Sharon Overview Of Presentation Motivation Challenges of video classification Common datasets 4 different methods presented in
More informationMultilayer and Multimodal Fusion of Deep Neural Networks for Video Classification
Multilayer and Multimodal Fusion of Deep Neural Networks for Video Classification Xiaodong Yang, Pavlo Molchanov, Jan Kautz INTELLIGENT VIDEO ANALYTICS Surveillance event detection Human-computer interaction
More informationMulti-Glance Attention Models For Image Classification
Multi-Glance Attention Models For Image Classification Chinmay Duvedi Stanford University Stanford, CA cduvedi@stanford.edu Pararth Shah Stanford University Stanford, CA pararth@stanford.edu Abstract We
More informationConvolutional Neural Network for Facial Expression Recognition
Convolutional Neural Network for Facial Expression Recognition Liyuan Zheng Department of Electrical Engineering University of Washington liyuanz8@uw.edu Shifeng Zhu Department of Electrical Engineering
More informationDynamic Routing Using Inter Capsule Routing Protocol Between Capsules
2018 UKSim-AMSS 20th International Conference on Modelling & Simulation Dynamic Routing Using Inter Capsule Routing Protocol Between Capsules Sanjib Kumar Sahu GGS Indraprastha University Delhi, India,
More informationHello Edge: Keyword Spotting on Microcontrollers
Hello Edge: Keyword Spotting on Microcontrollers Yundong Zhang, Naveen Suda, Liangzhen Lai and Vikas Chandra ARM Research, Stanford University arxiv.org, 2017 Presented by Mohammad Mofrad University of
More informationCMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro
CMU 15-781 Lecture 18: Deep learning and Vision: Convolutional neural networks Teacher: Gianni A. Di Caro DEEP, SHALLOW, CONNECTED, SPARSE? Fully connected multi-layer feed-forward perceptrons: More powerful
More informationarxiv: v1 [cs.lg] 29 Oct 2015
RATM: Recurrent Attentive Tracking Model Samira Ebrahimi Kahou École Polytechnique de Montréal, Canada samira.ebrahimi-kahou@polymtl.ca Vincent Michalski Université de Montréal, Canada vincent.michalski@umontreal.ca
More informationIntroduction to visual computation and the primate visual system
Introduction to visual computation and the primate visual system Problems in vision Basic facts about the visual system Mathematical models for early vision Marr s computational philosophy and proposal
More informationNeural Network and Deep Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina
Neural Network and Deep Learning Early history of deep learning Deep learning dates back to 1940s: known as cybernetics in the 1940s-60s, connectionism in the 1980s-90s, and under the current name starting
More informationReal-Time* Multiple Object Tracking (MOT) for Autonomous Navigation
Real-Time* Multiple Object Tracking (MOT) for Autonomous Navigation Ankush Agarwal,1 Saurabh Suryavanshi,2 ankushag@stanford.edu saurabhv@stanford.edu Authors contributed equally for this project. 1 Google
More informationWhy equivariance is better than premature invariance
1 Why equivariance is better than premature invariance Geoffrey Hinton Canadian Institute for Advanced Research & Department of Computer Science University of Toronto with contributions from Sida Wang
More informationPedestrian Detection and Tracking in Images and Videos
Pedestrian Detection and Tracking in Images and Videos Azar Fazel Stanford University azarf@stanford.edu Viet Vo Stanford University vtvo@stanford.edu Abstract The increase in population density and accessibility
More informationInternet of things that video
Video recognition from a sentence Cees Snoek Intelligent Sensory Information Systems Lab University of Amsterdam The Netherlands Internet of things that video 45 billion cameras by 2022 [LDV Capital] 2
More informationDeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs Zhipeng Yan, Moyuan Huang, Hao Jiang 5/1/2017 1 Outline Background semantic segmentation Objective,
More informationIs 2D Information Enough For Viewpoint Estimation? Amir Ghodrati, Marco Pedersoli, Tinne Tuytelaars BMVC 2014
Is 2D Information Enough For Viewpoint Estimation? Amir Ghodrati, Marco Pedersoli, Tinne Tuytelaars BMVC 2014 Problem Definition Viewpoint estimation: Given an image, predicting viewpoint for object of
More informationMOTION ESTIMATION USING CONVOLUTIONAL NEURAL NETWORKS. Mustafa Ozan Tezcan
MOTION ESTIMATION USING CONVOLUTIONAL NEURAL NETWORKS Mustafa Ozan Tezcan Boston University Department of Electrical and Computer Engineering 8 Saint Mary s Street Boston, MA 2215 www.bu.edu/ece Dec. 19,
More informationHIERARCHICAL JOINT-GUIDED NETWORKS FOR SEMANTIC IMAGE SEGMENTATION
HIERARCHICAL JOINT-GUIDED NETWORKS FOR SEMANTIC IMAGE SEGMENTATION Chien-Yao Wang, Jyun-Hong Li, Seksan Mathulaprangsan, Chin-Chin Chiang, and Jia-Ching Wang Department of Computer Science and Information
More informationPerceptron: This is convolution!
Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image
More informationHybrid CNN+LSTM for Face Recognition in Videos
Departamento de Informática Universidad Técnica Federico Santa María Hybrid CNN+LSTM for Face Recognition in Videos Proyecto de Tesis Magister en Ciencias de la Ingeniería Informática Alumno: Marco Bellantonio
More informationCode Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python:
Code Mania 2019 Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python: 1. Introduction to Artificial Intelligence 2. Introduction to python programming and Environment
More informationBack propagation Algorithm:
Network Neural: A neural network is a class of computing system. They are created from very simple processing nodes formed into a network. They are inspired by the way that biological systems such as the
More informationLarge-scale Video Classification with Convolutional Neural Networks
Large-scale Video Classification with Convolutional Neural Networks Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, Li Fei-Fei Note: Slide content mostly from : Bay Area
More informationOverview. Object Recognition. Neurobiology of Vision. Invariances in higher visual cortex
3 / 27 4 / 27 Overview Object Recognition Mark van Rossum School of Informatics, University of Edinburgh January 15, 2018 Neurobiology of Vision Computational Object Recognition: What s the Problem? Fukushima
More informationMachine learning based automatic extrinsic calibration of an onboard monocular camera for driving assistance applications on smart mobile devices
Technical University of Cluj-Napoca Image Processing and Pattern Recognition Research Center www.cv.utcluj.ro Machine learning based automatic extrinsic calibration of an onboard monocular camera for driving
More informationObject-Based Saliency Maps Harry Marr Computing BSc 2009/2010
Object-Based Saliency Maps Harry Marr Computing BSc 2009/2010 The candidate confirms that the work submitted is their own and the appropriate credit has been given where reference has been made to the
More informationContextual Dropout. Sam Fok. Abstract. 1. Introduction. 2. Background and Related Work
Contextual Dropout Finding subnets for subtasks Sam Fok samfok@stanford.edu Abstract The feedforward networks widely used in classification are static and have no means for leveraging information about
More informationCIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm
CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm Instructions This is an individual assignment. Individual means each student must hand in their
More informationMask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma
Mask R-CNN presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN Background Related Work Architecture Experiment Mask R-CNN Background Related Work Architecture Experiment Background From left
More informationConvolutional Neural Networks: Applications and a short timeline. 7th Deep Learning Meetup Kornel Kis Vienna,
Convolutional Neural Networks: Applications and a short timeline 7th Deep Learning Meetup Kornel Kis Vienna, 1.12.2016. Introduction Currently a master student Master thesis at BME SmartLab Started deep
More informationClassifying Depositional Environments in Satellite Images
Classifying Depositional Environments in Satellite Images Alex Miltenberger and Rayan Kanfar Department of Geophysics School of Earth, Energy, and Environmental Sciences Stanford University 1 Introduction
More informationDeep Learning on Graphs
Deep Learning on Graphs with Graph Convolutional Networks Hidden layer Hidden layer Input Output ReLU ReLU, 22 March 2017 joint work with Max Welling (University of Amsterdam) BDL Workshop @ NIPS 2016
More informationDynamic Routing Between Capsules
Report Explainable Machine Learning Dynamic Routing Between Capsules Author: Michael Dorkenwald Supervisor: Dr. Ullrich Köthe 28. Juni 2018 Inhaltsverzeichnis 1 Introduction 2 2 Motivation 2 3 CapusleNet
More informationarxiv: v1 [cs.cv] 1 Jan 2019
Mapping Areas using Computer Vision Algorithms and Drones Bashar Alhafni Saulo Fernando Guedes Lays Cavalcante Ribeiro Juhyun Park Jeongkyu Lee University of Bridgeport. Bridgeport, CT, 06606. United States
More informationJOINT DETECTION AND SEGMENTATION WITH DEEP HIERARCHICAL NETWORKS. Zhao Chen Machine Learning Intern, NVIDIA
JOINT DETECTION AND SEGMENTATION WITH DEEP HIERARCHICAL NETWORKS Zhao Chen Machine Learning Intern, NVIDIA ABOUT ME 5th year PhD student in physics @ Stanford by day, deep learning computer vision scientist
More informationThe SIFT (Scale Invariant Feature
The SIFT (Scale Invariant Feature Transform) Detector and Descriptor developed by David Lowe University of British Columbia Initial paper ICCV 1999 Newer journal paper IJCV 2004 Review: Matt Brown s Canonical
More informationProject 3 Q&A. Jonathan Krause
Project 3 Q&A Jonathan Krause 1 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations 2 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations
More informationClassification of objects from Video Data (Group 30)
Classification of objects from Video Data (Group 30) Sheallika Singh 12665 Vibhuti Mahajan 12792 Aahitagni Mukherjee 12001 M Arvind 12385 1 Motivation Video surveillance has been employed for a long time
More informationCIS 660. Image Searching System using CNN-LSTM. Presented by. Mayur Rumalwala Sagar Dahiwala
CIS 660 using CNN-LSTM Presented by Mayur Rumalwala Sagar Dahiwala AGENDA Problem in Image Searching? Proposed Solution Tools, Library and Dataset used Architecture of Proposed System Implementation of
More informationAction recognition in robot-assisted minimally invasive surgery
Action recognition in robot-assisted minimally invasive surgery Candidate: Laura Erica Pescatori Co-Tutor: Hirenkumar Chandrakant Nakawala Tutor: Elena De Momi 1 Project Objective da Vinci Robot: Console
More informationA Deep Learning Approach to Vehicle Speed Estimation
A Deep Learning Approach to Vehicle Speed Estimation Benjamin Penchas bpenchas@stanford.edu Tobin Bell tbell@stanford.edu Marco Monteiro marcorm@stanford.edu ABSTRACT Given car dashboard video footage,
More informationVISION FOR AUTOMOTIVE DRIVING
VISION FOR AUTOMOTIVE DRIVING French Japanese Workshop on Deep Learning & AI, Paris, October 25th, 2017 Quoc Cuong PHAM, PhD Vision and Content Engineering Lab AI & MACHINE LEARNING FOR ADAS AND SELF-DRIVING
More informationarxiv: v1 [cs.cv] 31 Mar 2016
Object Boundary Guided Semantic Segmentation Qin Huang, Chunyang Xia, Wenchao Zheng, Yuhang Song, Hao Xu and C.-C. Jay Kuo arxiv:1603.09742v1 [cs.cv] 31 Mar 2016 University of Southern California Abstract.
More informationSynscapes A photorealistic syntehtic dataset for street scene parsing Jonas Unger Department of Science and Technology Linköpings Universitet.
Synscapes A photorealistic syntehtic dataset for street scene parsing Jonas Unger Department of Science and Technology Linköpings Universitet 7D Labs VINNOVA https://7dlabs.com Photo-realistic image synthesis
More informationVision based autonomous driving - A survey of recent methods. -Tejus Gupta
Vision based autonomous driving - A survey of recent methods -Tejus Gupta Presently, there are three major paradigms for vision based autonomous driving: Directly map input image to driving action using
More informationComputer Vision with MATLAB MATLAB Expo 2012 Steve Kuznicki
Computer Vision with MATLAB MATLAB Expo 2012 Steve Kuznicki 2011 The MathWorks, Inc. 1 Today s Topics Introduction Computer Vision Feature-based registration Automatic image registration Object recognition/rotation
More informationObject detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation
Object detection using Region Proposals (RCNN) Ernest Cheung COMP790-125 Presentation 1 2 Problem to solve Object detection Input: Image Output: Bounding box of the object 3 Object detection using CNN
More informationDeep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies
http://blog.csdn.net/zouxy09/article/details/8775360 Automatic Colorization of Black and White Images Automatically Adding Sounds To Silent Movies Traditionally this was done by hand with human effort
More informationCAP 6412 Advanced Computer Vision
CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong Feb 04, 2016 Today Administrivia Attention Modeling in Image Captioning, by Karan Neural networks & Backpropagation
More informationRegion-based Segmentation and Object Detection
Region-based Segmentation and Object Detection Stephen Gould Tianshi Gao Daphne Koller Presented at NIPS 2009 Discussion and Slides by Eric Wang April 23, 2010 Outline Introduction Model Overview Model
More informationDeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material
DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material Yi Li 1, Gu Wang 1, Xiangyang Ji 1, Yu Xiang 2, and Dieter Fox 2 1 Tsinghua University, BNRist 2 University of Washington
More informationCrowd Scene Understanding with Coherent Recurrent Neural Networks
Crowd Scene Understanding with Coherent Recurrent Neural Networks Hang Su, Yinpeng Dong, Jun Zhu May 22, 2016 Hang Su, Yinpeng Dong, Jun Zhu IJCAI 2016 May 22, 2016 1 / 26 Outline 1 Introduction 2 LSTM
More informationStoryline Reconstruction for Unordered Images
Introduction: Storyline Reconstruction for Unordered Images Final Paper Sameedha Bairagi, Arpit Khandelwal, Venkatesh Raizaday Storyline reconstruction is a relatively new topic and has not been researched
More informationMulti-view 3D Models from Single Images with a Convolutional Network
Multi-view 3D Models from Single Images with a Convolutional Network Maxim Tatarchenko University of Freiburg Skoltech - 2nd Christmas Colloquium on Computer Vision Humans have prior knowledge about 3D
More informationPrediction of Pedestrian Trajectories Final Report
Prediction of Pedestrian Trajectories Final Report Mingchen Li (limc), Yiyang Li (yiyang7), Gendong Zhang (zgdsh29) December 15, 2017 1 Introduction As the industry of automotive vehicles growing rapidly,
More informationReal-time Object Detection CS 229 Course Project
Real-time Object Detection CS 229 Course Project Zibo Gong 1, Tianchang He 1, and Ziyi Yang 1 1 Department of Electrical Engineering, Stanford University December 17, 2016 Abstract Objection detection
More informationLecture #11: The Perceptron
Lecture #11: The Perceptron Mat Kallada STAT2450 - Introduction to Data Mining Outline for Today Welcome back! Assignment 3 The Perceptron Learning Method Perceptron Learning Rule Assignment 3 Will be
More informationRecurrent Neural Networks and Transfer Learning for Action Recognition
Recurrent Neural Networks and Transfer Learning for Action Recognition Andrew Giel Stanford University agiel@stanford.edu Ryan Diaz Stanford University ryandiaz@stanford.edu Abstract We have taken on the
More informationDiffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting
Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting Yaguang Li Joint work with Rose Yu, Cyrus Shahabi, Yan Liu Page 1 Introduction Traffic congesting is wasteful of time,
More informationAvailable online at ScienceDirect. Procedia Computer Science 22 (2013 )
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 22 (2013 ) 945 953 17 th International Conference in Knowledge Based and Intelligent Information and Engineering Systems
More informationChannel Locality Block: A Variant of Squeeze-and-Excitation
Channel Locality Block: A Variant of Squeeze-and-Excitation 1 st Huayu Li Northern Arizona University Flagstaff, United State Northern Arizona University hl459@nau.edu arxiv:1901.01493v1 [cs.lg] 6 Jan
More informationAll human beings desire to know. [...] sight, more than any other senses, gives us knowledge of things and clarifies many differences among them.
All human beings desire to know. [...] sight, more than any other senses, gives us knowledge of things and clarifies many differences among them. - Aristotle University of Texas at Arlington Introduction
More informationNeural Machine Translation In Linear Time
Neural Machine Translation In Linear Time Authors: Nal Kalchbrenner, Lasse Espeholt, Karen Simonyan, Aaron van den Oord, Alex Graves, Koray Kavukcuoglu Presenter: SunMao sm4206 YuZheng yz2978 OVERVIEW
More informationCAP 6412 Advanced Computer Vision
CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 21st, 2016 Today Administrivia Free parameters in an approach, model, or algorithm? Egocentric videos by Aisha
More informationParallelization and optimization of the neuromorphic simulation code. Application on the MNIST problem
Parallelization and optimization of the neuromorphic simulation code. Application on the MNIST problem Raphaël Couturier, Michel Salomon FEMTO-ST - DISC Department - AND Team November 2 & 3, 2015 / Besançon
More informationDEEP NEURAL NETWORKS FOR OBJECT DETECTION
DEEP NEURAL NETWORKS FOR OBJECT DETECTION Sergey Nikolenko Steklov Institute of Mathematics at St. Petersburg October 21, 2017, St. Petersburg, Russia Outline Bird s eye overview of deep learning Convolutional
More informationAttentional Based Multiple-Object Tracking
Attentional Based Multiple-Object Tracking Mark Calafut Stanford University mcalafut@stanford.edu Abstract This paper investigates the attentional based tracking framework of Bazzani et al. (2011) and
More informationINTRODUCTION TO DEEP LEARNING
INTRODUCTION TO DEEP LEARNING CONTENTS Introduction to deep learning Contents 1. Examples 2. Machine learning 3. Neural networks 4. Deep learning 5. Convolutional neural networks 6. Conclusion 7. Additional
More informationAsynchronous Parallel Learning for Neural Networks and Structured Models with Dense Features
Asynchronous Parallel Learning for Neural Networks and Structured Models with Dense Features Xu SUN ( 孙栩 ) Peking University xusun@pku.edu.cn Motivation Neural networks -> Good Performance CNN, RNN, LSTM
More informationDeep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group
Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group 1D Input, 1D Output target input 2 2D Input, 1D Output: Data Distribution Complexity Imagine many dimensions (data occupies
More informationConvolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech
Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:
More informationList of Accepted Papers for ICVGIP 2018
List of Accepted Papers for ICVGIP 2018 Paper ID ACM Article Title 3 1 PredGAN - A deep multi-scale video prediction framework for anomaly detection in videos 7 2 Handwritten Essay Grading on Mobiles using
More informationAnalysis of features vectors of images computed on the basis of hierarchical models of object recognition in cortex
Analysis of features vectors of images computed on the basis of hierarchical models of object recognition in cortex Ankit Agrawal(Y7057); Jyoti Meena(Y7184) November 22, 2010 SE 367 : Introduction to Cognitive
More informationDeep Learning for Computer Vision II
IIIT Hyderabad Deep Learning for Computer Vision II C. V. Jawahar Paradigm Shift Feature Extraction (SIFT, HoG, ) Part Models / Encoding Classifier Sparrow Feature Learning Classifier Sparrow L 1 L 2 L
More informationDoes the Brain do Inverse Graphics?
Does the Brain do Inverse Graphics? Geoffrey Hinton, Alex Krizhevsky, Navdeep Jaitly, Tijmen Tieleman & Yichuan Tang Department of Computer Science University of Toronto How to learn many layers of features
More informationPhoto-realistic Renderings for Machines Seong-heum Kim
Photo-realistic Renderings for Machines 20105034 Seong-heum Kim CS580 Student Presentations 2016.04.28 Photo-realistic Renderings for Machines Scene radiances Model descriptions (Light, Shape, Material,
More informationPIXELS TO VOXELS: MODELING VISUAL REPRESENTATION IN THE HUMAN BRAIN
PIXELS TO VOXELS: MODELING VISUAL REPRESENTATION IN THE HUMAN BRAIN By Pulkit Agrawal, Dustin Stansbury, Jitendra Malik, Jack L. Gallant University of California Berkeley Presented by Tim Patzelt AGENDA
More informationTracking Requirements for ATC Tower Augmented Reality Environments
Tracking Requirements for ATC Tower Augmented Reality Environments Magnus Axholt Department of Science and Technology (ITN) Linköping University, Sweden Supervisors: Anders Ynnerman Linköping University,
More information16-785: Integrated Intelligence in Robotics: Vision, Language, and Planning. Spring 2018 Lecture 14. Image to Text
16-785: Integrated Intelligence in Robotics: Vision, Language, and Planning Spring 2018 Lecture 14. Image to Text Input Output Classification tasks 4/1/18 CMU 16-785: Integrated Intelligence in Robotics
More informationBio-inspired Binocular Disparity with Position-Shift Receptive Field
Bio-inspired Binocular Disparity with Position-Shift Receptive Field Fernanda da C. e C. Faria, Jorge Batista and Helder Araújo Institute of Systems and Robotics, Department of Electrical Engineering and
More informationYiqi Yan. May 10, 2017
Yiqi Yan May 10, 2017 P a r t I F u n d a m e n t a l B a c k g r o u n d s Convolution Single Filter Multiple Filters 3 Convolution: case study, 2 filters 4 Convolution: receptive field receptive field
More informationdetectorpls version William Robson Schwartz
detectorpls version 0.1.1 William Robson Schwartz http://www.umiacs.umd.edu/~schwartz October 30, 2009 Contents 1 Introduction 2 2 Performing Object Detection 4 2.1 Conguration File........................
More informationarxiv: v1 [cs.cv] 28 Nov 2018
MeshNet: Mesh Neural Network for 3D Shape Representation Yutong Feng, 1 Yifan Feng, 2 Haoxuan You, 1 Xibin Zhao 1, Yue Gao 1 1 BNRist, KLISS, School of Software, Tsinghua University, China. 2 School of
More informationDeep Learning in Image Processing
Deep Learning in Image Processing Roland Memisevic University of Montreal & TwentyBN ICISP 2016 Roland Memisevic Deep Learning in Image Processing ICISP 2016 f 2? cathedral high-rise f 1 It s the features,
More informationDynamic Routing Between Capsules. Yiting Ethan Li, Haakon Hukkelaas, and Kaushik Ram Ramasamy
Dynamic Routing Between Capsules Yiting Ethan Li, Haakon Hukkelaas, and Kaushik Ram Ramasamy Problems & Results Object classification in images without losing information about important parts of the picture.
More informationVehicle Classification on Low-resolution and Occluded images: A low-cost labeled dataset for augmentation
Vehicle Classification on Low-resolution and Occluded images: A low-cost labeled dataset for augmentation Anonymous Author(s) Affiliation Address email Abstract 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Video image
More informationDefinition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos
Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos Sung Chun Lee, Chang Huang, and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA sungchun@usc.edu,
More information