Computer Vision eine Herausforderung in der Künstlichen Intelligenz
|
|
- Melvin Fitzgerald
- 5 years ago
- Views:
Transcription
1 Computer Vision eine Herausforderung in der Künstlichen Intelligenz Prof. Carsten Rother Computer Vision Lab Dresden Institute of Artificial Intelligence 11/12/2013 Computer Vision a hard case for AI
2 Roadmap for this lecture A few more words on the history of AI and subareas of AI An introduction to Computer Vision What is it? Why is it hard? How can we solve it? What can we do with it? Roadmap for the remaining lecture 11/12/2013 Computer Vision a hard case for AI 2
3 Roadmap for this lecture A few more words on the history of AI and subareas of AI An introduction to Computer Vision What is it? Why is it hard? How can we solve it? What can we do with it? Roadmap for the remaining lecture 11/12/2013 Computer Vision a hard case for AI 3
4 From first lecture 11/12/2013 Computer Vision a hard case for AI 4
5 Going back to 1973 Sir James Lighthill report to the British Parliament The general purpose robot is a mirage Ein Roboter der alles kann ist eine Illusion Full report on Youtube: 11/12/2013 Computer Vision a hard case for AI 5
6 Going back to 1973 Sir James Lighthill report to the British Parliament He specifically mentioned the problem of "combinatorial explosion" or "intractability", which implied that many of AI's most successful algorithms would grind to a halt on real world problems and were only suitable for solving "toy" versions. 11/12/2013 Computer Vision a hard case for AI 6
7 What do we have today Personal Conclusion He is correct we don t have the general purpose robot. AI Research split into many sub/related areas: Machine Learning, Computer Vision, (more later) In some areas we are doing a very good job: Natural Language Processing (NLP) Playing chess In some areas turned out to be very hard: Robotics Computer Vision seems like one of the hardest ones (a few success stories come later) 11/12/2013 Computer Vision a hard case for AI 7
8 Scene understanding in the 70s [Sussman, Lamport, Guzman 1966] [Slide credits Andrew Blake] 11/12/2013 Computer Vision a hard case for AI 8
9 Scene understanding - today We are getting there 40 years later [Xiao et al. NIPS 2012] 11/12/2013 Computer Vision a hard case for AI 9
10 Today: Topics / Subareas in AI Applications: Natural Language Processing Planning Computer Vision Robotics Biology Human-Computer Interaction Algorithms: Search Discrete Optimization Continuous Optimization Probabilistic Inference Learning Theory: Logic Machine Learning Probability Theory Decision Theory Automated Reasoning Models: Knowledge representation Undirected graphical models Directed Graphical models Unstructured models AI overlaps with many other disciplines There is not one unique, overarching theory AI has impact in many domains [derived from first lecture] 11/12/2013 Computer Vision a hard case for AI 10
11 Today: Topics / Subareas in AI Applications: Natural Language Processing Planning Computer Vision Robotics Biology Human-Computer Interaction Algorithms: Search Discrete Optimization Continuous Optimization Probabilistic Inference Learning Theory: Logic Machine Learning Probability Theory Decision Theory Automated Reasoning Models: Knowledge representation Undirected graphical models Directed Graphical models Unstructured models AI overlaps with many disciplines There is not one unique, overarching theory AI has impact in many domains [derived from first lecture] 11/12/2013 Computer Vision a hard case for AI 11
12 Books for the following lecture Artificial Intelligence: A modern Approach Russell, Norvig (Third Edition, English) (we cover: (parts of) sections: 4,5,6) Pattern recognition and machine learning, Bishop. Springer Learning from data: A short course, Abu-Mostafa, Magdon- Ismail,Hsuan-Tien Lin. AMLbook. Markov Random Fields for Vision and Image Processing, Blake, Kohli, Rother. MIT-Press 2011 Computer Vision: Algorithms and Applications, Szeliski, Springer An earlier version of the book is online: 11/12/2013 Computer Vision a hard case for AI 12
13 Roadmap for this lecture A few more words on history of AI and subareas of AI An introduction to Computer Vision What is it? Why is it hard? How can we solve it? What can we do with it? Roadmap for the remaining lecture 11/12/2013 Computer Vision a hard case for AI 13
14 What is computer Vision? (Potential) Definition: Developing computational models and algorithms to interpret digital images and visual data in order to understand the visual world we live in. 11/12/2013 Computer Vision I: Introduction 14
15 What is computer Vision? (Potential) Definition: Developing computational models and algorithms to interpret digital images and visual data in order to understand the visual world we live in. 11/12/2013 Computer Vision I: Introduction 15
16 What does it mean to understand? Physics-based vision: Geometry Segmentation Camera parameters Emitted light (sun) Surface properties: Reflectance, material (Potential) Definition: Developing computational models and algorithms to interpret digital images and visual data in order to understand the visual world we live in. Semantic-based vision: Objects: class, pose Scene: outdoor, Attributes/Properties: - old-fashioned train - A-on-top-of-B 11/12/2013 Computer Vision I: Introduction 16
17 Image-formation model Very many sources of variability Image [Slide Credits: John Winn, ICML 2008] 11/12/2013 Computer Vision I: Introduction 17
18 Image-formation model Scene type Scene geometry Street scene [Slide Credits: John Winn, ICML 2008] 11/12/2013 Computer Vision I: Introduction 18
19 Image-formation model Scene type Scene geometry Object classes Sky Building 3 Road Street scene Sidewalk Bicycle Tree 3 Car 5 Person 4 Bench Bollard [Slide Credits: John Winn, ICML 2008] 11/12/2013 Computer Vision I: Introduction 19
20 Image-formation model Scene type Scene geometry Object classes Object position Object orientation Sky Building 3 Road Street scene Sidewalk Bicycle Tree 3 Car 5 Person 4 Bench Bollard [Slide Credits: John Winn, ICML 2008] 11/12/2013 Computer Vision I: Introduction 20
21 Image-formation model Scene type Scene geometry Object classes Object position Object orientation Object shape Street scene [Slide Credits: John Winn, ICML 2008] 11/12/2013 Computer Vision I: Introduction 21
22 Image-formation model Scene type Scene geometry Object classes Object position Object orientation Object shape Depth/occlusions [Slide Credits: John Winn, ICML 2008] 11/12/2013 Computer Vision I: Introduction 22
23 Image-formation model Scene type Scene geometry Object classes Object position Object orientation Object shape Depth/occlusions Object appearance [Slide Credits: John Winn, ICML 2008] 11/12/2013 Computer Vision I: Introduction 23
24 Image-formation model Scene type Scene geometry Object classes Object position Object orientation Object shape Depth/occlusions Object appearance Illumination Shadows [Slide Credits: John Winn, ICML 2008] 11/12/2013 Computer Vision I: Introduction 24
25 Image-formation model Scene type Scene geometry Object classes Object position Object orientation Object shape Depth/occlusions Object appearance Illumination Shadows [Slide Credits: John Winn, ICML 2008] 11/12/2013 Computer Vision I: Introduction 25
26 Image-formation model Scene type Scene geometry Object classes Object position Object orientation Object shape Depth/occlusions Object appearance Illumination Shadows Motion blur Camera effects [Slide Credits: John Winn, ICML 2008] 11/12/2013 Computer Vision I: Introduction 26
27 Image-formation model Scene type Scene geometry Object classes Object position Object orientation Object shape Depth/occlusions Object appearance Illumination Shadows Motion blur Camera effects [Slide Credits: John Winn, ICML 2008] 11/12/2013 Computer Vision I: Introduction 27
28 The Scene Parsing challenge --- a grand challenge of computer vision (Probabilistic) Script = {Camera, Light, Geometry, Material, Objects, Scene, Attributes, Others} Single image 11/12/2013 Computer Vision I: Introduction 28
29 Why is scene parsing hard? Computer Graphics 3D Rich Representation, Script = {Camera, Light, Geometry, Material, Objects, Scene, Attributes, Others} 2D pixel representation Computer Vision Computer Vision can be seen as inverse graphics 11/12/2013 Computer Vision I: Introduction 29
30 Example of a recent work Scene graph Input [Gupta, Efros, Herbert, ECCV 10] 11/12/2013 Computer Vision I: Introduction 30
31 Example: General Object recognition & segmentation Good results [TextonBoost; Shotton et al, 06] 11/12/2013 Computer Vision I: Introduction 31
32 Example: General Object recognition & segmentation Failure cases [TextonBoost; Shotton et al, 06] 11/12/2013 Computer Vision I: Introduction 32
33 Comparison: CV to NLP Natural Language Processing Real-time Speech translation Amount of input data: (Audiobooks have 2.2 words per second, i.e. ~20 letters per second) Sound is 1D Strong rule (context free grammars exists) Real-time Speech translation exists more or less Computer Vision (Scene Understanding) Amount of Input Data: 10 Mpixel /second for a robot Images are 2D (much harder inference!) Rules/Models are hard to define since images are so varied (see next lecture) Scene Understand is far from being solved, best method has a 47% of being correct for 20 object classes 11/12/2013 Computer Vision a hard case for AI 33
34 Scene Understand is far from being solved, best method has a 47% of being correct for 20 object classes 11/12/2013 Computer Vision a hard case for AI 34
35 What is computer Vision? (Potential) Definition: Developing computational models and algorithms to interpret digital images and visual data in order to understand the visual world we live in. 11/12/2013 Computer Vision I: Introduction 35
36 Visual Data is everywhere Visual Data is dense, structured data Real world: RGB photo/video cameras Mobile phones Depth cameras Laser scanners Robotics Medicine Microscopy Surveillance Cars Web search Physics simulations 11/12/2013 Computer Vision a hard case for AI 36
37 How can we interpret visual data? 2D pixel representation Computer Graphics 3D Rich Representation, Script = {Camera, Light, Geometry, Material, Objects, Scene, Attributes, Others} Computer Vision What general (prior) knowledge of the world (not necessarily visual) can be exploit? What properties / cues from the image can be used? Both aspects are quite well understood (a lot is based on physics) but how to use them is efficiently is open challenged (see later) 11/12/2013 Computer Vision I: Introduction 37
38 How can we interpret visual data? 2D pixel representation Computer Graphics 3D Rich Representation, Script = {Camera, Light, Geometry, Material, Objects, Scene, Attributes, Others} Computer Vision What general (prior) knowledge of the world (not necessarily visual) can be exploit? What properties / cues from the image can be used? Both aspects are quite well understood (a lot is based on physics) but how to use them is efficiently is open challenged (see later) 11/12/2013 Computer Vision I: Introduction 38
39 Prior knowledge (examples) Hard prior knowledge Trains do not fly in the air Objects are connected in 3D Soft prior knowledge: The camera is more likely 1.70m above ground and not 0.1m. Self-similarity: all black pixels belong to the same object 11/12/2013 Computer Vision I: Introduction 39
40 Prior knowledge harder to describe Describe Image Texture Real Image zoom Not a real Image zoom Microscopic Images. What is the true shape of these objects 11/12/2013 Computer Vision I: Introduction 40
41 The importance of Prior knowledge Which patch is brighter: A or B? [Edward Adelson] 11/12/2013 Computer Vision I: Introduction 42
42 The importance of Prior knowledge Which patch is brighter: A or B? [Edward Adelson] 11/12/2013 Computer Vision I: Introduction 43
43 The importance of Prior knowledge 2D 3D 3D A Ambient Light A Direct Light A B B B 2D Image - local True colors in 3D world True colours In 3D world What the computer sees An unlikely 3D representation (hard to see for a human) The most likely 3D representation This is what humans see implicitly. Ideally the computer sees the sane. 11/12/2013 Computer Vision I: Introduction 44
44 The importance of Prior knowledge Light 2D Image 3D representation Humans see an image not as a set of 2D pixels. They understand an image as a projection of the 3D world we live in Humans have the prior knowledge about the world encoded, such as: Light cast shadows Objects do not fly in the air A car is likely to move but a table is unlikely to move We have to teach the computer this prior knowledge to understand 2D images as picture of the 3D world 11/12/2013 Computer Vision I: Introduction 45
45 Male or Female? 11/12/2013 Computer Vision I: Introduction 46
46 How can we interpret visual data? 2D pixel representation Computer Graphics 3D Rich Representation, Script = {Camera, Light, Geometry, Material, Objects, Scene, Attributes, Others} Computer Vision What general (prior) knowledge of the world (not necessarily visual) can be exploit? What properties / cues from the image can be used? Both aspects are quite well understood (a lot is based on physics) but how to use them is efficiently is open challenged (see later) 11/12/2013 Computer Vision I: Introduction 47
47 Cue: Appearance (Colour, Texture) for object recognition To what object does the patch belong to? 11/12/2013 Computer Vision I: Introduction 48
48 Cue: Outlines (shape) for object recognition 11/12/2013 Computer Vision I: Introduction 49
49 Guess the Object Colour Texture Shape [from JohnWinn ICML 2008] 11/12/2013 Computer Vision I: Introduction 50
50 Cue: Context for object recognition 11/12/2013 Computer Vision I: Introduction 51
51 Cue: Context for object recognition 11/12/2013 Computer Vision I: Introduction 52
52 Cue: stereo vision (2 frames) for geometry estimation Ground truth Algorithmic output 11/12/2013 Computer Vision I: Introduction 53
53 Cue: Multiple Frames for geometry estimation 11/12/2013 Computer Vision I: Introduction 54
54 Cue: Shading & shadows for geometry and Light estimation 11/12/2013 Computer Vision I: Introduction 55
55 Texture gradient for geometry estimation 11/12/2013 Computer Vision I: Introduction 56
56 The Scene Parsing challenge --- a grand challenge of computer vision (Probabilistic) Script = {Camera, Light, Geometry, Material, Objects, Scene, Attributes, Others} Single image Many applications do not have to extract the full probabilistic script but only a subset, e.g. does the image contain a car? many examples to come later 11/12/2013 Computer Vision I: Introduction 57
57 many application scenarios are in reach To simplify the problem: 1) Richer Input: - Modern sensing technology - Moving images - User involvement 2) Rich Data to learn from: - use the web - crowdsourcing to get labels (online games, mechanical turk) - Powerful graphics engines 11/12/2013 Computer Vision I: Introduction 58
58 Real-time pedestrian detection 11/12/2013 Computer Vision I: Introduction 59
59 Animate the world [Chen et al. UIST 12] 11/12/2013 Computer Vision I: Introduction 60
60 Example: Xbox people tracking 11/12/2013 Computer Vision a hard case for AI 62
61 Example: people tracking (test data) 11/12/2013 Computer Vision a hard case for AI 63
62 Body tracking and Gesture Recognition has many applications StartUp 2012: Try Fashion online Very large impact in many field: Gaming, Robotics, HCI, Medicine, 11/12/2013 Computer Vision I: Introduction 65
63 Start-Up Company: Like.com 11/12/2013 Computer Vision I: Introduction 66
64 What is computer Vision? (Potential) Definition: Developing computational models and algorithms to interpret digital images and visual data in order to understand the visual world we live in. 11/12/2013 Computer Vision I: Introduction 67
65 Example: Image Segmentation output Image with User Input x = 0,1 n Typically n is large 1M θ i (y i ) y i θ ij (y i, y j ) y j Undirected graphical models 11/12/2013 Introducing the Computer Vision Lab Dresden 68
66 Example: Image Segmentation θ i (y i ) y i θ ij (y i, y j ) y j Image with User Input y = 0,1 n Typically n is large 1M Graphical models Modelling: How toformulate the graphcial model, e. g. P y θ (this this is one of many tasks) Inference/Optimization: y = argmax y P(y θ) (this this is one of many tasks) Learning: find optimal parameters θ (this this is one of many tasks) 11/12/2013 Introducing the Computer Vision Lab Dresden 69
67 What is Learning? Training: Image and Ground Truth Probabilistic model: P y θ ) Error Function to say how we compare results find weights θ (can be up to 10M parameters) Testing: Inference: Maximum Probability: y = argmax y P y θ ) 11/12/2013 Introducing the Computer Vision Lab Dresden 70
68 Model versus Inference (Algorithm) Input: Image sequence [Data courtesy from Oliver Woodford] Output: New view Model: Minimize a binary 4-connected undirected graphical model (choose a colour-mode at each pixel) [Fitzgibbon et al. 03] 11/12/2013 Computer Vision I: Introduction 71
69 Another Example: Model versus Algorithm Ground Truth Graph Cut with Belief Propagation ICM, Simulated QPBOP truncation (approximate solution) Annealing [Boros et al. 06; [Rother et al. 05] (approximate solution) Rother et al. 07] (approximate solution) (exact solution) Why is the result not perfect? Model or Inference 11/12/2013 Computer Vision I: Introduction 72
70 Summary: The key questions for the upcoming lectures What is the modelling language: undirected / directed Graphical models; unstructured models How does the model look like: What is the structure? How do the functions look like? Can we learn the Model from Data: Learn structure Learn potential functions Probabilistic Learning / Discrimantive Learning How do we optimize the model (perform inference): fast, approximate Marginals Exactly solvable? 11/12/2013 Computer Vision I: Introduction 73
71 Is Machine Learning feasible? We are looking at a mapping: X = 0,1 3 Y = {0,1} We are given 5 training data instances: [example from book: Learning from data; Abu-Mustafa et al.] 11/12/2013 Computer Vision a hard case for AI 74
72 Is Machine Learning feasible? We are looking at a mapping: X = 0,1 3 Y = {0,1} We are given 5 training data instances:??? What is the value for the remaining 3 data points? [example from book: Learning from data; Abu-Mustafa et al.] 11/12/2013 Computer Vision a hard case for AI 75
73 Is Machine Learning feasible? Let us look at all possible functions: f x 1, x 2, x 3 = y We have in total 2 23 = 256 possible functions Given the training data fixed we have 8 remaining functions:??? Without any information about f any solution for f is good! We need information about f [example from book: Learning from data; Abu-Mustafa et al.] 11/12/2013 Computer Vision a hard case for AI 76
74 Is Machine Learning feasible? x 3 x 3 x 2 x 2 Assume f is smooth in 3D space (x 1, x 2, x 3 ), i.e. few 0-1 transitions in Manhattan-space (neighborhood drawn by lines) x 1 x 1 6 Transitions (optimal) x 3 x 3 x 2 x 2 9 Transitions (less good) x 1 12 Transitions (worst) [example from book: Learning from data; Abu-Mustafa et al.] 11/12/2013 Computer Vision a hard case for AI 77 x 1
75 Roadmap for this lecture A few more words on history of AI and subareas of AI An introduction to Computer Vision What is it? Why is it hard? How can we solve it? What can we do with it? Roadmap for the remaining lecture 11/12/2013 Computer Vision a hard case for AI 78
76 Roadmap for next lectures (1): Computer Vision a hard case for AI (2): Introduction to probability theory (1): Exercise: probability theory (2): Unstructured models: Decision theory 8.1 (1): Unstructured models: Probabilistic Learning 8.1 (2): Unstructured models: Discriminative Learning Intro 15.1 (1): Exercise: Learning 15.1 (2): Unstructured models: Discriminative Learning Lecturers: Carsten Rother and Dimitri Schlesinger 11/12/2013 Computer Vision a hard case for AI 79
77 Roadmap for next lectures 22.1 (1): Undirected Graphical models: Models and Inference 22.1 (2): Undirected Graphical models: Models and Inference 29.1 (1): Exercise: Learning 29.1 (2): Undirected Graphical models: Learning 5.2 (1): Directed Graphical models 5.2 (2): Wrap up; Putting theory to practice Lecturers: Carsten Rother and Dimitri Schlesinger 11/12/2013 Computer Vision a hard case for AI 80
78 Related Lectures in Master / Bachelor / Diploma Computer Vision 1: Algorithms and Applications (winter term; 2+2) Machine Learning (winter term; 2+2) Computer Vision 2: Models, Inference, and Learning (summer term; 4+2) Many seminars and practical sessions Topics for Bachelor, Master, Diploma Thesis 11/12/2013 Computer Vision a hard case for AI 81
Computer Vision I - Introduction
Computer Vision I - Introduction Carsten Rother 21/10/2014 Computer Vision I:Introduction Computer Vision I: Introduction 21/10/2014 2 Admin Stuff Language: German/English; Slides: English (all the terminology
More informationComputer Vision I - Algorithms and Applications: Introduction
Computer Vision I - Algorithms and Applications: Introduction Carsten Rother 22/10/2013 Computer Vision I:Introduction Admin Stuff Computer Vision I: Introduction 22/10/2013 2 Language: German/English;
More informationSegmentation. Bottom up Segmentation Semantic Segmentation
Segmentation Bottom up Segmentation Semantic Segmentation Semantic Labeling of Street Scenes Ground Truth Labels 11 classes, almost all occur simultaneously, large changes in viewpoint, scale sky, road,
More informationAnalysis: TextonBoost and Semantic Texton Forests. Daniel Munoz Februrary 9, 2009
Analysis: TextonBoost and Semantic Texton Forests Daniel Munoz 16-721 Februrary 9, 2009 Papers [shotton-eccv-06] J. Shotton, J. Winn, C. Rother, A. Criminisi, TextonBoost: Joint Appearance, Shape and Context
More informationShadows in the graphics pipeline
Shadows in the graphics pipeline Steve Marschner Cornell University CS 569 Spring 2008, 19 February There are a number of visual cues that help let the viewer know about the 3D relationships between objects
More informationQuadratic Pseudo-Boolean Optimization(QPBO): Theory and Applications At-a-Glance
Quadratic Pseudo-Boolean Optimization(QPBO): Theory and Applications At-a-Glance Presented By: Ahmad Al-Kabbany Under the Supervision of: Prof.Eric Dubois 12 June 2012 Outline Introduction The limitations
More informationData-driven Depth Inference from a Single Still Image
Data-driven Depth Inference from a Single Still Image Kyunghee Kim Computer Science Department Stanford University kyunghee.kim@stanford.edu Abstract Given an indoor image, how to recover its depth information
More informationWhat is Computer Vision? Introduction. We all make mistakes. Why is this hard? What was happening. What do you see? Intro Computer Vision
What is Computer Vision? Trucco and Verri (Text): Computing properties of the 3-D world from one or more digital images Introduction Introduction to Computer Vision CSE 152 Lecture 1 Sockman and Shapiro:
More informationIntroduction to Computer Vision. Srikumar Ramalingam School of Computing University of Utah
Introduction to Computer Vision Srikumar Ramalingam School of Computing University of Utah srikumar@cs.utah.edu Course Website http://www.eng.utah.edu/~cs6320/ What is computer vision? Light source 3D
More informationCOMP 102: Computers and Computing
COMP 102: Computers and Computing Lecture 23: Computer Vision Instructor: Kaleem Siddiqi (siddiqi@cim.mcgill.ca) Class web page: www.cim.mcgill.ca/~siddiqi/102.html What is computer vision? Broadly speaking,
More informationMaking Machines See. Roberto Cipolla Department of Engineering. Research team
Making Machines See Roberto Cipolla Department of Engineering Research team http://www.eng.cam.ac.uk/~cipolla/people.html Cognitive Systems Engineering Cognitive Systems Engineering Introduction Making
More informationLecture 19: Depth Cameras. Visual Computing Systems CMU , Fall 2013
Lecture 19: Depth Cameras Visual Computing Systems Continuing theme: computational photography Cameras capture light, then extensive processing produces the desired image Today: - Capturing scene depth
More information(Sample) Final Exam with brief answers
Name: Perm #: (Sample) Final Exam with brief answers CS/ECE 181B Intro to Computer Vision March 24, 2017 noon 3:00 pm This is a closed-book test. There are also a few pages of equations, etc. included
More informationWhy study Computer Vision?
Computer Vision Why study Computer Vision? Images and movies are everywhere Fast-growing collection of useful applications building representations of the 3D world from pictures automated surveillance
More informationMarkov Random Fields and Segmentation with Graph Cuts
Markov Random Fields and Segmentation with Graph Cuts Computer Vision Jia-Bin Huang, Virginia Tech Many slides from D. Hoiem Administrative stuffs Final project Proposal due Oct 27 (Thursday) HW 4 is out
More informationELL 788 Computational Perception & Cognition July November 2015
ELL 788 Computational Perception & Cognition July November 2015 Module 6 Role of context in object detection Objects and cognition Ambiguous objects Unfavorable viewing condition Context helps in object
More informationWhy is computer vision difficult?
Why is computer vision difficult? Viewpoint variation Illumination Scale Why is computer vision difficult? Intra-class variation Motion (Source: S. Lazebnik) Background clutter Occlusion Challenges: local
More informationObject Recognition. Lecture 11, April 21 st, Lexing Xie. EE4830 Digital Image Processing
Object Recognition Lecture 11, April 21 st, 2008 Lexing Xie EE4830 Digital Image Processing http://www.ee.columbia.edu/~xlx/ee4830/ 1 Announcements 2 HW#5 due today HW#6 last HW of the semester Due May
More informationJamie Shotton, Andrew Fitzgibbon, Mat Cook, Toby Sharp, Mark Finocchio, Richard Moore, Alex Kipman, Andrew Blake CVPR 2011
Jamie Shotton, Andrew Fitzgibbon, Mat Cook, Toby Sharp, Mark Finocchio, Richard Moore, Alex Kipman, Andrew Blake CVPR 2011 Auto-initialize a tracking algorithm & recover from failures All human poses,
More informationDiscrete Optimization of Ray Potentials for Semantic 3D Reconstruction
Discrete Optimization of Ray Potentials for Semantic 3D Reconstruction Marc Pollefeys Joined work with Nikolay Savinov, Christian Haene, Lubor Ladicky 2 Comparison to Volumetric Fusion Higher-order ray
More informationComputer Vision. Introduction
Computer Vision Introduction Filippo Bergamasco (filippo.bergamasco@unive.it) http://www.dais.unive.it/~bergamasco DAIS, Ca Foscari University of Venice Academic year 2016/2017 About this course Official
More informationFinally: Motion and tracking. Motion 4/20/2011. CS 376 Lecture 24 Motion 1. Video. Uses of motion. Motion parallax. Motion field
Finally: Motion and tracking Tracking objects, video analysis, low level motion Motion Wed, April 20 Kristen Grauman UT-Austin Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys, and S. Lazebnik
More information12/3/2009. What is Computer Vision? Applications. Application: Assisted driving Pedestrian and car detection. Application: Improving online search
Introduction to Artificial Intelligence V22.0472-001 Fall 2009 Lecture 26: Computer Vision Rob Fergus Dept of Computer Science, Courant Institute, NYU Slides from Andrew Zisserman What is Computer Vision?
More informationImage Segmentation continued Graph Based Methods. Some slides: courtesy of O. Capms, Penn State, J.Ponce and D. Fortsyth, Computer Vision Book
Image Segmentation continued Graph Based Methods Some slides: courtesy of O. Capms, Penn State, J.Ponce and D. Fortsyth, Computer Vision Book Previously Binary segmentation Segmentation by thresholding
More informationHuman Body Recognition and Tracking: How the Kinect Works. Kinect RGB-D Camera. What the Kinect Does. How Kinect Works: Overview
Human Body Recognition and Tracking: How the Kinect Works Kinect RGB-D Camera Microsoft Kinect (Nov. 2010) Color video camera + laser-projected IR dot pattern + IR camera $120 (April 2012) Kinect 1.5 due
More informationUndirected Graphical Models. Raul Queiroz Feitosa
Undirected Graphical Models Raul Queiroz Feitosa Pros and Cons Advantages of UGMs over DGMs UGMs are more natural for some domains (e.g. context-dependent entities) Discriminative UGMs (CRF) are better
More informationDepth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth
Common Classification Tasks Recognition of individual objects/faces Analyze object-specific features (e.g., key points) Train with images from different viewing angles Recognition of object classes Analyze
More informationTopics to be Covered in the Rest of the Semester. CSci 4968 and 6270 Computational Vision Lecture 15 Overview of Remainder of the Semester
Topics to be Covered in the Rest of the Semester CSci 4968 and 6270 Computational Vision Lecture 15 Overview of Remainder of the Semester Charles Stewart Department of Computer Science Rensselaer Polytechnic
More informationThanks to Chris Bregler. COS 429: Computer Vision
Thanks to Chris Bregler COS 429: Computer Vision COS 429: Computer Vision Instructor: Thomas Funkhouser funk@cs.princeton.edu Preceptors: Ohad Fried, Xinyi Fan {ohad,xinyi}@cs.princeton.edu Web page: http://www.cs.princeton.edu/courses/archive/fall13/cos429/
More informationOptimizing Monocular Cues for Depth Estimation from Indoor Images
Optimizing Monocular Cues for Depth Estimation from Indoor Images Aditya Venkatraman 1, Sheetal Mahadik 2 1, 2 Department of Electronics and Telecommunication, ST Francis Institute of Technology, Mumbai,
More informationLocal cues and global constraints in image understanding
Local cues and global constraints in image understanding Olga Barinova Lomonosov Moscow State University *Many slides adopted from the courses of Anton Konushin Image understanding «To see means to know
More informationA Simple Vision System
Chapter 1 A Simple Vision System 1.1 Introduction In 1966, Seymour Papert wrote a proposal for building a vision system as a summer project [4]. The abstract of the proposal starts stating a simple goal:
More informationComputer Vision: Making machines see
Computer Vision: Making machines see Roberto Cipolla Department of Engineering http://www.eng.cam.ac.uk/~cipolla/people.html http://www.toshiba.eu/eu/cambridge-research- Laboratory/ Vision: what is where
More informationCS4495/6495 Introduction to Computer Vision. 1A-L1 Introduction
CS4495/6495 Introduction to Computer Vision 1A-L1 Introduction Outline What is computer vision? State of the art Why is this hard? Course overview Software Why study Computer Vision? Images (and movies)
More informationInteractive segmentation, Combinatorial optimization. Filip Malmberg
Interactive segmentation, Combinatorial optimization Filip Malmberg But first... Implementing graph-based algorithms Even if we have formulated an algorithm on a general graphs, we do not neccesarily have
More informationTri-modal Human Body Segmentation
Tri-modal Human Body Segmentation Master of Science Thesis Cristina Palmero Cantariño Advisor: Sergio Escalera Guerrero February 6, 2014 Outline 1 Introduction 2 Tri-modal dataset 3 Proposed baseline 4
More informationEstimating Human Pose in Images. Navraj Singh December 11, 2009
Estimating Human Pose in Images Navraj Singh December 11, 2009 Introduction This project attempts to improve the performance of an existing method of estimating the pose of humans in still images. Tasks
More informationEECS 442 Computer Vision fall 2011
EECS 442 Computer Vision fall 2011 Instructor Silvio Savarese silvio@eecs.umich.edu Office: ECE Building, room: 4435 Office hour: Tues 4:30-5:30pm or under appoint. (after conversation hour) GSIs: Mohit
More informationThe Kinect Sensor. Luís Carriço FCUL 2014/15
Advanced Interaction Techniques The Kinect Sensor Luís Carriço FCUL 2014/15 Sources: MS Kinect for Xbox 360 John C. Tang. Using Kinect to explore NUI, Ms Research, From Stanford CS247 Shotton et al. Real-Time
More informationDigital Images. Kyungim Baek. Department of Information and Computer Sciences. ICS 101 (November 1, 2016) Digital Images 1
Digital Images Kyungim Baek Department of Information and Computer Sciences ICS 101 (November 1, 2016) Digital Images 1 iclicker Question I know a lot about how digital images are represented, stored,
More informationIntroduction to SLAM Part II. Paul Robertson
Introduction to SLAM Part II Paul Robertson Localization Review Tracking, Global Localization, Kidnapping Problem. Kalman Filter Quadratic Linear (unless EKF) SLAM Loop closing Scaling: Partition space
More informationCombining Appearance and Structure from Motion Features for Road Scene Understanding
STURGESS et al.: COMBINING APPEARANCE AND SFM FEATURES 1 Combining Appearance and Structure from Motion Features for Road Scene Understanding Paul Sturgess paul.sturgess@brookes.ac.uk Karteek Alahari karteek.alahari@brookes.ac.uk
More informationMarkov Networks in Computer Vision
Markov Networks in Computer Vision Sargur Srihari srihari@cedar.buffalo.edu 1 Markov Networks for Computer Vision Some applications: 1. Image segmentation 2. Removal of blur/noise 3. Stereo reconstruction
More informationAnnouncements. Introduction. Why is this hard? What is Computer Vision? We all make mistakes. What do you see? Class Web Page is up:
Announcements Introduction Computer Vision I CSE 252A Lecture 1 Class Web Page is up: http://www.cs.ucsd.edu/classes/wi05/cse252a/ Assignment 0: Getting Started with Matlab is posted to web page, due 1/13/04
More informationMarkov Networks in Computer Vision. Sargur Srihari
Markov Networks in Computer Vision Sargur srihari@cedar.buffalo.edu 1 Markov Networks for Computer Vision Important application area for MNs 1. Image segmentation 2. Removal of blur/noise 3. Stereo reconstruction
More informationComputer Vision at Cambridge: Reconstruction,Registration and Recognition
Computer Vision at Cambridge: Reconstruction,Registration and Recognition Roberto Cipolla Research team http://www.eng.cam.ac.uk/~cipolla/people.html Cognitive Systems Engineering Cognitive Systems Engineering
More informationCS 534: Computer Vision Segmentation and Perceptual Grouping
CS 534: Computer Vision Segmentation and Perceptual Grouping Spring 2005 Ahmed Elgammal Dept of Computer Science CS 534 Segmentation - 1 Where are we? Image Formation Human vision Cameras Geometric Camera
More informationSegmentation by Clustering Reading: Chapter 14 (skip 14.5)
Segmentation by Clustering Reading: Chapter 14 (skip 14.5) Data reduction - obtain a compact representation for interesting image data in terms of a set of components Find components that belong together
More informationWhat have we leaned so far?
What have we leaned so far? Camera structure Eye structure Project 1: High Dynamic Range Imaging What have we learned so far? Image Filtering Image Warping Camera Projection Model Project 2: Panoramic
More informationSegmentation by Clustering. Segmentation by Clustering Reading: Chapter 14 (skip 14.5) General ideas
Reading: Chapter 14 (skip 14.5) Data reduction - obtain a compact representation for interesting image data in terms of a set of components Find components that belong together (form clusters) Frame differencing
More informationSoft shadows. Steve Marschner Cornell University CS 569 Spring 2008, 21 February
Soft shadows Steve Marschner Cornell University CS 569 Spring 2008, 21 February Soft shadows are what we normally see in the real world. If you are near a bare halogen bulb, a stage spotlight, or other
More informationFeature Tracking and Optical Flow
Feature Tracking and Optical Flow Prof. D. Stricker Doz. G. Bleser Many slides adapted from James Hays, Derek Hoeim, Lana Lazebnik, Silvio Saverse, who in turn adapted slides from Steve Seitz, Rick Szeliski,
More informationMRFs and Segmentation with Graph Cuts
02/24/10 MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Today s class Finish up EM MRFs w ij i Segmentation with Graph Cuts j EM Algorithm: Recap
More informationRecap from Previous Lecture
Recap from Previous Lecture Tone Mapping Preserve local contrast or detail at the expense of large scale contrast. Changing the brightness within objects or surfaces unequally leads to halos. We are now
More informationSpatial Latent Dirichlet Allocation
Spatial Latent Dirichlet Allocation Xiaogang Wang and Eric Grimson Computer Science and Computer Science and Artificial Intelligence Lab Massachusetts Tnstitute of Technology, Cambridge, MA, 02139, USA
More informationAnalysis of Image and Video Using Color, Texture and Shape Features for Object Identification
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 16, Issue 6, Ver. VI (Nov Dec. 2014), PP 29-33 Analysis of Image and Video Using Color, Texture and Shape Features
More informationFeature Tracking and Optical Flow
Feature Tracking and Optical Flow Prof. D. Stricker Doz. G. Bleser Many slides adapted from James Hays, Derek Hoeim, Lana Lazebnik, Silvio Saverse, who 1 in turn adapted slides from Steve Seitz, Rick Szeliski,
More informationSTRUCTURAL EDGE LEARNING FOR 3-D RECONSTRUCTION FROM A SINGLE STILL IMAGE. Nan Hu. Stanford University Electrical Engineering
STRUCTURAL EDGE LEARNING FOR 3-D RECONSTRUCTION FROM A SINGLE STILL IMAGE Nan Hu Stanford University Electrical Engineering nanhu@stanford.edu ABSTRACT Learning 3-D scene structure from a single still
More informationSupervised texture detection in images
Supervised texture detection in images Branislav Mičušík and Allan Hanbury Pattern Recognition and Image Processing Group, Institute of Computer Aided Automation, Vienna University of Technology Favoritenstraße
More informationComputer Vision I - Basics of Image Processing Part 1
Computer Vision I - Basics of Image Processing Part 1 Carsten Rother 28/10/2014 Computer Vision I: Basics of Image Processing Link to lectures Computer Vision I: Basics of Image Processing 28/10/2014 2
More informationColorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.
Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ 1 Introduction to 2 What is? A process that produces from images of the external world a description
More informationA Survey of Light Source Detection Methods
A Survey of Light Source Detection Methods Nathan Funk University of Alberta Mini-Project for CMPUT 603 November 30, 2003 Abstract This paper provides an overview of the most prominent techniques for light
More informationConditional Random Fields for Object Recognition
Conditional Random Fields for Object Recognition Ariadna Quattoni Michael Collins Trevor Darrell MIT Computer Science and Artificial Intelligence Laboratory Cambridge, MA 02139 {ariadna, mcollins, trevor}@csail.mit.edu
More informationCombinatorial optimization and its applications in image Processing. Filip Malmberg
Combinatorial optimization and its applications in image Processing Filip Malmberg Part 1: Optimization in image processing Optimization in image processing Many image processing problems can be formulated
More informationME/CS 132: Introduction to Vision-based Robot Navigation! Low-level Image Processing" Larry Matthies"
ME/CS 132: Introduction to Vision-based Robot Navigation! Low-level Image Processing" Larry Matthies" lhm@jpl.nasa.gov, 818-354-3722" Announcements" First homework grading is done! Second homework is due
More informationComputer Vision. I-Chen Lin, Assistant Professor Dept. of CS, National Chiao Tung University
Computer Vision I-Chen Lin, Assistant Professor Dept. of CS, National Chiao Tung University About the course Course title: Computer Vision Lectures: EC016, 10:10~12:00(Tues.); 15:30~16:20(Thurs.) Pre-requisites:
More informationOther Reconstruction Techniques
Other Reconstruction Techniques Ruigang Yang CS 684 CS 684 Spring 2004 1 Taxonomy of Range Sensing From Brain Curless, SIGGRAPH 00 Lecture notes CS 684 Spring 2004 2 Taxonomy of Range Scanning (cont.)
More informationImage Analysis Lecture Segmentation. Idar Dyrdal
Image Analysis Lecture 9.1 - Segmentation Idar Dyrdal Segmentation Image segmentation is the process of partitioning a digital image into multiple parts The goal is to divide the image into meaningful
More informationSimultaneous Multi-class Pixel Labeling over Coherent Image Sets
Simultaneous Multi-class Pixel Labeling over Coherent Image Sets Paul Rivera Research School of Computer Science Australian National University Canberra, ACT 0200 Stephen Gould Research School of Computer
More informationAutomatic Dense Semantic Mapping From Visual Street-level Imagery
Automatic Dense Semantic Mapping From Visual Street-level Imagery Sunando Sengupta [1], Paul Sturgess [1], Lubor Ladicky [2], Phillip H.S. Torr [1] [1] Oxford Brookes University [2] Visual Geometry Group,
More informationData Term. Michael Bleyer LVA Stereo Vision
Data Term Michael Bleyer LVA Stereo Vision What happened last time? We have looked at our energy function: E ( D) = m( p, dp) + p I < p, q > N s( p, q) We have learned about an optimization algorithm that
More informationLecture 24: More on Reflectance CAP 5415
Lecture 24: More on Reflectance CAP 5415 Recovering Shape We ve talked about photometric stereo, where we assumed that a surface was diffuse Could calculate surface normals and albedo What if the surface
More informationLearning 6D Object Pose Estimation and Tracking
Learning 6D Object Pose Estimation and Tracking Carsten Rother presented by: Alexander Krull 21/12/2015 6D Pose Estimation Input: RGBD-image Known 3D model Output: 6D rigid body transform of object 21/12/2015
More informationProf. Feng Liu. Spring /17/2017. With slides by F. Durand, Y.Y. Chuang, R. Raskar, and C.
Prof. Feng Liu Spring 2017 http://www.cs.pdx.edu/~fliu/courses/cs510/ 05/17/2017 With slides by F. Durand, Y.Y. Chuang, R. Raskar, and C. Rother Last Time Image segmentation Normalized cut and segmentation
More informationNotes 9: Optical Flow
Course 049064: Variational Methods in Image Processing Notes 9: Optical Flow Guy Gilboa 1 Basic Model 1.1 Background Optical flow is a fundamental problem in computer vision. The general goal is to find
More informationGraph-Based Superpixel Labeling for Enhancement of Online Video Segmentation
Graph-Based Superpixel Labeling for Enhancement of Online Video Segmentation Alaa E. Abdel-Hakim Electrical Engineering Department Assiut University Assiut, Egypt alaa.aly@eng.au.edu.eg Mostafa Izz Cairo
More informationLecture 10: Multi view geometry
Lecture 10: Multi view geometry Professor Fei Fei Li Stanford Vision Lab 1 What we will learn today? Stereo vision Correspondence problem (Problem Set 2 (Q3)) Active stereo vision systems Structure from
More informationComputer Vision I - Filtering and Feature detection
Computer Vision I - Filtering and Feature detection Carsten Rother 30/10/2015 Computer Vision I: Basics of Image Processing Roadmap: Basics of Digital Image Processing Computer Vision I: Basics of Image
More informationObject Recognition Using Pictorial Structures. Daniel Huttenlocher Computer Science Department. In This Talk. Object recognition in computer vision
Object Recognition Using Pictorial Structures Daniel Huttenlocher Computer Science Department Joint work with Pedro Felzenszwalb, MIT AI Lab In This Talk Object recognition in computer vision Brief definition
More informationKinect Device. How the Kinect Works. Kinect Device. What the Kinect does 4/27/16. Subhransu Maji Slides credit: Derek Hoiem, University of Illinois
4/27/16 Kinect Device How the Kinect Works T2 Subhransu Maji Slides credit: Derek Hoiem, University of Illinois Photo frame-grabbed from: http://www.blisteredthumbs.net/2010/11/dance-central-angry-review
More information3D Scanning. Qixing Huang Feb. 9 th Slide Credit: Yasutaka Furukawa
3D Scanning Qixing Huang Feb. 9 th 2017 Slide Credit: Yasutaka Furukawa Geometry Reconstruction Pipeline This Lecture Depth Sensing ICP for Pair-wise Alignment Next Lecture Global Alignment Pairwise Multiple
More informationAutomatic Tracking of Moving Objects in Video for Surveillance Applications
Automatic Tracking of Moving Objects in Video for Surveillance Applications Manjunath Narayana Committee: Dr. Donna Haverkamp (Chair) Dr. Arvin Agah Dr. James Miller Department of Electrical Engineering
More informationContexts and 3D Scenes
Contexts and 3D Scenes Computer Vision Jia-Bin Huang, Virginia Tech Many slides from D. Hoiem Administrative stuffs Final project presentation Nov 30 th 3:30 PM 4:45 PM Grading Three senior graders (30%)
More informationOptical flow and tracking
EECS 442 Computer vision Optical flow and tracking Intro Optical flow and feature tracking Lucas-Kanade algorithm Motion segmentation Segments of this lectures are courtesy of Profs S. Lazebnik S. Seitz,
More informationRobot vision review. Martin Jagersand
Robot vision review Martin Jagersand What is Computer Vision? Computer Graphics Three Related fields Image Processing: Changes 2D images into other 2D images Computer Graphics: Takes 3D models, renders
More informationWaleed Pervaiz CSE 352
Waleed Pervaiz CSE 352 Computer Vision is the technology that enables machines to see and obtain information from digital images. It is seen as an integral part of AI in fields such as pattern recognition
More informationDEPTH AND GEOMETRY FROM A SINGLE 2D IMAGE USING TRIANGULATION
2012 IEEE International Conference on Multimedia and Expo Workshops DEPTH AND GEOMETRY FROM A SINGLE 2D IMAGE USING TRIANGULATION Yasir Salih and Aamir S. Malik, Senior Member IEEE Centre for Intelligent
More informationMulti-view stereo. Many slides adapted from S. Seitz
Multi-view stereo Many slides adapted from S. Seitz Beyond two-view stereo The third eye can be used for verification Multiple-baseline stereo Pick a reference image, and slide the corresponding window
More informationWhy study Computer Vision?
Why study Computer Vision? Images and movies are everywhere Fast-growing collection of useful applications building representations of the 3D world from pictures automated surveillance (who s doing what)
More informationAugmented Reality, Advanced SLAM, Applications
Augmented Reality, Advanced SLAM, Applications Prof. Didier Stricker & Dr. Alain Pagani alain.pagani@dfki.de Lecture 3D Computer Vision AR, SLAM, Applications 1 Introduction Previous lectures: Basics (camera,
More informationJoint Inference in Image Databases via Dense Correspondence. Michael Rubinstein MIT CSAIL (while interning at Microsoft Research)
Joint Inference in Image Databases via Dense Correspondence Michael Rubinstein MIT CSAIL (while interning at Microsoft Research) My work Throughout the year (and my PhD thesis): Temporal Video Analysis
More informationStereo. 11/02/2012 CS129, Brown James Hays. Slides by Kristen Grauman
Stereo 11/02/2012 CS129, Brown James Hays Slides by Kristen Grauman Multiple views Multi-view geometry, matching, invariant features, stereo vision Lowe Hartley and Zisserman Why multiple views? Structure
More informationComputer Vision I - Appearance-based Matching and Projective Geometry
Computer Vision I - Appearance-based Matching and Projective Geometry Carsten Rother 01/11/2016 Computer Vision I: Image Formation Process Roadmap for next four lectures Computer Vision I: Image Formation
More information6.801/866. Segmentation and Line Fitting. T. Darrell
6.801/866 Segmentation and Line Fitting T. Darrell Segmentation and Line Fitting Gestalt grouping Background subtraction K-Means Graph cuts Hough transform Iterative fitting (Next time: Probabilistic segmentation)
More informationLast update: May 4, Vision. CMSC 421: Chapter 24. CMSC 421: Chapter 24 1
Last update: May 4, 200 Vision CMSC 42: Chapter 24 CMSC 42: Chapter 24 Outline Perception generally Image formation Early vision 2D D Object recognition CMSC 42: Chapter 24 2 Perception generally Stimulus
More informationLecture 16 Segmentation and Scene understanding
Lecture 16 Segmentation and Scene understanding Introduction! Mean-shift! Graph-based segmentation! Top-down segmentation! Silvio Savarese Lecture 15 -! 3-Mar-14 Segmentation Silvio Savarese Lecture 15
More information3D Spatial Layout Propagation in a Video Sequence
3D Spatial Layout Propagation in a Video Sequence Alejandro Rituerto 1, Roberto Manduchi 2, Ana C. Murillo 1 and J. J. Guerrero 1 arituerto@unizar.es, manduchi@soe.ucsc.edu, acm@unizar.es, and josechu.guerrero@unizar.es
More information08 An Introduction to Dense Continuous Robotic Mapping
NAVARCH/EECS 568, ROB 530 - Winter 2018 08 An Introduction to Dense Continuous Robotic Mapping Maani Ghaffari March 14, 2018 Previously: Occupancy Grid Maps Pose SLAM graph and its associated dense occupancy
More informationCan Similar Scenes help Surface Layout Estimation?
Can Similar Scenes help Surface Layout Estimation? Santosh K. Divvala, Alexei A. Efros, Martial Hebert Robotics Institute, Carnegie Mellon University. {santosh,efros,hebert}@cs.cmu.edu Abstract We describe
More informationLearning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009
Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer
More information