Kinect Cursor Control EEE178 Dr. Fethi Belkhouche Christopher Harris Danny Nguyen I. INTRODUCTION
|
|
- Osborne Blankenship
- 5 years ago
- Views:
Transcription
1 Kinect Cursor Control EEE178 Dr. Fethi Belkhouche Christopher Harris Danny Nguyen Abstract: An XBOX 360 Kinect is used to develop two applications to control the desktop cursor of a Windows computer. Application A uses the skeletal tracking feature to track body joints in both X and Y direction to control both cursor positioning and clicking. Application B uses the skeletal tracking feature to track body joints in X, Y, and Z directions to control both cursor positioning and clicking. In this paper, along with a detailed layout of the algorithm behind the Kinect Cursor Control application, the process which makes Kinect s skeletal tracking feature possible will also be discussed in detail. I. INTRODUCTION The Xbox 360 Kinect has been around since 2010 and after its successful control-less game play technology, people have been using the Kinect s technology for all kinds of applications. In this project, the XBOX 360 Kinect will be used to control the desktop cursor of a Windows computer. The idea came from the need to control a pc-connected tv screen without the need to purchase an expensive wireless mouse, assuming the person already has a Kinect. This project utilizes the Kinect s skeletal tracking as well as Microsoft s Visual Studio 2013 to create two simple application. These applications will allow us to control the cursor s position with our right hand and clicking features with our left, but first it is necessary to understand the features of the Kinect and the process behind skeletal tracking from the beginning step of creating a depth map to the final step of joint position proposal. II. MICROSOFT XBOX 360 KINECT Figure 2.1. Kinect Main Components Kinect combines a few detection mechanisms to build up an accurate and comprehensive amount of 3D data on what s going on inside a room. The color camera 1
2 captures RGB images along with per-pixel depth imaging information. The sensor also relies on the pair of depth sensors to measure the distance of objects in the room in three dimensions by emitting infrared structured light beams which project a specific grid pattern that is distorted based on a person s distance from the emitter. Kinect combines structured light with two classic computer vision techniques: Depth from focus, using a special astigmatic lens with different focal length in x- and y- directions (i.e. the stuff that is more blurry is further away), and Depth from stereo which will be discussed in detail later in this paper. These images are measured by an 11-bit 640x480 pixel monochrome CMOS sensor providing 2048 levels of grey which builds a map showing the distance from the sensor of every point in the image as seen in figure 2.2. Figure 2.2. Infrared structured light beams to gray level depth map The depth calculations are performed in the scene using a method called Stereo Triangulation. The depth measurement requires that corresponding points in one image need to be found in the second image, known as the correspondence problem, and is executed by means of a method called stereo matching. Once those corresponding points are found, we can find the disparity between the two images. Disparity measures the displacement of a point between the two images (i.e. the number of pixels between a point in the right image and the corresponding point in the left image). Figure 2.3. Stereo Matching - Correspondence Problem As the Kinect makes use of the simplest case constructing the device such that the image planes of the cameras are parallel to each other and to the baseline, the camera centers at same height, and the focal lengths the same, the epipolar lines fall along the horizontal scan 2
3 lines of the images, and the images are rectified (along the same parallel axis). Therefore, the process is simplified and the computation time reduced. Then, once we have the disparity, we then use triangulation to calculate the depth of that point in the scene. A disparity is computed for every pixel of the image with Stereo Matching, then Triangulation is used to compute the 3D position for every disparity. Figure 2.4. Depth from Stereo Images - Rectified Image Using Triangulation, the relationship between Disparity and Depth is determined as follows in figure 2.5: Figure 2.5. Disparity: Relationship to Depth Using this relationship, we can draw important conclusions such as disparity values are inversely proportional to the depth of a point Z. This is to say that far points have low disparity (i.e. the horizon has disparity of zero), and close points have a high disparity. The disparity is also proportional to the baseline b (i.e. the larger the baseline, the higher the disparity). For the Kinect, inferring body position is a two-stage process: compute a depth map using structured light, the infer body position using machine learning. Body Part Inference and Joint Proposals A key component of this work is the intermediate body part representation where several localized body part labels, that densely cover the body, are defined and color-coded as in figure
4 Figure 2.6. Synthetic and Real Data. Pairs of depth image and ground truth body parts. Some of these parts are defined to directly localize particular skeletal joints of interest, while others fill the gaps or could be used in combination to predict other joints. This intermediate representation causes for the problem to then be changed to one that is solved by efficient classification algorithms. Simple depth comparison features, as seen in figure 2.7, are utilized to compute the following equation at a given pixel x: (1) where is the depth at pixel x in image, and parameters describe offsets and. Figure 2.7. Depth Image Features Figure 2.7. illustrates two features at different pixel locations x. Feature looks upwards. Equation 1 returns a large positive response for pixels x near the top of the body, but a value close to zero for pixels x lower down the body. Feature 2may instead find thin vertical structures such as the arm. 4
5 Figure 2.8. Randomized Decision Forest These features in combination in a decision forest make for a sufficient and accurate classification tool for all trained parts. A forest is an ensemble of T decision trees, each consisting of split and leaf nodes as depicted in figure 2.8. Each split node consists of a feature and a threshold. To classify pixel x in image I, one starts at the root and repeatedly evaluates Eq. 1, branching left or right according to the comparison to threshold. At the leaf node reached in tree t, a learned distribution over body part labels c is stored. The distributions are averaged together for all trees in the forest to give the final classification: (2) Body part recognition as described above infers per-pixel information. This information is then pooled across pixels to generate reliable proposals for the 3D skeletal joints. These proposals are the final output of the algorithm. As outlying pixels severely degrade the quality of global estimates (global 3D centers of probability mass for each part accumulated using the known calibrated depth), a local modefinding approach, based on mean shift with a weighted Gaussian kernel, is utilized. Figure 2.9 Mean shift descriptors Mean shift is used to find modes in this density efficiently. A final confidence estimate is given as a sum of the pixel weights reaching each mode. The detected modes lie on the surface of the body, and by means of the learning algorithm, a final joint position proposal is produced and skeletal tracking made possible. III. VISUAL STUDIOS 2013 Due to the Kinect s popularity as a stereo vision sensor, Microsoft s website provides an SDK for Visual Studios for the sole purpose for programming applications for the Kinect. This includes games as well as windows forms application. It is because of the programmability of Visual Studios along with the Kinect s SDK that makes the Kinect so robust in terms of 5
6 application development. For our application we are using the KinectSDK v.1.8 which can be found in Microsoft s Kinect for Windows page. This package is essentially the bridge between the features of the Kinect, which includes all the software and libraries necessary for the sensors to work, and C#, C++, or Visual Basics coding. By utilizing the KinectSDK package, we can develop our own application that directly interacts with the Kinect. IV. COORDINATES CONVERSION One of the things we need to consider while programming an application is the coordinate system of the Kinect. As shown in Figure 4.1, we see that it is in cartesian coordinates with the origin in the center of the image. In this case, the maximum X and Y values are -1 and 1; however, depending on the object s distance from the sensor, this value may change. For a typical application, one would be standing roughly 6 feet away from the sensor and -1 and 1 is a good approximation for the maximum values of the X and Y. Figure 4. Kinect s Coordinate System For certain applications such as our own, we will need to find the relationship between the coordinate system of the Kinect and the display resolution of the computer as shown in Figure 4.2. We also need to consider the fact that the origin of the display coordinates of the pc begins in the top left as opposed to the conventional bottom left of an X-Y coordinates. Figure 4.2 PC Display Coordinate System The equations needed to convert from the Kinect s coordinates to the PC coordinates are as follows: 6
7 where: Figure 4.3 Conversion Equations Xpc/ Ypc is the computer s maximum resolution. Xk/Yk is the Kinect s X/Y coordinates. Xr/Yr is Kinect s reference x/y frame. the minus sign in front of the yk converts the negative Y direction which compensates for the fact that the origin begins in the top left corner in the pc coordinate system. These two equations converts the Kinect s coordinates (input) to the pc display coordinates (output) which depends on the resolution of the screen. In addition we can limit the detection area of the Kinect to a localized window. Considering our application to control the cursor we can limit the user s movement to a small region as shown in Figure 4.4. This can alleviate the issues where users will have to move further away in order to move the mouse to the corners of the screen. Figure 4.4 Kinect s Reference Frame The red box represents the localized window which can be resized by simply changing the values of Xr/Yr above. Example Calculation: Suppose we have a point (0.3, -0.3) in the Kinect s coordinate system as shown in Figure 4.5. Where the reference window frame, Xr and Yr are 0.7 and 0.5 respectively. The maximum display resolution of the computer is 1366x768. Plugging the values into the equations, we get the converted points in pixels. 7
8 Figure 4.5 Kinect s Reference values 0.7, 0.5 Figure 4.6 Solving for X and Y Pixel coordinate Figure 4.7 Converted Pixel Coordinates Now that we have a way to convert between coordinate system, we can simply track the joints, convert their positions from Kinect coordinate system to pc coordinates, and set them as the new mouse coordinates. V. SKELETAL TRACKING TO CONTROL CLICKING The first application developed tracks four points of the body as shown in Figure 5.1: 8
9 Figure 5.1 Skeletal Tracking Joints The Right Wrist is tracked to control the movement or position of the cursor. The Left Wrist is tracked via its Y-coordinates to determine the type of clicking. The Center shoulder is tracked via its Y-coordinates and acts as a threshold for double clicking. The Left Hip is tracked via its Y-coordinates and acts as a threshold for single click and hold. Conditions for clicking: Left Wrist Y-Position is less than Center Shoulder will trigger double click. Left Wrist Y-Position is greater than Left Hip will trigger single click and hold. Left Wrist Y-Position is greater than Center Shoulder and less than Left Hip will do nothing. VI. Z-DIRECTION VELOCITY TO CONTROL DOUBLE CLICK The second application tracks only two points of the body: the Left Wrist and the Right Wrist. Similar to the first application, the Right Wrist controls the positioning of the cursor and the Left Wrist will control the clicking. In this case, we will track an additional axis, the Z- Direction of the Left Wrist. We will then measure the positive direction velocity of the Left Wrist and use this to determine the conditions to perform a double click. The velocity is measured by the change in position. By tracking the distance of the Left Wrist via the Z-direction and finding the difference between past and current positions, we can calculate the instantaneous velocity. For example, if the current distance from the Kinect to the tracked joint is 2.5 feet and the previous is 2.3 feet, the measured instantaneous velocity is -0.2 feet per sample. The negative sign signifies that the joint is moving away from the Kinect and the clicking will not trigger for negative values of velocity. We then had to determine a threshold to trigger the clicking event. After several trials from a slight movement to aggressive push, the velocity used is about 0.05 feet per sample, which is a simple push movement, not weak enough to detect a forward movement and not strong enough to injure someone. VII. CONCLUSION 9
10 In this paper, we have discussed the processes behind skeletal tracking a two-stage process: computing a depth map using structured light, then inferring the body position using machine learning (i.e. transform depth image to body part image, and body part image into a skeleton). We then utilized the Kinect s Skeletal tracking to create two versions of a cursor control application. The first version sets clicking thresholds on the Y position of the left wrist whereas the second version sets clicking thresholds via the left wrist s Z velocity. Both applications clicking mechanism works perfectly. Small problems were presented where the positioning of the cursor sometimes generate unwanted noise which results in the cursor teleporting in the screen. This may be the result of different aspect ratios from the Kinect s localized window frame and the display of the pc. Solutions to this may include adjusting the aspect ratio of the localized window to match those of the computer or to adjust the distance between the user and the Kinect. 10
11 APPENDIX A. CODE LISTING: Skeletal tracking //Check for Kinect Sensor //Check for Skeleton //as long as Skeleton is found run the following: bool leftclick; if (WristLeft < ShoulderCenter) if your wrist is higher than your shoulder double click! leftclick = true; Console.WriteLine("DOUBLE CLICK!"); for (int cnt = 0; cnt < 4; cnt++) This calls the clicking method four times, click, release, click, release NativeMethods.SendMouseInput(cursorX, cursory, resolution.width, resolution.height, leftclick); leftclick =!leftclick; Thread.Sleep(1000); a small delay is put here to prevent endless clicking else if (WristLeft > ShoulderCenter && WristLeft < HipLeft)if the hand is between the shoulder and hip, do nothing. leftclick = false; NativeMethods.SendMouseInput(cursorX, cursory, resolution.width, resolution.height, leftclick); else the hand is under the left hip and will click and hold, and will hold until the hand is raised above the hip. Console.WriteLine("SINGLE CLICK AND HOLD!"); leftclick = true; NativeMethods.SendMouseInput(cursorX, cursory, resolution.width, resolution.height, leftclick); APPENDIX B. CODE LISTING: Velocity Tracking //Check for Kinect Sensor //Check for Skeleton //as long as Skeleton is found run the following: bool leftclick; if ((Zvel - jwl.position.z) > 0.05) if the previous minus current position is greater than 0.05, click! leftclick = true; Console.Write("DOUBLE CLICK!"); for (int cnt = 0; cnt < 4; cnt++) This calls the clicking method four times, click, release, click, release NativeMethods.SendMouseInput(cursorX, cursory, resolution.width, resolution.height, leftclick); leftclick =!leftclick; Thread.Sleep(500); a small delay is put here to prevent endless clicking else leftclick = false; NativeMethods.SendMouseInput(cursorX, cursory, resolution.width, resolution.height, leftclick); Zvel = jwl.position.z; this sets the new previous Z position 11
12 REFERENCE [1] [2] [3] [4] [5] [6] [7] [8] [9] Works%20-%20CP%20Fall% pdf [10] [11] [12] [13] [14] [15] 12
Lecture 19: Depth Cameras. Visual Computing Systems CMU , Fall 2013
Lecture 19: Depth Cameras Visual Computing Systems Continuing theme: computational photography Cameras capture light, then extensive processing produces the desired image Today: - Capturing scene depth
More informationHuman Body Recognition and Tracking: How the Kinect Works. Kinect RGB-D Camera. What the Kinect Does. How Kinect Works: Overview
Human Body Recognition and Tracking: How the Kinect Works Kinect RGB-D Camera Microsoft Kinect (Nov. 2010) Color video camera + laser-projected IR dot pattern + IR camera $120 (April 2012) Kinect 1.5 due
More informationComplex Sensors: Cameras, Visual Sensing. The Robotics Primer (Ch. 9) ECE 497: Introduction to Mobile Robotics -Visual Sensors
Complex Sensors: Cameras, Visual Sensing The Robotics Primer (Ch. 9) Bring your laptop and robot everyday DO NOT unplug the network cables from the desktop computers or the walls Tuesday s Quiz is on Visual
More informationCS5670: Computer Vision
CS5670: Computer Vision Noah Snavely, Zhengqi Li Stereo Single image stereogram, by Niklas Een Mark Twain at Pool Table", no date, UCR Museum of Photography Stereo Given two images from different viewpoints
More informationRectification and Disparity
Rectification and Disparity Nassir Navab Slides prepared by Christian Unger What is Stereo Vision? Introduction A technique aimed at inferring dense depth measurements efficiently using two cameras. Wide
More informationStereo Vision A simple system. Dr. Gerhard Roth Winter 2012
Stereo Vision A simple system Dr. Gerhard Roth Winter 2012 Stereo Stereo Ability to infer information on the 3-D structure and distance of a scene from two or more images taken from different viewpoints
More informationBinocular stereo. Given a calibrated binocular stereo pair, fuse it to produce a depth image. Where does the depth information come from?
Binocular Stereo Binocular stereo Given a calibrated binocular stereo pair, fuse it to produce a depth image Where does the depth information come from? Binocular stereo Given a calibrated binocular stereo
More informationThe Kinect Sensor. Luís Carriço FCUL 2014/15
Advanced Interaction Techniques The Kinect Sensor Luís Carriço FCUL 2014/15 Sources: MS Kinect for Xbox 360 John C. Tang. Using Kinect to explore NUI, Ms Research, From Stanford CS247 Shotton et al. Real-Time
More informationArticulated Pose Estimation with Flexible Mixtures-of-Parts
Articulated Pose Estimation with Flexible Mixtures-of-Parts PRESENTATION: JESSE DAVIS CS 3710 VISUAL RECOGNITION Outline Modeling Special Cases Inferences Learning Experiments Problem and Relevance Problem:
More informationFundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision
Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision What Happened Last Time? Human 3D perception (3D cinema) Computational stereo Intuitive explanation of what is meant by disparity Stereo matching
More informationKinect Device. How the Kinect Works. Kinect Device. What the Kinect does 4/27/16. Subhransu Maji Slides credit: Derek Hoiem, University of Illinois
4/27/16 Kinect Device How the Kinect Works T2 Subhransu Maji Slides credit: Derek Hoiem, University of Illinois Photo frame-grabbed from: http://www.blisteredthumbs.net/2010/11/dance-central-angry-review
More informationEE795: Computer Vision and Intelligent Systems
EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 14 130307 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Stereo Dense Motion Estimation Translational
More informationEXAM SOLUTIONS. Image Processing and Computer Vision Course 2D1421 Monday, 13 th of March 2006,
School of Computer Science and Communication, KTH Danica Kragic EXAM SOLUTIONS Image Processing and Computer Vision Course 2D1421 Monday, 13 th of March 2006, 14.00 19.00 Grade table 0-25 U 26-35 3 36-45
More information3D Computer Vision. Depth Cameras. Prof. Didier Stricker. Oliver Wasenmüller
3D Computer Vision Depth Cameras Prof. Didier Stricker Oliver Wasenmüller Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de
More informationAutonomous Vehicle Navigation Using Stereoscopic Imaging
Autonomous Vehicle Navigation Using Stereoscopic Imaging Project Proposal By: Beach Wlaznik Advisors: Dr. Huggins Dr. Stewart December 7, 2006 I. Introduction The objective of the Autonomous Vehicle Navigation
More informationFinal Exam Study Guide
Final Exam Study Guide Exam Window: 28th April, 12:00am EST to 30th April, 11:59pm EST Description As indicated in class the goal of the exam is to encourage you to review the material from the course.
More informationStereo vision. Many slides adapted from Steve Seitz
Stereo vision Many slides adapted from Steve Seitz What is stereo vision? Generic problem formulation: given several images of the same object or scene, compute a representation of its 3D shape What is
More informationMinimizing Noise and Bias in 3D DIC. Correlated Solutions, Inc.
Minimizing Noise and Bias in 3D DIC Correlated Solutions, Inc. Overview Overview of Noise and Bias Digital Image Correlation Background/Tracking Function Minimizing Noise Focus Contrast/Lighting Glare
More informationLUMS Mine Detector Project
LUMS Mine Detector Project Using visual information to control a robot (Hutchinson et al. 1996). Vision may or may not be used in the feedback loop. Visual (image based) features such as points, lines
More informationA method for depth-based hand tracing
A method for depth-based hand tracing Khoa Ha University of Maryland, College Park khoaha@umd.edu Abstract An algorithm for natural human-computer interaction via in-air drawing is detailed. We discuss
More informationThere are many cues in monocular vision which suggests that vision in stereo starts very early from two similar 2D images. Lets see a few...
STEREO VISION The slides are from several sources through James Hays (Brown); Srinivasa Narasimhan (CMU); Silvio Savarese (U. of Michigan); Bill Freeman and Antonio Torralba (MIT), including their own
More informationStereo. Many slides adapted from Steve Seitz
Stereo Many slides adapted from Steve Seitz Binocular stereo Given a calibrated binocular stereo pair, fuse it to produce a depth image image 1 image 2 Dense depth map Binocular stereo Given a calibrated
More informationAccurate 3D Face and Body Modeling from a Single Fixed Kinect
Accurate 3D Face and Body Modeling from a Single Fixed Kinect Ruizhe Wang*, Matthias Hernandez*, Jongmoo Choi, Gérard Medioni Computer Vision Lab, IRIS University of Southern California Abstract In this
More informationEpipolar Geometry and Stereo Vision
Epipolar Geometry and Stereo Vision Computer Vision Jia-Bin Huang, Virginia Tech Many slides from S. Seitz and D. Hoiem Last class: Image Stitching Two images with rotation/zoom but no translation. X x
More informationVIRTUAL TRAIL ROOM. South Asian Journal of Engineering and Technology Vol.3, No.5 (2017) 87 96
VIRTUAL TRAIL ROOM 1 Vipin Paul, 2 Sanju Abel J., 3 Sudharsan S., 4 Praveen M. 1 Vipinpaul95@gmail.com 3 shansudharsan002@gmail.com 2 sanjuabel@gmail.com 4 praveen.pravin6@gmail.com Department of computer
More informationCorrespondence and Stereopsis. Original notes by W. Correa. Figures from [Forsyth & Ponce] and [Trucco & Verri]
Correspondence and Stereopsis Original notes by W. Correa. Figures from [Forsyth & Ponce] and [Trucco & Verri] Introduction Disparity: Informally: difference between two pictures Allows us to gain a strong
More informationComputer Vision I. Dense Stereo Correspondences. Anita Sellent 1/15/16
Computer Vision I Dense Stereo Correspondences Anita Sellent Stereo Two Cameras Overlapping field of view Known transformation between cameras From disparity compute depth [ Bradski, Kaehler: Learning
More informationAutonomous Vehicle Navigation Using Stereoscopic Imaging
Autonomous Vehicle Navigation Using Stereoscopic Imaging Functional Description and Complete System Block Diagram By: Adam Beach Nick Wlaznik Advisors: Dr. Huggins Dr. Stewart December 14, 2006 I. Introduction
More informationA Low Power, High Throughput, Fully Event-Based Stereo System: Supplementary Documentation
A Low Power, High Throughput, Fully Event-Based Stereo System: Supplementary Documentation Alexander Andreopoulos, Hirak J. Kashyap, Tapan K. Nayak, Arnon Amir, Myron D. Flickner IBM Research March 25,
More informationReal-Time Human Pose Recognition in Parts from Single Depth Images
Real-Time Human Pose Recognition in Parts from Single Depth Images Jamie Shotton, Andrew Fitzgibbon, Mat Cook, Toby Sharp, Mark Finocchio, Richard Moore, Alex Kipman, Andrew Blake CVPR 2011 PRESENTER:
More informationFog Simulation and Refocusing from Stereo Images
Fog Simulation and Refocusing from Stereo Images Yifei Wang epartment of Electrical Engineering Stanford University yfeiwang@stanford.edu bstract In this project, we use stereo images to estimate depth
More informationCHAPTER 3 DISPARITY AND DEPTH MAP COMPUTATION
CHAPTER 3 DISPARITY AND DEPTH MAP COMPUTATION In this chapter we will discuss the process of disparity computation. It plays an important role in our caricature system because all 3D coordinates of nodes
More informationStereo Vision Computer Vision (Kris Kitani) Carnegie Mellon University
Stereo Vision 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University What s different between these two images? Objects that are close move more or less? The amount of horizontal movement is
More informationStereo. 11/02/2012 CS129, Brown James Hays. Slides by Kristen Grauman
Stereo 11/02/2012 CS129, Brown James Hays Slides by Kristen Grauman Multiple views Multi-view geometry, matching, invariant features, stereo vision Lowe Hartley and Zisserman Why multiple views? Structure
More informationEpipolar Geometry and Stereo Vision
Epipolar Geometry and Stereo Vision Computer Vision Shiv Ram Dubey, IIIT Sri City Many slides from S. Seitz and D. Hoiem Last class: Image Stitching Two images with rotation/zoom but no translation. X
More informationPractice Exam Sample Solutions
CS 675 Computer Vision Instructor: Marc Pomplun Practice Exam Sample Solutions Note that in the actual exam, no calculators, no books, and no notes allowed. Question 1: out of points Question 2: out of
More informationStereoScan: Dense 3D Reconstruction in Real-time
STANFORD UNIVERSITY, COMPUTER SCIENCE, STANFORD CS231A SPRING 2016 StereoScan: Dense 3D Reconstruction in Real-time Peirong Ji, pji@stanford.edu June 7, 2016 1 INTRODUCTION In this project, I am trying
More informationCombining PGMs and Discriminative Models for Upper Body Pose Detection
Combining PGMs and Discriminative Models for Upper Body Pose Detection Gedas Bertasius May 30, 2014 1 Introduction In this project, I utilized probabilistic graphical models together with discriminative
More informationDD2423 Image Analysis and Computer Vision IMAGE FORMATION. Computational Vision and Active Perception School of Computer Science and Communication
DD2423 Image Analysis and Computer Vision IMAGE FORMATION Mårten Björkman Computational Vision and Active Perception School of Computer Science and Communication November 8, 2013 1 Image formation Goal:
More informationProject 2 due today Project 3 out today. Readings Szeliski, Chapter 10 (through 10.5)
Announcements Stereo Project 2 due today Project 3 out today Single image stereogram, by Niklas Een Readings Szeliski, Chapter 10 (through 10.5) Public Library, Stereoscopic Looking Room, Chicago, by Phillips,
More informationVirtual Production for the Real World Using Autodesk MotionBuilder 2013
Virtual Production for the Real World Using Autodesk MotionBuilder 2013 Daryl Obert Autodesk Hein Beute- Xsens DG3148 This class will give attendees a firm understanding of the concepts and workflows involved
More informationStereo and structured light
Stereo and structured light http://graphics.cs.cmu.edu/courses/15-463 15-463, 15-663, 15-862 Computational Photography Fall 2018, Lecture 20 Course announcements Homework 5 is still ongoing. - Make sure
More informationDepth Estimation with a Plenoptic Camera
Depth Estimation with a Plenoptic Camera Steven P. Carpenter 1 Auburn University, Auburn, AL, 36849 The plenoptic camera is a tool capable of recording significantly more data concerning a particular image
More informationA Comparison between Active and Passive 3D Vision Sensors: BumblebeeXB3 and Microsoft Kinect
A Comparison between Active and Passive 3D Vision Sensors: BumblebeeXB3 and Microsoft Kinect Diana Beltran and Luis Basañez Technical University of Catalonia, Barcelona, Spain {diana.beltran,luis.basanez}@upc.edu
More informationMidterm Examination CS 534: Computational Photography
Midterm Examination CS 534: Computational Photography November 3, 2016 NAME: Problem Score Max Score 1 6 2 8 3 9 4 12 5 4 6 13 7 7 8 6 9 9 10 6 11 14 12 6 Total 100 1 of 8 1. [6] (a) [3] What camera setting(s)
More informationHigh-Fidelity Augmented Reality Interactions Hrvoje Benko Researcher, MSR Redmond
High-Fidelity Augmented Reality Interactions Hrvoje Benko Researcher, MSR Redmond New generation of interfaces Instead of interacting through indirect input devices (mice and keyboard), the user is interacting
More informationColorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.
Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ 1 Stereo Vision 2 Inferring 3D from 2D Model based pose estimation single (calibrated) camera > Can
More informationRange Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation
Obviously, this is a very slow process and not suitable for dynamic scenes. To speed things up, we can use a laser that projects a vertical line of light onto the scene. This laser rotates around its vertical
More informationCS Decision Trees / Random Forests
CS548 2015 Decision Trees / Random Forests Showcase by: Lily Amadeo, Bir B Kafle, Suman Kumar Lama, Cody Olivier Showcase work by Jamie Shotton, Andrew Fitzgibbon, Richard Moore, Mat Cook, Alex Kipman,
More informationImage Based Reconstruction II
Image Based Reconstruction II Qixing Huang Feb. 2 th 2017 Slide Credit: Yasutaka Furukawa Image-Based Geometry Reconstruction Pipeline Last Lecture: Multi-View SFM Multi-View SFM This Lecture: Multi-View
More informationExam in DD2426 Robotics and Autonomous Systems
Exam in DD2426 Robotics and Autonomous Systems Lecturer: Patric Jensfelt KTH, March 16, 2010, 9-12 No aids are allowed on the exam, i.e. no notes, no books, no calculators, etc. You need a minimum of 20
More informationOutline. ETN-FPI Training School on Plenoptic Sensing
Outline Introduction Part I: Basics of Mathematical Optimization Linear Least Squares Nonlinear Optimization Part II: Basics of Computer Vision Camera Model Multi-Camera Model Multi-Camera Calibration
More informationBasic 3D Geometry for One and Two Cameras
Basic 3D Geometry for One and Two Cameras James Coughlan Aug. 18, 2009 1 Introduction This note discusses 3D geometry for a single camera in Sections 2-6, and then covers the basics of stereo in Sections
More informationPhysics 101, Lab 1: LINEAR KINEMATICS PREDICTION SHEET
Physics 101, Lab 1: LINEAR KINEMATICS PREDICTION SHEET After reading through the Introduction, Purpose and Principles sections of the lab manual (and skimming through the procedures), answer the following
More informationReal-Time Human Detection using Relational Depth Similarity Features
Real-Time Human Detection using Relational Depth Similarity Features Sho Ikemura, Hironobu Fujiyoshi Dept. of Computer Science, Chubu University. Matsumoto 1200, Kasugai, Aichi, 487-8501 Japan. si@vision.cs.chubu.ac.jp,
More informationCS 787: Assignment 4, Stereo Vision: Block Matching and Dynamic Programming Due: 12:00noon, Fri. Mar. 30, 2007.
CS 787: Assignment 4, Stereo Vision: Block Matching and Dynamic Programming Due: 12:00noon, Fri. Mar. 30, 2007. In this assignment you will implement and test some simple stereo algorithms discussed in
More informationarxiv: v1 [cs.cv] 28 Sep 2018
Camera Pose Estimation from Sequence of Calibrated Images arxiv:1809.11066v1 [cs.cv] 28 Sep 2018 Jacek Komorowski 1 and Przemyslaw Rokita 2 1 Maria Curie-Sklodowska University, Institute of Computer Science,
More informationCS201 Computer Vision Camera Geometry
CS201 Computer Vision Camera Geometry John Magee 25 November, 2014 Slides Courtesy of: Diane H. Theriault (deht@bu.edu) Question of the Day: How can we represent the relationships between cameras and the
More informationLaser sensors. Transmitter. Receiver. Basilio Bona ROBOTICA 03CFIOR
Mobile & Service Robotics Sensors for Robotics 3 Laser sensors Rays are transmitted and received coaxially The target is illuminated by collimated rays The receiver measures the time of flight (back and
More informationStereo Image Rectification for Simple Panoramic Image Generation
Stereo Image Rectification for Simple Panoramic Image Generation Yun-Suk Kang and Yo-Sung Ho Gwangju Institute of Science and Technology (GIST) 261 Cheomdan-gwagiro, Buk-gu, Gwangju 500-712 Korea Email:{yunsuk,
More informationDepth Sensors Kinect V2 A. Fornaser
Depth Sensors Kinect V2 A. Fornaser alberto.fornaser@unitn.it Vision Depth data It is not a 3D data, It is a map of distances Not a 3D, not a 2D it is a 2.5D or Perspective 3D Complete 3D - Tomography
More informationRange Sensors (time of flight) (1)
Range Sensors (time of flight) (1) Large range distance measurement -> called range sensors Range information: key element for localization and environment modeling Ultrasonic sensors, infra-red sensors
More informationAdvanced Vision Guided Robotics. David Bruce Engineering Manager FANUC America Corporation
Advanced Vision Guided Robotics David Bruce Engineering Manager FANUC America Corporation Traditional Vision vs. Vision based Robot Guidance Traditional Machine Vision Determine if a product passes or
More informationMultimedia Technology CHAPTER 4. Video and Animation
CHAPTER 4 Video and Animation - Both video and animation give us a sense of motion. They exploit some properties of human eye s ability of viewing pictures. - Motion video is the element of multimedia
More informationProcessing 3D Surface Data
Processing 3D Surface Data Computer Animation and Visualisation Lecture 17 Institute for Perception, Action & Behaviour School of Informatics 3D Surfaces 1 3D surface data... where from? Iso-surfacing
More information2 OVERVIEW OF RELATED WORK
Utsushi SAKAI Jun OGATA This paper presents a pedestrian detection system based on the fusion of sensors for LIDAR and convolutional neural network based image classification. By using LIDAR our method
More informationChapter 7: Geometrical Optics. The branch of physics which studies the properties of light using the ray model of light.
Chapter 7: Geometrical Optics The branch of physics which studies the properties of light using the ray model of light. Overview Geometrical Optics Spherical Mirror Refraction Thin Lens f u v r and f 2
More informationLecture 9 & 10: Stereo Vision
Lecture 9 & 10: Stereo Vision Professor Fei- Fei Li Stanford Vision Lab 1 What we will learn today? IntroducEon to stereo vision Epipolar geometry: a gentle intro Parallel images Image receficaeon Solving
More informationRealtime Omnidirectional Stereo for Obstacle Detection and Tracking in Dynamic Environments
Proc. 2001 IEEE/RSJ Int. Conf. on Intelligent Robots and Systems pp. 31-36, Maui, Hawaii, Oct./Nov. 2001. Realtime Omnidirectional Stereo for Obstacle Detection and Tracking in Dynamic Environments Hiroshi
More informationCSE152 Introduction to Computer Vision Assignment 3 (SP15) Instructor: Ben Ochoa Maximum Points : 85 Deadline : 11:59 p.m., Friday, 29-May-2015
Instructions: CSE15 Introduction to Computer Vision Assignment 3 (SP15) Instructor: Ben Ochoa Maximum Points : 85 Deadline : 11:59 p.m., Friday, 9-May-015 This assignment should be solved, and written
More information3D Vision Real Objects, Real Cameras. Chapter 11 (parts of), 12 (parts of) Computerized Image Analysis MN2 Anders Brun,
3D Vision Real Objects, Real Cameras Chapter 11 (parts of), 12 (parts of) Computerized Image Analysis MN2 Anders Brun, anders@cb.uu.se 3D Vision! Philisophy! Image formation " The pinhole camera " Projective
More informationECE 470: Homework 5. Due Tuesday, October 27 in Seth Hutchinson. Luke A. Wendt
ECE 47: Homework 5 Due Tuesday, October 7 in class @:3pm Seth Hutchinson Luke A Wendt ECE 47 : Homework 5 Consider a camera with focal length λ = Suppose the optical axis of the camera is aligned with
More informationThe XH-map algorithm: A method to process stereo video to produce a real-time obstacle map.
The XH-map algorithm: A method to process stereo video to produce a real-time obstacle map. Donald Rosselot and Ernest L. Hall Center for Robotics Research Department of Mechanical, Industrial, and Nuclear
More informationPublic Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923
Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923 Teesta suspension bridge-darjeeling, India Mark Twain at Pool Table", no date, UCR Museum of Photography Woman getting eye exam during
More informationDepth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth
Common Classification Tasks Recognition of individual objects/faces Analyze object-specific features (e.g., key points) Train with images from different viewing angles Recognition of object classes Analyze
More informationComputer Vision Projective Geometry and Calibration. Pinhole cameras
Computer Vision Projective Geometry and Calibration Professor Hager http://www.cs.jhu.edu/~hager Jason Corso http://www.cs.jhu.edu/~jcorso. Pinhole cameras Abstract camera model - box with a small hole
More informationINFRARED AUTONOMOUS ACQUISITION AND TRACKING
INFRARED AUTONOMOUS ACQUISITION AND TRACKING Teresa L.P. Olson and Harry C. Lee Teresa.Lolson@lmco.com (407) 356-7109 Harrv.c.lee@lmco.com (407) 356-6997 Lockheed Martin Missiles and Fire Control - Orlando
More informationInternational Society for Photogrammetry and Remote Sensing
International Society or Photogrammetry and Remote Sensing Commission I: Sensors, Platorms, and Imagery Commission II: Systems or Data Processing, Analysis, and Representation Commission III: Theory and
More informationColorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.
Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ 1 Stereo Vision 2 Inferring 3D from 2D Model based pose estimation single (calibrated) camera Stereo
More informationINFO - H Pattern recognition and image analysis. Vision
INFO - H - 501 Pattern recognition and image analysis Vision Stereovision digital elevation model obstacle avoidance 3D model scanner human machine interface (HMI)... Stereovision image of the same point
More informationSubpixel Corner Detection for Tracking Applications using CMOS Camera Technology
Subpixel Corner Detection for Tracking Applications using CMOS Camera Technology Christoph Stock, Ulrich Mühlmann, Manmohan Krishna Chandraker, Axel Pinz Institute of Electrical Measurement and Measurement
More informationTheory of Stereo vision system
Theory of Stereo vision system Introduction Stereo vision is a technique aimed at extracting depth information of a scene from two camera images. Difference in pixel position in two image produces the
More informationDense 3D Reconstruction. Christiano Gava
Dense 3D Reconstruction Christiano Gava christiano.gava@dfki.de Outline Previous lecture: structure and motion II Structure and motion loop Triangulation Today: dense 3D reconstruction The matching problem
More informationLecture'9'&'10:'' Stereo'Vision'
Lecture'9'&'10:'' Stereo'Vision' Dr.'Juan'Carlos'Niebles' Stanford'AI'Lab' ' Professor'FeiAFei'Li' Stanford'Vision'Lab' 1' Dimensionality'ReducIon'Machine'(3D'to'2D)' 3D world 2D image Point of observation
More informationUsing temporal seeding to constrain the disparity search range in stereo matching
Using temporal seeding to constrain the disparity search range in stereo matching Thulani Ndhlovu Mobile Intelligent Autonomous Systems CSIR South Africa Email: tndhlovu@csir.co.za Fred Nicolls Department
More informationHIGH SPEED 3-D MEASUREMENT SYSTEM USING INCOHERENT LIGHT SOURCE FOR HUMAN PERFORMANCE ANALYSIS
HIGH SPEED 3-D MEASUREMENT SYSTEM USING INCOHERENT LIGHT SOURCE FOR HUMAN PERFORMANCE ANALYSIS Takeo MIYASAKA, Kazuhiro KURODA, Makoto HIROSE and Kazuo ARAKI School of Computer and Cognitive Sciences,
More information3D Modeling of Objects Using Laser Scanning
1 3D Modeling of Objects Using Laser Scanning D. Jaya Deepu, LPU University, Punjab, India Email: Jaideepudadi@gmail.com Abstract: In the last few decades, constructing accurate three-dimensional models
More informationAccurate and Dense Wide-Baseline Stereo Matching Using SW-POC
Accurate and Dense Wide-Baseline Stereo Matching Using SW-POC Shuji Sakai, Koichi Ito, Takafumi Aoki Graduate School of Information Sciences, Tohoku University, Sendai, 980 8579, Japan Email: sakai@aoki.ecei.tohoku.ac.jp
More informationProcessing 3D Surface Data
Processing 3D Surface Data Computer Animation and Visualisation Lecture 12 Institute for Perception, Action & Behaviour School of Informatics 3D Surfaces 1 3D surface data... where from? Iso-surfacing
More informationFinal Review CMSC 733 Fall 2014
Final Review CMSC 733 Fall 2014 We have covered a lot of material in this course. One way to organize this material is around a set of key equations and algorithms. You should be familiar with all of these,
More information10/5/09 1. d = 2. Range Sensors (time of flight) (2) Ultrasonic Sensor (time of flight, sound) (1) Ultrasonic Sensor (time of flight, sound) (2) 4.1.
Range Sensors (time of flight) (1) Range Sensors (time of flight) (2) arge range distance measurement -> called range sensors Range information: key element for localization and environment modeling Ultrasonic
More informationStructured Light. Tobias Nöll Thanks to Marc Pollefeys, David Nister and David Lowe
Structured Light Tobias Nöll tobias.noell@dfki.de Thanks to Marc Pollefeys, David Nister and David Lowe Introduction Previous lecture: Dense reconstruction Dense matching of non-feature pixels Patch-based
More informationconvolution shift invariant linear system Fourier Transform Aliasing and sampling scale representation edge detection corner detection
COS 429: COMPUTER VISON Linear Filters and Edge Detection convolution shift invariant linear system Fourier Transform Aliasing and sampling scale representation edge detection corner detection Reading:
More informationDense 3D Reconstruction. Christiano Gava
Dense 3D Reconstruction Christiano Gava christiano.gava@dfki.de Outline Previous lecture: structure and motion II Structure and motion loop Triangulation Wide baseline matching (SIFT) Today: dense 3D reconstruction
More informationDEVELOPMENT OF REAL TIME 3-D MEASUREMENT SYSTEM USING INTENSITY RATIO METHOD
DEVELOPMENT OF REAL TIME 3-D MEASUREMENT SYSTEM USING INTENSITY RATIO METHOD Takeo MIYASAKA and Kazuo ARAKI Graduate School of Computer and Cognitive Sciences, Chukyo University, Japan miyasaka@grad.sccs.chukto-u.ac.jp,
More informationIntroduction to 3D Machine Vision
Introduction to 3D Machine Vision 1 Many methods for 3D machine vision Use Triangulation (Geometry) to Determine the Depth of an Object By Different Methods: Single Line Laser Scan Stereo Triangulation
More informationIntroducing Robotics Vision System to a Manufacturing Robotics Course
Paper ID #16241 Introducing Robotics Vision System to a Manufacturing Robotics Course Dr. Yuqiu You, Ohio University c American Society for Engineering Education, 2016 Introducing Robotics Vision System
More informationDepth estimation from stereo image pairs
Depth estimation from stereo image pairs Abhranil Das In this report I shall first present some analytical results concerning depth estimation from stereo image pairs, then describe a simple computational
More informationCOMPARATIVE STUDY OF DIFFERENT APPROACHES FOR EFFICIENT RECTIFICATION UNDER GENERAL MOTION
COMPARATIVE STUDY OF DIFFERENT APPROACHES FOR EFFICIENT RECTIFICATION UNDER GENERAL MOTION Mr.V.SRINIVASA RAO 1 Prof.A.SATYA KALYAN 2 DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING PRASAD V POTLURI SIDDHARTHA
More informationCreating a distortion characterisation dataset for visual band cameras using fiducial markers.
Creating a distortion characterisation dataset for visual band cameras using fiducial markers. Robert Jermy Council for Scientific and Industrial Research Email: rjermy@csir.co.za Jason de Villiers Council
More information