Open Source Computer Vision Tutorial for ICDSC 08

Size: px

Start display at page:

Download "Open Source Computer Vision Tutorial for ICDSC 08"

Robert Long
5 years ago
Views:

1 Open Source Computer Vision Tutorial for ICDSC 08 Gary Bradski Senior Scientist, Willow Garage; Consulting Prof, Stanford University Smart Cameras Gary Bradski (c) 2008 Rights to original images herein explicitly retained, no redistribution without written permission 1

2 OpenCV Book Due out end of Sept This tutorial is just a faint introduction new book will teach computer vision and its use via OpenCV Learning OpenCV Computer Vision with the OpenCV Library, O Reilly Press. Google OpenCV Amazon Gary Bradski (c)

3 Outline Installing OpenCV Computer vision motivation OpenCV overview Programming with OpenCV, getting started Some computer vision background With program examples An object training set program Optic flow, segmentation, calibration, stereo, recognition Some tips on tricks of the trade Wither computer vision and OpenCV Gary Bradski (c)

4 INSTALLING OPENCV Gary Bradski (c)

5 Installing OpenCV First, install the files for this course (by key or by CD). Open this file. Hopefully you ve preinstalled, otherwise see the next two pages, and/or see the OR: installing_opencv_cvs.txt in the files for this course Also in files, there s FFMPEG_install_MacOSX.txt Instructions can be found on the OpenCV wiki, for Windows, Linux or MacOS. The release version of the library are at: I recommend you get the latest version of OpenCV from the sourceforge CVS repository. If you get the version 1 release from sourceforge, it's OK, but you won't be able to run the stereo demo. Gary Bradski (c)

6 Installing On Windows For getting the code from CVS: ============================= WINDOWS For Windows users, you'll need a CVS program. I recommend TortoiseCVS ( ), which integrates nicely with Windows Explorer. On Windows, if you want the latest OpenCV from the CVS repository then you'll need to access the CVSROOT directory: :pserver:anonymous@opencvlibrary.cvs.sourceforge.net:2401/cvsroot/opencvlibrary For the tortoisecvs, launch the program by finding the directory you want, right clicking and selecting CVS checkout. When the program launches, Put the above :pserver line in the CVSROOT box. The protocal is Password server (:pserver:) No protocal parameters Server is opencvlibrary.cvs.sourceforge.net Port is probably 2401 Repository folder is /cvsroot/opencvlibrary Username is anonymous. After you download the code, you must build it using the MSVC sln or dsw file in the opencv _make directory. The alternative for Windows is to go to and download the opencv-win release 1.0 which is a self extracting exe file. You will be able to run all but the stereo related demos. Gary Bradski (c)

7 Installing on Linux On Linux, you can get the opencv CVS using the following two commands: cvs login When asked for password, hit return. Then use: cvs -z3 co -P opencv Linux needs other files to build OpenCV. You can use sudo synaptic or sudo apt-get to get the following packages: GTK+ 2.x or higher, including headers. pkgconfig, libpng, zlib, libjpeg, libtiff, and libjasper with development files. Python 2.3, 2.4, or 2.5 with headers installed (developer package). libavcodec and the other libav* libraries (including headers) from ffmpeg pre1 or later. FFMPEG instructions: Download ffmpeg from You have to build a shared library of the ffmpeg program to use it with other open source programs such as OpenCV. To build and use a shared ffmpg library: $>./configure --enable-shared $> make $> sudo make install You will end up with: /usr/local/lib/libavcodec.so.*, /usr/local/lib/libavformat.so.*, /usr/local/lib/libavutil.so.*, and include files under various /usr/local/include/libav*. The latest FFMPEG might have changed where it puts its include files, so you might have to: cd /usr/local/include sudo mkdir ffmpeg sudo cp libavcodec/* ffmpeg sudo cp libavformat/* ffmpeg sudo co libavutil/* ffmpeg FFMPEG is a complete pain due to their licensing and refusal to package To build OpenCV once you've done all of the above: $> autoreconf --force $>./configure $> make $> sudo make install $> sudo ldconfig [ ] Ubuntu modification (if you can't get ffmpeg to work): ================================ * sudu synaptic o and get gstreamer and install gstreamer files, particularly libgstreamer0.10-dev * Then configure opencv with./configure --with-gstreamer o as follows: $> autoreconf --force $>./configure --with-gstreamer $> make $> sudo make install $> sudo ldconfig The alternative for Linux (though you'll still need the other files) is to go to and download the opencv-linux release 1.0 which is a gz file. Then: $> tar xvzf OpenCV tar.gz $> cd opencv $>./configure --prefix=/opencv_library_install_path/opencv $> make $> sudo make install Gary Bradski (c)

8 COMPUTER VISION MOTIVATION Gary Bradski (c)

9 My Motivation Bring competent vision (recognition and 3D segmentation) to Robotics Willow Garage Robot Stanford STAIR Project under Prof. Andrew Ng Gary Bradski (c)

10 Perception Seems easy In 1966 Marvin Minsky told an undergraduate to solve perception over the summer Perhaps he forgot to say which summer? Why is vision so hard? Gary Bradski (c)

11 Perception is hard Maybe look for larger features like edges? Gary Bradski (c)

But, What s an Edge? Depth discontinuity Surface orientation discontinuity Reflectance discontinuity (i.e., change in surface material properties) Illumination discontinuity (e.

12 But, What s an Edge? Depth discontinuity Surface orientation discontinuity Reflectance discontinuity (i.e., change in surface material properties) Illumination discontinuity (e.g., shadow) Slide credit: Christopher Rasmussen 12 Gary Bradski (c) 2008

13 To Deal With the Confusion, Your Brain has Rules... That can be wrong Gary Bradski (c)

15 The Brain Compensates for Lighting too Which square is darker? Gary Bradski (c)

16 Lighting Determines Shape Perception Gary Bradski (c)

17 Don t believe me? Perception of surfaces depends on lighting assumptions Gary Bradski (c)

19 The Brain Assumes 3D Geometry Perception is ambiguous depending on your point of view! Gary Bradski (c)

20 The Brain Has a Point of View* and it s a Tad Strange Same size things get smaller, we hardly notice Parallel lines meet at a point * A Cartoon Epistemology: Gary Bradski (c)

21 An Infinite World in a Finite Space Perception must be mapped to a space variant grid Logarithmic in nature Gary Bradski (c) Steve Lehar

22 QUESTION: What is beyond the sky? The sky outside your head. We live in our own bubbles Gary Bradski (c)

23 (Robots Will Be the Same ) Vision is hard. Now lets make some progress Gary Bradski (c)

24 OPENCV OVERVIEW Gary Bradski (c)

25 What is OpenCV? Free AND Open Source Computer Vision Library Then: Started in 1999 as a project on my C: drive. Now: Over 500 functions implementing Computer vision, machine learning, image processing and numeric algorithms. Portable and very efficient (implemented in C/C++) Full Python interface Runs on Windows, Linux and Mac OS Has BSD license (that is, free for ANY use) Available at Other Sites: Open Source Computer Vision Library OpenCV (algorithms) Intel Integrated Performance Primitives (optimized routines) Gary Bradski (c)

26 OpenCV History Motivation was to lower the bar to entry to computer vision, increase incentive for higher MIPS out in the market for Intel. Timeline: Willow Garage now supports OpenCV (and a robot OS called ROS) as a means of accelerating peaceful uses of robotics and AI Gary Bradski (c)

27 OpenCV Tends Towards Real Time Gary Bradski (c)

28 Based on BSD license License Free for commercial or research use In whole or in part Does not force your code to be open You need not contribute back We hope you will contribute back, recent contribution, C++ wrapper class used for Google Street Maps* * Thanks to Daniel Filip Gary Bradski (c)

29 OpenCV Structure CV Image Processing and Vision Algorithms MLL Statistical Classifiers and Clustering Tools HighGUI GUI, Image and Video I/O CXCORE basic structures and algorithms, XML support, drawing functions IPP Fast architecture-specific low-level functions Gary Bradski (c)

30 OpenCV Big Picture: General Image Processing Functions Image Pyramids Segmentation Transforms Machine Learning: Detection, Recognition Geometric descriptors Features Tracking Matrix Math Camera calibration, Stereo, 3D Utilities and Data Structures Fitting Gary Bradski (c)

Research Safety monitoring (Dam sites, mines, swimming pools) Security systems

31 Where is OpenCV Used? Google Maps, Google street view, Google Earth, Books Academic and Industry Research Safety monitoring (Dam sites, mines, swimming pools) Security systems Image retrieval On Cell phones Video search Structure from motion in movies Machine vision factory production inspection systems Robotics 2M downloads

32 Pictorial Tour of OpenCV Gary Bradski (c)

33 Canny Edge Detector Gary Bradski (c)

34 Hough Transform Gary Bradski (c)

35 Scale Space void cvpyrdown( IplImage* src, IplImage* dst, IplFilter filter = IPL_GAUSSIAN_5x5); Gary Bradski (c) 2008 void cvpyrup( IplImage* src, IplImage* dst, IplFilter filter = IPL_GAUSSIAN_5x5); 35

36 Space Variant vision: Log-Polar Transform Gary Bradski (c)

37 Inpainting: Image textures Removes damage to images, in this case, it removes the text. Gary Bradski (c)

$// opencv/samples/c/lkdemo.c int main( ){ CvCapture* capture = < >? cvcapturefromcam(camera_id) : Optical Flow cvcapturefromfile(path); if(!$ frame) break; // copy and process image cvcalcopticalflowpyrlk( ) cvshowimage( LkDemo, result ); c=cvwaitkey(30); // run at ~20-30fps speed if(c >= 0)

frame) break; // copy and process image cvcalcopticalflowpyrlk( ) cvshowimage( LkDemo, result ); c=cvwaitkey(30); // run at ~20-30fps speed if(c >= 0)

38 // opencv/samples/c/lkdemo.c int main( ){ CvCapture* capture = < >? cvcapturefromcam(camera_id) : Optical Flow cvcapturefromfile(path); if(!capture ) return -1; for(;;) { IplImage* frame=cvqueryframe(capture); if(!frame) break; // copy and process image cvcalcopticalflowpyrlk( ) cvshowimage( LkDemo, result ); c=cvwaitkey(30); // run at ~20-30fps speed if(c >= 0) { // process key }} cvreleasecapture(&capture);} lkdemo.c, 190 lines (needs camera to run) I ( x dx, y I G X / t X I b, dy, t dt ) / x ( dx ( x, y ), G / dt ) I I I 2 x x I ( x, y, t );, I y I, x I I / y ( dy y 2 y, b / dt ); I t I I x y

39 Morphological Operations Examples Morphology - applying Min-Max. Filters and its combinations Image I Erosion I B Dilatation I B Opening IoB= (I B) B Closing I B= (I B) B Grad(I)= (I B)-(I B) TopHat(I)= I - (I B) BlackHat(I)= (I B) - I Gary Bradski (c)

40 Distance Transform Distance field from edges of objects Flood Filling Gary Bradski (c)

41 Histogram Equalization Gary Bradski (c)

42 Thresholds Gary Bradski (c)

43 Segmentation Pyramid, mean-shift, graph-cut Here: Watershed Gary Bradski (c)

44 Contours Gary Bradski (c)

45 Delaunay Triangulation, Voronoi Tessellation Gary Bradski (c)

46 Background Subtraction Gary Bradski (c)

47 Motion Templates (My work with James Davies) Object silhouette Motion history images Motion history gradients Motion segmentation algorithm silhouette MHI MHG Gary Bradski (c)

48 Segmentation, Motion Tracking Motion Segmentation Motion Segmentation Pose Recognition Gesture Recognition Gary Bradski (c)

49 Tracking with CAMSHIFT Control game with head

50 3D tracking Camera Calibration View Morphing POSIT

51 Projections

52 Single Camera Calibration Now, camera calibration can be done by holding checkerboard in front of the camera for a few seconds. And after that you ll get: 3D view of checkerboard Un-distorted image Gary Bradski (c)

53 Bird s-eye View Useful for navigation planning and security systems Gary Bradski (c)

54 Stereo Calibration Gary Bradski (c)

55 Find Epipolar lines: 3D Stereo Vision Align images: Triangulate points: Depth: Gary Bradski (c)

56 ML for Recognition Gary Bradski (c)

57 CLASSIFICATION / REGRESSION CART Naïve Bayes MLP (Back propagation) Statistical Boosting Random Forests SVM K-NN Face Detector (Histogram matching) (Correlation) Stochastic Discrimination Logistic CLUSTERING K-Means EM (Mahalanobis distance) Agglomerative Machine Learning Library (MLL) CC CCB AACBAABBCBCC AACACB B AAA CBABBC CB ABBC C B A BBC BB C TUNING/VALIDATION Cross validation Bootstrapping Variable importance Sampling methods Gary Bradski (c)

58 K-Means, Mahalanobis Gary Bradski (c)

59 Patch Matching Gary Bradski (c)

Meanshift Algorithm used to track, histogram intersection

60 Gesture Recognition Gestures: Gesture via: Gradient histogram* based gesture recognition with Tracking. Meanshift Algorithm used to track, histogram intersection with gradient used to recognize. Up R L Stop OK *Bill Freeman Gary Bradski (c)

61 Boosting: Face Detection with Viola-Jones Rejection Cascade Gary Bradski (c)

62 We ll now delve into some programming, ending with an object recognition dataset collection algorithm PROGRAMMING WITH OPENCV, GETTING STARTED Gary Bradski (c)

63 We ll get a light gist to some algorithms. Go over the algorithms in the samples directory SOME COMPUTER VISION BACKGROUND Gary Bradski (c)

64 Seeing the light Cameras The Ideal Camera is a Pinhole Camera In theory it s a perfect imager Magnification factor (by similar triangles): x f X Z 64

65 PROBLEM: Pinhole Gives Too little light: SOLUTION: Use a lens -- Brunelleschi, XVth Century 65

66 TRADOFFs: More light, more problems Geometrical aberrations spherical aberration astigmatism distortion coma Vignetting 66 Marc Pollefeys

67 Astigmatism Different focal length for inclined rays 67 Marc Pollefeys

68 Distortion magnification/focal length different for different angles of inclination pincushion (tele-photo) barrel (wide-angle) 68 Marc Pollefeys

69 Camera Calibration We use a known calibration object (chessboard) Collect the image it actually makes Correct it to the projection it should make Gary Bradski (c)

70 Camera Calibration The projection equation Can be written Given that these come from images of a plane (it forms something called a Homography ), we can stack the points together and Solve for the projection equation parameters. We then adjust for distortion: And solve all over again. Gary Bradski (c)

71 Camera Calibration Lens distortion can then be corrected using a fast undistortion lookup map

72 Assumptions Optical flow Brightness constancy Temporal persistence (small movements) Spatial coherence I(x(t),t) = I(x(t+dt),t+dt) Gary Bradski (c)

73 Stereo Depth from correspondence is easy in a perfect stereo rig (frontal parallel, row aligned) Gary Bradski (c)

74 Stereo But stereo rigs are never perfectly aligned, so we use stereo calibration (chessboards again) to mathematically create the perfect rig. Gary Bradski (c)

75 Stereo To get perfect alignment, we align something called epipoles along rows of the left and right images. Gary Bradski (c)

76 Stereo We use calibration to find the rotation to align the epipoles This means finding the rotation and translation between the left and right cameras The math is involved, but similar to single camera calibration Gary Bradski (c)

77 Stereo Once the left an right cameras are calibrated internally (intrinsics) and externally (extrinsics), we need to rectify the images Gary Bradski (c)

78 Stereo This converts correspondence search to a 1D search along image rows We do the serch using Sum of Abs. Diff block matching, with various constraints Gary Bradski (c)

79 Stereo There is a left-right feature ordering constraint, a texture strength constraint, a max disparity constraint ( Horopter ), a match strength constraint, all of which adds robustness. Gary Bradski (c)

80 Stereo Note that disparity is inversly proporational to distance, so we get most depth resolution close by Gary Bradski (c)

81 Stereo Result is a disparity map (if we don t know the scale of the calibration object), or a depth map (if we do) This can be all be done at frame rate. Gary Bradski (c)

82 Optical flow Going to 2D is by simple extension Need for small motion assumption Gary Bradski (c)

83 Kalman Filter, Partical Filter for Tracking Kalman Condensation or Particle Filter Gary Bradski (c)

84 Mean-Shift for Tracking Gary Bradski (c)

85 A few hints and some speculation SOME TIPS ON TRICKS OF THE TRADE Gary Bradski (c)

86 Machine learning Features beat algorithms Chose an operating point that trades off accuracy vs. cost TP FN 100% FP TN 100% Gary Bradski (c)

87 Debugging ML: Bias, Variance* If your model is too strong or weak you can have characteristic problems that have different strategies to fix. * See Andrew Ng s course notes, Gary Bradski (c)

88 Object Recognition, Segmentation Want what (recognition) and where (segmentation) Quick features with 2D and 3D offsets Initial recognition and segmentation Rich feature sets Make use of lighting effects Model based verification 2D and/or 3D Better sensors help Depth Higher dynamic range and sensitivity Adaptivity to lighting Gary Bradski (c)

89 WITHER COMPUTER VISION AND OPENCV Gary Bradski (c)

Vision Faster computation, 2D and 3D sensors, higher range and sensitivity Will make vision recognition and segmentation work faster than people think

90 Vision Faster computation, 2D and 3D sensors, higher range and sensitivity Will make vision recognition and segmentation work faster than people think Will enable autonous robotics Visual geometry is already fairly mature for image stitching Model capture for games, movies, VR Gary Bradski (c)

91 OpenCV Now has a full time team of five people developing it again (Willow Garage s support) Will begin regular releases Plans Even better stereo and multicamera support Dealing with 2D+3D active and passive calibation Dealing with point clouds Much richer 2D and 3D feature set Better handling of motion, lighting, color, shadow Gary Bradski (c)

92 OpenCV Book Due out end of Sept This tutorial is just a faint introduction new book will teach computer vision and its use via OpenCV Learning OpenCV Computer Vision with the OpenCV Library, O Reilly Press. Google OpenCV Amazon Gary Bradski (c)

Questions? BOOK O Reilly Late September 2008: Learning OpenCV: Computer Vision with OpenCV OpenCV CODE: http://sourceforge.

93 Questions? BOOK O Reilly Late September 2008: Learning OpenCV: Computer Vision with OpenCV OpenCV CODE: OpenCV WIKI: Willow Garage: garybradski@gmail.com

Final Exam Study Guide

Final Exam Study Guide Exam Window: 28th April, 12:00am EST to 30th April, 11:59pm EST Description As indicated in class the goal of the exam is to encourage you to review the material from the course.