3D omnistereo panorama generation from a monocular video sequence Ning, Zhiyu U

Size: px
Start display at page:

Download "3D omnistereo panorama generation from a monocular video sequence Ning, Zhiyu U"

Transcription

1 3D omnistereo panorama generation from a monocular video sequence Ning, Zhiyu U A report for COMP6703 escience Project at The Department of Computer Science Australian National University

2 Acknowledgement I would like to thank my client and supervisor, Dr Hongdong Li, for giving me the chance to do computer vision project and giving me so much support and inspiration. I would like to thank my supervisor, Dr. Alistair Rendell, my lecturer Dr. Henry Gardner, Dr. Pascal Vuylsteker for giving me academic supports and useful advice. And thanks to all the support and help from my family and friends. 2

3 ABSTRACT An interesting part in computer vision is to generate 3D video from traditional 2D video. To achieve this, first we need to understand how to use a single monocular camera to generate a 3D scene sensation. We call this 3D scene stereo panorama. In paper[1], Shmuel Peleg and his fellows proposed a new approach to generate a stereo panorama by using only a single video camera rotate about an axis behind its lens based on the principle of X-slit camera in paper[2]. Specifically, the stereo panorama images is obtained by pasting together strips taken from each image in the video sequence. However, they did not specify how to choose the specific strip width and the location of each strip, how the parameters in use will affect the 3D sensation. And they didn t give a benchmark on how to recognize the stereo image pair s quality. The strip width chosen is obviously related with the camera rotating speed. And the location of each strip is related with disparity. However, in this report, I will use a Virtual speed rather than actual speed of camera to determine the strip width and present the relationship between visual speed and actual speed. And analyze the location of each strip in accordance with the actual 3D sensation of the stereo panorama images. Also, I will give my benchmark on how to distinguish the stereo image s quality. Keyword: computer vision, stereo panorama, X-slit camera, parameters, disparity 3

4 CONTENTS Acknowledgement...2 ABSTRACT INTRODUCTION General introduction Stereo Panorama images Camera classifications Sharp RD3D STRUCTURE BACKGROUNDS X-slit camera camera calibration Single viewpoint projections Multiple viewpoint projections Benchmark on stereo panorama image quality Disparity Algorithm built Sharp RD3D API 25 3 CLIENT REQUIREMENT SPECIFICATIONS Theory analysis requirement Implementation requirement PROJECT PLAN MILESTONE FUTURE WORK THEORY ANALYSES AND MODELING Methodology Parameter analysis Image points matching

5 5.4 Algorithm building Modeling 41 6 IMPLEMENTATIONS Initial implementation methods Software introduction Input specification Output results Testing on sharp RD3D API RESULTS AND ANALYSIS Testing effect Result analysis and improvement CONCLUSIONS.. 55 APPENDIX.56 Bibliography

6 1 INTRODUCTION 1.1 General introduction A normal panoramic image covers 360 degrees of the natural scene. Traditionally, It can be captured either by using a camera with special lens or multiple cameras. However, now we can use software to generate a 360 degree panoramic image automatically from a set of images, covering the whole panoramic scene, taken by a video camera. But, it seems a problem when we try to use this kind of software to generate a stereo panoramic image, which consists of a pair of images, where one is for the left eye viewing and the other is for the right eye viewing. That is because the ordinary panorama image is generated from approximately a single viewpoint where stereo panorama images are generated from two different viewpoints simulating the human eyes locations. Therefore, an approach by using two cameras to simulate the two eyes to capture the images is proposed. However, this approach is a bit more complicated and hard to be implemented by ordinary users. Shmuel Peleg proposed a new approach [1] to generate a stereo panorama images by using multiple viewpoint projections, which was called circular projections by Peleg. It used a single video camera to rotate about a vertical axis which was behind its lens and captured a sequence of images, which had different viewpoints. This approach can be used to generate a stereo panorama in accordance with X-slit camera principles. The X-slit camera [2] consists of two slits rather than a pin hole in a traditional camera. The details will be presented in chapter 2.1. However, the parameters and their effects in the 3D sensation are not discussed. To understand why a single camera can simulate a two-camera system, you should be able to understand the basic principle of X-slit camera, and the multiple viewpoint projections, which will all be introduced in the chapter 2. The camera calibration will 6

7 tell you how s the real time camera work and its critical parameters. Moreover, to implement the approach in [1], we need to know something about the benchmark of the stereo panorama image quality and the SHARP RD3D,the world s first laptop to present 3D images without special glasses, which is used to present the test result. They will also be presented in chapter 2. In this section, short introductions are given to stereo panorama images, camera classifications, SHARP RD3D as well as the structure of the remaining chapters. 1.2 Stereo Panorama images A stereo panorama images consists of a pair of images. One is for left eye viewing; the other is for right eye viewing. Then you will have a stereo sensation about the panorama scene. Figure 1.1 illustrates the principle of 3D imaging. FIGURE 1.1 7

8 1.3 Camera classifications In theory, we can define only one kind of camera. That is X-slit camera [2, 3], which has two slits inside. The details will be introduced in chapter 3.1. The pinhole camera, which is known to all, is just a special kind of X-slit camera with two slits intersect and orthogonal to each other. The parameters of the camera will directly affect the critical parameters in this project. Therefore, it is quite important. The process of estimating the intrinsic and extrinsic parameters of a camera is called camera calibration, which will be introduced in chapter Sharp RD3D The Sharp RD3D is the world's first auto-stereo 3D notebook computer. Sharp Corporation's TFT 3D LCD technology makes it possible to view eye-popping 3D images without special glasses [6]. We will test our result in sharp RD3D laptop. And in chapter 2.6, we will discuss the3d image generation API in the sharp RD3D notebook computer. 1.5 Structure This report is divided into eight chapters. Chapter 1 provides introduction for the project. Chapter 2 provides background information relevant to the project..chapter 3 addresses the client s requirement specification. Chapter 4 compares my initial plan of project with actual timetable. Chapter 5 describes the approach used to achieve the target, which is proposed by Shmuel Peleg, and its extension studies. Then I proposed my algorithms to achieve the target. Chapter 6 describes the experimental setup of the algorithms proposed in chapter 5. Chapter 7 evaluates the results and suggests future work. Finally, Chapter 8 gives a conclusion of the report. 8

9 2 BACKGROUNDS This chapter addresses the relevant background knowledge of the project. In chapter 2.1, there will be an introduction in X-slit camera and how to use a single camera to simulate the effect of a X-slit camera. Chapter 2.2 will introduce the parameters of a regular camera and how to define them in mathematics. Chapter 2.3 will discuss traditional viewing projection in capturing panorama images, which is single viewpoint projection. Chapter 2.4 will discuss the several kinds of multiple viewpoint projection and the one will be applied in the project. Chapter 2.5 will give a benchmark on how to distinguish the quality of stereo panorama image and finally, Chapter 2.6 will explain the Sharp RD3D 3D imaging API. 2.1 X-slit camera The first physical X-slit camera was designed by one of the pioneers of color photography, Ducos du Hauron [3], in 19 th century. The X-slit camera model is shown in Figure 2.1. The general X-slit camera is designed with two arbitrary slits l1, l2. And the projection ray of a 3D point P will first intersect slit l2, and then intersect slit l1, finally to the image plane. FIGURE 2.1 9

10 Specially, if the two slits exists in the same plane and orthogonal to each other, then a pin hole camera will be produced. I refer the reader to [8] for detailed description of X-slit camera rendering and to [2, 7] for how to use a translating pinhole camera to simulate the effects produced by X-slit camera with different parameters. Hereby, I only introduce some simple knowledge of X-slit camera which will be applied into this project. According to [2, 7, 8], the novel views of X-slit camera rendering are created by sampling the column strips from input images, as shown in figure 2.2. FIGURE 2.2 The width of the column strips is related with the range of pinhole camera positions and the range of width of each pinhole image. Therefore, the width of the column strips can vary, as shown in figure 2.3. It is determined by the speed of the translating cameras. Particularly, if the pinhole camera translates at constant speed, the column strips can share the same width. 10

11 FIGURE 2.3 The specific location of column strips is related with the location of virtual camera s vertical slit, as shown in figure 2.4. Therefore, if the orientation and location of the vertical slits vary, the novel views created will vary each other. FIGURE 2.4 As mentioned above, we can easily find out that the width of the column strips and the location of the column strips are all related with camera parameters, such as the focal length of the aperture or vertical slit, the horizontal viewing angle of the camera and the resolution rate of the camera. Therefore, we can get a conclusion that if we 11

12 need to accurately define the width and location of the column strips, we should define the camera parameters first. That is camera calibration. 2.2 Camera Calibration In this section, I will introduce the mathematic definition [5] of the geometric model as well as the method to estimate the camera parameters in mathematics. Then discuss the parameters that I need to define in my project and propose a future research extension from my project Geometric camera models In mathematics, we can define 2 kinds of camera models [5]. One is the camera with perspective projection; the other is camera with affine projection. In my opinion, the later model is just a special case of the former one, which is just a rational approximation of perspective projection for the observed objects lying at an approximate constant distance from the cameras. Also, it distinguished the camera parameters in [5] as intrinsic parameters which related the camera s coordinate system to the idealized coordinate system [5], and extrinsic parameters, which related the camera s coordinate system to a fixed world coordinate system and specify its position and orientation in space [5]. According to [5], they obtained an equation that represented the homogeneous coordinate vector of a 3D point p in the camera coordinate system with only the camera intrinsic parameters as follows. 12

13 P=(x, y, z, 1) T denotes this time the homogeneous coordinate vector of P in the camera coordinate system. M denotes a 3*4 matrix which is from K. K is related with the camera s intrinsic parameters α, θ, u0, β, v0. α and β are the magnification values which transfer the meters to pixel units mathematically, and α=kf, β=lf, where f is the camera s focal length and k, l are the scale parameters. u0 and v0 are the offset values that adjusted the difference between the center of CCD matrix with the principal point C0, which is the pierce point of the camera s optical axis with the physical retina, as shown in figure 2.5. FIGURE 2.5, taken from [5] And θ is also an offset to adjust the manufacturing error of a camera coordinate system, which is the angle between the two image axes that are not equal to 90 degree. And in [5], they obtained an equation that represented the homogeneous coordinate vector of a 3D point p in the camera coordinate system with both the camera intrinsic parameters and extrinsic parameters as follows. 13

14 The notation R is a rotation matrix, which is used to define the rotation parameters of the camera respect with the world coordinate system. Notation t is just a translation vector. r T 1, r T 2 and r T 3 denote the three rows of the matrix R and t x, t y and t z are the Co-ordinates of the vector t in the frame attached to the camera Geometric camera calibration Geometric camera calibration is the process to estimate the intrinsic and extrinsic parameters of a camera. In my project, it is a must to know some of the camera parameters. Generally speaking, I need to know the camera s horizontal viewing angle and its focal length when capturing the input video sequence, as shown in figure 2.6. In addition, I want to know the input image size measured in pixels. They all belong to the camera s intrinsic parameters. FIGURE 2.6, taken from [5]. 14

15 The 2φ is the camera s viewing angle and the f is the focal length when capturing current images, and d is the diameter of the film. However, I use my own camera to capture the input video sequence myself, so I know the camera parameters exactly, and no need to estimate it. But in the future, if I need to extend the study on this topic, it is a must to understand how to estimate the parameters that I need in the project. Then I can generate the good quality (This will be addressed in chapter 2.5) stereo panorama video from any input video sequence. In [5], it defined a method called least-squares parameter estimation to estimate the intrinsic and extrinsic. The details can be found in chapter 3 of [5]. It supposed that a camera watches several geometric features whose locations are known in a fixed world coordinate system. It then computed the perspective projection matrix M (chapter 2.1) associated with the camera and computed the intrinsic and extrinsic parameters of the camera from this matrix. These, they called it a linear approach to do camera calibration [5]. 2.3 Single viewpoint projections Single viewpoint projection, by literately, it is a view from only a fixed point in the world coordinate system, as shown in figure 2.7. All the projection rays will be intersected into a fixed center point, which is the pinhole of the camera. 15

16 FIGURE 2.7 Currently, we can use software to create a panorama image by using single view point projection principle. One among which is Canon's free software PHOTOSTITCH 3.1. it is a easy to use software, which reads in a set of images and stitches them together to form a full 360 degree panorama image. The final panorama image, looks like a picture taken by a special camera fixed in a center point of the panorama scene, similar as the pinhole view point in figure 2.7. However, such software can only generate single panorama image rather than a pair of panorama image, because they all based on the single view point projection principle. To generate a stereo pair of panorama images, we should use the software based on multiple view point principle, especially in two points view. 16

17 2.4 Multiple viewpoint projections Before talking something about the multiple viewpoint projections, we can look around in the nature. Humans have two eyes, and most of the animals have at least two eyes and move their head when looking for food or something else. The reason is that they need to analyze the environment. Or, we can say, they need to perceive the depth of the scene. That is the location of each object in the scene. Refer this to image-processing, we can not define the depth of a scene along the corresponding projection ray in a single image. We need at least two pictures, and then depth can be measure through triangulation [5]. To generate a 3D video, depth information is quite important. For the object that has infinite depth in the scene, our eyes will perceive it as zero disparity. For the object that is closer to us, our eyes will perceive greater disparity. This is the reason why you will see two forefingers when you put one of your forefingers in front of you within a small distance. Therefore, to generate the stereo panorama image, we should first know something about the multiple viewpoint projection. The multiple viewpoint projection, by literately, it refers to viewing a scene at different points at the same time. Figure 2.8 illustrates the two view point projection. The 3D point P is projected into the two image planes of the viewing camera with the optical centers O and the viewing camera with the optical centers O. 17

18 FIGURE 2.8, taken from [5]. From this simple two view point projection model, we are able to perceive a real 3D object, because these two cameras simulate our two eyes viewing objects, and then we obtain the information we need to judge the depth of every points in the objects. Then we can actually see the 3D object. In my project, I use a camera rotate about an axis behind its lens, this is also a kind of multiple view point projection, as shown in figure 2.9. FIGURE

19 Since the camera will rotate around the rotating axis in figure 2.9, every location of the camera varies from each other. However, for the two locations that close to each other, they share part of the scene (the shade part in the figure 2.9). So, this is possible to create a 3D view of the scene, because all the information needed exist in the image sequence. However, I do not use the shade part as the information that I need to create the 3D scene. I will use another approach, as shown in figure FIGURE 2.10, taken from [10]. 19

20 Recalling the X-slit camera principle that I have introduced in chapter 2.1, when I actually rotate the camera around the center axis, I can image that two slit cameras capturing the scene at the same time, as shown in figure 2.10, the right eye projection ray and the left eye projection ray. After rotating for a circle, they will share the full 360 degree scene. And this two slit camera play the roles of simulating our two eyes to capture the scene. Therefore, we will get two of the image strips, one from right eye projection rays, the other one from left eye projection rays. Then we can get a stereo panorama image pair. But, how can we tell if the stereo panorama image has good quality or not, I propose an approach that will be introduced in the next chapter. 2.5 Benchmark on stereo panorama image quality This chapter, I will introduce a approach I use to define a panorama image quality. According to this benchmark, I can tell which panorama image has the good image quality and which does not Disparity Disparity refers to the difference in images from the left and right eye that the brain uses as a binoculars cue to determine depth or distance of an object [11]. As shown in figure FIGURE 2.11, taken from [12]. 20

21 According to the definition mentioned above, when we perceive the ball in figure 2.11, we will base on the difference between the images that our left and right eyes actually see. This is the same as the 3D display principle. That is to display the ball in the screen by extending the lines from the eyes to the ball and then to the screen. In my project, the system behaves mainly like the human example. Two virtual slit cameras take pictures of the same scene, but they are displaced by a certain distance - exactly like our eyes (figure 2.10). The computer then computes the depth from these two image strips. As mentioned above, disparity plays an important role in forming 3D object that we can see. Therefore, disparity is an essential part when we need to define a stereo image s quality Algorithm built My approach on define a stereo image s quality is based on the algorithm below. 1 Input a stereo image pair. 2 Display both of the images at the same time with the pixel information on, which is just an axis that mark the width of the image by pixels, as shown in figure Find the same points from the object which is the closest to the cameras (refer to chapter 5.3) from both of the images, and compute the pixel difference a. That is OCI1-OCI2=a. where OCI 1 is the pixel value in the axis of the first image of the object closed to the camera, respectably the OCI2. 4 Find the same points from the object which is far away to the cameras (refer to chapter 5.3) from both of the images, and compute the pixel difference a. That is OFI1-OFI2=b. where OCI 1 is the pixel value in the axis of the first image of the object far away from the camera, respectably the OCI2. 5 Compare the value a with 1 and compare the value b with 2. Where 1 and 21

22 2 are the values we can need to get from experiment, details will be addressed in the session follows. 6 if a 1 or a is a little bit bigger than 1, and b 2 or b is a little bit smaller than 2. Particularly, if the objects locate almost at infinite distance from the camera, 2 0, then b 0, which is what we call zero disparity. FIGURE 2.12, taken from my home. So, in my benchmark, I need to define a reasonable 1 and a reasonable 2. Recalling that we see the object quite closed to us with great disparity and object at infinite distance with zero disparity. So if object s distance becomes farer and farer, the disparity amount becomes smaller and smaller. As shown in figure 2.13, by triangle similar principle, D/b f/d, where D is the radial distance, b stands for the distance between two cameras that simulate human two eyes, and f is the camera s focal length, d is the disparity. In addition, O and O stand for the object which the cameras are looking at, and D stands for the distance between the object and the 22

23 cameras. So, actually, we can establish a relationship between the object s distance D and its disparity that D=1/2D=bf/2d, where D is measured by meter. FIGURE 2.13, modified from [19]. Then I obtain a mathematic expression for my equation that =λbf/2d, where λ is a ratio whose unit is measured as pixels/meter. And the value of λ is determined by the camera s intrinsic parameters, as shown in figure a.6 and 5.6. And we can get its relationship as shown in figure

24 FIGURE 2.14 Therefore, 1 and 2 can be obtained by the expression introduced above with the depth of the object known(the depth of the object can be obtained by two images with 2/3 overlapping, as proposed in [20]). Note,this algorithm introduced can not be applied to the points that exist in the plane which locates exactly in the middle between the two cameras, as shown in figure 2.13, the plane perpendicular to b and consists the line O O, because all the points in the line are detected by the two cameras with no disparity. Therefore, the plane is called zero disparity plane. 24

25 2.6 Sharp RD3D API In this section, I will introduce the environment that I used to test my result, which is based on the Sharp RD3D laptop. As introduced in chapter one, sharp RD3D, the world s first 3D display laptop, can make people perceive 3D effect without glass, which is due to a special layer which consists a series of special designed vertical slot called parallax barrier, as shown in figure FIGURE 2.13, taken from [21]. With the parallax barrier turned off, human eyes perceive the same light from the screen, and with the parallax barrier turned on, human eyes perceive divided light from the screen, which is due to the effect of parallax barrier. Hence, we can perceive the 3D sensation without special glasses. As mentioned above, the sharp RD3D API is a application platform which can display the pair of stereo panorama images. To make 3D sensation successful, we need to mosaic the two images in accordance with Sharp s special technology as shown in figure

26 3 CLIENT REQUIREMENT SPECIFICATIONS Client requirements are the most important guidance to the project objectives and also the compulsory contents of the project. My client requires not only the practical implementation of Peleg s theory, but also the extension of his theory. Therefore, to finish the project successfully, I need to fulfill both of them. 3.1 Theory analysis requirement 1. Understand the basic principle of X-slit camera and its projections. 2. Understand how X-slit camera projections can be achieved by a single camera. 3. Understand how X-slit camera principle can be applied into stereo panorama image generation, and learn the approach proposed in [1]. 4. Do extension research and provide detail mathematic induction on how to choose the parameters proposed in [1], such as the strip width and the distance between two strips. 5. Gives a theory analysis on how the parameters will affect stereo sensation. 6..Synthesize the theory and provide a detailed algorithm to generate a stereo panorama with a single camera. 3.2 Implementation requirement 1. Understand the Matlab language toolbox in image processing and video processing. 2. Capture a video sequence by a single monocular camera as input video. 3. Use the algorithm built to write a Matlab code to process the input video to generate a stereo panorama image. 4. Study the 3D display API for Sharp RD3D laptop. 5. Use the stereo panorama image to generate a 3D video or virtual 3D scene in Sharp RD3D laptop. 6. Improve the stereo sensation. 26

27 4 Project Plan This chapter describes the initial project plan and also the alternative plan during the project development process. And then it will address the future development work from the current project. 4.1 MILESTONE INITIAL PLAN: Milestone Date Understanding the paper needed for this project August 18th, 2006 Modeling and building algorithm September 8th, 2006 Implementing and debugging October 8th, 2006 Testing and analyzing October 22nd, 2006 Finalizing report October 29th, 2006 ACTUAL PLAN: Milestone Date Understanding the paper needed for this project August 8th, 2006 Modeling and building algorithm September 18th, 2006 Implementing and debugging October 10th, 2006 Testing and analyzing October 20th, 2006 Finalizing report 27

28 4.2 FUTURE WORK Due to the time limit for this project, there are a lot to be extended from what I have done in this semester. This section will address part of them. For the software part: Firstly, it is converted the matlab code to C++ code. That is because matlab is the software that is convenient to use by researchers rather than the general users. In addition, running matlab is much more slowly in my project than running a c++ program. Secondly, building an extra function for the software that can automatically measure the quality of the stereo image pair. Thirdly, try to build an extra function for the software to apply camera calibration principle to estimate camera s intrinsic and extrinsic parameters. With this function, the software should be able to process inputs by general users rather than inputs with a specific camera model. For the research part: Firstly, try to extend the 3D image creation from limiting the camera s moving track (rotating about an axis behind it) to allowing camera s free movement. Secondly, try to solve the problem of displaying dynamic scene rather than static scene in my project. Thirdly, try to build a algorithm to solve the 2D video to 3D video conversion. 28

29 5 THEORY ANALYSES AND MODELING In this section, it will first introduce the method that I used in this project. And then focus on discussing the steps on how to choose the parameters discussed in the method. Then discuss the technology I used in my steps when specifying the parameters. Finally, giving the algorithm on achieving the goal and model the prototype based on the algorithm. 5.1 Methodology To generate a stereo panorama image, we can use several approach. The common one is proposed in [10], which used two cameras to simulate the human two eyes to rotate around an axis, as shown in figure 5.1 and 5.2. FIGURE 5.1, a two-camera device, taken from [10]. 29

30 FIGURE 5.2, the methodology of using two cameras to generate stereo panorama image, taken from [10]. As shown in the figure 5.2, and figure 2.9, the two cameras share some common information about a same object and have some unique information about a same object. Therefore, the system can compute the depth of the object according to this information obtained, to generate a 3D image pair. However, it is not easy for everyone to obtain such a device in figure 5.1. So, it is difficult to generate a 3D sensation image by common user. In this project, I used the technology proposed in [1, 13], based on the principle of [8] to generate a stereo panorama image pair, which will use only one camera. Recalling that the knowledge that I introduced in chapter 2.1 and chapter 2.4, we can establish a prototype that two virtual slit cameras exist behind the physical camera are capturing the images while the physical camera is recording the scene around an axis, as shown in figure

31 FIGURE 5.3, taken from [13]. The letter O stands for the physical camera s optical center, and the letters V L and V R stand for the virtual slit cameras, where V L plays the role as human s left eye and V R plays the role as human s right eye. 2d stands for the distance between the two cameras, and r stands for the physical camera s rotation radius. With the prototype addressed in figure 5.3, we know that when we rotate the physical camera about an axis, it is capturing a full 360 degree video sequence. And we can get a set of images. For each of the images, a part of it is captured by each of the slit cameras. Recalling the knowledge in chapter 2.1, the slit cameras actually capture vertical strips in a single frame. In the figure 5.3, we link the projection rays from V L and V R with physical camera s optical center, and extend them to the image plane. Then we can obtain the two corresponding vertical strips. Hence, for a full 360 degree rotation of the physical camera, we can get two set of strips, where one is captured by the left-eye slit camera, the other is captured by the right-eye slit camera. Therefore, after stitching them each set of strips together, we can actually obtain two panorama images, where one is for left eye viewing, the other is for right eye viewing. This approach is simple and directly. Even by common users, they can obtain their own stereo pair according to this approach. So, how to choose the strips location in the image planes, and how to choose the width of the strips and the camera s rotating 31

32 radius become an important issue. And if the final stereo panorama image will generate great 3D sensation becomes an interesting topic. They will be addressed in the next section. 5.2 Parameter analysis In this section, I will analyze the parameters needed to create a stereo panorama image pair based on figure 5.3, as shown in figure 5.4, in six steps. FIGURE 5.4, derived from figure 5.3. Firstly, to simulate the human eyes function exactly, the distance between the two virtual slit cameras should be roughly the same as the human eyes distance. Therefore, we can get 2d (6.5,7)cm. Secondly, we need to know the camera s rotation radius r. In this project, I define this r myself, so I know the exact value of r. But in the future, we can build software to detect the r + f value, by comparing several images with some same objects, where f is the physical camera s focal length, as addressed in chapter 4.2. Thirdly, I use the canon ixus4.0 as my physical camera (figure 5.5), so I know the 32

33 focal length of this camera when recording videos. Then f value is knowable. However, in the future, we can apply camera calibration principle to estimate the camera s intrinsic parameters. FIGURE 5.5 Fourthly, from the triangle similar principle, we can get 2v = 2d f / r, with an unit mm. So, the mm value of distance between the two vertical strips is computable. However, I use a digital camera to capture image, I can only get pixel values from the image, hence, I need to know the pixels value from the mm value. Based on figure 2.6, I build a simple to understand geometry model, as shown in figure 5.6. FIGURE 5.6, simple camera model. The 2 thetas refer to the physical camera s horizontal viewing angle. From this simple 33

34 model, we can easily get the image plane width w = 2f tan (theta). Then we can divide 2v obtained in step 3 with w, we can get percentage information that the distance between the two vertical strips occupied. Since I know the width of the image in pixels, I simply multiply the Width in pixels with the percentage obtained, I can get the distance between the two vertical strips in pixels. Fifthly, to obtain the width of each vertical strip, I need to define a camera s virtual speed rather than its actual speed. My virtual speed is defined as pixels per second instead of millimeter per second. This is easy because we can use the value directly to determent the width of the vertical strip in pixel, and don t need to worry about convert the value from millimeter to pixel. In addition, the virtual speed is easier to detect than the actual speed. When we rotate the camera clockwise, the images we obtain look like moving to the left as shown in figure 5.7. FIGURE 5.7 The point P s original position is occupied by another P, and P move to the left hand side of P. And the distance between P and P is the distance that P moved in a frame s time. So assume the camera actual speed is a constant amount, then we can obtain a virtual camera speed by multiply the distance D with the frames rate per second (in the movie, the value is 24 frames per second, TV is from 25-30, in my project, my value is 20 frames per second), and get a value in pixels per second. However, the 34

35 camera s actual speeds are not always the same. So we need to detect every virtual speed that appeared in the whole camera moving sequence. We can use image matching point principle to achieve this, which will be addressed in the next session. Hence, the distance between P and P is the value we need to know to determent the width of the vertical strips. Also, we can get the vertical virtual speed if the camera vibrates while rotating. Finally, we need to discuss if the values we got above can actual generate a great 3D sensation. Before this, we should understand the basic principle of 3D movie playing process. The 3D movie process includes two major parts. One is what we have discussed, the generating part, by using several cameras or using the approach I addressed in chapter 5.1, the left part of figure 5.8. The other part is the playing process. That is the audience physically watching the movie, the right part of figure 5.8. If we need to exactly control the effect we want to achieve, all the parameters for the left part of figure 5.8 and the right part of figure 5.8 should be exactly the same. FIGURE 5.8 That is camwide= eyewide, D=D (D is the distance between the cameras and the object, D is that we can feel the distance that between the object and us), a=b (a is the camera s viwing angle, B is the human eye s viewing angle.) If we can meet every of the conditions addressed above, we can control exactly the distance that object A 35

36 jumps out of the screen. It is the same distance with A and the image plane. However, we can t actually meet all the conditions. For camwide = eyewide, it is easy, as defined in first step in chapter 5.2. But for a=b, it is almost impossible, because everybody has different eyesight. And for D=D, it is quite hard to achieve this, because you can not force people to sit at a fixed location. Therefore, if we need to apply our parameters discussed in the six steps to generate a reasonable 3D sensation, we need to find alternate way rather than simply meet the three conditions addressed above. Here, what we call reasonable 3D sensation is that the audience should be able to feel a reasonable amount that some object pops out which means that the object jumps out of the screen a certain amount, and the audience would like to touch it by hand, as in figure 5.9. FIGURE 5.9, taken from [6]. And my alternate way to generate the reasonable 3D sensation is to adjust the parameters discussed in the six steps to adjust the difference between D and D, a and B. First, we have a look at figure

37 FIGURE 5.10 From figure 5.10, we can get a conclusion that if D<D or B<a, it will reduce the effect that object A pops out the screen, which may cause a failure in 3D sensation generation. To avoid this, we can simply increase the size of the screen or ask the audience to sit closer to the screen. However, in my project, I can not control both of these, what I can do is to control the parameters. Therefore, I can increase the distance between the two virtual slit cameras to improve the 3D effect, as shown in figure FIGURE 5.11 The effect of increasing the distance between left slit camera and right slit camera is 37

38 that we increase the D value to make it greater than D to adjust the Pops out effect. 5.3 Image points matching In this section, I will introduce the method that I used to detect the virtual speed addressed in the last section, which is called image points matching. Currently, there are two popular methods are used in image points matching. One is called L.K. Feature ; the other is called Harris Corner. And a new method [14] is proposed in 2006, which in my opinion is the best of all. It detects the object s invariant feature to match the same points; therefore it can be used to do advanced search that automatically looking for same points in a set of un-order images. However, in my project, the order of input images are known, and I just use the simple principle to find same points, so I choose the Harris corner approach, which is easy to understand and used. The details of Harris corner are addressed in [15]. With this approach, we are able to find out the feature points in an image such as the edge or intersection points etc. Then we can generates putative matches between previously detected feature points in two images by looking for points that are maximally correlated with each other within windows surrounding each point. And only the points that correlate most strongly with each other in both directions are returned [20], as shown in figure

39 FIGURE 5.12 In figure, A,B,C and D are the feature points detected by Harris corners, and only both directions co-relationships are satisfied in a specific window around the point that we can say A is the same point as B. In computer vision, generally speaking, we use the approach based on convolution computation to carry out the matching. As mentioned above, there are only two steps to determine the virtual speed in this project. The first step is to use harris corner approach to detect each frame s feature points. And record their row and column information. The second step is to compare the feature points of two frames which are adjacent. Computes the difference of the same points horizontal coordinate and vertical coordinate. And then divide it with the frames rate per second. 5.4 Algorithm building In this section, I will conclude the analysis mentioned above to provide an algorithm to generate a stereo panorama image pair which will be used in the implementation section. There are a few steps. 39

40 Firstly, input a video sequence which is recorded based on the approach of figure 5.3 and figure 5.4. Secondly, get each frame from the input video and save them as an image cube, as shown in figure FIGURE 5.13, taken from 10. Thirdly, find the image matching points based on harris corner approach & mathematical co-relationship, and compute the virtual speed. Fourthly, compute the width of each vertical strip based on the virtual speed and compute the location of the vertical strips. Specifically, the width of the vertical strip in the first frame can be chosen based on the virtual speed derived from first frame and second frame, which is the same as that in the second frame. Fifthly, stitch the strips together to form a pair of panorama images, as shown in 40

41 figure Modeling The project is just an initial step. Therefore, the software is not as important as theory. So, I use matlab language to write codes to test if the perspective effect of the theory is in accordance with practice. Therefore, the modeling process is quite simple. Data format modeling Firstly, I need to identify the input data format. In matlab, it supports only avi format video files. And if the avi is compressed, the de-compressed codec should be known to matlab. Therefore, my test data is the avi file with known de-compress codec. Secondly, I need to identify the image format. The image format used in the project can be any image format supposed by matlab, such as bmp, jpg, tiff, gif etc. So, I choose bmp as the image format. Component modeling As mentioned in chapter 5.4, I need a special unit to detect the virtual speed of the camera. Therefore, the total structure is divided into 2 parts, as shown in figure FIGURE

42 As shown in figure 5.14, the motion estimation and computer relative shift units are used to detect the virtual speed, and the results obtained are used to decide the width of each vertical strip. The panorama images built unit will process the results passed from computer relative shift unit to select the width of each strip, and decide the locations of left and right strips according to the camera s parameters, then it will output a stereo image pair. 42

43 6 IMPLEMENTATIONS This section will introduce the way I used to implement the theory analyzed above. It will include the methods that I try at the very beginning according to my own understanding of such an issue, and compare my original approaches with the one I used in my project, and give an reason why I didn t use the initial approaches. 6.1 Initial implementations methods At the beginning of the project, I used my ways to generate a stereo panorama image pair. Firstly, I use a single camera to simulate the two-camera device as shown in figure 5.1. In figure 6.1, it illustrates the way I implemented it. FIGURE 6.1 As shown in figure 6.1, I fixed the camera first at camera spot A to capture a video sequence. And then I fixed the camera in camera spot B to capture a video sequence. Particularly, if the actions in the two process are exactly the same (same vertical or horizontal shaking and same speed at each frame), each pair of the frames consist a 43

44 pair of stereo image, as shown in figure 6.2. FIGURE 6.2 After mosaic them, a stereo image is generated, as shown in figure 6.3. FIGURE

45 Therefore, we can generate a stereo panorama image by stitching the unit-pairs. There are several advantages of this approach. The first one is that it is quite easy to understand by people without any computer vision knowledge. The second one, also the most important one is that we can generate the vision of virtual camera, as shown in figure 6.4. FIGURE 6.4 In figure 6.4, the vision of virtual camera A in the virtual system O can be simulated, because all of the A s projection ray can be calculated by the A s projection ray, due to A rotating a 360 degree. However, the mathematics involved in achieving this are complicated. The disadvantage of this approach is that it only get better result when left spot camera actions and the right spot camera actions are exactly the same. The second approach I tried to generate a stereo image pair is a bit different from the one mentioned above, as shown in figure

46 FIGURE 6.5 This approach used several camera pairs to capture unit stereo pairs, as indicated by the circles in figure 6.5. That is in each circle, I used the camera to capture a single image at camera spot A, and then shift the camera to camera spot B which locates about 6.5cm (human eyes distance) away from spot A to capture another image. These two images then comprise a stereo image pair. Then I can obtain a set of stereo image pairs. The location of each circle can be chosen in a range which makes the image taken in the same spot at current circle (for example, spot A in the second circle clock wisely) overlaps at least 1/3 of the former image (the spot A marked in figure 6.5). Then I can stitch the unit pairs together to generate a panorama images, which will use SIFT algorithm to mosaic images. Particularly, if the radius of spot A to the center axis is greater than a value B, the curve from A to B is almost equal to the horizontal distance from A to B. That is we can canister that spot A and spot B locate in the same circle. Then we don t need a whole video sequence to generate a stereo panorama image pair. Instead, we need only some pairs of unit stereo images, as shown in figure 6.6 and

47 FIGURE 6.6, unit pair generated by initial approach two. FIGURE 6.7, 3D image generated from figure

48 The advantage of this approach is that it is more easy and quick to implement than the first approach and the one I used in my project, which is due to the great reduce on image cube size. The disadvantage of this approach is that it can not be used to generate a virtual camera s vision. In the following section, I will introduce the way I used to implement the theory that I used for my project. 6.2 Software introduction In my project, I use Matlab as my programming language. It is because my project is an initial research project. I need an easy to use language to implement the theory I obtained. MATLAB is a high-level language and interactive environment that enables you to perform computationally intensive tasks faster than with traditional programming languages such as C, C++, and Fortran. And it has a lot of features built especially for engineering, including computer vision. We can read in video, images directly into Matlab by its simple built-in function (which are the same as class in Java), and manipulate them directly with matrix. In my project, I used MATLAB version R2006b as the major language to carry out the stereo panorama images generation. And also, I write java scrip and a simple html language to display my stereo image in SHARP RD3D computer. 6.3 Input specification To generate a perfect 3D panorama image, the input video s quality is quite important. To identify the input video s quality, I classify the input video into two categories. One is in ideal situation which has constant rotating speed and has not any vertical 48

49 vibrating. The other is in normal situation that allows the different rotating speed and some vertical vibrating. For the ideal situation, I don t need to apply for the virtual speed detection algorithm to every frame of the video to adjust the vertical and horizontal vibration. Therefore the program will run at a high speed. For the normal situation, I need to apply the virtual speed detection algorithm into every frame of the video, which is quite slow in Matlab execution. In my project, I used Canon ixus 4.0 to capture the video sequence, and use my elbow as the rotating radius. Therefore, my input is in normal situation as mentioned above. Moreover, the scene of the video greatly affect the 3D sensation, later, I will discuss it in the test section. 6.4 Output results My program will output a pair of stereo image, and I used SHARP company s open source SDK to generate a stereo image for SHARP RD3D notebook computer to view. Figure 6.8 and figure 6.9 illustrate the output image pairs, where figure 6.10 and figure 6.11 illustrate their synthesis 3D image that can be displayed on SHARP RD3D computer respectively. 49

50 FIGURE 6.8 FIGURE

51 FIGURE 6.10 FIGURE 6.11 The advantages of this approach are obvious. Firstly, it is easy to capture a video sequence by any of home-use cameras. That is the people without specific knowledge can make it. Secondly, due to the continuous camera positions when capturing video, we can easily calculate a virtual 3D view at virtual points, as shown in figure FIGURE 6.12 There is always a projection ray intersecting the physical camera s moving circle, where a specific camera position can be found. For example, to find out the camera s 51

52 viewing rays in position A, we need to find out the projections ray from the physical camera s locations B and B. Therefore, in a virtual position A, we can find out that, all the projection rays by the virtual camera locating at A can be found from a set of rays generated by the physical camera s locations from B to B in the moving circle. Hence, in theory, we can generate real time 3D sensation within this circle. 6.5 Testing on sharp RD3D API There came a kind of software for viewing stereo image with SHARP RD3D. I can test the results on such software. However, the software limits the image size of the stereo photo. For a panorama image, the width is much greater than the height. The software will automatically fit the image into screen size, where I can not adjust it. So the small image makes little 3D sensation. I write a HTML language and a java script to display my image, which enables user to move forward or backward to look for the whole panorama scene. 52

53 7 RESULTS AND ANALYSIS In this section, I will first introduce the way I used to do the test for my project, and then discuss the result I get from a set of test feedback questionnaires. 7.1 Testing effect I used a questionnaire approach to test the 3D sensation of my stereo image among a group of 10 people for two main purposes. At first stage, I measured the each of the testers eye distances by a ruler and asked the testers to look at a set of 5 stereo images from the same scene which were generated with different parameters chosen, which were the different distances between two strips calculated in accordance with different distances between two virtual slit cameras, where the figures were from 6cm to 8cm, with a 0.5cm increase. And then I asked them to pick up the nicest one among them. I would like to know how the human eyes distance would affect the 3D sensation. At the second stage, I will use the most voted stereo image from first stage and other 2 stereo panorama images, which have identical parameters with the one chosen at first stage but taken in different environment, as the test data, and ask them to find out which one is the best. And then, I will discuss the result and give an improvement on 3D effect if applicable. Finally, I want to ask them to give their suggestions on how to improve 3D sensation based on their own knowledge base and imagination. 53

COSC579: Scene Geometry. Jeremy Bolton, PhD Assistant Teaching Professor

COSC579: Scene Geometry. Jeremy Bolton, PhD Assistant Teaching Professor COSC579: Scene Geometry Jeremy Bolton, PhD Assistant Teaching Professor Overview Linear Algebra Review Homogeneous vs non-homogeneous representations Projections and Transformations Scene Geometry The

More information

Camera Calibration. Schedule. Jesus J Caban. Note: You have until next Monday to let me know. ! Today:! Camera calibration

Camera Calibration. Schedule. Jesus J Caban. Note: You have until next Monday to let me know. ! Today:! Camera calibration Camera Calibration Jesus J Caban Schedule! Today:! Camera calibration! Wednesday:! Lecture: Motion & Optical Flow! Monday:! Lecture: Medical Imaging! Final presentations:! Nov 29 th : W. Griffin! Dec 1

More information

Jump Stitch Metadata & Depth Maps Version 1.1

Jump Stitch Metadata & Depth Maps Version 1.1 Jump Stitch Metadata & Depth Maps Version 1.1 jump-help@google.com Contents 1. Introduction 1 2. Stitch Metadata File Format 2 3. Coverage Near the Poles 4 4. Coordinate Systems 6 5. Camera Model 6 6.

More information

Multi-View Omni-Directional Imaging

Multi-View Omni-Directional Imaging Multi-View Omni-Directional Imaging Tuesday, December 19, 2000 Moshe Ben-Ezra, Shmuel Peleg Abstract This paper describes a novel camera design or the creation o multiple panoramic images, such that each

More information

Introduction to Homogeneous coordinates

Introduction to Homogeneous coordinates Last class we considered smooth translations and rotations of the camera coordinate system and the resulting motions of points in the image projection plane. These two transformations were expressed mathematically

More information

lecture 10 - depth from blur, binocular stereo

lecture 10 - depth from blur, binocular stereo This lecture carries forward some of the topics from early in the course, namely defocus blur and binocular disparity. The main emphasis here will be on the information these cues carry about depth, rather

More information

Pin Hole Cameras & Warp Functions

Pin Hole Cameras & Warp Functions Pin Hole Cameras & Warp Functions Instructor - Simon Lucey 16-423 - Designing Computer Vision Apps Today Pinhole Camera. Homogenous Coordinates. Planar Warp Functions. Motivation Taken from: http://img.gawkerassets.com/img/18w7i1umpzoa9jpg/original.jpg

More information

3D Sensing and Reconstruction Readings: Ch 12: , Ch 13: ,

3D Sensing and Reconstruction Readings: Ch 12: , Ch 13: , 3D Sensing and Reconstruction Readings: Ch 12: 12.5-6, Ch 13: 13.1-3, 13.9.4 Perspective Geometry Camera Model Stereo Triangulation 3D Reconstruction by Space Carving 3D Shape from X means getting 3D coordinates

More information

CS223b Midterm Exam, Computer Vision. Monday February 25th, Winter 2008, Prof. Jana Kosecka

CS223b Midterm Exam, Computer Vision. Monday February 25th, Winter 2008, Prof. Jana Kosecka CS223b Midterm Exam, Computer Vision Monday February 25th, Winter 2008, Prof. Jana Kosecka Your name email This exam is 8 pages long including cover page. Make sure your exam is not missing any pages.

More information

Geometric camera models and calibration

Geometric camera models and calibration Geometric camera models and calibration http://graphics.cs.cmu.edu/courses/15-463 15-463, 15-663, 15-862 Computational Photography Fall 2018, Lecture 13 Course announcements Homework 3 is out. - Due October

More information

Omni Stereo Vision of Cooperative Mobile Robots

Omni Stereo Vision of Cooperative Mobile Robots Omni Stereo Vision of Cooperative Mobile Robots Zhigang Zhu*, Jizhong Xiao** *Department of Computer Science **Department of Electrical Engineering The City College of the City University of New York (CUNY)

More information

calibrated coordinates Linear transformation pixel coordinates

calibrated coordinates Linear transformation pixel coordinates 1 calibrated coordinates Linear transformation pixel coordinates 2 Calibration with a rig Uncalibrated epipolar geometry Ambiguities in image formation Stratified reconstruction Autocalibration with partial

More information

Vision Review: Image Formation. Course web page:

Vision Review: Image Formation. Course web page: Vision Review: Image Formation Course web page: www.cis.udel.edu/~cer/arv September 10, 2002 Announcements Lecture on Thursday will be about Matlab; next Tuesday will be Image Processing The dates some

More information

Homework #1. Displays, Alpha Compositing, Image Processing, Affine Transformations, Hierarchical Modeling

Homework #1. Displays, Alpha Compositing, Image Processing, Affine Transformations, Hierarchical Modeling Computer Graphics Instructor: Brian Curless CSE 457 Spring 2014 Homework #1 Displays, Alpha Compositing, Image Processing, Affine Transformations, Hierarchical Modeling Assigned: Saturday, April th Due:

More information

Stereo CSE 576. Ali Farhadi. Several slides from Larry Zitnick and Steve Seitz

Stereo CSE 576. Ali Farhadi. Several slides from Larry Zitnick and Steve Seitz Stereo CSE 576 Ali Farhadi Several slides from Larry Zitnick and Steve Seitz Why do we perceive depth? What do humans use as depth cues? Motion Convergence When watching an object close to us, our eyes

More information

Computer Vision. Coordinates. Prof. Flávio Cardeal DECOM / CEFET- MG.

Computer Vision. Coordinates. Prof. Flávio Cardeal DECOM / CEFET- MG. Computer Vision Coordinates Prof. Flávio Cardeal DECOM / CEFET- MG cardeal@decom.cefetmg.br Abstract This lecture discusses world coordinates and homogeneous coordinates, as well as provides an overview

More information

Pin Hole Cameras & Warp Functions

Pin Hole Cameras & Warp Functions Pin Hole Cameras & Warp Functions Instructor - Simon Lucey 16-423 - Designing Computer Vision Apps Today Pinhole Camera. Homogenous Coordinates. Planar Warp Functions. Example of SLAM for AR Taken from:

More information

Stereo Image Rectification for Simple Panoramic Image Generation

Stereo Image Rectification for Simple Panoramic Image Generation Stereo Image Rectification for Simple Panoramic Image Generation Yun-Suk Kang and Yo-Sung Ho Gwangju Institute of Science and Technology (GIST) 261 Cheomdan-gwagiro, Buk-gu, Gwangju 500-712 Korea Email:{yunsuk,

More information

CS6670: Computer Vision

CS6670: Computer Vision CS6670: Computer Vision Noah Snavely Lecture 7: Image Alignment and Panoramas What s inside your fridge? http://www.cs.washington.edu/education/courses/cse590ss/01wi/ Projection matrix intrinsics projection

More information

All human beings desire to know. [...] sight, more than any other senses, gives us knowledge of things and clarifies many differences among them.

All human beings desire to know. [...] sight, more than any other senses, gives us knowledge of things and clarifies many differences among them. All human beings desire to know. [...] sight, more than any other senses, gives us knowledge of things and clarifies many differences among them. - Aristotle University of Texas at Arlington Introduction

More information

Computer Vision Project-1

Computer Vision Project-1 University of Utah, School Of Computing Computer Vision Project- Singla, Sumedha sumedha.singla@utah.edu (00877456 February, 205 Theoretical Problems. Pinhole Camera (a A straight line in the world space

More information

L16. Scan Matching and Image Formation

L16. Scan Matching and Image Formation EECS568 Mobile Robotics: Methods and Principles Prof. Edwin Olson L16. Scan Matching and Image Formation Scan Matching Before After 2 Scan Matching Before After 2 Map matching has to be fast 14 robots

More information

Raycasting. Chapter Raycasting foundations. When you look at an object, like the ball in the picture to the left, what do

Raycasting. Chapter Raycasting foundations. When you look at an object, like the ball in the picture to the left, what do Chapter 4 Raycasting 4. Raycasting foundations When you look at an, like the ball in the picture to the left, what do lamp you see? You do not actually see the ball itself. Instead, what you see is the

More information

DD2429 Computational Photography :00-19:00

DD2429 Computational Photography :00-19:00 . Examination: DD2429 Computational Photography 202-0-8 4:00-9:00 Each problem gives max 5 points. In order to pass you need about 0-5 points. You are allowed to use the lecture notes and standard list

More information

Final Exam Study Guide

Final Exam Study Guide Final Exam Study Guide Exam Window: 28th April, 12:00am EST to 30th April, 11:59pm EST Description As indicated in class the goal of the exam is to encourage you to review the material from the course.

More information

CSE 4392/5369. Dr. Gian Luca Mariottini, Ph.D.

CSE 4392/5369. Dr. Gian Luca Mariottini, Ph.D. University of Texas at Arlington CSE 4392/5369 Introduction to Vision Sensing Dr. Gian Luca Mariottini, Ph.D. Department of Computer Science and Engineering University of Texas at Arlington WEB : http://ranger.uta.edu/~gianluca

More information

ECE-161C Cameras. Nuno Vasconcelos ECE Department, UCSD

ECE-161C Cameras. Nuno Vasconcelos ECE Department, UCSD ECE-161C Cameras Nuno Vasconcelos ECE Department, UCSD Image formation all image understanding starts with understanding of image formation: projection of a scene from 3D world into image on 2D plane 2

More information

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth Common Classification Tasks Recognition of individual objects/faces Analyze object-specific features (e.g., key points) Train with images from different viewing angles Recognition of object classes Analyze

More information

Final Review CMSC 733 Fall 2014

Final Review CMSC 733 Fall 2014 Final Review CMSC 733 Fall 2014 We have covered a lot of material in this course. One way to organize this material is around a set of key equations and algorithms. You should be familiar with all of these,

More information

5LSH0 Advanced Topics Video & Analysis

5LSH0 Advanced Topics Video & Analysis 1 Multiview 3D video / Outline 2 Advanced Topics Multimedia Video (5LSH0), Module 02 3D Geometry, 3D Multiview Video Coding & Rendering Peter H.N. de With, Sveta Zinger & Y. Morvan ( p.h.n.de.with@tue.nl

More information

Camera model and multiple view geometry

Camera model and multiple view geometry Chapter Camera model and multiple view geometry Before discussing how D information can be obtained from images it is important to know how images are formed First the camera model is introduced and then

More information

CHAPTER 3. Single-view Geometry. 1. Consequences of Projection

CHAPTER 3. Single-view Geometry. 1. Consequences of Projection CHAPTER 3 Single-view Geometry When we open an eye or take a photograph, we see only a flattened, two-dimensional projection of the physical underlying scene. The consequences are numerous and startling.

More information

Outline. ETN-FPI Training School on Plenoptic Sensing

Outline. ETN-FPI Training School on Plenoptic Sensing Outline Introduction Part I: Basics of Mathematical Optimization Linear Least Squares Nonlinear Optimization Part II: Basics of Computer Vision Camera Model Multi-Camera Model Multi-Camera Calibration

More information

521466S Machine Vision Exercise #1 Camera models

521466S Machine Vision Exercise #1 Camera models 52466S Machine Vision Exercise # Camera models. Pinhole camera. The perspective projection equations or a pinhole camera are x n = x c, = y c, where x n = [x n, ] are the normalized image coordinates,

More information

CS4670: Computer Vision

CS4670: Computer Vision CS467: Computer Vision Noah Snavely Lecture 13: Projection, Part 2 Perspective study of a vase by Paolo Uccello Szeliski 2.1.3-2.1.6 Reading Announcements Project 2a due Friday, 8:59pm Project 2b out Friday

More information

Movie: Geri s Game. Announcements. Ray Casting 2. Programming 2 Recap. Programming 3 Info Test data for part 1 (Lines) is available

Movie: Geri s Game. Announcements. Ray Casting 2. Programming 2 Recap. Programming 3 Info Test data for part 1 (Lines) is available Now Playing: Movie: Geri s Game Pixar, 1997 Academny Award Winner, Best Short Film Quicksand Under Carpet New Radiant Storm King from Leftover Blues: 1991-003 Released 004 Ray Casting Rick Skarbez, Instructor

More information

Camera Models and Image Formation. Srikumar Ramalingam School of Computing University of Utah

Camera Models and Image Formation. Srikumar Ramalingam School of Computing University of Utah Camera Models and Image Formation Srikumar Ramalingam School of Computing University of Utah srikumar@cs.utah.edu Reference Most slides are adapted from the following notes: Some lecture notes on geometric

More information

Assignment 2 : Projection and Homography

Assignment 2 : Projection and Homography TECHNISCHE UNIVERSITÄT DRESDEN EINFÜHRUNGSPRAKTIKUM COMPUTER VISION Assignment 2 : Projection and Homography Hassan Abu Alhaija November 7,204 INTRODUCTION In this exercise session we will get a hands-on

More information

Camera Model and Calibration

Camera Model and Calibration Camera Model and Calibration Lecture-10 Camera Calibration Determine extrinsic and intrinsic parameters of camera Extrinsic 3D location and orientation of camera Intrinsic Focal length The size of the

More information

CSE 252B: Computer Vision II

CSE 252B: Computer Vision II CSE 252B: Computer Vision II Lecturer: Serge Belongie Scribe: Sameer Agarwal LECTURE 1 Image Formation 1.1. The geometry of image formation We begin by considering the process of image formation when a

More information

Image Formation. Antonino Furnari. Image Processing Lab Dipartimento di Matematica e Informatica Università degli Studi di Catania

Image Formation. Antonino Furnari. Image Processing Lab Dipartimento di Matematica e Informatica Università degli Studi di Catania Image Formation Antonino Furnari Image Processing Lab Dipartimento di Matematica e Informatica Università degli Studi di Catania furnari@dmi.unict.it 18/03/2014 Outline Introduction; Geometric Primitives

More information

Single-view 3D Reconstruction

Single-view 3D Reconstruction Single-view 3D Reconstruction 10/12/17 Computational Photography Derek Hoiem, University of Illinois Some slides from Alyosha Efros, Steve Seitz Notes about Project 4 (Image-based Lighting) You can work

More information

Chapter 23. Geometrical Optics (lecture 1: mirrors) Dr. Armen Kocharian

Chapter 23. Geometrical Optics (lecture 1: mirrors) Dr. Armen Kocharian Chapter 23 Geometrical Optics (lecture 1: mirrors) Dr. Armen Kocharian Reflection and Refraction at a Plane Surface The light radiate from a point object in all directions The light reflected from a plane

More information

3D Sensing. 3D Shape from X. Perspective Geometry. Camera Model. Camera Calibration. General Stereo Triangulation.

3D Sensing. 3D Shape from X. Perspective Geometry. Camera Model. Camera Calibration. General Stereo Triangulation. 3D Sensing 3D Shape from X Perspective Geometry Camera Model Camera Calibration General Stereo Triangulation 3D Reconstruction 3D Shape from X shading silhouette texture stereo light striping motion mainly

More information

TEAMS National Competition Middle School Version Photometry Solution Manual 25 Questions

TEAMS National Competition Middle School Version Photometry Solution Manual 25 Questions TEAMS National Competition Middle School Version Photometry Solution Manual 25 Questions Page 1 of 14 Photometry Questions 1. When an upright object is placed between the focal point of a lens and a converging

More information

Computer Vision I - Appearance-based Matching and Projective Geometry

Computer Vision I - Appearance-based Matching and Projective Geometry Computer Vision I - Appearance-based Matching and Projective Geometry Carsten Rother 01/11/2016 Computer Vision I: Image Formation Process Roadmap for next four lectures Computer Vision I: Image Formation

More information

Stereo Vision. MAN-522 Computer Vision

Stereo Vision. MAN-522 Computer Vision Stereo Vision MAN-522 Computer Vision What is the goal of stereo vision? The recovery of the 3D structure of a scene using two or more images of the 3D scene, each acquired from a different viewpoint in

More information

TEAMS National Competition High School Version Photometry Solution Manual 25 Questions

TEAMS National Competition High School Version Photometry Solution Manual 25 Questions TEAMS National Competition High School Version Photometry Solution Manual 25 Questions Page 1 of 15 Photometry Questions 1. When an upright object is placed between the focal point of a lens and a converging

More information

Computer Vision Projective Geometry and Calibration. Pinhole cameras

Computer Vision Projective Geometry and Calibration. Pinhole cameras Computer Vision Projective Geometry and Calibration Professor Hager http://www.cs.jhu.edu/~hager Jason Corso http://www.cs.jhu.edu/~jcorso. Pinhole cameras Abstract camera model - box with a small hole

More information

Machine vision. Summary # 11: Stereo vision and epipolar geometry. u l = λx. v l = λy

Machine vision. Summary # 11: Stereo vision and epipolar geometry. u l = λx. v l = λy 1 Machine vision Summary # 11: Stereo vision and epipolar geometry STEREO VISION The goal of stereo vision is to use two cameras to capture 3D scenes. There are two important problems in stereo vision:

More information

CS 664 Slides #9 Multi-Camera Geometry. Prof. Dan Huttenlocher Fall 2003

CS 664 Slides #9 Multi-Camera Geometry. Prof. Dan Huttenlocher Fall 2003 CS 664 Slides #9 Multi-Camera Geometry Prof. Dan Huttenlocher Fall 2003 Pinhole Camera Geometric model of camera projection Image plane I, which rays intersect Camera center C, through which all rays pass

More information

Natural Viewing 3D Display

Natural Viewing 3D Display We will introduce a new category of Collaboration Projects, which will highlight DoCoMo s joint research activities with universities and other companies. DoCoMo carries out R&D to build up mobile communication,

More information

3D Image Sensor based on Opto-Mechanical Filtering

3D Image Sensor based on Opto-Mechanical Filtering 3D Image Sensor based on Opto-Mechanical Filtering Barna Reskó 1,2, Dávid Herbay 3, Péter Korondi 3, Péter Baranyi 2 1 Budapest Tech 2 Computer and Automation Research Institute of the Hungarian Academy

More information

Calibrating an Overhead Video Camera

Calibrating an Overhead Video Camera Calibrating an Overhead Video Camera Raul Rojas Freie Universität Berlin, Takustraße 9, 495 Berlin, Germany http://www.fu-fighters.de Abstract. In this section we discuss how to calibrate an overhead video

More information

Computer Vision I Name : CSE 252A, Fall 2012 Student ID : David Kriegman Assignment #1. (Due date: 10/23/2012) x P. = z

Computer Vision I Name : CSE 252A, Fall 2012 Student ID : David Kriegman   Assignment #1. (Due date: 10/23/2012) x P. = z Computer Vision I Name : CSE 252A, Fall 202 Student ID : David Kriegman E-Mail : Assignment (Due date: 0/23/202). Perspective Projection [2pts] Consider a perspective projection where a point = z y x P

More information

Refraction at a single curved spherical surface

Refraction at a single curved spherical surface Refraction at a single curved spherical surface This is the beginning of a sequence of classes which will introduce simple and complex lens systems We will start with some terminology which will become

More information

(Refer Slide Time: 00:01:26)

(Refer Slide Time: 00:01:26) Computer Graphics Prof. Sukhendu Das Dept. of Computer Science and Engineering Indian Institute of Technology, Madras Lecture - 9 Three Dimensional Graphics Welcome back everybody to the lecture on computer

More information

Robotics - Projective Geometry and Camera model. Marcello Restelli

Robotics - Projective Geometry and Camera model. Marcello Restelli Robotics - Projective Geometr and Camera model Marcello Restelli marcello.restelli@polimi.it Dipartimento di Elettronica, Informazione e Bioingegneria Politecnico di Milano Ma 2013 Inspired from Matteo

More information

Autonomous Vehicle Navigation Using Stereoscopic Imaging

Autonomous Vehicle Navigation Using Stereoscopic Imaging Autonomous Vehicle Navigation Using Stereoscopic Imaging Project Proposal By: Beach Wlaznik Advisors: Dr. Huggins Dr. Stewart December 7, 2006 I. Introduction The objective of the Autonomous Vehicle Navigation

More information

Camera Models and Image Formation. Srikumar Ramalingam School of Computing University of Utah

Camera Models and Image Formation. Srikumar Ramalingam School of Computing University of Utah Camera Models and Image Formation Srikumar Ramalingam School of Computing University of Utah srikumar@cs.utah.edu VisualFunHouse.com 3D Street Art Image courtesy: Julian Beaver (VisualFunHouse.com) 3D

More information

Computer Vision I - Appearance-based Matching and Projective Geometry

Computer Vision I - Appearance-based Matching and Projective Geometry Computer Vision I - Appearance-based Matching and Projective Geometry Carsten Rother 05/11/2015 Computer Vision I: Image Formation Process Roadmap for next four lectures Computer Vision I: Image Formation

More information

3D Vision Real Objects, Real Cameras. Chapter 11 (parts of), 12 (parts of) Computerized Image Analysis MN2 Anders Brun,

3D Vision Real Objects, Real Cameras. Chapter 11 (parts of), 12 (parts of) Computerized Image Analysis MN2 Anders Brun, 3D Vision Real Objects, Real Cameras Chapter 11 (parts of), 12 (parts of) Computerized Image Analysis MN2 Anders Brun, anders@cb.uu.se 3D Vision! Philisophy! Image formation " The pinhole camera " Projective

More information

Practice Exam Sample Solutions

Practice Exam Sample Solutions CS 675 Computer Vision Instructor: Marc Pomplun Practice Exam Sample Solutions Note that in the actual exam, no calculators, no books, and no notes allowed. Question 1: out of points Question 2: out of

More information

3D Rendering and Ray Casting

3D Rendering and Ray Casting 3D Rendering and Ray Casting Michael Kazhdan (601.457/657) HB Ch. 13.7, 14.6 FvDFH 15.5, 15.10 Rendering Generate an image from geometric primitives Rendering Geometric Primitives (3D) Raster Image (2D)

More information

3D Rendering and Ray Casting

3D Rendering and Ray Casting 3D Rendering and Ray Casting Michael Kazhdan (601.457/657) HB Ch. 13.7, 14.6 FvDFH 15.5, 15.10 Rendering Generate an image from geometric primitives Rendering Geometric Primitives (3D) Raster Image (2D)

More information

Computer Vision cmput 428/615

Computer Vision cmput 428/615 Computer Vision cmput 428/615 Basic 2D and 3D geometry and Camera models Martin Jagersand The equation of projection Intuitively: How do we develop a consistent mathematical framework for projection calculations?

More information

COMP 175 COMPUTER GRAPHICS. Ray Casting. COMP 175: Computer Graphics April 26, Erik Anderson 09 Ray Casting

COMP 175 COMPUTER GRAPHICS. Ray Casting. COMP 175: Computer Graphics April 26, Erik Anderson 09 Ray Casting Ray Casting COMP 175: Computer Graphics April 26, 2018 1/41 Admin } Assignment 4 posted } Picking new partners today for rest of the assignments } Demo in the works } Mac demo may require a new dylib I

More information

CIS 580, Machine Perception, Spring 2016 Homework 2 Due: :59AM

CIS 580, Machine Perception, Spring 2016 Homework 2 Due: :59AM CIS 580, Machine Perception, Spring 2016 Homework 2 Due: 2015.02.24. 11:59AM Instructions. Submit your answers in PDF form to Canvas. This is an individual assignment. 1 Recover camera orientation By observing

More information

(Refer Slide Time: 00:04:20)

(Refer Slide Time: 00:04:20) Computer Graphics Prof. Sukhendu Das Dept. of Computer Science and Engineering Indian Institute of Technology, Madras Lecture 8 Three Dimensional Graphics Welcome back all of you to the lectures in Computer

More information

CV: 3D to 2D mathematics. Perspective transformation; camera calibration; stereo computation; and more

CV: 3D to 2D mathematics. Perspective transformation; camera calibration; stereo computation; and more CV: 3D to 2D mathematics Perspective transformation; camera calibration; stereo computation; and more Roadmap of topics n Review perspective transformation n Camera calibration n Stereo methods n Structured

More information

Chapters 1 7: Overview

Chapters 1 7: Overview Chapters 1 7: Overview Chapter 1: Introduction Chapters 2 4: Data acquisition Chapters 5 7: Data manipulation Chapter 5: Vertical imagery Chapter 6: Image coordinate measurements and refinements Chapter

More information

Depth Measurement and 3-D Reconstruction of Multilayered Surfaces by Binocular Stereo Vision with Parallel Axis Symmetry Using Fuzzy

Depth Measurement and 3-D Reconstruction of Multilayered Surfaces by Binocular Stereo Vision with Parallel Axis Symmetry Using Fuzzy Depth Measurement and 3-D Reconstruction of Multilayered Surfaces by Binocular Stereo Vision with Parallel Axis Symmetry Using Fuzzy Sharjeel Anwar, Dr. Shoaib, Taosif Iqbal, Mohammad Saqib Mansoor, Zubair

More information

Introduction to 3D Machine Vision

Introduction to 3D Machine Vision Introduction to 3D Machine Vision 1 Many methods for 3D machine vision Use Triangulation (Geometry) to Determine the Depth of an Object By Different Methods: Single Line Laser Scan Stereo Triangulation

More information

Homogeneous Coordinates. Lecture18: Camera Models. Representation of Line and Point in 2D. Cross Product. Overall scaling is NOT important.

Homogeneous Coordinates. Lecture18: Camera Models. Representation of Line and Point in 2D. Cross Product. Overall scaling is NOT important. Homogeneous Coordinates Overall scaling is NOT important. CSED44:Introduction to Computer Vision (207F) Lecture8: Camera Models Bohyung Han CSE, POSTECH bhhan@postech.ac.kr (",, ) ()", ), )) ) 0 It is

More information

A Review of Image- based Rendering Techniques Nisha 1, Vijaya Goel 2 1 Department of computer science, University of Delhi, Delhi, India

A Review of Image- based Rendering Techniques Nisha 1, Vijaya Goel 2 1 Department of computer science, University of Delhi, Delhi, India A Review of Image- based Rendering Techniques Nisha 1, Vijaya Goel 2 1 Department of computer science, University of Delhi, Delhi, India Keshav Mahavidyalaya, University of Delhi, Delhi, India Abstract

More information

DEPTH PERCEPTION. Learning Objectives: 7/31/2018. Intro & Overview of DEPTH PERCEPTION** Speaker: Michael Patrick Coleman, COT, ABOC, & former CPOT

DEPTH PERCEPTION. Learning Objectives: 7/31/2018. Intro & Overview of DEPTH PERCEPTION** Speaker: Michael Patrick Coleman, COT, ABOC, & former CPOT DEPTH PERCEPTION Speaker: Michael Patrick Coleman, COT, ABOC, & former CPOT Learning Objectives: Attendees will be able to 1. Explain what the primary cue to depth perception is (vs. monocular cues) 2.

More information

CS 4204 Computer Graphics

CS 4204 Computer Graphics CS 4204 Computer Graphics 3D Viewing and Projection Yong Cao Virginia Tech Objective We will develop methods to camera through scenes. We will develop mathematical tools to handle perspective projection.

More information

3D Geometry and Camera Calibration

3D Geometry and Camera Calibration 3D Geometry and Camera Calibration 3D Coordinate Systems Right-handed vs. left-handed x x y z z y 2D Coordinate Systems 3D Geometry Basics y axis up vs. y axis down Origin at center vs. corner Will often

More information

ECE Digital Image Processing and Introduction to Computer Vision. Outline

ECE Digital Image Processing and Introduction to Computer Vision. Outline ECE592-064 Digital Image Processing and Introduction to Computer Vision Depart. of ECE, NC State University Instructor: Tianfu (Matt) Wu Spring 2017 1. Recap Outline 2. Modeling Projection and Projection

More information

Topics and things to know about them:

Topics and things to know about them: Practice Final CMSC 427 Distributed Tuesday, December 11, 2007 Review Session, Monday, December 17, 5:00pm, 4424 AV Williams Final: 10:30 AM Wednesday, December 19, 2007 General Guidelines: The final will

More information

Computer Vision Lecture 17

Computer Vision Lecture 17 Computer Vision Lecture 17 Epipolar Geometry & Stereo Basics 13.01.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar in the summer semester

More information

Structure from motion

Structure from motion Structure from motion Structure from motion Given a set of corresponding points in two or more images, compute the camera parameters and the 3D point coordinates?? R 1,t 1 R 2,t R 2 3,t 3 Camera 1 Camera

More information

A Calibration Algorithm for POX-Slits Camera

A Calibration Algorithm for POX-Slits Camera A Calibration Algorithm for POX-Slits Camera N. Martins 1 and H. Araújo 2 1 DEIS, ISEC, Polytechnic Institute of Coimbra, Portugal 2 ISR/DEEC, University of Coimbra, Portugal Abstract Recent developments

More information

Skybox. Ruoqi He & Chia-Man Hung. February 26, 2016

Skybox. Ruoqi He & Chia-Man Hung. February 26, 2016 Skybox Ruoqi He & Chia-Man Hung February 26, 206 Introduction In this project, we present a method to construct a skybox from a series of photos we took ourselves. It is a graphical procedure of creating

More information

DD2423 Image Analysis and Computer Vision IMAGE FORMATION. Computational Vision and Active Perception School of Computer Science and Communication

DD2423 Image Analysis and Computer Vision IMAGE FORMATION. Computational Vision and Active Perception School of Computer Science and Communication DD2423 Image Analysis and Computer Vision IMAGE FORMATION Mårten Björkman Computational Vision and Active Perception School of Computer Science and Communication November 8, 2013 1 Image formation Goal:

More information

Computer Vision Lecture 17

Computer Vision Lecture 17 Announcements Computer Vision Lecture 17 Epipolar Geometry & Stereo Basics Seminar in the summer semester Current Topics in Computer Vision and Machine Learning Block seminar, presentations in 1 st week

More information

Projective Geometry and Camera Models

Projective Geometry and Camera Models /2/ Projective Geometry and Camera Models Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem Note about HW Out before next Tues Prob: covered today, Tues Prob2: covered next Thurs Prob3:

More information

Computer Vision, Laboratory session 1

Computer Vision, Laboratory session 1 Centre for Mathematical Sciences, january 2007 Computer Vision, Laboratory session 1 Overview In this laboratory session you are going to use matlab to look at images, study projective geometry representations

More information

Laser sensors. Transmitter. Receiver. Basilio Bona ROBOTICA 03CFIOR

Laser sensors. Transmitter. Receiver. Basilio Bona ROBOTICA 03CFIOR Mobile & Service Robotics Sensors for Robotics 3 Laser sensors Rays are transmitted and received coaxially The target is illuminated by collimated rays The receiver measures the time of flight (back and

More information

Image Transformations & Camera Calibration. Mašinska vizija, 2018.

Image Transformations & Camera Calibration. Mašinska vizija, 2018. Image Transformations & Camera Calibration Mašinska vizija, 2018. Image transformations What ve we learnt so far? Example 1 resize and rotate Open warp_affine_template.cpp Perform simple resize

More information

Homework #1. Displays, Image Processing, Affine Transformations, Hierarchical Modeling

Homework #1. Displays, Image Processing, Affine Transformations, Hierarchical Modeling Computer Graphics Instructor: Brian Curless CSE 457 Spring 215 Homework #1 Displays, Image Processing, Affine Transformations, Hierarchical Modeling Assigned: Thursday, April 9 th Due: Thursday, April

More information

An Algorithm for Seamless Image Stitching and Its Application

An Algorithm for Seamless Image Stitching and Its Application An Algorithm for Seamless Image Stitching and Its Application Jing Xing, Zhenjiang Miao, and Jing Chen Institute of Information Science, Beijing JiaoTong University, Beijing 100044, P.R. China Abstract.

More information

MERGING POINT CLOUDS FROM MULTIPLE KINECTS. Nishant Rai 13th July, 2016 CARIS Lab University of British Columbia

MERGING POINT CLOUDS FROM MULTIPLE KINECTS. Nishant Rai 13th July, 2016 CARIS Lab University of British Columbia MERGING POINT CLOUDS FROM MULTIPLE KINECTS Nishant Rai 13th July, 2016 CARIS Lab University of British Columbia Introduction What do we want to do? : Use information (point clouds) from multiple (2+) Kinects

More information

Project 4 Results. Representation. Data. Learning. Zachary, Hung-I, Paul, Emanuel. SIFT and HoG are popular and successful.

Project 4 Results. Representation. Data. Learning. Zachary, Hung-I, Paul, Emanuel. SIFT and HoG are popular and successful. Project 4 Results Representation SIFT and HoG are popular and successful. Data Hugely varying results from hard mining. Learning Non-linear classifier usually better. Zachary, Hung-I, Paul, Emanuel Project

More information

Short on camera geometry and camera calibration

Short on camera geometry and camera calibration Short on camera geometry and camera calibration Maria Magnusson, maria.magnusson@liu.se Computer Vision Laboratory, Department of Electrical Engineering, Linköping University, Sweden Report No: LiTH-ISY-R-3070

More information

Perspective Projection [2 pts]

Perspective Projection [2 pts] Instructions: CSE252a Computer Vision Assignment 1 Instructor: Ben Ochoa Due: Thursday, October 23, 11:59 PM Submit your assignment electronically by email to iskwak+252a@cs.ucsd.edu with the subject line

More information

Camera Model and Calibration. Lecture-12

Camera Model and Calibration. Lecture-12 Camera Model and Calibration Lecture-12 Camera Calibration Determine extrinsic and intrinsic parameters of camera Extrinsic 3D location and orientation of camera Intrinsic Focal length The size of the

More information

Computer Vision, Laboratory session 1

Computer Vision, Laboratory session 1 Centre for Mathematical Sciences, january 200 Computer Vision, Laboratory session Overview In this laboratory session you are going to use matlab to look at images, study the representations of points,

More information

Agenda. Rotations. Camera calibration. Homography. Ransac

Agenda. Rotations. Camera calibration. Homography. Ransac Agenda Rotations Camera calibration Homography Ransac Geometric Transformations y x Transformation Matrix # DoF Preserves Icon translation rigid (Euclidean) similarity affine projective h I t h R t h sr

More information

Cameras and Stereo CSE 455. Linda Shapiro

Cameras and Stereo CSE 455. Linda Shapiro Cameras and Stereo CSE 455 Linda Shapiro 1 Müller-Lyer Illusion http://www.michaelbach.de/ot/sze_muelue/index.html What do you know about perspective projection? Vertical lines? Other lines? 2 Image formation

More information