Shift-map Image Registration Svärm, Linus; Stranmark, Petter Unpublishe: 2010-01-01 Link to publication Citation for publishe version (APA): Svärm, L., & Stranmark, P. (2010). Shift-map Image Registration. Paper presente at Sweish Symposium on Image Analysis (SSBA) 2010, Uppsala, Sween. General rights Copyright an moral rights for the publications mae accessible in the public portal are retaine by the authors an/or other copyright owners an it is a conition of accessing publications that users recognise an abie by the legal requirements associate with these rights. Users may ownloa an print one copy of any publication from the public portal for the purpose of private stuy or research. You may not further istribute the material or use it for any profit-making activity or commercial gain You may freely istribute the URL ientifying the publication in the public portal? Take own policy If you believe that this ocument breaches copyright please contact us proviing etails, an we will remove access to the work immeiately an investigate your claim. L UNDUNI VERS I TY PO Box117 22100L un +46462220000
Shift-map Image Registration Linus Svärm, Petter Stranmark Centre for Mathematical Sciences, Lun University Email: {linus,petter}@maths.lth.se Abstract Shift-map image processing is a new framework base on energy minimization over a large space of labels. The optimization utilizes α-expansion moves an iterative refinement over a Gaussian pyrami. In this paper we exten the range of applications to image registration. To o this, new ata an smoothness terms have to be constructe. We note a great improvement when we measure pixel similarities with the ense DAISY escriptor. The main contributions of this paper are: The extension of the shift-map framework to inclue image registration. We register images for which SIFT only provies 3 correct matches. The first publicly available implementation of shift-map image processing (e.g. inpainting, registration). We conclue by comparing shift-map registration to a recent metho for optical flow with favorable results. I. INTRODUCTION TO SHIFT-MAPS Shift-map image processing has recently [1] been introuce an applie to image inpainting, content aware resizing, texture synthesis an image rearrangement. This paper will exten the range of applications to image registration. The motivation behin this work is the fact that we alreay were in possession of images we foun to be impossible to register using stanar methos (e.g. SIFT corresponences). The results we obtaine with shift-map registration seems promising. A. Problem formulation Registration can be performe using a parametric moel, e.g. an affine or a projective transformation estimate from point corresponences between the two images. In this paper, we consier a non-parametric moel. We have a base image B(i, j) an an input image I(i, j). The goal is to register the pixels of the input image ( onto the base ) image using a shift-map T(i, j) = t i (i, j), t j (i, j). ( The pixel I(i, j) is registere onto B i + t i (i, j), j + ) t j (i, j). Figure 2 shows the input an base image an the resulting image obtaine by moving the input image on top of the base image. Each possible shift-map is assigne an energy, base on a priori assumptions on what a goo shift-map typically looks like an how well the two images match each other. The goal is then to fin the optimal shift-map, that is, the shift-map with the lowest energy: E(T) = α 1 i m (i,j ) 1 j n 1 i m 1 j n (T(i, j))+ s (T(i, j), T(i, j )), (1) Fig. 1: Shift-map between two images where the last summation refers to summations over all (i, j ) in a neighborhoo N (i, j) of (i, j). an Eij s are the ata terms an smoothness terms, respectively. They will be escribe below. We have use 4-connectivity of ajacent pixels throughout this paper. II. REGISTRATION ENERGY TERMS The methos in [1] eal with constructing a new image from an ol one an the registration problem is about fining a map between two existing images. Hence the energy previously use for fining shift-maps is not suitable for registration an new energy terms must be constructe. a) Comparison of pixels: A relate problem to image registration is ense epth estimation from two images of the same object with known camera positions. This problem has been stuie extensively, see for example [3]. Recently a new escriptor, DAISY, was propose by Tola et al. [4], tailore to ense stereo estimation where the position of the two cameras iffer by a large amount. This escriptor is shown to outperform other approaches (e.g. SIFT, SURF an pixel ifferences) in extensive experiments. Therefore, it seems relevant to try an apply this escriptor to the relate problem of estimating a ense image registration. Not unlike SIFT [5], a DAISY escriptor samples the image erivative in ifferent irections. Eight ifferent irections an three ifferent scales are use. By sampling these fiels at ifferent points aroun the feature location, a escriptor of imensionality 200 is obtaine. Since the same fiels are use for all image locations, a ense fiel of escriptors can be compute in a couple of secons. The main goal of the DAISY escriptor was efficient ense computation. In orer to choose relevant parameters, we foun [6] helpful. b) Data terms: The ata terms were previously use in [1] to enforce har constraints on the shift-map. When inpainting an image, the ata term makes sure no pixels in the hole are use in the output image by assigning such shifts a cost of. In this paper, where image registration is consiere, we nee to evelop more complex ata terms to incorporate
(a) input image I (b) base image B (c) shift-map (x) () shift-map (y) (e) final location of pixels Fig. 2: Registration of two images from [2]. Each pixel in the input image is place on the base image as escribe by the shift-map. the fact that we want to fin a mapping between two images such that similar pixels are mappe to similar pixels The ata terms ictates that similar parts of the images shoul en up on top of each other. To measure similarity, ense DAISY is use. It might only be possible to register parts of the input image, so shifting pixels outsie the base image is permitte, at a constant cost P per pixel. The ata terms are then given by (T) = { Î(i, j) ˆB( (i, j) + T(i, j) ) 2 P when (i, j) + T(i, j) is outsie B, (2) where Î(i, j) is the DAISY escriptor escribing the image I at pixel location (i, j). Figure 5 shows a heat map (row 6) of the istance from the circle feature in the first row to all locations in the image in row 2. c) Smoothness terms: The smoothness term is use to enforce global consistency to the shift-map, while allowing iscontinuities at a limite number of places. In [1], the smoothness term compare the color an graient pixel-wise. Where a iscontinuity in the shift-map occurs, the penalty is compute as the ifference in color an graients. Our smoothness function takes the form of the Eucliean istance between the enpoints of the two shifts: s (T(i, j), T(i, j )) = (i, j ) + T(i, j ) (i, j) T(i, j) 2. (3) Here, (i, j) an (i, j ) are neighboring pixels, see (1). Using the shift ifference T(i, j ) T(i, j) 2 will penalize smoothly varying shift-maps too much, an hence it is important to compare the en points (as in (3)). ) Color information: The DAISY escriptor oes not use color information, yet intuitively it makes little sense to match pixels of very ifferent colors. Because of this, we have also mae experiments where the color information of the images are incorporate into the above ata terms. The color moel use assigne a cost of P to pixels with large ifference in hue, given that the intensity an saturation allowe a reliable value of the hue. This moel improve the result of the registration in Fig. 5. We i not use color information in the experiment shown in Fig. 2. III. EXPERIMENTS To minimize the energy (1), we use α-expansion as escribe by Boykov an Veksler [7] with the graph algorithms escribe in [8], [9]. Each possible shift value T(i, j) { m... m} { n... n} is mappe to a 1D label space. The number of labels for even moerately size images then becomes very large. In orer to make it tractable, a Gaussian pyrami was use. For the images in Fig. 5, an initial size of 128 23 was use The size was ouble 3 times until the final resolution of 1024 179 was reache. Each oubling of the image size is followe by a linear interpolation of the shift-map. This shift-map was use as a starting guess for the optimization at the larger level. At each level after the first, only 9 possible shifts then nee to be consiere: { 1, 0, 1} in each irection. To verify our implementation, we inpainte an example image use in [1], see Fig. 3. We trie to follow their implementation as closely as possible an got ifferent, but qualitatively similar results. During large-scale reconstruction of a city using images taken with a cylinrical camera [10], we have encountere many ifficult image pairs where SIFT is unable to provie useful corresponences. The top two rows in Fig. 5 show one of the harest. Compute SIFT features for the two images (794 an 1019 feature points, respectively) only yiele 3 correct matches. The main reason for this was the image geometry an large, repetitive patterns. Using shiftmap we obtaine a ense, mostly correct map between the images. This was then use as an ai to compute SIFT corresponences. We then obtaine 28 matches, of which 12 were correct. The runtime for this image was about 2 minutes. e) Comparison to previous work: We compare shiftmap registration to the optical flow algorithm escribe in [11]. This algorithm i not prouce useful results for the street images, see bottom of Fig. 5. IV. CONCLUSION AND FURTHER WORK We have stuie the application of shift-maps to image registration. Computing the smoothness term with color an graient ifferences as in [1] i not give satisfactory results when extene to image registration, but we foun a great improvement with the ense DAISY escriptor. For relatively easy cases (Fig. 4), we obtaine very goo results. For very har cases (Fig. 5) we obtaine results
which prove very useful for obtaining corresponences between the images. We compare shift-map registration to the optical flow algorithm escribe in [11] (Fig. 5e), which was significantly less accurate. One interesting future line of work woul be to investigate whether shift-map inpainting can be improve by the DAISY escriptor as well. We have also not investigate large rotations in this paper, which woul require aitional consierations. The source coe for our shift-map implementation an for all experiments in this paper has been release to the public. To our knowlege, this is the first publicly available shift-map implementation. (a) input image I (b) base image B (c) final location of pixels R EFERENCES [1] Y. Pritch, E. Kav-Venaki, an S. Peleg, Shift-map image eiting, in Int. Conf. Computer Vision, 2009. 1, 2, 3 [2] A. Kushal an J. Ponce, Moeling 3D objects from stereo views an recognizing them in photographs, in European Conf. Computer Vision, 2006. 2 [3] V. Kolmogorov an R. Zabih, Graph cut algorithms for binocular stereo with occlusions, in Hanbook of Mathematical Moels in Computer Vision. Springer, 2006. 1 [4] E. Tola, V. Lepetit, an P. Fua, Daisy: An efficient ense escriptor applie to wie baseline stereo, IEEE Trans. Pattern Analysis an Machine Intelligence, 2009. 1 [5] D. Lowe, Distinctive image features from scale-invariant keypoints, Int. Journal Computer Vision, vol. 20, pp. 91 110, 2004. 1 [6] S. Winer, G. Hua, an M. Brown, Picking the best DAISY, in Conf. Computer Vision an Pattern Recognition. IEEE, 2009, pp. 178 185. 1 [7] Y. Boykov, O. Veksler, an R. Zabih, Fast approximate energy minimization via graph cuts, IEEE Trans. Pattern Analysis an Machine Intelligence, vol. 23, no. 11, pp. 1222 1239, Nov 2001. 2 [8] Y. Boykov an V. Kolmogorov, An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision, IEEE Trans. Pattern Analysis an Machine Intelligence, vol. 26, no. 9, pp. 1124 1137, Sept. 2004. 2 [9] V. Kolmogorov an R. Zabih, What energy functions can be minimize via graph cuts? IEEE Trans. Pattern Analysis an Machine Intelligence, vol. 26, no. 2, pp. 147 159, 2004. 2 [10] Hitta.se streetview, http://www.hitta.se. 2 [11] C. Zach, T. Pock, an H. Bischof, A uality base approach for realtime TV-L1 optical flow, in Pattern Recognition (Proc. DAGM), 2007, pp. 214 223. 2, 3, 4 Fig. 4: Registration of two images of a builing. Fig. 3: Our reimplementation of the algorithm in [1]. The last two images show the shift-map in the x an y irection, respectively.
(a) input image I (b) base image B (c) final locations of the pixels in I () resulting shift-map (e) result using the metho in [11] (f) DAISY istance between the circle feature in I to all pixel locations in B Fig. 5: Registration of 1024 179 Hitta images. We note that we achieve a ense, highly nonlinear registration. This shift-map allowe us to obtain useful point-corresponences between the images, which was not possible using SIFT alone.