Non-Parametric Structure-Based Calibration of Radially Symmetric Cameras

Non-Parametrc Structure-Based Calbraton of Radally Symmetrc Cameras Federco Camposeco, Torsten Sattler, Marc Pollefeys Department of Computer Scence, ETH Zürch, Swtzerland {federco.camposeco, torsten.sattler, marc.pollefeys}@nf.ethz.ch Abstract We propose a novel two-step method for estmatng the ntrnsc and extrnsc calbraton of any radally symmetrc camera, ncludng non-central systems. The frst step conssts of estmatng the camera pose, gven a Structure from Moton (SfM) model, up to the translaton along the optcal axs. As a second step, we obtan the calbraton by fndng the translaton of the camera center usng an orderng constrant. The method makes use of the 1D radal camera model, whch allows us to effectvely handle any radally symmetrc camera, ncludng non-central ones. Usng ths orderng constrant, we show that the we are able to calbrate several dfferent (central and non-central) Wde Feld of Vew (WFOV) cameras, ncludng fsheye, hypercatadoptrc and sphercal catadoptrc cameras, as well as pnhole cameras, usng a sngle mage or jontly solvng for several vews. 1. Introducton Recently, cameras wth a wde feld of vew (WFOV) such as fsheye and omndrectonal cameras are startng to become more and more popular. Due to ther ablty to observe a large porton of the scene, usng WFOV cameras s advantageous for 3D computer vson tasks such as the precse camera trackng that s done as part of vsual navgaton for robots and autonomous vehcles. At the same tme, acton cameras wth WFOV such as the GoPro Hero are wdely used. Smlarly, camera mounts for moble devces that enable them to take panoramc mages, e.g., usng a catadoptrc lens, are becomng more frequent. As a result, more and more WFOV mages are becomng avalable on photo sharng webstes such as Flckr and Pcasa. In the case of Structure-from-Moton (SfM) from photo communty collectons, these photos could be partcularly helpful to strengthen the overall reconstructon as they provde constrants to many other cameras. However, they are typcally dscarded n practce due to the challenge of automatcally calbratng these cameras. In ths paper, we present a novel method to automatcally calbrate WFOV cameras from 2D-3D matches establshed between features extracted n ther mages and 3D ponts n a SfM reconstructon usng mage-based localzaton methods. Gven a partal reconstructon obtaned from regular mages, our method can thus be used to calbrate and then nsert WFOV nto a SfM model to strongly lnk together dfferent parts of the scene whch are all vsble n these photos and thus mprove the qualty of the reconstructon. Our method s based on the 1D radal camera model [15 17], whch can be used to descrbe any type of camera wth radal dstorton, ncludng pnhole, fsheye, and non-central cameras such as catadoptrc lenses, as long as the center of dstorton s known. We combne ths model wth a nonparametrc ntrnsc calbraton to obtan an extremely powerful calbraton method that s capable of calbratng a wde range of camera types. Our method conssts of two steps. In the frst step, the extrnsc calbraton s computed from the 2D-3D matches up to the poston of the camera center (or camera centers n the case of non-central cameras) along the optcal axs. Ths can be done effcently usng a lnear 7-pont solver nsde a RANSAC loop [5]. Gven the partal extrnsc calbraton, we employ a novel orderng constrant on the openng angle of the vewng rays correspondng to the 2D features to estmate the remanng extrnsc parameter n the case of a central camera. Snce we are consderng radally symmetrc cameras, fxng the camera center drectly provdes the ntrnsc calbraton as t provdes a mappng from mage postons to vewng rays. A slghtly generalzed verson of ths constrant enables us to obtan a more accurate calbraton from multple photos taken wth the same camera. The constrant can be further generalzed to also handle non-central cameras, both when only a sngle or multple photos are avalable. In addton to ts generalty, our novel orderng constrant enables us to formulate the second part of the calbraton process as a convex optmzaton program. We expermentally demonstrate the accuracy of our calbraton ppelne for a wde range of cameras. Addtonally, we make the source code of our method avalable [1]. The remander of the paper s structured as follows. Sec. 2 dscusses related work. Sec. 3 revews the 1D radal camera model. Sec. 4 ntroduces our novel orderng constrant and derves our calbraton method for the case of central cameras. The extenson to non-central cameras 2192

s then provded n Sec. 5. Fnally, Sec. 6 expermentally evaluates our approach on both synthetc and real data. 2. Related Work Recently, there has been some nterest n non-parametrc calbraton of radally symmetrc cameras. For nstance, n [8, 11] a non-teratve, non-parametrc method for calbraton of fsheye cameras s proposed. Whle they clam that they can do wthout, the method s only tested usng a calbraton pattern snce ther auto-calbraton needs several correspondences and s very senstve to nose. Smlar to ths, and much more related to our work, n [15 17] Thrthala and Pollefeys developed the 1D Radal Camera model. They propose a mult-focal tensor able to auto-calbrate any radally symmetrc set of cameras (ncludng non-central), and also produce a non-parametrc calbraton. In ths work we use ths same model, however, n contrast to ther approach, we develop a more general and robust geometrc orderng constrant to calbrate wth. Several other methods make use of ether a specfc scene structure (enough straght lnes) or use calbraton objects to compute ther calbraton, e.g. [7, 13]. In [14] an approach for self-calbraton of radally symmetrc cameras s presented. They develop a plumb-lne (usng the fact that straght lnes n space must project nto straght lnes n the mage) and plane-based methods. In [6], they rely on the observaton of at least three lnes to compute the parameters of a para-catadoptrc system. In contrast to these, we reman flexble by enforcng no requrements on the scene or a calbraton object. For our method we frst estmate the (partal) extrnscs (.e. the pose) w.r.t. a SfM model. There have been numerous advances geared toward pose estmaton n the absence of calbraton. For example, Kukelova et al. propose a 5-pont pose estmator wth unknown radal dstorton and focal length [10]. They employ an dea smlar to the 1D Radal Camera Model n order to effcently obtan a pose and 3-parameter calbraton. Also, mnmal 4-pont solvers for ths same case are presented n [3, 9] n whch they too make use of a parametrc model to obtan a soluton. In contrast to these methods, we do not restrct the camera to be able to be represented by a specfc mode. Furthermore, we extend our method to seamlessly aggregate data from several vews of the same camera to ncrease the accuracy of the calbraton and to better handle non-central systems. 3. The 1D Radal Camera Model In order to calbrate any type of radally symmetrc camera, ths paper bulds on the 1D radal camera model. For a more n-depth analyss the reader s referred do [17], however we brefly revew t n ths secton for completeness. LetC d be the center of dstorton for a camera exhbtng radal dstorton. Let x u denote the undstorted projecton of a 3D pont X onto the camera s mage. As llustrated Fgure 1: Radal 1D Camera. Image plane (left) and top vew (rght) of the projecton of pontx. n Fg. 1, applyng radal dstorton maps x u to a pont on the radal lnel = x u C d through the center of dstorton and the undstorted mage coordnates. Smlarly,x u les on the lnel = x d C d defned by the dstorted measurement x d. Instead of explctly modelng the radal dstorton, the 1D radal camera model defnes a projecton up to radal dstorton. Ths s expressed as a mappng P 3 P 1 that assocates each 3D pont to a lneλl = P r X. The projecton matrxp r R 2 4 relates to the frst two rows of the camera pose (R t) by [ ][ ] 0 1 R1 t P r = x, (1) 1 0 R 2 t y where R s the -th row of the rotaton matrx R. Notce that the 1D radal camera model, other than unt aspect rato, makes no assumpton on the nternal calbraton of the camera. In fact, t descrbes both central and non-central cameras as long as there s a sngle center of dstorton, ncludng pnhole, fsheye, and catadoptrc cameras. As n [17], we assume that C d s known, enablng us to center the mage around C d. For most cameras, the center of the mage s a reasonable approxmaton forc d. Alternatvely, t can be estmated usng the vsble rm of the catadoptrc mrror or the edge of the fsheye lens (c.f. Fg. 8). 4. Calbratng Central Radally Symmetrc Cameras Gven a 3D model of the scene, our goal s to estmate both the extrnsc and ntrnsc calbraton from 2D-3D correspondences(x,x ) between postons n an mage taken wth a radally symmetrc camera and the model. Snce the projecton matrx P r does not depend on the ntrnsc calbraton, we use a two-stage approach. In the frst stage, we use RANSAC [5] to estmate the extrnsc calbraton up to an unknown translaton along the optcal axs. The nlers to the pose are then used to non-parametrcally estmate the ntrnsc calbraton. Sec. 4.1 detals the computaton of the partal extrnsc calbraton. In Sec. 4.2, we then derve a novel orderng constrant that allows us to compute the ntrnsc calbraton by solvng a convex optmzaton problem. Sec. 4.3 shows that the same constrant can be used to calbrate a camera from multple mages. We show n Sec. 5 2193

how to extend our approach to handle non-central radally symmetrc cameras. 4.1. Partal Extrnsc Calbraton Let x d = (x d,y d ) be the poston of a dstorted measurement n a coordnate system centered at the center of dstorton. The radal lne of the -th correspondence (x d,x ) can then be expressed as [ ] [ ] y l = d /x d l = = P 1 1 r X. (2) By multplyng l by ts perpendcular vector (1, l ), we obtan P r1 X l ( P r2 X ) = 0, (3) (a) Sngle mage central case. (b) Mult-mage central case. where P rn represents the n-th row of the matrx P r. Thus, each 2D-3D correspondence gves us one constrant. Snce P r s only defned up to scale, t can be estmated lnearly from seven matches by rearrangng (3). Once we have an estmate for P r, we can recover the full rotaton matrx R by explotng the fact that rotaton matrces are orthonormal matrces wth determnant one (c.f. (1)). Gven a set of 2D-3D correspondences, we estmate P r by usng the 7-pont solver nsde a RANSAC loop. In order to dstngush between nlers and outlers, we measure the subtended angle between the predcted and the observed radal lnesˆl = P r X and l. A match s consdered to be an nler f the angle s below a gven threshold σ (set to 1 n our experments). Notce that P r has only fve degrees of freedom n total: Three degrees of freedom for the rotaton and two degrees of freedom for the partal translaton t x, t y. Thus, the lnear 7-pont solver s non-mnmal. If a mnmal solver s requred due to a hgh outler rato, the 5-pont approach from [10] can be used, whch requres solvng a fourth degree polynomal n a sngle varable. 4.2. Non Parametrc Intrnsc Calbraton The ntrnsc calbraton of a camera defnes a mappng r(x) from mage coordnates to vewng rays. In the case of radal symmetry, the angle θ between the ray r(x) and the optcal axs for all postonsxwth the same dstance to the center of dstorton,.e., x 2 = r, s constant. Consequently, the pont X projectng to x has to le on a cone along the optcal axs wth openng angleθ (c.f. Fg. 1). For two ponts x r1, x r2 wth rad r 1 < r 2, we have θ r1 < θ r2. In the followng, we derve a geometrc constrant from ths observaton from whch we explctly compute the mappng from rad to openng angles. Gven P r (c.f. Sec. 4.1), the transformaton of the 3D ponts from the global nto the local coordnate system of the camera s defned up to a translaton along the optcal axs. Usng R, t = (t x,t y,0), we obtan an ntermedate (c) Sngle mage non-central. (d) Mult-mage non-central. Fgure 2: Orderng constrants for dfferent systems. The abscssa for each fgure, labeled z, are algned wth the optcal axs of the camera. For Fgures a, b and c,r d > rj d. coordnate system n whch the unknown translaton corresponds to the poston of the camera center c on the optcal axs. We notce that fxng c defnes the openng angle θ for a gven 3D pontrx+t n the ntermedate coordnate system. Thus, fxng c fully defnes the ntrnsc calbraton of the camera. A geometrc orderng constrant on the camera center. We express each pont(x,y,ẑ ) = (x,y,z c) as (ϕ,ρ,ẑ ) n a cylndrcal coordnate system (c.f. Fg. 2). Snce we consder radally symmetrc cameras, we can drop the angle ϕ of the pont around the optcal axs from the notaton and only consder the dstance of the 3D pont to the optcal axs ρ and ts depth z. Consder two 3D ponts p = (ρ,ẑ ), p j = (ρ j,ẑ j ) n the ntermedate frame (c.f. Fg. 2a), correspondng to rad rd and rj d of the dstorted mage measurements x d, x j d. Wthout loss of generalty, letρ ρ j and let I j = (ẑ j ρ ẑ ρ j )/(ρ ρ j ) (4) be the ntersecton of the 2D lne contanng the pont par wth the optcal axs z. In the case where rd = rj d, I j corresponds to the camera center c. Unfortunately, t s rather unlkely to fnd two features wth exactly the same radus. In [15], the authors propose to ft a lne through 3D ponts correspondng to smlar rad to obtan a camera center per radus. In contrast, we use an orderng constrant to drectly obtan acas explaned below. Wthout loss of generalty, assume thatrd > rj d and thus θ > θ j. In the case thatρ > ρ j, t follows thatc < I j (c.f. Fg. 2a). Smlarly, ρ < ρ j yelds the constrant c > I j. 2194

Thus, for each pont par we get a one-sded constrant that restrcts the value ofcto le ether to the left or to the rght of I j. For each constrant we then buld a cost functon whch penalzes a gven c that volates a one-sded constrant by usng a pecewse cost functon. Forrd > rj d and ρ > ρ j { Ej l 0 c < I j (c;p,p j ) = f (I j c) otherwse, (5) whch penalzes c f t s to the rght of I j (c.f. Fg 2a). Here, f s a functon dependng on the dstance between the ntersecton pont and the center c. For the opposte confguraton, ether r d < rj d or ρ < ρ j, we may buld a smlar cost functon whch penalzes c to be to the left of I j. Calbraton through convex optmzaton. Usng (5) we can then take the sum over all the cost functons E(c) = Ej l + Ej r, (6) {,j} L {,j} R where L = {{,j} ρ > ρ j and rd > rj d } and R = {{,j} ρ > ρ j and rd < rj d }. If f s chosen to be a convex functon, E(c) wll be convex. We can then obtan the camera center, and thus the ntrnsc calbraton, by optmzng (6) usng, e.g., Gradent-Descent. We choose f to be an L1 norm to be robust to outlers, whle allowng E to reman convex. Furthermore, we propose a very smple algorthm for computng (6) whenf s a lnear functon. Snce the slope of E(c) changes only wherever there s an ntersecton, we may effcently compute t n two passes. We start by sortng the ntersectons, such thati k < I k+1. On the frst pass, from left to rght, we deal only wth the ntersectons that constran c to be to ther left (shown n green n Fg. 3a) and teratvely compute the cost for each ntersecton. Startng wthe(i 0 ) l = 0 we can express the cost of thek th ntersecton as the cost of the prevous ntersecton plus the cost ncrease of the k volatng constrants fromi k 1 toi k. Sncef s a lnear functon, the latter costs only depend on the dstance between the current ntersecton and the last,.e. E(I k ) l = E(I k 1 ) l +kf (I k I k 1 ). (7) On the second pass we sweep n the opposte drecton takng nto account the ntersectons that constran c to be to ther rght. The cost of a gven c k s obtaned by checkng ts nearest left- and rght-constranng ntersectons and summng ther costse(c k ) = E r (I k )+E l (I k ). Selectng pont pars. GvenN ponts n an mage, t s mpractcal to exhaustvely take all pont pars snce the number of pars s N(N 1)/2. Instead, for a gven mage we want to only operate on a fxed number of pars. To do so, we to sort the pont pars by ther qualty,.e. pars whch yeld stable ntersectons close to c. For each par p j we (a) Fgure 3: In a we show the sngle mage lnear cost functon proposed. In the mult-mage case, each pont par defnes a 2D constrant, depcted n b. get r j = r d rj d and ρ j = ρ ρ j. Frst, we dscard pars wth ρ j less than a gven threshold, whch takes care of unstable ntersectons. Then we sort the pars usng r j n ascendng order and take only the frstn s (set to 120 n our experments) pars of the sorted lst. (b) 4.3. Jont Calbraton from Multple Images The approach presented n Sec. 4.2 essentally determnes an nterval n whch the camera center can le n. Usng more ponts adds more constrants on ths nterval, whch should lead to more accurate estmates. Synthetc experments have shown that approxmately 250 maged ponts are enough to obtan a calbraton that acheves less than 1 pxels of RMSE on the reprojected ponts (c.f. Fg. 5), whle usng less than 100 ponts leaves c very underconstraned and the resultng calbraton wll be unrelable 1. Naturally, addtonal ponts can be obtaned by usng multple mages for the calbraton. Thus, n ths secton we show that our geometrc orderng constrant can easly be extended to allow calbratng a camera from M > 1 mages. By expressng 3D ponts n the ρz-plane we can transform all cameras to a common frame of reference by fndng a one-dmensonal relatve translaton between them (c.f. Fg. 2b). Ths allows us to employ our one-sded constrant to fnd ths relatve translaton and a jont calbraton. Jont constrant for central cameras. Gven two cameras s() and s(j) we can express the ntersecton of any pont parp j between them as I s(),s(j) = ( z j c s(j) ) ρ ( z c s() ) ρj, (8) where s() ndcates to whch camera the pont corresponds. Notce that ths s almost the same as (4), however (8) provdes a constrant that now depends on two varables (c.f. Fg. 3b), makng the poston of one camera dependent on the other. The cost functon E : R M R can be also desgned as a pecewse functon. For the confguraton 1 Notce, that toolboxes as the one descrbed n [13] suggest usng 6 to 10 mages. Assumng a calbraton pattern wth 48 corners, such methods use up to 480 ponts. 2195

r d > rj d and ρ > ρ j we defne E l j ( ) {0 I cs(),c s(j) = s(),s(j) < 0 g ( ) c s(),c s(j) otherwse, (9) where g s a cost functon on the dstance from the gven center par to the ntersecton (c.f. Fg. 2b). As wth (5), we decde to use the L1 norm as a cost functon to reman robust to outlers. Calbraton through convex optmzaton. Smlarly to (6), we take the sum of all relevant pont pars and to get E, whch can be mnmzed usng a convex optmzaton method. Notce that the selecton crtera for pont pars descrbed n Sec. 4.1 apples here as well, snce we may aggregate mage rad from all cameras nto one sngle sorted lst to choose N s relevant pars. Fnally, we get a calbraton by translatng the Z coordnate of the ponts ( ) by the camera center that obtaned the observaton θ r d = arctan ( ρ /(z c s() ) ). 5. Calbratng Non-Central Cameras Snce we are only dealng wth radally symmetrc cameras, the centers of the camera can be expressed as a functon of the dstorted mage radus c = c(r d ). So, any pont p = (ρ,ẑ ) hasẑ = z c, wherec = c(r d ). Non-central constrant. Any pont par p j wll constran both centersc andc j (see Fg. 2c). GvenN mage ponts, we have N 1 constrants for each center we need to estmate. We treat each of the N camera centers as a dfferent vew of the scene (.e. s() = ) and apply the method descrbed n Sec. 4.3. However, n practce the N 1 constrants mght not lmt the locaton of a gven center enough, yeldng naccurate results for centers wth weak or too few constrants (e.g. for centers that correspond to rad near the edge or center of the mage, snce these are mostly same-sde constrants). To solve ths we propose to mpose an orderng constrant to the centers. We frst sort all the ponts p by ther rad such that r 1 d < rd < r+1 d, whch restrcts ther correspondng centers c ( r 1 d ) ( ) ( ) < c r d < c r +1 d. (10) Ths s sensble gven that all radally symmetrc non-central systems known to the authors follow ths orderng (e.g. sphercal catadoptrc, para-catadoptrc). Ths constrant can seamlessly be translated nto the one-sded constrants (c.f. Fg. 4). We defne E r k(c) = N ( E r k (c)+ek(c) l ), (11) k=0.e. the cost of the k-th pont aganst the rest. Then for a Fgure 4: Illustraton of the orderng constrant from smulated data. In red, each center s constraned to le above a certan value, and vce versa for blue. Notce that we may use any of the blue constrants lyng to the left of any gven pxel radus and vce versa. centerc the cost becomes E(c ) = N E r k(c)+ E l k(c), (12) k=0 k= n other words, we use the one-sded left constrants of the centers that should be larger thanc and the one-sded rght constrants of those centers that should be smaller than c (see Fg. 2c). Mnmzng (12) we get a set of centers whch can be used to get the fnal calbraton mappng θ ( r d ) = arctan(ρ /(z c )). Jont soluton for non-central cameras. To get a jont non-central calbraton we use a two step procedure. Frst, we treat each camera as a central system and solve for ther jont calbraton, whch provdes us wth an estmate of ther dsplacements d k (see Fg. 2d). We use ths to translate all the data ponts across dfferent vews to be on the same frame of reference,.e. to have a mutually consstent depth. Second, we solve for a sngle non-central system by treatng all the translated ponts as f they came from a sngle vew. Ths allows us to keep the number of ponts needed for a successful calbraton relatvely low (around 350 ponts per mage, c.f. Fg. 5). 5.1. Refnement and Fnal Calbraton One of the prmary benefts of our method s that we provde a calbraton that does not rely on a gven parametrzaton, thus we can accommodate a very wde range of cameras; from planar to catadoptrc, central and non-central. However, we wsh to refne our obtaned soluton by removng vews and ponts based on ther reprojecton errors, and to do so we must fnd a way to use the obtaned mappng. For ths we opt to use a sldng medan [8] of the calbraton 2196

Fgure 5: RMSE error when varyng the number of matches used to obtan the calbraton (pxel σ set to 1.2). The red astersk ndcates that pror to that number of matches, the calbraton faled. (a) Equangular. (b) FOV. data obtaned 2. For all vews we compute correspondng reprojecton errors and remove those ponts whose errors rse beyond a certan threshold (set to 5 pxels n our experments). After ths, we get a fnal set of nlers and recompute the calbraton by repeatng the correspondng procedure. 6. Expermental Evaluaton To evaluate the proposed method we perform experments wth real and synthetc data. Snce one of the strengths of the method s that t can handle a very wde array of cameras, we make a pont of tryng as many cameras as possble (c.f. Fg. 8). 6.1. Synthetc Data We frst carred out experments on synthetc data to evaluate the performance of our methods. We populate the scene wth 320 data ponts dstrbuted randomly. To smulate the central case, we project the data ponts nto the camera usng a pnhole model as well as two well-known fsheye models, the Feld of Vew (FOV) model [4] and the equangular model. For the non-central case, we chose a sphercal catadoptrc camera. To compute the reflectons we use [2]. Fg. 7 shows the calbraton output of the central as well as the non-central smulatons, Fg. 6 compares the accuracy of the generated calbratons aganst those obtaned usng the toolbox n [13], and Fg. 5 shows the error w.r.t the number of matches used. As t can be seen from Fg. 7, the results for the central systems perfectly match the ground truth. In Fg. 7c we show the beneft of relaxng the method to handle noncentral systems. At the begnnng of the curve both orange (central assumpton) and blue (non-central) scatter plots match. However, as the non-centralty of the sphercal model becomes more sgnfcant at hgher rad of the smulated mage, the devaton s more apparent. In Fg. 7d we show how the accuracy of c(r d ) s affected when we do not enforce the orderng constrant (10). We show the re- 2 However, havng a calbraton that s agnostc to the partcular optcs of the setup, one s free to use a more sophstcated method to approxmate the dstorton functon. (c) Sphercal Catadoptrc. Fgure 6: Comparson of the reprojecton error and ts standard devaton aganst [13]. To obtan the calbratons, our method used 320 ponts whle the method we compare aganst used 21 mages of a 48-pont synthetc calbraton pattern. Notce that for Fg. 6c the error s always lower usng our calbraton snce we explctly support non-central systems. sultng calbraton mappngs compared aganst the ground truth of the smulated data. Ths s of partcular mportance for the smulated sphercal catadoptrc system snce wth real data we don t have relable ground truth forc(r d ). 6.2. Real Data In order to test the flexblty of the method, we tred several dfferent cameras and lenses (c.f. Fg. 8): A Nkon D300 coupled wth a fsheye lens, a 360One VR catadoptrc lens and a sphercal catadoptrc lens (usng a 3-nch steel ball). To assess the performance for the mentoned moble phone attachments, we run tests usng an Phone 4 wth a GoPano catadoptrc attachment. Results from other cameras and lenses are provded as supplemental materal [1]. Due to the hgh dstorton observed wth most of the lenses, we don t obtan a very hgh number of matches and thus we must use the mult-camera methods. To get more complete calbratons n the catadoptrc cases we need to ncrease the number of matches near the edges of the reflecton. To do so, we frst use an equangular calbraton: θ = k r d where we fnd k by havng θ = π/2 map to the largest rad n the mage. We use ths to warp the mage nto a cylndrcal map whch we use to get matches aganst the SfM model 3. Note that ths places no restrctons on the 3 The toolbox used for comparson had to be manually asssted to get the corners of the calbraton pattern for ths partcularly dffcult data. 2197

(a) Nkon D300 wth fsheye lens. (b) Phone 4 wth GoPano catadoptrc lens. (c) Sphercal reflecton wth Nkon D300. (d) 360One VR catadoptrc lens wth Nkon D300. Fgure 9: Real calbratons for central and non-central cases. All results shown are compared aganst the calbraton obtaned usng [13], shown n gray. As t can be seen from all four cameras, our soluton accurately match the one from ths stateof-the-art calbraton toolbox. Calbraton ponts are colored accordng to the mage ndex used to emphasze the number of mages used per camera. To hghlght the comparson wth the reference calbraton, we also plot the dstorton functon F (the plot on the rght for each case) as detaled n [13]. mages we are able to handle, snce the only assumpton, as before, s that the mages are radally symmetrc. The SfM model used conssts of a large-scale reconstructon, obtaned n an outdoor locaton, c.f. Fg. 10. We took several mages wth each camera type at the same locaton and obtaned putatve 3D-2D matches (around 120 for each mage). In order to maxmze the number of matches obtaned, we employed the method proposed n [12], modfed to return as many matches as possble. Because of the drastcally large dstorton, we observed nler ratos as low as 20%, and thus several mages were needed for each camera type (between 20 and 25) from whch we were left wth approxmately 500 ponts. In Fg. 9 we show the calbraton obtaned for a selecton of the tested systems. For each case, we obtaned calbratons that closely match the calbraton computed wth the toolbox n [13]. To emphasze the correctness of the obtaned calbraton, we compare the prevously dscussed mappng (θ(r d )) and the functon F. Ths functon s descrbed n [13] as the focal length as a functon of the mage radus, where a pont wth mage coordnates (u,v) can be expressed n the camera frame as (u,v,f(r d )). For the fsheye mages (c.f. Fg. 9a), we got a very hgh number of matches, snce the query mages resembled the most to the mages used to construct the SfM model. For ths partcular model, we have enough data to see that the calbraton near the center of the mage suffers more than the rest due to weak constrants. However, for the case of the non-central system (c.f. Fg. 9c) there s a larger msmatch throughout between our obtaned calbraton and that of [13] snce our method fully supports non-central systems. For the GoPano attachment, as shown n Fg. 9b, we have very few matches near the border and ths s reflected n the scattered data ponts at the end of the curve. Nevertheless, we are overall able to calbrate even such a low-qualty lens system. The reference calbraton method addtonally computes the refned centers of dstorton. However, the fact that our calbraton closely matches the reference calbraton shows that usng the center of the mage as the center of dstorton s a vald assumpton n practce. 7. Concluson In ths work we presented a novel, flexble, structurebased calbraton method for radally symmetrc cameras. Indeed, such subset of cameras encompasses most of the systems used nowadays, such as planar, fsheye, catadoptrc, WFOV, and so on. We are thus able to handle the cal- 2198

(a) Equangular θ(rd ). (b) FOV θ(rd ). (c) Sphercal θ(rd ). (d) Sphercal c(rd ). Fgure 7: Synthetc calbraton θ(rd ) for the central (a and b) and the non-central cameras c, where n orange we show how a central assumpton would not be as accurate. Fg. d shows the estmate of c(rd ) for the non-central case, notce here the effect of employng the orderng constrant (10). braton of several systems under a sngle framework whch would usually requre several dfferent calbraton methods. Furthermore, WFOV magery s becomng more ubqutous by products such as the GoPro and WFOV lens attachments for moble phones. Wth our method, we can make use of ths ncreasngly popular mage modalty to augment and strengthen SfM models produced from onlne photocollectons. Onlne WFOV mages can be thus calbrated and nserted as part of an exstng SfM model. Ths would be greatly benefcal for the qualty of the model snce these type of mages can strongly lnk several parts of the model whch were never vsble before from the same vew. The descrbed calbraton method makes use of the 1D Radal Camera [15] to decouple the estmaton of the extrnsc (up to translaton along the optcal axs) and ntrnsc calbraton of any radally symmetrc camera nto two separate steps. In partcular, the partal extrnscs are obtaned va a lnear 7-pont solver n conjuncton wth RANSAC, whle the computaton of the ntrnscs s carred out mnmzng an outler-robust convex cost functon for both the sngle and the mult-mage case. We compute the calbraton as a mappng from dstorted mage rad nto the angle of ts correspondng 3D ray w.r.t. the optcal axs of the camera. By optng for a nonparametrc calbraton we are able to mantan a very broad compatblty wth any camera that fts the 1D Radal model. Fgure 8: Sample of the data used for the experments for the same outdoor locaton. From top to bottom, left to rght; the 360One catadoptrc lens, the sphercal catadoptrc setup, the GoPano Phone attachment and the D300 wth a fsheye lens. Notce the wde range of dstortons as well as the low qualty n the case of the GoPano. Fgure 10: The SfM pont cloud used for the expermental evaluaton. The approach s valdated expermentally and usng real data and ts accuracy and robustness s assessed by comparng the obtaned calbraton mappngs aganst the calbraton from a state-of-the-art toolbox [13]. We make our source code avalable at [1]. Acknowledgements The research leadng to these results has receved fundng from Google s Project Tango. We would lke to thank Dr. Martn Oswald for hs valuable nput regardng the convex optmzaton. 2199

References [1] Project page. http://www.cvg.ethz.ch/ research/radally-symmetrc-cameras/. [2] A. Agrawal, Y. Taguch, and S. Ramalngam. Analytcal forward projecton for axal non-central doptrc and catadoptrc cameras. In ECCV, 2010. [3] M. Bujnak, Z. Kukelova, and T. Pajdla. New effcent soluton to the absolute pose problem for camera wth unknown focal length and radal dstorton. In ACCV. 2010. [4] F. Devernay and O. Faugeras. Straght lnes have to be straght. Machne vson and applcatons, 13(1):14 24, 2001. [5] M. A. Fschler and R. C. Bolles. Random sample consensus: a paradgm for model fttng wth applcatons to mage analyss and automated cartography. Communcatons of the ACM, 24(6):381 395, 1981. [6] C. Geyer and K. Danlds. Paracatadoptrc camera calbraton. PAMI, 24(5):687 695, 2002. [7] A. Goshtasby. Correcton of mage deformaton from lens dstorton usng bezer patches. Computer Vson, Graphcs, and Image Processng, 47(3):385 394, 1989. [8] R. Hartley and S. B. Kang. Parameter-free radal dstorton correcton wth center of dstorton estmaton. PAMI, 29(8):1309 1321, 2007. [9] K. Josephson and M. Byrod. Pose estmaton wth radal dstorton and unknown focal length. In CVPR, 2009. [10] Z. Kukelova, M. Bujnak, and T. Pajdla. Real-tme soluton to the absolute pose problem wth unknown radal dstorton and focal length. In ICCV, 2013. [11] H. L and R. Hartley. Plane-based calbraton and autocalbraton of a fsh-eye camera. In ACCV. 2006. [12] T. Sattler, B. Lebe, and L. Kobbelt. Improvng mage-based localzaton by actve correspondence search. In ECCV. 2012. [13] D. Scaramuzza, A. Martnell, and R. Segwart. A toolbox for easly calbratng omndrectonal cameras. In IROS, 2006. [14] J.-P. Tardf, P. Sturm, and S. Roy. Self-calbraton of a general radally symmetrc dstorton model. In ECCV. 2006. [15] S. Thrthala and M. Pollefeys. Mult-vew geometry of 1d radal cameras and ts applcaton to omndrectonal camera calbraton. In CVPR, 2005. [16] S. Thrthala and M. Pollefeys. The radal trfocal tensor: A tool for calbratng the radal dstorton of wde-angle cameras. In CVPR, 2005. [17] S. Thrthala and M. Pollefeys. Radal mult-focal tensors. Internatonal Journal of Computer Vson, 96(2):195 211, 2012. 2200