Proceeding of he 6 h Inernaional Symposium on Arificial Inelligence and Roboics & Auomaion in Space: i-sairas 00, Canadian Space Agency, S-Huber, Quebec, Canada, June 8-, 00. Muli-resoluion Mapping Using Surface, Descen and Orbial Images Clark F. Olson, Larry H. Mahies, Yalin iong Je Propulsion Laboraory, California Insiue of Technology 4800 Oak Grove Drive, Pasadena, CA 909-8099 Rongxing Li, Fei Ma, Fengliang u Deparmen of Civil and Environmenal Engineering and Geodeic Science The Ohio Sae Universiy, Columbus, OH 40-75 Absrac Our goal is o produce high-accuracy maps of he errain elevaion a landing sies on planeary bodies hrough he use of all available image daa. We use images on he planeary surface from landers and rovers, images capured during he lander descen o he surface, and orbial images. Three new capabiliies have been developed. Firs, we generae elevaion maps from descen images using srucure-frommoion echniques. These maps are useful for rover navigaion and provide a link beween he orbial images and surface images. We have developed a mehodology for performing rover localizaion using bundle adjusmen ha uses ie poins beween he rover and descen images o deermine boh he camera and he ie poin locaions. Finally, a new mehod o perform regisraion beween orbial images and descen images has been developed ha locaes he landing posiion in he orbial imagery and allows inegraion of he enire daa se. These echnologies are imporan for performing rover navigaion in fuure space missions and he maps provide a ool for coordinaing rovers in a roboic colony. Keywords: errain mapping, srucure-from-moion, rover localizaion, image regisraion. Inroducion For he exploraion of planeary surfaces, he images aken during he lander's descen provide a criical link beween orbial imagery and imagery aken on he surface using rovers and landers. The descen imagery no only provides informaion abou he landing locaion in a global coordinae sysem, bu also yields progressively higher resoluion maps for mapping and mission planning for rovers. We address he issue of mapping using all available imagery, including imagery from he surface, descen imagery, and orbial imagery. In order o map he daa from all sources of imagery and combine hem ino a muli-resoluion map, we have developed new capabiliies. Firs, we have developed echniques for building hree-dimensional errain maps from descen imagery by comparing each pair of images in he nesed sequence. Nex, we have creaed echniques for localizing rovers on he surface using he descen imagery. This allows he incorporaion of surface imagery ino he muli-resoluion map srucure. Finally, we developed a new mehod for he regisraion of descen imagery o orbial imagery using enropy alignmen. Our approach o mapping descen imagery has wo seps: moion refinemen and deph recovery. In moion refinemen, we use an iniial moion esimae o avoid he inrinsic ambiguiy in descending moions. The objecive of he moion refinemen is o adjus he moion parameers such ha he epipolar consrains are valid beween adjacen frames. The deph recovery sep correlaes adjacen frames o mach pixels for riangulaion. Due o he descending moion, he convenional recificaion process is replaced by a se of ani-aliasing image warpings corresponding o a se of virual parallel planes. In order o locae rover posiions in he map consruced from descen imagery and enable incorporaion of he rover imagery ino he map, we have developed bundle adjusmen echniques for rover localizaion. In his mehod ie poins ha represen he same locaions in he rover and descen imagery and deermined and opimizaion echniques are used o deermine he camera and ie poin posiions for each image in a bach opimizaion. These echniques have been exended o allow incremenal localizaion, so ha new imagery can be efficienly added o he
nework of ie poins. We have also developed a new mehod for comparing descen imagery o orbial imagery in order o locae he landing sie and provide conex for he descen imagery. Unlike maching using muual informaion [6, 9], which fails in our es cases, our mehod akes advanage of shape informaion in he imagery by comparing he alignmen of spaial enropy in he orbial image and he descen image. The basic mehod is o ransform each image ino a new image represening he enropy a each locaion. The images are hen compared (for example, using normalized correlaion) o deermine a posiion where he enropies are bes aligned. Each of hese mehods has been esed by applying i o a se of daa colleced during rover field esing a Silver Lake, California. During his field es, a se of descen imagery was colleced using a helicoper. The daa se consiss of eigh images aken a elevaions ha range from 086 m o 8 m above he ground. These images, ogeher wih he housands of rover images colleced a he sie and SPOT saellie imagery, yield a rich se of daa for esing rover mapping and localizaion algorihms in he conex of a planeary landing scenario. Mapping Descen Images We recover deph maps from descen images using a wo-sage process. Firs, moion refinemen is performed in order o guaranee ha he epipolar consrains are saisfied beween he images. Then, deph recovery is performed using an algorihm based on correlaion.. Moion Refinemen Recovering camera moion from wo or more frames is one of he classical problems in compuer vision. Linear [5] and nonlinear [8] soluions have been proposed. For descen moions (as in Fig. ), generic moion recovery from mached feaures is ill-posed owing o a numerical singulariy. Since he camera is rigidly aached o he lander, and he change in he lander orienaion can be measured accuraely by an inerial navigaion sysem onboard, we can eliminae he singulariy problem by adding a penaly erm for deviaing from he measured orienaion. We recover he camera moion by racking feaures in he image and opimizing based on epipolar consrains. For each pair of adjacen frames in he sequence, we rack feaures ha have been seleced in he higher I I Figure : Descen Moion Terrain resoluion frame ino he lower resoluion frame. We use Forsner's ineres operaor [] o evaluae he rackabiliy of he feaures in he higher resoluion frame. We selec he feaures wih high scores, while disallowing feaures ha are oo close ogeher. Once he image resoluions have been equalized (hrough downsampling, or ani-aliasing warping, if necessary), feaure racking is performed in a sraighforward manner using normalized correlaion. The racked feaures provide a rich se of observaions o consrain he camera moion, even hough he relaionship beween he locaions of he racked feaures and he camera moion parameers is highly nonlinear. Le us assume ha he projecion marix of he camera is M, he locaion of feaure i a ime is [i ; Y i ; Z i ] T, is image locaion a ime represened in homogeneous coordinaes is [x i ; y i ; z i ]T, and he camera moion beween ime and ime + is composed of a ranslaion T and roaion R ( marix). The projecion of he feaure a ime is, hus: 4 x i y i z i 5 = M 4 i Yi Zi and he projecion a ime ( + ) is: 4 x i yi zi 5 = M 4 i Yi Zi 0 5 = M @ R 4 i Yi Zi 5 ; () 5 + T A : Therefore, he feaure moion in he image is: 4 x i yi zi 0 5 = M = U @ RM 4 x i y i z i 4 x i y i z i 5 + T A () 5 + V; () Page
where U = MRM is a marix and V = MT is a -vecor. Le [c i ; r i] = [x i =z i ; y i =z i] denoe he acual column and row locaion of feaure i in image coordinaes a ime. We, hen, have he prediced feaure locaions a ime + as: ^c i = u 00x i + u 0 yi + u 0 zi + v 0 u 0 x i + u yi + u zi + v ; (4) i = u 0x i + u yi + u zi + v u 0 x i + u yi + u zi + v ; (5) ^r where u ij and v i are elemens of U and V respecively. In order o refine he moion esimae, we augmen he parameers wih deph esimaes for each of he feaures. There are wo advanages o his approach. Firs, he objecive funcion becomes jus he disance beween he prediced and observed feaure locaions. Therefore, i is guaraneed o have no bias if he observaions conain no bias. In addiion, in he conex of mapping descen images, we have a good iniial esimae of he deph value from he spacecraf alimeer. Incorporaing his informaion will, hus, improve he opimizaion in general. Le us say ha he deph value of feaure i a ime is d i and he camera is poining along he z- axis, he homogeneous coordinaes of he feaure are [x i ; y i ; z i ]T = d i [c i ; r i ; ]. Therefore, he overall objecive funcion we are minimizing is: N i=0 c i ^c i + r i ^r i ; (6) where N is he number of feaures, and ^c i and ^r i are nonlinear funcions of he camera moion and deph value d i given by Eq. (4) and (5). We perform nonlinear minimizaion using he Levenberg- Marquard algorihm.. Deph Map Recovery The second sep of our mehod generaes deph maps using correlaions beween image pairs. In order o compue he image correlaion efficienly, we need o recify he images in a manner similar o binocular sereo. Unforunaely, i is impossible o recify he images along scanlines because he epipolar lines inersec each oher near he cener of he images. If we resample he images along epipolar lines as in sereo, we will oversample near he image cener, and undersample near he image boundaries. In order o avoid his problem, we adop a slicing algorihm ha allows us o perform he correlaion efficienly. The main concep is o use a se of virual planar surfaces slicing hrough he errain. The virual planar surfaces are similar, in concep, o horoper surfaces [] in sereo. For every planar surface k, if he errain surface lies on he planar surface, here exiss a projecive warping P k beween wo images. If we designae he firs image I (x; y) and he second image I (x; y), hen for every virual planar surface, we can compue a correlaion image as he sum-of-squared-differences (SSD): C k (x; y) = x+w y+w m=x W n=y W I (m; n) I k (m; n) ; (7) where W + is he size of he correlaion window and I k (x; y) is a warped version of I (x; y): I k p00 x + p 0 y + p 0 (x; y) = I ; p 0x + p y + p ; p 0 x + p y + p p 0 x + p y + p (8) and p ij are elemens of he x marix P k. Due o he drasic resoluion difference, an ani-aliasing resampling such as [] or a uniform downsampling of I (x; y) is applied before he image warping. In pracice, if he camera heading direcions are close o be perpendicular o he ground, a uniform downsampling before warping shall suffice. Oherwise, a space-varian downsampling is needed o equalize he image resoluions. The deph value a each pixel is he deph of he planar surface z k whose corresponding SSD image pixel C k (x; y) is he smalles: where z(x; y) = z k ; (9) C k (x; y)» C j (x; y); j = 0; : : : ; M ; (0) and M is he number of planar surfaces. To furher refine he deph values, he underlying SSD curve can be inerpolaed by a quadraic curve and he subpixel" deph value can be compued as: ffiz(c k+ (x; y) C k (x; y)) z(x; y) = z k + (C k+ (x; y) + C k (x; y) C k (x; y)) ; () where ffiz is he deph incremen beween adjacen planar surfaces. The projecive warping marix P k is derived from he parameers of he camera moion and he planar surfaces. For an arbirary poin in some reference frame, is projecion is expressed as x = M( C), where C is he posiion of he camera nodal poin and M is he projecion marix. Noe ha C and Page
(a) (b) (c) (d) Figure : Real descen sequence from a helicoper. (a) Image aken a higher aliude. (b) Image aken a lower aliude. (c) False-color esimaed errain map. (d) Rendered errain map wih image overlaid. (The rows have differen heigh scales.) M encapsulae he camera moion beween he images, since hey are represened in a common reference frame. Le C and M represen he higher camera, C and M represen he lower camera, and N T + z k = 0 represen he se of planar surfaces. For any pixel in image (i.e. he lower camera), is locaion mus lie on a d ray: = sm 4 c r 5 + C ; () where c and r are he column and row locaion of he pixel and s is a posiive scale facor. If he pixel is from a poin on he planar surface, hen he following consrain mus be saisfied: sn T M 4 c r Therefore, he scale facor s mus be 5 + N T C + z k = 0: () N T C + z k s = N T M [c ; r ; ] T : (4) We can hen re-projec he poin ono he firs image using Eq. () and (4): 4 x y z 5 = M ( C ) = P k 4 c r 5 ; (5) where P k is a x marix specifying he projecive warping: P k = M (C C )N T M (N T C + z k )M M : (6) Noe ha he deph recovery is numerically unsable in he viciniy of he epipoles (a he cener of he image for pure descending moion), since pixels near he epipoles have a small amoun of parallax, even for large camera moions. Mahemaically, he SSD curves in hose areas are very fla and, hus, accurae deph recovery is difficul. These regions can be easily filered, if desired, by imposing a minimum curvaure hreshold a he minima of he SSD curves.. Experimens A se of real descen images was colleced in he deser near Silver Lake, California using a helicoper. Figure shows four frames from his sequence. The iniial camera moions were esimaed using conrol poins on he ground. Several of he images conain significan laeral moions due o he difficuly in mainaining he x-y posiion of he helicoper during he daa collecion. Column (c) of Fig. shows he false-color deph maps ha were recovered from he sequence and column (d) shows he image draped over he visualized errain. For he images in his daa se, he errain slopes downward from lef o righ, which can be observed in he rendered maps. Some of he ineresing errain feaures include he bushes visible in row and he channel in row. Noe ha he areas in which he helicoper shadow is presen yield good resuls, despie he movemen of he shadow. This can be aribued o he robus mehods ha we use for boh moion esimaion and emplae maching [7]. Overall, his daa se indicaes ha we can robusly compue maps Page 4
ha are useful for rover navigaion over boh small and large scales using real descen images. Localizaion Wih Descen Images A second aspec of he muli-resoluion mapping problem ha we have examined is he deerminaion of he rover/camera posiion in he errain using maches beween surface and descen images [4]. This allows he consrucion of a map encompassing boh ses of images. For landing on Mars (or anoher planeary body), we have no ground conrol poins. In his case, he bundle adjusmen compuaion is a free nework consising of he exposure ceners of he camera posiions (boh descen and surface images), measured image ie poins, and he ground locaion of each of he ie poins. We selec he landing locaion as he origin of a local coordinae frame. Three consrains are applied o he bundle adjusmen model: scale, azimuh, and zenih. These are supplied by a landmark locaion relaive o he lander obained by, for example, sereo vision. Given he landmark coordinaes (L x ; L y ; L z ), we consrain he scale S, azimuh ff, and zenih fi as follows: S = L x + L y + L z (7) ff = an L y =L x (8) fi = an L z = q L x + L y (9) If we le A be he coefficien marix afer linearizaion, L be he observaion vecor and V be he correcion vecor, hen we have he unknown vecor (including he camera posiions and ground coordinaes) as: V = A L; (0) and he hree consrains above can be represened as: H = W: () Wih weigh marix P on he measuremens of he image poins, we le N = A T P A and our leas-squares soluion becomes: = N (A T P L + H T (HN H T ) (W HN A T P L)); () where N is he generalized inverse of N. For rover localizaion along a raverse, we would prefer an incremenal mehod, raher han a bach mehod ha may require more ime han desired. Assuming ha we have previously processed he firs (a) (b) Figure : Example ie poins beween rover and descen images. (a) Rover images. (b) Descen image. m rover saions, we can decompose he observaion equaions ino wo pars: and v m = A m m l m ; () v m = A m m + B m Y m l m : (4) Equaion represens he soluion wih all of he daa unil rover saion m and Equaion 4 is an incremenal updae of he posiion using he daa a saion m, where Y m is he new unknown vecor expressing he rover posiion a saion m. In his case, he generalized inverse of N m becomes:» N m A T = m Pm Am + A T mpmam A T mpmam BmPmAm T BmPmBm T» (5) Km Gm = G T : (6) m Hm The incremenal soluion for saion m is: m = ^Wm(m Fm(lm Amm )) + GmB T mpmlm (7) Page 5
and Ym = (BmPmBm) T BmPmAm T ^Wm (m Fm(lm Amm )) + HmBmPmIm T (8) where ^Wm = I + ^N m A T mpmbmhmbmpma T T m; (9) Fm = (A T m Pm Am ) A T m (P m + Am(A T m Pm Am ) A T m) ; (0) ^Nm = (Nm + A T mpmam) : () This mehod was esed using daa from he Silver Lake es sie. In his experimen, we used descen images and 4 pairs of rover sereo images aken a separae rover saions. Image feaures appearing in boh he rover and descen images were seleced as ie poins (see Fig. ). The resuls of he experimen were compared wih ground-ruh colleced using GPS. In each case, we were able o obain localizaion accuracy wihin m of he GPS esimae. For rover posiions closer o he cener of he descen imagery, he accuracy is much beer, wih errors below 5 cm for a posiion approximaely 5 meers from he cener. 4 Regisering Descen Images In order o deermine he locaion of he landing sie and provide conex for he descen images, i is necessary o deermine he locaion of he descen images wihin lower-resoluion orbial images aken of he same locaion. This is a difficul problem for wo reasons. Firs, he images are capured wih differen sensors ha have differen sensiiviies o various wavelenghs of ligh. Therefore, he same errain locaion will yield differen image inensiies for he wo sensors. In general, he relaionship beween he image inensiies yielded by he same errain locaion in he wo images is highly non-linear. The second problem is ha ransformaion beween he camera posiions (and, hus, he posiion of he image daa) is complex. There are six degrees-of-freedom in he relaive camera posiions and his leads o a six degreeof-freedom ransformaion in he image space, if he errain is approximaed as planar. The ransform is even more complex if he errain is no approximaed as planar. However, we shall use he planar approximaion in his paper, since he disance of he errain from he camera is large. (a) (b) Figure 4: Enropy image example. (a) Orbial image of he Ayawaz Mounains and Silurian Valley in California. (b) Enropy image compued from (a). A common echnique ha is used for his ype of image regisraion is he maximizaion of muual informaion beween he images [6, 9]. This echnique locaes he relaive posiion beween he images a which he muual saisical informaion conen is maximized. Unforunaely, in experimens on real images, we have found ha his mehod fails when he search space is large. We speculae ha he reason for his failure is ha muual informaion does no well use shape informaion ha is presen in he images. Anoher possible explanaion is ha smooh shading from he differen illuminaion in he images causes he correc mach o score poorly, since muual informaion can no handle his ype of illuminaion change wih a single reference image. We use a differen mehod, where each image is ransformed ino an enropy image (i.e. soring he enropy a each locaion in he original image). Regisraion is hen performed using he enropy images. This mehod is more robus o changes in illuminaion and makes greaer use of shape informaion in he image. For a discree random variable A, wih marginal probabiliy disribuion p A (a), he enropy is defined as: H(A) = p A (a) log p A (a): () Noe ha 0 log 0 is aken o be zero, since a lim x log x = 0: () x!0 In order o compue he enropy image for boh he emplae (he descen image) and he search image (he orbial image), we apply Eq. o each square window of some paricular size (5 5 pixels is ypical) and replace he pixel value wih he enropy score. Figure 4 shows an example of an enropy image creaed from an orbial image a our es sie. Page 6
(a) (b) Figure 5: Regisraion example. (a) Aerial image showing a deail of he image in Fig 4. (b) Regisered locaion of he aerial image wih respec o he orbial image. In order o locae he bes posiion of he descen image in he orbial image, we combine he use of he fas Fourier ransform (FFT) o perform correlaion over ranslaions wih a search over he remaining four parameers of he search space. For each se of roaion, scaling, and warping parameers, correlaion is performed efficienly in he frequency domain and he locaion wih he highes normalized correlaion can be locaed very quickly. A presen, a brue-force search is used o search over he remaining parameers of he search space. We are currenly invesigaing algorihms for efficienly searching hese pose parameers. Fig. 5 shows an example of he regisraion achieved wih hese echniques. In his example (where experimens wih muual informaion have failed o deec he correc posiion), our mehod performs well, finding a very close mach beween he descen image and he orbial image. 5 Summary We have discussed new echniques for performing muli-resoluion mapping and regisraion of image daa from a variey of sources. For mapping landing sies on planeary bodies, we use images aken on he surface by landers and rovers, images aken during a lander's descen o he surface, and images capured from orbi. Three echniques were described in his paper. Firs, a mehod for generaing errain maps from he descen images was described. This gives us muli-resoluion informaion for navigaion and planning, and provides a link beween he surface images and he orbial images. Nex, we described a mehod for deermining he rover posiion on he surface using correspondences beween surface imagery and descen imagery. In addiion o being useful for navigaion, hese echniques allow he surface imagery o be accumulaed ino a map encompassing all of he daa. Finally, we have described echniques for performing regisraion beween descen images and orbial images. This allows us o deermine he locaion of he landing sie and provides conex for he descen images and, by exension, he surface images. The combinaion of hese echniques yields an overall framework for regisering and combining he errain maps from all of he daa sources. Acknowledgmens The research described in his paper was carried ou in par a he Je Propulsion Laboraory, California Insiue of Technology, under a conrac wih he Naional Aeronauics and Space Adminisraion. References [] P. J. Bur, L. Wixson, and G. Salgian. Elecronically direced focal" sereo. In Proceedings of he Inernaional Conference on Compuer Vision, pages 94 0, 995. [] W. Försner. A framework for low-level feaure exracion. In Proceedings of he European Conference on Compuer Vision, pages 8 94, 994. [] P. Heckber. Survey of exure mapping. IEEE Compuer Graphics and Applicaions, 6():56 67, November 986. [4] R. Li, F. Ma, F. u, L. Mahies, C. Olson, and Y. iong. Large scale Mars mapping and rover localizaion using descen and rover imagery. In Proceedings of he 9h ISPRS Congress, 000. [5] H. C. Longue-Higgins. A compuer algorihm for reconsrucing a scene from wo projecions. Naure, 9: 5, Sepember 98. [6] F. Maes, A. Collignon, D. Vandermeulen, G. Marchal, and P. Sueens. Mulimodaliy image regisraion by maximizaion of muual informaion. IEEE Transacions on Medical Imaging, 6():87 98, April 997. [7] C. F. Olson. Maximum-likelihood emplae maching. In Proceedings of he IEEE Conference on Compuer Vision and Paern Recogniion, volume, pages 5 57, 000. [8] R. Szeliski and S. B. Kang. Recovering d shape and moion from image sreams using non-linear leas squares. Journal of Visual Communicaion and Image Represenaion, 5():0 8, March 994. [9] P. Viola and W. M. Wells. Alignmen by maximizaion of muual informaion. Inernaional Journal of Compuer Vision, 4():7 54, 997. Page 7