From: AAAI-82 Proceedngs. Copyrght 1982, AAAI (www.aaa.org). All rghts reserved. TRACKING KNOWN THREE-DIMENSIONAL OBJECTS* Donald B. Gennery Robotcs and Teleoperator Group Jet Propulson Laboratory Pasadena, Calforna 91109 ABSTRACT A method of vsually trackng a known three-dmensonal object s descrbed. Predcted object poston and orentaton extrapolated prevous trackng data are used to fnd known features n one or more s. The measured mage postons of the features are used to adjust the estmates of object poston, orentaton, velocty, and angular velocty n three dmensons, Flterng over tme s ncluded as an ntegral part of the adjustment, so that the flterng both smooths as approprate to the measurements and allows stereo depth nformaton to be obtaned multple cameras takng s of a movng object at dfferent tmes. I II ODUCI ION Prevous work n vsual trackng of movng objects has dealt mostly wth two-dmensonal scenes Cl, 2, 31, wth labelled objects [41, or wth restrcted domans n whch only partal spatal nformaton s extracted I51. Ths paper descrbes a method of trackng a known sold object for whch an accurate object model s avalable, determnng ts three-dmensonal poston and orentaton rapdly as t moves, by usng natural features on the object. Only the porton of the trackng problem concernng lockng onto an object and trackng t when gven ntal approxmate data s dscussed here. The acquston porton of the problem s currently beng worked on and wll be descrbed n a later paper. Snce the trackng proper porton dscussed here has approxmate nformaton avalable the acquston data or prevous trackng data, t can quckly fnd the expected features n the s, and t can be optmzed to use these features to produce hgh accuracy, good coastng through tmes of poor data, and optmum combnng of nformaton obtaned at dfferent tmes. (An earler, smlar method lackng many of the features descrbed here was prevously reported 161.) mode 1 The current method uses a general object cons stng of planar surf aces. The f eatures * The research descrbed n ths paper was carred out by the Jet Propulson Laboratory, Calforna Insttute of Technology, under contract wth the Natonal Aeronautcs and Space Admnstraton. found n the s are the brghtness edges formed by the ntersecton of the planar faces of the object, caused by dfferences n llumnaton on the dfferent faces. By comparng the postons of the actual features n the s to ther predcted postons, dscrepances are generated that are used n a least-squares adjustment (based on a lnearzaton usng partal dervatves) to refne the current estmates of object poston and orentaton. Flterng over tme s ncluded n the adjustment to further reduce error by smoothng (ncludng dfferent amounts of smoothng automatcally n dfferent spatal drectons as requred by the accuracy of the data), to obtan veloctes for predcton, and to enable nformaton obtaned at dfferent tmes to be combned optmally. Thus stereo depth nformaton s obtaned when more than one camera s used, even though ndvdual feartures are not tracked or matched between s, and even f the dfferent cameras take s at dfferent tmes, When only one camera s used, the approxmate dstance to the object s stll determned, because of ts known sze. In order to avod the sngularty n the Euler angle representaton (and for other reasons mentoned later), the current orentaton of the object s represented by quaternons, and the ncremental ad j us tment to orentaton s represented by an nfntesmal rotaton vector. (Corben and Stehle 171 provde a dscusson of qua ternons, and Goldsten 181 provdes a dscusson of nfntesmal rotaton vectors.) The trackng program works n a loop wth the followng major steps: predcton of the object poston and orentaton for the tme at whch a s taken by extrapolatng the prevously adjusted data (or acquston data when startng), detecton of features by projectng nto the to fnd the actual features and to measure ther mage postons relatve to the predctons; and the use of the resultng data to adjust the poston, orentaton, and ther tme dervatves so that the best estmates for the tme of the are obtaned. These steps wll be descrbed brefly n the followng sectons. A more de taled descrpton wll appear n a paper pub1 shed elsewhere. II PREDICTION The predcton of poston and orentaton s based upon the.assumpton of random acceleraton 13
and angular acceleraton (that s, a constant power spectrum to frequences consderably hgher than the rate of takng). Snce random acceleraton mp1 es constant expected velocty, the predcted poston tself s obtaned smply by addng to the poston estmate the prevous adjustment the product of the prevous adjusted velocty tmes the elapsed tme snce the prevous, for each of the three dmensons. Smlarly, the predcted orentaton s obtaned by rotatng the prevous adjusted orentaton as f the prevous adjusted angular velocty vector appled constantly over the the elapsed tme nterval. (Ths orentaton extrapolaton s partcularly smple when quaternons are used.) The predcted velocty and angular velocty are smply equal to the prevous adjusted values. However, these predcted values must have approprate weght n the adjustment, and, snce the weght matrx should be the nverse of the covarance matrx (see, for example, Mkhal 1911, the computaton of the covarance matrx of the predcted data wll now be dscussed. necessary background nformaton on matrx algebra.) The larger are the values of a and a, the larger wll be the uncertanty n the predcted values as ndcated by 3, and thus the less smoothng over tme wll be produced n the adjustment. In practce, the above matrx multplcatons are multpled out so that the actual computatons are expressed n terms of 3-by-3 matrces. Ths s computatonally faster snce A s so sparse. However, for greater accuracy two addtonal effects are ncluded n the mplemented program. Frst, the effect on orentaton of uncertanty n the prevous orentaton and angular velocty wll be nfluenced by the rotaton that has occured durng the tme r. Ths causes some modfcaton of the A matrx. Second, addtonal terms nvolvng a and a are added to 3 to reflect the nfluence that the random acceleraton durng the just elapsed tme nterval z has on poston and orentaton. These refnements wll be descrbed n another paper. The covar ante matrx of the prev ons adjusted data s denoted by S. Ths s a 12-by-12 matrx, snce there are three components of poston, three components of ncremental rotaton, three components of velocty, and three components of angular velocty (assumed to be arranged n that order n S). To a frst approxmaton, the covarance matrx s of the predcted data can be obtaned by addng to S terms to represent the addtonal uncertanty caused by the extrapolaton. These must nclude both of the followng: terms to ncrease the uncertanty n poston and orentaton caused by uncertanty n the velocty and angular velocty that were used to do the extrapolaton, and terms to ncrease the uncertanty n velocty and angular velocty caused by the random acceleraton and angular acceleraton occur ng over the extrapolaton nterval. The former effect can be produced by usng the followng 12-by-12 transformaton matrx: A = I 0 ZI 0 0 I 0 ZI 0 0 I 0 0 0 0 I where I s the 3-by-3 dentty matrx and z s the elapsed tme nterval of the extrapolaton. Then the covarance matrx can be transformed by ths matrx, and addtonal terms can be added for the latter effect, as follows: 0 0 0 0 0 az1 0 0 III DETECTION @ FEATURES Once the predcted object poston and orentaton are avalable for a, the vertces n the object model that are predcted to be vsble (wth a margn of safety) are projected nto the by usng the known camera model [loi. The lnes n the that correspond to edges of the object are computed by connectng the approprate projected vertces. Analytcal partal dervatves of the projected quanttes wth respect to the object poston vector and object ncremental nfntesmal rotaton vector are also computed. Brghtness edges are searched for near the postons of the predcted lnes. The brghtness edges elements are detected by a modfed Sobel operator ( ncludng thresholdng and thnnng), whch we have avalable both n software form and n specal hardware that operates at the vdeo rate 1111. The program only looks for edge elements every three pxels along the lne, snce the Sobel operator s three pxels wde. For each of these postons t searches approxmately perpendcularly to the lne. Currently t accepts the nearest edge element to the predcted lne, f t s wthn fve pxels. However, a more elaborate method has been devsed. Ths new method vares the extent of the search accordng to the accuracy of the predcted data, accepts all edge elements wthn the search wdth, and gves the edge elements varable weght accordng to ther agreement wth the predcted lne and ther ntensty. Ths method wll be descrbed n a later paper. s Y ASAT + 0 0 0 0 In prncple, the poston of each detected 0 0 0 alz1 edge element could be used drectly n the adjustment descrbed n the next secton. The where a and a are the assumed values of the power observed quantty e would be the perpendcular spectra of acceleraton and angular acceleraton, dstance the predcted lne to the detected respectvely, and the superscrpt T denotes the edge element, the l-by-6 partal dervatve matrx matrx transpose. (Mkhal 191 provdes the B would be the partal dervatves of -e wth
respect to the three components of object poston and three components of ncremental object rotaton, and the weght W of the observaton would be the recprocal of the square of ts standard devaton (accuracy). (Currently, ths standard devaton s a gven quantty and s assumed to be the same for all edge elements.) However, for computatonal effcency the program uses a mathematcally equvalent two-step process. Frst, a corrected lne s computed by a least-squares ft to the perpendcular dscrepances the predcted lne. In effect, the quanttes obtaned are the perpendcular correctons to the predcted lne at ts two end vertces, whch form the 2-by-1 matrx E, and the correspondng 2-by-2 weght matrx W. B s then the 2-by-6 partal dervatve matrx of -E wth respect to the object poston and ncremental orentaton. Second, these quanttes for each predcted lne are used n the adjustment descrbed n the next secton. weght matrx s-l. (Gvng the predcted values weght n the soluton produces the flterng acton, smlar to a Kalman flter, because of the memory of prevous measurements contaned n the predcted nformaton.) Therefore, the adjustment ncludng the nformaton contaned n the predcted values n prncple could be obtaned as follows: S = + 3-l -1 [] = [F] + $1 However, usng the above equaton s neffcent and may present numercal probl ems, snce the two matr ces to be nverted are 12-by-12 and may be nearly sngular. If s s parttoned nto 6-by-6 matrces as follows, IV ADJUSTMENT Now the nature of the adjustment to poston and orentaton wll be dscussed. If no flterng were desred, a weghted least squares soluton could be done, gnorng the predcted values except as ntal approxmatons to be corrected. The standard way of dong ths [9, 121 s as follows: N = B;WB the followng mathematcally equvalent form n terms of 6-by-6 matrces and -vectors can be produced by means of some matrx manpulaton: spp = (1 + ~ppnrlspp SW = (1 + sppnrlbm SW = SW - g&n(i + sppn)-l~pv C = B;WE P = f + sppc P = H + N-lC where B s the matrx of partal dervatves of the th set of observed quanttes wth respect to the parameters beng adjusted, W s the weght matrx of the th set of observed quanttes, E s a vector made up of the th set of observed quanttes, P s the vector of parameters beng adjusted, and fr s the ntal approxmaton to P. The covarance matrx of P, whch ndcates the accuracy of the adjustment, s then N -1 For the case at hand, P s 6-by-1 and s composd of the components of poston and ncremental orentaton, N s 6-by-6, and C s 6-by-l. The meanngs of E wj,# and B for ths case were descrbed n the prevous secton. The velocty and angular velocty are ncluded n the adjustment by consderng twelve adjusted parameters consstng of the sx-vectors P and V, where V s composed of the three components of velocty and the three components of angular velocty. The measurements whch produce N and C above contrbute no nformaton drectly to v. However, the predcted values fr and v can be consdered to be addtonal measurements drectly on P and V wth covarance matrx s, and thus v = 3 + s&c Not only s ths form more effcent computatonally, but the matrx to be nverted (I + sppn) s guaranteed to be nonsngular, because both s, and N are non-negatve defnte. The frst three elements of P the above form the new adj usted poston vector of the object. The last three elements form an ncr emental rotaton vector used to correct the object orentaton. Ths could be used drectly to update the rotat on matrx as expl aned by Goldsten 181), but, snce the prmary representaton of orentaton n the mplemented tracker s n terms of qua ternons, t s used nstead to update the quaternon that represents orentaton, and the rotaton matrx s computed that. Ths method also makes convenent the normalzaton to prevent accumulaton of numercal error. (The relatonshn between quaternons and rotatons s descrbed be Corben and Stehle 171.) The covarance matrx S of th+e adjusted data s formed by assemblng Spp, SPV, %vs and SVV nto a 12-by-12 matrx, smlarly to 58. 15
1. Dgtzed 2. 3. next 1. rght 16 4. next object. 5. rght later. 6. camera camera wth obscurng fve s
V RESULTS s 1, 2, 3, and 4 show the tracker n acton. The object beng tracked (a hexagonal prsm) s 203 mm tall and s movng upwards at about 16 mm/set. Pctures two cameras were taken alternately. The values used for the acceleraton parameters were a = 1 mm2/sec3 and a = 0.0001 radan2/sec3. The assumed standard devaton of the edge measurements was one pxel. The software verson of the edge detector was used. The program, whch runs on a General Automaton SPC-16/ 85 computer, was able to process each n ths example n 1.6 seconds, so that the complete loop through both cameras requred 3.2 seconds. (When the hardware edge detector s used, the tme per n a case such as ths s only 0.5 second.) 1 shows the raw dgtzed mage correspondng to 2. For successve s the, rght, and cameras, respectvely, s 2. 3, and 4 show the followng nformaton. In a wndow that the program puts around the predcted object for applyng the software edge detector, the raw dgtzed has been replaced by the detected brghtness edges (showng as fant lnes). (Wth the hardware edge detector the entre would be so replaced.) Supermposed on ths are the predcted lnes correspondng to edges of the object (showng as brghter lnes). The brght dots are the edge elements whch were used n the adjustment, (These may be somewhat obscured n the fgures when they le drectly on the predcted lnes.) The program s able to tolerate a moderate amount of mssng and spurous edges. Ths s because t looks for edges only near ther expected postons, because the typcal abundance of edges produces consderable overdetermnaton n the adjustment, and because of the smoothng produced by the flterng. s 5 and 6 (smlar to s 2, 3, and 4) show an example of an obscurng object passng n front of the tracked object wthout causng loss of track. 5 s the rght camera, and 6 s the camera fve s later (so that there are two s the camera and two the rght camera between these n tme that/are not shown). c21 131 141 r51 161 [71 CSI [91 [loi r111 Cl21 W. N. Martn and J. K. Aggarwal, Dynamc Scene Analyss, Computer Grauhcs and Imane Processng 7 (19781, pp. 356-374. - A. L. Glbert, M. II. Gles, G. 1. Flachs, R. B. Rogers, and Y. H. TJ, A Real-Tme Vdeo Trackng System, IEEE Transactons on Pattern Analvss and Machne Intellnen= PAYI- (19801, pp. 47-56. H. F. L. Pnkney, Theory and Development of an On-Lne 30 Hz Vdeo Photogrammetry System for Real-Tme 3-Dmensonal Control, Proceedngs of the ISP SvmDo s um on PhoD- s InXtrv, Stockholm, August 1978. J. W. Roach and J. B. Aggarwal, Computer Trackng of Objects Movng n Space, IEEE Transactons on Pattern Analyss -- and Machne Intellgence PAHI- (19791, pp. 127-135. E. Saund, D. B. Gennery, and R. T. Cunnngham, Vsual Trackng n Stereo, Jont Automatc Control Conference, sponsored by ASME, Unversty of Vrgna, June 1981. H. C. Corben and P. Stehle, Classcal -- Mechancs (Second Edton), Wley, 1960. H. Goldsten, Classcal Mechancs (Second Edton), Addson-Wesley, 1980. E. M. Mkhal (wth contrbutons by F. Ackermann), Observatons and Least Squares, Harper and Row, 1976. Y. Yakmovsky and R. T. Cunnngham, A System for Extractng Three-Dmensonal Measurements a Stereo Par of TV Cameras, Computer Grauhcs and Image Processng 7 (19781, pp. 195-210. R. Eskenaz and J. M. Wlf, Low-Leve 1 Processng for Real-Tme Image Analyss, Jet Propulson Laboratory Report 79-79. D. B. Gennery, Mode 11 ng the Envronment of an Explorng Vehcle by Means of Stereo Vson, AIM-339, STAN-CS-80-805, Computer Scence Dept., Stanford Unversty, 1980. ACKNOWLEDGMENTS The programmng of the tracker was done prmarly by Erc Saund, wth portons by Doug Varney and Bob Cunnngham. Bob Cunnngham asssted n conductng the trackng experments. REFERENCES 111 H.-H. Nagel, Analyss Technques for Image Sequences, Fourth Internatonal Jont Conference on Pattern Recognton, Tokyo, November 1978, pp. 186-211. 17