High-Accuracy 3D Image-Based Registration of Endoscopic Video to C-Arm Cone-Beam CT for Image-Guided Skull Base Surgery

Similar documents
Direct Endoscopic Video Registration for Sinus Surgery

1. INTRODUCTION ABSTRACT

Toward Video-based Navigation for Endoscopic Endonasal Skull Base Surgery

Toward Video-Based Navigation for Endoscopic Endonasal Skull Base Surgery

Endoscopic Reconstruction with Robust Feature Matching

Integration of CBCT and a Skull Base Drilling Robot

Rendering-Based Video-CT Registration with Physical Constraints for Image-Guided Endoscopic Sinus Surgery

Improved Navigated Spine Surgery Utilizing Augmented Reality Visualization

Acknowledgments. Nesterov s Method for Accelerated Penalized-Likelihood Statistical Reconstruction for C-arm Cone-Beam CT.

Self-Calibration of Cone-Beam CT Using 3D-2D Image Registration

Scale-Invariant Registration of Monocular Endoscopic Images to CT-Scans for Sinus Surgery

Incorporation of Prior Knowledge for Region of Change Imaging from Sparse Scan Data in Image-Guided Surgery

A System for Video-based Navigation for Endoscopic Endonasal Skull Base Surgery

A Multi-view Active Contour Method for Bone Cement Reconstruction from C-Arm X-Ray Images

Introduction to Digitization Techniques for Surgical Guidance

Assessing Accuracy Factors in Deformable 2D/3D Medical Image Registration Using a Statistical Pelvis Model

Design and Implementation of a Laparoscope Calibration Method for Augmented Reality Navigation

Skull registration for prone patient position using tracked ultrasound

Response to Reviewers

Sensor-aided Milling with a Surgical Robot System

Photoacoustic Image Guidance for Robot-Assisted Skull Base Surgery

REAL-TIME ADAPTIVITY IN HEAD-AND-NECK AND LUNG CANCER RADIOTHERAPY IN A GPU ENVIRONMENT

Acknowledgments. High Performance Cone-Beam CT of Acute Traumatic Brain Injury

Navigation System for ACL Reconstruction Using Registration between Multi-Viewpoint X-ray Images and CT Images

AUTOMATIC DETECTION OF ENDOSCOPE IN INTRAOPERATIVE CT IMAGE: APPLICATION TO AUGMENTED REALITY GUIDANCE IN LAPAROSCOPIC SURGERY

Validation System of MR Image Overlay and Other Needle Insertion Techniques

Towards Projector-based Visualization for Computer-assisted CABG at the Open Heart

Automatic segmentation and statistical shape modeling of the paranasal sinuses to estimate natural variations

Computational Medical Imaging Analysis Chapter 4: Image Visualization

SPIE Image-Guided Procedures, Robotic Procedures, and Modeling, VOLUME 8671

Computed Photography - Final Project Endoscope Exploration on Knee Surface

2 Michael E. Leventon and Sarah F. F. Gibson a b c d Fig. 1. (a, b) Two MR scans of a person's knee. Both images have high resolution in-plane, but ha

Frequency split metal artifact reduction (FSMAR) in computed tomography

Intraoperative Prostate Tracking with Slice-to-Volume Registration in MR

Real-time self-calibration of a tracked augmented reality display

shape modeling of the paranasal

3D Ultrasound System Using a Magneto-optic Hybrid Tracker for Augmented Reality Visualization in Laparoscopic Liver Surgery

Advanced Visual Medicine: Techniques for Visual Exploration & Analysis

2D-3D Registration using Gradient-based MI for Image Guided Surgery Systems

Data Fusion Virtual Surgery Medical Virtual Reality Team. Endo-Robot. Database Functional. Database

TomoTherapy Related Projects. An image guidance alternative on Tomo Low dose MVCT reconstruction Patient Quality Assurance using Sinogram

A Navigation System for Minimally Invasive Abdominal Intervention Surgery Robot

Scaling Calibration in the ATRACT Algorithm

Depth-Layer-Based Patient Motion Compensation for the Overlay of 3D Volumes onto X-Ray Sequences

A Video Guided Solution for Screw Insertion in Orthopedic Plate Fixation

8/3/2016. Image Guidance Technologies. Introduction. Outline

Overview of Proposed TG-132 Recommendations

7/31/2011. Learning Objective. Video Positioning. 3D Surface Imaging by VisionRT

Evaluation and Stability Analysis of Video-Based Navigation System for Functional Endoscopic Sinus Surgery on In-Vivo Clinical Data

Fiducial localization in C-arm based Cone-Beam CT

Penalized-Likelihood Reconstruction for Sparse Data Acquisitions with Unregistered Prior Images and Compressed Sensing Penalties

3D Guide Wire Navigation from Single Plane Fluoroscopic Images in Abdominal Catheterizations

VALIDATION OF DIR. Raj Varadhan, PhD, DABMP Minneapolis Radiation Oncology

The team. Disclosures. Ultrasound Guidance During Radiation Delivery: Confronting the Treatment Interference Challenge.

Towards deformable registration for augmented reality in robotic assisted partial nephrectomy

A New Method for CT to Fluoroscope Registration Based on Unscented Kalman Filter

Optimum Robot Control for 3D Virtual Fixture in Constrained ENT Surgery

NIH Public Access Author Manuscript Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. Author manuscript; available in PMC 2010 July 1.

CT vs. VolumeScope: image quality and dose comparison

A Study of Medical Image Analysis System

Development of a novel laser range scanner

Anatomically Constrained Video-CT Registration via the V-IMLOP Algorithm

A Non-Linear Image Registration Scheme for Real-Time Liver Ultrasound Tracking using Normalized Gradient Fields

Concurrent Visualization of and Mapping between 2D and 3D Medical Images for Disease Pattern Analysis

Optimal Planning of Robotically Assisted Heart Surgery: Transfer Precision in the Operating Room

Trackerless Surgical Image-guided System Design Using an Interactive Extension of 3D Slicer

Mech. Engineering, Comp. Science, and Rad. Oncology Departments. Schools of Engineering and Medicine, Bio-X Program, Stanford University

An Iterative Framework for Improving the Accuracy of Intraoperative Intensity-Based 2D/3D Registration for Image-Guided Orthopedic Surgery

3D Navigation for Transsphenoidal Surgical Robotics System Based on CT - Images and Basic Geometric Approach

An Accuracy Approach to Robotic Microsurgery in the Ear

From DQE to TRE: Image Science, Image Guidance, and Cone-Beam CT

Fully Automatic Endoscope Calibration for Intraoperative Use

Imaging protocols for navigated procedures

Intraoperative Ultrasound Probe Calibration in a Sterile Environment

3D Voxel-Based Volumetric Image Registration with Volume-View Guidance

LOCATED centrally at the base of the brain, the pituitary is

An Automated Image-based Method for Multi-Leaf Collimator Positioning Verification in Intensity Modulated Radiation Therapy

Estimating 3D Respiratory Motion from Orbiting Views

Optical Guidance. Sanford L. Meeks. July 22, 2010

Camera Calibration for a Robust Omni-directional Photogrammetry System

Non-rigid 2D-3D image registration for use in Endovascular repair of Abdominal Aortic Aneurysms.

Automatic Patient Registration for Port Placement in Minimally Invasive Endoscopic Surgery

Leksell SurgiPlan. Powerful planning for success

Markerless Endoscopic Registration and Referencing

Simulation of Mammograms & Tomosynthesis imaging with Cone Beam Breast CT images

Real-Time Data Acquisition for Cardiovascular Research

A New Approach to Ultrasound Guided Radio-Frequency Needle Placement

NIH Public Access Author Manuscript Proc Soc Photo Opt Instrum Eng. Author manuscript; available in PMC 2014 October 07.

Ultrasound Probe Tracking for Real-Time Ultrasound/MRI Overlay and Visualization of Brain Shift

GPU-accelerated Rendering for Medical Augmented Reality in Minimally-invasive Procedures

Intra-Patient Prone to Supine Colon Registration for Synchronized Virtual Colonoscopy

Vision-based endoscope tracking for 3D ultrasound image-guided surgical navigation [Yang et al. 2014, Comp Med Imaging and Graphics]

Medicale Image Analysis

Computer Aided Surgery 8:49 69 (2003)

C-mode Real Time Tomographic Reflection for a Matrix Array Ultrasound Sonic Flashlight

Gradient-Based Differential Approach for Patient Motion Compensation in 2D/3D Overlay

Calibration Method for Determining the Physical Location of the Ultrasound Image Plane

CT Reconstruction Using Spectral and Morphological Prior Knowledge: Application to Imaging the Prosthetic Knee

3D Imaging from Video and Planar Radiography

Bone registration with 3D CT and ultrasound data sets

Transcription:

High-Accuracy 3D Image-Based Registration of Endoscopic Video to C-Arm Cone-Beam CT for Image-Guided Skull Base Surgery Daniel J. Mirota, a Ali Uneri, a Sebastian Schafer, b Sajendra Nithiananthan, b Douglas D. Reh, c Gary L. Gallia, d Russell H. Taylor, a Gregory D. Hager, a and Jeffrey H. Siewerdsen *a,b a Department of Computer Science, Johns Hopkins University, Baltimore MD b Department of Biomedical Engineering, Johns Hopkins University, Baltimore MD c Department of Otolaryngology-Head and Neck Surgery, Johns Hopkins Hospital, Baltimore MD d Department of Neurosurgery and Oncology, Johns Hopkins Hospital, Baltimore MD ABSTRACT Registration of endoscopic video to preoperative CT facilitates high-precision surgery of the head, neck, and skull-base. Conventional video-ct registration is limited by the accuracy of the tracker and does not use the underlying video or CT image data. A new image-based video registration method has been developed to overcome the limitations of conventional tracker-based registration. This method adds to a navigation system based on intraoperative C-arm conebeam CT (CBCT) that reflects anatomical change, in turn providing high-accuracy registration of video to the surgical scene. The resulting registration enables visualization of the CBCT and planning data within the endoscopic video. The system incorporates a mobile C-arm for high-performance CBCT, integrated with an optical tracking system, video endoscopy, deformable registration of preoperative CT with intraoperative CBCT, and 3D visualization. As in the tracker-based approach, in the image-based video-cbct registration the endoscope is localized using an optical tracking system that provides a quick initialization followed by a direct 3D image-based registration of the video to the CBCT. In this way, the system achieves video-cbct registration that is both fast and accurate. Application in skull-base surgery demonstrates overlay of critical structures (e.g., carotid arteries and optic nerves) and surgical target volumes with submm accuracy. Phantom and cadaver experiments show consistent improvement in target registration error (TRE) in video overlay over conventional tracker-based registration e.g., 0.92 mm versus 1.82 mm for image-based and trackerbased registration, respectively. The proposed method represents a two-fold advance first, through registration of video to up-to-date intraoperative CBCT (overcoming limitations associated with navigation with respect to preoperative CT), and second, through direct 3D image-based video-cbct registration, which together provide more confident visualization of target and normal tissues within up-to-date images and improved targeting precision. Keywords: image registration, 3D visualization, image-guided surgery, cone-beam CT, intra-operative imaging, skullbase surgery, surgical navigation, video endoscopy, video fusion 1. INTRODUCTION Complex head and neck procedures tumor surgery in the skull-base in particular require high accuracy navigation. Transnasal surgery to the skull-base uses the endonasal sinus passage to gain access to lesions of the pituitary and the surrounding ventral region. 1 Accuracy is critical when operating in this region due to the proximity of critical structures, including the optic nerves and carotid arteries, violation of which could result in blindness or death. To assist in preservation of critical anatomy while facilitating complete target resection, surgical navigation systems involving tracking and visualization of a pointer tool have shown to support complex cases. 2 Recent work 3 extends such tracking capability to the endoscope, allowing registration of the endoscopic video scene to preoperative CT. However, such tracker-based approaches are limited by inaccuracies in the tracking system, with typical registration errors of 1 3 mm. 4 In addition, conventional navigation based on preoperative CT is subject to gross inaccuracies arising from anatomical change (i.e., deformation and tissue excision) occurring during the procedure. As noted by Schlosser et al., 5 of the four major types of skull-base lesions (viz., neoplasms, orbital pathology, skull base defects, and inflammatory disease) conventional navigation systems are appropriate only for bony neoplasms, and an inability to account for * Further author information: (* Corresponding Author) Email Addresses: jeff.siewerdsen@jhu.edu *

intraoperative change limits applicability in the treatment of more complex disease. The work reported below attempts to answer the conventional limitations of accounting for anatomical change and tracker accuracy: first, by resolving the limitations of conventional tracker-based video registration through direct 3D image-based registration; and second, through registration of video to high-quality intraoperative C-arm cone-beam CT (CBCT) that properly reflects anatomical change. By incorporating CBCT and direct 3D image-based registration with video overlay, it is possible not only to improve surgical performance but also to lower the learning curve for skull-base surgery and provide a tool for surgical training via enhanced visualization. Conventional tracker-based registration 3,6,7 is performed by attaching a rigid body to an endoscope and using an optical or electromagnetic tracking system to measure the pose, rotation, and translation of the endoscope. With the pose of the endoscope known, the intrinsic and extrinsic camera parameters are applied along with the CT-to-tracker (or CBCT-to-tracker) registration. The camera is then registered to the CT (or CBCT) image data, which can be fused in alpha-blended overlay on the video scene. Such approaches have demonstrated the potential value of augmenting endoscopic video with CT, CBCT, and/or planning data, but are fundamentally limited by the accuracy of registration provided by the tracking system. The approach described below incorporates image-based video registration 8 to directly register the video image (viz., a 3D video scene computed by structure-from-motion 9 ) to the CT or CBCT image data. The image-based method is further integrated with a navigation system that combines intraoperative CBCT with realtime tool tracking, video endoscopy, preoperative data, video reconstruction, registration software, and 3D visualization software. 10,11 The intraoperative CBCT is provided by a prototype mobile C-arm developed in collaboration with Siemens Healthcare (Erlangen, Germany) to provide high-quality intraoperative images with sub-millimeter spatial resolution and soft-tissue visibility. 12,13 A newly developed open-source software platform enables image-based video registration to be integrated seamlessly with surgical navigation. 2. METHODS 2.1 System Setup Surgical tools and an ENT video endoscope with infrared retro-reflective markers (Karl Storz, Tuttlingen, Germany), [E Figure 1], were tracked using an optical tracking system (Polaris Vicra or Spectrum, Northern Digital Inc., Waterloo ON), [N Figure 1]. The endoscopic video was captured using a video capture card (Hauppauge, Hauppauge NY) and combined with other navigation subsystems using a software platform linking open-source libraries for surgical navigation (CISST, Johns Hopkins University) 14 and 3D visualization (3D Slicer, Brigham and Women s Hospital, Boston MA), 15 [CBCT Figure 1]. Components of the system are illustrated in Figure 1. N X #$#%" (" "%&'()% * )+,-.'-) $ " "%&'()% * %)/)%)+') $ "%&'()% * -#+")% $ )" CBCT E )+,-.'-) * '&0)%& $ *" "# $ P O R 1* * &"#)+" $ '&0)%& * &"#)+" $ Figure 1: The bench-top system (left) and the relationship of the frame transformations (right). The labels refer to: N (the tracker), CBCT (the CBCT data), R (the reference marker frame), X (the pointer frame), E (the endoscope frame), O (the optical center of the endoscope), and P (the patient frame). 2.2 Camera Calibration Camera calibration is required for both the conventional tracker-based registration and the proposed image-based registration. A passive arm was used to hold the endoscope (to minimize tracker latency effects) in view of a checkerboard calibration pattern (2 2 mm 2 squares). The camera was calibrated using the German Aerospace Center (DLR) camera calibration toolbox. 16 The DLR camera calibration toolbox computed the intrinsic camera parameters (focal length, principle point, and radial distortion), extrinsic camera parameters (the pose of camera) and the hand-eye calibration (the transformation from the optically tracked rigid body and the camera center). &" '" &"#)+" * %)/)%)+') $

2.3 Tracker-Based Registration Following camera calibration, the tracker-based registration used the tracker to localize the endoscope, and the hand-eye calibration was applied to resolve the camera in tracker coordinates. The registration of patient to tracker was then applied to render the camera in CT (or CBCT) coordinates as shown in Eq. 1: " "#$%" " "#$%&# "#$%&# "#""$%" "#$%&" "#""$%" "#$%&" "#$%&$' "#$%&$' "#$%" (1) where each transformation is illustrated with respect to the experimental setup in Figure 1. The limitations in accuracy associated with a tracker-based registration become evident in considering the number of transformations required in Eq. 1. Each non-tracker transformation is assumed to be fixed and rigid, but such is not always the case even with rigid endoscopes, and errors in each transformation accumulate in an overall registration error that is typically limited to ~1-3 mm under best-case (laboratory) conditions. In particular, " "#$%&#is subject to errors arising from deformation during surgery, while "#$%&# "#""$%" and "#$%&" "#""$%" are subject to perturbations of the reference marker and other aging of the preoperative tracker registration over the course of the procedure. 2.4 Image-Based Registration Instead of relying on the aforementioned tracker-based transformations, image-based registration builds a 3D reconstruction of the tracked 2D image features and registers the reconstruction directly to a 3D isosurface of the airtissue boundary in CT (or CBCT). 17 2D image features selected by Scale Invariant Feature Transfrom (SIFT) 18 are tracked in a pair of images. From the motion field created by the tracked 2D features over the pair of images the 3D location of each tracked feature pair is triangulated. This significantly simplifies the transformations of Eq. 1 to a single image-based transform: " "#$%" " "#$%&# "#$%" "#$%&# (2a) or "# "#$%" "# "#$%&# "#$%" "#$%&# (2b) In particular, "# "#$%&# is robust to inaccuracies associated with preoperative images alone (i.e., " "#$%&#) since the CBCT image accurately accounts for intraoperative change. Similarly, "#$%" "#$%&# uses up-to-date video imagery of the patient. While this approach has the potential to improve registration accuracy essentially to the spatial resolution of the video and CT (or CBCT) images (each better than 1 mm), the image-based approach is more computationally intense and requires a robust initialization (within ~10 mm of the camera location) to avoid local minima. 2.5 Hybrid Registration We have developed a hybrid registration process that takes advantage of both the tracker-based registration (for fast, robust pose initialization) and image-based registration (for precise refinement based on 3D image features). As illustrated in Figure 2, the tracker-based registration streams continuously and updates the registration in real-time (but 3/0;A%/260-%7 3/0;A%/ >$7)-;)B% C *)/(7 540,% 3 *)/(7 >$7)-;)B% +%0(23#4%&"0-.+%,#-./01)$ DB70.% 9#-B(0G E3& 540,%260-%7 EFE3 5-)-=/@0;% C(0$$#$, '5"3 '892'5"3 :0.;< +)6=-. :)1)$ 3/#0$,=(01)$ >-1401)$ "#$%&'()*+%,#-./01)$ +)6=-.?92?9 +%,#-./01)$ Figure 2: System data flow diagram

SPIE Medical Imaging 2011 Proc. Visualization, Image-Guided Procedures, and Display Volume 7964 with coarse accuracy), while the image-based registration runs in the background (at a slower rate due to computational complexity) and updates the registration to a finer level of precision as available (e.g., when the endoscope is moving slowly or is at rest with manually motion). 2.6 Phantom and Cadaver Studies Phantom and cadaver studies were conducted to measure the accuracy of image-based video-cbct registration and to gather expert evaluation of its integration and utility in CBCT-guided skull base surgery. The phantom study included the use of two anthropomorphic phantoms. The first was a simple, rigid head phantom for endoscopic sinus simulation shown in Figure 1 and detailed in Reference 19. The phantom was printed with a 3D rapid prototype printer based on segmentations of bone and sinus structures from a cadaver CT. It is composed of homogeneous plastic, presenting a simple, architecturally realistic simulation of anatomy of the sinuses, skull base, nasopharynx, etc. Because the phantom is simply plastic (in air), CBCT images exhibit very low noise and ideal segmentation of surfaces. The second phantom (referred to as the anthropomorphic phantom, or simply, phantom #2) provided more realistic anatomy, imaging conditions, and CBCT image quality, because it more closely approximated a human head. Phantom #2 incorporated a natural skull within tissue-equivalent plastic (RandoTM material) while maintaining open space within the nasal sinuses, nasopharynx, oral cavity, oropharyncx, etc. The phantom therefore provided stronger attenuation and realistic artifacts in CBCT of the head. As a result, the isosurface segmented in CBCT was not as smooth or free of artifact as in the simple plastic phantom. The second phantom therefore demonstrated the effects of image quality degradation on image-based video-cbct registration, and improvements in image quality (for example, improved artifact correction) is expected to yield better registration. N CBCT R X E O P Figure 3: Cadaver study experimental setup. Each of the specimens was scanned preoperatively in CT, and structures of interest (fiducials, critical anatomy, and surgical targets) were segmented. Each was then registered to the navigation system using a fiducial-based registration with fiducial markers spread over the exterior of the skull. All experiments used a passive arm to hold the endoscope and avoid the latency effects of video and tracker synchronization. Video and CBCT images were registered according to the conventional tracker-based approach and the image-based approach. In each image the target registration error (TRE) was measured as the RMS difference in target points viz., rigid divots in the phantoms and anatomical points identified by an expert surgeon for the cadavers. Figure 3 shows the cadaver study experimental setup. The devices and reference frames are as previously labeled in Figure 1.The cadaver specimen was held in a Mayfield Clamp on a carbon fiber table. The C-arm provided intraoperative CBCT on request from the surgeon at various milestones in the course of approach to the skull base, identification of anatomical landmarks therein, and resection of a surgical target (typically the pituitary or clival volume). D. J. Mirota et al. (IN PRESS)

SPIE Medical Imaging 2011 Proc. Visualization, Image-Guided Procedures, and Display Volume 7964 To evaluate the registration, target points were manually segmented in both the video and CBCT images. Tracker measurements were recorded 10 times each with the endoscope held stationary by a passive arm to avoid tracker jitter and single-sample measurement bias. The ten measurements were then averaged and used to compute the TRE for the tracker-based method. The same video frame from the stationary endoscope was used as the first image of the pair used for the image-based method. A total of 90 target points were identified in the phantoms the targets were from within the right and left maxillary, ethmoid and sphenoid sinuses and the mouth, and a total of 4 target points were identified in the cadavers (viz., the left and right the optic carotid recess in two specimens). After segmenting the targets in both the video and CBCT, the camera pose estimates from either the tracker-based or image-based method were used as the origin of a ray,, through the point segmented in the video. The distance between this ray and the 3D point in the CBCT was computed as the TRE. Thus, for a given camera pose in the CBCT and camera intrinsic parameters, the ray,, was formed from the segmented point in the video, "#$% is: " "#$% (3) It then follows that the distance,, between the ray and the point in the CBCT ("# ) is: "# "# (4) Geometrical,, can be thought of as the distance between "# and the projection of "# on the ray,. We used,, as our TRE metric to be able to compare, tracker-based and image-based directly. 3. RESULTS 3.1 Phantom Studies Tracker-based From a high quality CBCT, an isosurface was segmented by applying a threshold at the value of the air/tissue interface. The resulting polygon surface was smoothed and de-noised using the Visualization Toolkit (VTK) available within Slicer, shown in Figure 4 on the left. While there is some noise in the texture of the hard palate and base of the mouth, the majority of the surface is smooth and artifact-free, which allows better registration. The simple, rigid phantom experiment therefore begins to probe the fundamental performance limits of image-based in comparison to tracker-based (b) (c) (d) (e) (f) Image-based (a) Figure 4: Comparison of tracker-based (top row) and image-based (bottom row) video-cbct registration in the rigid phantom of Figure 1. The three views (left to right) are: the CBCT surface rendering, the endoscopic video scene, and the TRE colormap. Green spheres indicate the (true) endoscope image target locations. Blue spheres indicate the CBCT target locations. D. J. Mirota et al. (IN PRESS)

Proc. Visualization, Image-Guided Procedures, and Display registration. In Figure 4 are the manually segmented targets in the CBCT are indicated by blue spheres and the targets in the video are indicated by green spheres. The result in Figure 4 shows (from left to right) the surface rendering, endoscope image, and TRE colormap. Figure 4(a-c) present the tracker-based results, while Figure 4(d-f) show the image-based results. In the tracker-based results (4a-c) the misalignment of all of the target spheres is visible, whereas image-based registration shows lower error for all of the targets. The colormap shows more clearly the distribution of error over each of the targets. Note the consistent offset from the inaccuracy of the extrinsic camera parameters of the tracker-based approach versus the radial distribution of error in the image-based approach. Volume 7964 Phantom #1 3 2.5 2 mm SPIE Medical Imaging 2011 1.5 Trackerer-based Figure 5 summarizes the overall distribution of all TRE measurements 1 within the rigid plastic phantom. The mean TRE for the tracker-based registration method was 1.82 mm (with 1.58 mm first quartile and 2.24 mm range). By comparison, the mean TRE for image-based registration was 0.5 0.92 mm (with 0.61 mm first quartile and 1.76 mm range). The improvement in TRE over the tracker-based method was statistically significant (p < 0.001) with the majority of all targets visualized showing Tracker Based Video Based sub-millimeter precision. Figure 5: Summary of TRE Figure 6 shows an example registration in the anthropomorphic phantom. measurements in the simple, rigid, As in Figure 4, the top row corresponds to tracker-based registration, and the plastic phantom (N=43, p < 0.001). bottom row the image-based results, where the images illustrate target points within the oropharynx (composed of a rapid prototyped CAD model incorporated within the oral cavity and nasopharynx of the natural skull). Registration in this phantom presented a variety of realistic challenges for example, a lack of texture on smooth surfaces and a prevalence of realistic CBCT reconstruction artifacts. Nonetheless, the image-based registration algorithm yielded an improvement over tracker-based registration as illustrated in Figure 6. The CBCT targets are shown by blue spheres, and the video targets as light red (b) (c) (d) (e) (f) Image-based (a) Figure 6: Comparison of tracker-based (top row) and image-based (bottom row) video-cbct registration in the more realistic, anthropomorphic phantom. The three views (left to right) are: the CBCT surface rendering, the endoscopic video scene, and the TRE colormap. Light red spheres indicate the (true) endoscope image target locations. Blue spheres indicate the CBCT target locations. D. J. Mirota et al. (IN PRESS)

spheres. The correspondence of the spheres is shown with a light blue line connecting each pair. A distinct pattern of consistent offset is again observed for the tracker-based approach, compared to a radial distribution of offset in the image-based approach. Shown in Figure 7 is the overall distribution of TRE measured in the anthropomorphic phantom. The TRE was measured within the nasal passage, the oropharynx and mouth. The CBCT artifacts and lack of surface texture is evident in an overall increase in TRE and the large variance of the distribution. The mean TRE for the tracker-based registration method was 2.54 mm (with 1.97 mm first quartile and 5.04 mm range). By comparison, the mean TRE for image-based registration was 1.62 mm (with 0.83 mm first quartile and 3.30 mm range). The results again demonstrate a significant improvement (p < 0.001) for the imagebased registration approach. 3.2 Cadaver Study The same experimental setup as for the phantom studies was repeated in a cadaver to evaluate the effectiveness of the algorithm in a more realistic, preclinical environment. The left and right optic carotid recess (OCR) were manually segmented in the CBCT image and the video for TRE analysis. Figure 8 shows the results of the tracker-based registration (Fig. 8a-b) and image-based registration (Fig. 8c-d). Again, blue spheres denote points segmented in CBCT and light red spheres, from the video. Note the closer alignment of points after video registration. Table 1 shows the mean and range of the tracker and image-based results, once again showing an Figure 7: Summary of TRE measurements in the phantom #2 (N=47, p < 0.001). improvement in the registration. Although the small number of points do not support a statistical comparison, the mean TRE (of the four OCR points) was 4.86 mm for the tracker-based approach, compared to 1.87 mm for the image-based mm 5.5 5 4.5 4 3.5 3 2.5 2 1.5 1 0.5 Tracker Based Phantom #2 Video Based Tracker-Based (a) (b) Image-based (c) (d) Figure 8: Comparison of tracker-based and image-based video-cbct registration in cadaver. The top row shows the tracker-based results from our cadaver study with two different views (CBCT [left], live endoscope [right]). The light red spheres indicate the endoscope image target locations. The blue spheres indicate the CBCT target locations. The bottom row shows the image-based results.

SPIE Medical Imaging 2011 Proc. Visualization, Image-Guided Procedures, and Display Table 1: Summary of TRE measurements in the cadaver study. TRE (mm) of the OCR (N=4) Volume 7964 registration. A more complete statistical analysis in cadaver is the subject of ongoing work. After registering the planning data from a preoperative CT to the intraoperative CBCT (as shown in Figure 9), endoscopic video and CBCT Mean Range were registered via the tracker-based and/or image-based approaches to Tracker-based 4.86 4.11 yield overlay of the planning data on video. Figure 9 shows the posterior aspect of the sphenoid sinus (the surgical target) with the optic nerves Image-based 1.87 1.27 overlaid in blue and the carotid arteries in red. The orthogonal views show the standard views visualizing the location of the camera s optical center within the CBCT. Registered to the CBCT is the preoperative CT with planning data overlay in color on the slices. The planning data enables surgical tasks to be defined ahead of time, such as resection volumes and dissection cut lines. 4. DISCUSSION Schulze et al. noted that an exact match of the virtual and the real endoscopic image is virtually impossible 20 since the source of the virtual image is a preoperative CT. However, the access to intra-operative CBCT makes possible an image-based registration in which the endoscopic video is matched to an up-to-date tomographic image that properly reflects anatomical changes (including deformations, tissue excisions, etc.) occurring during the procedure. Consider the video snapshot of the clivus, for example, as shown in Figure 8; it is possible that the location of the camera might be within the anterior aspect of the sphenoid, making the rendering of the current endoscopic location very difficult in the context of preoperative CT alone (i.e., with the anterior sphenoid wall still intact). However, with a current CBCT the source of the tomographic image is up to date, reflecting surgical resections, thus making it possible to render an isosurface that matches the endoscopic video. A significant improvement (p < 0.001) in TRE was demonstrated in these studies, but this method is sensitive to local minima and there are cases in which there were insufficient matched features to produce a meaningful reconstruction. Figure 9: The surgeon's view of the navigation system. Shown in the endoscope view (top right) are the carotid arteries (red) and the optic nerves (blue). The three orthogonal slices indicate the location of the endoscope in the CBCT with planning data overlaid in color. D. J. Mirota et al. (IN PRESS)

Local minima can be produced from the isosurface containing artifacts, and poor feature matches can occur when there is insufficient texture in the video images. In addition, the current use of only two images for 3D video image reconstruction is limiting and not as robust as multi-view reconstruction. Areas of future work include investigation of using the entire image for registration instead of a sparse feature set and more tightly coupling the tracker data into the image-based registration. This would add more constraints to the registration and could simplify the formulation of the problem to a single image, thus making video frame-by-frame registration possible. 5. CONCLUSION Conventional tracker-based video-ct registration provides an important addition to the image-guided endoscopic surgery arsenal, but it is limited by fundamental inaccuracies in tracker registration accuracy and anatomical change imparted during surgery. As a result, conventional video-ct registration is limited to accuracy ~1-3 mm in the best-case scenarios, and under clinical conditions with large anatomical change it can be rendered useless. The work reported here begins to resolve such limitations: first, by overcoming tracker-based inaccuracies through a high-precision 3D imagebased registration of video (computed structure-from-motion) with CT or CBCT (air-tissue iso-surfaces); and secondly, by integration with intraoperative CBCT acquired using a high-performance prototype C-arm. Quantitative studies in phantoms show a significant improvement (p < 0.001) in TRE for image-based registration compared to tracker-based registration, with a hybrid arrangement proposed that initializes the image-based alignment with a tracker-based pose estimate to yield a system that is both robust and precise. Translation of the approach to clinical use within an integrated surgical navigation system is underway, guided by expert feedback from surgeons in cadaveric studies. These preclinical studies suggest that removing such conventional factors of geometric imprecision through intraoperative image-based registration could extend the applicability of high-precision guidance systems across a broader spectrum of complex head and neck surgeries. ACKNOWLEDGMENTS This work was supported by the Link Foundation Fellowship in Simulation and Training, Medtronic and the National Institutes of Health grant number R01-CA127144. Academic-industry partnership in the development and application of the mobile C-arm prototype for CBCT is acknowledged, in particular collaborators at Siemens Healthcare (Erlangen Germany) Dr. Rainer Graumann, Dr. Gerhard Kleinszig, and Dr. Christian Schmidgunst. Cadaver studies were performed at the Johns Hopkins Medical Institute, Minmally Invasive Surgical Training Center, with support and collaboration from Dr. Michael Marohn and Ms. Sue Eller gratefully acknowledged. REFERENCES [1] Carrau, R. L., Jho, H.-D., and Ko, Y., "Transnasal-Transsphenoidal Endoscopic Surgery of the Pituitary Gland," The Laryngoscope, 106, 914-918 (1996). [2] Nasseri, S. S., Kasperbauer, J. L., Strome, S. E., McCaffrey, T. V., Atkinson, J. L., and Meyer, F. B., "Endoscopic Transnasal Pituitary Surgery: Report on 180 Cases," American Journal of Rhinology, 15, 281-287(7) (2001). [3] Lapeer, R., Chen, M. S., Gonzalez, G., Linney, A., and Alusi, G., "Image-enhanced surgical navigation for endoscopic sinus surgery: evaluating calibration, registration and tracking," The International Journal of Medical Robotics and Computer Assisted Surgery, 4, 32-45 (2008). [4] Chassat, F. and Lavallée, S., "Experimental protocol of accuracy evaluation of 6-D localizers for computerintegrated surgery: Application to four optical localizers," in MICCAI, 277-284 (1998). [5] Schlosser, R. J. and Bolger, W. E., "Image-Guided Procedures of the Skull Base," Otolaryngologic Clinics of North America, 38, 483-490 (2005). [6] Daly, M. J., Chan, H., Prisman, E., Vescan, A., Nithiananthan, S., Qiu, J., Weersink, R., Irish, J. C., and Siewerdsen, J. H., "Fusion of intraoperative cone-beam CT and endoscopic video for image-guided procedures," in SPIE Medical Imaging, 7625, 762503 (2010). [7] Shahidi, R., Bax, M. R., Jr, C. R. M., Johnson, J. A., Wilkinson, E. P., Wang, B., West, J. B., Citardi, M. J., Manwaring, K. H., and Khadem, R., "Implementation, Calibration and Accuracy Testing of an Image-Enhanced

Endoscopy System," Medical Imaging, IEEE Transactions on, 21, 1524-1535 (2002). [8] Mirota, D., Wang, H., Taylor, R. H., Ishii, M., and Hager, G. D., "Toward Video-Based Navigation for Endoscopic Endonasal Skull Base Surgery," in MICCAI, 5761, 91-99 (2009). [9] Longuet-Higgins, H. C., "A Computer Algorithm for Reconstructing a Scene from Two Projections," Nature, 293, 133-135 (1981). [10] Siewerdsen, J. H., Daly, M. J., Chan, H., Nithiananthan, S., Hamming, N., Brock, K. K., and Irish, J. C., "Highperformance intraoperative cone-beam CT on a mobile C-arm: an integrated system for guidance of head and neck surgery," in SPIE Medical Imaging, 7261, 72610J (2009). [11] Uneri, A., Schafer, S., Mirota, D., Nithiananthan, S., Otake, Y., Reaungamornrat, S., Yoo, J., Stayman, W., Reh, D. D., Gallia, G. L., Khanna, J., Hager, G. D., Taylor, R. H., Kleinszig, G., and Siewerdsen, J. H., "Architecture of a high-performance surgical guidance system based on C-arm cone-beam CT: software platform for technical integration and clinical translation," in SPIE Medical Imaging (2011), To Appear. [12] Siewerdsen, J., Moseley, D., Burch, S., Bisland, S., Bogaards, A., Wilson, B., and Jaffray, D., "Volume CT with a flat-panel detector on a mobile, isocentric C-arm: Pre-clinical investigation in guidance of minimally invasive surgery," Medical physics, 32, 241-254 (2005). [13] Daly, M., Siewerdsen, J., Moseley, D., Jaffray, D., and Irish, J., "Intraoperative cone-beam CT for guidance of head and neck surgery: Assessment of dose and image quality using a C-arm prototype," Medical physics, 33, 3767-3780 (2006). [14] Deguet, A., Kumar, R., Taylor, R., and Kazanzides, P., "The cisst libraries for computer assisted intervention systems," in MICCAI Workshop (2008), https://trac.lcsr.jhu.edu/cisst/. [15] Pieper, S., Lorensen, B., Schroeder, W., and Kikinis, R., "The NA-MIC Kit: ITK, VTK, pipelines, grids and 3D slicer as an open platform for the medical image computing community," in International Symposium on Biomedical Imaging, 698-701 (2006). [16] Strobl, K. H., Sepp, W., Fuchs, S., Paredes, C., and Arbter, K. "DLR CalDe and DLR CalLab," [Online]. http://www.robotic.dlr.de/callab/ (2010) [17] Mirota, D., Taylor, R. H., Ishii, M., and Hager, G. D., "Direct Endoscopic Video Registration for Sinus Surgery," in SPIE Medical Imaging, 7261, 72612K-1-72612K-8 (2009). [18] Lowe, D. G., "Distinctive Image Features from Scale-Invariant Keypoints," International Journal of Computer Vision, 60(2), 91-110 (2004). [19] Vescan, A. D., Chan, H., Daly, M. J., Witterick, I., Irish, J. C., and Siewerdsen, J. H., "C-arm cone beam CT guidance of sinus and skull base surgery: quantitative surgical performance evaluation and development of a novel high-fidelity phantom," in SPIE Medical Imaging, 7261, 72610L (2009). [20] Schulze, F., Bühler, K., Neubauer, A., Kanitsar, A., Holton, L., and Wolfsberger, S., "Intra-operative virtual endoscopy for image guided endonasal transsphenoidal pituitary surgery," International Journal of Computer Assisted Radiology and Surgery, 5(2), 143-154 (2010).