Performance Evaluation of CT Systems

8/2/206 Performance Evaluation of CT Systems AAPM Task Group 233 Foundations, Methods, and Prospects Ehsan Samei Duke University Medical Center Disclosures Research grant: NIH R0 EB00838 Research grant: General Electric Research grant: Siemens Medical Advisory board: Imologix TG 233 membership Donovan Bakalyar Kirsten L Boedeker Samuel Brad Jiahua Fan Shuai Leng Michael McNitt-Gray Kyle J. Myers Lucretiu Popescu Juan Carlos Ramirez Giraldo Frank Ranallo Ehsan Samei Justin Solomon Jay Vaishnav Jia Wang

8/2/206 Duke Imaging Physics Carl E Ravin Advanced Imaging Laboratories Clinical Imaging Physics Group Outline Foundation What has led to TG233 Methods Select procedures Prospects Possibilities Future extensions Outline Foundation What has led to TG233 Methods Select procedures Prospects Possibilities Future extensions 2

8/2/206 Foundation New physics perspective New CT features New physics perspective Medical physics metrology is most relevant to the extend it relates to the clinical need. Conformance-based testing Validating physical specifications: pass/fail Relevance of non-compliance? 2. Performance-based testing Metrics relevant to clinical accuracy Application to optimization of use: precision medicine and patient-centric care Medical Physics 3.0 Medical physics extending from specifications to performance equipment to operation quality check to process consistency Presumption to actual utility compliance to excellence 3

8/2/206 Features of modern CT Variable kernels Iterative reconstruction Tube current modulation Special applications Spectral methods and applications Cone-beam CT Cardiac CT Perfusion imaging Iterative recons ASiR (GE) Veo (GE) IRIS (Siemens) SAFIRE (Siemens) AIDR (Toshiba) idose (Philips) Significant potential for dose reduction Potential for improved image quality Increased vendor-dependence Unconventional image appearance Limited utility of prior quality metrics Need for nuanced implementation for effective improvement in patient care FBP Reconstruction Iterative Reconstruction courtesy of University of Erlangen, Germany 4

MTF NPS MTF NPS 8/2/206 FBP Reconstruction Iterative Reconstruction Low Dose CT @ 4 DLP,.9 msv Courtesy of Dr de Mey and Dr Nieboer, UZ Brussel, Belgium Resolution and noise, eg Comparable resolution Lower noise but different texture -FBP -IRIS -SAFIRE x 0-6 2.5 2 -FBP -IRIS -SAFIRE.5 0.4 0.2 0 0 0.2 0.4 Spatial frequency 0 0 0.2 0.4 Spatial frequency Resolution and noise, eg 2 Higher resolution Lower noise but different texture 0.9 x 0-5 0.45 0.4 0.35 0.3 -MBIR -ASIR -FBP 0.4 0.3 0.2 -MBIR -ASIR -FBP 0.25 0.2 0.5 0. 0. 0.05 0 0 0.2 0.4 Spatial frequency 0 0 0.2 0.4 Spatial frequency 5

8/2/206 Non-Stationary Noise and Resolution The Local NPS and MTF NPS MTF 2.5 x0-6 0 0 Noise and resolution model for Penalized Likelihood (PL) model-based reconstruction.* Predictive framework for NPS, MTF, and detectability index (d ) enables task-based design and optimization of new systems using iterative reconstruction. *Fessler et al. IEEE-TIP (996) G. Gang et al. Med Phys 4 (204) Outline Foundation What has led to TG233 Methods Select procedures Prospects Possibilities Future extensions Methods. Visual inspection 2. Basic performance Summarizing existing methods in tabular form 3. Operational performance Tube current modulation Spatial resolution Noise Quasi-linear task-based performance Spatial domain task-based performance Appendix 6

8/2/206 Basic performance TCM Use: Variable-sized, continuous and step-wise phantoms imaged with TCM Measure: ma and noise per size Evaluate: ma and noise size dependencies Phase concordance Resolution Use: Phantom with 2 cm circular inserts of relevant contrast imaged using representative protocols Measure: Task transfer function (TTF) Evaluate: TTF at defined noise and contrast Frequencies at 50% and 0% TTF (f 50 and f 0 ) 7

8/2/206 Noise Use: Variable-sized, uniform phantom imaged using representative protocols Measure: Noise magnitude and NPS Evaluate: SD and NPS at defined noise levels Peak, average frequencies of the NPS (f P and f A ) Spatial-domain task-based detectability Use: Uniform phantoms with rod and sphere targets imaged using representative protocols Measure: Human or observer model target detection Evaluate: Localization success rate for targeted tasks Area under the LROC curve for targeted tasks Area under the EFROC curve for targeted tasks Quasi-linear task-based detectability index Use: TTF and NPS evaluation phantom imaged using representative protocols Measure: TTF and NPS Evaluate: Detectability indices for reference tasks (, 5, 0 mm, 0 and 00 HU, designer, rect, Gaussian) ' ( d NPWE ) 2 = òò [ òò MTF 2 (u,v)w 2 Task (u,v) E 2 (u,v)dudv] 2 MTF 2 2 (u,v)w Task (u,v) NPS(u,v)E 4 (u,v) + MTF 2 2 (u,v)w Task (u,v)n i dudv 8

37 cm 30 cm 23 cm 8.5 cm 2 cm 8/2/206 Suggested reference protocols Quasi-linear task-based measurements Mercury Phantom 3.0 Size matching population cohorts Designed for size, AEC, and d evaluations Mercury Phantom 3.0 8.5 cm 6 cm 6 cm 6 cm 6 cm 3.9 cm 8.5 cm 5.63 cm 6 cm 3 cm 30 9

8/2/206 Design: Size Pediatric representation percentages MP 3.0 section Water size equivalent 20 mm 2 mm 85 mm 77 mm Abdomen Chest Head Age Percentile Age Percentile Age Percentile 0 2 0 50 7 5 5.5 5 0 5 6 50 0 50 3 95 5 5 6 5 2 50 3 95 8 95 230 mm 220 mm 2 50 - - 6 50 2 5 2 95 300 mm 290 mm 9 95 - - 2 50 370 mm 355 mm 20 95 - - - - Design: Size Adult representation percentages MP 3.0 section Water size equivalent Abdomen Chest Head M F M F M F 20 mm 2 mm - - - - - - 85 mm 77 mm - - - - 25 75 230 mm 220 mm 0.4 9 0.06.4 - - 300 mm 290 mm 27. 6 4 48 - - 370 mm 355 mm 80 90.3 60 87 - - Design: Resolution, HU, noise Representation of abnormality-relevant HUs Sizes large enough for resolution sampling Maximum margin for individual assessment Iso-radius resolution properties Matching uniform section for noise assessment 30 0

8/2/206 Image Quality Evaluation Software HU, Contrast, Noise, CNR, MTF, NPS, d per patient size, fixed ma and TCM Expected release through AAPM TG 233 d vs observer performance Christianson et al, Radiology, 205 Comparing observer models Model: CNR CNRa NPW NPWE CHO CHOi Task Properties Lesion Contrast x x x x x x Lesion Size x x x x x Image Properties Noise Magnitude x x x x x x Noise Texture x x x x Resolution x x x x Observer Properties Visual System x x x Observer Noise x Assumptions Quasi LSI System x x Noise Stationarity x x

Detection Accuracy Detection Accuracy Detection Accuracy Detection Accuracy Detection Accuracy Detection Accuracy Human 8/2/206 Criteria for goodness of d calculation methods Strong correlation with human results Coefficient of determination (R 2 ) used as goodness of fit metric. Correlation independent of reconstruction algorithm Error, E, of linear discriminator used (bigger = better). Confidence interval for d should be small Average CI 95% used. Normalized to unity slope FBP ADMIRE Fit Lin. Disc. Model (c) Ehasn Samei R 2 = 0.4 R 2 = 0.4 R 2 = 0.4 E = 0.3E = 0.3 E = 0.3 CI = CI 5.60e-03 = 95% CI 5.60e-03 95% = 5.60e-03 95% Simple Metrics Fourier-Based Image Based CNR CNR CNR NPW NPW NPW Humans R R Rvs. Models 2 = 26 2 = 26 2 = 26 E = 0.25E = 0.25 E = 0.25 CI = CI 3.32e-03 = 95% CI 3.32e-03 95% = 3.32e-03 95% 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 CHO CHO CHO R 2 = 03 R 2 = 03 R 2 = 03 E = 0.4E = 0.4 E = 0.4 CI = CI 4.25e-02 = 95% CI 4.25e-02 95% = 4.25e-02 95% 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 A A CNR CNR A CNR A A NPW NPW A NPW A A CHO CHO A CHO CNRa CNRa CNRa NPWE NPWE NPWE CHOi CHOi CHOi 0.9 R 2 = 07 R 2 = 07 R 2 = 07 E = 0.5E = 0.5 E = 0.5 CI = CI.20e-02 = 95% CI.20e-02 =.20e-02 95% 95% 0.9 0.9 0.9 R 2 = 7 R 2 = 7 R 2 = 7 E = 0.3E = 0.3 E = 0.3 CI = CI 5.83e-03 = 95% CI 5.83e-03 = 5.83e-03 95% 95% 0.9 0.9 0.9 R 2 = 2 R 2 = 2 R 2 = 2 E = 0.45E = 0.45 E = 0.45 CI = CI 4.59e-02 = 95% CI 4.59e-02 = 4.59e-02 95% 95% 0.9 0.9 0.9 0.9 0.9 A A A CNRa CNRa CNRa 0.9 0.9 0.9 A A A NPWE NPWE NPWE 0.9 0.9 A A A CHOi CHOi CHOi 0.4 0.2 0 Comparing statistics across models R 2 0.4 0.3 0.2 0. 0 CNR CNRa NPW NPWE CHO CHOi CNR CNRa NPW NPWE CHO CHOi E 0.05 0.04 0.03 0.02 0.0 0.00 CI 95% CNR CNRa NPW NPWE CHO CHOi 2

Iodine concentration 8/2/206 No Somewhat Yes Model Comparison CNR CNRa NPW NPWE CHO CHOi Correlated with humans? Computed for generic tasks not present in phantom? Handling non-linearity or nonstationary noise? Correctly characterizing different recons? Acceptable uncertainty for a reasonable # of images? Outline Foundation What has led to TG233 Methods Select procedures Prospects Possibilities Future extensions Task-based performance Task characteristics for detection and estimation Size F C (r ) = C peak ( - r 2 æ ö ç èr ø ) n 3

AUC (NPWE) AUC (NPWE) 20% 29% 8/2/206 AUC vs. dose FBP IRIS SAFIRE3 SAFIRE5 Chen, SPIE, 203 Task-based dose reduction Large feature detection task 0.95 MBIR ACR Acrylic insert 0.9 5 5 5 78% FBP 5 0 2 3 4 5 6 7 Eff dose (msv) 0.2 msv 3.7 msv Task-based dose reduction Small feature detection task 0.95 MBIR ACR lp/mm bar pattern 0.9 5 5 5 64% FBP 5 0 2 3 4 5 6 7 Eff dose (msv) 0.2 msv 3.7 msv 4

Detectability 8/2/206 Dose-quality optimization Quality-dose gradient to achieve highest quality at lowest dose Iso-gradient operating points 2 Detectability index across systems 0 8 All Systems System System 2 System 3 System 4 System 5 System 6 System 7 System 8 System 9 System 0 System System 2 System 3 6 6 8 0 2 Date Intra system variability: -4% Inter system variability: 6% Detectability index across protocols pilot national trial # Protocols 5 3 8 7 4 67 5

Average resolution (f 50 ) 8/2/206 Protocol optimization: Noise texture matching GE Siemens Solomon, Samei, Med Phys, 202 Texture similarity Sharpness Solomon, Samei, Med Phys, 202 Protocol optimization: Noise texture and resolution matching Average noise texture (f A ) 6

Average resolution (f 50 ) Log of dose needed for a target noise of 0 8/2/206 Protocol optimization: Noise texture and resolution matching Average noise texture (f A ) Outline Foundation What has led to TG233 Methods Select procedures Prospects Possibilities Future extensions CT performance in anatomicallyinformed textured phantoms Lung Texture Soft-Tissue Texture 7

NPS nnps (HU (mm 2 *mm 2 ) 2 ) 8/2/206 Noise in textured phantoms What about noise texture? 300.4 FBP 250.2 200 50 00 0.4 0.2 50 Lung Texture NPS IR FBP: Red Pixels IR: Red Pixels IR: Blue Pixels 0 0 0.2 0.4 Spatial Frequency (cycles/mm) 8

8/2/206 Texture phantom library Cylinder phantom 30 mm thick 65 mm diameter 20 Low contrast signals size (6 mm) 5 contrast levels (~3, 5, 7, 0, 4 HU) Signal-present and signal-absent regions with identical background 4 phantoms made 3 textured + uniform 57 Conclusions New technologies and new paradigm necessitate an upgrade to performance metrology towards higher degrees of clinical relevance: Taskful surrogates of clinical performance Application for use optimization TG233 a first step towards uniformity and relevance of characterization TG 233 timeline Aug 206: v. 2 released for committee review Release anticipated in late 206 9