Center for Advanced Computing Research DANSE Kickoff Meeting Mark Stalzer stalzer@caltech.edu August 15, 2006
CACR Mission and Partners Creating advanced computing methods to accelerate scientific discovery Mark Stalzer stalzer@caltech.edu 2
CACR Competencies High Performance Computing Systems Facilities design, operation, and user support Physics Based Simulation Algorithm development Validation & Verification Data Intensive Science (Novel Instruments) Data transport, storage, and analysis Standards and community building Visualization Scientific Software Engineering Mark Stalzer stalzer@caltech.edu 3
Intel Touchstone Delta: World s Fastest Computer in 1991 (30 Gflops) Mark Stalzer stalzer@caltech.edu 4
Facilities NVO Powell-Booth Laboratory for Computational Science Machine room: shc & LIGO silo LHC/ CMS Tier2 Mark Stalzer stalzer@caltech.edu 5
Caltech ASC Center: Multi- Physics Multi-Scale Modeling time ns µs ms Void growth, colaescence Ductile fracture Damage localization Dislocation emission, nanovoid cavitation Vacancy clustering, nanovoid nucleation Vacancy generation nm µm mm Mark Stalzer stalzer@caltech.edu 6
Ductile Fracture Source: Dan Meiron Mark Stalzer stalzer@caltech.edu 7
Data Intensive Science: Knowledge Gap doubling t 1.5 yrs Large digital sky surveys are becoming the dominant source of data in astronomy: ~ 10-100 TB/survey, ~ 10 6-10 9 sources/survey, many wavelengths. 1970 1975 1980 1985 1990 1000 100 10 1 0.1 2000 1995 CCDs Glass An exponential growth in data quantity driven by detector technology; but our understanding of the universe increases much more slowly! Mark Stalzer stalzer@caltech.edu 8
Astronomical Virtual Observatories (www.us-vo.org) Mark Stalzer stalzer@caltech.edu 9
CERN LHC Cyberinfrastructure Tier 1 ~PByte/sec 10-40 Gbps Online System Tier 0 +1 IN2P3 Center INFN Center RAL Center CERN/Outside Resource Ratio ~1:2 Tier0/(Σ Tier1)/(Σ Tier2) ~1:1:1 ~100-1500 MBytes/sec CERN Center PBs of Disk; Tape Robot FNAL Center Tier 3 Physics data cache ~1-10 Gbps Tier 2 Institute Institute Institute Institute 1 to 10 Gbps ~10 Gbps Tier2 Center Tier2 Center Tier2 Center Tier2 Center Tier2 Center Tens of Petabytes by 2007-8 An Exabyte ~5-7 Years later Workstations Mark Stalzer stalzer@caltech.edu 10
Novel Instrument: TeraVoxel Observing Turbulent Mixing ~300 MB/s captured from KPS camera (laser illumination). Sent to local ~50TB DataWulfs. Transmitted to CACR for image correction, processing, and visualization Mark Stalzer stalzer@caltech.edu 11
TeraVoxel Operation Source: Santiago Lombeyda Mark Stalzer stalzer@caltech.edu 12
Turbulent Mixing Close-up Mark Stalzer stalzer@caltech.edu 13
Visualization: ShakeMovie.com Mark Stalzer stalzer@caltech.edu 14
CACR Initiative in Computational Biology Biology is >25% of research at Caltech Working with Biology Division, Beckman Inst., and E&AS Current efforts: Biological Network Modeling Center & ARO-ICB program Parallel stochastic simulation algorithm with tau leaping Center for the Integrative Study of Cell Regulation Funded by Moore foundation ($5.6M gift over 5 years) MCell tool chain Biological image processing Phylogenetic inferencing Mark Stalzer stalzer@caltech.edu 15
The Internet Hourglass Applications User Web FTP Mail News Video telnet ping napster TCP telnet protocol IP Ethernet ATM Power lines 802.11 Optical Satellite Bluetooth Link technologies Source: John Doyle Mark Stalzer stalzer@caltech.edu 16
Bridging the Semantic Gap e-science geo CIG astro NVO bio physics other Semantic Gap Simulation Data intensive Libs (LAPACK, PETSc, ) Assembly Languages (C++, MPI, XML, SQL, ) Scheduling (Time&Space) Computational Substrate Generalized fast (synthetic) frameworks? Mark Stalzer stalzer@caltech.edu 17
Software is the Secret Sauce Palomar-QUEST Image Filtering & Fusion Source: Roy Williams Mark Stalzer stalzer@caltech.edu 18
For more information: www.cacr.caltech.edu Mark Stalzer stalzer@caltech.edu 19