SST + MacSim. Case Studies Using SST MacSim. Genie Hsieh Sandia National Labs
|
|
- Byron Andrews
- 6 years ago
- Views:
Transcription
1 Photos placed in horizontal position with even amount of white space between photos and header SST + MacSim Case Studies Using SST MacSim Genie Hsieh Sandia National Labs Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy s National Nuclear Security Administration under contract DE-AC04-94AL85000.
2 SST MacSim DEMO MacSim and DRAMSim2 integration Parallel execution of multiple MacSim 2
3 SST MacSim: Two Modes Standalone./configure <options>./configure --prefix=/home/myhsieh/local/sst --with-mcpat=/home/myhsieh/local --with-hotspot=/home/myhsieh/local --with-m5=/home/myhsieh/m5-x86/ make; make install With DRAMSim2 Build DRAMSim2 library: make libdramsim.so./configure <options> with dramsim=dir./configure --prefix=/home/myhsieh/local/sst --with-mcpat=/home/myhsieh/local --with-hotspot=/home/myhsieh/local --with-m5=/home/myhsieh/m5-x86/ --with-dramsim=/home/mhsieh/dramsim2 make; make install 3
4 MacSim + DRAMSim2 Example <component name=gpu0 type=macsimcomponent> <parampath>params_hetero_1_6</parampath> <tracepath>trace_file_list</tracepath> <outputpath>results</outputpath> <clock>1.4ghz</clock> <link name=membus port=bus latency=1ns /> SST-MacSim <component name=mem0 type=dramsimc> <clock> 1.5 Ghz </clock> <megsofmemory> 1024 </megsofmemory> <systemini> system_gddr5.ini </systemini> <deviceini> ini/gddr5_hynix_1gb_16b.ini </deviceini> <link name=membus port=bus latency=1ns /> DRAMSim2 DDR2, DDR3 4
5 DRAMSim2 Simulation Output bin]$./sst.x --sdl-file=test_dram.xml SST: construct macsimcomponent and setsstcomponent with ID 0 SST: construct DRAMSimC with ID 1 src/macsim.cc:588: (I=0 C=439930): elapsed time:7.4 seconds Done DRAM: Background Energy DRAM: Burst Energy DRAM: ACT/PRE Energy DRAM: Refresh Energy Bus packet Transaction Transaction queue 1]T [Read] [0x45bbfa4] 2]T [Write] [0x55fbfa0] [5439E] Memory statistics Power 5
6 MacSim Memory Experiments MacSim + DDR3 <component name=mem0 type=dramsimc> <systemini> system_ddr3.ini </systemini> <deviceini> ini/gddr3.ini </deviceini> MacSim + GDDR5 <component name=mem0 type=dramsimc> <systemini> system_gddr5.ini </systemini> <deviceini> ini/gddr5.ini </deviceini> Output **Core 1 Core_Total Finished: insts: cycles: seconds: IPC (0.47 IPC) (I=0 C=439930): finalize simulation DRAM: Background Energy DRAM: Burst Energy DRAM: ACT/PRE Energy **Core 1 Core_Total Finished: insts: cycles: seconds: IPC (0.48 IPC) (I=0 C=428508): finalize simulation DRAM: Background Energy DRAM: Burst Energy DRAM: ACT/PRE Energy
7 Parallel Execution of MacSim in SST MacSim SST-MacSim 7
8 Parallel Execution of Multiple MacSim <component name=cpu0 type=macsimcomponent> <parampath>params_x86</parampath> SST-MacSim <tracepath>trace_file_list_cpu</tracepath> <clock>4ghz</clock> CPU <link name=cpu port=bus latency=1ns /> CPU <component name=bus0 type=bus> <clock>1ghz</clock> <devicelist> cpu gpusst-bus mem</devicelist> <link name=cpu port=cpu latency=1ns /> <link name=gpu port=gpu latency=1ns /> <link name=mem port=mem latency=1ns /> <component name=gpu0 type=macsimcomponent> <parampath>params_gtx8800_v2</parampath> SST-MacSim <tracepath>trace_file_list_gpu</tracepath> <clock>1.4ghz</clock> GPU <link name=gpu port=bus latency=1ns /> Memory GPU <component name=mem0 type=dramsimc> <systemini> system_gddr5.ini </systemini> <deviceini> SST-DRAMSim2 ini/gddr5_.ini </deviceini> <link name=mem port=bus latency=1ns /> 8
9 Parallel Execution of Multiple MacSim <comonent name=cpu0 type=macsimcomponent rank =0> <comonent name=cpu1 type=macsimcomponent rank =1> <comonent name=gpu0 type=macsimcomponent rank =2> <component name=bus type=bus rank=4> <component name=dram type=dramsimc rank=5> <comonent name=gpu1 type=macsimcomponent rank =3> mpirun np6./sst.x sdl-file=macsim.xml 9
10 Memory Experiments 1CPU 1GPU DDR3 DRAM: Background Energy DRAM: Burst Energy 2380 DRAM: ACT/PRE Energy 7080 # # Simulation times # Build time: 0.00 s # Simulation time: s # Total time: s 2CPUs 2GPUs DDR3 DRAM: Background Energy DRAM: Burst Energy DRAM: ACT/PRE Energy # # Simulation times # Build time: 0.00 s # Simulation time: s # Total time: s Future Work: SST-MacSim + SST-Iris for parallel simulation of GPU cluster 10
SST Overview. Genie Hsieh Arun Rodrigues. Sandia National Labs
SST Overview Genie Hsieh Arun Rodrigues Sandia National Labs Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energyʼs
More informationAras Innovator Active Directory Sync
Photos placed in horizontal position with even amount of white space between photos and header Aras Innovator Active Directory Sync Sandia National Laboratories is a multi-program laboratory managed and
More informationImplementing Many-Body Potentials for Molecular Dynamics Simulations
Official Use Only Implementing Many-Body Potentials for Molecular Dynamics Simulations Using large scale clusters for higher accuracy simulations. Christian Trott, Aidan Thompson Unclassified, Unlimited
More informationSandia National Laboratories Solutions for Aras Innovator
Photos placed in horizontal position with even amount of white space between photos and header Sandia National Laboratories Solutions for Aras Innovator Sandia National Laboratories is a multi-program
More informationA Platform for Provisioning Integrated Data and Visualization Capabilities Presented to SATURN in May 2016 Gerry Giese, Sandia National Laboratories
Photos placed in horizontal position with even amount of white space between photos and header A Platform for Provisioning Integrated Data and Visualization Capabilities Presented to SATURN in May 2016
More informationExtracting Hidden Messages in Steganographic Images
DIGITAL FORENSIC RESEARCH CONFERENCE Extracting Hidden Messages in Steganographic Images By Tu-Thach Quach Presented At The Digital Forensic Research Conference DFRWS 2014 USA Denver, CO (Aug 3 rd - 6
More informationMeasurements on (Complete) Graphs: The Power of Wedge and Diamond Sampling
Measurements on (Complete) Graphs: The Power of Wedge and Diamond Sampling Tamara G. Kolda plus Grey Ballard, Todd Plantenga, Ali Pinar, C. Seshadhri Workshop on Incomplete Network Data Sandia National
More informationPortability and Scalability of Sparse Tensor Decompositions on CPU/MIC/GPU Architectures
Photos placed in horizontal position with even amount of white space between photos and header Portability and Scalability of Sparse Tensor Decompositions on CPU/MIC/GPU Architectures Christopher Forster,
More informationEarly Experiences with Trinity - The First Advanced Technology Platform for the ASC Program
Early Experiences with Trinity - The First Advanced Technology Platform for the ASC Program C.T. Vaughan, D.C. Dinge, P.T. Lin, S.D. Hammond, J. Cook, C. R. Trott, A.M. Agelastos, D.M. Pase, R.E. Benner,
More informationUsing the Cray Gemini Performance Counters
Photos placed in horizontal position with even amount of white space between photos and header Using the Cray Gemini Performance Counters 0 1 2 3 4 5 6 7 Backplane Backplane 8 9 10 11 12 13 14 15 Backplane
More informationVisual Analysis of Lagrangian Particle Data from Combustion Simulations
Visual Analysis of Lagrangian Particle Data from Combustion Simulations Hongfeng Yu Sandia National Laboratories, CA Ultrascale Visualization Workshop, SC11 Nov 13 2011, Seattle, WA Joint work with Jishang
More informationHypergraph Exploitation for Data Sciences
Photos placed in horizontal position with even amount of white space between photos and header Hypergraph Exploitation for Data Sciences Photos placed in horizontal position with even amount of white space
More informationMassively Parallel Graph Analytics
Massively Parallel Graph Analytics Manycore graph processing, distributed graph layout, and supercomputing for graph analytics George M. Slota 1,2,3 Kamesh Madduri 2 Sivasankaran Rajamanickam 1 1 Sandia
More informationSimple Parallel Biconnectivity Algorithms for Multicore Platforms
Simple Parallel Biconnectivity Algorithms for Multicore Platforms George M. Slota Kamesh Madduri The Pennsylvania State University HiPC 2014 December 17-20, 2014 Code, presentation available at graphanalysis.info
More informationA Classifica*on of Scien*fic Visualiza*on Algorithms for Massive Threading Kenneth Moreland Berk Geveci Kwan- Liu Ma Robert Maynard
A Classifica*on of Scien*fic Visualiza*on Algorithms for Massive Threading Kenneth Moreland Berk Geveci Kwan- Liu Ma Robert Maynard Sandia Na*onal Laboratories Kitware, Inc. University of California at Davis
More informationIrregular Graph Algorithms on Parallel Processing Systems
Irregular Graph Algorithms on Parallel Processing Systems George M. Slota 1,2 Kamesh Madduri 1 (advisor) Sivasankaran Rajamanickam 2 (Sandia mentor) 1 Penn State University, 2 Sandia National Laboratories
More informationEMPRESS Extensible Metadata PRovider for Extreme-scale Scientific Simulations
EMPRESS Extensible Metadata PRovider for Extreme-scale Scientific Simulations Photos placed in horizontal position with even amount of white space between photos and header Margaret Lawson, Jay Lofstead,
More informationMain Memory Supporting Caches
Main Memory Supporting Caches Use DRAMs for main memory Fixed width (e.g., 1 word) Connected by fixed-width clocked bus Bus clock is typically slower than CPU clock Cache Issues 1 Example cache block read
More informationMicrogrid System Design and Economic Analysis Tools
Microgrid System Design and Economic Analysis Tools DOE Microgrid Workshop 30 August 2011 Jason Stamp, Ph.D. (Sandia National Laboratories) Michael Clark (Encorp) 1 Sandia National Laboratories is a multi-program
More informationPerspectives form US Department of Energy work on parallel programming models for performance portability
Perspectives form US Department of Energy work on parallel programming models for performance portability Jeremiah J. Wilke Sandia National Labs Livermore, CA IMPACT workshop at HiPEAC Sandia National
More informationABHP Certification Radiation Disciplines
Photos placed in horizontal position with even amount of white space between photos and header ABHP Certification Radiation Disciplines Charles Potter, Ph.D., C.H.P. Sandia National Laboratories is a multi-program
More informationA Reference Architecture for Payload Reusable Software (RAPRS)
SAND2011-7588 C A Reference Architecture for Payload Reusable Software (RAPRS) 2011 Workshop on Spacecraft Flight Software Richard D. Hunt Sandia National Laboratories P.O. Box 5800 M/S 0513 Albuquerque,
More informationPreconditioning Linear Systems Arising from Graph Laplacians of Complex Networks
Preconditioning Linear Systems Arising from Graph Laplacians of Complex Networks Kevin Deweese 1 Erik Boman 2 1 Department of Computer Science University of California, Santa Barbara 2 Scalable Algorithms
More informationProcurement Guidance for Energy Storage Projects: Help with RFIs, RFQs and RFPs
Procurement Guidance for Energy Storage Projects: Help with RFIs, RFQs and RFPs April 20, 2016 Hosted by Todd Olinsky-Paul Project Director Clean Energy Group/ Clean Energy States Alliance Housekeeping
More informationBasics DRAM ORGANIZATION. Storage element (capacitor) Data In/Out Buffers. Word Line. Bit Line. Switching element HIGH-SPEED MEMORY SYSTEMS
Basics DRAM ORGANIZATION DRAM Word Line Bit Line Storage element (capacitor) In/Out Buffers Decoder Sense Amps... Bit Lines... Switching element Decoder... Word Lines... Memory Array Page 1 Basics BUS
More informationEureka! Task Teams! Kyle Wheeler SC 12 Chapel Lightning Talk SAND: P
GO 08012011 Eureka! Task Teams! Kyle Wheeler SC 12 Chapel Lightning Talk Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary
More informationOh, Exascale! The effect of emerging architectures on scien1fic discovery. Kenneth Moreland, Sandia Na1onal Laboratories
Photos placed in horizontal posi1on with even amount of white space between photos and header Oh, $#*@! Exascale! The effect of emerging architectures on scien1fic discovery Ultrascale Visualiza1on Workshop,
More informationIntegrating Analysis and Computation with Trios Services
October 31, 2012 Integrating Analysis and Computation with Trios Services Approved for Public Release: SAND2012-9323P Ron A. Oldfield Scalable System Software Sandia National Laboratories Albuquerque,
More informationChapter 2: Memory Hierarchy Design (Part 3) Introduction Caches Main Memory (Section 2.2) Virtual Memory (Section 2.4, Appendix B.4, B.
Chapter 2: Memory Hierarchy Design (Part 3) Introduction Caches Main Memory (Section 2.2) Virtual Memory (Section 2.4, Appendix B.4, B.5) Memory Technologies Dynamic Random Access Memory (DRAM) Optimized
More informationDRAMsim: DRAM Memory System Simulation Framework Code Review
SLIDE 1 DRAMsim: DRAM Memory System Simulation Framework Why are we doing a code review? What is wrong with the old code? What is new? What is the new hardware architecture? How does the new software architecture
More informationMicrogrid Design Toolkit (MDT)
SAND2016-8151 C Microgrid Design Toolkit (MDT) 24 October 2016 John Eddy, Ph.D. Microgrid Design Toolkit (MDT) Principal Investigator System Sustainment & Readiness Technologies Department Sandia National
More informationSecurity Metrics. Mark Torgerson Sandia National Laboratories 6/20/2007. Entry I-108
Security Metrics Mark Torgerson Sandia National Laboratories 6/20/2007 Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of
More informationUnclassified Unlimited Release UUR
Shock-hardened Penetrator Data Recorder to Support Hard-Target Fuze Development SAND2016-3004 C 59th Annual NDIA Fuze Conference, Charleston, SC Mike Partridge, Shane Curtis Advanced Fuzing Technologies
More informationPULP: Fast and Simple Complex Network Partitioning
PULP: Fast and Simple Complex Network Partitioning George Slota #,* Kamesh Madduri # Siva Rajamanickam * # The Pennsylvania State University *Sandia National Laboratories Dagstuhl Seminar 14461 November
More informationVirtualized and Flexible ECC for Main Memory
Virtualized and Flexible ECC for Main Memory Doe Hyun Yoon and Mattan Erez Dept. Electrical and Computer Engineering The University of Texas at Austin ASPLOS 2010 1 Memory Error Protection Applying ECC
More informationChallenges and Opportunities for HPC Interconnects and MPI
Challenges and Opportunities for HPC Interconnects and MPI Ron Brightwell, R&D Manager Scalable System Software Department Sandia National Laboratories is a multi-mission laboratory managed and operated
More information8. Solving Stochastic Programs
8. Solving Stochastic Programs Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S.
More informationPerformance COE 403. Computer Architecture Prof. Muhamed Mudawar. Computer Engineering Department King Fahd University of Petroleum and Minerals
Performance COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals What is Performance? How do we measure the performance of
More informationPuLP: Scalable Multi-Objective Multi-Constraint Partitioning for Small-World Networks
PuLP: Scalable Multi-Objective Multi-Constraint Partitioning for Small-World Networks George M. Slota 1,2 Kamesh Madduri 2 Sivasankaran Rajamanickam 1 1 Sandia National Laboratories, 2 The Pennsylvania
More informationSlide credit: Slides adapted from David Kirk/NVIDIA and Wen-mei W. Hwu, DRAM Bandwidth
Slide credit: Slides adapted from David Kirk/NVIDIA and Wen-mei W. Hwu, 2007-2016 DRAM Bandwidth MEMORY ACCESS PERFORMANCE Objective To learn that memory bandwidth is a first-order performance factor in
More informationExtreme-scale Graph Analysis on Blue Waters
Extreme-scale Graph Analysis on Blue Waters 2016 Blue Waters Symposium George M. Slota 1,2, Siva Rajamanickam 1, Kamesh Madduri 2, Karen Devine 1 1 Sandia National Laboratories a 2 The Pennsylvania State
More informationJourney of the Bubbles
Journey of the Bubbles PRESENTED BY Thomas Roth Sandia National Laboratories is a multimission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned
More informationHarnessing GPU speed to accelerate LAMMPS particle simulations
Harnessing GPU speed to accelerate LAMMPS particle simulations Paul S. Crozier, W. Michael Brown, Peng Wang pscrozi@sandia.gov, wmbrown@sandia.gov, penwang@nvidia.com SC09, Portland, Oregon November 18,
More informationECE 485/585 Midterm Exam
ECE 485/585 Midterm Exam Time allowed: 100 minutes Total Points: 65 Points Scored: Name: Problem No. 1 (12 points) For each of the following statements, indicate whether the statement is TRUE or FALSE:
More informationLecture 15: DRAM Main Memory Systems. Today: DRAM basics and innovations (Section 2.3)
Lecture 15: DRAM Main Memory Systems Today: DRAM basics and innovations (Section 2.3) 1 Memory Architecture Processor Memory Controller Address/Cmd Bank Row Buffer DIMM Data DIMM: a PCB with DRAM chips
More informationLecture 18: DRAM Technologies
Lecture 18: DRAM Technologies Last Time: Cache and Virtual Memory Review Today DRAM organization or, why is DRAM so slow??? Lecture 18 1 Main Memory = DRAM Lecture 18 2 Basic DRAM Architecture Lecture
More informationCOMPUTER ARCHITECTURES
COMPUTER ARCHITECTURES Random Access Memory Technologies Gábor Horváth BUTE Department of Networked Systems and Services ghorvath@hit.bme.hu Budapest, 2019. 02. 24. Department of Networked Systems and
More informationCSE 599 I Accelerated Computing - Programming GPUS. Memory performance
CSE 599 I Accelerated Computing - Programming GPUS Memory performance GPU Teaching Kit Accelerated Computing Module 6.1 Memory Access Performance DRAM Bandwidth Objective To learn that memory bandwidth
More informationRisk Informed Cyber Security for Nuclear Power Plants
Risk Informed Cyber Security for Nuclear Power Plants Phillip L. Turner, Timothy A. Wheeler, Matt Gibson Sandia National Laboratories Electric Power Research Institute Albuquerque, NM USA Charlotte, NC
More informationGit Propaganda. for Centralized Version Control Flunkies
for Centralized Version Control Flunkies Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy's National Nuclear Security
More informationEvaluating STT-RAM as an Energy-Efficient Main Memory Alternative
Evaluating STT-RAM as an Energy-Efficient Main Memory Alternative Emre Kültürsay *, Mahmut Kandemir *, Anand Sivasubramaniam *, and Onur Mutlu * Pennsylvania State University Carnegie Mellon University
More informationLarge Scale Visualization on the Cray XT3 Using ParaView
Large Scale Visualization on the Cray XT3 Using ParaView Cray User s Group 2008 May 8, 2008 Kenneth Moreland David Rogers John Greenfield Sandia National Laboratories Alexander Neundorf Technical University
More informationComputer Systems Laboratory Sungkyunkwan University
DRAMs Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Main Memory & Caches Use DRAMs for main memory Fixed width (e.g., 1 word) Connected by fixed-width
More informationTechnology in Action
Technology in Action Chapter 9 Behind the Scenes: A Closer Look at System Hardware 1 Binary Language Computers work in binary language. Consists of two numbers: 0 and 1 Everything a computer does is broken
More informationHashing Strategies for the Cray XMT MTAAP 2010
Hashing Strategies for the Cray XMT MTAAP 2010 Eric Goodman (SNL) David Haglin (PNNL) Chad Scherrer (PNNL) Daniel Chavarría-Miranda (PNNL) Jace Mogill (PNNL) John Feo (PNNL) Sandia is a multiprogram laboratory
More informationSimulation of Workflow and Threat Characteristics for Cyber Security Incident Response Teams
Simulation of Workflow and Threat Characteristics for Cyber Security Incident Response Teams Theodore Reed, Robert G. Abbott, Benjamin Anderson, Kevin Nauer & Chris Forsythe Sandia National Laboratories
More informationMaximizing heterogeneous system performance with ARM interconnect and CCIX
Maximizing heterogeneous system performance with ARM interconnect and CCIX Neil Parris, Director of product marketing Systems and software group, ARM Teratec June 2017 Intelligent flexible cloud to enable
More informationPriority-Based Memory Access Scheduling for CPU-GPU Workloads
Priority-Based Memory Access Scheduling for CPU-GPU Workloads BY FABIO GHIOZZI Laurea, Politecnico di Torino, Turin, Italy, 2013 THESIS Submitted as partial fulfillment of the requirements for the degree
More informationACES and Cray Collaborate on Advanced Power Management for Trinity (and beyond)
ACES and Cray Collaborate on Advanced Power Management for Trinity (and beyond) Alliance for Computing at Extreme Scale (ACES) Sandia National Laboratories and Los Alamos Laboratory in collaboration with
More informationA performance portable implementation of HOMME via the Kokkos programming model
E x c e p t i o n a l s e r v i c e i n t h e n a t i o n a l i n t e re s t A performance portable implementation of HOMME via the Kokkos programming model L.Bertagna, M.Deakin, O.Guba, D.Sunderland,
More informationSupra-linear Packet Processing Performance with Intel Multi-core Processors
White Paper Dual-Core Intel Xeon Processor LV 2.0 GHz Communications and Networking Applications Supra-linear Packet Processing Performance with Intel Multi-core Processors 1 Executive Summary Advances
More information,e-pg PATHSHALA- Computer Science Computer Architecture Module 25 Memory Hierarchy Design - Basics
,e-pg PATHSHALA- Computer Science Computer Architecture Module 25 Memory Hierarchy Design - Basics The objectives of this module are to discuss about the need for a hierarchical memory system and also
More informationCS698Y: Modern Memory Systems Lecture-16 (DRAM Timing Constraints) Biswabandan Panda
CS698Y: Modern Memory Systems Lecture-16 (DRAM Timing Constraints) Biswabandan Panda biswap@cse.iitk.ac.in https://www.cse.iitk.ac.in/users/biswap/cs698y.html Row decoder Accessing a Row Access Address
More informationA space efficient streaming algorithm for triangle counting using the birthday paradox
space efficient streaming algorithm for triangle counting using the birthday paradox Madhav Jha (Penn State Sandia National Labs) Joint work with. Seshadhri (Sandia National Labs) and li Pinar (Sandia
More informationMaking Your Most Accurate DDR4 Compliance Measurements. Ai-Lee Kuan OPD Memory Product Manager
Making Your Most Accurate DDR4 Compliance Measurements Ai-Lee Kuan OPD Memory Product Manager 1 Agenda DDR4 Testing Strategy Probing Analysis Tool Compliance Test Conclusion 2 DDR4 Testing Strategy 1.
More informationMaintaining An Online Publication List
Maintaining An Online Publication List Tamara G. Kolda Sandia National Labs Webpage Expert* * Self-proclaimed Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation,
More informationNegotiating the Maze Getting the most out of memory systems today and tomorrow. Robert Kaye
Negotiating the Maze Getting the most out of memory systems today and tomorrow Robert Kaye 1 System on Chip Memory Systems Systems use external memory Large address space Low cost-per-bit Large interface
More informationExtreme-scale Graph Analysis on Blue Waters
Extreme-scale Graph Analysis on Blue Waters 2016 Blue Waters Symposium George M. Slota 1,2, Siva Rajamanickam 1, Kamesh Madduri 2, Karen Devine 1 1 Sandia National Laboratories a 2 The Pennsylvania State
More informationCSE502: Computer Architecture CSE 502: Computer Architecture
CSE 502: Computer Architecture Memory / DRAM SRAM = Static RAM SRAM vs. DRAM As long as power is present, data is retained DRAM = Dynamic RAM If you don t do anything, you lose the data SRAM: 6T per bit
More informationCS/ECE 217. GPU Architecture and Parallel Programming. Lecture 16: GPU within a computing system
CS/ECE 217 GPU Architecture and Parallel Programming Lecture 16: GPU within a computing system Objective To understand the major factors that dictate performance when using GPU as an compute co-processor
More informationMulti-Gigabit Transceivers Getting Started with Xilinx s Rocket I/Os
Multi-Gigabit Transceivers Getting Started with Xilinx s Rocket I/Os Craig Ulmer cdulmer@sandia.gov July 26, 2007 Craig Ulmer SNL/CA Sandia is a multiprogram laboratory operated by Sandia Corporation,
More informationProcessor and DRAM Integration by TSV- Based 3-D Stacking for Power-Aware SOCs
Processor and DRAM Integration by TSV- Based 3-D Stacking for Power-Aware SOCs Shin-Shiun Chen, Chun-Kai Hsu, Hsiu-Chuan Shih, and Cheng-Wen Wu Department of Electrical Engineering National Tsing Hua University
More informationLecture: Memory, Multiprocessors. Topics: wrap-up of memory systems, intro to multiprocessors and multi-threaded programming models
Lecture: Memory, Multiprocessors Topics: wrap-up of memory systems, intro to multiprocessors and multi-threaded programming models 1 Refresh Every DRAM cell must be refreshed within a 64 ms window A row
More informationNVIDIA GT740 PCIe ADD-IN BOARD. Datasheet GFX-N3A2-01FMS1
NVIDIA GT740 PCIe ADD-IN BOARD Datasheet GFX-N3A2-01FMS1 CONTENTS 1. Feature... 3 2. Functional Overview... 4 2.1. GPU Block diagram... 4 2.2. Memory Interface... 4 2.3. Features and Technologies... 4
More informationBEFORE YOU BEGIN You will need to know what you would like to order, the vendor, and the General Ledger (G/L) account to charge to.
Shopping Cart: Create an MXPO Order Use this Job Aid to: Learn how to initiate a purchase order for the Maximo work order system. BEFORE YOU BEGIN You will need to know what you would like to order, the
More informationECE 485/585 Midterm Exam
ECE 485/585 Midterm Exam Time allowed: 100 minutes Total Points: 65 Points Scored: Name: Problem No. 1 (12 points) For each of the following statements, indicate whether the statement is TRUE or FALSE:
More informationDRAM Bank Organization
DRM andwidth DRM ank Organization Row ddr Row Decoder Memory Cell Core rray DRM Memory Cell Sense mps Column Latches Column ddr Mux Mux Off-chip Data DRM Core rrays are Slow DRM Core rrays are Slow DDR:
More informationOrganization. 5.1 Semiconductor Main Memory. William Stallings Computer Organization and Architecture 6th Edition
William Stallings Computer Organization and Architecture 6th Edition Chapter 5 Internal Memory 5.1 Semiconductor Main Memory 5.2 Error Correction 5.3 Advanced DRAM Organization 5.1 Semiconductor Main Memory
More informationDax: A Massively Threaded Visualiza5on and Analysis Toolkit for Extreme Scale
Dax: A Massively Threaded Visualiza5on and Analysis Toolkit for Extreme Scale GPU Technology Conference March 26, 2014 Kenneth Moreland Sandia Na5onal Laboratories Robert Maynard Kitware, Inc. Sandia National
More informationWilliam Stallings Computer Organization and Architecture 6th Edition. Chapter 5 Internal Memory
William Stallings Computer Organization and Architecture 6th Edition Chapter 5 Internal Memory Semiconductor Memory Types Semiconductor Memory RAM Misnamed as all semiconductor memory is random access
More informationLecture: Memory Technology Innovations
Lecture: Memory Technology Innovations Topics: memory schedulers, refresh, state-of-the-art and upcoming changes: buffer chips, 3D stacking, non-volatile cells, photonics Multiprocessor intro 1 Row Buffers
More informationCSE502: Computer Architecture CSE 502: Computer Architecture
CSE 502: Computer Architecture Memory / DRAM SRAM = Static RAM SRAM vs. DRAM As long as power is present, data is retained DRAM = Dynamic RAM If you don t do anything, you lose the data SRAM: 6T per bit
More informationNetronome NFP: Theory of Operation
WHITE PAPER Netronome NFP: Theory of Operation TO ACHIEVE PERFORMANCE GOALS, A MULTI-CORE PROCESSOR NEEDS AN EFFICIENT DATA MOVEMENT ARCHITECTURE. CONTENTS 1. INTRODUCTION...1 2. ARCHITECTURE OVERVIEW...2
More informationEC EMBEDDED AND REAL TIME SYSTEMS
EC6703 - EMBEDDED AND REAL TIME SYSTEMS Unit I -I INTRODUCTION TO EMBEDDED COMPUTING Part-A (2 Marks) 1. What is an embedded system? An embedded system employs a combination of hardware & software (a computational
More informationModeling Performance Use Cases with Traffic Profiles Over ARM AMBA Interfaces
Modeling Performance Use Cases with Traffic Profiles Over ARM AMBA Interfaces Li Chen, Staff AE Cadence China Agenda Performance Challenges Current Approaches Traffic Profiles Intro Traffic Profiles Implementation
More informationQualita've DNS Measurement Perspec'ves
Qualita've DNS Measurement Perspec'ves Casey Deccio Sandia Na/onal Laboratories ISC/CAIDA Data Collabora/on Workshop Oct 22, 2012 Sandia National Laboratories is a multi-program laboratory managed and
More informationScalable Community Detection Benchmark Generation
Scalable Community Detection Benchmark Generation Jonathan Berry 1 Cynthia Phillips 1 Siva Rajamanickam 1 George M. Slota 2 1 Sandia National Labs, 2 Rensselaer Polytechnic Institute jberry@sandia.gov,
More informationUsing a Robust Metadata Management System to Accelerate Scientific Discovery at Extreme Scales
Using a Robust Metadata Management System to Accelerate Scientific Discovery at Extreme Scales Margaret Lawson, Jay Lofstead Sandia National Laboratories is a multimission laboratory managed and operated
More informationCaches. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University
Caches Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns
More informationEach Milliwatt Matters
Each Milliwatt Matters Ultra High Efficiency Application Processors Govind Wathan Product Manager, CPG ARM Tech Symposia China 2015 November 2015 Ultra High Efficiency Processors Used in Diverse Markets
More informationIMM64M64D1SOD16AG (Die Revision D) 512MByte (64M x 64 Bit)
Product Specification Rev. 2.0 2015 IMM64M64D1SOD16AG (Die Revision D) 512MByte (64M x 64 Bit) 512MB DDR Unbuffered SO-DIMM RoHS Compliant Product Product Specification 2.0 1 IMM64M64D1SOD16AG Version:
More information28x 29x 30x [ 24x] 3.20GHz ( 133x24) CPU Clock Ratio CPU Frequency. CPU Host Clock Control [ Enable] CPU Host Frequency ( MHz ) 133
Intel Core i7 is a brand new architecture featuring the QPI bus which replaces the FSB bus. So, how does this affect overclocking? The Core i7 processor s frequency is Bclk * CPU multiplier. For ex. Intel
More informationTECHNOLOGY BRIEF. Double Data Rate SDRAM: Fast Performance at an Economical Price EXECUTIVE SUMMARY C ONTENTS
TECHNOLOGY BRIEF June 2002 Compaq Computer Corporation Prepared by ISS Technology Communications C ONTENTS Executive Summary 1 Notice 2 Introduction 3 SDRAM Operation 3 How CAS Latency Affects System Performance
More informationIMM128M72D1SOD8AG (Die Revision F) 1GByte (128M x 72 Bit)
Product Specification Rev. 1.0 2015 IMM128M72D1SOD8AG (Die Revision F) 1GByte (128M x 72 Bit) 1GB DDR Unbuffered SO-DIMM RoHS Compliant Product Product Specification 1.0 1 IMM128M72D1SOD8AG Version: Rev.
More informationMaterials Science and Technology
Materials Science and Technology IP Symposium, New Orleans September 28, 2004 Technology Commercialization and IP Management Corporate Business Development & Partnerships Kevin A. McMahon, Manager Licensing
More informationSpring 2018 :: CSE 502. Main Memory & DRAM. Nima Honarmand
Main Memory & DRAM Nima Honarmand Main Memory Big Picture 1) Last-level cache sends its memory requests to a Memory Controller Over a system bus of other types of interconnect 2) Memory controller translates
More informationUnderstanding Cisco Express Forwarding
Understanding Cisco Express Forwarding Document ID: 47321 Contents Introduction Prerequisites Requirements Components Used Conventions Overview CEF Operations Updating the GRP's Routing Tables Packet Forwarding
More informationTHE FUTURE OF GPU DATA MANAGEMENT. Michael Wolfe, May 9, 2017
THE FUTURE OF GPU DATA MANAGEMENT Michael Wolfe, May 9, 2017 CPU CACHE Hardware managed What data to cache? Where to store the cached data? What data to evict when the cache fills up? When to store data
More informationIMM128M64D1DVD8AG (Die Revision F) 1GByte (128M x 64 Bit)
Product Specification Rev. 1.0 2015 IMM128M64D1DVD8AG (Die Revision F) 1GByte (128M x 64 Bit) 1GB DDR VLP Unbuffered DIMM RoHS Compliant Product Product Specification 1.0 1 IMM128M64D1DVD8AG Version: Rev.
More informationAsynchronous Termination Detection Module User s Guide
Asynchronous Termination Detection Module User s Guide William McLon III Sandia National Laboratories wcmclen@sandia.gov 1 Introduction Interprocessor communications are performed through point-to-point
More information