EECS 598: Integrating Emerging Technologies with Computer Architecture. Lecture 14: Photonic Interconnect
|
|
- Moris Bennett
- 5 years ago
- Views:
Transcription
1 1 EECS 598: Integrating Emerging Technologies with Computer Architecture Lecture 14: Photonic Interconnect Instructor: Ron Dreslinski Winter
2 Announcements 2 Remaining lecture schedule 3/15: Photonics 3/17: Project Meetings 3/22: Student Presentations (2) 3/24: Student Presentations (2) 3/29: Student Presentations (2) 3/31: Student Presentations (2) 4/5: Student Presentations (2) 4/7: Student Presentations (2) 4/12: Project Writeup Due; Group Project Presentations (2 or 3) 4/14: Group Project Presentations (2 or 3) 2 2
3 Photonic Interconenct 3 Used heavily in telecommunications industry Encode data in photons (light) rather than electrons Multiple wavelengths of light provide natural communication channels over a single connection But can we integrate them into the CMOS system and use them for chip-chip or even on-chip communication? 3 3
4 Corona 4 Enter the Corona paper Corona: System Implications of Emerging Nanophotonic Technology Dana Vantrease, Robert Schreiber, Matteo Monchiero, Moray McLaren, Norman P. Jouppi, Marco Fiorentino, Al Davis, Nathan Binkert, Raymond G. Beausoleil, and Jung Ho Ahn. In ISCA-35, Beijing, China, June Discuss the use of 3D integration to provide all the components necessary for on-chip photonics Then address the architectural design 4 4
5 How does it work on chip? 5 Can couple selected wavelength Can Detect the wavelength a) b) c) d) e) Ring resonator SiGe Doped Coupler Waveguide Can pass the value onto other line Using ring resonators the system can couple signals that match only a certain wavelength from the waveguide, pass them between waveguides, or detect the presence 5 5
6 Putting them into a system 6 Fiber I/O s to s or Network stsvs pgctsvs Heat Sink pgctsvs Processor/L1 Die Memory Controller/Directory/L2 Die Analog Electronics Die Optical Die Package Face to Face Bonds Laser 6 6
7 High-Level Architecture 7 Cluster based approach (increase core count without increasing interconnect as much) Cluster 0 Cluster 1 Optical Interconnect (a) Cluster 63 On-chip directory based coherence Hub connects on-chip optically to other clusters Core Core Core Shared L2 Cache Memory Controller Hub S Directory Off-chip memory also uses optical connections to improve bandwidth Core Network Interface Optical Interconnect (b) Figure 2: Architecture Overview 7 7
8 More Detailed Architecture 8 Core Die L1-I L1-I Core 0 L1-D L1-D L1-D L1-D Core 1 Star Coupler Laser L1 L2 Interface L1-I L1-I Core 2 Through Silicon Via Array Core 3 L2 Cache L1 L2 Interface Cache Die MC Direct ory Hub My X-bar Connection NI Peer X-bar Connection Detectors Splitters Optical Die Optically Connected Memory Detectors Splitter Broadcast 4-waveguide bundles N-1 Crossbar Detectors Splitter N N Arbitration Injectors Detectors Figure 3: Layout with Serpentine Crossbar and Resonator Ring Detail ensures that the memory bandwidth grows linearly with increased core count, and it provides local memory accessible Photonic Subsystem Waveguides Ring Resonators Memory K Crossbar K 8 8
9 Mul$ple writer single reader (MWSR) interconnects latchless/ wave-pipelined Arbitration prevents corruption of in-flight data Source: Mikko Lipas$-University of Wisconsin 9
10 Arbitra$on solu$ons Token Channel Single Token / Serial Writes Token Slot Multiple Tokens / Simultaneous Writes Token passing allows token to pace transmission tail (no bubbles) Source: Mikko Lipas$-University of Wisconsin Token passing allows token to directly precede 10 slot
11 Token Protocol 11 injectors Cluster 0 0r 0g 0b 0r 0g 0b detectors How do you prevent more than one writter in a given wavelength (color) Arbitration WG home cluster wavelength r g b 1r 1g 1b r r 1r 1g 1b Cluster 1 Have a token that circulates the system to indicate who is allowed to write b b g 2b 2g 2r b b g b 2b 2g g2r Power WG Cluster 2 Active Ring Resonator Lit Inactive Ring Resonator Unlit Assign each cluster it s own wavelength (color) to read Leads to underutilization of potential interconnect (when token is at a node who doesn t need it) 11 11
12 Evaluation Criteria 12 Resource Value Number of clusters 64 Per-Cluster: L2 cache size/assoc 4 MB/16-way L2 cache line size 64 B L2 coherence MOESI Memory controllers 1 Cores 4 Per-Core: L1 ICache size/assoc 16 KB/4-way L1 DCache size/assoc 32 KB/4-way L1 I & D cache line size 64 B Frequency 5 GHz Threads 4 Issue policy In-order Issue width 2 64 b floating point SIMD width 4 Fused floating point operations Multiply-Add Resource ECM Memory controllers External connectivity 256 fibers 1536 pins Channel width 128 b half duplex 12 b full duplex Channel data rate 10 Gb/s 10 Gb/s Memory bandwidth TB/s 0.96 TB/s Memory latency 20 ns 20 ns Synthetic # Network Benchmark Description Requests Uniform Uniform random 1M Hot Spot All clusters to one cluster 1M Tornado Cluster (i, j) to cluster 1M ((i + bk/2c 1)%k, (j + bk/2c 1)%k), where k = network s radix Transpose Cluster (i, j) to cluster (j, i) 1M SPLASH-2 Data Set # Network Benchmark Experimental (Default) Requests Barnes 64 K particles (16 K) 7.2 M Cholesky tk29.o (tk15.o) 0.6 M FFT 16 M points (64 K) 176 M FMM 1 M particles (16 K) 1.8 M LU matrix ( ) 34 M Ocean grid ( ) 240 M Radiosity roomlarge (room) 4.2 M Radix 64 M integers (1 M) 189 M Raytrace balls4 (car) 0.7 M Volrend head (head) 3.6 M Water-Sp 32 K molecules (512) 3.2 M Table 3: Benchmarks and Configurations 12 12
13 Speedup 13 Normalized Speedup Uniform Hot Spot 13.5 Tornado Transpose Barnes Cholesky FFT FMM Figure 8: Normalized Speedup LU Ocean Radiosity Radix Raytrace LMesh/ECM HMesh/ECM LMesh/ HMesh/ XBar/ Volrend Water-Sp 13 13
14 Bandwidth 14 Bandwidth (TB/s) LMesh/ECM HMesh/ECM LMesh/ HMesh/ XBar/ 1 0 Uniform Hot Spot Tornado Transpose Barnes Cholesky FFT FMM LU Ocean Radiosity Radix Raytrace Figure 9: Achieved Bandwidth Volrend Water-Sp 14 14
15 L2 Miss Latency 15 Average Request Latency (ns) Uniform Hot Spot Tornado Transpose Barnes Cholesky FFT FMM LU Ocean Radiosity Radix Raytrace Figure 10: Average L2 Miss Latency LMesh/ECM HMesh/ECM LMesh/ HMesh/ XBar/ Volrend Water-Sp 15 15
16 Power LMesh/ECM HMesh/ECM LMesh/ HMesh/ XBar/ Power (W) Uniform Hot Spot Tornado Transpose Barnes Cholesky FFT FMM LU Ocean Radiosity Radix Raytrace Volrend Water-Sp Figure 11: On-chip Network Power 16 16
Corona: System Implications of Emerging Nanophotonic Technology
International Symposium on Computer Architecture Corona: System Implications of Emerging Nanophotonic Technology Dana Vantrease, Robert Schreiber, Matteo Monchiero, Moray McLaren, Norman P. Jouppi, Marco
More informationEECS 598: Integrating Emerging Technologies with Computer Architecture. Lecture 12: On-Chip Interconnects
1 EECS 598: Integrating Emerging Technologies with Computer Architecture Lecture 12: On-Chip Interconnects Instructor: Ron Dreslinski Winter 216 1 1 Announcements Upcoming lecture schedule Today: On-chip
More informationPhastlane: A Rapid Transit Optical Routing Network
Phastlane: A Rapid Transit Optical Routing Network Mark Cianchetti, Joseph Kerekes, and David Albonesi Computer Systems Laboratory Cornell University The Interconnect Bottleneck Future processors: tens
More informationSystem Implications of Integrated Photonics
System Implications of Integrated Photonics Norman P. Jouppi and Parthasarathy Ranganathan 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice
More informationAtomic Coherence: Leveraging Nanophotonics to Build Race-Free Cache Coherence Protocols. Dana Vantrease, Mikko Lipasti, Nathan Binkert
Atomic Coherence: Leveraging Nanophotonics to Build Race-Free Cache Coherence Protocols Dana Vantrease, Mikko Lipasti, Nathan Binkert 1 Executive Summary Problem: Cache coherence races make protocols complicated
More informationBrief Background in Fiber Optics
The Future of Photonics in Upcoming Processors ECE 4750 Fall 08 Brief Background in Fiber Optics Light can travel down an optical fiber if it is completely confined Determined by Snells Law Various modes
More informationSwizzle Switch: A Self-Arbitrating High-Radix Crossbar for NoC Systems
1 Swizzle Switch: A Self-Arbitrating High-Radix Crossbar for NoC Systems Ronald Dreslinski, Korey Sewell, Thomas Manville, Sudhir Satpathy, Nathaniel Pinckney, Geoff Blake, Michael Cieslak, Reetuparna
More informationCommunication Aware Design Method for Optical Network-on-Chip
Communication Aware Design Method for Optical Network-on-Chip Johanna Sepúlveda,2, Sebastien Le Beux 2, Jiating Luo, Cedric Killian, Daniel Chillet, Hui Li 2, Ian O Connor 2, Olivier Sentieys INRIA, IRISA,
More informationLight Speed Arbitration and Flow Control for Nanophotonic Interconnects
Light Speed Arbitration and Flow Control for Nanophotonic Interconnects Dana Vantrease Univ of Wisconsin - Madison danav@cs.wisc.edu Nathan Binkert HP Laboratories binkert@hp.com Robert Schreiber HP Laboratories
More informationSnoop-Based Multiprocessor Design III: Case Studies
Snoop-Based Multiprocessor Design III: Case Studies Todd C. Mowry CS 41 March, Case Studies of Bus-based Machines SGI Challenge, with Powerpath SUN Enterprise, with Gigaplane Take very different positions
More informationArbitration at the Speed of Light
Arbitration at the Speed of Light Abstract Optics, as an alternative to purely electrical methods, promises low latency and high bandwidth in chip-wide communication at low power levels. As a result, designs
More informationMeet in the Middle: Leveraging Optical Interconnection Opportunities in Chip Multi Processors
Meet in the Middle: Leveraging Optical Interconnection Opportunities in Chip Multi Processors Sandro Bartolini* Department of Information Engineering, University of Siena, Italy bartolini@dii.unisi.it
More informationDCOF - An Arbitration Free Directly Connected Optical Fabric
IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS 1 DCOF - An Arbitration Free Directly Connected Optical Fabric Christopher Nitta, Member, IEEE, Matthew Farrens, Member, IEEE, and Venkatesh
More informationBandwidth Adaptive Nanophotonic Crossbars with Clockwise/Counter-Clockwise Optical Routing
Bandwidth Adaptive Nanophotonic Crossbars with Clockwise/Counter-Clockwise Optical Routing Matthew Kennedy and Avinash Karanth Kodi School of Electrical Engineering and Computer Science Ohio University,
More informationCross-Chip: Low Power Processor-to-Memory Nanophotonic Interconnect Architecture
Cross-Chip: Low Power Processor-to-Memory Nanophotonic Interconnect Architecture Matthew Kennedy and Avinash Kodi Department of Electrical Engineering and Computer Science Ohio University, Athens, OH 45701
More informationA Multilayer Nanophotonic Interconnection Network for On-Chip Many-core Communications
A Multilayer Nanophotonic Interconnection Network for On-Chip Many-core Communications Xiang Zhang and Ahmed Louri Department of Electrical and Computer Engineering, The University of Arizona 1230 E Speedway
More informationHANDSHAKE AND CIRCULATION FLOW CONTROL IN NANOPHOTONIC INTERCONNECTS
HANDSHAKE AND CIRCULATION FLOW CONTROL IN NANOPHOTONIC INTERCONNECTS A Thesis by JAGADISH CHANDAR JAYABALAN Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of
More informationOPTICAL TOKENS IN MANY-CORE PROCESSORS. Dana M. Vantrease. A dissertation submitted in partial fulfillment of the requirements for the degree of
OPTICAL TOKENS IN MANY-CORE PROCESSORS by Dana M. Vantrease A dissertation submitted in partial fulfillment of the requirements for the degree of Doctorate of Philosophy (Computer Sciences) at the UNIVERSITY
More information3D-NoC: Reconfigurable 3D Photonic On-Chip Interconnect for Multicores
D-NoC: Reconfigurable D Photonic On-Chip Interconnect for Multicores Randy Morris, Avinash Karanth Kodi, and Ahmed Louri Electrical Engineering and Computer Science, Ohio University, Athens, OH 457 Electrical
More informationMore is Less, Less is More: Molecular-Scale Photonic NoC Power Topologies
More is Less, Less is More: Molecular-Scale Photonic NoC Power Topologies Jun Pang Department of Computer Science Duke University pangjun92@gmail.com Chris Dwyer Department of Electrical and Computer Engineering
More informationPerformance and Power Impact of Issuewidth in Chip-Multiprocessor Cores
Performance and Power Impact of Issuewidth in Chip-Multiprocessor Cores Magnus Ekman Per Stenstrom Department of Computer Engineering, Department of Computer Engineering, Outline Problem statement Assumptions
More informationIITD OPTICAL STACK : LAYERED ARCHITECTURE FOR PHOTONIC INTERCONNECTS
SRISHTI PHOTONICS RESEARCH GROUP INDIAN INSTITUTE OF TECHNOLOGY, DELHI 1 IITD OPTICAL STACK : LAYERED ARCHITECTURE FOR PHOTONIC INTERCONNECTS Authors: Janib ul Bashir and Smruti R. Sarangi Indian Institute
More informationEnergy Efficient And Low Latency Interconnection Network For Multicast Invalidates In Shared Memory Systems
Energy Efficient And Low Latency Interconnection Network For Multicast Invalidates In Shared Memory Systems Muhammad Ridwan Madarbux Optical Networks Group Electronic and Electrical Engineering Department
More informationDCAF - A Directly Connected Arbitration-Free Photonic Crossbar For Energy-Efficient High Performance Computing
- A Directly Connected Arbitration-Free Photonic Crossbar For Energy-Efficient High Performance Computing Christopher Nitta, Matthew Farrens, and Venkatesh Akella University of California, Davis Davis,
More information3D Stacked Nanophotonic Network-on-Chip Architecture with Minimal Reconfiguration
IEEE TRANSACTIONS ON COMPUTERS 1 3D Stacked Nanophotonic Network-on-Chip Architecture with Minimal Reconfiguration Randy W. Morris, Jr., Student Member, IEEE, Avinash Karanth Kodi, Member, IEEE, Ahmed
More informationIndex 283. F Fault model, 121 FDMA. See Frequency-division multipleaccess
Index A Active buffer window (ABW), 34 35, 37, 39, 40 Adaptive data compression, 151 172 Adaptive routing, 26, 100, 114, 116 119, 121 123, 126 128, 135 137, 139, 144, 146, 158 Adaptive voltage scaling,
More informationFUTURE high-performance computers (HPCs) and data. Runtime Management of Laser Power in Silicon-Photonic Multibus NoC Architecture
Runtime Management of Laser Power in Silicon-Photonic Multibus NoC Architecture Chao Chen, Student Member, IEEE, and Ajay Joshi, Member, IEEE (Invited Paper) Abstract Silicon-photonic links have been proposed
More informationMonolithic Integration of Energy-efficient CMOS Silicon Photonic Interconnects
Monolithic Integration of Energy-efficient CMOS Silicon Photonic Interconnects Vladimir Stojanović Integrated Systems Group Massachusetts Institute of Technology Manycore SOC roadmap fuels bandwidth demand
More informationReVive: Cost-Effective Architectural Support for Rollback Recovery in Shared-Memory Multiprocessors
ReVive: Cost-Effective Architectural Support for Rollback Recovery in Shared-Memory Multiprocessors Milos Prvulovic, Zheng Zhang*, Josep Torrellas University of Illinois at Urbana-Champaign *Hewlett-Packard
More informationExploiting Dark Silicon in Server Design. Nikos Hardavellas Northwestern University, EECS
Exploiting Dark Silicon in Server Design Nikos Hardavellas Northwestern University, EECS Moore s Law Is Alive And Well 90nm 90nm transistor (Intel, 2005) Swine Flu A/H1N1 (CDC) 65nm 45nm 32nm 22nm 16nm
More informationShared Memory Multiprocessors. Symmetric Shared Memory Architecture (SMP) Cache Coherence. Cache Coherence Mechanism. Interconnection Network
Shared Memory Multis Processor Processor Processor i Processor n Symmetric Shared Memory Architecture (SMP) cache cache cache cache Interconnection Network Main Memory I/O System Cache Coherence Cache
More informationDesigning Multi-socket Systems Using Silicon Photonics
Designing Multi-socket Systems Using Silicon Photonics Scott Beamer Krste Asanovic Chris Batten Ajay Joshi Vladimir Stojanovic Electrical Engineering and Computer Sciences University of California at Berkeley
More informationSYMMETRIC multiprocessors (SMPs) are attractive parallel
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 15, NO. 12, DECEMBER 2004 1093 An Optical Interconnection Network and a Modified Snooping Protocol for the Design of Large-Scale Symmetric Multiprocessors
More informationHybrid On-chip Data Networks. Gilbert Hendry Keren Bergman. Lightwave Research Lab. Columbia University
Hybrid On-chip Data Networks Gilbert Hendry Keren Bergman Lightwave Research Lab Columbia University Chip-Scale Interconnection Networks Chip multi-processors create need for high performance interconnects
More informationECE 669 Parallel Computer Architecture
ECE 669 Parallel Computer Architecture Lecture 9 Workload Evaluation Outline Evaluation of applications is important Simulation of sample data sets provides important information Working sets indicate
More informationDSENT A Tool Connecting Emerging Photonics with Electronics for Opto- Electronic Networks-on-Chip Modeling Chen Sun
A Tool Connecting Emerging Photonics with Electronics for Opto- Electronic Networks-on-Chip Modeling Chen Sun In collaboration with: Chia-Hsin Owen Chen George Kurian Lan Wei Jason Miller Jurgen Michel
More informationDynamic Reconfiguration of 3D Photonic Networks-on-Chip for Maximizing Performance and Improving Fault Tolerance
IEEE/ACM 45th Annual International Symposium on Microarchitecture Dynamic Reconfiguration of D Photonic Networks-on-Chip for Maximizing Performance and Improving Fault Tolerance Randy Morris, Avinash Karanth
More informationATAC: Improving Performance and Programmability with On-Chip Optical Networks
ATAC: Improving Performance and Programmability with On-Chip Optical Networks James Psota, Jason Miller, George Kurian, Nathan Beckmann, Jonathan Eastep, Henry Hoffman, Jifeng Liu, Mark Beals, Jurgen Michel,
More informationLecture: Memory, Multiprocessors. Topics: wrap-up of memory systems, intro to multiprocessors and multi-threaded programming models
Lecture: Memory, Multiprocessors Topics: wrap-up of memory systems, intro to multiprocessors and multi-threaded programming models 1 Refresh Every DRAM cell must be refreshed within a 64 ms window A row
More information3D Memory Architecture. Kyushu University
3D Memory Architecture Koji Inoue Kyushu University 1 Outline Why 3D? Will 3D always work well? Support Adaptive Execution! Memory Hierarchy Run time Optimization Conclusions 2 Outline Why 3D? Will 3D
More informationfor High Performance and Low Power Consumption Koji Inoue, Shinya Hashiguchi, Shinya Ueno, Naoto Fukumoto, and Kazuaki Murakami
3D Implemented dsram/dram HbidC Hybrid Cache Architecture t for High Performance and Low Power Consumption Koji Inoue, Shinya Hashiguchi, Shinya Ueno, Naoto Fukumoto, and Kazuaki Murakami Kyushu University
More informationA Multicore Processor Designed For PetaFLOPS Computation
A Multicore Processor Designed For PetaFLOPS Computation Weiwu Hu Institute of Computing Technology, Chinese Academy of Sciences Loongson Technologies Corporation Limited hww@ict.ac.cn 1 Contents Background
More informationSnatch: Opportunistically Reassigning Power Allocation between Processor and Memory in 3D Stacks
Snatch: Opportunistically Reassigning Power Allocation between and in 3D Stacks Dimitrios Skarlatos, Renji Thomas, Aditya Agrawal, Shibin Qin, Robert Pilawa, Ulya Karpuzcu, Radu Teodorescu, Nam Sung Kim,
More information826 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 33, NO. 6, JUNE 2014
826 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 33, NO. 6, JUNE 2014 LumiNOC: A Power-Efficient, High-Performance, Photonic Network-on-Chip Cheng Li, Student Member,
More informationCache Performance, System Performance, and Off-Chip Bandwidth... Pick any Two
Cache Performance, System Performance, and Off-Chip Bandwidth... Pick any Two Bushra Ahsan and Mohamed Zahran Dept. of Electrical Engineering City University of New York ahsan bushra@yahoo.com mzahran@ccny.cuny.edu
More informationLight64: Ligh support for data ra. Darko Marinov, Josep Torrellas. a.cs.uiuc.edu
: Ligh htweight hardware support for data ra ce detection ec during systematic testing Adrian Nistor, Darko Marinov, Josep Torrellas University of Illinois, Urbana Champaign http://iacoma a.cs.uiuc.edu
More informationTopology Optimization of 3D Hybrid Optical-Electronic
, October 19-21, 16, San Francisco, USA Topology Optimization of 3D Hybrid Optical-lectronic Networks-on-Chip Zhicheng Zhou, Ning Wu, and Gaizhen Yan Abstract Power and latency constraints of the electronic
More informationMulticast Snooping: A Multicast Address Network. A New Coherence Method Using. With sponsorship and/or participation from. Mark Hill & David Wood
Multicast Snooping: A New Coherence Method Using A Multicast Address Ender Bilir, Ross Dickson, Ying Hu, Manoj Plakal, Daniel Sorin, Mark Hill & David Wood Computer Sciences Department University of Wisconsin
More informationDesigning Multisocket Systems with Silicon Photonics. by Scott Beamer. Research Project
Designing Multisocket Systems with Silicon Photonics by Scott Beamer Research Project Submitted to the Department of Electrical Engineering and Computer Sciences, University of California at Berkeley,
More informationDesigning Chip-Level Nanophotonic Interconnection Networks
IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 2, NO. 2, JUNE 2012 137 Designing Chip-Level Nanophotonic Interconnection Networks Christopher Batten, Member, IEEE, Ajay Joshi,
More informationIEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 1 An Inter/Intra-Chip Optical Network for Manycore Processors Xiaowen Wu, Student Member, IEEE, JiangXu,Member, IEEE, Yaoyao Ye, Student
More informationKotura Analysis: WDM PICs improve cost over LR4
Kotura Analysis: WDM PICs improve cost over LR4 IEEE P802.3bm - 40 Gb/s & 100 Gb/s Fiber Optic Task Force Sept 2012 Contributors: Mehdi Asghari, Kotura Samir Desai, Kotura Arlon Martin, Kotura Recall the
More informationRack-Scale Optical Network for High Performance Computing Systems
Rack-Scale Optical Network for High Performance Computing Systems Peng Yang, Zhengbin Pang, Zhifei Wang Zhehui Wang, Min Xie, Xuanqi Chen, Luan H. K. Duong, Jiang Xu Outline Introduction Rack-scale inter/intra-chip
More informationPerformance of coherence protocols
Performance of coherence protocols Cache misses have traditionally been classified into four categories: Cold misses (or compulsory misses ) occur the first time that a block is referenced. Conflict misses
More informationLogTM: Log-Based Transactional Memory
LogTM: Log-Based Transactional Memory Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill, & David A. Wood 12th International Symposium on High Performance Computer Architecture () 26 Mulitfacet
More informationSwitch Datapath in the Stanford Phictious Optical Router (SPOR)
Switch Datapath in the Stanford Phictious Optical Router (SPOR) H. Volkan Demir, Micah Yairi, Vijit Sabnis Arpan Shah, Azita Emami, Hossein Kakavand, Kyoungsik Yu, Paulina Kuo, Uma Srinivasan Optics and
More informationA Fully Optical Ring Network-on-Chip with Static and Dynamic Wavelength Allocation
IEICE TRANS. INF. & SYST., VOL.E96 D, NO.12 DECEMBER 2013 2545 PAPER Special Section on Parallel and Distributed Computing and Networking A Fully Optical Ring Network-on-Chip with Static and Dynamic Wavelength
More informationPerformance Evaluation of a Multicore System with Optically Connected Memory Modules
2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip Performance Evaluation of a Multicore System with Optically Connected Memory Modules Paul Vincent Mejia, Rajeevan Amirtharajah, Matthew
More informationNetwork-on-Chip Architecture
Multiple Processor Systems(CMPE-655) Network-on-Chip Architecture Performance aspect and Firefly network architecture By Siva Shankar Chandrasekaran and SreeGowri Shankar Agenda (Enhancing performance)
More informationPSMC Roadmap For Integrated Photonics Manufacturing
PSMC Roadmap For Integrated Photonics Manufacturing Richard Otte Promex Industries Inc. Santa Clara California For the Photonics Systems Manufacturing Consortium April 21, 2016 Meeting the Grand Challenges
More informationReconfigurable Optical and Wireless (R-OWN) Network-on-Chip for High Performance Computing
Reconfigurable Optical and Wireless (R-OWN) Network-on-Chip for High Performance Computing Md Ashif I Sikder School of Electrical Engineering and Computer Science, Ohio University Athens, OH-45701 ms047914@ohio.edu
More informationFlynn s Classification
Flynn s Classification SISD (Single Instruction Single Data) Uniprocessors MISD (Multiple Instruction Single Data) No machine is built yet for this type SIMD (Single Instruction Multiple Data) Examples:
More informationThe Impact of Optics on HPC System Interconnects
The Impact of Optics on HPC System Interconnects Mike Parker and Steve Scott Hot Interconnects 2009 Manhattan, NYC Will cost-effective optics fundamentally change the landscape of networking? Yes. Changes
More informationAccelerating Multi-core Processor Design Space Evaluation Using Automatic Multi-threaded Workload Synthesis
Accelerating Multi-core Processor Design Space Evaluation Using Automatic Multi-threaded Workload Synthesis Clay Hughes & Tao Li Department of Electrical and Computer Engineering University of Florida
More informationRespin: Rethinking Near- Threshold Multiprocessor Design with Non-Volatile Memory
Respin: Rethinking Near- Threshold Multiprocessor Design with Non-Volatile Memory Computer Architecture Research Lab h"p://arch.cse.ohio-state.edu Universal Demand for Low Power Mobility Ba"ery life Performance
More informationMultiprocessors & Thread Level Parallelism
Multiprocessors & Thread Level Parallelism COE 403 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals Presentation Outline Introduction
More informationarxiv: v2 [cs.oh] 14 Mar 2017
MorphoNoC: Exploring the Design Space of a Configurable Hybrid NoC using Nanophotonics Vikram K. Narayana a,, Shuai Sun a, Abdel-Hameed A. Badawy b, Volker J. Sorger a, Tarek El-Ghazawi a a The George
More informationCMOS Photonic Processor-Memory Networks
CMOS Photonic Processor-Memory Networks Vladimir Stojanović Integrated Systems Group Massachusetts Institute of Technology Acknowledgments Krste Asanović, Rajeev Ram, Franz Kaertner, Judy Hoyt, Henry Smith,
More informationAuthor's personal copy
J. Parallel Distrib. Comput. 68 (2008) 1413 1424 Contents lists available at ScienceDirect J. Parallel Distrib. Comput. journal homepage: www.elsevier.com/locate/jpdc Two proposals for the inclusion of
More informationEXASCALE COMPUTING: WHERE OPTICS MEETS ELECTRONICS
EXASCALE COMPUTING: WHERE OPTICS MEETS ELECTRONICS Overview of OFC Workshop: Organizers: Norm Jouppi HP Labs, Moray McLaren HP Labs, Madeleine Glick Intel Labs March 7, 2011 1 AGENDA Introduction. Moray
More informationElectrical Engineering and Computer Science Department
Electrical Engineering and Computer Science Department Technical Report Number: NU-EECS-13-08 July, 2013 Galaxy: A High-Performance Energy-Efficient Multi-Chip Architecture Using Photonic Interconnects
More informationLumiNOC: A Power-Efficient, High-Performance, Photonic Network-on-Chip for Future Parallel Architectures
LumiNOC: A Power-Efficient, High-Performance, Photonic Network-on-Chip for Future Parallel Architectures Cheng Li, Mark Browning, Paul V. Gratz, Sam Palermo Texas A&M University {seulc,mabrowning,pgratz,spalermo}@tamu.edu
More informationEffect of Data Prefetching on Chip MultiProcessor
THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE. 819-0395 744 819-0395 744 E-mail: {fukumoto,mihara}@c.csce.kyushu-u.ac.jp, {inoue,murakami}@i.kyushu-u.ac.jp
More informationThe SPLASH-2 Programs: Characterization and Methodological Considerations
Appears in the Proceedings of the nd Annual International Symposium on Computer Architecture, pages -36, June 995 The SPLASH- Programs: Characterization and Methodological Considerations Steven Cameron
More informationPREDICTION MODELING FOR DESIGN SPACE EXPLORATION IN OPTICAL NETWORK ON CHIP
PREDICTION MODELING FOR DESIGN SPACE EXPLORATION IN OPTICAL NETWORK ON CHIP SARA KARIMI A Thesis in The Department Of Electrical and Computer Engineering Presented in Partial Fulfillment of the Requirements
More informationChallenges for Future Interconnection Networks Hot Interconnects Panel August 24, Dennis Abts Sr. Principal Engineer
Challenges for Future Interconnection Networks Hot Interconnects Panel August 24, 2006 Sr. Principal Engineer Panel Questions How do we build scalable networks that balance power, reliability and performance
More informationOWN: Optical and Wireless Network-on-Chip for Kilo-core Architectures
OWN: Optical and Wireless Network-on-Chip for Kilo-core Architectures Md Ashif I Sikder, Avinash K Kodi, Matthew Kennedy and Savas Kaya School of Electrical Engineering and Computer Science Ohio University
More information1. NoCs: What s the point?
1. Nos: What s the point? What is the role of networks-on-chip in future many-core systems? What topologies are most promising for performance? What about for energy scaling? How heavily utilized are Nos
More informationLow-Power Reconfigurable Network Architecture for On-Chip Photonic Interconnects
Low-Power Reconfigurable Network Architecture for On-Chip Photonic Interconnects I. Artundo, W. Heirman, C. Debaes, M. Loperena, J. Van Campenhout, H. Thienpont New York, August 27th 2009 Iñigo Artundo,
More informationFundamentals of Quantitative Design and Analysis
Fundamentals of Quantitative Design and Analysis Dr. Jiang Li Adapted from the slides provided by the authors Computer Technology Performance improvements: Improvements in semiconductor technology Feature
More informationA Tuneable Software Cache Coherence Protocol for Heterogeneous MPSoCs. Marco Bekooij & Frank Ophelders
A Tuneable Software Cache Coherence Protocol for Heterogeneous MPSoCs Marco Bekooij & Frank Ophelders Outline Context What is cache coherence Addressed challenge Short overview of related work Related
More informationSupporting Distributed Shared Memory. Axel Jantsch Xiaowen Chen, Zhonghai Lu Royal Institute of Technology, Sweden September 16, 2009
Supporting Distributed Shared Memory Axel Jantsch Xiaowen Chen, Zhonghai Lu Royal Institute of Technology, Sweden September 16, 2009 Memory content in today s SoCs 3 Elements in SoC Processing: Well understood;
More informationAchieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation
Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation Kshitij Bhardwaj Dept. of Computer Science Columbia University Steven M. Nowick 2016 ACM/IEEE Design Automation
More informationLecture: Memory, Coherence Protocols. Topics: wrap-up of memory systems, intro to multi-thread programming models
Lecture: Memory, Coherence Protocols Topics: wrap-up of memory systems, intro to multi-thread programming models 1 Refresh Every DRAM cell must be refreshed within a 64 ms window A row read/write automatically
More informationArchitectures. A thesis presented to. the faculty of. In partial fulfillment. of the requirements for the degree.
Dynamic Bandwidth and Laser Scaling for CPU-GPU Heterogenous Network-on-Chip Architectures A thesis presented to the faculty of the Russ College of Engineering and Technology of Ohio University In partial
More informationA Case Study of Signal-to-Noise Ratio in Ring-Based Optical Networks-on-Chip
A Case Study of Signal-to-Noise Ratio in Ring-Based Optical Networks-on-Chip Luan H. K. Duong, Jiang Xu, Xiaowen Wu, Zhehui Wang, and Peng Yang Hong Kong University of Science and Technology Sébastien
More informationPseudo-Circuit: Accelerating Communication for On-Chip Interconnection Networks
Department of Computer Science and Engineering, Texas A&M University Technical eport #2010-3-1 seudo-circuit: Accelerating Communication for On-Chip Interconnection Networks Minseon Ahn, Eun Jung Kim Department
More informationECE/CS 757: Advanced Computer Architecture II Interconnects
ECE/CS 757: Advanced Computer Architecture II Interconnects Instructor:Mikko H Lipasti Spring 2017 University of Wisconsin-Madison Lecture notes created by Natalie Enright Jerger Lecture Outline Introduction
More informationMultiprocessors. Flynn Taxonomy. Classifying Multiprocessors. why would you want a multiprocessor? more is better? Cache Cache Cache.
Multiprocessors why would you want a multiprocessor? Multiprocessors and Multithreading more is better? Cache Cache Cache Classifying Multiprocessors Flynn Taxonomy Flynn Taxonomy Interconnection Network
More informationLect. 6: Directory Coherence Protocol
Lect. 6: Directory Coherence Protocol Snooping coherence Global state of a memory line is the collection of its state in all caches, and there is no summary state anywhere All cache controllers monitor
More informationNANOPHOTONIC INTERCONNECT ARCHITECTURES FOR MANY-CORE MICROPROCESSORS
NANOPHOTONIC INTERCONNECT ARCHITECTURES FOR MANY-CORE MICROPROCESSORS A Dissertation Presented to the Faculty of the Graduate School of Cornell University in Partial Fulfillment of the Requirements for
More informationMassachusetts Institute of Technology Department of Electrical Engineering and Computer Science
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Proposal for Thesis Research in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
More informationMaking the Fast Case Common and the Uncommon Case Simple in Unbounded Transactional Memory
Making the Fast Case Common and the Uncommon Case Simple in Unbounded Transactional Memory Colin Blundell (University of Pennsylvania) Joe Devietti (University of Pennsylvania) E Christopher Lewis (VMware,
More informationInternational Journal of Advanced Research in Computer Engineering &Technology (IJARCET) Volume 2, Issue 8, August 2013
ISSN: 2278 323 Signal Delay Control Based on Different Switching Techniques in Optical Routed Interconnection Networks Ahmed Nabih Zaki Rashed Electronics and Electrical Communications Engineering Department
More informationComputer Systems Architecture
Computer Systems Architecture Lecture 23 Mahadevan Gomathisankaran April 27, 2010 04/27/2010 Lecture 23 CSCE 4610/5610 1 Reminder ABET Feedback: http://www.cse.unt.edu/exitsurvey.cgi?csce+4610+001 Student
More informationLecture 2 Parallel Programming Platforms
Lecture 2 Parallel Programming Platforms Flynn s Taxonomy In 1966, Michael Flynn classified systems according to numbers of instruction streams and the number of data stream. Data stream Single Multiple
More informationFuture Memory and Interconnect Technologies
Future Memory and Interconnect Technologies Yuan Xie Pennsylvania State University, USA AMD Research, Advanced Micro Devices, Inc., USA Email: yuanxie@cse.psu.edu Abstract The improvement of the computer
More informationTDT Appendix E Interconnection Networks
TDT 4260 Appendix E Interconnection Networks Review Advantages of a snooping coherency protocol? Disadvantages of a snooping coherency protocol? Advantages of a directory coherency protocol? Disadvantages
More informationInterconnection Networks: Topology. Prof. Natalie Enright Jerger
Interconnection Networks: Topology Prof. Natalie Enright Jerger Topology Overview Definition: determines arrangement of channels and nodes in network Analogous to road map Often first step in network design
More informationTOPAZ: An Open-Source Interconnection Network Simulator for Chip Multiprocessors and Supercomputers
TOPAZ: An Open-Source Interconnection Network Simulator for Chip Multiprocessors and Supercomputers Pablo Abad, Pablo Prieto, Lucia Menezo, Adrian Colaso, Valentin Puente, Jose-Angel Gregorio University
More information