NOW and the Killer Network David E. Culler
|
|
- Thomasina Wilcox
- 5 years ago
- Views:
Transcription
1 NOW and the Killer Network David E. Culler NOW 1
2 Remember the Killer Micro 100,000,000 10,000,000 R10000 Pentium Transistors 1,000, ,000 i80286 i80386 R3000 R2000 i ,000 i8080 i4004 1, Year NOW 2
3 Timely Engineering of Large Systems SpecInt SpecFP Year NOW 3
4 90 s Technological Breakthrough the killer network the single-chip, high bandwidth, low-latency, high reliability building block for scalable networks => the new LAN? at least SAN (System Area Network) NOW 4
5 The Last Two Inches Problem Really Fast Workstations ns MS Network Stacks Drivers Network Interfaces µs Fast Scalable Networks NOW 5
6 Goals of the Research Fundamental change in how we design largescale computing systems snap together commodity components self-managing, self-tuning, highly available Make the killer network real realize the potential of emerging hardware technology and push its effect through the rest of the system Integrated system on a building-wide scale pool of resources (proc, disk mem) remote processor and memory closer than local disk federation of systems with local and global role The right way to build internet services NOW 6
7 NOW System Architecture Large Seq. Apps Parallel Apps Sockets, Split-C, MPI, HPF, vsm Global Layer Unix Resource Management Network RAM Distributed Files Process Migration Unix Workstation Unix Workstation Unix Workstation Unix Workstation Comm. SW Comm. SW Comm. SW Comm. SW Net Inter. HW Net Inter. HW Net Inter. HW Net Inter. HW Fast Commercial Switch (Myrinet) NOW 7
8 Intelligent Network Interfaces Processing power and storage embedded in the NIC Mryicom Net 160 MB/s Myricom NIC P M M I/O bus (S-Bus) 50 MB/s $ P Sun Ultra 170 NOW 8
9 AM: Fast, Portable Communication Overhead+Latency Gap (1/BW) NOW-SS20 NOW-Ultra Paragon Meiko NOW-SS20 NOW-Ultra Paragon Meiko µs g L Or Os NOW 9
10 MPI over AM System Start-up (µs) Peak BW (MB/s) NOW Paragon Meiko CS IBM SP NOW 10
11 Sockets over AM: Latency Time (µs) TCP 100BT TCP Fast Sockets GAM Transfer Size NOW 11
12 netperf, ttcp bandwidth Throughput (MB/s) Transfer Size FastSockets: netperf FastSockets: ttcp Myrinet TCP: netperf Myrinet TCP: ttcp NOW 12
13 Application Sensitivity to Overhead Slowdown ebarnes radix em3d sample p-ray murphi connect radb Overhead (µs) NOW 13
14 Sensitivity to gap (1/msg rate) Slowdown ebarnes radix sample em3d p-ray connect murphi radb gap (µs) NOW 14
15 Sensitivity to Latency Slowdown p-ray ebarnes connect em3d radix radb sample murphi Added latency (us) NOW 15
16 General Purpose, fault-tolerant Virtual Networks Many User / System Process User Processes NIC NIC Segment Driver Segment Driver Dynamic binding of multiple virtual network end-points directly to physical NIC resources Deep integration with VM and threads Smart NIC mux-demux, errors, and flow-control Clean error model - return-to-sender NOW 16
17 Truely Distibuted File System XFS Scalable Low-Latency Communication Network Cooperative Caching Local Cache P P P P P P P P File Cache File Cache File Cache File Cache File Cache File Cache File Cache File Cache Log Structured File Stripe Group G = Node Comm BW / Disk BW NOW 17
18 Virtualization of O.S. Services system call Std. Unix User Process redirection System Call translation modified system call on user s behalf Conventional Unix Kernel Small kernel insertion provides redirection to user-level translation facilty Translation provide global-local mapping of system functions for std. binaries NOW 18
19 World-Record Disk-to-Disk Sort Gigabytes sorted Minute Sort SGI Processors Seconds Datamation Million Record Sort SGI Processors NOW 19
20 Toward a Web O.S. Build on basic technology developed for NOW to provide a powerful operating environment for advanced web applications Global (URL-based) file system imports home environment to NOW and vice versa build services on a cache-coherent global file system OS sandbox isolates foreign entity Smart (java-based) browsers provide scalable, interactive front-end Rent-a-server when you re too hot Cooperative web caching around the planet Interactive services NOW 20
21 Toward Immense Disk Clusters NOW 21
22 Clusters of SMPs (CLUMPS) NOW 22
23 Overall System Configuration NOW-2 Myricom Scalable Network 100 Ultras 200 disks Tertiary Disk 24 PCs 400 disks 4 x 8 SMPs Switched EtherNet Servers ATM Backbone LBL-NERSC Router UCB NPACI NOW 23
24 Invitation System is operational enough for research CS267 is using it heavily Think about it for term projects CS252, CS262, CS286,... Ready to work with other research groups see: NOW 24
Parallel Computing Trends: from MPPs to NoWs
Parallel Computing Trends: from MPPs to NoWs (from Massively Parallel Processors to Networks of Workstations) Fall Research Forum Oct 18th, 1994 Thorsten von Eicken Department of Computer Science Cornell
More informationScalable Distributed Memory Machines
Scalable Distributed Memory Machines Goal: Parallel machines that can be scaled to hundreds or thousands of processors. Design Choices: Custom-designed or commodity nodes? Network scalability. Capability
More informationChapter 2: Computer-System Structures. Hmm this looks like a Computer System?
Chapter 2: Computer-System Structures Lab 1 is available online Last lecture: why study operating systems? Purpose of this lecture: general knowledge of the structure of a computer system and understanding
More informationUniprocessor Computer Architecture Example: Cray T3E
Chapter 2: Computer-System Structures MP Example: Intel Pentium Pro Quad Lab 1 is available online Last lecture: why study operating systems? Purpose of this lecture: general knowledge of the structure
More informationAdapted from: TRENDS AND ATTRIBUTES OF HORIZONTAL AND VERTICAL COMPUTING ARCHITECTURES
Adapted from: TRENDS AND ATTRIBUTES OF HORIZONTAL AND VERTICAL COMPUTING ARCHITECTURES Tom Atwood Business Development Manager Sun Microsystems, Inc. Takeaways Understand the technical differences between
More informationLow Cost Supercomputing. Rajkumar Buyya, Monash University, Melbourne, Australia. Parallel Processing on Linux Clusters
N Low Cost Supercomputing o Parallel Processing on Linux Clusters Rajkumar Buyya, Monash University, Melbourne, Australia. rajkumar@ieee.org http://www.dgs.monash.edu.au/~rajkumar Agenda Cluster? Enabling
More informationCSC501 Operating Systems Principles. OS Structure
CSC501 Operating Systems Principles OS Structure 1 Announcements q TA s office hour has changed Q Thursday 1:30pm 3:00pm, MRC-409C Q Or email: awang@ncsu.edu q From department: No audit allowed 2 Last
More informationConvergence of Parallel Architecture
Parallel Computing Convergence of Parallel Architecture Hwansoo Han History Parallel architectures tied closely to programming models Divergent architectures, with no predictable pattern of growth Uncertainty
More informationCS 350 Winter 2011 Current Topics: Virtual Machines + Solid State Drives
CS 350 Winter 2011 Current Topics: Virtual Machines + Solid State Drives Virtual Machines Resource Virtualization Separating the abstract view of computing resources from the implementation of these resources
More informationWhat are Clusters? Why Clusters? - a Short History
What are Clusters? Our definition : A parallel machine built of commodity components and running commodity software Cluster consists of nodes with one or more processors (CPUs), memory that is shared by
More informationPerformance Evaluation of Myrinet-based Network Router
Performance Evaluation of Myrinet-based Network Router Information and Communications University 2001. 1. 16 Chansu Yu, Younghee Lee, Ben Lee Contents Suez : Cluster-based Router Suez Implementation Implementation
More informationLecture 9: MIMD Architectures
Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction A set of general purpose processors is connected together.
More informationSeparating Access Control Policy, Enforcement, and Functionality in Extensible Systems. Robert Grimm University of Washington
Separating Access Control Policy, Enforcement, and Functionality in Extensible Systems Robert Grimm University of Washington Extensions Added to running system Interact through low-latency interfaces Form
More informationParallel Computer Architecture II
Parallel Computer Architecture II Stefan Lang Interdisciplinary Center for Scientific Computing (IWR) University of Heidelberg INF 368, Room 532 D-692 Heidelberg phone: 622/54-8264 email: Stefan.Lang@iwr.uni-heidelberg.de
More informationLecture 1: Course Introduction and Overview Prof. Randy H. Katz Computer Science 252 Spring 1996
Lecture 1: Course Introduction and Overview Prof. Randy H. Katz Computer Science 252 Spring 1996 RHK.S96 1 Computer Architecture Is the attributes of a [computing] system as seen by the programmer, i.e.,
More informationLearning Curve for Parallel Applications. 500 Fastest Computers
Learning Curve for arallel Applications ABER molecular dynamics simulation program Starting point was vector code for Cray-1 145 FLO on Cray90, 406 for final version on 128-processor aragon, 891 on 128-processor
More informationCS550. TA: TBA Office: xxx Office hours: TBA. Blackboard:
CS550 Advanced Operating Systems (Distributed Operating Systems) Instructor: Xian-He Sun Email: sun@iit.edu, Phone: (312) 567-5260 Office hours: 1:30pm-2:30pm Tuesday, Thursday at SB229C, or by appointment
More informationThe Overall SHRIMP Project. Related Work (from last section) Paper Goals. Bill Kramer April 17, 2002
CS 258 Reading Assignment 16 Discussion Design Choice in the SHRIMP System: An Empirical Study Bill Kramer April 17, 2002 # The Overall SHRIMP Project The SHRIMP (Scalable High-performance Really Inexpensive
More informationReal Parallel Computers
Real Parallel Computers Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra, Meuer, Simon Parallel Computing 2005 Short history
More informationTHE U-NET USER-LEVEL NETWORK ARCHITECTURE. Joint work with Werner Vogels, Anindya Basu, and Vineet Buch. or: it s easy to buy high-speed networks
Thorsten von Eicken Dept of Computer Science tve@cs.cornell.edu Cornell niversity THE -NET SER-LEVEL NETWORK ARCHITECTRE or: it s easy to buy high-speed networks but making them work is another story NoW
More informationXytech MediaPulse Equipment Guidelines (Version 8 and Sky)
Xytech MediaPulse Equipment Guidelines (Version 8 and Sky) MediaPulse Architecture Xytech Systems MediaPulse solution utilizes a multitier architecture, requiring at minimum three server roles: a database
More informationBlueGene/L. Computer Science, University of Warwick. Source: IBM
BlueGene/L Source: IBM 1 BlueGene/L networking BlueGene system employs various network types. Central is the torus interconnection network: 3D torus with wrap-around. Each node connects to six neighbours
More informationSurvey Of Volume Managers
Survey Of Volume Managers Nasser M. Abbasi May 24, 2000 page compiled on June 28, 2015 at 10:44am Contents 1 Advantages of Volume Managers 1 2 Terminology used in LVM software 1 3 Survey of Volume Managers
More informationXytech MediaPulse Equipment Guidelines (Version 8 and Sky)
Xytech MediaPulse Equipment Guidelines (Version 8 and Sky) MediaPulse Architecture Xytech s MediaPulse solution utilizes a multitier architecture, requiring at minimum three server roles: a database server,
More information1/5/2012. Overview of Interconnects. Presentation Outline. Myrinet and Quadrics. Interconnects. Switch-Based Interconnects
Overview of Interconnects Myrinet and Quadrics Leading Modern Interconnects Presentation Outline General Concepts of Interconnects Myrinet Latest Products Quadrics Latest Release Our Research Interconnects
More informationProcessor Architecture and Interconnect
Processor Architecture and Interconnect What is Parallelism? Parallel processing is a term used to denote simultaneous computation in CPU for the purpose of measuring its computation speeds. Parallel Processing
More informationMPI and comparison of models Lecture 23, cs262a. Ion Stoica & Ali Ghodsi UC Berkeley April 16, 2018
MPI and comparison of models Lecture 23, cs262a Ion Stoica & Ali Ghodsi UC Berkeley April 16, 2018 MPI MPI - Message Passing Interface Library standard defined by a committee of vendors, implementers,
More informationNUMA replicated pagecache for Linux
NUMA replicated pagecache for Linux Nick Piggin SuSE Labs January 27, 2008 0-0 Talk outline I will cover the following areas: Give some NUMA background information Introduce some of Linux s NUMA optimisations
More informationWorkload Optimized Systems: The Wheel of Reincarnation. Michael Sporer, Netezza Appliance Hardware Architect 21 April 2013
Workload Optimized Systems: The Wheel of Reincarnation Michael Sporer, Netezza Appliance Hardware Architect 21 April 2013 Outline Definition Technology Minicomputers Prime Workstations Apollo Graphics
More informationSMP and ccnuma Multiprocessor Systems. Sharing of Resources in Parallel and Distributed Computing Systems
Reference Papers on SMP/NUMA Systems: EE 657, Lecture 5 September 14, 2007 SMP and ccnuma Multiprocessor Systems Professor Kai Hwang USC Internet and Grid Computing Laboratory Email: kaihwang@usc.edu [1]
More informationDistributed OS and Algorithms
Distributed OS and Algorithms Fundamental concepts OS definition in general: OS is a collection of software modules to an extended machine for the users viewpoint, and it is a resource manager from the
More informationMIMD Overview. Intel Paragon XP/S Overview. XP/S Usage. XP/S Nodes and Interconnection. ! Distributed-memory MIMD multicomputer
MIMD Overview Intel Paragon XP/S Overview! MIMDs in the 1980s and 1990s! Distributed-memory multicomputers! Intel Paragon XP/S! Thinking Machines CM-5! IBM SP2! Distributed-memory multicomputers with hardware
More informationCOSC 6385 Computer Architecture - Multi Processor Systems
COSC 6385 Computer Architecture - Multi Processor Systems Fall 2006 Classification of Parallel Architectures Flynn s Taxonomy SISD: Single instruction single data Classical von Neumann architecture SIMD:
More informationCluster Computing. Interconnect Technologies for Clusters
Interconnect Technologies for Clusters Interconnect approaches WAN infinite distance LAN Few kilometers SAN Few meters Backplane Not scalable Physical Cluster Interconnects FastEther Gigabit EtherNet 10
More informationAgenda. System Performance Scaling of IBM POWER6 TM Based Servers
System Performance Scaling of IBM POWER6 TM Based Servers Jeff Stuecheli Hot Chips 19 August 2007 Agenda Historical background POWER6 TM chip components Interconnect topology Cache Coherence strategies
More informationOutline. Execution Environments for Parallel Applications. Supercomputers. Supercomputers
Outline Execution Environments for Parallel Applications Master CANS 2007/2008 Departament d Arquitectura de Computadors Universitat Politècnica de Catalunya Supercomputers OS abstractions Extended OS
More informationLecture 9: MIMD Architectures
Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is connected
More informationHigh Performance MPI on IBM 12x InfiniBand Architecture
High Performance MPI on IBM 12x InfiniBand Architecture Abhinav Vishnu, Brad Benton 1 and Dhabaleswar K. Panda {vishnu, panda} @ cse.ohio-state.edu {brad.benton}@us.ibm.com 1 1 Presentation Road-Map Introduction
More informationOn the cost of tunnel endpoint processing in overlay virtual networks
J. Weerasinghe; NVSDN2014, London; 8 th December 2014 On the cost of tunnel endpoint processing in overlay virtual networks J. Weerasinghe & F. Abel IBM Research Zurich Laboratory Outline Motivation Overlay
More informationDistributed Computing: PVM, MPI, and MOSIX. Multiple Processor Systems. Dr. Shaaban. Judd E.N. Jenne
Distributed Computing: PVM, MPI, and MOSIX Multiple Processor Systems Dr. Shaaban Judd E.N. Jenne May 21, 1999 Abstract: Distributed computing is emerging as the preferred means of supporting parallel
More informationChapter 9 Multiprocessors
ECE200 Computer Organization Chapter 9 Multiprocessors David H. lbonesi and the University of Rochester Henk Corporaal, TU Eindhoven, Netherlands Jari Nurmi, Tampere University of Technology, Finland University
More informationFast access ===> use map to find object. HW == SW ===> map is in HW or SW or combo. Extend range ===> longer, hierarchical names
Fast access ===> use map to find object HW == SW ===> map is in HW or SW or combo Extend range ===> longer, hierarchical names How is map embodied: --- L1? --- Memory? The Environment ---- Long Latency
More informationMemory Management Strategies for Data Serving with RDMA
Memory Management Strategies for Data Serving with RDMA Dennis Dalessandro and Pete Wyckoff (presenting) Ohio Supercomputer Center {dennis,pw}@osc.edu HotI'07 23 August 2007 Motivation Increasing demands
More informationGPFS on a Cray XT. Shane Canon Data Systems Group Leader Lawrence Berkeley National Laboratory CUG 2009 Atlanta, GA May 4, 2009
GPFS on a Cray XT Shane Canon Data Systems Group Leader Lawrence Berkeley National Laboratory CUG 2009 Atlanta, GA May 4, 2009 Outline NERSC Global File System GPFS Overview Comparison of Lustre and GPFS
More informationA Disseminated Distributed OS for Hardware Resource Disaggregation Yizhou Shan
LegoOS A Disseminated Distributed OS for Hardware Resource Disaggregation Yizhou Shan, Yutong Huang, Yilun Chen, and Yiying Zhang Y 4 1 2 Monolithic Server OS / Hypervisor 3 Problems? 4 cpu mem Resource
More informationCommunication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.
Cluster Networks Introduction Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. As usual, the driver is performance
More informationRoadmap for Challenging Times System Virtualiztion
Roadmap for Challenging Times System Virtualiztion Most people thinking VIRTUALIZION as a strategy to CONSOLIDATE systems and reduce cost System Virtualization Grid Control Plane Virtualized Storage Resources
More information10-Gigabit iwarp Ethernet: Comparative Performance Analysis with InfiniBand and Myrinet-10G
10-Gigabit iwarp Ethernet: Comparative Performance Analysis with InfiniBand and Myrinet-10G Mohammad J. Rashti and Ahmad Afsahi Queen s University Kingston, ON, Canada 2007 Workshop on Communication Architectures
More informationNOW Handout Page 1. Recap: Gigaplane Bus Timing. Scalability
Recap: Gigaplane Bus Timing 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Address Rd A Rd B Scalability State Arbitration 1 4,5 2 Share ~Own 6 Own 7 A D A D A D A D A D A D A D A D CS 258, Spring 99 David E. Culler
More informationIO System. CP-226: Computer Architecture. Lecture 25 (24 April 2013) CADSL
IO System Virendra Singh Associate Professor Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/
More informationComputer Science 146. Computer Architecture
Computer Science 46 Computer Architecture Spring 24 Harvard University Instructor: Prof dbrooks@eecsharvardedu Lecture 22: More I/O Computer Science 46 Lecture Outline HW5 and Project Questions? Storage
More informationHuge market -- essentially all high performance databases work this way
11/5/2017 Lecture 16 -- Parallel & Distributed Databases Parallel/distributed databases: goal provide exactly the same API (SQL) and abstractions (relational tables), but partition data across a bunch
More informationThe NE010 iwarp Adapter
The NE010 iwarp Adapter Gary Montry Senior Scientist +1-512-493-3241 GMontry@NetEffect.com Today s Data Center Users Applications networking adapter LAN Ethernet NAS block storage clustering adapter adapter
More informationThe power of centralized computing at your fingertips
Pinnacle 3 Professional The power of centralized computing at your fingertips Philips Pinnacle 3 Professional specifications The power of centralized computing in a scalable offering for mid-size clinics
More informationDheeraj Bhardwaj May 12, 2003
HPC Systems and Models Dheeraj Bhardwaj Department of Computer Science & Engineering Indian Institute of Technology, Delhi 110 016 India http://www.cse.iitd.ac.in/~dheerajb 1 Sequential Computers Traditional
More informationLow-Latency Message Passing on Workstation Clusters using SCRAMNet 1 2
Low-Latency Message Passing on Workstation Clusters using SCRAMNet 1 2 Vijay Moorthy, Matthew G. Jacunski, Manoj Pillai,Peter, P. Ware, Dhabaleswar K. Panda, Thomas W. Page Jr., P. Sadayappan, V. Nagarajan
More informationEngineering Fault-Tolerant TCP/IP servers using FT-TCP. Dmitrii Zagorodnov University of California San Diego
Engineering Fault-Tolerant TCP/IP servers using FT-TCP Dmitrii Zagorodnov University of California San Diego Motivation Reliable network services are desirable but costly! Extra and/or specialized hardware
More informationLecture 5: Process Description and Control Multithreading Basics in Interprocess communication Introduction to multiprocessors
Lecture 5: Process Description and Control Multithreading Basics in Interprocess communication Introduction to multiprocessors 1 Process:the concept Process = a program in execution Example processes:
More informationSEDA: An Architecture for Well-Conditioned, Scalable Internet Services
SEDA: An Architecture for Well-Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University of California, Berkeley Operating Systems Principles
More informationParallel Computers. CPE 631 Session 20: Multiprocessors. Flynn s Tahonomy (1972) Why Multiprocessors?
Parallel Computers CPE 63 Session 20: Multiprocessors Department of Electrical and Computer Engineering University of Alabama in Huntsville Definition: A parallel computer is a collection of processing
More informationNode Hardware. Performance Convergence
Node Hardware Improved microprocessor performance means availability of desktop PCs with performance of workstations (and of supercomputers of 10 years ago) at significanty lower cost Parallel supercomputers
More informationMultiprocessors and Thread Level Parallelism Chapter 4, Appendix H CS448. The Greed for Speed
Multiprocessors and Thread Level Parallelism Chapter 4, Appendix H CS448 1 The Greed for Speed Two general approaches to making computers faster Faster uniprocessor All the techniques we ve been looking
More informationDynamic Partitioned Global Address Spaces for Power Efficient DRAM Virtualization
Dynamic Partitioned Global Address Spaces for Power Efficient DRAM Virtualization Jeffrey Young, Sudhakar Yalamanchili School of Electrical and Computer Engineering, Georgia Institute of Technology Talk
More informationSystem input-output, performance aspects March 2009 Guy Chesnot
Headline in Arial Bold 30pt System input-output, performance aspects March 2009 Guy Chesnot Agenda Data sharing Evolution & current tendencies Performances: obstacles Performances: some results and good
More informationMellanox InfiniBand Solutions Accelerate Oracle s Data Center and Cloud Solutions
Mellanox InfiniBand Solutions Accelerate Oracle s Data Center and Cloud Solutions Providing Superior Server and Storage Performance, Efficiency and Return on Investment As Announced and Demonstrated at
More informationBest Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays
Dell EqualLogic Best Practices Series Best Practices for Deploying a Mixed 1Gb/10Gb Ethernet SAN using Dell EqualLogic Storage Arrays A Dell Technical Whitepaper Jerry Daugherty Storage Infrastructure
More informationOctopus: A Multi-core implementation
Octopus: A Multi-core implementation Kalpesh Sheth HPEC 2007, MIT, Lincoln Lab Export of this products is subject to U.S. export controls. Licenses may be required. This material provides up-to-date general
More informationChapter 6. Storage and Other I/O Topics
Chapter 6 Storage and Other I/O Topics Introduction I/O devices can be characterized by Behaviour: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections
More informationAN O/S PERSPECTIVE ON NETWORKS Adem Efe Gencer 1. October 4 th, Department of Computer Science, Cornell University
AN O/S PERSPECTIVE ON NETWORKS Adem Efe Gencer 1 October 4 th, 2012 1 Department of Computer Science, Cornell University Papers 2 Active Messages: A Mechanism for Integrated Communication and Control,
More informationSorting. Overview. External sorting. Warm up: in memory sorting. Purpose. Overview. Sort benchmarks
15-823 Advanced Topics in Database Systems Performance Sorting Shimin Chen School of Computer Science Carnegie Mellon University 22 March 2001 Sort benchmarks A base case: AlphaSort Improving Sort Performance
More informationLecture 23: I/O Redundant Arrays of Inexpensive Disks Professor Randy H. Katz Computer Science 252 Spring 1996
Lecture 23: I/O Redundant Arrays of Inexpensive Disks Professor Randy H Katz Computer Science 252 Spring 996 RHKS96 Review: Storage System Issues Historical Context of Storage I/O Storage I/O Performance
More informationHW Trends and Architectures
Pavel Tvrdík, Jiří Kašpar (ČVUT FIT) HW Trends and Architectures MI-POA, 2011, Lecture 1 1/29 HW Trends and Architectures prof. Ing. Pavel Tvrdík CSc. Ing. Jiří Kašpar Department of Computer Systems Faculty
More informationHardware and Software solutions for scaling highly threaded processors. Denis Sheahan Distinguished Engineer Sun Microsystems Inc.
Hardware and Software solutions for scaling highly threaded processors Denis Sheahan Distinguished Engineer Sun Microsystems Inc. Agenda Chip Multi-threaded concepts Lessons learned from 6 years of CMT
More informationFast access ===> use map to find object. HW == SW ===> map is in HW or SW or combo. Extend range ===> longer, hierarchical names
Fast access ===> use map to find object HW == SW ===> map is in HW or SW or combo Extend range ===> longer, hierarchical names How is map embodied: --- L1? --- Memory? The Environment ---- Long Latency
More informationMiAMI: Multi-Core Aware Processor Affinity for TCP/IP over Multiple Network Interfaces
MiAMI: Multi-Core Aware Processor Affinity for TCP/IP over Multiple Network Interfaces Hye-Churn Jang Hyun-Wook (Jin) Jin Department of Computer Science and Engineering Konkuk University Seoul, Korea {comfact,
More informationImpact of Cache Coherence Protocols on the Processing of Network Traffic
Impact of Cache Coherence Protocols on the Processing of Network Traffic Amit Kumar and Ram Huggahalli Communication Technology Lab Corporate Technology Group Intel Corporation 12/3/2007 Outline Background
More informationMoneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories
Moneta: A High-Performance Storage Architecture for Next-generation, Non-volatile Memories Adrian M. Caulfield Arup De, Joel Coburn, Todor I. Mollov, Rajesh K. Gupta, Steven Swanson Non-Volatile Systems
More informationReal Parallel Computers
Real Parallel Computers Modular data centers Overview Short history of parallel machines Cluster computing Blue Gene supercomputer Performance development, top-500 DAS: Distributed supercomputing Short
More informationCOP Cloud Computing. Presented by: Sanketh Beerabbi University of Central Florida
COP6087 - Cloud Computing Presented by: Sanketh Beerabbi University of Central Florida A cloud is a collection of networked resources configured such that users can request scalable resources (VMs, platforms,
More informationAdvanced Operating Systems (CS 202)
Advanced Operating Systems (CS 202) Presenter today: Khaled N. Khasawneh Instructor: Nael Abu-Ghazaleh Jan, 9, 2016 Today Course organization and mechanics Introduction to OS 2 What is this course about?
More informationPatagonia Cluster Project Research Cluster
Patagonia Cluster Project Research Cluster Clusters of PCs Multi-Boot and Multi-Purpose? Christian Kurmann, Felix Rauch, Michela Taufer, Prof. Thomas M. Stricker Laboratory for Computer Systems ETHZ -
More informationA Case for High Performance Computing with Virtual Machines
A Case for High Performance Computing with Virtual Machines Wei Huang*, Jiuxing Liu +, Bulent Abali +, and Dhabaleswar K. Panda* *The Ohio State University +IBM T. J. Waston Research Center Presentation
More informationI/O Buffering and Streaming
I/O Buffering and Streaming I/O Buffering and Caching I/O accesses are reads or writes (e.g., to files) Application access is arbitary (offset, len) Convert accesses to read/write of fixed-size blocks
More informationIO virtualization. Michael Kagan Mellanox Technologies
IO virtualization Michael Kagan Mellanox Technologies IO Virtualization Mission non-stop s to consumers Flexibility assign IO resources to consumer as needed Agility assignment of IO resources to consumer
More informationIntroduction. Operating Systems. Introduction. Introduction. Introduction
Operating Systems User OS Kernel & Device Drivers Interface Programs Instructor Brian Mitchell - Brian bmitchel@mcs.drexel.edu www.mcs.drexel.edu/~bmitchel TA - To Be Announced Course Information MCS720
More informationIntel Enterprise Processors Technology
Enterprise Processors Technology Kosuke Hirano Enterprise Platforms Group March 20, 2002 1 Agenda Architecture in Enterprise Xeon Processor MP Next Generation Itanium Processor Interconnect Technology
More informationECE/CS 757: Advanced Computer Architecture II Interconnects
ECE/CS 757: Advanced Computer Architecture II Interconnects Instructor:Mikko H Lipasti Spring 2017 University of Wisconsin-Madison Lecture notes created by Natalie Enright Jerger Lecture Outline Introduction
More informationPerformance of DB2 Enterprise-Extended Edition on NT with Virtual Interface Architecture
Performance of DB2 Enterprise-Extended Edition on NT with Virtual Interface Architecture Sivakumar Harinath 1, Robert L. Grossman 1, K. Bernhard Schiefer 2, Xun Xue 2, and Sadique Syed 2 1 Laboratory of
More informationCOSC4201 Multiprocessors
COSC4201 Multiprocessors Prof. Mokhtar Aboelaze Parts of these slides are taken from Notes by Prof. David Patterson (UCB) Multiprocessing We are dedicating all of our future product development to multicore
More informationRapidIO.org Update. Mar RapidIO.org 1
RapidIO.org Update rickoco@rapidio.org Mar 2015 2015 RapidIO.org 1 Outline RapidIO Overview & Markets Data Center & HPC Communications Infrastructure Industrial Automation Military & Aerospace RapidIO.org
More informationNo Tradeoff Low Latency + High Efficiency
No Tradeoff Low Latency + High Efficiency Christos Kozyrakis http://mast.stanford.edu Latency-critical Applications A growing class of online workloads Search, social networking, software-as-service (SaaS),
More informationDistributed Systems Principles and Paradigms
Distributed Systems Principles and Paradigms Chapter 03 (version February 11, 2008) Maarten van Steen Vrije Universiteit Amsterdam, Faculty of Science Dept. Mathematics and Computer Science Room R4.20.
More informationCOSC 6385 Computer Architecture. - Memory Hierarchies (I)
COSC 6385 Computer Architecture - Hierarchies (I) Fall 2007 Slides are based on a lecture by David Culler, University of California, Berkley http//www.eecs.berkeley.edu/~culler/courses/cs252-s05 Recap
More informationThe MOSIX Scalable Cluster Computing for Linux. mosix.org
The MOSIX Scalable Cluster Computing for Linux Prof. Amnon Barak Computer Science Hebrew University http://www. mosix.org 1 Presentation overview Part I : Why computing clusters (slide 3-7) Part II : What
More informationParallel Programming Platforms
arallel rogramming latforms Ananth Grama Computing Research Institute and Department of Computer Sciences, urdue University ayg@cspurdueedu http://wwwcspurdueedu/people/ayg Reference: Introduction to arallel
More informationCS Computer Architecture
CS 35101 Computer Architecture Section 600 Dr. Angela Guercio Fall 2010 Structured Computer Organization A computer s native language, machine language, is difficult for human s to use to program the computer
More informationLecture 1 Introduction (Chapter 1 of Textbook)
Bilkent University Department of Computer Engineering CS342 Operating Systems Lecture 1 Introduction (Chapter 1 of Textbook) Dr. İbrahim Körpeoğlu http://www.cs.bilkent.edu.tr/~korpe 1 References The slides
More informationParallel & Cluster Computing. cs 6260 professor: elise de doncker by: lina hussein
Parallel & Cluster Computing cs 6260 professor: elise de doncker by: lina hussein 1 Topics Covered : Introduction What is cluster computing? Classification of Cluster Computing Technologies: Beowulf cluster
More informationGFS: The Google File System
GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one
More informationcsci3411: Operating Systems
csci3411: Operating Systems Lecture 3: System structure and Processes Gabriel Parmer Some slide material from Silberschatz and West System Structure System Structure How different parts of software 1)
More information