The Supercomputer Industry in Light of the Top500 Data

Size: px
Start display at page:

Download "The Supercomputer Industry in Light of the Top500 Data"

Transcription

1 The Supercomputer Industry in Light of the Top500 Data Dror G. Feitelson School of Computer Science and Engineering The Hebrew University of Jerusalem Jerusalem, Israel Abstract The Top500 list lists the 500 most powerful computers installed worldwide, and has been updated semiannually for the last 10 years. Analyzing this data enables an impartial analysis of the current state and the development trends of the supercomputer industry, and sheds some light on the challenges which it faces. Introduction High-performance computing is now widely considered a strategic asset. The supercomputer industry, as the creator of this capability, is therefore very important, and is often also a source of national pride. At the same time, this industry has traditionally been a leader in the development of computer technology. It is therefore interesting to analyze the state of this industry and its development trends. Luckily, data for such an analysis is available in the form of the Top500 list, which now covers 10 years. By identifying invariants and trends in this data, we can predict growth rates, comment on technology, and identify limiting factors. The Top500 List In 1986, Hans Meuer started publishing lists of system counts for different supercomputer vendors. Even earlier, in 1979, Jack Dongarra formulated the Linpack benchmark, and 1

2 started using it to gauge the capabilities of various models and configurations of computers manufactured by different vendors [4]. in 1993 these two efforts led to the establishment of the Top500 list. This lists the 500 most powerful computer systems installed around the world in rank order. The emphasis on actual installations is crucial: it added a dimension of market success, and removed (or at least significantly reduced) the clutter caused by numerous computer models and configurations that were offered but not used. The list of the top ranking 500 installations worldwide has been updated twice a year, in June and November, ever since [5]. Over the years the list has gained in reputation, with national labs vying to host machines that contend for the top spot. It has been used to gauge the standing of different vendors and different countries in terms of the production and use of supercomputers, and also to comment on the state of the industry as a whole [7, 2, 12]. However, the list is not without its shortcomings. Chief among them is the use of the Linpack benchmark to rank the machines. This benchmark focuses on dense matrix operations, and tends to report optimistic performance figures that are closer to peak performance than those typically observed in practice using some production applications. Moreover, it may be that the discrepancy between the Linpack measurements and the performance as measured by other applications is different for different classes of computers, so the relative ranking would be different when using other benchmarks. In addition, Linpack is not a throughput benchmark, and there is no regard for the efficiency of the work done or to loss of resources to fragmentation when multiple jobs are executed concurrently. Finally, it does not take cost into account [3, 9, 11, 15, 10]. Nevertheless, Linpack has the advantage that it is easily ported and measured, making the Top500 list the only list of its kind. And the accumulated data from 10 years allows for some interesting insights, that are at least partly independent of the actual mechanism used for the ranking. In the following analysis we use the November lists, and denote the maximal computation rate achieved on the Linpack benchmark by Rmax. Ranking and Performance Predictions It has been widely observed that the performance of machines at a certain rank, and of the list as a whole, grows exponentially with time. This can be seen by tabulating the log of Rmax values as a function of year for a given rank, which leads to a straight line (Figure 1). The slope, however, depends on rank: performing a linear regression on data from 1997 to 2003, Rmax doubles every 13.5 months for the rank 500 machines, but only every

3 Rmax 1e+07 1e rank 1 rank 2 rank 4 rank 8 rank 16 rank 32 rank 64 rank 125 rank 250 rank Figure 1: Increase in computational power with time, as measured using Linpack. 1e+07 Rmax 1e rank Figure 2: Distribution of computing power across the lists. months for the rank 16 machines (the data at the very highest ranks is too noisy for an accurate estimate). In other words, the bottom of the list is improving faster than the top. In any case, the rate across the list is somewhat lower than the doubling every year reported in the past [7]. But it is still faster than the doubling every 18 months predicted by Moore s Law. 3

4 By tabulating the Rmax values as a function of rank on log-log axes, we find that the computational power drops off polynomially with rank (Figure 2). To measure this, we perform a least-squares regression starting from position 32 to the end of the list (the data near the top of the list tends to be much noisier). The results indicate that the slope of the line is growing slightly smaller with time (except in 2000, when it grew). Again, this means that the bottom of the list is actually growing at a slightly faster rate than the top. The current value is about This means that as we double the rank, the computational power drops to about 63% of what it was at the original rank. Based on these regularities, we can actually make crude predictions of the Rmax values at different ranks in future years. We know Rmax grows exponentially with time. Assuming an average time constant of about 1.15 years, we can write Rmax(r, t) = Rmax(r, t 0 ) 2 (t t 0)/1.15 where Rmax(r, t) is the Linpack rating at rank r at time t, Rmax(r, t 0 ) is the rating at this rank at time t 0, and t and t 0 are measured in years. We also know that log(rmax) drops linearly with log(rank), with a slope of about 0.7, log(rmax(r)) = C 0.7 log r which leads to Rmax(r) = C /r 0.7 Using this to calculate the Rmax value at a rank that is a factor of α higher we get Rmax(αr) = C /(αr) 0.7 = Rmax(r)/α 0.7 implying that Rmax changes by a factor of 1/α 0.7. Putting all the above together we then get the approximate expression Rmax(r, t) = Rmax(r 0, t 0 ) (r 0 /r) (t t 0)/1.15 As an example, let s use the Rmax of the rank 500 machine in 1997, which is 9513, to estimate the Rmax of the rank 100 machine in The formula gives 9513 (500/100) /1.15 = 1, 091, 848. The true value is 1,142,000, so the error is less than 5%. If the slope of log(rmax) by log(rank) does indeed decrease, the consequence is that machines will tend to fall off the list faster, and the list will end up dominated by new 4

5 year new machines Table 1: The number of new entries in the list each year. machines. To a degree, this is already the case, and has been for a long time: The number of machines in the list drops off exponentially with age [7], and more than half the list is new each year. However, the number of new machines has been relatively static at around 300 at least since 1997 (Table 1). If it grows, it may make sense to increase the length of the list to some number higher than 500. This is also desirable since the prevalence of certain machine models and configurations leads to long stretches of constant values in the current lists. Vectors and Micros Much publicity had been given in 2002 to the capture of the top spot by Japan s Earth Simulator [6], after several years of dominance by American ASCI machines. This has also re-ignited the controversy regarding the use of proprietary vector processors as opposed to commodity microprocessors. NEC, the creator of the Earth Simulator, has pursued the vector path originally developed by Cray Research, as have other Japanese firms. Most American companies have preferred to use commodity microprocessors, including Cray itself in its T3D and T3E models. The Top500 list has been used repeatedly to show that vector machines are loosing ground to those based on commodity microprocessors (e.g. [2]). But will the Earth Simulator and the recent introduction of the X1 line of vector 5

6 vector share [%] % machines % processors % rmax Figure 3: Share of machines, processors, and cycles held by vector machines. machines by Cray herald a new era of vector processing? The truth of the matter is somewhat complex. Figure 3 shows the share of machines, processors, and cycles held by vector machines in the last 10 years. The number of vector machines has indeed dropped precipitously. But the share in cycles, as expressed by the Rmax performance values, has stabilized at about 12 15% of the total since 1998, and surpassed 18% when the Earth Simulator was introduced in 2002 (and, if we accept the claim that Linpack over-estimates the capability of microprocessor-based machines on various important applications, these numbers should actually be larger). Interestingly, the share of installed processors has remained relatively constant throughout the 10-year period. As vector processors are generally much more powerful, there were always many fewer of them, even when vector machines dominated the list. In the mid 90s the other processors were part of massively parallel SIMD machines, and now they are the commodity microprocessors installed in large parallel machines and clusters. Other interesting data is provided by focusing on the bottom end of the list, and tracking the minimal number of processors needed to make the list. The results, shown in Figure 4, indicate that for microprocessor-based machines this number doubles every three years (as predicted based on much less data in [7]). This explains the finding that the power of installed supercomputers grows faster than that of desktop machines: it is the product of improvements due to Moore s law and using increasing numbers of processors [13] (note 6

7 maximal top rank minimal micro minimal processors Figure 4: Maximal and minimal degrees of parallelism represented in the list compared to the minimal number of microprocessors needed and to the parallelism at the top rank. also that even vector machines have required multiple processors to make the list since 1997). This applies across the list, as witnessed by the fact that the ratio of the Rmax value of the rank 100 machine to that of the last machine has been remarkably stable at around 3 since Comparing the minimal size of microprocessor-based machines with the minimal parallelism in the list, we find that the latter grows faster than the former (Figure 4). As the minimal parallelism is invariably achieved using vector processors, the higher slope indicates that vector processors are improving at a slower rate than microprocessors. If they continue at the same rate, the two lines are expected to cross around Alternatively, the two types may converge as innovations initially developed for vector processors are incorporated in microprocessor designs. Limiting Factors The capabilities of top-ranked machines are the product of how many processors they have and what each one can do. Figure 4 also shows the biggest and top-ranking machines in the Top500 list. Since 1997, it seems that there is a sort of glass ceiling that prevents systems with more than about 10,000 processors. The significance of this bound is that it 7

8 limits potential performance improvements by microprocessor-based machines at the top of the list to the doubling every 18 months due to Moore s law; and even this rate is being questioned as chip feature sizes are approaching the atomic scale and manufacturing costs are in the billions of dollars [14]. The bottom of the list, meanwhile, is growing faster by increasing the number of processors used. There are three main factors that limit the number of processors used in large-scale machines. One possible factor is cost. Large scale machines obviously cost a lot of money, with Japan s Earth Simulator coming in at about 350 million dollars. The cost issue is the main motivation for the growth of clusters as an alternative to vendor machines [2]. A few years ago clusters were small-scale endeavors constructed by individual research groups. In the 2003 Top500 list, they occupy 7 of the top 10 slots, and account for more than 40% of the listed systems. As such large-scale clusters are much less expensive than previous vendor machines, cost cannot explain the 10,000 processor limit. The second limiting factor is management. This includes two components. One is general management of the machine by its operators, including configuration management, identifying and replacing faulty parts, etc. The other is on-line management by a combination of daemons and operating system services, including runtime support and scheduling. It seems that these management issues may be the main factor that imposes the 10,000 processor limit. Improved management, and especially improved fault tolerance, are needed in order to surpass this number. An example of progress in this direction is the STORM project from LANL [8]. This is a resource management framework that leverages the capabilities of the Quadrics interconnect in order to implement extremely scalable control functions. The idea is to construct the management software itself as a parallel application, rather than as a distributed one as has typically been done up to now. A third limiting factor is effective usage by applications. The problem is basically one of the programming model employed. Current programming practices are too rigid, and only scale effectively for very regular problems. There is a need for more dynamic and flexible mechanisms that will support massive resources and the reallocations needed to handle imbalance and failures (which may be seen as an extreme case of imbalance: a failed component has zero capacity, equivalent to infinite load). To maintain progress, and in particular to outperform the Earth Simulator using commodity microprocessors, the 10,000 processor bound will have to be broken. A current 8

9 invariant Rmax grows exponentially doubling every 14 to 17 months depending on rank power drops polynomially with rank, with exponent of about 0.7 minimal parallelism grows exponentially doubling every 3 years for microprocessor-based machines, and every 1.5 years for vector machines maximum usable parallelism is 10,000 about 15% of the total Rmax is due to vector processors age drops exponentially in list, with around 300 new machines each year about 50% of machines are used by industry prediction rank 500 will achieve 1 teraflop in 2005, and rank 1 will achieve 1 petaflop in 2008 or 2009 machine at rank r will remain on the list 7.2 log r years in 2009 microprocessors will have the same power as vector processors, and about 512 will be needed to make the list this limit will drive progress towards better management, programming, and reliability, and it will be broken vectors might make a modest comeback current growth rate meets current needs; new markets needed to expand usage of parallel supercomputers Table 2: Invariants and predictions from analysis of the Top500 list. effort to do so is the BlueGene/L project [1], which has a design point of 65,536 processing nodes. One component of the design is to adopt a SIMD-like style, in which large blocks of processors are used en masse. In addition, nodes will not run a full operating system, but rather a simplified kernel. Interestingly, these design choices are re-incarnations of ideas that were common some 10 years ago, and have been disused in the time since. Conclusions The above results (and updates to [7]) are summarized in Table 2, together with predictions based on them. In summary, a large number of parameters have remained relatively static over several years. This does not mean that the supercomputing industry is stagnating, as 9

10 some of the constants are exponential growth rates. But their stability over time makes it possible to make predictions of how things will change in the future. Apart from allowing predictions, the analysis also allows us to identify major challenges. An increasingly important challenge is how to produce more efficient management and scheduling mechanisms, that approach the high utilization that has been typical of vector machines in the past, and at the same time use more than 10,000 processors effectively [10, 11]. This is partly a problem of designing better runtime systems, and partly a problem of devising more flexible programming models. These issues may be more important for long-term progress than the architectural innovations required to achieve a petaflop. Another challenge is to increase the use of parallel machines outside of the research community. Investigating the fraction of machines installed in industry locations shows that the industry share of machines has doubled since the mid 90s, and peaked at over 52% in This is typically taken as a good sign, showing the maturity of the supercomputing field. However, in the last two years industry share has dropped by 10 percentage points. Moreover, the industry share of Rmax has grown at a slower rate than its share of machines, and now stands at 27%. Overall, these findings indicate that new markets have to be found to further increase the use of parallel machines. Acknowledgements Many thanks to the reviewers who helped remove various inaccuracies and improve the paper. References [1] N. R. Adiga et al., An overview of the BlueGene/L supercomputer. In SC2002, Nov [2] G. Bell and J. Gray, What s next in high-performance computing?. Comm. ACM 45(2), pp , Feb [3] G. Cybenko and D. J. Kuck, Revolution or evolution?. IEEE Spectrum 29(9), pp , Sep

11 [4] J. Dongarra, P. Luszczek, and A. Petitet, The LINPACK benchmark: past, present and future. Concurrency & Computation Pract. & Exp. 15(9), pp , Aug [5] J. J. Dongarra, H. W. Meuer, H. D. Simon, and E. Strohmaier, Top500 supercomputer sites. URL (updated every 6 months). [6] Earth Simulator. URL [7] D. G. Feitelson, On the interpretation of Top500 data. Intl. J. High Performance Comput. Appl. 13(2), pp , Summer [8] E. Frachtenberg, F. Petrini, J. Fernandez, S. Pakin, and S. Coll, STORM: lightningfast resource management. In SC2002, Nov [9] J. Gustafson, Teraflops and other false goals. IEEE Parallel & Distributed Technology 2(2), pp. 5 6, Summer [10] J. P. Jones and B. Nitzberg, Scheduling for parallel supercomputing: a historical perspective of achievable utilization. In Job Scheduling Strategies for Parallel Processing, pp. 1 16, Springer-Verlag, Lect. Notes Comput. Sci. vol [11] L. Rudolph and P. Smith, Valuation of ultra-scale computing systems. In Job Scheduling Strategies for Parallel Processing, pp , Springer Verlag, Lect. Notes Comput. Sci. vol [12] E. Strohmaier, who s who in high performance computing. Scientific Computing & Instrumentation, Aug URL [13] E. Strohmaier, J. J. Dongarra, H. W. Meuer, and H. D. Simon, The marketplace of high-performance computing. Parallel Comput. 25(13-14), pp , Dec [14] T. N. Theis, Beyond the silicon transistor: personal observations. Comput. in Sci. & Eng. 5(1), pp , Jan/Feb [15] A. Wong, L. Oliker, W. Kramer, T. Kaltz, and D. Bailey, System utilization benchmark on the Cray T3E and IBM SP2. In Job Scheduling Strategies for Parallel Processing, pp , Springer Verlag, Lect. Notes Comput. Sci. vol

High Performance Computing in Europe and USA: A Comparison

High Performance Computing in Europe and USA: A Comparison High Performance Computing in Europe and USA: A Comparison Erich Strohmaier 1 and Hans W. Meuer 2 1 NERSC, Lawrence Berkeley National Laboratory, USA 2 University of Mannheim, Germany 1 Introduction In

More information

TOP500 List s Twice-Yearly Snapshots of World s Fastest Supercomputers Develop Into Big Picture of Changing Technology

TOP500 List s Twice-Yearly Snapshots of World s Fastest Supercomputers Develop Into Big Picture of Changing Technology TOP500 List s Twice-Yearly Snapshots of World s Fastest Supercomputers Develop Into Big Picture of Changing Technology BY ERICH STROHMAIER COMPUTER SCIENTIST, FUTURE TECHNOLOGIES GROUP, LAWRENCE BERKELEY

More information

Real Parallel Computers

Real Parallel Computers Real Parallel Computers Modular data centers Overview Short history of parallel machines Cluster computing Blue Gene supercomputer Performance development, top-500 DAS: Distributed supercomputing Short

More information

Nigerian Telecommunications (Services) Sector Report Q3 2016

Nigerian Telecommunications (Services) Sector Report Q3 2016 Nigerian Telecommunications (Services) Sector Report Q3 2016 24 NOVEMBER 2016 Telecommunications Data The telecommunications data used in this report were obtained from the National Bureau of Statistics

More information

What have we learned from the TOP500 lists?

What have we learned from the TOP500 lists? What have we learned from the TOP500 lists? Hans Werner Meuer University of Mannheim and Prometeus GmbH Sun HPC Consortium Meeting Heidelberg, Germany June 19-20, 2001 Outlook TOP500 Approach Snapshots

More information

High Performance Computing in Europe and USA: A Comparison

High Performance Computing in Europe and USA: A Comparison High Performance Computing in Europe and USA: A Comparison Hans Werner Meuer University of Mannheim and Prometeus GmbH 2nd European Stochastic Experts Forum Baden-Baden, June 28-29, 2001 Outlook Introduction

More information

Outline. Course Administration /6.338/SMA5505. Parallel Machines in Applications Special Approaches Our Class Computer.

Outline. Course Administration /6.338/SMA5505. Parallel Machines in Applications Special Approaches Our Class Computer. Outline Course Administration 18.337/6.338/SMA5505 Parallel Machines in 2003 Overview Details Applications Special Approaches Our Class Computer Parallel Computer Architectures MPP Massively Parallel Processors

More information

The TOP500 Project of the Universities Mannheim and Tennessee

The TOP500 Project of the Universities Mannheim and Tennessee The TOP500 Project of the Universities Mannheim and Tennessee Hans Werner Meuer University of Mannheim EURO-PAR 2000 29. August - 01. September 2000 Munich/Germany Outline TOP500 Approach HPC-Market as

More information

Algorithms and Architecture. William D. Gropp Mathematics and Computer Science

Algorithms and Architecture. William D. Gropp Mathematics and Computer Science Algorithms and Architecture William D. Gropp Mathematics and Computer Science www.mcs.anl.gov/~gropp Algorithms What is an algorithm? A set of instructions to perform a task How do we evaluate an algorithm?

More information

Overview. CS 472 Concurrent & Parallel Programming University of Evansville

Overview. CS 472 Concurrent & Parallel Programming University of Evansville Overview CS 472 Concurrent & Parallel Programming University of Evansville Selection of slides from CIS 410/510 Introduction to Parallel Computing Department of Computer and Information Science, University

More information

Nigerian Telecommunications Sector

Nigerian Telecommunications Sector Nigerian Telecommunications Sector SUMMARY REPORT: Q4 and full year 2015 NATIONAL BUREAU OF STATISTICS 26th April 2016 Telecommunications Data The telecommunications data used in this report were obtained

More information

ECE 588/688 Advanced Computer Architecture II

ECE 588/688 Advanced Computer Architecture II ECE 588/688 Advanced Computer Architecture II Instructor: Alaa Alameldeen alaa@ece.pdx.edu Fall 2009 Portland State University Copyright by Alaa Alameldeen and Haitham Akkary 2009 1 When and Where? When:

More information

Presentation of the 16th List

Presentation of the 16th List Presentation of the 16th List Hans- Werner Meuer, University of Mannheim Erich Strohmaier, University of Tennessee Jack J. Dongarra, University of Tennesse Horst D. Simon, NERSC/LBNL SC2000, Dallas, TX,

More information

GLOBAL VIDEO INDEX Q4 2013

GLOBAL VIDEO INDEX Q4 2013 GLOBAL VIDEO INDEX TABLE OF CONTENTS Introduction...3 Executive Summary...4 Mobile + Tablet Video...5 Long-form Video...7 Live Video...8 Sports Video...9 Online Video Outlook...11 Turning Information into

More information

The Constellation Project. Andrew W. Nash 14 November 2016

The Constellation Project. Andrew W. Nash 14 November 2016 The Constellation Project Andrew W. Nash 14 November 2016 The Constellation Project: Representing a High Performance File System as a Graph for Analysis The Titan supercomputer utilizes high performance

More information

ECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141

ECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141 ECE 637 Integrated VLSI Circuits Introduction EE141 1 Introduction Course Details Instructor Mohab Anis; manis@vlsi.uwaterloo.ca Text Digital Integrated Circuits, Jan Rabaey, Prentice Hall, 2 nd edition

More information

Customer Success Story Los Alamos National Laboratory

Customer Success Story Los Alamos National Laboratory Customer Success Story Los Alamos National Laboratory Panasas High Performance Storage Powers the First Petaflop Supercomputer at Los Alamos National Laboratory Case Study June 2010 Highlights First Petaflop

More information

The TOP500 list. Hans-Werner Meuer University of Mannheim. SPEC Workshop, University of Wuppertal, Germany September 13, 1999

The TOP500 list. Hans-Werner Meuer University of Mannheim. SPEC Workshop, University of Wuppertal, Germany September 13, 1999 The TOP500 list Hans-Werner Meuer University of Mannheim SPEC Workshop, University of Wuppertal, Germany September 13, 1999 Outline TOP500 Approach HPC-Market as of 6/99 Market Trends, Architecture Trends,

More information

Fundamentals of Computers Design

Fundamentals of Computers Design Computer Architecture J. Daniel Garcia Computer Architecture Group. Universidad Carlos III de Madrid Last update: September 8, 2014 Computer Architecture ARCOS Group. 1/45 Introduction 1 Introduction 2

More information

State of the Satellite Industry Report June 2013

State of the Satellite Industry Report June 2013 State of the Satellite Industry Report June 2013 Sponsored by the Prepared by 1 1 1 0 1 0 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 1 1 0 The Satellite Industry Association: 18 Years as the Voice of the U.S. Satellite

More information

Parallel Computer Architecture

Parallel Computer Architecture Parallel Computer Architecture What is Parallel Architecture? A parallel computer is a collection of processing elements that cooperate to solve large problems fast Some broad issues: Resource Allocation:»

More information

Exascale: Parallelism gone wild!

Exascale: Parallelism gone wild! IPDPS TCPP meeting, April 2010 Exascale: Parallelism gone wild! Craig Stunkel, Outline Why are we talking about Exascale? Why will it be fundamentally different? How will we attack the challenges? In particular,

More information

FACTS & FIGURES FEBRUARY 2014

FACTS & FIGURES FEBRUARY 2014 FEBRUARY 2014 These figures will be updated regularly. - Layar B.V. FEBRUARY 2014 WHAT THE MARKET SAYS ABOUT AUGMENTED REALITY 1. INTRODUCTION At Layar we measure everything that we do and our customers

More information

represent parallel computers, so distributed systems such as Does not consider storage or I/O issues

represent parallel computers, so distributed systems such as Does not consider storage or I/O issues Top500 Supercomputer list represent parallel computers, so distributed systems such as SETI@Home are not considered Does not consider storage or I/O issues Both custom designed machines and commodity machines

More information

Comparison of Parallel Processing Systems. Motivation

Comparison of Parallel Processing Systems. Motivation Comparison of Parallel Processing Systems Ash Dean Katie Willis CS 67 George Mason University Motivation Increasingly, corporate and academic projects require more computing power than a typical PC can

More information

Presentations: Jack Dongarra, University of Tennessee & ORNL. The HPL Benchmark: Past, Present & Future. Mike Heroux, Sandia National Laboratories

Presentations: Jack Dongarra, University of Tennessee & ORNL. The HPL Benchmark: Past, Present & Future. Mike Heroux, Sandia National Laboratories HPC Benchmarking Presentations: Jack Dongarra, University of Tennessee & ORNL The HPL Benchmark: Past, Present & Future Mike Heroux, Sandia National Laboratories The HPCG Benchmark: Challenges It Presents

More information

Q Conviva s State of the Streaming TV Industry

Q Conviva s State of the Streaming TV Industry 1 Conviva s State of the Streaming TV Industry Q3 2018 Conviva is the real-time measurement and intelligence platform for streaming TV, with a global footprint of 50 billion streams per year across 3 billion

More information

IPTV Statistics market analysis

IPTV Statistics market analysis IPTV Statistics market analysis Q2 2012 October 2012 Point Topic Ltd 73 Farringdon Road London EC1M 3JQ, UK Tel. +44 (0) 20 3301 3305 Email laura.kell@point-topic.com 2 Contents 1 Introduction 3 2 Global

More information

Nigerian Telecommunications (Services) Sector Report Q2 2016

Nigerian Telecommunications (Services) Sector Report Q2 2016 Nigerian Telecommunications (Services) Sector Report Q2 2016 01 SEPTEMBER 2016 Telecommunications Data The telecommunications data used in this report were obtained from the National Bureau of Statistics

More information

An Oracle White Paper April 2010

An Oracle White Paper April 2010 An Oracle White Paper April 2010 In October 2009, NEC Corporation ( NEC ) established development guidelines and a roadmap for IT platform products to realize a next-generation IT infrastructures suited

More information

AM205: lecture 2. 1 These have been shifted to MD 323 for the rest of the semester.

AM205: lecture 2. 1 These have been shifted to MD 323 for the rest of the semester. AM205: lecture 2 Luna and Gary will hold a Python tutorial on Wednesday in 60 Oxford Street, Room 330 Assignment 1 will be posted this week Chris will hold office hours on Thursday (1:30pm 3:30pm, Pierce

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction A set of general purpose processors is connected together.

More information

Outline. CSC 447: Parallel Programming for Multi- Core and Cluster Systems

Outline. CSC 447: Parallel Programming for Multi- Core and Cluster Systems CSC 447: Parallel Programming for Multi- Core and Cluster Systems Performance Analysis Instructor: Haidar M. Harmanani Spring 2018 Outline Performance scalability Analytical performance measures Amdahl

More information

SOFTWARE DEFINED STORAGE VS. TRADITIONAL SAN AND NAS

SOFTWARE DEFINED STORAGE VS. TRADITIONAL SAN AND NAS WHITE PAPER SOFTWARE DEFINED STORAGE VS. TRADITIONAL SAN AND NAS This white paper describes, from a storage vendor perspective, the major differences between Software Defined Storage and traditional SAN

More information

A Manifesto for Supercomputing Research. Marc Snir

A Manifesto for Supercomputing Research. Marc Snir A Manifesto for Supercomputing Research Marc Snir The Successes! Peak performance of top supercomputers has grown faster than predicted by Moore s Law! average level of parallelism has increased! Some

More information

TOP500 Listen und industrielle/kommerzielle Anwendungen

TOP500 Listen und industrielle/kommerzielle Anwendungen TOP500 Listen und industrielle/kommerzielle Anwendungen Hans Werner Meuer Universität Mannheim Gesprächsrunde Nichtnumerische Anwendungen im Bereich des Höchstleistungsrechnens des BMBF Berlin, 16./ 17.

More information

Paper. Delivering Strong Security in a Hyperconverged Data Center Environment

Paper. Delivering Strong Security in a Hyperconverged Data Center Environment Paper Delivering Strong Security in a Hyperconverged Data Center Environment Introduction A new trend is emerging in data center technology that could dramatically change the way enterprises manage and

More information

The Environmental Footprint of Data Centers: The Influence of Server Renewal Rates on the Overall Footprint.

The Environmental Footprint of Data Centers: The Influence of Server Renewal Rates on the Overall Footprint. The Environmental Footprint of Data Centers: The Influence of Server Renewal Rates on the Overall Footprint. Willem Vereecken 1, Ward Vanheddeghem 1, Didier Colle 1, Mario Pickavet 1, Bart Dhoedt 1 and

More information

Experiences with the Parallel Virtual File System (PVFS) in Linux Clusters

Experiences with the Parallel Virtual File System (PVFS) in Linux Clusters Experiences with the Parallel Virtual File System (PVFS) in Linux Clusters Kent Milfeld, Avijit Purkayastha, Chona Guiang Texas Advanced Computing Center The University of Texas Austin, Texas USA Abstract

More information

2017 ASSOCIATION MARKETING BENCHMARK REPORT

2017 ASSOCIATION  MARKETING BENCHMARK REPORT 217 ASSOCIATION EMAIL MARKETING BENCHMARK REPORT Table of Contents 2 Introduction 13 Results by Email Client Type 3 Email Marketing Metrics, Defined 14 Results by Number of Links 4 Executive Summary 15

More information

During 1995, Cray Research, Fujitsu, IBM, Intel, NEC, and Silicon Graphics introduced new

During 1995, Cray Research, Fujitsu, IBM, Intel, NEC, and Silicon Graphics introduced new view point Gordon Bell Photo illustration by Robert Vizzini 1995 Observations on Supercomputing Alternatives: Did the MPP Bandwagon Lead to a Cul-de-Sac? or over a decade, governf ment and the technical

More information

W H I T E P A P E R U n l o c k i n g t h e P o w e r o f F l a s h w i t h t h e M C x - E n a b l e d N e x t - G e n e r a t i o n V N X

W H I T E P A P E R U n l o c k i n g t h e P o w e r o f F l a s h w i t h t h e M C x - E n a b l e d N e x t - G e n e r a t i o n V N X Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com W H I T E P A P E R U n l o c k i n g t h e P o w e r o f F l a s h w i t h t h e M C x - E n a b

More information

AMD Opteron Processors In the Cloud

AMD Opteron Processors In the Cloud AMD Opteron Processors In the Cloud Pat Patla Vice President Product Marketing AMD DID YOU KNOW? By 2020, every byte of data will pass through the cloud *Source IDC 2 AMD Opteron In The Cloud October,

More information

ECE 588/688 Advanced Computer Architecture II

ECE 588/688 Advanced Computer Architecture II ECE 588/688 Advanced Computer Architecture II Instructor: Alaa Alameldeen alaa@ece.pdx.edu Winter 2018 Portland State University Copyright by Alaa Alameldeen and Haitham Akkary 2018 1 When and Where? When:

More information

Roadmapping of HPC interconnects

Roadmapping of HPC interconnects Roadmapping of HPC interconnects MIT Microphotonics Center, Fall Meeting Nov. 21, 2008 Alan Benner, bennera@us.ibm.com Outline Top500 Systems, Nov. 2008 - Review of most recent list & implications on interconnect

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is connected

More information

CORPORATE PERFORMANCE IMPROVEMENT DOES CLOUD MEAN THE PRIVATE DATA CENTER IS DEAD?

CORPORATE PERFORMANCE IMPROVEMENT DOES CLOUD MEAN THE PRIVATE DATA CENTER IS DEAD? CORPORATE PERFORMANCE IMPROVEMENT DOES CLOUD MEAN THE PRIVATE DATA CENTER IS DEAD? DOES CLOUD MEAN THE PRIVATE DATA CENTER IS DEAD? MASS MIGRATION: SHOULD ALL COMPANIES MOVE TO THE CLOUD? Achieving digital

More information

Chapter 5 Supercomputers

Chapter 5 Supercomputers Chapter 5 Supercomputers Part I. Preliminaries Chapter 1. What Is Parallel Computing? Chapter 2. Parallel Hardware Chapter 3. Parallel Software Chapter 4. Parallel Applications Chapter 5. Supercomputers

More information

Lecture 9: MIMD Architecture

Lecture 9: MIMD Architecture Lecture 9: MIMD Architecture Introduction and classification Symmetric multiprocessors NUMA architecture Cluster machines Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is

More information

BlueGene/L. Computer Science, University of Warwick. Source: IBM

BlueGene/L. Computer Science, University of Warwick. Source: IBM BlueGene/L Source: IBM 1 BlueGene/L networking BlueGene system employs various network types. Central is the torus interconnection network: 3D torus with wrap-around. Each node connects to six neighbours

More information

The Past, Present and Future of High Performance Computing

The Past, Present and Future of High Performance Computing The Past, Present and Future of High Performance Computing Ruud van der Pas 1 Sun Microsystems, Technical Developer Tools 16 Network Circle, Mailstop MK16-319, Menlo Park, CA 94025, USA ruud.vanderpas@sun.com

More information

Evolution of Computers & Microprocessors. Dr. Cahit Karakuş

Evolution of Computers & Microprocessors. Dr. Cahit Karakuş Evolution of Computers & Microprocessors Dr. Cahit Karakuş Evolution of Computers First generation (1939-1954) - vacuum tube IBM 650, 1954 Evolution of Computers Second generation (1954-1959) - transistor

More information

Worldwide Workstation Shipments Rebound in Third Quarter

Worldwide Workstation Shipments Rebound in Third Quarter Market Analysis Worldwide Workstation Shipments Rebound in Third Quarter Abstract: Preliminary worldwide workstation shipments totaled 384,756 units in the third quarter of 2003, for a year-over-year increase

More information

IBM DB2 BLU Acceleration vs. SAP HANA vs. Oracle Exadata

IBM DB2 BLU Acceleration vs. SAP HANA vs. Oracle Exadata Research Report IBM DB2 BLU Acceleration vs. SAP HANA vs. Oracle Exadata Executive Summary The problem: how to analyze vast amounts of data (Big Data) most efficiently. The solution: the solution is threefold:

More information

CS 229: Machine Learning Final Report Identifying Driving Behavior from Data

CS 229: Machine Learning Final Report Identifying Driving Behavior from Data CS 9: Machine Learning Final Report Identifying Driving Behavior from Data Robert F. Karol Project Suggester: Danny Goodman from MetroMile December 3th 3 Problem Description For my project, I am looking

More information

Making a Case for a Green500 List

Making a Case for a Green500 List Making a Case for a Green500 List S. Sharma, C. Hsu, and W. Feng Los Alamos National Laboratory Virginia Tech Outline Introduction What Is Performance? Motivation: The Need for a Green500 List Challenges

More information

Cray XE6 Performance Workshop

Cray XE6 Performance Workshop Cray XE6 erformance Workshop odern HC Architectures David Henty d.henty@epcc.ed.ac.uk ECC, University of Edinburgh Overview Components History Flynn s Taxonomy SID ID Classification via emory Distributed

More information

Fundamentals of Computer Design

Fundamentals of Computer Design Fundamentals of Computer Design Computer Architecture J. Daniel García Sánchez (coordinator) David Expósito Singh Francisco Javier García Blas ARCOS Group Computer Science and Engineering Department University

More information

Multicore Computing and Scientific Discovery

Multicore Computing and Scientific Discovery scientific infrastructure Multicore Computing and Scientific Discovery James Larus Dennis Gannon Microsoft Research In the past half century, parallel computers, parallel computation, and scientific research

More information

Introduction to High-Performance Computing

Introduction to High-Performance Computing Introduction to High-Performance Computing Simon D. Levy BIOL 274 17 November 2010 Chapter 12 12.1: Concurrent Processing High-Performance Computing A fancy term for computers significantly faster than

More information

Why Vyatta is Better than Cisco

Why Vyatta is Better than Cisco VYATTA, INC. White Paper Why Vyatta is Better than Cisco How standard hardware, evolving deployment models and simplified application integration make Vyatta a better choice for next generation networking

More information

PART I - Fundamentals of Parallel Computing

PART I - Fundamentals of Parallel Computing PART I - Fundamentals of Parallel Computing Objectives What is scientific computing? The need for more computing power The need for parallel computing and parallel programs 1 What is scientific computing?

More information

ECE 574 Cluster Computing Lecture 1

ECE 574 Cluster Computing Lecture 1 ECE 574 Cluster Computing Lecture 1 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 22 January 2019 ECE574 Distribute and go over syllabus http://web.eece.maine.edu/~vweaver/classes/ece574/ece574_2019s.pdf

More information

Challenges in large-scale graph processing on HPC platforms and the Graph500 benchmark. by Nkemdirim Dockery

Challenges in large-scale graph processing on HPC platforms and the Graph500 benchmark. by Nkemdirim Dockery Challenges in large-scale graph processing on HPC platforms and the Graph500 benchmark by Nkemdirim Dockery High Performance Computing Workloads Core-memory sized Floating point intensive Well-structured

More information

CSE : Introduction to Computer Architecture

CSE : Introduction to Computer Architecture Computer Architecture 9/21/2005 CSE 675.02: Introduction to Computer Architecture Instructor: Roger Crawfis (based on slides from Gojko Babic A modern meaning of the term computer architecture covers three

More information

Technology challenges and trends over the next decade (A look through a 2030 crystal ball) Al Gara Intel Fellow & Chief HPC System Architect

Technology challenges and trends over the next decade (A look through a 2030 crystal ball) Al Gara Intel Fellow & Chief HPC System Architect Technology challenges and trends over the next decade (A look through a 2030 crystal ball) Al Gara Intel Fellow & Chief HPC System Architect Today s Focus Areas For Discussion Will look at various technologies

More information

Distribution Channels for Mobile Navigation Services. Industry Research Whitepaper

Distribution Channels for Mobile Navigation Services. Industry Research Whitepaper Distribution Channels for Mobile Navigation Services Industry Research Whitepaper www.berginsight.com BERG INSIGHT André Malm, Senior Analyst (andre.malm@berginsight.com) Johan Fagerberg, Senior Analyst

More information

Parallel Computing Why & How?

Parallel Computing Why & How? Parallel Computing Why & How? Xing Cai Simula Research Laboratory Dept. of Informatics, University of Oslo Winter School on Parallel Computing Geilo January 20 25, 2008 Outline 1 Motivation 2 Parallel

More information

CS227: Assignment 1 Report

CS227: Assignment 1 Report 1 CS227: Assignment 1 Report Lei Huang and Lawson Wong April 20, 2008 1 Introduction Propositional satisfiability (SAT) problems have been of great historical and practical significance in AI. Despite

More information

Parallelism. Parallel Hardware. Introduction to Computer Systems

Parallelism. Parallel Hardware. Introduction to Computer Systems Parallelism We have been discussing the abstractions and implementations that make up an individual computer system in considerable detail up to this point. Our model has been a largely sequential one,

More information

Parallel Computing: From Inexpensive Servers to Supercomputers

Parallel Computing: From Inexpensive Servers to Supercomputers Parallel Computing: From Inexpensive Servers to Supercomputers Lyle N. Long The Pennsylvania State University & The California Institute of Technology Seminar to the Koch Lab http://www.personal.psu.edu/lnl

More information

Real Parallel Computers

Real Parallel Computers Real Parallel Computers Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra, Meuer, Simon Parallel Computing 2005 Short history

More information

Chapter 1: Introduction to Parallel Computing

Chapter 1: Introduction to Parallel Computing Parallel and Distributed Computing Chapter 1: Introduction to Parallel Computing Jun Zhang Laboratory for High Performance Computing & Computer Simulation Department of Computer Science University of Kentucky

More information

DOWNLOAD PDF BIG IDEAS MATH VERTICAL SHRINK OF A PARABOLA

DOWNLOAD PDF BIG IDEAS MATH VERTICAL SHRINK OF A PARABOLA Chapter 1 : BioMath: Transformation of Graphs Use the results in part (a) to identify the vertex of the parabola. c. Find a vertical line on your graph paper so that when you fold the paper, the left portion

More information

Hard Disk Storage Deflation Is There a Floor?

Hard Disk Storage Deflation Is There a Floor? Hard Disk Storage Deflation Is There a Floor? SCEA National Conference 2008 June 2008 Laura Friese David Wiley Allison Converse Alisha Soles Christopher Thomas Hard Disk Deflation Background Storage Systems

More information

Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS

Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS Who am I? Education Master of Technology, NTNU, 2007 PhD, NTNU, 2010. Title: «Managing Shared Resources in Chip Multiprocessor Memory

More information

State of Mobile Commerce. Q

State of Mobile Commerce. Q State of Mobile Commerce. Q4 2014. JANUARY 2015 Executive Summary. Mobile commerce adoption is far ahead of expectations. Globally, mobile now accounts for 30% of ecommerce transactions. It is expected

More information

Introduction to ICs and Transistor Fundamentals

Introduction to ICs and Transistor Fundamentals Introduction to ICs and Transistor Fundamentals A Brief History 1958: First integrated circuit Flip-flop using two transistors Built by Jack Kilby at Texas Instruments 2003 Intel Pentium 4 mprocessor (55

More information

Accelerating Implicit LS-DYNA with GPU

Accelerating Implicit LS-DYNA with GPU Accelerating Implicit LS-DYNA with GPU Yih-Yih Lin Hewlett-Packard Company Abstract A major hindrance to the widespread use of Implicit LS-DYNA is its high compute cost. This paper will show modern GPU,

More information

The Case of the Missing Supercomputer Performance

The Case of the Missing Supercomputer Performance The Case of the Missing Supercomputer Performance Achieving Optimal Performance on the 8192 Processors of ASCI Q Fabrizio Petrini, Darren Kerbyson, Scott Pakin (Los Alamos National Lab) Presented by Jiahua

More information

On Using Machine Learning for Logic BIST

On Using Machine Learning for Logic BIST On Using Machine Learning for Logic BIST Christophe FAGOT Patrick GIRARD Christian LANDRAULT Laboratoire d Informatique de Robotique et de Microélectronique de Montpellier, UMR 5506 UNIVERSITE MONTPELLIER

More information

Multi-core Programming - Introduction

Multi-core Programming - Introduction Multi-core Programming - Introduction Based on slides from Intel Software College and Multi-Core Programming increasing performance through software multi-threading by Shameem Akhter and Jason Roberts,

More information

(ii) Why are we going to multi-core chips to find performance? Because we have to.

(ii) Why are we going to multi-core chips to find performance? Because we have to. CSE 30321 Computer Architecture I Fall 2009 Lab 06 Introduction to Multi-core Processors and Parallel Programming Assigned: November 3, 2009 Due: November 17, 2009 1. Introduction: This lab will introduce

More information

How Adolite Breaks the Optical Interconnect Supply Chain Bottleneck

How Adolite Breaks the Optical Interconnect Supply Chain Bottleneck BE FIRST IN THE DATA-DRIVEN RACE How Adolite Breaks the Optical Interconnect Supply Chain Bottleneck Hyperscale datacenter is all about ramping up quickly, gracefully and cost-effectively. So, it s ironic

More information

EE586 VLSI Design. Partha Pande School of EECS Washington State University

EE586 VLSI Design. Partha Pande School of EECS Washington State University EE586 VLSI Design Partha Pande School of EECS Washington State University pande@eecs.wsu.edu Lecture 1 (Introduction) Why is designing digital ICs different today than it was before? Will it change in

More information

Partitioning Effects on MPI LS-DYNA Performance

Partitioning Effects on MPI LS-DYNA Performance Partitioning Effects on MPI LS-DYNA Performance Jeffrey G. Zais IBM 138 Third Street Hudson, WI 5416-1225 zais@us.ibm.com Abbreviations: MPI message-passing interface RISC - reduced instruction set computing

More information

A Ten Year ( ) Storage Landscape LTO Tape Media, HDD, NAND

A Ten Year ( ) Storage Landscape LTO Tape Media, HDD, NAND R. Fontana, G. Decad IBM Systems May 15, 2018 A Ten Year (2008-2017) Storage Landscape LTO Tape Media,, 10 Year Storage Landscape 1 A Ten Year (2008-2017) Storage Landscape: LTO Tape Media,, Topics Data

More information

CHAPTER 1 Introduction

CHAPTER 1 Introduction CHAPTER 1 Introduction 1.1 Overview 1 1.2 The Main Components of a Computer 3 1.3 An Example System: Wading through the Jargon 4 1.4 Standards Organizations 13 1.5 Historical Development 14 1.5.1 Generation

More information

Microelettronica. J. M. Rabaey, "Digital integrated circuits: a design perspective" EE141 Microelettronica

Microelettronica. J. M. Rabaey, Digital integrated circuits: a design perspective EE141 Microelettronica Microelettronica J. M. Rabaey, "Digital integrated circuits: a design perspective" Introduction Why is designing digital ICs different today than it was before? Will it change in future? The First Computer

More information

COSC 311: ALGORITHMS HW1: SORTING

COSC 311: ALGORITHMS HW1: SORTING COSC 311: ALGORITHMS HW1: SORTIG Solutions 1) Theoretical predictions. Solution: On randomly ordered data, we expect the following ordering: Heapsort = Mergesort = Quicksort (deterministic or randomized)

More information

Welfare Navigation Using Genetic Algorithm

Welfare Navigation Using Genetic Algorithm Welfare Navigation Using Genetic Algorithm David Erukhimovich and Yoel Zeldes Hebrew University of Jerusalem AI course final project Abstract Using standard navigation algorithms and applications (such

More information

Enhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations

Enhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations Performance Brief Quad-Core Workstation Enhancing Analysis-Based Design with Quad-Core Intel Xeon Processor-Based Workstations With eight cores and up to 80 GFLOPS of peak performance at your fingertips,

More information

Example for calculation of clustering coefficient Node N 1 has 8 neighbors (red arrows) There are 12 connectivities among neighbors (blue arrows)

Example for calculation of clustering coefficient Node N 1 has 8 neighbors (red arrows) There are 12 connectivities among neighbors (blue arrows) Example for calculation of clustering coefficient Node N 1 has 8 neighbors (red arrows) There are 12 connectivities among neighbors (blue arrows) Average clustering coefficient of a graph Overall measure

More information

FET Proactive initiative: Advanced Computing Architectures

FET Proactive initiative: Advanced Computing Architectures FET Proactive initiative: Advanced Computing Architectures Terms of Reference 1 INTRODUCTION Fuelled by Moore's law, progress in electronics continues at an unabated pace permitting the design of devices

More information

Technologies for Information and Health

Technologies for Information and Health Energy Defence and Global Security Technologies for Information and Health Atomic Energy Commission HPC in France from a global perspective Pierre LECA Head of «Simulation and Information Sciences Dpt.»

More information

A Study of High Performance Computing and the Cray SV1 Supercomputer. Michael Sullivan TJHSST Class of 2004

A Study of High Performance Computing and the Cray SV1 Supercomputer. Michael Sullivan TJHSST Class of 2004 A Study of High Performance Computing and the Cray SV1 Supercomputer Michael Sullivan TJHSST Class of 2004 June 2004 0.1 Introduction A supercomputer is a device for turning compute-bound problems into

More information

Finding a needle in Haystack: Facebook's photo storage

Finding a needle in Haystack: Facebook's photo storage Finding a needle in Haystack: Facebook's photo storage The paper is written at facebook and describes a object storage system called Haystack. Since facebook processes a lot of photos (20 petabytes total,

More information

The Return of Innovation. David May. David May 1 Cambridge December 2005

The Return of Innovation. David May. David May 1 Cambridge December 2005 The Return of Innovation David May David May 1 Cambridge December 2005 Long term trends Computer performance/cost has followed an exponential path since the 1940s, doubling about every 18 months This has

More information

High-Performance Scientific Computing

High-Performance Scientific Computing High-Performance Scientific Computing Instructor: Randy LeVeque TA: Grady Lemoine Applied Mathematics 483/583, Spring 2011 http://www.amath.washington.edu/~rjl/am583 World s fastest computers http://top500.org

More information

These Are the Top Languages for Enterprise Application Development

These Are the Top Languages for Enterprise Application Development These Are the Top Languages for Enterprise Application Development And What That Means for Business August 2018 Enterprises are now free to deploy a polyglot programming language strategy thanks to a decrease

More information