Since the invention of microprocessors at Intel in the late 1960s, Multicore Chips and Parallel Processing for High-End Learning Environments

Similar documents
Computer Architecture!

Understanding Dual-processors, Hyper-Threading Technology, and Multicore Systems

Computer Architecture s Changing Definition

Von Neumann architecture. The first computers used a single fixed program (like a numeric calculator).

GPU Architecture. Alan Gray EPCC The University of Edinburgh

Wisconsin Computer Architecture. Nam Sung Kim

Computer Architecture!

Computer Architecture Computer Architecture. Computer Architecture. What is Computer Architecture? Grading

The Power Wall. Why Aren t Modern CPUs Faster? What Happened in the Late 1990 s?

Homework Question. Faster, faster, faster! (40 points)

Administration. Coursework. Prerequisites. CS 378: Programming for Performance. 4 or 5 programming projects

Outline Marquette University

EE586 VLSI Design. Partha Pande School of EECS Washington State University

Computer Architecture

Why Parallel Architecture

Computer Architecture!

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 15

ECE 486/586. Computer Architecture. Lecture # 2

Multicore Computing and Scientific Discovery

CSC 447: Parallel Programming for Multi- Core and Cluster Systems

Administration. Prerequisites. CS 395T: Topics in Multicore Programming. Why study parallel programming? Instructors: TA:

Parallelism and Concurrency. COS 326 David Walker Princeton University

C Program Adventures. From C code to motion

Computer Architecture

Lecture 1: Introduction

Parallel Computer Architecture

Microelettronica. J. M. Rabaey, "Digital integrated circuits: a design perspective" EE141 Microelettronica

Administration. Prerequisites. Website. CSE 392/CS 378: High-performance Computing: Principles and Practice

Fundamentals of Computer Design

CS Computer Architecture

Using Graphics Chips for General Purpose Computation

Computer Architecture

Module 18: "TLP on Chip: HT/SMT and CMP" Lecture 39: "Simultaneous Multithreading and Chip-multiprocessing" TLP on Chip: HT/SMT and CMP SMT

AWS & Intel: A Partnership Dedicated to fueling your Innovations. Thomas Kellerer BDM CSP, Intel Central Europe

The Return of Innovation. David May. David May 1 Cambridge December 2005

Power dissipation! The VLSI Interconnect Challenge. Interconnect is the crux of the problem. Interconnect is the crux of the problem.

Administration. Course material. Prerequisites. CS 395T: Topics in Multicore Programming. Instructors: TA: Course in computer architecture

Introduction to Multicore architecture. Tao Zhang Oct. 21, 2010

Parallel Computing. Parallel Computing. Hwansoo Han

Copyright 2012, Elsevier Inc. All rights reserved.

Overview. CS 472 Concurrent & Parallel Programming University of Evansville

Multi-Core Microprocessor Chips: Motivation & Challenges

45-year CPU Evolution: 1 Law -2 Equations

Parallelism in Hardware

CSE 591: GPU Programming. Introduction. Entertainment Graphics: Virtual Realism for the Masses. Computer games need to have: Klaus Mueller

CHAPTER 1 Introduction

Multicore Hardware and Parallelism

CSCI 402: Computer Architectures. Computer Abstractions and Technology (4) Fengguang Song Department of Computer & Information Science IUPUI.

Birth of Microprocessor

Introduction CPS343. Spring Parallel and High Performance Computing. CPS343 (Parallel and HPC) Introduction Spring / 29

CS 194 Parallel Programming. Why Program for Parallelism?

Parallel Computing: Parallel Architectures Jin, Hai

Computer and Information Sciences College / Computer Science Department CS 207 D. Computer Architecture

ECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141

Mobile Processors. Jose R. Ortiz Ubarri

CSE 141: Computer Architecture. Professor: Michael Taylor. UCSD Department of Computer Science & Engineering

NVIDIA DGX SYSTEMS PURPOSE-BUILT FOR AI

Computer Architecture A Quantitative Approach, Fifth Edition. Chapter 1. Copyright 2012, Elsevier Inc. All rights reserved. Computer Technology

The Stampede is Coming: A New Petascale Resource for the Open Science Community

EECS4201 Computer Architecture

Challenges of Analyzing Parametric CFD Results. White Paper Published: January

ECE 588/688 Advanced Computer Architecture II

VMware Cloud Operations Management Technology Consulting Services

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2015 Lecture 15

COMPUTER ARCHITECTURE AND OPERATING SYSTEMS (CS31702)

CSE 591/392: GPU Programming. Introduction. Klaus Mueller. Computer Science Department Stony Brook University

David R. Mackay, Ph.D. Libraries play an important role in threading software to run faster on Intel multi-core platforms.

Fundamentals of Computers Design

COMPUTING ELEMENT EVOLUTION AND ITS IMPACT ON SIMULATION CODES

STAR Watch Statewide Technology Assistance Resources Project A publication of the Western New York Law Center,Inc.

Saman Amarasinghe and Rodric Rabbah Massachusetts Institute of Technology

ORACLE SERVICES FOR APPLICATION MIGRATIONS TO ORACLE HARDWARE INFRASTRUCTURES

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 18 Multicore Computers

Calendar Description

Microprocessor Trends and Implications for the Future

How to Write Fast Code , spring th Lecture, Mar. 31 st

Fra superdatamaskiner til grafikkprosessorer og

This Unit: Putting It All Together. CIS 501 Computer Architecture. What is Computer Architecture? Sources

This Unit: Putting It All Together. CIS 371 Computer Organization and Design. What is Computer Architecture? Sources

CS/EE 6810: Computer Architecture

Computer Architecture = CS/ECE 552: Introduction to Computer Architecture. 552 In Context. Why Study Computer Architecture?

Multicore computer: Combines two or more processors (cores) on a single die. Also called a chip-multiprocessor.

Parallelism: The Real Y2K Crisis. Darek Mihocka August 14, 2008

Computer Organization & Assembly Language Programming. CSE 2312 Lecture 2 Introduction to Computers

Measurement of real time information using GPU

The Future of Computing: AMD Vision

Supercomputing and Mass Market Desktops

Improving Performance and Power of Multi-Core Processors with Wonderware and System Platform 3.0

Benchmark Performance Results for Pervasive PSQL v11. A Pervasive PSQL White Paper September 2010

Microarchitecture Overview. Performance

Unit 11: Putting it All Together: Anatomy of the XBox 360 Game Console

Computer Architecture

6 February Parallel Computing: A View From Berkeley. E. M. Hielscher. Introduction. Applications and Dwarfs. Hardware. Programming Models

CHAPTER 1 Introduction

Advanced Computer Architecture (CS620)

EE 7722 GPU Microarchitecture. Offered by: Prerequisites By Topic: Text EE 7722 GPU Microarchitecture. URL:

Von Antreibern und Beschleunigern des HPC

IJESR Volume 1, Issue 1 ISSN:

Concurrency & Parallelism, 10 mi

CS5222 Advanced Computer Architecture. Lecture 1 Introduction

Transcription:

old Learning on Demand Marcelo Hoffmann +1 650 859 3680; fax: +1 650 859 4544; electronic mail: mhoffmann@sric-bi.com Multicore Chips and Parallel Processing for High-End Learning Environments Why is this topic significant? Computing power is a key enabler for high-end learning environments such as threedimensional simulations, games, and virtual worlds. The availability of inexpensive, multicore, parallel computers may revolutionize high-end learning environments if programmers can create and convert software for the new computer systems. This Viewpoints, which examines the field of multicore chips and parallel processing, is relevant to all learning-industry players interested in new hardware trends. Since the invention of microprocessors at Intel in the late 1960s, improvements in their speed and power have become expected and generally assumed by the marketplace. However, in 2004, Moore s law ran into a major speed bump that led developers to transform microprocessor designs and, with them, computer programming. Gordon Moore, one of the founders of Intel, originally predicted that the number of transistors in the same surface area on a microchip would double every 18 months. In 2004, chip designers hit a wall because each new chip not only doubled in performance but also doubled in power dissipation, resulting in the production of excessive heat that the chip could not easily expel. Because of this barrier, Intel canceled its planned successor to the Pentium 4 in 2004 and gave up on its goal of producing a chip with a 10 GHz clock rate by 2010 (microprocessors now generally operate little faster than about, a monthly bulletin, alerts members to issues and developments in learning and technology with potential high impact on business performance. LoD research reports provide more in-depth analysis of key topics, including strategy recommendations for adopters and vendors. Membership includes access to LoD s Inquiry Service. To use the service, contact Eilif Trondsen at etrondsen@sric-bi.com or Rob Edmonds at redmonds@sric-bi.com.

3 GHz). Currently, and for the first time in some 20 years, the line plotting microprocessor performance per unit of surface area has dipped below the line plotting Moore s law. This change has severe consequences for the consumer electronics and computer industries, which expect continual improvements in microprocessors power for their ongoing business. The most popular solution that chip architects have developed to alleviate the power-density limitation of chips uses multiple processors, or cores, in each chip. The various cores execute portions of software code in parallel. Multicore architectures allow designers to lower the clock rate and voltage requirements of the chip (solving the heat problem) while continuing to improve overall performance for executing programs. The number of cores on a microprocessor may soon join measurements (in megahertz) of chip clock rates as a combined metric for microprocessor performance. David Patterson a noted computer-science researcher at the University of California, Berkeley claims that we can now expect the number of cores per chip to double every 18 months in a sense replacing Moore s law with a measure for parallelization. The transition toward multicore chips runs across the microprocessor and computer industries: Intel Corp. and Advanced Micro Devices Inc. introduced two- and four-core microprocessors during 2006 for mainstream and power-user computers. Sun Microsystems is already producing chips for servers with eight cores. IBM sells what amounts to an eight-core chip in its Cell microprocessor product that it designed and built in conjunction with Sony and Toshiba for use in Sony s latest PlayStation 3 game console and several other applications. Texas Instruments has produced chips with a variety of cores for cell-phone makers for several years. In 2006, LSI Logic Corp. introduced a microprocessor platform for consumer gadgets that uses three or four cores, depending on the particular application. Makers of chips for embedded systems have produced highly specialized chips with more than 200 cores on a die. No one is sure how successful the multicore-development strategy will be for the consumer-electronics and computer industries. Several major questions remain about implementation: Programming multicore processors is dramatically different from and more complex than programming traditional processors. Parallel programs require design and code so that multiple processors can share computational tasks. This requirement is problematic because not all application programs have components that can easily divide; sometimes tasks begun simultaneously

finish at different times and generate bottlenecks when conjoined. Shared resources can also generate problems: If an application needs to access data in memory that is already shared and used by other cores, the program could freeze, stopping all operations. Parallel programs are also inherently hard to debug because mistakes are often not obvious at inception making location of the source of future problems difficult. The move to multicore requires not only new programming skills, but also new tools. Some experts claim that adapting to multicore processors and parallel programming is the biggest challenge that the information technology (IT) industry has faced in 50 years. Jim Larus, a computer scientist who manages programming initiatives at Microsoft Research, notes, We lack algorithms, languages, compilers and expertise in parallel programming. Programmers face many short-term issues, like developing better support for multithreading, synchronization, debugging and error detection (Electronic Engineering Times, 28 December 2006, page 6). A key driver toward parallelism in computation is the market for computers with high-end graphics processing units (GPUs; microprocessors specifically to improve computer graphics). High end GPUs are important for computer games, including games such as Second Life that may find use in learning situations. Such GPUs are also necessary for three-dimensional learning environments with highly detailed, real- time representations of the people and objects. The game market has already driven the development of graphics chips and GPUs that are even more powerful for some operations than the leading general purpose microprocessors that drive personal computers and workstations. According to Wired magazine ( Supercomputing s next Revolution, 9 November 2006; www.wired. com/news/technology/0,72090-0.html), researchers from the University of North Carolina at Chapel Hill released benchmark tests showing how specialized GPUs that developed for the games industry in the past few years can surpass the latest central-processing-unit (CPU-) based systems by two to five times in a wide variety of tasks. Some researchers, writing specialized code for the graphics chips, find even greater improvements, though these improvements generally result from painstaking hand coding of software to the specific hardware, which is often not cost beneficial for commercial purposes.

Competition in the GPU market is now even more intense than in the CPU market, and doubling of computation power per board in one year is not unusual in the industry. The GPU hardware makers are also anxious to address the issues facing programmers. NVIDIA, one of the two leading GPU developers (the other is ATI, recently bought by AMD), has announced that it will soon offer the first C-compiler development environment for its GPUs, making GPUs easier to program for multiple applications, in addition to graphics rendering and presentation. According to Andy Keane, NVIDIA s general manager for GPU computing, the company created a new architecture for its latest GPU, the GeForce 8800, adding a memory cache that allows the chip to work in two modes one for graphics that uses stream processing (a specialized type of parallel processing) and a second so-called load-store mode for more complex logic-based operations, making the leading-edge GPU operate similar to the way in which a traditional CPU operates. Offloading computational tasks from the general-purpose microprocessor could result in substantially greater realism in learning environments, if researchers can surpass the programming bottlenecks. In the January 2007 issue of Intelligent Enterprise, David Patterson described the problems and opportunities facing the computer industry, stating that the switch to parallel architectures will require researchers and practitioners to address the biggest challenge and opportunity to face the IT industry in 50 years. If we solve the problem of making it easy to program thousands of processors efficiently, the future is rosy. If we don t, the IT industry will have to learn to live without the performance rush that it has been addicted to for decades. (See www.intelligententerprise.com/showarticle. jhtml?articleid=196603897&pgno=5.) Patterson and a number of other leading computer scientists at the Massachusetts Institute of Technology, Carnegie Mellon University, Stanford University, and the University of Washington are now working to solve the problem by collaborating on a project that they call Research Accelerator for Multiple Processors (RAMP). According to its Web site (ramp.eecs.berkeley.edu/), the goal of RAMP is to develop (i) component models (ranging from processors to coherent caches to networks) that can be composed quickly to create and evaluate new multiprocessor architectural and micro-architectural concepts and (ii) a set of three reference machines that will use those component models. Reportedly, the reference systems are in design to scale to the 1000-core range given the appropriate hardware platform (additional information is available at en.wikipedia.org/wiki/ramp:_research_accelerator_for_multiple_ Processors).

About the LoD Program SRI Consulting Business Intelligence s Learning-on-Demand (LoD) multiclient research program leverages the subscription fees of multiple clients to examine the evolution and features of the emerging technology-enabled learning marketplace, explore adoption issues, and define the components of effective workplace learning. The LoD multiclient program provides a cost-effective way to discover, evaluate, and implement LoD solutions that will yield high business payoffs by improving employee performance. The program benefits both LoD users and developers: Potential LoD system users gain an unbiased source of information about LoD implementation, benefits of LoD systems, and innovative LoD solutions emerging in the marketplace. LoD system developers receive information about the factors driving or constraining market demand for LoD systems. For more information about the Learning-on-Demand multiclient research program, contact: Eilif Trondsen, Program Director; etrondsen@sric-bi.com 333 Ravenswood Avenue, Menlo Park, California 94025-3476 Telephone: +1 650 859 4600 Or visit our Web site: www.sric-bi.com. About SRI Consulting Business Intelligence TM Anticipating Futures Opportunities in digital information, communications, and entertainment EXPLORE FOCUS Business opportunities in technology commercialization old Learning on Demand TM Psychology of Markets Opportunities in technology-enabled learning and strategies in elearning Applying psychology to understand and predict consumer behavior Core Consulting Services Technology and Market Assessment Opportunity Discovery Innovation and Commercialization Consumer Demand Scenario Planning Strategy Intelligence Technology Management Learning Strategy C F DTM CONSUMER FINANCIAL DECISIONS Insight and consulting about consumer financial behaviors and attitudes