Page 1. Distributed and Parallel Systems. High Performance Computing, Computational Grid, and Numerical Libraries. What is Grid Computing?

Size: px
Start display at page:

Download "Page 1. Distributed and Parallel Systems. High Performance Computing, Computational Grid, and Numerical Libraries. What is Grid Computing?"

Transcription

1 QuickTime and a decompressor are needed to see this picture. High omputing, omputational Grid, and Numerical Libraries Distributed systems heterogeneous Distributed and Parallel Systems STI@home ntropia Grid based omputing Beowulf cluster Network of ws lusters w/ special interconnect Massively parallel systems homogeneous ASI Tflops Parallel Dist mem!! "!#$ %&' ()! +, ( - ( -. / ( ( # 3 4 ()! + What is Grid omputing? 6 7 ' 8 DATA AQUISITION ADVAND VISUALIZATION,ANALYSIS The omputational Grid is 97 - :- :- ; '- '- <' :- ; ( = 7? 7 IMAGING INSTRUMNTS OMPUTATIONAL RSOURS LARG-SAL DATABASS omputational Grids and lectric Power Grids <":-! " " :- # # $ % & % ' % & ( An merging Grid A7<B - # - % ) % % + %, &!, % & -. %, # & % $&!/ Page

2 Grids a Grid-Aware Numerical Libraries & $'$ +'& ) Software omponents Feedback Problem Real-time Monitor #:3( 3( ( ;+++%- + +#: 7 ; Whole- Program ompiler Source Application onfigurable Object Program Resource Negotiator Scheduler Negotiation Grid Runtime System ; % -+ ( ; ;++--- < ++ Libraries Binder 3#3D ;++ ++ ; F ; D- ; # ;++ < + Grid-Aware Numerical Libraries & $'$ +'& ) Libraries Software omponents Whole- Program ompiler Feedback Source Application onfigurable Object Program Problem Resource Negotiator Scheduler 3 & $'$ + ) 4 Real-time Monitor Negotiation Binder ) $ & 4 Grid Runtime System ScaLAPAK (:( /? 7 ::6 :# - < < D? "'3D'?( :( # 7#, '= :/'D '3"' ' #''3( '# '9! 6 ScaLAPAK Grid nabled # (:( 7 < G :7 :- G <- 7 - ( A 7B / - / < To Use ScaLAPAK a User Must:? - < / <<:,( ',( ',( '6 :# :? -? : 7 :? 7! 7 -! A H 3 I B % 3; 7 /7 7 ' ' Page

3 GrADS Numerical Library < <- 7 G 7? 7 87? 7 :? 7 < % # GrADS Library Sequence 7 = AB <- < $ $ ) 4 Resource Selector 7? % 7-' ' 7 % / 7 ' - 77 Arrays of Values Generated by Resource Selector 7 $!'?'# % : D,- << : / '' / ( - < ' ' - ScaLAPAK Model Model v T ( n, p) = f t f + vtv + mtm 3 n f = 3p! 7 n = (3 + log p) 4 p! 7 m = n(6 + log p)! 7 t f! t v! t m! 7 : :< # ' 7 : #A7B' ' : Page 3

4 ontract Development Application Launcher 7 7? 7-? : - <7! 7' <? : (? : A H H7 III6 perimental Hardware / Software Grid Resource Selector/ Modeler MacroGrid Testbed Type OS TOR YPHR OPUS luster 8 Dual Pentium III Red Hat Linu.. SMP luster 6 Dual Pentium III Debian Linu..7 SMP luster 8 Pentium II Red Hat Linu..6 Memory MB MB 8 or 6 MB PU speed MHz MHz MHz Network Fast thernet ( Mbit/s) (3om 39B) and switch (BayStack 3T) with 6 ports Gigabit thernet (SK- 9843) and switch (Foundry FastIron II) with 4 ports Myrinet (LANai 4.3) with 6 ports each 7 J ( J 3 :#= (:(K (!(+,(J,( :( :#. (?GAB Independent components being put together and interacting 7-7! 7! : # -! 7<< 8! / ' '! +) 8!! 7 MDS, NWS Model Library writer to supply Optimizer oarse Grid! 7 Model Validation Opus4 Opus3 Opus6 Opus Torc4 Torc6 Torc mem(mb) speed load Speed = 6% of the peak Latency Opus4 Opus3 Opus6 Opus Torc4 Torc6 Torc7 Opus Opus Opus Opus Torc Torc Torc Latency in msec Time (seconds) PDGSV Time Breakdown ScaLAPAK - PDGSV - Using collapsed NWS query from USB 4 machine available, using mainly torc, cypher, msc clusters at UTK [Jan ] 6. other. PDGSV 4. spawn 3. NWS. MDS. Bandwidth Opus4 Opus3 Opus6 Opus Torc4 Torc6 Torc7 Opus Opus Opus Opus Torc Torc Torc Bandwidth in Mb/s This is for a refined grid other PDGSV spawn NWS MDS Matri Size - Nproc Page 4

5 Time (seconds) ScaLAPAK across 3 lusters 3 OPUS OPUS, YPHR OPUS, TOR, YPHR 3 OPUS, 4 TOR, 6 YPHR 8 OPUS, 4 TOR, 4 YPHR 8 OPUS, TOR, 6 YPHR 6 OPUS, YPHR 8 OPUS, 6 YPHR 8 OPUS OPUS 8 OPUS Matri Size Time (seconds) torcs, 3 mscs QR Timing Breakdown ecution of QR factorization over the grid Application eecution Startup time modeling NWS MDS 3 torcs, 4 mscs 4 torcs, 6 mscs 4 8 Matri Size 4 torcs, 6 mscs 4 torcs, 7 mscs 4 torcs, 7 mscs PDSYVX Timing Breakdown GradSoft & ScaLAPAK Time (s) other_grid_overhead perf_modeling_time nws mds pdsyev_driver_overhead back transformation compute eigenvectors compute eigenvalues tridiagonal reduction PDSYVX - torcs, cyphers Uses torcs only Uses torcs and cyphers 7? 3 3 '( +:7 (:( ' ( :( ( (:( : I ( I G :? "FI / 7 < N-nproc Rescheduling/Redistribution perimental Results Rescheduling/Redistribution perimental Results Time (seconds) Scenario: start application on 4 cyphers machines. After 4 minutes into the run 4 addition processors become available. Want to reorganize the computation to take advantage of the etra computing capability. What s the additional cost? Time (seconds) Application eecution Redistribute time heckpoint read+write Restart time Start time Grid overhead NWS App: 36, 4 cyphers No rescheduling App: 36, 4 cyphers+4 torcs No rescheduling App: 36, 4 cyphers No rescheduling App: 36, 4 App: 36, 4 torcs cyphers+4 torcs App: 36, 4 torcs+4 cyphers, No rescheduling introduced 4 mins. into the run With rescheduling About % better performance even with redistribution and restart. Page

6 Time stimate (sec) Adhoc vs Annealing Scheduling () (9) stimated ecution Time for PDGSV Using Adhoc Scheduler and Annealing Scheduler Given 7 possible hosts; all GrADS 86 machines est_adhoc est_anneal () () (4) (6) (4) () (8) () N (nproc_adhoc) (nproc_anneal) 3 () () 3 (7) (7) Largest Problem Solved /8J' ), J #! % 3.,' 4, : )! <4& % JK + % + % (:( ) - 7. < % :. = 8. + < % D ( :( ompiler analogy 7 General Library Interface In the past: Isolation Motivation for Grid omputing (? : - - <' '- 7 #G7A B +7?! 7-3 :! - < <'7 ' < 77 '7 7? 7 7 Today: ollaboration The Grid! : - < "7 7 H 7/ - < A - B A 3- < G<' 3 B! L Page 6

7 ? 3,D! = ", Motivation for NetSolve NetSolve Network nabled Server 3 / 7- +-, : 7-9 ' 7 7'7' ' '9 "! '7 = G :7 A B NetSolve: The Big Picture NetSolve: The Big Picture Schedule Schedule Database Database AGNT(s) AGNT(s) Matlab Mathematica IBP Depot Matlab Mathematica A, B IBP Depot, Fortran, Fortran Java, cel Java, cel S A S S3 S4 S A S S3 S4 No knowledge of the grid required, RP like. No knowledge of the grid required, RP like. NetSolve: The Big Picture NetSolve: The Big Picture Schedule Schedule Matlab Mathematica Handle back IBP Depot AGNT(s) Database Matlab Mathematica Request S! IBP Depot AGNT(s) Database, Fortran, Fortran Java, cel Java, cel S A S S3 S4 S Answer () A A, B OP, handle Op(, A, B) S S3 S4 No knowledge of the grid required, RP like. No knowledge of the grid required, RP like. Page 7

8 Basic Usage Scenarios NetSolve Agent Agent 7 7 G- 7 '(:( ' '(:(':"!' (M!"'(:(! < A: B/ : / - A, B,? <- -< / 'A B # 7 '' 3#3D'9 3 3 # NetSolve Agent Agent NetSolve G; : : #3:( 3- <7- ' - < :7 8+ / A! B 3 D,# 7 3 G (:# #7'D' 7' - < 3 7< ;7<' 7<'< '9 NetSolve NetSolve # 7 / ; (N,'O < ( 7 A = netsolve( matmul, B, ); Possible parallelisms hidden. Page 8

9 Java This is a linear solver for dense matrices from the LAPAK Library. MATRIX DOUBL A Double precision VTOR DOUBL b Right VTOR DOUBL Generating New Services in NetSolve (? 7 - # NetSolve Parser/ ompiler New Service Added! Server Service Service Service Service New Service Task Farming - Multiple Requests To Single Problem ( ; "9:;( ;: D ; & < "9 A B ( - ' Data Persistence Data Persistence (cont d) 3 ( 8 "?( - -! + < / / command(a, sequence(a, B, B) ) result input A, command(a, intermediate output ) result D intermediate output D, input command3(d, ) Server Server netsl_begin_sequence( ); netsl( command, netsl( command, A, A, B, B, ); ); netsl( command, netsl( command, A, A,,, D); D); netsl( command3, netsl( command3, D, D,,, F); F); netsl_end_sequence(, D); result F Server University of Tennessee Deployment: Scalable Intracampus Research Grid SInRG NPAI Alpha Project - Mell: 3-D Monte-arlo Simulation of Neuro-Transmitter Release in Between ells USD (F. Berman, H. asanova, M. llisman), Salk Institute (T. Bartol), MU (J. Stiles), UTK (Dongarra, M. Miller, R. Wolski) Study how neurotransmitters diffuse and activate receptors in synapses blue unbounded, red singly bounded, green doubly bounded closed, yellow doubly bounded open The Knoville ampus has two DS-3 commodity Internet connections and one DS-3 Internet/Abilene connection. An O-3 ATM link routes IP traffic between the Knoville campus, National Transportation Research enter, and Oak Ridge National Laboratory. UT participates in several national networking initiatives including Internet (I), Abilene, the federal Net Generation Internet (NGI) initiative, Southern Universities Research Association (SURA) Regional Information Infrastructure (RII), and Southern rossroads (SoX). D - ;' "' ' "'" " ' - ' -< The UT campus consists of a meshed ATM O- being migrated over to switched Gigabit by early. Page 9

10 Web Interface Web Server NetSolve IPARS-enabled Servers #:( G '! ( " 7<'-' J? - #"/ D - < :'-'7< : + ' 8 D #:( - # #:(#; 'D! ( 3' 7' ' 7 Netsolve and SIRun SIRun torso defibrillator application hris Johnson, U of Utah Things Not Touched On onclusions 7F. # = -- = ( 7 33-<!< 3-< 3-< #,< : - D! +7? 3 ( ( (? 7 7 "/ (? = "/ 7 F 7< - / (?- 7 ( ( / 7? 7 - < ( '- / O- - < # 8 ' ' Page

Grid Application Development Software

Grid Application Development Software Grid Application Development Software Department of Computer Science University of Houston, Houston, Texas GrADS Vision Goals Approach Status http://www.hipersoft.cs.rice.edu/grads GrADS Team (PIs) Ken

More information

Numerical Libraries And The Grid: The GrADS Experiments With ScaLAPACK

Numerical Libraries And The Grid: The GrADS Experiments With ScaLAPACK UT-CS-01-460 April 28, 2001 Numerical Libraries And The Grid: The GrADS Experiments With ScaLAPACK Antoine Petitet, Susan Blackford, Jack Dongarra, Brett Ellis, Graham Fagg, Kenneth Roche, and Sathish

More information

COSC 6374 Parallel Computation. Parallel Computer Architectures

COSC 6374 Parallel Computation. Parallel Computer Architectures OS 6374 Parallel omputation Parallel omputer Architectures Some slides on network topologies based on a similar presentation by Michael Resch, University of Stuttgart Edgar Gabriel Fall 2015 Flynn s Taxonomy

More information

Self Adaptivity in Grid Computing

Self Adaptivity in Grid Computing CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2004; 00:1 26 [Version: 2002/09/19 v2.02] Self Adaptivity in Grid Computing Sathish S. Vadhiyar 1, and Jack J.

More information

Compiler Technology for Problem Solving on Computational Grids

Compiler Technology for Problem Solving on Computational Grids Compiler Technology for Problem Solving on Computational Grids An Overview of Programming Support Research in the GrADS Project Ken Kennedy Rice University http://www.cs.rice.edu/~ken/presentations/gridcompilers.pdf

More information

COSC 6374 Parallel Computation. Parallel Computer Architectures

COSC 6374 Parallel Computation. Parallel Computer Architectures OS 6374 Parallel omputation Parallel omputer Architectures Some slides on network topologies based on a similar presentation by Michael Resch, University of Stuttgart Spring 2010 Flynn s Taxonomy SISD:

More information

Compilers and Run-Time Systems for High-Performance Computing

Compilers and Run-Time Systems for High-Performance Computing Compilers and Run-Time Systems for High-Performance Computing Blurring the Distinction between Compile-Time and Run-Time Ken Kennedy Rice University http://www.cs.rice.edu/~ken/presentations/compilerruntime.pdf

More information

Users. Applications. Client Library. NS Agent. NS Server. NS Server. Server

Users. Applications. Client Library. NS Agent. NS Server. NS Server. Server Request Sequencing: Optimizing Communication for the Grid 0 Dorian C. Arnold 1, Dieter Bachmann 2, and Jack Dongarra 1 1 Department of Computer Science, University oftennessee, Knoxville, TN 37996 [darnold,

More information

Experiments with Scheduling Using Simulated Annealing in a Grid Environment

Experiments with Scheduling Using Simulated Annealing in a Grid Environment Experiments with Scheduling Using Simulated Annealing in a Grid Environment Asim YarKhan Computer Science Department University of Tennessee yarkhan@cs.utk.edu Jack J. Dongarra Computer Science Department

More information

High Performance Computing. Without a Degree in Computer Science

High Performance Computing. Without a Degree in Computer Science High Performance Computing Without a Degree in Computer Science Smalley s Top Ten 1. energy 2. water 3. food 4. environment 5. poverty 6. terrorism and war 7. disease 8. education 9. democracy 10. population

More information

GrADSoft and its Application Manager: An Execution Mechanism for Grid Applications

GrADSoft and its Application Manager: An Execution Mechanism for Grid Applications GrADSoft and its Application Manager: An Execution Mechanism for Grid Applications Authors Ken Kennedy, Mark Mazina, John Mellor-Crummey, Rice University Ruth Aydt, Celso Mendes, UIUC Holly Dail, Otto

More information

Application Performance on Dual Processor Cluster Nodes

Application Performance on Dual Processor Cluster Nodes Application Performance on Dual Processor Cluster Nodes by Kent Milfeld milfeld@tacc.utexas.edu edu Avijit Purkayastha, Kent Milfeld, Chona Guiang, Jay Boisseau TEXAS ADVANCED COMPUTING CENTER Thanks Newisys

More information

Low Cost Supercomputing. Rajkumar Buyya, Monash University, Melbourne, Australia. Parallel Processing on Linux Clusters

Low Cost Supercomputing. Rajkumar Buyya, Monash University, Melbourne, Australia. Parallel Processing on Linux Clusters N Low Cost Supercomputing o Parallel Processing on Linux Clusters Rajkumar Buyya, Monash University, Melbourne, Australia. rajkumar@ieee.org http://www.dgs.monash.edu.au/~rajkumar Agenda Cluster? Enabling

More information

MAGMA a New Generation of Linear Algebra Libraries for GPU and Multicore Architectures

MAGMA a New Generation of Linear Algebra Libraries for GPU and Multicore Architectures MAGMA a New Generation of Linear Algebra Libraries for GPU and Multicore Architectures Stan Tomov Innovative Computing Laboratory University of Tennessee, Knoxville OLCF Seminar Series, ORNL June 16, 2010

More information

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15 Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2014/15 Lecture X: Parallel Databases Topics Motivation and Goals Architectures Data placement Query processing Load balancing

More information

MAGMA. Matrix Algebra on GPU and Multicore Architectures

MAGMA. Matrix Algebra on GPU and Multicore Architectures MAGMA Matrix Algebra on GPU and Multicore Architectures Innovative Computing Laboratory Electrical Engineering and Computer Science University of Tennessee Piotr Luszczek (presenter) web.eecs.utk.edu/~luszczek/conf/

More information

Early Evaluation of the Cray X1 at Oak Ridge National Laboratory

Early Evaluation of the Cray X1 at Oak Ridge National Laboratory Early Evaluation of the Cray X1 at Oak Ridge National Laboratory Patrick H. Worley Thomas H. Dunigan, Jr. Oak Ridge National Laboratory 45th Cray User Group Conference May 13, 2003 Hyatt on Capital Square

More information

MPI History. MPI versions MPI-2 MPICH2

MPI History. MPI versions MPI-2 MPICH2 MPI versions MPI History Standardization started (1992) MPI-1 completed (1.0) (May 1994) Clarifications (1.1) (June 1995) MPI-2 (started: 1995, finished: 1997) MPI-2 book 1999 MPICH 1.2.4 partial implemention

More information

A Decoupled Scheduling Approach for the GrADS Program Development Environment. DCSL Ahmed Amin

A Decoupled Scheduling Approach for the GrADS Program Development Environment. DCSL Ahmed Amin A Decoupled Scheduling Approach for the GrADS Program Development Environment DCSL Ahmed Amin Outline Introduction Related Work Scheduling Architecture Scheduling Algorithm Testbench Results Conclusions

More information

ANALYSIS OF CLUSTER INTERCONNECTION NETWORK TOPOLOGIES

ANALYSIS OF CLUSTER INTERCONNECTION NETWORK TOPOLOGIES ANALYSIS OF CLUSTER INTERCONNECTION NETWORK TOPOLOGIES Sergio N. Zapata, David H. Williams and Patricia A. Nava Department of Electrical and Computer Engineering The University of Texas at El Paso El Paso,

More information

Advances of parallel computing. Kirill Bogachev May 2016

Advances of parallel computing. Kirill Bogachev May 2016 Advances of parallel computing Kirill Bogachev May 2016 Demands in Simulations Field development relies more and more on static and dynamic modeling of the reservoirs that has come a long way from being

More information

Users. Applications. Figure 2. Sample C code: Left side before NetSolve, right side after NetSolve integration

Users. Applications. Figure 2. Sample C code: Left side before NetSolve, right side after NetSolve integration The NetSolve Environment: Progressing Towards the Seamless Grid Dorian C. Arnold and Jack Dongarra Computer Science Department University of Tennessee Knoxville, TN 37996 [darnold, dongarra]@cs.utk.edu

More information

Commodity Cluster Computing

Commodity Cluster Computing Commodity Cluster Computing Ralf Gruber, EPFL-SIC/CAPA/Swiss-Tx, Lausanne http://capawww.epfl.ch Commodity Cluster Computing 1. Introduction 2. Characterisation of nodes, parallel machines,applications

More information

Platforms Design Challenges with many cores

Platforms Design Challenges with many cores latforms Design hallenges with many cores Raj Yavatkar, Intel Fellow Director, Systems Technology Lab orporate Technology Group 1 Environmental Trends: ell 2 *Other names and brands may be claimed as the

More information

Real Parallel Computers

Real Parallel Computers Real Parallel Computers Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra, Meuer, Simon Parallel Computing 2005 Short history

More information

A Performance Oriented Migration Framework For The Grid Λ

A Performance Oriented Migration Framework For The Grid Λ A Performance Oriented Migration Framework For The Grid Λ Sathish S. Vadhiyar and Jack J. Dongarra Computer Science Department University of Tennessee fvss, dongarrag@cs.utk.edu Abstract At least three

More information

Parallel Data Mining on a Beowulf Cluster

Parallel Data Mining on a Beowulf Cluster Parallel Data Mining on a Beowulf Cluster Peter Strazdins, Peter Christen, Ole M. Nielsen and Markus Hegland http://cs.anu.edu.au/ Peter.Strazdins (/seminars) Data Mining Group Australian National University,

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is connected

More information

Innovations of the NetSolve Grid Computing System

Innovations of the NetSolve Grid Computing System CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2002; 14:1457 1479 (DOI: 10.1002/cpe.678) Innovations of the NetSolve Grid Computing System Dorian C. Arnold 1,,,HenriCasanova

More information

Performance comparison between a massive SMP machine and clusters

Performance comparison between a massive SMP machine and clusters Performance comparison between a massive SMP machine and clusters Martin Scarcia, Stefano Alberto Russo Sissa/eLab joint Democritos/Sissa Laboratory for e-science Via Beirut 2/4 34151 Trieste, Italy Stefano

More information

Lecture 9: MIMD Architecture

Lecture 9: MIMD Architecture Lecture 9: MIMD Architecture Introduction and classification Symmetric multiprocessors NUMA architecture Cluster machines Zebo Peng, IDA, LiTH 1 Introduction MIMD: a set of general purpose processors is

More information

Campus Network Design. 2003, Cisco Systems, Inc. All rights reserved. 2-1

Campus Network Design. 2003, Cisco Systems, Inc. All rights reserved. 2-1 Campus Network Design 2003, Cisco Systems, Inc. All rights reserved. 2-1 Design Objective Business Requirement Why do you want to build a network? Too often people build networks based on technological,

More information

1. NoCs: What s the point?

1. NoCs: What s the point? 1. Nos: What s the point? What is the role of networks-on-chip in future many-core systems? What topologies are most promising for performance? What about for energy scaling? How heavily utilized are Nos

More information

Campus Network Design

Campus Network Design Modular Network Design Campus Network Design Modules are analogous to building blocks of different shapes and sizes; when creating a building, each block has different functions Designing one of these

More information

Evaluation of Parallel Application s Performance Dependency on RAM using Parallel Virtual Machine

Evaluation of Parallel Application s Performance Dependency on RAM using Parallel Virtual Machine Evaluation of Parallel Application s Performance Dependency on RAM using Parallel Virtual Machine Sampath S 1, Nanjesh B R 1 1 Department of Information Science and Engineering Adichunchanagiri Institute

More information

How to perform HPL on CPU&GPU clusters. Dr.sc. Draško Tomić

How to perform HPL on CPU&GPU clusters. Dr.sc. Draško Tomić How to perform HPL on CPU&GPU clusters Dr.sc. Draško Tomić email: drasko.tomic@hp.com Forecasting is not so easy, HPL benchmarking could be even more difficult Agenda TOP500 GPU trends Some basics about

More information

NOW and the Killer Network David E. Culler

NOW and the Killer Network David E. Culler NOW and the Killer Network David E. Culler culler@cs http://now.cs.berkeley.edu NOW 1 Remember the Killer Micro 100,000,000 10,000,000 R10000 Pentium Transistors 1,000,000 100,000 i80286 i80386 R3000 R2000

More information

Parallel FEM Computation and Multilevel Graph Partitioning Xing Cai

Parallel FEM Computation and Multilevel Graph Partitioning Xing Cai Parallel FEM Computation and Multilevel Graph Partitioning Xing Cai Simula Research Laboratory Overview Parallel FEM computation how? Graph partitioning why? The multilevel approach to GP A numerical example

More information

High Performance Computing and Data Mining

High Performance Computing and Data Mining High Performance Computing and Data Mining Performance Issues in Data Mining Peter Christen Peter.Christen@anu.edu.au Data Mining Group Department of Computer Science, FEIT Australian National University,

More information

CS Understanding Parallel Computing

CS Understanding Parallel Computing CS 594 001 Understanding Parallel Computing Web page for the course: http://www.cs.utk.edu/~dongarra/web-pages/cs594-2006.htm CS 594 001 Wednesday s 1:30 4:00 Understanding Parallel Computing: From Theory

More information

New Grid Scheduling and Rescheduling Methods in the GrADS Project

New Grid Scheduling and Rescheduling Methods in the GrADS Project New Grid Scheduling and Rescheduling Methods in the GrADS Project K. Cooper, A. Dasgupta, K. Kennedy, C. Koelbel, A. Mandal, G. Marin, M. Mazina, J. Mellor-Crummey Computer Science Dept., Rice University

More information

ITTC High-Performance Networking The University of Kansas EECS 881 Architecture and Topology

ITTC High-Performance Networking The University of Kansas EECS 881 Architecture and Topology High-Performance Networking The University of Kansas EECS 881 Architecture and Topology James P.G. Sterbenz Department of Electrical Engineering & Computer Science Information Technology & Telecommunications

More information

Experiences in Managing Resources on a Large Origin3000 cluster

Experiences in Managing Resources on a Large Origin3000 cluster Experiences in Managing Resources on a Large Origin3000 cluster UG Summit 2002, Manchester, May 20 2002, Mark van de Sanden & Huub Stoffers http://www.sara.nl A oarse Outline of this Presentation Overview

More information

COSC 6385 Computer Architecture - Multi Processor Systems

COSC 6385 Computer Architecture - Multi Processor Systems COSC 6385 Computer Architecture - Multi Processor Systems Fall 2006 Classification of Parallel Architectures Flynn s Taxonomy SISD: Single instruction single data Classical von Neumann architecture SIMD:

More information

LAPACK. Linear Algebra PACKage. Janice Giudice David Knezevic 1

LAPACK. Linear Algebra PACKage. Janice Giudice David Knezevic 1 LAPACK Linear Algebra PACKage 1 Janice Giudice David Knezevic 1 Motivating Question Recalling from last week... Level 1 BLAS: vectors ops Level 2 BLAS: matrix-vectors ops 2 2 O( n ) flops on O( n ) data

More information

Adaptive Cluster Computing using JavaSpaces

Adaptive Cluster Computing using JavaSpaces Adaptive Cluster Computing using JavaSpaces Jyoti Batheja and Manish Parashar The Applied Software Systems Lab. ECE Department, Rutgers University Outline Background Introduction Related Work Summary of

More information

The GridWay. approach for job Submission and Management on Grids. Outline. Motivation. The GridWay Framework. Resource Selection

The GridWay. approach for job Submission and Management on Grids. Outline. Motivation. The GridWay Framework. Resource Selection The GridWay approach for job Submission and Management on Grids Eduardo Huedo Rubén S. Montero Ignacio M. Llorente Laboratorio de Computación Avanzada Centro de Astrobiología (INTA - CSIC) Associated to

More information

Adaptive Scientific Software Libraries

Adaptive Scientific Software Libraries Adaptive Scientific Software Libraries Lennart Johnsson Advanced Computing Research Laboratory Department of Computer Science University of Houston Challenges Diversity of execution environments Growing

More information

From Beowulf to the HIVE

From Beowulf to the HIVE Commodity Cluster Computing at Goddard Space Flight Center Dr. John E. Dorband NASA Goddard Space Flight Center Earth and Space Data Computing Division Applied Information Sciences Branch 1 The Legacy

More information

No Tradeoff Low Latency + High Efficiency

No Tradeoff Low Latency + High Efficiency No Tradeoff Low Latency + High Efficiency Christos Kozyrakis http://mast.stanford.edu Latency-critical Applications A growing class of online workloads Search, social networking, software-as-service (SaaS),

More information

OmniRPC: a Grid RPC facility for Cluster and Global Computing in OpenMP

OmniRPC: a Grid RPC facility for Cluster and Global Computing in OpenMP OmniRPC: a Grid RPC facility for Cluster and Global Computing in OpenMP (extended abstract) Mitsuhisa Sato 1, Motonari Hirano 2, Yoshio Tanaka 2 and Satoshi Sekiguchi 2 1 Real World Computing Partnership,

More information

@joerg_schad Nightmares of a Container Orchestration System

@joerg_schad Nightmares of a Container Orchestration System @joerg_schad Nightmares of a Container Orchestration System 2017 Mesosphere, Inc. All Rights Reserved. 1 Jörg Schad Distributed Systems Engineer @joerg_schad Jan Repnak Support Engineer/ Solution Architect

More information

Application Case Studies of Logistical Networking

Application Case Studies of Logistical Networking Application Case Studies of Logistical Networking Micah Beck, Assoc. Prof. & Director Terry Moore, Assoc. Director Logistical Computing & Internetworking (LoCI) Lab Computer Science Department Internet2

More information

Voltaire Making Applications Run Faster

Voltaire Making Applications Run Faster Voltaire Making Applications Run Faster Asaf Somekh Director, Marketing Voltaire, Inc. Agenda HPC Trends InfiniBand Voltaire Grid Backbone Deployment examples About Voltaire HPC Trends Clusters are the

More information

Motivation and goal Design concepts and service model Architecture and implementation Performance, and so on...

Motivation and goal Design concepts and service model Architecture and implementation Performance, and so on... Motivation and goal Design concepts and service model Architecture and implementation Performance, and so on... Autonomous applications have a demand for grasping the state of hosts and networks for: sustaining

More information

Improve Web Application Performance with Zend Platform

Improve Web Application Performance with Zend Platform Improve Web Application Performance with Zend Platform Shahar Evron Zend Sr. PHP Specialist Copyright 2007, Zend Technologies Inc. Agenda Benchmark Setup Comprehensive Performance Multilayered Caching

More information

MPI versions. MPI History

MPI versions. MPI History MPI versions MPI History Standardization started (1992) MPI-1 completed (1.0) (May 1994) Clarifications (1.1) (June 1995) MPI-2 (started: 1995, finished: 1997) MPI-2 book 1999 MPICH 1.2.4 partial implemention

More information

Basics of datacommunication

Basics of datacommunication Data communication I Lecture 1 Course Introduction About the course Basics of datacommunication How is information transported between digital devices? Essential data communication protocols Insight into

More information

Kenneth A. Hawick P. D. Coddington H. A. James

Kenneth A. Hawick P. D. Coddington H. A. James Student: Vidar Tulinius Email: vidarot@brandeis.edu Distributed frameworks and parallel algorithms for processing large-scale geographic data Kenneth A. Hawick P. D. Coddington H. A. James Overview Increasing

More information

Parallel & Cluster Computing. cs 6260 professor: elise de doncker by: lina hussein

Parallel & Cluster Computing. cs 6260 professor: elise de doncker by: lina hussein Parallel & Cluster Computing cs 6260 professor: elise de doncker by: lina hussein 1 Topics Covered : Introduction What is cluster computing? Classification of Cluster Computing Technologies: Beowulf cluster

More information

The NE010 iwarp Adapter

The NE010 iwarp Adapter The NE010 iwarp Adapter Gary Montry Senior Scientist +1-512-493-3241 GMontry@NetEffect.com Today s Data Center Users Applications networking adapter LAN Ethernet NAS block storage clustering adapter adapter

More information

Performance of ORBs on Switched Fabric Transports

Performance of ORBs on Switched Fabric Transports Performance of ORBs on Switched Fabric Transports Victor Giddings Objective Interface Systems victor.giddings@ois.com 2001 Objective Interface Systems, Inc. Switched Fabrics High-speed interconnects High-bandwidth,

More information

Component Architectures

Component Architectures Component Architectures Rapid Prototyping in a Networked Environment Ken Kennedy Rice University http://www.cs.rice.edu/~ken/presentations/lacsicomponentssv01.pdf Participants Ruth Aydt Bradley Broom Zoran

More information

Agent Semantic Communications Service (ASCS) Teknowledge

Agent Semantic Communications Service (ASCS) Teknowledge Agent Semantic Communications Service (ASCS) Teknowledge John Li, Allan Terry November 2004 0 Overall Program Summary The problem: Leverage semantic markup for integration of heterogeneous data sources

More information

Euro-Par Pisa - Italy

Euro-Par Pisa - Italy Euro-Par 2004 - Pisa - Italy Accelerating farms through ad- distributed scalable object repository Marco Aldinucci, ISTI-CNR, Pisa, Italy Massimo Torquati, CS dept. Uni. Pisa, Italy Outline (Herd of Object

More information

CloudView : Describe and Maintain Resource Views in Cloud

CloudView : Describe and Maintain Resource Views in Cloud The 2 nd IEEE International Conference on Cloud Computing Technology and Science CloudView : Describe and Maintain Resource Views in Cloud Dehui Zhou, Liang Zhong, Tianyu Wo and Junbin, Kang Beihang University

More information

SharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Social Environment

SharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Social Environment SharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Social Environment This document is provided as-is. Information and views expressed in this document, including URL and other Internet

More information

Taming your heterogeneous cloud with Red Hat OpenShift Container Platform.

Taming your heterogeneous cloud with Red Hat OpenShift Container Platform. Taming your heterogeneous cloud with Red Hat OpenShift Container Platform martin@redhat.com Business Problem: Building a Hybrid Cloud solution PartyCo Some Bare Metal machines Mostly Virtualised CosPlayUK

More information

Distributed Computing: PVM, MPI, and MOSIX. Multiple Processor Systems. Dr. Shaaban. Judd E.N. Jenne

Distributed Computing: PVM, MPI, and MOSIX. Multiple Processor Systems. Dr. Shaaban. Judd E.N. Jenne Distributed Computing: PVM, MPI, and MOSIX Multiple Processor Systems Dr. Shaaban Judd E.N. Jenne May 21, 1999 Abstract: Distributed computing is emerging as the preferred means of supporting parallel

More information

Optical Interconnection Networks in Data Centers: Recent Trends and Future Challenges

Optical Interconnection Networks in Data Centers: Recent Trends and Future Challenges Optical Interconnection Networks in Data Centers: Recent Trends and Future Challenges Speaker: Lin Wang Research Advisor: Biswanath Mukherjee Kachris C, Kanonakis K, Tomkos I. Optical interconnection networks

More information

Logistical Computing and Internetworking: Middleware for the Use of Storage in Communication

Logistical Computing and Internetworking: Middleware for the Use of Storage in Communication Logistical Computing and Internetworking: Middleware for the Use of Storage in Communication Micah Beck Dorian Arnold Alessandro Bassi Fran Berman Henri Casanova Jack Dongarra Terence Moore Graziano Obertelli

More information

Sami Saarinen Peter Towers. 11th ECMWF Workshop on the Use of HPC in Meteorology Slide 1

Sami Saarinen Peter Towers. 11th ECMWF Workshop on the Use of HPC in Meteorology Slide 1 Acknowledgements: Petra Kogel Sami Saarinen Peter Towers 11th ECMWF Workshop on the Use of HPC in Meteorology Slide 1 Motivation Opteron and P690+ clusters MPI communications IFS Forecast Model IFS 4D-Var

More information

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures

GFS Overview. Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures GFS Overview Design goals/priorities Design for big-data workloads Huge files, mostly appends, concurrency, huge bandwidth Design for failures Interface: non-posix New op: record appends (atomicity matters,

More information

Big Orange Bramble. August 09, 2016

Big Orange Bramble. August 09, 2016 Big Orange Bramble August 09, 2016 Overview HPL SPH PiBrot Numeric Integration Parallel Pi Monte Carlo FDS DANNA HPL High Performance Linpack is a benchmark for clusters Created here at the University

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #8 2/7/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Outline From last class

More information

International Journal of Parallel Programming, Vol. 33, No. 2, June 2005 ( 2005) DOI: /s

International Journal of Parallel Programming, Vol. 33, No. 2, June 2005 ( 2005) DOI: /s International Journal of Parallel Programming, Vol. 33, No. 2, June 2005 ( 2005) DOI: 10.1007/s10766-005-3584-4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 New Grid Scheduling and Rescheduling Methods in the

More information

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP TITLE: Implement sort algorithm and run it using HADOOP PRE-REQUISITE Preliminary knowledge of clusters and overview of Hadoop and its basic functionality. THEORY 1. Introduction to Hadoop The Apache Hadoop

More information

NetSolve: past, present, and future; a look at a Grid enabled server 1

NetSolve: past, present, and future; a look at a Grid enabled server 1 24 NetSolve: past, present, and future; a look at a Grid enabled server 1 Sudesh Agrawal, Jack Dongarra, Keith Seymour, and Sathish Vadhiyar University of Tennessee, Tennessee, United States 24.1 INTRODUCTION

More information

On the Migration of the Scientific Code Dyana from SMPs to Clusters of PCs and on to the Grid

On the Migration of the Scientific Code Dyana from SMPs to Clusters of PCs and on to the Grid On the Migration of the Scientific Code Dyana from SMPs to Clusters of PCs and on to the Grid Michela Taufer Gérard Roos Thomas Stricker Peter Güntert Institute for Computer Systems Institute for Molecular

More information

Compiler Architecture for High-Performance Problem Solving

Compiler Architecture for High-Performance Problem Solving Compiler Architecture for High-Performance Problem Solving A Quest for High-Level Programming Systems Ken Kennedy Rice University http://www.cs.rice.edu/~ken/presentations/compilerarchitecture.pdf Context

More information

CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman)

CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC. Guest Lecturer: Sukhyun Song (original slides by Alan Sussman) CMSC 714 Lecture 6 MPI vs. OpenMP and OpenACC Guest Lecturer: Sukhyun Song (original slides by Alan Sussman) Parallel Programming with Message Passing and Directives 2 MPI + OpenMP Some applications can

More information

Virtual Grids. Today s Readings

Virtual Grids. Today s Readings Virtual Grids Last Time» Adaptation by Applications» What do you need to know? To do it well?» Grid Application Development Software (GrADS) Today» Virtual Grids» Virtual Grid Application Development Software

More information

An Exascale Programming, Multi objective Optimisation and Resilience Management Environment Based on Nested Recursive Parallelism.

An Exascale Programming, Multi objective Optimisation and Resilience Management Environment Based on Nested Recursive Parallelism. This project has received funding from the European Union s Horizon 2020 research and innovation programme under grant agreement No. 671603 An Exascale Programming, ulti objective Optimisation and Resilience

More information

Toward Improved Support for Loosely Coupled Large Scale Simulation Workflows. Swen Boehm Wael Elwasif Thomas Naughton, Geoffroy R.

Toward Improved Support for Loosely Coupled Large Scale Simulation Workflows. Swen Boehm Wael Elwasif Thomas Naughton, Geoffroy R. Toward Improved Support for Loosely Coupled Large Scale Simulation Workflows Swen Boehm Wael Elwasif Thomas Naughton, Geoffroy R. Vallee Motivation & Challenges Bigger machines (e.g., TITAN, upcoming Exascale

More information

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems.

Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. Cluster Networks Introduction Communication has significant impact on application performance. Interconnection networks therefore have a vital role in cluster systems. As usual, the driver is performance

More information

New research on Key Technologies of unstructured data cloud storage

New research on Key Technologies of unstructured data cloud storage 2017 International Conference on Computing, Communications and Automation(I3CA 2017) New research on Key Technologies of unstructured data cloud storage Songqi Peng, Rengkui Liua, *, Futian Wang State

More information

Parallel Discrete Event Simulation

Parallel Discrete Event Simulation IEEE/ACM DS RT 2016 September 2016 Parallel Discrete Event Simulation on Data Processing Engines Kazuyuki Shudo, Yuya Kato, Takahiro Sugino, Masatoshi Hanai Tokyo Institute of Technology Tokyo Tech Proposal:

More information

hgs/sip2001 Conferencing 1 SIP Conferencing

hgs/sip2001 Conferencing 1 SIP Conferencing hgs/sip2001 onferencing 1 SIP onferencing Henning Schulzrinne ept. of omputer Science olumbia University New York, New York (sip:)schulzrinne@cs.columbia.edu onference International SIP Paris, France February

More information

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono

Introduction to CUDA Algoritmi e Calcolo Parallelo. Daniele Loiacono Introduction to CUDA Algoritmi e Calcolo Parallelo References q This set of slides is mainly based on: " CUDA Technical Training, Dr. Antonino Tumeo, Pacific Northwest National Laboratory " Slide of Applied

More information

System Software for Big Data and Post Petascale Computing

System Software for Big Data and Post Petascale Computing The Japanese Extreme Big Data Workshop February 26, 2014 System Software for Big Data and Post Petascale Computing Osamu Tatebe University of Tsukuba I/O performance requirement for exascale applications

More information

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI.

CSCI 402: Computer Architectures. Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI. CSCI 402: Computer Architectures Parallel Processors (2) Fengguang Song Department of Computer & Information Science IUPUI 6.6 - End Today s Contents GPU Cluster and its network topology The Roofline performance

More information

Chapter 20: Database System Architectures

Chapter 20: Database System Architectures Chapter 20: Database System Architectures Chapter 20: Database System Architectures Centralized and Client-Server Systems Server System Architectures Parallel Systems Distributed Systems Network Types

More information

Lecture 9: MIMD Architectures

Lecture 9: MIMD Architectures Lecture 9: MIMD Architectures Introduction and classification Symmetric multiprocessors NUMA architecture Clusters Zebo Peng, IDA, LiTH 1 Introduction A set of general purpose processors is connected together.

More information

BlueGene/L. Computer Science, University of Warwick. Source: IBM

BlueGene/L. Computer Science, University of Warwick. Source: IBM BlueGene/L Source: IBM 1 BlueGene/L networking BlueGene system employs various network types. Central is the torus interconnection network: 3D torus with wrap-around. Each node connects to six neighbours

More information

Transparent Throughput Elas0city for IaaS Cloud Storage Using Guest- Side Block- Level Caching

Transparent Throughput Elas0city for IaaS Cloud Storage Using Guest- Side Block- Level Caching Transparent Throughput Elas0city for IaaS Cloud Storage Using Guest- Side Block- Level Caching Bogdan Nicolae (IBM Research, Ireland) Pierre Riteau (University of Chicago, USA) Kate Keahey (Argonne National

More information

The AppLeS Parameter Sweep Template: User-level middleware for the Grid 1

The AppLeS Parameter Sweep Template: User-level middleware for the Grid 1 111 The AppLeS Parameter Sweep Template: User-level middleware for the Grid 1 Henri Casanova a, Graziano Obertelli a, Francine Berman a, and Richard Wolski b a Computer Science and Engineering Department,

More information

Performance Characteristics of a Cost-Effective Medium-Sized Beowulf Cluster Supercomputer

Performance Characteristics of a Cost-Effective Medium-Sized Beowulf Cluster Supercomputer Performance Characteristics of a Cost-Effective Medium-Sized Beowulf Cluster Supercomputer Andre L.C. Barczak 1, Chris H. Messom 1, and Martin J. Johnson 1 Massey University, Institute of Information and

More information

Introduction to Distributed Systems. INF5040/9040 Autumn 2018 Lecturer: Eli Gjørven (ifi/uio)

Introduction to Distributed Systems. INF5040/9040 Autumn 2018 Lecturer: Eli Gjørven (ifi/uio) Introduction to Distributed Systems INF5040/9040 Autumn 2018 Lecturer: Eli Gjørven (ifi/uio) August 28, 2018 Outline Definition of a distributed system Goals of a distributed system Implications of distributed

More information

The Future of High-Performance Networking (The 5?, 10?, 15? Year Outlook)

The Future of High-Performance Networking (The 5?, 10?, 15? Year Outlook) Workshop on New Visions for Large-Scale Networks: Research & Applications Vienna, VA, USA, March 12-14, 2001 The Future of High-Performance Networking (The 5?, 10?, 15? Year Outlook) Wu-chun Feng feng@lanl.gov

More information

Using implicit fitness functions for genetic algorithm-based agent scheduling

Using implicit fitness functions for genetic algorithm-based agent scheduling Using implicit fitness functions for genetic algorithm-based agent scheduling Sankaran Prashanth, Daniel Andresen Department of Computing and Information Sciences Kansas State University Manhattan, KS

More information