High Performance and Productivity Computing with Windows HPC George Yan. Group Manager Windows HPC Microsoft China

Size: px
Start display at page:

Download "High Performance and Productivity Computing with Windows HPC George Yan. Group Manager Windows HPC Microsoft China"

Transcription

1 High Performance and Productivity Computing with Windows HPC George Yan Group Manager Windows HPC Microsoft China

2 HPC at Microsoft 1997 NCSA deploys first Windows clusters on NT Windows Server 2000 ships 2001 Microsoft Computational Clustering Preview kit and Beowulf Cluster Computing with Windows book released 2002 Cornell Theory Center migrates to all- Windows infrastructure, eventually reaching over 600 nodes and 1,200 user accounts, first Top500 appearance 2003 Argonne National Labs releases MPICH on Windows

3 HPC at Microsoft 2004 Windows HPC team established in both Redmond and Shanghai 2005 Microsoft launches HPC entry at SC 05 in Seattle with Bill Gates keynote 2006 Windows Compute Cluster Server 2003 ships 2007 Microsoft named one of the Top 5 companies to watch in HPC at SC Windows HPC Server 2008

4 Spring 2008, NCSA, # cores, 68.5 TF, 77.7% Spring 2008, Umea, # cores, 46 TF, 85.5% Spring 2008, Aachen, # cores, 18.8 TF, 76.5% Fall 2007, Microsoft, # cores, 11.8 TF, 77.1% 30% efficiency improvement Spring 2007, Microsoft, # cores, 9 TF, 58.8% Windows HPC Server 2008 Windows Compute Cluster 2003 Spring 2006, NCSA, # cores, 4.1 TF Winter 2005, Microsoft 4 procs, 9.46 GFlops

5 HPC Clusters in Every Lab X64 Server

6 Parallelism Everywhere Power Density (W/cm 2 ) 10,000 1, Today s Architecture: Heat becoming an unmanageable problem! Hot Plate 486 Nuclear Reactor Rocket Nozzle Sun s Surface Pentium processors GOPS 32,768 2, To Grow, To Keep Up, We Must Embrace Parallel Computing Many-core Peak Parallel GOPs Parallelism Opportunity 80X Single Threaded Perf 10% per year Intel Dev eloper Forum, Spring Pat Gelsinger we see a very significant shift in what architectures will look like in the future... fundamentally the way we've begun to look at doing that is to move from instruction level concurrency to multiple cores per die. But we're going to continue to go beyond there. And that just won't be in our server lines in the future; this will permeate every architecture that we build. All will have massively multicore implementations. Intel Developer Forum, Spring 2004 Pat Gelsinger Chief Technology Officer, Senior Vice President Intel Corporation February, 19, 2004

7 Today s s Environment High Speed networking Corporate Infrastructure Clusters/Super Computers Storage Engineers Scientists Financial Analysts Information workers Specialized languages Mainstream Technologies Compilers Debuggers

8 High Productivity Computing Combined Infrastructure Integrated Desktop and HPC Environment Unified Development Environment

9 Microsoft s s Productivity Vision Windows HPC allows you to accomplish more, in less time, with reduced effort by leveraging users existing skills and integrating with the tools they are already using. Administrator Integrated Turnkey Solution Simplified Setup and Deployment Built-In Diagnostics Efficient Cluster Utilization Integrates with IT Infrastructure and Policies Application Developer Highly Productive Parallel Programming Frameworks Service-Oriented HPC Applications Support for Key HPC Development Standards Unix Application Migration End - User Seamless Integration with Workstation Applications Integrated Collaboration and Workflow Solutions Secure Job Execution and Data Access World-class Performance

10 Industry Focused Solutions Academia Aerospace Automotive Financial Services Geo Services Government Life Sciences

11 Windows HPC Server 2008 Rapid large scale deployment and built-in diagnostics suite Integrated monitoring, management and reporting Familiar UI and rich scripting interface Systems Management Job Scheduling Integrated security via Active Directory Support for batch, interactive and service-oriented applications High availability scheduling Interoperability via OGF s HPC Basic Profile Storage MPI Access to SQL, Windows and Unix file servers Key parallel file server vendor support (GPFS, Lustre, Panasas) In-memory caching options MS-MPI stack based on MPICH2 reference implementation Performance improvements for RDMA networking and multi-core shared memory MS-MPI integrated with Windows Event Tracing

12 Ease of deployment

13 Ease of Deployment

14 Comprehensive Diagnostics Suite

15 Single Management Console

16 Integrated Monitoring

17 Built-in in Reporting

18 Integrated Job Scheduling

19 Service Oriented HPC UDF Scheduler Jobs UDF UDF UDF UDF Head Node Job Cluster Mgmt Mgmt Scheduling Resource Mgmt Results UDF UDF UDF Compute Node Job Execution User App MPI

20 HPC SOA Programming Model Sequential Parallel Session session = new session(startinfo); PricingClient client = new P ricingclient(binding, session.endpointaddress); for (i = 0; i < 100,000,000; i++) { r[i] = worker.dowork(dataset[i]); } reduce ( r ); for (i = 0; I < 100,000,000, i++) { client.begindowork(dataset[i], new AsyncCallback(callback), i) } void callback(iasyncresult handle) { r = client.enddowork(handle); // aggregate results reduce ( r ); }

21 Placement via Job Context node grouping, job templates, filters Application Aware Capacity Aware MATLAB M M M M M M A A A A A A T T T T T T L L L L L L A A A A A A B B B B B B An ISV application (requires Nodes where the application is ins talled) Multi-threaded application (requires machine with many Cores) A big model (requi res Large memory machines) MAT LAB M M M M M M A A A A A A T T T T T T L L L L L L A A A A A A B B B B B B Numa Aware 4-way Structural Analysis MPI Job C0 C0 M C 1 C 1 C2 C2 M C 3 C 3 M M M M P0 P1 P2 P3 M M M M M Quad-core IO 32-core IO

22 NetworkDirect A new RDMA networking interface built for speed and stability 2 usec latency, 2 GB/sec bandwidth on ConnectX OpenFabrics driver for Windows includes support for Network Direct, Winsock Direct and IPoIB protocols TCP/Ethernet Networking Verbs-based based design for close fit with native, high-perf networking interfaces Equal to Hardware- Optimized stacks for MPI micro-benchmarks Socket- Based App Windows Sockets (Winsock + WSD) TCP IP NDIS Networking Networking Mini-port Hardware Hardware Driver Networking Networking WinSock Hardware Hardware Direct Prov ider Networking Hardware Networking Hardware Hardware Driver MPI App MS-MPI Networking Hardware Networking Hardware User Mode Access Layer Networking Hardware Networking Hardware Networking Hardware RDMA Networki ng Networking NetworkDirect Networking Hardware Hardware Prov ider Kernel By- Pass User Mode Kernel Mode (ISV) App CCP Component OS Component IHV Component

23 Partnering for Performance Networking Hardware vendors NetworkDirect design review NetworkDirect & WinsockDirect provider development Windows Core Networking Team Commercial Software Vendors Win64 best practices MPI usage patterns Collaborative performance tuning 4 benchmarking centers online IBM, HP, Dell, SGI Now working with Cray!

24 Devs can't tune what they can't see MS-MPI MPI integrated with Event Tracing for Windows Single, time-correlated log of: OS, driver, MPI, and app events CCS-specific additions High-precision CPU clock correction Log consolidation from multiple compute nodes into a single record of parallel app execution Dual purpose: Performance Analysis Application Trouble-Shooting Trace Data Display Visual Studio & Windows ETW tools Intel Collector/Analyzer Vampir Jumpshot

25 HPC Storage Solutions Aggregate (Mb/s/core) Windows Server 2003 Windows Server 2008 Number of cores in cluster IBM GPFS Panasas Active Scale HP - PolyServe Sun - Lustre Ibrix - Fusion Quantum - StorNext SANbolic Melio file system

26 Unix Application Porting Windows Subsystem for Unix applications Complete SVR-5 and BSD UNIX environment with 300 commands, utilizes, shell scripts, compilers Visual Studio extensions for debugging POSIX applications Support for 32 and 64-bit applications Recent port of WRF weather model 350K lines, Fortran 90 and C using MPI, OpenMP Traditionally developed for Unix HPC systems Two dynamical cores, full range of physics options Porting experience Fewer than 750 lines of code changed Changes in only several hundred of lines of code, primarily in the build mechanism (Makefiles, scripts) Level of effort and nature of tasks not unlike porting to any new version of UNIX Performance on par with the Linux systems

27 F# is......a functional, object-oriented, oriented, imperative and explorative programming language for.net

28 Example: Taming Asynchronous I/O using System; using System.IO; using System.Threading; public static void ReadInImageCallback(IAsyncResult asyncresult) { public static void ProcessImagesInBulk() ImageStateObject state = (ImageStateObject)asyncResult.AsyncState; { public class BulkImageProcAsync Stream stream = state.fs; Console.WriteLine("Processing images... "); { int bytesread = stream.endread(asyncresult); long t0 = Environment.TickCount; public const String ImageBaseName = "tmpimage-"; if (bytesread!= numpixels) NumImagesToFinish = numimages; public const int numimages = 200; throw new Exception(String.Format AsyncCallback readimagecallback = new public const int numpixels = 512 * 512; ("In ReadInImageCallback, got the wrong number of " + AsyncCallback(ReadInImageCallback); "bytes from the image: {0}.", bytesread)); for (int i = 0; i < numimages; i++) // ProcessImage has a simple O(N) loop, and ProcessImage(state.pixels, you can vary the number state.imagenum); { // of times you repeat that loop to make the stream.close(); application more CPU- ImageStateObject state = new ImageStateObject(); // bound or more IO-bound. state.pixels = new byte[numpixels]; public static int processimagerepeats = 20; // Now write out the image. state.imagenum = i; // Using asynchronous I/O here appears not to be best practice. // Very large items are read only once, so you can make the // Threads must decrement NumImagesToFinish, // It and ends protect up swamping the threadpool, because the threadpool // buffer on the FileStream very small to save memory. // their access to it through a mutex. // threads are blocked on I/O requests that were just queued tofilestream fs = new FileStream(ImageBaseName + i + ".tmp", public static int NumImagesToFinish = numimages; // the threadpool. FileMode.Open, FileAccess.Read, FileShare.Read, 1, true); public static Object[] NumImagesMutex = new FileStream Object[0]; fs = new FileStream(ImageBaseName + state.imagenum + state.fs = fs; // WaitObject is signalled when all image processing ".done", is FileMode.Create, done. FileAccess.Write, FileShare.None, fs.beginread(state.pixels, 0, numpixels, readimagecallback, public static Object[] WaitObject = new Object[0]; 4096, false); state); public class ImageStateObject fs.write(state.pixels, 0, numpixels); } { fs.close(); public byte[] pixels; // Determine whether all images are done being processed. public int imagenum; // If not, block until all are finished. public FileStream fs; bool mustblock = false; } lock (NumImagesMutex) } // This application model uses too much memory. // Releasing memory as soon as possible is a good idea, // especially global state. state.pixels = null; fs = null; // Record that an image is finished now. lock (NumImagesMutex) { NumImagesToFinish--; if (NumImagesToFinish == 0) { Monitor.Enter(WaitObject); Monitor.Pulse(WaitObject); Monitor.Exit(WaitObject); } } } } { if (NumImagesToFinish > 0) mustblock = true; } if (mustblock) { Processing 200 images in parallel Console.WriteLine("All worker threads are queued. " + " Blocking until they complete. numleft: {0}", NumImagesToFinish); Monitor.Enter(WaitObject); Monitor.Wait(WaitObject); Monitor.Exit(WaitObject); } long t1 = Environment.TickCount; Console.WriteLine("Total time processing images: {0}ms", (t1 - t0));

29 Example: Taming Asynchronous I/O Open the file, synchronously let ProcessImageAsync(i) = This object coordinates Equivalent F# code (same perf) async { let instream = File.OpenRead(sprintf "source%d.jpg" i) let! pixels = instream.readasync(numpixels) let pixels' = TransformImage(pixels,i) let outstream = File.OpenWrite(sprintf "result%d.jpg" i) do! outstream.writeasync(pixels') do Console.WriteLine "done!" } let ProcessImagesAsync() = Async.Run (Async.Parallel Read from the file, asynchronously Write the result, asynchronously [ for i in 1.. numimages -> ProcessImageAsync(i) ])! = asynchronous Generate the tasks and queue them in parallel

30 Microsoft HPC++ Experience Application Benefits The most productive distributed application development environment Cluster Benefits Complete HPC cluster platform integrated with the enterprise infrastructure System Benefits Cost-effective, reliable and high performance server operating system

31 Resources research.microsoft.com/fsharp

32 Thank you!

The F# Team Microsoft

The F# Team Microsoft The F# Team Microsoft Asynchronous and Parallel Programming with F# Workflows Some other F# Language Oriented Programming Techniques Lots of Examples F# is: F# is a.net programming language Functional

More information

Why is Microsoft investing in Functional Programming?

Why is Microsoft investing in Functional Programming? Why is Microsoft investing in Functional Programming? Don Syme With thanks to Leon Bambrick, Chris Smith and the puppies All opinions are those of the author and not necessarily those of Microsoft Simplicity

More information

Don Syme, Principal Researcher Microsoft Research, Cambridge

Don Syme, Principal Researcher Microsoft Research, Cambridge F# Succinct, Expressive, Functional Don Syme, Principal Researcher Microsoft Research, Cambridge Topics What is F# about? Some Simple F# Programming A Taste of Parallel/Reactive with F# What is F# about?

More information

F# Succinct, Expressive, Efficient. The F# Team Microsoft Developer Division, Redmond Microsoft Research, Cambridge

F# Succinct, Expressive, Efficient. The F# Team Microsoft Developer Division, Redmond Microsoft Research, Cambridge F# Succinct, Expressive, Efficient Functional Programming for.net The F# Team Microsoft Developer Division, Redmond Microsoft Research, Cambridge Topics What is F# about? Some Simple F# Programming A Taste

More information

F# Succinct, Expressive, Efficient Functional Programming for.net

F# Succinct, Expressive, Efficient Functional Programming for.net F# Succinct, Expressive, Efficient Functional Programming for.net The F# Team Microsoft Developer Division, Redmond Microsoft Research, Cambridge Topics What is F# about? Some Simple F# Programming A Taste

More information

Windows OpenFabrics (WinOF) Update

Windows OpenFabrics (WinOF) Update Windows OpenFabrics (WinOF) Update Eric Lantz, Microsoft (elantz@microsoft.com) April 2008 Agenda OpenFabrics and Microsoft Current Events HPC Server 2008 Release NetworkDirect - RDMA for Windows 2 OpenFabrics

More information

Supercomputing and Mass Market Desktops

Supercomputing and Mass Market Desktops Supercomputing and Mass Market Desktops John Manferdelli Microsoft Corporation This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.

More information

"Charting the Course to Your Success!" MOC A Developing High-performance Applications using Microsoft Windows HPC Server 2008

Charting the Course to Your Success! MOC A Developing High-performance Applications using Microsoft Windows HPC Server 2008 Description Course Summary This course provides students with the knowledge and skills to develop high-performance computing (HPC) applications for Microsoft. Students learn about the product Microsoft,

More information

SUSE. High Performance Computing. Eduardo Diaz. Alberto Esteban. PreSales SUSE Linux Enterprise

SUSE. High Performance Computing. Eduardo Diaz. Alberto Esteban. PreSales SUSE Linux Enterprise SUSE High Performance Computing Eduardo Diaz PreSales SUSE Linux Enterprise ediaz@suse.com Alberto Esteban Territory Manager North-East SUSE Linux Enterprise aesteban@suse.com HPC Overview SUSE High Performance

More information

Microsoft Windows HPC Server 2008 R2 for the Cluster Developer

Microsoft Windows HPC Server 2008 R2 for the Cluster Developer 50291B - Version: 1 02 May 2018 Microsoft Windows HPC Server 2008 R2 for the Cluster Developer Microsoft Windows HPC Server 2008 R2 for the Cluster Developer 50291B - Version: 1 5 days Course Description:

More information

Windows OpenFabrics (WinOF)

Windows OpenFabrics (WinOF) Windows OpenFabrics (WinOF) Gilad Shainer, Mellanox Ishai Rabinovitz, Mellanox Stan Smith, Intel April 2008 Windows OpenFabrics (WinOF) Collaborative effort to develop, test and release OFA software for

More information

IBM Lotus Domino Product Roadmap

IBM Lotus Domino Product Roadmap IBM Lotus Domino Product Roadmap Your Name Your Title Today s agenda Domino Strategy What s coming in Domino 8? What s planned beyond Domino 8? Lotus Domino Strategy The integrated messaging & collaboration

More information

Application Acceleration Beyond Flash Storage

Application Acceleration Beyond Flash Storage Application Acceleration Beyond Flash Storage Session 303C Mellanox Technologies Flash Memory Summit July 2014 Accelerating Applications, Step-by-Step First Steps Make compute fast Moore s Law Make storage

More information

Arm Processor Technology Update and Roadmap

Arm Processor Technology Update and Roadmap Arm Processor Technology Update and Roadmap ARM Processor Technology Update and Roadmap Cavium: Giri Chukkapalli is a Distinguished Engineer in the Data Center Group (DCG) Introduction to ARM Architecture

More information

Performance Tools for Technical Computing

Performance Tools for Technical Computing Christian Terboven terboven@rz.rwth-aachen.de Center for Computing and Communication RWTH Aachen University Intel Software Conference 2010 April 13th, Barcelona, Spain Agenda o Motivation and Methodology

More information

Improving the Productivity of Scalable Application Development with TotalView May 18th, 2010

Improving the Productivity of Scalable Application Development with TotalView May 18th, 2010 Improving the Productivity of Scalable Application Development with TotalView May 18th, 2010 Chris Gottbrath Principal Product Manager Rogue Wave Major Product Offerings 2 TotalView Technologies Family

More information

A unified multicore programming model

A unified multicore programming model A unified multicore programming model Simplifying multicore migration By Sven Brehmer Abstract There are a number of different multicore architectures and programming models available, making it challenging

More information

Horizontal Scaling Solution using Linux Environment

Horizontal Scaling Solution using Linux Environment Systems Software for the Next Generation of Storage Horizontal Scaling Solution using Linux Environment December 14, 2001 Carter George Vice President, Corporate Development PolyServe, Inc. PolyServe Goal:

More information

On-Demand Supercomputing Multiplies the Possibilities

On-Demand Supercomputing Multiplies the Possibilities Microsoft Windows Compute Cluster Server 2003 Partner Solution Brief Image courtesy of Wolfram Research, Inc. On-Demand Supercomputing Multiplies the Possibilities Microsoft Windows Compute Cluster Server

More information

Sun and Oracle. Kevin Ashby. Oracle Technical Account Manager. Mob:

Sun and Oracle. Kevin Ashby. Oracle Technical Account Manager. Mob: Sun and Oracle Kevin Ashby Oracle Technical Account Manager Mob: 07710 305038 Email: kevin.ashby@sun.com NEW Sun/Oracle Stats Sun is No1 Platform for Oracle Database Sun is No1 Platform for Oracle Applications

More information

by Brian Hausauer, Chief Architect, NetEffect, Inc

by Brian Hausauer, Chief Architect, NetEffect, Inc iwarp Ethernet: Eliminating Overhead In Data Center Designs Latest extensions to Ethernet virtually eliminate the overhead associated with transport processing, intermediate buffer copies, and application

More information

Compute Cluster Server Lab 1: Installation of Microsoft Compute Cluster Server 2003

Compute Cluster Server Lab 1: Installation of Microsoft Compute Cluster Server 2003 Compute Cluster Server Lab 1: Installation of Microsoft Compute Cluster Server 2003 Compute Cluster Server Lab 1: Installation of Microsoft Compute Cluster Server 2003... 1 Lab Objective... 1 Overview

More information

To Infiniband or Not Infiniband, One Site s s Perspective. Steve Woods MCNC

To Infiniband or Not Infiniband, One Site s s Perspective. Steve Woods MCNC To Infiniband or Not Infiniband, One Site s s Perspective Steve Woods MCNC 1 Agenda Infiniband background Current configuration Base Performance Application performance experience Future Conclusions 2

More information

SUSE. High Performance Computing. Kai Dupke. Meike Chabowski. Senior Product Manager SUSE Linux Enterprise

SUSE. High Performance Computing. Kai Dupke. Meike Chabowski. Senior Product Manager SUSE Linux Enterprise SUSE High Performance Computing Kai Dupke Meike Chabowski Senior Product Manager SUSE Linux Enterprise kdupke@suse.com Senior Product Marketing Manager SUSE Linux Enterprise Meike.Chabowski@suse.com Distribution:

More information

Implementing MPI on Windows: Comparison with Common Approaches on Unix

Implementing MPI on Windows: Comparison with Common Approaches on Unix Implementing MPI on Windows: Comparison with Common Approaches on Unix Jayesh Krishna, 1 Pavan Balaji, 1 Ewing Lusk, 1 Rajeev Thakur, 1 Fabian Tillier 2 1 Argonne Na+onal Laboratory, Argonne, IL, USA 2

More information

Administration. Coursework. Prerequisites. CS 378: Programming for Performance. 4 or 5 programming projects

Administration. Coursework. Prerequisites. CS 378: Programming for Performance. 4 or 5 programming projects CS 378: Programming for Performance Administration Instructors: Keshav Pingali (Professor, CS department & ICES) 4.126 ACES Email: pingali@cs.utexas.edu TA: Hao Wu (Grad student, CS department) Email:

More information

Checklist for Selecting and Deploying Scalable Clusters with InfiniBand Fabrics

Checklist for Selecting and Deploying Scalable Clusters with InfiniBand Fabrics Checklist for Selecting and Deploying Scalable Clusters with InfiniBand Fabrics Lloyd Dickman, CTO InfiniBand Products Host Solutions Group QLogic Corporation November 13, 2007 @ SC07, Exhibitor Forum

More information

Post-K: Building the Arm HPC Ecosystem

Post-K: Building the Arm HPC Ecosystem Post-K: Building the Arm HPC Ecosystem Toshiyuki Shimizu FUJITSU LIMITED Nov. 14th, 2017 Exhibitor Forum, SC17, Nov. 14, 2017 0 Post-K: Building up Arm HPC Ecosystem Fujitsu s approach for HPC Approach

More information

Addressing the Increasing Challenges of Debugging on Accelerated HPC Systems. Ed Hinkel Senior Sales Engineer

Addressing the Increasing Challenges of Debugging on Accelerated HPC Systems. Ed Hinkel Senior Sales Engineer Addressing the Increasing Challenges of Debugging on Accelerated HPC Systems Ed Hinkel Senior Sales Engineer Agenda Overview - Rogue Wave & TotalView GPU Debugging with TotalView Nvdia CUDA Intel Phi 2

More information

Sami Saarinen Peter Towers. 11th ECMWF Workshop on the Use of HPC in Meteorology Slide 1

Sami Saarinen Peter Towers. 11th ECMWF Workshop on the Use of HPC in Meteorology Slide 1 Acknowledgements: Petra Kogel Sami Saarinen Peter Towers 11th ECMWF Workshop on the Use of HPC in Meteorology Slide 1 Motivation Opteron and P690+ clusters MPI communications IFS Forecast Model IFS 4D-Var

More information

Clusters. Rob Kunz and Justin Watson. Penn State Applied Research Laboratory

Clusters. Rob Kunz and Justin Watson. Penn State Applied Research Laboratory Clusters Rob Kunz and Justin Watson Penn State Applied Research Laboratory rfk102@psu.edu Contents Beowulf Cluster History Hardware Elements Networking Software Performance & Scalability Infrastructure

More information

Administration. Prerequisites. Website. CSE 392/CS 378: High-performance Computing: Principles and Practice

Administration. Prerequisites. Website. CSE 392/CS 378: High-performance Computing: Principles and Practice CSE 392/CS 378: High-performance Computing: Principles and Practice Administration Professors: Keshav Pingali 4.126 ACES Email: pingali@cs.utexas.edu Jim Browne Email: browne@cs.utexas.edu Robert van de

More information

HPC on Windows. Visual Studio 2010 and ISV Software

HPC on Windows. Visual Studio 2010 and ISV Software HPC on Windows Visual Studio 2010 and ISV Software Christian Terboven 19.03.2012 / Aachen, Germany Stand: 16.03.2012 Version 2.3 Rechen- und Kommunikationszentrum (RZ) Agenda

More information

Windows-HPC Environment at RWTH Aachen University

Windows-HPC Environment at RWTH Aachen University Windows-HPC Environment at RWTH Aachen University Christian Terboven, Samuel Sarholz {terboven, sarholz}@rz.rwth-aachen.de Center for Computing and Communication RWTH Aachen University PPCES 2009 March

More information

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING

Chelsio Communications. Meeting Today s Datacenter Challenges. Produced by Tabor Custom Publishing in conjunction with: CUSTOM PUBLISHING Meeting Today s Datacenter Challenges Produced by Tabor Custom Publishing in conjunction with: 1 Introduction In this era of Big Data, today s HPC systems are faced with unprecedented growth in the complexity

More information

SQL Server 2005 on a Dell Scalable Enterprise Foundation

SQL Server 2005 on a Dell Scalable Enterprise Foundation on a Dell Scalable Enterprise Foundation Dell s vision for the scalable enterprise is based on the standardization of core elements of the data center to provide superior value, and encompasses the core

More information

2008 International ANSYS Conference

2008 International ANSYS Conference 2008 International ANSYS Conference Maximizing Productivity With InfiniBand-Based Clusters Gilad Shainer Director of Technical Marketing Mellanox Technologies 2008 ANSYS, Inc. All rights reserved. 1 ANSYS,

More information

The MOSIX Scalable Cluster Computing for Linux. mosix.org

The MOSIX Scalable Cluster Computing for Linux.  mosix.org The MOSIX Scalable Cluster Computing for Linux Prof. Amnon Barak Computer Science Hebrew University http://www. mosix.org 1 Presentation overview Part I : Why computing clusters (slide 3-7) Part II : What

More information

CSC 447: Parallel Programming for Multi- Core and Cluster Systems

CSC 447: Parallel Programming for Multi- Core and Cluster Systems CSC 447: Parallel Programming for Multi- Core and Cluster Systems Why Parallel Computing? Haidar M. Harmanani Spring 2017 Definitions What is parallel? Webster: An arrangement or state that permits several

More information

HPC Tools on Windows. Christian Terboven Center for Computing and Communication RWTH Aachen University.

HPC Tools on Windows. Christian Terboven Center for Computing and Communication RWTH Aachen University. - Excerpt - Christian Terboven terboven@rz.rwth-aachen.de Center for Computing and Communication RWTH Aachen University PPCES March 25th, RWTH Aachen University Agenda o Intel Trace Analyzer and Collector

More information

Operating Systems (ECS 150) Spring 2011

Operating Systems (ECS 150) Spring 2011 Operating Systems (ECS 150) Spring 2011 Raju Pandey Department of Computer Science University of California, Davis CA 95616 pandey@cs.ucdavis.edu http://www.cs.ucdavis.edu/~pandey Course Objectives After

More information

RAIDIX Data Storage Solution. Clustered Data Storage Based on the RAIDIX Software and GPFS File System

RAIDIX Data Storage Solution. Clustered Data Storage Based on the RAIDIX Software and GPFS File System RAIDIX Data Storage Solution Clustered Data Storage Based on the RAIDIX Software and GPFS File System 2017 Contents Synopsis... 2 Introduction... 3 Challenges and the Solution... 4 Solution Architecture...

More information

1. Which programming language is used in approximately 80 percent of legacy mainframe applications?

1. Which programming language is used in approximately 80 percent of legacy mainframe applications? Volume: 59 Questions 1. Which programming language is used in approximately 80 percent of legacy mainframe applications? A. Visual Basic B. C/C++ C. COBOL D. Java Answer: C 2. An enterprise customer's

More information

Programming for Fujitsu Supercomputers

Programming for Fujitsu Supercomputers Programming for Fujitsu Supercomputers Koh Hotta The Next Generation Technical Computing Fujitsu Limited To Programmers who are busy on their own research, Fujitsu provides environments for Parallel Programming

More information

IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning

IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning IME (Infinite Memory Engine) Extreme Application Acceleration & Highly Efficient I/O Provisioning September 22 nd 2015 Tommaso Cecchi 2 What is IME? This breakthrough, software defined storage application

More information

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance

LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance 11 th International LS-DYNA Users Conference Computing Technology LS-DYNA Best-Practices: Networking, MPI and Parallel File System Effect on LS-DYNA Performance Gilad Shainer 1, Tong Liu 2, Jeff Layton

More information

Lecture Topics. Announcements. Today: Operating System Overview (Stallings, chapter , ) Next: Processes (Stallings, chapter

Lecture Topics. Announcements. Today: Operating System Overview (Stallings, chapter , ) Next: Processes (Stallings, chapter Lecture Topics Today: Operating System Overview (Stallings, chapter 2.1-2.4, 2.8-2.10) Next: Processes (Stallings, chapter 3.1-3.6) 1 Announcements Consulting hours posted Self-Study Exercise #3 posted

More information

Lustre A Platform for Intelligent Scale-Out Storage

Lustre A Platform for Intelligent Scale-Out Storage Lustre A Platform for Intelligent Scale-Out Storage Rumi Zahir, rumi. May 2003 rumi.zahir@intel.com Agenda Problem Statement Trends & Current Data Center Storage Architectures The Lustre File System Project

More information

NTRDMA v0.1. An Open Source Driver for PCIe NTB and DMA. Allen Hubbe at Linux Piter 2015 NTRDMA. Messaging App. IB Verbs. dmaengine.h ntb.

NTRDMA v0.1. An Open Source Driver for PCIe NTB and DMA. Allen Hubbe at Linux Piter 2015 NTRDMA. Messaging App. IB Verbs. dmaengine.h ntb. Messaging App IB Verbs NTRDMA dmaengine.h ntb.h DMA DMA DMA NTRDMA v0.1 An Open Source Driver for PCIe and DMA Allen Hubbe at Linux Piter 2015 1 INTRODUCTION Allen Hubbe Senior Software Engineer EMC Corporation

More information

The Stampede is Coming: A New Petascale Resource for the Open Science Community

The Stampede is Coming: A New Petascale Resource for the Open Science Community The Stampede is Coming: A New Petascale Resource for the Open Science Community Jay Boisseau Texas Advanced Computing Center boisseau@tacc.utexas.edu Stampede: Solicitation US National Science Foundation

More information

Sybase Adaptive Server Enterprise on Linux

Sybase Adaptive Server Enterprise on Linux Sybase Adaptive Server Enterprise on Linux A Technical White Paper May 2003 Information Anywhere EXECUTIVE OVERVIEW ARCHITECTURE OF ASE Dynamic Performance Security Mission-Critical Computing Advanced

More information

Cray RS Programming Environment

Cray RS Programming Environment Cray RS Programming Environment Gail Alverson Cray Inc. Cray Proprietary Red Storm Red Storm is a supercomputer system leveraging over 10,000 AMD Opteron processors connected by an innovative high speed,

More information

Best Practices for Setting BIOS Parameters for Performance

Best Practices for Setting BIOS Parameters for Performance White Paper Best Practices for Setting BIOS Parameters for Performance Cisco UCS E5-based M3 Servers May 2013 2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page

More information

Windows Server 2012 Hands- On Camp. Learn What s Hot and New in Windows Server 2012!

Windows Server 2012 Hands- On Camp. Learn What s Hot and New in Windows Server 2012! Windows Server 2012 Hands- On Camp Learn What s Hot and New in Windows Server 2012! Your Facilitator Damir Bersinic Datacenter Solutions Specialist Microsoft Canada Inc. damirb@microsoft.com Twitter: @DamirB

More information

ORACLE Linux / TSC.

ORACLE Linux / TSC. ORACLE Linux / TSC Sekook.jang@oracle.com Unbreakable Linux Unbreakable Support Unbreakable Products Unbreakable Performance Asianux Then. Next? Microsoft Scalability 20 User Workgroup Computing Microsoft

More information

Technical Computing Suite supporting the hybrid system

Technical Computing Suite supporting the hybrid system Technical Computing Suite supporting the hybrid system Supercomputer PRIMEHPC FX10 PRIMERGY x86 cluster Hybrid System Configuration Supercomputer PRIMEHPC FX10 PRIMERGY x86 cluster 6D mesh/torus Interconnect

More information

6.1 Multiprocessor Computing Environment

6.1 Multiprocessor Computing Environment 6 Parallel Computing 6.1 Multiprocessor Computing Environment The high-performance computing environment used in this book for optimization of very large building structures is the Origin 2000 multiprocessor,

More information

Performance Profiler. Klaus-Dieter Oertel Intel-SSG-DPD IT4I HPC Workshop, Ostrava,

Performance Profiler. Klaus-Dieter Oertel Intel-SSG-DPD IT4I HPC Workshop, Ostrava, Performance Profiler Klaus-Dieter Oertel Intel-SSG-DPD IT4I HPC Workshop, Ostrava, 08-09-2016 Faster, Scalable Code, Faster Intel VTune Amplifier Performance Profiler Get Faster Code Faster With Accurate

More information

Improve Web Application Performance with Zend Platform

Improve Web Application Performance with Zend Platform Improve Web Application Performance with Zend Platform Shahar Evron Zend Sr. PHP Specialist Copyright 2007, Zend Technologies Inc. Agenda Benchmark Setup Comprehensive Performance Multilayered Caching

More information

OpenMP 4.0: A Significant Paradigm Shift in Parallelism

OpenMP 4.0: A Significant Paradigm Shift in Parallelism OpenMP 4.0: A Significant Paradigm Shift in Parallelism Michael Wong OpenMP CEO michaelw@ca.ibm.com http://bit.ly/sc13-eval SC13 OpenMP 4.0 released 2 Agenda The OpenMP ARB History of OpenMP OpenMP 4.0

More information

Administration. Course material. Prerequisites. CS 395T: Topics in Multicore Programming. Instructors: TA: Course in computer architecture

Administration. Course material. Prerequisites. CS 395T: Topics in Multicore Programming. Instructors: TA: Course in computer architecture CS 395T: Topics in Multicore Programming Administration Instructors: Keshav Pingali (CS,ICES) 4.26A ACES Email: pingali@cs.utexas.edu TA: Xin Sui Email: xin@cs.utexas.edu University of Texas, Austin Fall

More information

Box s 1 minute Bio l B. Eng (AE 1983): Khon Kean University

Box s 1 minute Bio l B. Eng (AE 1983): Khon Kean University CSC469/585: Winter 2011-12 High Availability and Performance Computing: Towards non-stop services in HPC/HEC/Enterprise IT Environments Chokchai (Box) Leangsuksun, Associate Professor, Computer Science

More information

Windows Compute Cluster Server 2003 allows MATLAB users to quickly and easily get up and running with distributed computing tools.

Windows Compute Cluster Server 2003 allows MATLAB users to quickly and easily get up and running with distributed computing tools. Microsoft Windows Compute Cluster Server 2003 Partner Solution Brief Image courtesy of The MathWorks Technical Computing Tools Combined with Cluster Computing Deliver High-Performance Solutions Microsoft

More information

Activator Library. Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success.

Activator Library. Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success. Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success. ACTIVATORS Designed to give your team assistance when you need it most without

More information

PROGRAMOVÁNÍ V C++ CVIČENÍ. Michal Brabec

PROGRAMOVÁNÍ V C++ CVIČENÍ. Michal Brabec PROGRAMOVÁNÍ V C++ CVIČENÍ Michal Brabec PARALLELISM CATEGORIES CPU? SSE Multiprocessor SIMT - GPU 2 / 17 PARALLELISM V C++ Weak support in the language itself, powerful libraries Many different parallelization

More information

Oracle Enterprise Manager 12c IBM DB2 Database Plug-in

Oracle Enterprise Manager 12c IBM DB2 Database Plug-in Oracle Enterprise Manager 12c IBM DB2 Database Plug-in May 2015 Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and

More information

Future Routing Schemes in Petascale clusters

Future Routing Schemes in Petascale clusters Future Routing Schemes in Petascale clusters Gilad Shainer, Mellanox, USA Ola Torudbakken, Sun Microsystems, Norway Richard Graham, Oak Ridge National Laboratory, USA Birds of a Feather Presentation Abstract

More information

Innovative Alternate Architecture for Exascale Computing. Surya Hotha Director, Product Marketing

Innovative Alternate Architecture for Exascale Computing. Surya Hotha Director, Product Marketing Innovative Alternate Architecture for Exascale Computing Surya Hotha Director, Product Marketing Cavium Corporate Overview Enterprise Mobile Infrastructure Data Center and Cloud Service Provider Cloud

More information

Update of Post-K Development Yutaka Ishikawa RIKEN AICS

Update of Post-K Development Yutaka Ishikawa RIKEN AICS Update of Post-K Development Yutaka Ishikawa RIKEN AICS 11:20AM 11:40AM, 2 nd of November, 2017 FLAGSHIP2020 Project Missions Building the Japanese national flagship supercomputer, post K, and Developing

More information

Industrial system integration experts with combined 100+ years of experience in software development, integration and large project execution

Industrial system integration experts with combined 100+ years of experience in software development, integration and large project execution PRESENTATION Who we are Industrial system integration experts with combined 100+ years of experience in software development, integration and large project execution Background of Matrikon & Honeywell

More information

High Performance Computing Software Development Kit For Mac OS X In Depth Product Information

High Performance Computing Software Development Kit For Mac OS X In Depth Product Information High Performance Computing Software Development Kit For Mac OS X In Depth Product Information 2781 Bond Street Rochester Hills, MI 48309 U.S.A. Tel (248) 853-0095 Fax (248) 853-0108 support@absoft.com

More information

InsightConnector Version 1.0

InsightConnector Version 1.0 InsightConnector Version 1.0 2002 Bynari Inc. All Rights Reserved Table of Contents Table of Contents... 2 Executive Summary... 3 Examination of the Insight Messaging Solution... 3 Exchange or Outlook?...

More information

HPC Architectures. Types of resource currently in use

HPC Architectures. Types of resource currently in use HPC Architectures Types of resource currently in use Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clinical Platform

IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clinical Platform IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clinical Platform A vendor-neutral medical-archive offering Dave Curzio IBM Systems and Technology Group ISV Enablement February

More information

Advanced Threading and Optimization

Advanced Threading and Optimization Mikko Byckling, CSC Michael Klemm, Intel Advanced Threading and Optimization February 24-26, 2015 PRACE Advanced Training Centre CSC IT Center for Science Ltd, Finland!$omp parallel do collapse(3) do p4=1,p4d

More information

BUILD BETTER MICROSOFT SQL SERVER SOLUTIONS Sales Conversation Card

BUILD BETTER MICROSOFT SQL SERVER SOLUTIONS Sales Conversation Card OVERVIEW SALES OPPORTUNITY Lenovo Database Solutions for Microsoft SQL Server bring together the right mix of hardware infrastructure, software, and services to optimize a wide range of data warehouse

More information

User Manual. Admin Report Kit for IIS 7 (ARKIIS)

User Manual. Admin Report Kit for IIS 7 (ARKIIS) User Manual Admin Report Kit for IIS 7 (ARKIIS) Table of Contents 1 Admin Report Kit for IIS 7... 1 1.1 About ARKIIS... 1 1.2 Who can Use ARKIIS?... 1 1.3 System requirements... 2 1.4 Technical Support...

More information

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc

Scaling to Petaflop. Ola Torudbakken Distinguished Engineer. Sun Microsystems, Inc Scaling to Petaflop Ola Torudbakken Distinguished Engineer Sun Microsystems, Inc HPC Market growth is strong CAGR increased from 9.2% (2006) to 15.5% (2007) Market in 2007 doubled from 2003 (Source: IDC

More information

7 DAYS AND 8 NIGHTS WITH THE CARMA DEV KIT

7 DAYS AND 8 NIGHTS WITH THE CARMA DEV KIT 7 DAYS AND 8 NIGHTS WITH THE CARMA DEV KIT Draft Printed for SECO Murex S.A.S 2012 all rights reserved Murex Analytics Only global vendor of trading, risk management and processing systems focusing also

More information

MM5 Modeling System Performance Research and Profiling. March 2009

MM5 Modeling System Performance Research and Profiling. March 2009 MM5 Modeling System Performance Research and Profiling March 2009 Note The following research was performed under the HPC Advisory Council activities AMD, Dell, Mellanox HPC Advisory Council Cluster Center

More information

Hardware and Software solutions for scaling highly threaded processors. Denis Sheahan Distinguished Engineer Sun Microsystems Inc.

Hardware and Software solutions for scaling highly threaded processors. Denis Sheahan Distinguished Engineer Sun Microsystems Inc. Hardware and Software solutions for scaling highly threaded processors Denis Sheahan Distinguished Engineer Sun Microsystems Inc. Agenda Chip Multi-threaded concepts Lessons learned from 6 years of CMT

More information

Linux multi-core scalability

Linux multi-core scalability Linux multi-core scalability Oct 2009 Andi Kleen Intel Corporation andi@firstfloor.org Overview Scalability theory Linux history Some common scalability trouble-spots Application workarounds Motivation

More information

OPEN MPI WITH RDMA SUPPORT AND CUDA. Rolf vandevaart, NVIDIA

OPEN MPI WITH RDMA SUPPORT AND CUDA. Rolf vandevaart, NVIDIA OPEN MPI WITH RDMA SUPPORT AND CUDA Rolf vandevaart, NVIDIA OVERVIEW What is CUDA-aware History of CUDA-aware support in Open MPI GPU Direct RDMA support Tuning parameters Application example Future work

More information

HETEROGENEOUS SYSTEM ARCHITECTURE: PLATFORM FOR THE FUTURE

HETEROGENEOUS SYSTEM ARCHITECTURE: PLATFORM FOR THE FUTURE HETEROGENEOUS SYSTEM ARCHITECTURE: PLATFORM FOR THE FUTURE Haibo Xie, Ph.D. Chief HSA Evangelist AMD China OUTLINE: The Challenges with Computing Today Introducing Heterogeneous System Architecture (HSA)

More information

Multi-core Programming - Introduction

Multi-core Programming - Introduction Multi-core Programming - Introduction Based on slides from Intel Software College and Multi-Core Programming increasing performance through software multi-threading by Shameem Akhter and Jason Roberts,

More information

The Arm Technology Ecosystem: Current Products and Future Outlook

The Arm Technology Ecosystem: Current Products and Future Outlook The Arm Technology Ecosystem: Current Products and Future Outlook Dan Ernst, PhD Advanced Technology Cray, Inc. Why is an Ecosystem Important? An Ecosystem is a collection of common material Developed

More information

System input-output, performance aspects March 2009 Guy Chesnot

System input-output, performance aspects March 2009 Guy Chesnot Headline in Arial Bold 30pt System input-output, performance aspects March 2009 Guy Chesnot Agenda Data sharing Evolution & current tendencies Performances: obstacles Performances: some results and good

More information

Advanced Computer Networks. End Host Optimization

Advanced Computer Networks. End Host Optimization Oriana Riva, Department of Computer Science ETH Zürich 263 3501 00 End Host Optimization Patrick Stuedi Spring Semester 2017 1 Today End-host optimizations: NUMA-aware networking Kernel-bypass Remote Direct

More information

Oracle Developer Studio 12.6

Oracle Developer Studio 12.6 Oracle Developer Studio 12.6 Oracle Developer Studio is the #1 development environment for building C, C++, Fortran and Java applications for Oracle Solaris and Linux operating systems running on premises

More information

Architecting Storage for Semiconductor Design: Manufacturing Preparation

Architecting Storage for Semiconductor Design: Manufacturing Preparation White Paper Architecting Storage for Semiconductor Design: Manufacturing Preparation March 2012 WP-7157 EXECUTIVE SUMMARY The manufacturing preparation phase of semiconductor design especially mask data

More information

An Introduction to GPFS

An Introduction to GPFS IBM High Performance Computing July 2006 An Introduction to GPFS gpfsintro072506.doc Page 2 Contents Overview 2 What is GPFS? 3 The file system 3 Application interfaces 4 Performance and scalability 4

More information

Windows Azure Solutions with Microsoft Visual Studio 2010

Windows Azure Solutions with Microsoft Visual Studio 2010 Windows Azure Solutions with Microsoft Visual Studio 2010 Course No. 50466 3 Days Instructor-led, Hands-on Introduction This class is an introduction to cloud computing and specifically Microsoft's public

More information

MCTS Guide to Microsoft Windows Server 2008 Applications Infrastructure Configuration (Exam # ) Chapter One Introducing Windows Server 2008

MCTS Guide to Microsoft Windows Server 2008 Applications Infrastructure Configuration (Exam # ) Chapter One Introducing Windows Server 2008 MCTS Guide to Microsoft Windows Server 2008 Applications Infrastructure Configuration (Exam # 70-643) Chapter One Introducing Windows Server 2008 Objectives Distinguish among the different Windows Server

More information

ECMWF Workshop on High Performance Computing in Meteorology. 3 rd November Dean Stewart

ECMWF Workshop on High Performance Computing in Meteorology. 3 rd November Dean Stewart ECMWF Workshop on High Performance Computing in Meteorology 3 rd November 2010 Dean Stewart Agenda Company Overview Rogue Wave Product Overview IMSL Fortran TotalView Debugger Acumem ThreadSpotter 1 Copyright

More information

Reducing Network Contention with Mixed Workloads on Modern Multicore Clusters

Reducing Network Contention with Mixed Workloads on Modern Multicore Clusters Reducing Network Contention with Mixed Workloads on Modern Multicore Clusters Matthew Koop 1 Miao Luo D. K. Panda matthew.koop@nasa.gov {luom, panda}@cse.ohio-state.edu 1 NASA Center for Computational

More information

Understanding the latent value in all content

Understanding the latent value in all content Understanding the latent value in all content John F. Kennedy (JFK) November 22, 1963 INGEST ENRICH EXPLORE Cognitive skills Data in any format, any Azure store Search Annotations Data Cloud Intelligence

More information

A Road Map to the Future of Linux in the Enterprise. Timothy D. Witham Lab Director Open Source Development Lab

A Road Map to the Future of Linux in the Enterprise. Timothy D. Witham Lab Director Open Source Development Lab A Road Map to the Future of Linux in the Enterprise Timothy D. Witham Lab Director Open Source Development Lab 1 Agenda Introduction Why Linux Current Linux Uses Roadmap for the Future Process 2 Open Source

More information

High-Performance Lustre with Maximum Data Assurance

High-Performance Lustre with Maximum Data Assurance High-Performance Lustre with Maximum Data Assurance Silicon Graphics International Corp. 900 North McCarthy Blvd. Milpitas, CA 95035 Disclaimer and Copyright Notice The information presented here is meant

More information

AMD ACCELERATING TECHNOLOGIES FOR EXASCALE COMPUTING FELLOW 3 OCTOBER 2016

AMD ACCELERATING TECHNOLOGIES FOR EXASCALE COMPUTING FELLOW 3 OCTOBER 2016 AMD ACCELERATING TECHNOLOGIES FOR EXASCALE COMPUTING BILL.BRANTLEY@AMD.COM, FELLOW 3 OCTOBER 2016 AMD S VISION FOR EXASCALE COMPUTING EMBRACING HETEROGENEITY CHAMPIONING OPEN SOLUTIONS ENABLING LEADERSHIP

More information

Scalability issues : HPC Applications & Performance Tools

Scalability issues : HPC Applications & Performance Tools High Performance Computing Systems and Technology Group Scalability issues : HPC Applications & Performance Tools Chiranjib Sur HPC @ India Systems and Technology Lab chiranjib.sur@in.ibm.com Top 500 :

More information