Revision 1.1. Copyright 2011, XLsoft K.K. All rights reserved. 1
|
|
- Charla Benson
- 5 years ago
- Views:
Transcription
1 1. Revision 1.1 Copyright 2011, XLsoft K.K. All rights reserved. 1
2 Cluster Studio XE 2012 Compiler C/C++ Fortran Library : MKL MPI: MPI C++ : TBB : IPP Analyzer Copyright 2011, XLsoft K.K. All rights reserved. 2
3 icc icpc ifort mpiicc mpiicpc mpiifort <Install dir> C C++ Fortran mpicc C++ mpic++, mpicxx C++ mpif77,mpif90 Fortran : <VTune Install dir> = /opt/intel/vtune_amplifier_xe/ Copyright 2011, XLsoft K.K. All rights reserved. 3
4 Cluster Studio XE 2012 Copyright 2011, XLsoft K.K. All rights reserved. 4
5 C++ Fortran (PGO) (GAP) ( ) Sandy Bridge-EP OpenMP3.1 x86 x86_64 GCC(Windows VisualStudio) C/C++ Fortran Copyright 2011, XLsoft K.K. All rights reserved. 5
6 Fortran 12.1 Co-array Fortran Fortran 2008(ISO/IEC :2004) Fortran IV/77/90/95/2003 (Fortran 2008 ) ISO ISO/IEC 1539:1991 ISO/IEC :1997 ISO/IEC1539-1:2004 ANSI X ANSI X ANSIX ANSI X3J3/ ISO/IEC :2004 Copyright 2011, XLsoft K.K. All rights reserved. 6
7 C C++11(ISO/IEC 14882:2011) IEEE ( CilkPlus) C/C++ C99 (C++11 ) ISO/IEC 9899:1990 ISO/IEC 9899:1999 ISO/IEC 14882:1998) ISO/IEC 14882: bit long double Copyright 2011, XLsoft K.K. All rights reserved. 7
8 x86_64 source <composer Install dir>/bin/compilervars.sh intel64 source <icc Install dir>/bin/iccvars.sh intel64 source <ifort Install dir>/bin/ifortvars.sh intel64 x86_64 MPI source <Intel MPI Install dir>/bin64/mpivars.sh VTune Amplifier XE source <VTune Install dir>/amplxe-vars.sh Trace Analyzer/Collector source <TA/TC Install dir>/bin/itacvars.sh composer icc ifort C.sh.csh Copyright 2011, XLsoft K.K. All rights reserved. 8
9 C: icc [ ] < > C++: icpc [ ] < > Fortran: ifort [ ] < > MPI( ) C: mpiicc [ ] < > C++: mpiicpc [ ] < > Fortran: mpiifort [ ] < > < > [ ] < > [ ] : icc -O3 -xavx -ipo main.c func1.c func2.c obj_c.o icpc main.cpp func1.cpp func2.cpp obj_cpp.o ifort -O3 -xsse4.2 prog.f90 sub.f -o my_prog_name.out obj_c.o.c obj_cpp.cpp Copyright 2011, XLsoft K.K. All rights reserved. 9
10 (Linux ) ( ) OpenMP: (SandyBridge ) Trace Collector mpiicc -parallel < > mpiicc -openmp < > mpiicc -O3 < > mpiicc -xavx < > mpiicc -ipo < > mpiicc -vec_report3 < > mpiicc -trace -g < > Trace Analyzer/ Collector : mpiicc -O3 -xavx -ipo [-openmp] [-parallel] < > mpiicc icc ifort, mpiicpc, mpiifort ( -trace mpiicc, mpiicpc, mpifort ) Copyright 2011, XLsoft K.K. All rights reserved. 10
11 (Linux ) -O0 -O1, -O2, -O3 -x, -ax, -m -ipo -openmp -parallel -prof_gen, -prof_use -guide -vec_report<number> -par_report<number> -fast -fp-model <model> (3 ) OpenMP (Profile Guided Optimization PGO ) ( 1,2,3,4,5 ) ( 1,2,3 ) -ipo -O3 -no-prec-div -static -xhost (model strict, precise, fast ) Copyright 2011, XLsoft K.K. All rights reserved. 11
12 (Linux ) -prec-div -static-intel, -shared-intel -mkl[=lib] : -mkl=cluster ( static) MKL (=lib ) parallel MKL lib sequential MKL cluster MKL MKL Copyright 2011, XLsoft K.K. All rights reserved. 12
13 OpenMP* - export KMP_AFFINITY= physical ( ) compact ( ) scatter ( ) Copyright 2011, XLsoft K.K. All rights reserved. 13
14 Composer XE (IPO) (PGO) 4. (GAP) 5. VTune AmplifierXE Copyright 2011, XLsoft K.K. All rights reserved. 14
15 : -O3 High-Level Optimization: HLO -x, -ax -O0 -O1 -O2 -O3 Copyright 2011, XLsoft K.K. All rights reserved. 15
16 : -ipo Inter Procedural Optimization: IPO IPO IPO file1.c file2.c file3.c file4.c file1.c file3.c file4.c file2.c Copyright 2011, XLsoft K.K. All rights reserved. 16
17 : -parallel Parallelizer -par_report[n] n 0,1,2,3 int_sin.cpp(74): (col. 4) remark: :. int_sin.cpp(92): (col. 6) remark: :. -par-threshold[n] n int_sin.c(92): (col. 6) remark:. Copyright 2011, XLsoft K.K. All rights reserved. 17
18 : -m,-x,-ax Vectorizer SIMD Single Instruction Multiple Data for ( i=0; i<max; i++ ) c[i] = a[i] + b[i]; -no-vec -xavx Intel AVX 1 1 MAX A + B SIMD 1 8 MAX/8 A8 A7 A6 A5 A4 A3 A2 A B8 B7 B6 B5 B4 B3 B2 B1 C C8 C7 C6 C5 C4 C3 C2 C1 AVX float Copyright 2011, XLsoft K.K. All rights reserved. 18
19 -m,-x,-ax -x<code> <code>=host,avx,sse4. 2,SSE4.1,SSE3_ATOM,SS SE3,SSE3,SSE2 -ax<code> <code>=avx,sse4.2,sse 4.1,SSSE3,SSE3,SSE2 -m<code> <code>=ia32, sse2, sse3, ssse3, sse4.1 AVX SSE4.2 :-xavx AVX SSE4.2 SSE4.1 SSSE3 SSE3 SSE2 SSE AFX SSE4.2 :-axavx AVX SSE4.2 SSE4.1 SSSE3 SSE3 SSE2 SSE ia32 sse2 :-msse4.1 SSE4.1 SSSE3 SSE3 SSE2 SSE Copyright 2011, XLsoft K.K. All rights reserved. 19
20 -ax : -axavx,sse4.2,sse3 AVX : AVX SSE4.2 : SSE4.2 SSE4.1 : SSE3 SSE2 : Copyright 2011, XLsoft K.K. All rights reserved. 20
21 : -guide Guided Auto Parallelization: GAP -guide-par -guide-vec -guide-data-trans gap_vec.c(8): remark #30536: (LOOP) -fargument-noalias ( ) 8 [ ] [ ] "matrix_mul_matrix" "restrict" 8 [ ] "restrict" Copyright 2011, XLsoft K.K. All rights reserved. 21
22 : -fp-model precise strict -fp-model fast t0 = 4.1f + t1 + t2 float t0, t1, t2 t0 = 4.0f + 0.1f + t1 + t2 -fp-model precise -fp-model strict t0 = (4.1f + t1) + t2 t0 = (((4.0f + 0.1f) + t1) + t2) Copyright 2011, XLsoft K.K. All rights reserved. 22
23 Copyright 2011, XLsoft K.K. All rights reserved. 23
No Time to Read This Book?
Chapter 1 No Time to Read This Book? We know what it feels like to be under pressure. Try out a few quick and proven optimization stunts described below. They may provide a good enough performance gain
More informationIntel MPI Cluster Edition on Graham A First Look! Doug Roberts
Intel MPI Cluster Edition on Graham A First Look! Doug Roberts SHARCNET / COMPUTE CANADA Intel Parallel Studio XE 2016 Update 4 Cluster Edition for Linux 1. Intel(R) MPI Library 5.1 Update 3 Cluster Ed
More informationProgramming LRZ. Dr. Volker Weinberg, RRZE, 2018
Programming Environment @ LRZ Dr. Volker Weinberg, weinberg@lrz.de RRZE, 2018 Development tools Activity Tools Linux versions Source code development Editors vi, emacs, etc. Executable creation Compilers
More informationIntel Software Development Products Licensing & Programs Channel EMEA
Intel Software Development Products Licensing & Programs Channel EMEA Intel Software Development Products Advanced Performance Distributed Performance Intel Software Development Products Foundation of
More informationExploiting the Power of the Intel Compiler Suite. Dr. Mario Deilmann Intel Compiler and Languages Lab Software Solutions Group
Exploiting the Power of the Intel Compiler Suite Dr. Mario Deilmann Intel Compiler and Languages Lab Software Solutions Group Agenda Compiler Overview Intel C++ Compiler High level optimization IPO, PGO
More informationWhat s New August 2015
What s New August 2015 Significant New Features New Directory Structure OpenMP* 4.1 Extensions C11 Standard Support More C++14 Standard Support Fortran 2008 Submodules and IMPURE ELEMENTAL Further C Interoperability
More informationUsing Intel VTune Amplifier XE for High Performance Computing
Using Intel VTune Amplifier XE for High Performance Computing Vladimir Tsymbal Performance, Analysis and Threading Lab 1 The Majority of all HPC-Systems are Clusters Interconnect I/O I/O... I/O I/O Message
More informationIntel C++ Compiler User's Guide With Support For The Streaming Simd Extensions 2
Intel C++ Compiler User's Guide With Support For The Streaming Simd Extensions 2 This release of the Intel C++ Compiler 16.0 product is a Pre-Release, and as such is 64 architecture processor supporting
More informationPerformance Tuning on Itasca
Performance Tuning on Itasca Shuxia Zhangh and Andrew Gustafson Nov. 27, 2012 Outline What computer resources are available under Itasca umbrella? Has your code run efficient? Profiling applications using
More informationIntel Architecture and Tools Jureca Tuning for the platform II. Dr. Heinrich Bockhorst Intel SSG/DPD/ Date:
Intel Architecture and Tools Jureca Tuning for the platform II Dr. Heinrich Bockhorst Intel SSG/DPD/ Date: 23.11.2017 Agenda Introduction Processor Architecture Overview Composer XE Compiler Intel Python
More informationInstallation Guide and Release Notes
Intel Parallel Studio XE 2013 for Linux* Installation Guide and Release Notes Document number: 323804-003US 10 March 2013 Table of Contents 1 Introduction... 1 1.1 What s New... 1 1.1.1 Changes since Intel
More informationIntel Fortran Composer XE 2011 Getting Started Tutorials
Intel Fortran Composer XE 2011 Getting Started Tutorials Document Number: 323651-001US World Wide Web: http://developer.intel.com Legal Information Contents Legal Information...5 Introducing the Intel
More informationCompiler Options. Linux/x86 Performance Practical,
Center for Information Services and High Performance Computing (ZIH) Compiler Options Linux/x86 Performance Practical, 17.06.2009 Zellescher Weg 12 Willers-Bau A106 Tel. +49 351-463 - 31945 Ulf Markwardt
More informationIntel C++ & Fortran Compiler. Presenter: Georg Zitzlsberger
Intel C++ & Fortran Compiler Presenter: Georg Zitzlsberger Date: 09-07-2015 Agenda Introduction How to Use Compiler Highlights Numerical Stability What s New (16.0)? Summary 2 Why Use Intel C++/Fortran
More informationPRACE PATC Course: Vectorisation & Basic Performance Overview. Ostrava,
PRACE PATC Course: Vectorisation & Basic Performance Overview Ostrava, 7-8.2.2017 1 Agenda Basic Vectorisation & SIMD Instructions IMCI Vector Extension Intel compiler flags Hands-on Intel Tool VTune Amplifier
More informationCode modernization and optimization for improved performance using the OpenMP* programming model for threading and SIMD parallelism.
Code modernization and optimization for improved performance using the OpenMP* programming model for threading and SIMD parallelism. Parallel + SIMD is the Path Forward Intel Xeon and Intel Xeon Phi Product
More informationIntel Parallel Studio XE 2011 SP1 for Linux* Installation Guide and Release Notes
Intel Parallel Studio XE 2011 SP1 for Linux* Installation Guide and Release Notes Document number: 323804-002US 21 June 2012 Table of Contents 1 Introduction... 1 1.1 What s New... 1 1.2 Product Contents...
More informationIntel Software Development Products for High Performance Computing and Parallel Programming
Intel Software Development Products for High Performance Computing and Parallel Programming Multicore development tools with extensions to many-core Notices INFORMATION IN THIS DOCUMENT IS PROVIDED IN
More informationGetting Started with Intel SDK for OpenCL Applications
Getting Started with Intel SDK for OpenCL Applications Webinar #1 in the Three-part OpenCL Webinar Series July 11, 2012 Register Now for All Webinars in the Series Welcome to Getting Started with Intel
More informationIntel Parallel Studio XE 2011 for Linux* Installation Guide and Release Notes
Intel Parallel Studio XE 2011 for Linux* Installation Guide and Release Notes Document number: 323804-001US 8 October 2010 Table of Contents 1 Introduction... 1 1.1 Product Contents... 1 1.2 What s New...
More informationIntel Parallel Studio XE 2011 for Windows* Installation Guide and Release Notes
Intel Parallel Studio XE 2011 for Windows* Installation Guide and Release Notes Document number: 323803-001US 4 May 2011 Table of Contents 1 Introduction... 1 1.1 What s New... 2 1.2 Product Contents...
More informationVectorization Advisor: getting started
Vectorization Advisor: getting started Before you analyze Run GUI or Command Line Set-up environment Linux: source /advixe-vars.sh Windows: \advixe-vars.bat Run GUI or Command
More informationIntel Compilers for C/C++ and Fortran
Intel Compilers for C/C++ and Fortran Georg Zitzlsberger georg.zitzlsberger@vsb.cz 1st of March 2018 Agenda Important Optimization Options for HPC High Level Optimizations (HLO) Pragmas Interprocedural
More informationAdvanced Parallel Programming II
Advanced Parallel Programming II Alexander Leutgeb, RISC Software GmbH RISC Software GmbH Johannes Kepler University Linz 2016 22.09.2016 1 Introduction to Vectorization RISC Software GmbH Johannes Kepler
More informationIntel Advisor XE Future Release Threading Design & Prototyping Vectorization Assistant
Intel Advisor XE Future Release Threading Design & Prototyping Vectorization Assistant Parallel is the Path Forward Intel Xeon and Intel Xeon Phi Product Families are both going parallel Intel Xeon processor
More informationIntel Composer XE. Copyright 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Intel Composer XE Intel Parallel Studio XE 2011 for Advanced Performance Boost Performance. Scale Forward. Ensure Confidence. Intel Parallel Studio XE 2011 Windows and Linux Value Proposition What Leading
More informationInstallation Guide and Release Notes
Intel C++ Studio XE 2013 for Windows* Installation Guide and Release Notes Document number: 323805-003US 26 June 2013 Table of Contents 1 Introduction... 1 1.1 What s New... 2 1.1.1 Changes since Intel
More informationRevealing the performance aspects in your code
Revealing the performance aspects in your code 1 Three corner stones of HPC The parallelism can be exploited at three levels: message passing, fork/join, SIMD Hyperthreading is not quite threading A popular
More informationGAP Guided Auto Parallelism A Tool Providing Vectorization Guidance
GAP Guided Auto Parallelism A Tool Providing Vectorization Guidance 7/27/12 1 GAP Guided Automatic Parallelism Key design ideas: Use compiler to help detect what is blocking optimizations in particular
More informationIntel Math Kernel Library 10.3
Intel Math Kernel Library 10.3 Product Brief Intel Math Kernel Library 10.3 The Flagship High Performance Computing Math Library for Windows*, Linux*, and Mac OS* X Intel Math Kernel Library (Intel MKL)
More informationReusing this material
XEON PHI BASICS Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us
More informationA Simple Path to Parallelism with Intel Cilk Plus
Introduction This introductory tutorial describes how to use Intel Cilk Plus to simplify making taking advantage of vectorization and threading parallelism in your code. It provides a brief description
More informationIntel Parallel Studio XE Cluster Edition - Intel MPI - Intel Traceanalyzer & Collector
Intel Parallel Studio XE Cluster Edition - Intel MPI - Intel Traceanalyzer & Collector A brief Introduction to MPI 2 What is MPI? Message Passing Interface Explicit parallel model All parallelism is explicit:
More informationNative Computing and Optimization. Hang Liu December 4 th, 2013
Native Computing and Optimization Hang Liu December 4 th, 2013 Overview Why run native? What is a native application? Building a native application Running a native application Setting affinity and pinning
More informationEfficiently Introduce Threading using Intel TBB
Introduction This guide will illustrate how to efficiently introduce threading using Intel Threading Building Blocks (Intel TBB), part of Intel Parallel Studio XE. It is a widely used, award-winning C++
More informationIntel C++ Compiler Professional Edition 11.0 for Windows* In-Depth
Intel C++ Compiler Professional Edition 11.0 for Windows* In-Depth Contents Intel C++ Compiler Professional Edition for Windows*..... 3 Intel C++ Compiler Professional Edition At A Glance...3 Intel C++
More informationParallel Programming. The Ultimate Road to Performance April 16, Werner Krotz-Vogel
Parallel Programming The Ultimate Road to Performance April 16, 2013 Werner Krotz-Vogel 1 Getting started with parallel algorithms Concurrency is a general concept multiple activities that can occur and
More informationPresenter: Dr. Heinrich Bockhorst Date:
Intel Architecture and Tools JURECA - Tuning for the platform II Presenter: Dr. Heinrich Bockhorst Date: 26-11-2015 1 Agenda Introduction Processor Architecture Basics Composer XE Selected Intel Tools
More informationGetting Started with Intel Cilk Plus SIMD Vectorization and SIMD-enabled functions
Getting Started with Intel Cilk Plus SIMD Vectorization and SIMD-enabled functions Introduction SIMD Vectorization and SIMD-enabled Functions are a part of Intel Cilk Plus feature supported by the Intel
More informationIntroduction to Performance Tuning & Optimization Tools
Introduction to Performance Tuning & Optimization Tools a[i] a[i+1] + a[i+2] a[i+3] b[i] b[i+1] b[i+2] b[i+3] = a[i]+b[i] a[i+1]+b[i+1] a[i+2]+b[i+2] a[i+3]+b[i+3] Ian A. Cosden, Ph.D. Manager, HPC Software
More informationIntel Many Integrated Core (MIC) Architecture
Intel Many Integrated Core (MIC) Architecture Karl Solchenbach Director European Exascale Labs BMW2011, November 3, 2011 1 Notice and Disclaimers Notice: This document contains information on products
More informationProgramming for the Intel Many Integrated Core Architecture By James Reinders. The Architecture for Discovery. PowerPoint Title
Programming for the Intel Many Integrated Core Architecture By James Reinders The Architecture for Discovery PowerPoint Title Intel Xeon Phi coprocessor 1. Designed for Highly Parallel workloads 2. and
More informationIntel Parallel Studio XE 2015
2015 Create faster code faster with this comprehensive parallel software development suite. Faster code: Boost applications performance that scales on today s and next-gen processors Create code faster:
More informationThis guide will show you how to use Intel Inspector XE to identify and fix resource leak errors in your programs before they start causing problems.
Introduction A resource leak refers to a type of resource consumption in which the program cannot release resources it has acquired. Typically the result of a bug, common resource issues, such as memory
More informationHigh Performance Parallel Programming. Multicore development tools with extensions to many-core. Investment protection. Scale Forward.
High Performance Parallel Programming Multicore development tools with extensions to many-core. Investment protection. Scale Forward. Enabling & Advancing Parallelism High Performance Parallel Programming
More informationGe#ng Started with Automa3c Compiler Vectoriza3on. David Apostal UND CSci 532 Guest Lecture Sept 14, 2017
Ge#ng Started with Automa3c Compiler Vectoriza3on David Apostal UND CSci 532 Guest Lecture Sept 14, 2017 Parallellism is Key to Performance Types of parallelism Task-based (MPI) Threads (OpenMP, pthreads)
More informationGetting Reproducible Results with Intel MKL
Getting Reproducible Results with Intel MKL Why do results vary? Root cause for variations in results Floating-point numbers order of computation matters! Single precision example where (a+b)+c a+(b+c)
More informationVLPL-S Optimization on Knights Landing
VLPL-S Optimization on Knights Landing 英特尔软件与服务事业部 周姗 2016.5 Agenda VLPL-S 性能分析 VLPL-S 性能优化 总结 2 VLPL-S Workload Descriptions VLPL-S is the in-house code from SJTU, paralleled with MPI and written in C++.
More informationIntel C++ Compiler Professional Edition 11.0 for Linux* In-Depth
Intel C++ Compiler Professional Edition 11.0 for Linux* In-Depth Contents Intel C++ Compiler Professional Edition for Linux*...3 Intel C++ Compiler Professional Edition Components:...3 Features...3 New
More informationPresenter: Georg Zitzlsberger. Date:
Presenter: Georg Zitzlsberger Date: 07-09-2016 1 Agenda Introduction to SIMD for Intel Architecture Compiler & Vectorization Validating Vectorization Success Intel Cilk Plus OpenMP* 4.x Summary 2 Vectorization
More informationOverview of Intel Parallel Studio XE
Overview of Intel Parallel Studio XE Stephen Blair-Chappell 1 30-second pitch Intel Parallel Studio XE 2011 Advanced Application Performance What Is It? Suite of tools to develop high performing, robust
More informationIntel MPI Library Conditional Reproducibility
1 Intel MPI Library Conditional Reproducibility By Michael Steyer, Technical Consulting Engineer, Software and Services Group, Developer Products Division, Intel Corporation Introduction High performance
More informationEliminate Threading Errors to Improve Program Stability
Introduction This guide will illustrate how the thread checking capabilities in Intel Parallel Studio XE can be used to find crucial threading defects early in the development cycle. It provides detailed
More informationIntroduction to Intel Xeon Phi programming techniques. Fabio Affinito Vittorio Ruggiero
Introduction to Intel Xeon Phi programming techniques Fabio Affinito Vittorio Ruggiero Outline High level overview of the Intel Xeon Phi hardware and software stack Intel Xeon Phi programming paradigms:
More informationBei Wang, Dmitry Prohorov and Carlos Rosales
Bei Wang, Dmitry Prohorov and Carlos Rosales Aspects of Application Performance What are the Aspects of Performance Intel Hardware Features Omni-Path Architecture MCDRAM 3D XPoint Many-core Xeon Phi AVX-512
More informationCode Optimization. Brandon Barker Computational Scientist Cornell University Center for Advanced Computing (CAC)
Code Optimization Brandon Barker Computational Scientist Cornell University Center for Advanced Computing (CAC) brandon.barker@cornell.edu Workshop: High Performance Computing on Stampede January 15, 2015
More informationOverview of Intel Xeon Phi Coprocessor
Overview of Intel Xeon Phi Coprocessor Sept 20, 2013 Ritu Arora Texas Advanced Computing Center Email: rauta@tacc.utexas.edu This talk is only a trailer A comprehensive training on running and optimizing
More informationIntel Visual Fortran Compiler Professional Edition 11.0 for Windows* In-Depth
Intel Visual Fortran Compiler Professional Edition 11.0 for Windows* In-Depth Contents Intel Visual Fortran Compiler Professional Edition for Windows*........................ 3 Features...3 New in This
More informationIntel tools for High Performance Python 데이터분석및기타기능을위한고성능 Python
Intel tools for High Performance Python 데이터분석및기타기능을위한고성능 Python Python Landscape Adoption of Python continues to grow among domain specialists and developers for its productivity benefits Challenge#1:
More informationIntroduc)on to Hyades
Introduc)on to Hyades Shawfeng Dong Department of Astronomy & Astrophysics, UCSSC Hyades 1 Hardware Architecture 2 Accessing Hyades 3 Compu)ng Environment 4 Compiling Codes 5 Running Jobs 6 Visualiza)on
More informationIntel Manycore Testing Lab (MTL) - Linux Getting Started Guide
Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide Introduction What are the intended uses of the MTL? The MTL is prioritized for supporting the Intel Academic Community for the testing, validation
More informationUsing Intel AVX without Writing AVX
1 White Paper Using Intel AVX without Writing AVX Introduction and Tools Intel Advanced Vector Extensions (Intel AVX) is a new 256-bit instruction set extension to Intel Streaming SIMD Extensions (Intel
More informationUsing the Intel Math Kernel Library (Intel MKL) and Intel Compilers to Obtain Run-to-Run Numerical Reproducible Results
Using the Intel Math Kernel Library (Intel MKL) and Intel Compilers to Obtain Run-to-Run Numerical Reproducible Results by Todd Rosenquist, Technical Consulting Engineer, Intel Math Kernal Library and
More informationIntel Xeon Phi programming. September 22nd-23rd 2015 University of Copenhagen, Denmark
Intel Xeon Phi programming September 22nd-23rd 2015 University of Copenhagen, Denmark Legal Disclaimer & Optimization Notice INFORMATION IN THIS DOCUMENT IS PROVIDED AS IS. NO LICENSE, EXPRESS OR IMPLIED,
More informationHigh Performance Computing: Tools and Applications
High Performance Computing: Tools and Applications Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology Lecture 8 Processor-level SIMD SIMD instructions can perform
More informationIntel Integrated Performance Primitives 7.0 Windows
Intel Integrated Performance Primitives 7.0 Windows 2012 2 2 SSE AVX OpenMP OpenMP Copyright 1998-2012 XLsoft Corporation. All Rights Reserved. 2 Prefix ippac ippac.h ippac[_*].lib ippac[**]-x.x.dll ipps
More informationIntel C++ Compiler Professional Edition 11.1 for Linux* In-Depth
Intel C++ Compiler Professional Edition 11.1 for Linux* In-Depth Contents Intel C++ Compiler Professional Edition 11.1 for Linux*.... 3 Intel C++ Compiler Professional Edition Components:......... 3 s...3
More informationKevin O Leary, Intel Technical Consulting Engineer
Kevin O Leary, Intel Technical Consulting Engineer Moore s Law Is Going Strong Hardware performance continues to grow exponentially We think we can continue Moore's Law for at least another 10 years."
More informationKlaus-Dieter Oertel, May 28 th 2013 Software and Services Group Intel Corporation
S c i c o m P 2 0 1 3 T u t o r i a l Intel Xeon Phi Product Family Programming Tools Klaus-Dieter Oertel, May 28 th 2013 Software and Services Group Intel Corporation Agenda Intel Parallel Studio XE 2013
More informationHPC. Accelerating. HPC Advisory Council Lugano, CH March 15 th, Herbert Cornelius Intel
15.03.2012 1 Accelerating HPC HPC Advisory Council Lugano, CH March 15 th, 2012 Herbert Cornelius Intel Legal Disclaimer 15.03.2012 2 INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS.
More informationOptimization and Scalability. Steve Lantz Senior Research Associate Cornell CAC
Optimization and Scalability Steve Lantz Senior Research Associate Cornell CAC Workshop: Parallel Computing on Stampede June 18, 2013 Putting Performance into Design and Development MODEL ALGORITHM IMPLEMEN-
More informationOptimization & Scalability
Optimization & Scalability Carlos Rosales carlos@tacc.utexas.edu January 11 th, 2013 Parallel Computing in Stampede What this talk is about Highlight main performance and scalability bottlenecks Simple
More informationIntel Cluster Studio XE 2012 for Linux* OS
Intel Cluster Studio XE 2012 for Linux* OS Tutorial Copyright 2011 Intel Corporation All Rights Reserved Document Number: 325977-001EN Revision: 20111108 World Wide Web: http://www.intel.com Contents Disclaimer
More informationBring your application to a new era:
Bring your application to a new era: learning by example how to parallelize and optimize for Intel Xeon processor and Intel Xeon Phi TM coprocessor Manel Fernández, Roger Philp, Richard Paul Bayncore Ltd.
More informationQuick-Reference Guide to Optimization with Intel Compilers
Quick-Reference Guide to Optimization with Intel Compilers For IA-32 processors, processors supporting Intel Extended Memory 64 Technology (Intel 64) and Intel Itanium (IA-64) processors. 1. 2. 3. 4. 5.
More informationWRF performance on Intel Processors
WRF performance on Intel Processors R. Dubtsov, A. Semenov, D. Shkurko Intel Corp., pr. ak. Lavrentieva 6/1, Novosibirsk, Russia, 630090 {roman.s.dubtsov, alexander.l.semenov,dmitry.v.shkurko,}@intel.com
More informationVectorization on KNL
Vectorization on KNL Steve Lantz Senior Research Associate Cornell University Center for Advanced Computing (CAC) steve.lantz@cornell.edu High Performance Computing on Stampede 2, with KNL, Jan. 23, 2017
More informationIntel Parallel Studio XE 2018
Intel Parallel Studio XE 2018 Installation Guide for Linux* OS 11 September 2017 Contents 1 Introduction...2 1.1 Licensing Information...2 2 Prerequisites...2 2.1 Notes for Cluster Installation...3 2.1.1
More informationCluster Clonetroop: HowTo 2014
2014/02/25 16:53 1/13 Cluster Clonetroop: HowTo 2014 Cluster Clonetroop: HowTo 2014 This section contains information about how to access, compile and execute jobs on Clonetroop, Laboratori de Càlcul Numeric's
More informationConsistency of Floating-Point Results using the Intel Compiler or Why doesn t my application always give the same answer?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 Consistency of Floating-Point Results using the Intel Compiler or Why doesn t my application
More informationThree Questions every one keeps asking. Stephen Blair-Chappell Intel Compiler Labs
Three Questions every one keeps asking Stephen Blair-Chappell Intel Compiler Labs Three Common Requests How can I make my program run faster? How can I make my program parallel? Will my code run on any
More informationIntel Parallel Studio XE 2016
Intel Parallel Studio XE 2016 Installation Guide for Linux* OS 18 August 2015 Contents 1 Introduction...2 2 Prerequisites...2 3 Installation...6 3.1 Using Online Installer...6 3.2 Installation Through
More informationIntel Parallel Studio XE 2019 Update 1
Intel Parallel Studio XE 2019 Update 1 Installation Guide for Linux* OS 7 November 2018 Contents 1 Introduction...2 1.1 Licensing Information...2 2 Prerequisites...2 2.1 Notes for Cluster Installation...3
More informationIntel C++ Compiler Professional Edition 11.1 for Mac OS* X. In-Depth
Intel C++ Compiler Professional Edition 11.1 for Mac OS* X In-Depth Contents Intel C++ Compiler Professional Edition 11.1 for Mac OS* X. 3 Intel C++ Compiler Professional Edition 11.1 Components:...3 Features...3
More informationCompilers & Optimized Librairies
Institut de calcul intensif et de stockage de masse Compilers & Optimized Librairies Modules Environment.bashrc env $PATH... Compilers : GNU, Intel, Portland Memory considerations : size, top, ulimit Hello
More informationIntel Parallel Studio XE 2015 Composer Edition for Linux* Installation Guide and Release Notes
Intel Parallel Studio XE 2015 Composer Edition for Linux* Installation Guide and Release Notes 23 October 2014 Table of Contents 1 Introduction... 1 1.1 Product Contents... 2 1.2 Intel Debugger (IDB) is
More informationPRACE Summer School, CINECA 8-11 July 2013 Intel Xeon Phi Programming Environment. Hans Pabst, July 2013 Software and Services Group Intel Corporation
PRACE Summer School, CINECA 8-11 July 2013 Intel Xeon Phi Programming Environment Hans Pabst, July 2013 Software and Services Group Intel Corporation Agenda Intel Manycore Platform Software Stack Getting
More informationIntel Xeon Phi Coprocessor
Intel Xeon Phi Coprocessor http://tinyurl.com/inteljames twitter @jamesreinders James Reinders it s all about parallel programming Source Multicore CPU Compilers Libraries, Parallel Models Multicore CPU
More informationNative Computing and Optimization on Intel Xeon Phi
Native Computing and Optimization on Intel Xeon Phi ISC 2015 Carlos Rosales carlos@tacc.utexas.edu Overview Why run native? What is a native application? Building a native application Running a native
More informationScaling Out Python* To HPC and Big Data
Scaling Out Python* To HPC and Big Data Sergey Maidanov Software Engineering Manager for Intel Distribution for Python* What Problems We Solve: Scalable Performance Make Python usable beyond prototyping
More informationGraphics Performance Analyzer for Android
Graphics Performance Analyzer for Android 1 What you will learn from this slide deck Detailed optimization workflow of Graphics Performance Analyzer Android* System Analysis Only Please see subsequent
More informationOptimizing Code for Intel Multi-Core Processors Intel Core Microarchitecture On Linux
Optimizing Code for Intel Multi-Core Processors Intel Core Microarchitecture On Linux 2 June 2007 Intel Corporation Legal Lines and Disclaimers - Inner Front Cover 4 June 2007 Intel Corporation Optimizing
More informationSimplified and Effective Serial and Parallel Performance Optimization
HPC Code Modernization Workshop at LRZ Simplified and Effective Serial and Parallel Performance Optimization Performance tuning Using Intel VTune Performance Profiler Performance Tuning Methodology Goal:
More informationCilk Plus GETTING STARTED
Cilk Plus GETTING STARTED Overview Fundamentals of Cilk Plus Hyperobjects Compiler Support Case Study 3/17/2015 CHRIS SZALWINSKI 2 Fundamentals of Cilk Plus Terminology Execution Model Language Extensions
More informationInstallation of OpenMX
Installation of OpenMX Truong Vinh Truong Duy and Taisuke Ozaki OpenMX Group, ISSP, The University of Tokyo 2015/03/30 Download 1. Download the latest version of OpenMX % wget http://www.openmx-square.org/openmx3.7.tar.gz
More informationMunara Tolubaeva Technical Consulting Engineer. 3D XPoint is a trademark of Intel Corporation in the U.S. and/or other countries.
Munara Tolubaeva Technical Consulting Engineer 3D XPoint is a trademark of Intel Corporation in the U.S. and/or other countries. notices and disclaimers Intel technologies features and benefits depend
More informationParallel Programming Features in the Fortran Standard. Steve Lionel 12/4/2012
Parallel Programming Features in the Fortran Standard Steve Lionel 12/4/2012 Agenda Overview of popular parallelism methodologies FORALL a look back DO CONCURRENT Coarrays Fortran 2015 Q+A 12/5/2012 2
More informationHow to Use the Condo and CyEnce Clusters Glenn R. Luecke Director of HPC Education & Professor of Mathematics April 11, 2018
How to Use the Condo and CyEnce Clusters Glenn R. Luecke Director of HPC Education & Professor of Mathematics April 11, 2018 Online Information and Help If you experience problems and would like help,
More informationUsing Intel Inspector XE 2011 with Fortran Applications
Using Intel Inspector XE 2011 with Fortran Applications Jackson Marusarz Intel Corporation Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS
More informationHow to Write Fast Numerical Code
How to Write Fast Numerical Code Lecture: Benchmarking, Compiler Limitations Instructor: Markus Püschel TA: Gagandeep Singh, Daniele Spampinato, Alen Stojanov Last Time: ILP Latency/throughput (Pentium
More information