Intel Xeon Phi TM Coprocessor Architecture and Tools
|
|
- Elinor Hicks
- 5 years ago
- Views:
Transcription
1 Intel Xeon Phi TM Coprocessor Architecture and Tools The Guide for Application Developers Rezaur Rahman
2 Intel Xeon Phi Coprocessor Architecture and Tools: The Guide for Application Developers Rezaur Rahman Copyright 2013 by Apress Media, LLC, all rights reserved ApressOpen Rights: You have the right to copy, use and distribute this Work in its entirety, electronically without modification, for non-commercial purposes only. However, you have the additional right to use or alter any source code in this Work for any commercial or non-commercial purpose which must be accompanied by the licenses in (2) and (3) below to distribute the source code for instances of greater than 5 lines of code. Licenses (1), (2) and (3) below and the intervening text must be provided in any use of the text of the Work and fully describes the license granted herein to the Work. (1) License for Distribution of the Work: This Work is copyrighted by Apress Media, LLC, all rights reserved. Use of this Work other than as provided for in this license is prohibited. By exercising any of the rights herein, you are accepting the terms of this license. You have the non-exclusive right to copy, use and distribute this English language Work in its entirety, electronically without modification except for those modifications necessary for formatting on specific devices, for all non-commercial purposes, in all media and formats known now or hereafter. While the advice and information in this Work are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein. If your distribution is solely Apress source code or uses Apress source code intact, the following licenses (2) and (3) must accompany the source code. If your use is an adaptation of the source code provided by Apress in this Work, then you must use only license (3). (2) License for Direct Reproduction of Apress Source Code: This source code, from TouchDevelop: Programming on the Go, ISBN is copyrighted by Apress Media, LLC, all rights reserved. Any direct reproduction of this Apress source code is permitted but must contain this license. The following license must be provided for any use of the source code from this product of greater than 5 lines wherein the code is adapted or altered from its original Apress form. This Apress code is presented AS IS and Apress makes no claims to, representations or warrantees as to the function, usability, accuracy or usefulness of this code. (3) License for Distribution of Adaptation of Apress Source Code: Po:rtions of the source code provided are used or adapted from TouchDevelop: Programming on the Go, ISBN copyright Apress Media LLC. Any use or reuse of this Apress source code must contain this License. This Apress code is made available at Apress.com/ as is and Apress makes no claims to, representations or warrantees as to the function, usability, accuracy or usefulness of this code. ISBN-13 (pbk): ISBN-13 (electronic): Trademarked names, logos, and images may appear in this book. Rather than use a trademark symbol with every occurrence of a trademarked name, logo, or image we use the names, logos, and images only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein. President and Publisher: Paul Manning Lead Editors: Jeffrey Pepper (Apress); Patrick Hauke (Intel) Development Editor: Robert Hutchinson Coordinating Editor: Anamika Panchoo Cover Designer: Anna Ishchenko Distributed to the book trade worldwide by Springer Science+Business Media New York, 233 Spring Street, 6th Floor, New York, NY Phone SPRINGER, fax (201) , orders-ny@springer-sbm.com, or visit For information on translations, please rights@apress.com, or visit
3 About ApressOpen What Is ApressOpen? ApressOpen is an open access book program that publishes high-quality technical and business information. ApressOpen ebooks are available for global, free, noncommercial use. ApressOpen ebooks are available in PDF, epub, and Mobi formats. The user friendly ApressOpen free ebook license is presented on the copyright page of this book.
4 To my mother, who has always been so proud of me and wanted for me to be my best, and to my mother-in-law, who has eagerly been waiting for this book to be published
5 Contents at a Glance About the Author... xv About the Technical Reviewer... xvii Acknowledgments... xix Introduction... xxi Part 1: Hardware Foundation: Intel Xeon Phi Architecture... 1 Chapter 1: Introduction to Xeon Phi Architecture...3 Chapter 2: Programming Xeon Phi...15 Chapter 3: Xeon Phi Vector Architecture and Instruction Set...31 Chapter 4: Xeon Phi Core Microarchitecture...49 Chapter 5: Xeon Phi Cache and Memory Subsystem...65 Chapter 6: Xeon Phi PCIe Bus Data Transfer and Power Management...81 Part 2: Software Foundation: Intel Xeon Phi System Software and Tools...95 Chapter 7: Xeon Phi System Software...97 Chapter 8: Xeon Phi Application Development Tools Part 3: Applications: Technical Computing Software Development on Intel Xeon Phi Chapter 9: Xeon Phi Application Design and Implementation Considerations Chapter 10: Application Performance Tuning on Xeon Phi v
6 Contents at a Glance Chapter 11: Algorithm and Data Structures for Xeon Phi Chapter 12: Xeon Phi Application Development on Windows OS Appendix A: OpenCL on Xeon Phi Appendix B: Virtual Shared Memory Programming on Xeon Phi Index vi
7 Contents About the Author... xv About the Technical Reviewer... xvii Acknowledgments... xix Introduction... xxi Part 1: Hardware Foundation: Intel Xeon Phi Architecture... 1 Chapter 1: Introduction to Xeon Phi Architecture...3 History of Intel Xeon Phi Development...4 Evolution from Von Neumann Architecture to Cache Subsystem Architecture... 5 Improvements in the Core and Memory... 6 Interconnect and Cache Improvements Intel Xeon Phi Coprocessor Chip Architecture...12 Applicability of the Intel Xeon Phi Coprocessor...13 Summary...14 Chapter 2: Programming Xeon Phi...15 Intel Xeon Phi Execution Models...15 Development Tools for Intel Xeon Phi Architecture...16 Intel Composer XE Setting Up an Intel Xeon Phi System...19 Install the MPSS Stack Install the Development Tools Code Generation for Intel Xeon Phi Architecture...20 Native Execution Mode vii
8 Contents Language Extensions to Support Offload Computation on Intel Xeon Phi...22 Heterogeneous Computing Model and Offload Pragmas Language Extensions and Execution Model Runtime Library Routines Offload Example Summary...30 Chapter 3: Xeon Phi Vector Architecture and Instruction Set...31 Xeon Phi Vector Microarchitecture...31 The VPU Pipeline Vector Registers Vector Mask Registers Extended Math Unit Xeon Phi Vector Instruction Set Architecture...36 Data Types Vector Nomenclature Vector Instruction Syntax Xeon Phi Vector ISA by Categories Summary...47 Chapter 4: Xeon Phi Core Microarchitecture...49 Intel Xeon Phi Cores...49 Core Pipeline Stages...50 Cache and TLB Structure...52 L2 Cache Structure...55 Multithreading...55 Performance Considerations Probing the Core Summary...64 viii
9 Contents Chapter 5: Xeon Phi Cache and Memory Subsystem...65 The Interconnect Topologies for Manycore Processors...65 Bidirectional Ring Topology Two-Dimensional Mesh Topology Two-Dimensional Torus Topology Other Topologies The Ring Interconnect Architecture in Intel Xeon Phi...67 L2 Cache...68 Tag Directory Data Transactions The Cache Coherency Protocol Hardware Prefetcher Memory Transactions Flow...73 Cacheable Memory Read Transaction Managing Cache Hierarchy in Software Probing the Memory Subsystem...77 Measuring the Memory Bandwidth on Intel Xeon Phi Summary...80 Chapter 6: Xeon Phi PCIe Bus Data Transfer and Power Management...81 DMA Engine...83 Measuring the Data Transfer Bandwidth over the PCIe Bus Reading Data from the Coprocessor...87 Low-Level Data Transfer APIs for Intel Xeon Phi...88 Placement of PCIe Cards for Optimal Data Transfer BW...90 Power Management and Reliability...90 Idle Stare Management Reliability Availability and Serviceability Features in the Intel Xeon Phi Coprocessor Summary...93 ix
10 Contents Part 2: Software Foundation: Intel Xeon Phi System Software and Tools Chapter 7: Xeon Phi System Software...97 System Software Component...98 Ring 0 Driver Layer Components of the MPSS...99 System Boot Process Coprocessor OS Creating a Third-Party Coprocessor OS mic0: Transition from State Booting to Online Host Driver Linux Virtual File System (Sysfs and Procfs) Networking on Xeon Phi Network File System Open Fabrics Enterprise Distribution and Message Passing Interface Support System Software Application Components Summary Chapter 8: Xeon Phi Application Development Tools The Application Development Tools Intel C/C++ Composer XE OpenMP 4.0 and Language Extensions Pragmas Asynchronous Data Transfer Over PCI Express Keywords Using Shared Virtual Memory Valid Use of the Keywords Macros Intrinsics C++ Class Libraries Application Programming Interfaces Environment Variables Compiler Options Creating Offload Libraries x
11 Contents Intel Fortran Composer XE Directives Macros Application Programming Interfaces Environment Variables, Compiler Options, and Creating Static Libraries Third-Party Compilers Supporting Xeon Phi CAPS Compiler Debugging Xeon Phi Applications Intel Debugger Third-Party Debuggers Optimization Tool: Intel Vtune Amplifier XE Libraries Native or Symmetric Execution Compiler-Assisted Offload Using the Automatic Offload Version of the MKL Library Third-Party Math Libraries Intel Cluster Tools Third-Party Cluster Tools Summary Part 3: Applications: Technical Computing Software Development on Intel Xeon Phi Chapter 9: Xeon Phi Application Design and Implementation Considerations Workload-Related Considerations Gustafson s Law Scaled Speedup Effect of Grid Shape on Performance Algorithm Considerations Data Structure xi
12 Contents Offload Overhead Load Balancing Implementation Considerations Memory Management Mixed-Precision Arithmetic Optimizing Memory Transfer Bandwidth over the PCIe Bus Data Alignment Considerations Communication Summary Chapter 10: Application Performance Tuning on Xeon Phi Getting Baseline Data Timing Applications Detecting Application Execution Bottlenecks Some Basic Performance Events Setting Target Performance Optimizing Code Compiler-Driven Optimizations Data Alignment Removing Pointer Aliasing Streaming Store Using Large Pages Using Intel Cilk Plus Array Notation Vectorization with Intel Compiler Using the Math Kernel Library Cluster-Level Tuning Summary xii
13 Contents Chapter 11: Algorithm and Data Structures for Xeon Phi Algorithm and Data Structure Design Rules for Xeon Phi General Matrix-Matrix Multiply Algorithm (GEMM) Rules 1 and 3: Scalable Parallelization and Optimal Cache Reuse Rule 2: Efficient Vectorization Molecular Dynamics Rule 1: Scalable Parallelization Rules 2 and 3: Efficient Vectorization and Optimal Cache Reuse Stencil Operation Rule 1: Scalable Parallelization Rule 2: Efficient Vectorization Rule 3: Optimal Cache Reuse European Option Pricing Using Monte Carlo Simulation in Financial Applications Rule 1: Scalable Parallelization Rule 2: Efficient Vectorization Rule 3: Optimal Cache Reuse Summary Chapter 12: Xeon Phi Application Development on Windows OS MPSS MPSS Tools Development Tools Language Extensions for the Xeon Phi Coprocessor Offload Environment Variables Debugging Offload Execution Logging into Xeon Phi Console using PuTTY Using VTune Amplifier XE to Profile Offload Code on Windows Building and Running Xeon Phi Native Applications from the Windows Host Summary xiii
14 Contents Appendix A: OpenCL on Xeon Phi Installation Building and Running OpenCL Application Performance Optimization Appendix B: Virtual Shared Memory Programming on Xeon Phi Placing Data on the Virtual Shared Memory Region Shared Functions Synchronizing Between the Host and the Coprocessors Index xiv
15 About the Author Rezaur Rahman is a Senior Software Engineer in the Intel Software and Services Group. He played a key role in the inception and development of the Xeon Phi coprocessor for technical computing applications by demonstrating the viability of applying Intel s manycore graphics processor codenamed Larrabee to solving technical computing problems. He led the worldwide technical enabling team for Intel Xeon Phi products, focused on porting and optimizing applications on the Xeon Phi coprocessor for hundreds of technical computing customers. He has worked internally with hardware architects and Intel compiler and tools teams to optimize and add features to improve the performance of Intel Many Integrated Core (MIC) and Xeon Phi software and hardware components. With 25 years experience in computer architecture and software design, Rahman contributes his expertise in technical code optimization, performance tuning, and processor microarchitectural analysis in the HPC domain. Rahman holds a master s degree in computer science from Texas A&M University and a bachelor s in electrical engineering from Bangladesh University of Engineering and Technology. xv
16
17 About the Technical Reviewer Leonardo Borges is a Senior Staff Engineer in the Intel Software and Services Group. He realizes system and software design and optimization with a primary focus on the energy vertical. Borges has specialized in applying his background in numerical analysis and parallel numerical mathematics libraries development in the HPC domain for the past two decades. He joined the Intel Many Integrated Core (MIC) program at its early stages. Borges holds a master s degree in applied mathematics and a PhD in computer science from Texas A&M. xvii
18
19 Acknowledgments I appreciate all the support and encouragement I received while writing the book from my wife Farzana, my daughters Fariha and Ridwana, and my father. My sincere thanks go to the ApressOpen publishing team Robert Hutchinson, Anamika Panchoo, and others and to the technical reviewer, Leo Borges, for helping me get through the tough job of completing the book by editing the presentation and helping to refine the content. Thanks to Patrick Hauke and Stuart Douglas from Intel Press for encouraging me to take on the project and providing the necessary support to get it published. Finally, I appreciate the unstinting help from my engineering colleagues at Intel who have helped me understand and improve my knowledge of the architecture and optimization techniques for the Intel Xeon Phi coprocessor. xix
20
21 Introduction This book provides a comprehensive introduction to Intel Xeon Phi architecture and the tools necessary for software engineers and scientists to develop optimized code for systems using Intel Xeon Phi coprocessors. It presents the in-depth knowledge of the Xeon Phi coprocessor architecture that developers need to have to utilize the power of Xeon Phi. My book presupposes prior knowledge of modern cache-based processor architecture, but it begins with a review of the general architectural history, concepts, and nomenclature that I assume my readers bring. Because this book is intended for practitioners rather than theoreticians, I have filled it with code examples chosen to illuminate features of Xeon Phi architecture in the light of code optimization. The book is divided into three parts corresponding to the areas engineers and scientists need to know to develop and optimize code on Xeon Phi for high-performance technical computing: Part 1 Hardware Foundation: Intel Xeon Phi Architecture sketches the salient features of modern cache-based architecture with reference to some of the history behind the development of Xeon Phi architecture that I was personally engaged in. It then walks the reader through the functional details of Xeon Phi architecture, using code samples to disclose the performance metrics and behavioral characteristics of the processor. Part 2 Software Foundation: Intel Xeon Phi System Software and Tools describes the system software and tools necessary to build and run applications on the Xeon Phi system. I drill into the details of the software layers involved in coordinating communication and computations between the host processor and a Xeon Phi coprocessor. Part 3 Applications: Technical Computing Software Development on Intel Xeon Phi discusses the characteristics of algorithms and data structures that are well tuned for the Xeon Phi coprocessor. I use C-like pseudo-algorithms to illustrate most instructively the various kinds of algorithms that are optimized for the Xeon Phi coprocessor. Although this final part of the book makes no pretensions to being comprehensive, it is rich with practical pointers for developing and optimizing your own code on the Xeon Phi coprocessor. Although each of the three parts of the book is relatively self-contained, allowing readers to go directly to the topics that are of most interest to them, I strongly recommend that you read Part 1 for the architectural foundation to understand the discussion of algorithms in Part 3. These algorithms are mainly of practical interest to the Xeon Phi community for optimizing their code for this architecture. xxi
Windows 10 Revealed. The Universal Windows Operating System for PC, Tablets, and Windows Phone. Kinnary Jangla
Windows 10 Revealed The Universal Windows Operating System for PC, Tablets, and Windows Phone Kinnary Jangla Windows 10 Revealed Kinnary Jangla Bing Maps San Francisco, California, USA ISBN-13 (pbk): 978-1-4842-0687-4
More informationE, F. Best-known methods (BKMs), 153 Boot strap processor (BSP),
Index A Accelerated Strategic Computing Initiative (ASCI), 3 Address generation interlock (AGI), 55 Algorithm and data structures, 171. See also General matrix-matrix multiplication (GEMM) design rules,
More informationObjective-C Quick Syntax Reference
Objective-C Quick Syntax Reference Matthew Campbell Objective-C Quick Syntax Reference Copyright 2014 by Matthew Campbell This work is subject to copyright. All rights are reserved by the Publisher, whether
More informationFunctional Programming in R
Functional Programming in R Advanced Statistical Programming for Data Science, Analysis and Finance Thomas Mailund Functional Programming in R: Advanced Statistical Programming for Data Science, Analysis
More informationMicrosoft Computer Vision APIs Distilled
Microsoft Computer Vision APIs Distilled Getting Started with Cognitive Services Alessandro Del Sole Microsoft Computer Vision APIs Distilled Alessandro Del Sole Cremona, Italy ISBN-13 (pbk): 978-1-4842-3341-2
More informationMATLAB Programming for Numerical Analysis. César Pérez López
MATLAB Programming for Numerical Analysis César Pérez López MATLAB Programming for Numerical Analysis Copyright 2014 by César Pérez López This work is subject to copyright. All rights are reserved by the
More informationPro MERN Stack. Full Stack Web App Development with Mongo, Express, React, and Node. Vasan Subramanian
Pro MERN Stack Full Stack Web App Development with Mongo, Express, React, and Node Vasan Subramanian Pro MERN Stack Vasan Subramanian Bangalore, Karnataka, India ISBN-13 (pbk): 978-1-4842-2652-0 ISBN-13
More informationEssential Angular for ASP.NET Core MVC
Essential Angular for ASP.NET Core MVC Adam Freeman Essential Angular for ASP.NET Core MVC Adam Freeman London, UK ISBN-13 (pbk): 978-1-4842-2915-6 ISBN-13 (electronic): 978-1-4842-2916-3 DOI 10.1007/978-1-4842-2916-3
More informationC Quick Syntax Reference
C Quick Syntax Reference Mikael Olsson C Quick Syntax Reference Copyright 2015 by Mikael Olsson This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
More informationJava Quick Syntax Reference. Second Edition. Mikael Olsson
Java Quick Syntax Reference Second Edition Mikael Olsson Java Quick Syntax Reference Second Edition Mikael Olsson Java Quick Syntax Reference Mikael Olsson Hammarland, Länsi-Suomi, Finland ISBN-13 (pbk):
More informationC++ Quick Syntax Reference
C++ Quick Syntax Reference Mikael Olsson C++ Quick Syntax Reference Copyright 2013 by Mikael Olsson This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
More informationBuilding Custom Tasks for SQL Server Integration Services
Building Custom Tasks for SQL Server Integration Services Andy Leonard Building Custom Tasks for SQL Server Integration Services Andy Leonard Farmville, Virginia, USA ISBN-13 (pbk): 978-1-4842-2939-2 ISBN-13
More informationJavaScript Quick Syntax Reference
JavaScript Quick Syntax Reference Mikael Olsson JavaScript Quick Syntax Reference Copyright 2015 by Mikael Olsson This work is subject to copyright. All rights are reserved by the Publisher, whether the
More informationMATLAB Numerical Calculations. César Pérez López
MATLAB Numerical Calculations César Pérez López MATLAB Numerical Calculations Copyright 2014 by César Pérez López This work is subject to copyright. All rights are reserved by the Publisher, whether the
More informationAgile Swift. Swift Programming Using Agile Tools and Techniques. Godfrey Nolan
Agile Swift Swift Programming Using Agile Tools and Techniques Godfrey Nolan Agile Swift: Swift Programming Using Agile Tools and Techniques Godfrey Nolan Huntington Woods, Michigan, USA ISBN-13 (pbk):
More informationThe Windows 10 Productivity Handbook
The Windows 10 Productivity Handbook Discover Expert Tips, Tricks, and Hidden Features in Windows 10 Mike Halsey The Windows 10 Productivity Handbook Mike Halsey Sheffield, Yorkshire, UK ISBN-13 (pbk):
More informationPractical Amazon EC2, SQS, Kinesis, and S3
Practical Amazon EC2, SQS, Kinesis, and S3 A Hands-On Approach to AWS Sunil Gulabani Practical Amazon EC2, SQS, Kinesis, and S3: A Hands-On Approach to AWS Sunil Gulabani Ahmedabad, Gujarat, India ISBN-13
More informationPro.NET 4 Parallel Programming in C#
Pro.NET 4 Parallel Programming in C# Adam Freeman Pro.NET 4 Parallel Programming in C# Copyright 2010 by Adam Freeman All rights reserved. No part of this work may be reproduced or transmitted in any form
More informationFor your convenience Apress has placed some of the front matter material after the index. Please use the Bookmarks and Contents at a Glance links to access them. Contents at a Glance About the Author...
More informationPro JavaScript Performance Monitoring and Visualization
Pro JavaScript Performance Monitoring and Visualization Tom Barker Pro JavaScript Performance Copyright 2012 by Tom Barker This work is subject to copyright. All rights are reserved by the Publisher, whether
More informationScalable Big Data Architecture
Scalable Big Data Architecture A Practitioner s Guide to Choosing Relevant Big Data Architecture Bahaaldine Azarmi Scalable Big Data Architecture Copyright 2016 by Bahaaldine Azarmi This work is subject
More informationSQL Server AlwaysOn Revealed
SQL Server AlwaysOn Revealed Second Edition Peter A. Carter SQL Server AlwaysOn Revealed, 2nd Edition Peter A. Carter Botley, United Kingdom ISBN-13 (pbk): 978-1-4842-2396-3 ISBN-13 (electronic): 978-1-4842-2397-0
More informationASP.NET Core Recipes
ASP.NET Core Recipes A Problem-Solution Approach Second Edition John Ciliberti ASP.NET Core Recipes: A Problem-Solution Approach John Ciliberti Sparta, New Jersey, USA ISBN-13 (pbk): 978-1-4842-0428-3
More informationAndroid Continuous Integration
Android Continuous Integration Build-Deploy-Test Automation for Android Mobile Apps Pradeep Macharla Android Continuous Integration Pradeep Macharla North Carolina, USA ISBN-13 (pbk): 978-1-4842-2795-4
More informationPractical Spring LDAP
Practical Spring LDAP Enterprise Java LDAP Development Made Easy Balaji Varanasi Practical Spring LDAP: Enterprise Java LDAP Development Made Easy Copyright 2013 Balaji Varanasi. All rights reserved. This
More informationJavaScript Essentials for SAP ABAP Developers
JavaScript Essentials for SAP ABAP Developers A Guide to Mobile and Desktop Application Development Rehan Zaidi JavaScript Essentials for SAP ABAP Developers: A Guide to Mobile and Desktop Application
More informationLearn PHP 7. Object-Oriented Modular Programming using HTML5, CSS3, JavaScript, XML, JSON, and MySQL. Steve Prettyman
THE EXPERT S VOICE IN WEB DEVELOPMENT Learn PHP 7 Object-Oriented Modular Programming using HTML5, CSS3, JavaScript, XML, JSON, and MySQL Steve Prettyman Learn PHP 7 Object-Oriented Modular Programming
More informationCompanion ebook Available Pro Android Includes Android 1.5 SOURCE CODE ONLINE US $44.99
The EXPERT s VOIce in Open Source Pro Android Covers Google s Android Platform and its fundamental APIs, from basic concepts such as Android resources, intents, and content providers to advanced topics
More informationWeb Programming with Dart. Moises Belchin Patricia Juberias
Web Programming with Dart Moises Belchin Patricia Juberias Web Programming with Dart Copyright 2015 by Moises Belchin and Patricia Juberias This work is subject to copyright. All rights are reserved by
More informationOffice 365. Migrating and Managing Your Business in the Cloud. Matthew Katzer Don Crawford
Office 365 Migrating and Managing Your Business in the Cloud Matthew Katzer Don Crawford Office 365: Migrating and Managing Your Business in the Cloud Matthew Katzer and Don Crawford Copyright 2013 by
More informationSwift Quick Syntax Reference
Swift Quick Syntax Reference Matthew Campbell Swift Quick Syntax Reference Copyright 2014 by Matthew Campbell This work is subject to copyright. All rights are reserved by the Publisher, whether the whole
More informationA Simple Path to Parallelism with Intel Cilk Plus
Introduction This introductory tutorial describes how to use Intel Cilk Plus to simplify making taking advantage of vectorization and threading parallelism in your code. It provides a brief description
More informationAndroid Continuous Integration
Android Continuous Integration Build-Deploy-Test Automation for Android Mobile Apps Pradeep Macharla Android Continuous Integration Build-Deploy-Test Automation for Android Mobile Apps Pradeep Macharla
More informationBeginning ASP.NET MVC 4. José Rolando Guay Paz
Beginning ASP.NET MVC 4 José Rolando Guay Paz Beginning ASP.NET MVC 4 Copyright 2013 by José Rolando Guay Paz This work is subject to copyright. All rights are reserved by the Publisher, whether the whole
More informationBeginning Functional JavaScript
Beginning Functional JavaScript Functional Programming with JavaScript Using EcmaScript 6 Anto Aravinth Beginning Functional JavaScript Anto Aravinth Chennai, Tamil Nadu, India ISBN-13 (pbk): 978-1-4842-2655-1
More informationMaterial Design Implementation with AngularJS
Material Design Implementation with AngularJS UI Component Framework First Edition V. Keerti Kotaru Material Design Implementation with AngularJS V. Keerti Kotaru Hyderabad, Andhra Pradesh, India ISBN-13
More informationIntel Parallel Studio XE 2015 Composer Edition for Linux* Installation Guide and Release Notes
Intel Parallel Studio XE 2015 Composer Edition for Linux* Installation Guide and Release Notes 23 October 2014 Table of Contents 1 Introduction... 1 1.1 Product Contents... 2 1.2 Intel Debugger (IDB) is
More informationBeginning Robotics Programming in Java with LEGO Mindstorms
Beginning Robotics Programming in Java with LEGO Mindstorms Wei Lu Beginning Robotics Programming in Java with LEGO Mindstorms Wei Lu Keene, New Hampshire, USA ISBN-13 (pbk): 978-1-4842-2004-7 ISBN-13
More informationReusing this material
XEON PHI BASICS Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us
More informationPro Angular 6. Third Edition. Adam Freeman
Pro Angular 6 Third Edition Adam Freeman Pro Angular 6 Adam Freeman London, UK ISBN-13 (pbk): 978-1-4842-3648-2 ISBN-13 (electronic): 978-1-4842-3649-9 https://doi.org/10.1007/978-1-4842-3649-9 Library
More informationOverview of Intel Xeon Phi Coprocessor
Overview of Intel Xeon Phi Coprocessor Sept 20, 2013 Ritu Arora Texas Advanced Computing Center Email: rauta@tacc.utexas.edu This talk is only a trailer A comprehensive training on running and optimizing
More informationAccelerator Programming Lecture 1
Accelerator Programming Lecture 1 Manfred Liebmann Technische Universität München Chair of Optimal Control Center for Mathematical Sciences, M17 manfred.liebmann@tum.de January 11, 2016 Accelerator Programming
More informationC++ Recipes. A Problem-Solution Approach. Bruce Sutherland
C++ Recipes A Problem-Solution Approach Bruce Sutherland C++ Recipes: A Problem-Solution Approach Copyright 2015 by Bruce Sutherland This work is subject to copyright. All rights are reserved by the Publisher,
More informationIntel Parallel Studio XE 2015
2015 Create faster code faster with this comprehensive parallel software development suite. Faster code: Boost applications performance that scales on today s and next-gen processors Create code faster:
More informationMigrating to Swift from Android
Migrating to Swift from Android Sean Liao Migrating to Swift from Android Copyright 2014 by Sean Liao This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part
More informationAchieving High Performance. Jim Cownie Principal Engineer SSG/DPD/TCAR Multicore Challenge 2013
Achieving High Performance Jim Cownie Principal Engineer SSG/DPD/TCAR Multicore Challenge 2013 Does Instruction Set Matter? We find that ARM and x86 processors are simply engineering design points optimized
More informationPORTING CP2K TO THE INTEL XEON PHI. ARCHER Technical Forum, Wed 30 th July Iain Bethune
PORTING CP2K TO THE INTEL XEON PHI ARCHER Technical Forum, Wed 30 th July Iain Bethune (ibethune@epcc.ed.ac.uk) Outline Xeon Phi Overview Porting CP2K to Xeon Phi Performance Results Lessons Learned Further
More informationChapter 1 Introduction to Xeon Phi Architecture
Chapter 1 Introduction to Xeon Phi Architecture Technical computing can be defined as the application of mathematical and computational principles to solve engineering and scientific problems. It has become
More informationIntel VTune Amplifier XE
Intel VTune Amplifier XE Vladimir Tsymbal Performance, Analysis and Threading Lab 1 Agenda Intel VTune Amplifier XE Overview Features Data collectors Analysis types Key Concepts Collecting performance
More informationComputer Architecture and Structured Parallel Programming James Reinders, Intel
Computer Architecture and Structured Parallel Programming James Reinders, Intel Parallel Computing CIS 410/510 Department of Computer and Information Science Lecture 17 Manycore Computing and GPUs Computer
More informationIntroduction to Xeon Phi. Bill Barth January 11, 2013
Introduction to Xeon Phi Bill Barth January 11, 2013 What is it? Co-processor PCI Express card Stripped down Linux operating system Dense, simplified processor Many power-hungry operations removed Wider
More informationUsing Intel VTune Amplifier XE and Inspector XE in.net environment
Using Intel VTune Amplifier XE and Inspector XE in.net environment Levent Akyil Technical Computing, Analyzers and Runtime Software and Services group 1 Refresher - Intel VTune Amplifier XE Intel Inspector
More informationDeepak Vohra. Pro Docker
Deepak Vohra Pro Docker Pro Docker Copyright 2016 by Deepak Vohra This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically
More informationMulti-Core Programming
Multi-Core Programming Increasing Performance through Software Multi-threading Shameem Akhter Jason Roberts Intel PRESS Copyright 2006 Intel Corporation. All rights reserved. ISBN 0-9764832-4-6 No part
More informationUsing Intel VTune Amplifier XE for High Performance Computing
Using Intel VTune Amplifier XE for High Performance Computing Vladimir Tsymbal Performance, Analysis and Threading Lab 1 The Majority of all HPC-Systems are Clusters Interconnect I/O I/O... I/O I/O Message
More informationIntel Xeon Phi Coprocessors
Intel Xeon Phi Coprocessors Reference: Parallel Programming and Optimization with Intel Xeon Phi Coprocessors, by A. Vladimirov and V. Karpusenko, 2013 Ring Bus on Intel Xeon Phi Example with 8 cores Xeon
More informationIntel Xeon Phi архитектура, модели программирования, оптимизация.
Нижний Новгород, 2017 Intel Xeon Phi архитектура, модели программирования, оптимизация. Дмитрий Прохоров, Дмитрий Рябцев, Intel Agenda What and Why Intel Xeon Phi Top 500 insights, roadmap, architecture
More informationBeginning PowerShell for SharePoint 2016
Beginning PowerShell for SharePoint 2016 A Guide for Administrators, Developers, and DevOps Engineers Second Edition Nikolas Charlebois-Laprade John Edward Naguib Beginning PowerShell for SharePoint 2016:
More informationIntel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins
Intel Many Integrated Core (MIC) Matt Kelly & Ryan Rawlins Outline History & Motivation Architecture Core architecture Network Topology Memory hierarchy Brief comparison to GPU & Tilera Programming Applications
More informationLow Level X Window Programming
Low Level X Window Programming Ross J. Maloney Low Level X Window Programming An Introduction by Examples 123 Dr. Ross J. Maloney Yenolam Corporation Booragoon, WA Australia ISBN 978-3-319-74249-6 ISBN
More informationIntel Xeon Phi Coprocessor
Intel Xeon Phi Coprocessor http://tinyurl.com/inteljames twitter @jamesreinders James Reinders it s all about parallel programming Source Multicore CPU Compilers Libraries, Parallel Models Multicore CPU
More informationArchitecture, Programming and Performance of MIC Phi Coprocessor
Architecture, Programming and Performance of MIC Phi Coprocessor JanuszKowalik, Piotr Arłukowicz Professor (ret), The Boeing Company, Washington, USA Assistant professor, Faculty of Mathematics, Physics
More informationPro Java Clustering and Scalability
Pro Java Clustering and Scalability Building Real-Time Apps with Spring, Cassandra, Redis, WebSocket and RabbitMQ Jorge Acetozi Pro Java Clustering and Scalability: Building Real-Time Apps with Spring,
More informationProgramming for the Intel Many Integrated Core Architecture By James Reinders. The Architecture for Discovery. PowerPoint Title
Programming for the Intel Many Integrated Core Architecture By James Reinders The Architecture for Discovery PowerPoint Title Intel Xeon Phi coprocessor 1. Designed for Highly Parallel workloads 2. and
More informationCustom Raspberry Pi Interfaces
Custom Raspberry Pi Interfaces Design and build hardware interfaces for the Raspberry Pi Warren Gay Custom Raspberry Pi Interfaces: Design and build hardware interfaces for the Raspberry Pi Warren Gay
More informationIntroduction to Intel Xeon Phi programming techniques. Fabio Affinito Vittorio Ruggiero
Introduction to Intel Xeon Phi programming techniques Fabio Affinito Vittorio Ruggiero Outline High level overview of the Intel Xeon Phi hardware and software stack Intel Xeon Phi programming paradigms:
More informationPro Data Backup and Recovery. Steven Nelson
Pro Data Backup and Recovery Steven Nelson Pro Data Backup and Recovery Copyright 2011 by Steven Nelson All rights reserved. No part of this work may be reproduced or transmitted in any form or by any
More informationAgenda. Optimization Notice Copyright 2017, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
Agenda VTune Amplifier XE OpenMP* Analysis: answering on customers questions about performance in the same language a program was written in Concepts, metrics and technology inside VTune Amplifier XE OpenMP
More informationMore performance options
More performance options OpenCL, streaming media, and native coding options with INDE April 8, 2014 2014, Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Inside, Intel Xeon, and Intel
More informationAgile Database Techniques Effective Strategies for the Agile Software Developer. Scott W. Ambler
Agile Database Techniques Effective Strategies for the Agile Software Developer Scott W. Ambler Agile Database Techniques Effective Strategies for the Agile Software Developer Agile Database Techniques
More informationEliminate Threading Errors to Improve Program Stability
Eliminate Threading Errors to Improve Program Stability This guide will illustrate how the thread checking capabilities in Parallel Studio can be used to find crucial threading defects early in the development
More informationWindows Troubleshooting Series
Windows Troubleshooting Series Mike Halsey, MVP Series Editor Windows Networking Troubleshooting Mike Halsey Joli Ballew Windows Networking Troubleshooting Mike Halsey Sheffield, South Yorkshire, UK Joli
More informationPhilip Andrew Simpson. FPGA Design. Best Practices for Team-based Reuse. Second Edition
FPGA Design Philip Andrew Simpson FPGA Design Best Practices for Team-based Reuse Second Edition Philip Andrew Simpson San Jose, CA, USA ISBN 978-3-319-17923-0 DOI 10.1007/978-3-319-17924-7 ISBN 978-3-319-17924-7
More informationBeginning CSS Preprocessors
Beginning CSS Preprocessors With Sass, Compass, and Less Anirudh Prabhu Beginning CSS Preprocessors: With SASS, Compass.js, and Less.js Copyright 2015 by Anirudh Prabhu This work is subject to copyright.
More informationBeginning Oracle WebCenter Portal 12c
Beginning Oracle WebCenter Portal 12c Build next-generation Enterprise Portals with Oracle WebCenter Portal Vinay Kumar Daniel Merchán García Beginning Oracle WebCenter Portal 12c Vinay Kumar Rotterdam,
More informationDavid R. Mackay, Ph.D. Libraries play an important role in threading software to run faster on Intel multi-core platforms.
Whitepaper Introduction A Library Based Approach to Threading for Performance David R. Mackay, Ph.D. Libraries play an important role in threading software to run faster on Intel multi-core platforms.
More informationDigital Illustration Fundamentals
Wallace Jackson Digital Illustration Fundamentals Vector, Raster, WaveForm, NewMedia with DICF, DAEF and ASNMF 1st ed. 2015 Wallace Jackson Lompoc, California, USA ISBN 978-1-4842-1696-5 e-isbn 978-1-4842-1697-2
More informationContents. Preface xvii Acknowledgments. CHAPTER 1 Introduction to Parallel Computing 1. CHAPTER 2 Parallel Programming Platforms 11
Preface xvii Acknowledgments xix CHAPTER 1 Introduction to Parallel Computing 1 1.1 Motivating Parallelism 2 1.1.1 The Computational Power Argument from Transistors to FLOPS 2 1.1.2 The Memory/Disk Speed
More informationEliminate Threading Errors to Improve Program Stability
Introduction This guide will illustrate how the thread checking capabilities in Intel Parallel Studio XE can be used to find crucial threading defects early in the development cycle. It provides detailed
More informationPro SQL Server 2008 Mirroring
Pro SQL Server 2008 Mirroring Robert L. Davis, Ken Simmons Pro SQL Server 2008 Mirroring Copyright 2009 by Robert L. Davis, Ken Simmons All rights reserved. No part of this work may be reproduced or transmitted
More informationHetero Streams Library (hstreams Library) User's Guide
(hstreams Library) User's Guide January 2017 Copyright 2013-2017 Intel Corporation All Rights Reserved US Revision: 1.0 World Wide Web: http://www.intel.com Disclaimer and Legal Information You may not
More informationEliminate Memory Errors to Improve Program Stability
Introduction INTEL PARALLEL STUDIO XE EVALUATION GUIDE This guide will illustrate how Intel Parallel Studio XE memory checking capabilities can find crucial memory defects early in the development cycle.
More informationExpert C# 5.0 with.net 4.5 Framework
Expert C# 5.0 with.net 4.5 Framework Mohammad Rahman Apress Expert C# 5.0: with.net 4.5 Framework Copyright 2013 by Mohammad Rahman This work is subject to copyright. All rights are reserved by the Publisher,
More informationPerformance Profiler. Klaus-Dieter Oertel Intel-SSG-DPD IT4I HPC Workshop, Ostrava,
Performance Profiler Klaus-Dieter Oertel Intel-SSG-DPD IT4I HPC Workshop, Ostrava, 08-09-2016 Faster, Scalable Code, Faster Intel VTune Amplifier Performance Profiler Get Faster Code Faster With Accurate
More informationAndroid on x86. An Introduction to Optimizing for Intel Architecture. Iggy Krajci Darren Cummings
Android on x86 An Introduction to Optimizing for Intel Architecture Iggy Krajci Darren Cummings Android on x86: An Introduction to Optimizing for Intel Architecture Iggy Krajci and Darren Cummings Copyright
More informationEfficiently Introduce Threading using Intel TBB
Introduction This guide will illustrate how to efficiently introduce threading using Intel Threading Building Blocks (Intel TBB), part of Intel Parallel Studio XE. It is a widely used, award-winning C++
More informationDebugging Intel Xeon Phi KNC Tutorial
Debugging Intel Xeon Phi KNC Tutorial Last revised on: 10/7/16 07:37 Overview: The Intel Xeon Phi Coprocessor 2 Debug Library Requirements 2 Debugging Host-Side Applications that Use the Intel Offload
More informationIntel Software Development Products Licensing & Programs Channel EMEA
Intel Software Development Products Licensing & Programs Channel EMEA Intel Software Development Products Advanced Performance Distributed Performance Intel Software Development Products Foundation of
More informationPro MongoDB Development
Pro MongoDB Development Deepak Vohra Pro MongoDB Development Copyright 2015 by Deepak Vohra This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the
More informationStructured Parallel Programming Patterns for Efficient Computation
Structured Parallel Programming Patterns for Efficient Computation Michael McCool Arch D. Robison James Reinders ELSEVIER AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO
More informationDesign of a System-on-Chip Switched Network and its Design Support Λ
Design of a System-on-Chip Switched Network and its Design Support Λ Daniel Wiklund y, Dake Liu Dept. of Electrical Engineering Linköping University S-581 83 Linköping, Sweden Abstract As the degree of
More information"Charting the Course to Your Success!" MOC A Developing High-performance Applications using Microsoft Windows HPC Server 2008
Description Course Summary This course provides students with the knowledge and skills to develop high-performance computing (HPC) applications for Microsoft. Students learn about the product Microsoft,
More informationAddressing the Increasing Challenges of Debugging on Accelerated HPC Systems. Ed Hinkel Senior Sales Engineer
Addressing the Increasing Challenges of Debugging on Accelerated HPC Systems Ed Hinkel Senior Sales Engineer Agenda Overview - Rogue Wave & TotalView GPU Debugging with TotalView Nvdia CUDA Intel Phi 2
More informationKnights Corner: Your Path to Knights Landing
Knights Corner: Your Path to Knights Landing James Reinders, Intel Wednesday, September 17, 2014; 9-10am PDT Photo (c) 2014, James Reinders; used with permission; Yosemite Half Dome rising through forest
More informationLearn Excel 2016 for OS X
Learn Excel 2016 for OS X Second Edition Guy Hart-Davis Learn Excel 2016 for OS X Copyright 2015 by Guy Hart-Davis This work is subject to copyright. All rights are reserved by the Publisher, whether the
More informationIntel Advisor XE Future Release Threading Design & Prototyping Vectorization Assistant
Intel Advisor XE Future Release Threading Design & Prototyping Vectorization Assistant Parallel is the Path Forward Intel Xeon and Intel Xeon Phi Product Families are both going parallel Intel Xeon processor
More informationThe Intel Xeon Phi Coprocessor. Dr-Ing. Michael Klemm Software and Services Group Intel Corporation
The Intel Xeon Phi Coprocessor Dr-Ing. Michael Klemm Software and Services Group Intel Corporation (michael.klemm@intel.com) Legal Disclaimer & Optimization Notice INFORMATION IN THIS DOCUMENT IS PROVIDED
More informationFundamentals of the Java Programming Language
Fundamentals of the Java Programming Language Student Guide SL-110 REV E D61798GC10 Edition 1.0 2009 D62399 Copyright 2006, 2009, Oracle and/or its affiliates. All rights reserved. Disclaimer This document
More informationPerformance Tools for Technical Computing
Christian Terboven terboven@rz.rwth-aachen.de Center for Computing and Communication RWTH Aachen University Intel Software Conference 2010 April 13th, Barcelona, Spain Agenda o Motivation and Methodology
More informationIntel Xeon Phi Coprocessor
Intel Xeon Phi Coprocessor 1 Agenda Introduction Intel Xeon Phi Architecture Programming Models Outlook Summary 2 Intel Multicore Architecture Intel Many Integrated Core Architecture (Intel MIC) Foundation
More information