CPU Toolchain Launch Postmortem. Greg Bedwell

Size: px
Start display at page:

Download "CPU Toolchain Launch Postmortem. Greg Bedwell"

Transcription

1 CPU Toolchain Launch Postmortem Greg Bedwell

2 x86-64 AMD Jaguar 8-core CPU 1.84 TFLOPS AMD Radeon based GPU 8GB GDDR5 RAM

3

4 postmortem noun an analysis or discussion of an event after it is over Now that we have successfully launched PlayStation 4 it is a good time to look back on our initial period of development up to that point

5 First, some history

6 SN Systems Ltd. was founded in 1988 to provide development tools for the games industry 1990 Psy-Q 16-bit home systems

7 Psy-Q included a version of GCC that was highly customized for the needs of game developers 1994 Psy-Q PlayStation

8 Continued to provide GCC but started researching a proprietary compiler technology SNC 2000 ProDG PlayStation 2

9 Provided SNC as part of the ProDG suite of tools although GCC was also available 2004 ProDG PSP (PlayStation Portable)

10 Sony Computer Entertainment Inc. acquired SN Systems in 2005 Provided SNC as part of the ProDG suite of tools although GCC was also available 2006 ProDG PlayStation 3

11 CPU Compiler is SNC 2011 PlayStation Vita

12 CPU Compiler is Clang 2013 PlayStation 4

13 The CPU compiler project is a global effort

14 Builds and build systems (and test systems)

15 We wanted to make use of all of our pre-existing test suites and test systems, which are shared across all targets

16

17 Optimized Assertions Debug Info Debug MinSizeRel Release RelWithDebInfo

18 Optimized Assertions Debug Info Debug MinSizeRel Release RelWithDebInfo

19 seconds Clang configuration effect on game build time Too slow for continuous integration testing! 2h 41m 24s Useful for continuous integration testing but still too much overhead over the pure release configuration of clang 12m 56s 7m 44s 0 Debug Release+Asserts Release

20 Optimized Assertions Debug Info Debug Checking The same set of build configurations as we use for SNC Debug for debugging Release for releasing Release Checking (Release+Asserts) for continuous integration testing

21 Optimized Assertions Debug Info Debug O Checking Release

22 clang.exe size (MB) No executable size overhead for enabling debug data Default /Zi (Produce PDB) /Zi (Produce PDB) and /OPT:REF /OPT:REF

23 Optimized Assertions Debug Info Debug O Checking Release

24 Optimized Assertions Debug Info Debug Checking Release Suppress Windows crash dialog box for Checking and Release builds

25 Improving test coverage

26 Things we know Game code is usually BIG For a new platform, the amount of code that exists is small

27 Things we know Users value CORRECTNESS over all else

28 Why write tests when I can write a test generator?

29 Why write tests when I can write _ a test generator? already have

30 Generate random C++ tests using SIMD language extensions and intrinsics to increase test coverage Random input Result

31 Random input Result

32 Random input Result

33 Random input Result

34 Random input Result

35 Random input Result

36 Random input Result Undefined behaviour makes runtimebehaviour random testing hard!

37 Random input Result Solution: Use a safe wrapper to make all undefined behaviour defined for the purpose of the test

38 Reducing optimization bugs

39 bugpoint Windows Clang integration

40 An alternative approach

41 SNC s max_opts *Not representative of actual visuals

42

43 SNC s max_opts

44 SNC s max_opts

45 SNC s max_opts

46 SNC s max_opts

47 SNC s max_opts

48 SNC s max_opts SNC optimizer SSA Form Rule Based Every transformation is guarded by a specific check

49 SNC keeps an internal counter of the number of transformations it performs Allows the user to specify a limit on the command line after which no further transformations can be performed

50 SNC s max_opts Autochop harness -O2

51 SNC s max_opts Autochop harness -O0

52 SNC s Autochop harness max_opts -O0 -O2

53 SNC s max_opts Autochop harness -O0 -O2

54 SNC s max_opts Autochop harness -O0 -O2

55 SNC s max_opts Autochop harness We ve found the game source file with the bad transformation

56 SNC s max_opts Autochop harness -Xmax_opts=2048

57 SNC s max_opts Autochop harness -Xmax_opts=1024

58 SNC s max_opts Autochop harness -Xmax_opts=512

59 SNC s max_opts Autochop harness -Xmax_opts=988

60 SNC s max_opts Autochop harness -Xmax_opts=989 Now we ve found the specific transformation causing the miscompile

61 A compiler trace shows us the culprit optimization

62 Narrowed down to a single difference in IR Just a single line change in the IR

63 SNC s max_opts A question for the community: Would this work in LLVM/Clang? (even if just at pass level)

64 The release process

65 End-user documentation is lacking Docs Release notes aimed at Clang/LLVM developers, not users

66 Docs We plan to contribute our documentation improvements to the community

67 Docs We plan to contribute our documentation improvements to the community

68 Forward compatibility

69 ABI v3.x v3.x System libraries

70 ABI v3.x v4.x System libraries

71 ABI v3.x v5.x System libraries

72 ABI v3.x v6.x System libraries

73 ABI v3.x v13.x System libraries

74 Maintaining a stable ABI is a ABI MUST (including maintaining existing ABI bugs)

75 We have created a full IA64 ABI test suite

76 ABI TEST SUITE We hope to contribute our test suite to the community (some logistics still to be worked out)

77 Developer reaction

78

79 But

80 #pragma optimize Most requested feature by an order of magnitude and already supported by all the other major compilers

81 This is the most common use-case: #pragma optimize Game runs too slowly at O0, but is very hard to debug at O2

82 #pragma optimize Solution: use a pragma to selectively disable optimization on a small set of functions to be debugged We proposed this on the mailing lists, but it is a major change and we got a limited response

83 #pragma optimize Short term solution: function level attribute to disable optimization Not user-friendly for more than a very small number of functions at a time!

84 #pragma optimize Many of our users abstract this away behind a compilerindependent interface. Function attribute does not fit this model! We still need a rangebased solution

85 In summary

86 Our initial experience with Clang and LLVM has been very positive Thanks to all of you who helped make Clang and LLVM great!

87 There are still improvements that can be made We will be working alongside you to make them

OpenMP * 4 Support in Clang * / LLVM * Andrey Bokhanko, Intel

OpenMP * 4 Support in Clang * / LLVM * Andrey Bokhanko, Intel OpenMP * 4 Support in Clang * / LLVM * Andrey Bokhanko, Intel Clang * : An Excellent C++ Compiler LLVM * : Collection of modular and reusable compiler and toolchain technologies Created by Chris Lattner

More information

2012 LLVM Euro - Michael Spencer. lld. Friday, April 13, The LLVM Linker

2012 LLVM Euro - Michael Spencer. lld. Friday, April 13, The LLVM Linker lld Friday, April 13, 2012 The LLVM Linker What is lld? A system linker Produce final libraries and executables, no other tools or runtime required Understands platform ABI What is lld? A system linker

More information

BuildPal Documentation

BuildPal Documentation BuildPal Documentation Release 0.1.1 PKE sistemi August 05, 2014 Contents 1 Introduction 3 1.1 What is it?................................................ 3 1.2 Why another distributed compiler?...................................

More information

Connecting the EDG front-end to LLVM. Renato Golin, Evzen Muller, Jim MacArthur, Al Grant ARM Ltd.

Connecting the EDG front-end to LLVM. Renato Golin, Evzen Muller, Jim MacArthur, Al Grant ARM Ltd. Connecting the EDG front-end to LLVM Renato Golin, Evzen Muller, Jim MacArthur, Al Grant ARM Ltd. 1 Outline Why EDG Producing IR ARM support 2 EDG Front-End LLVM already has two good C++ front-ends, why

More information

BuildPal Documentation

BuildPal Documentation BuildPal Documentation Release 0.2 development PKE sistemi October 10, 2014 Contents 1 Introduction 3 1.1 What is it?................................................ 3 1.2 Why another distributed compiler?...................................

More information

Under the Compiler's Hood: Supercharge Your PLAYSTATION 3 (PS3 ) Code. Understanding your compiler is the key to success in the gaming world.

Under the Compiler's Hood: Supercharge Your PLAYSTATION 3 (PS3 ) Code. Understanding your compiler is the key to success in the gaming world. Under the Compiler's Hood: Supercharge Your PLAYSTATION 3 (PS3 ) Code. Understanding your compiler is the key to success in the gaming world. Supercharge your PS3 game code Part 1: Compiler internals.

More information

Evolution of CPUs & Memory in Video Game Consoles. Curtis Geiger & Matthew Meehan

Evolution of CPUs & Memory in Video Game Consoles. Curtis Geiger & Matthew Meehan Evolution of CPUs & Memory in Video Game Consoles Curtis Geiger & Matthew Meehan 1 ST GENERATION Magnavox Odyssey first console, released 1972 No CPU or Memory entirely made up of transistors, resistors,

More information

Introduction to GPU hardware and to CUDA

Introduction to GPU hardware and to CUDA Introduction to GPU hardware and to CUDA Philip Blakely Laboratory for Scientific Computing, University of Cambridge Philip Blakely (LSC) GPU introduction 1 / 35 Course outline Introduction to GPU hardware

More information

User Manual of SADP Software

User Manual of SADP Software User Manual of SADP Software Search Active Device Protocol V2.0 User Manual of SADP Software 1 Table of Contents Chapter 1 Overview... 1 1.1 Description... 1 1.2 System Requirements... 1 Chapter 2 Running

More information

lld: A Fast, Simple and Portable Linker Rui Ueyama LLVM Developers' Meeting 2017

lld: A Fast, Simple and Portable Linker Rui Ueyama LLVM Developers' Meeting 2017 lld: A Fast, Simple and Portable Linker Rui Ueyama LLVM Developers' Meeting 2017 Talk overview 1. Implementation status 2. Design goals 3. Comparisons with other linkers 4. Concurrency

More information

BEAMJIT, a Maze of Twisty Little Traces

BEAMJIT, a Maze of Twisty Little Traces BEAMJIT, a Maze of Twisty Little Traces A walk-through of the prototype just-in-time (JIT) compiler for Erlang. Frej Drejhammar 130613 Who am I? Senior researcher at the Swedish Institute

More information

C Code Verification based on the Extended Labeled Transition System Model

C Code Verification based on the Extended Labeled Transition System Model C Code Verification based on the Extended Labeled Transition System Model Dexi Wang, Chao Zhang, Guang Chen, Ming Gu, and Jiaguang Sun School of Software, TNLIST, Tsinghua University, China {dx-wang12,zhang-chao13,chenguan14}@mails.tsinghua.edu.cn

More information

Introduction. CS 2210 Compiler Design Wonsun Ahn

Introduction. CS 2210 Compiler Design Wonsun Ahn Introduction CS 2210 Compiler Design Wonsun Ahn What is a Compiler? Compiler: A program that translates source code written in one language to a target code written in another language Source code: Input

More information

Yuka Takahashi - Princeton University, CERN. Takeaway from LLVM dev meeting

Yuka Takahashi - Princeton University, CERN. Takeaway from LLVM dev meeting Takeaway from LLVM dev meeting Yuka Takahashi - Princeton University, CERN LLVM developers meeting - Annual 2 day meeting for developers in LLVM/Clang and related projects - Held in San Jose Convention

More information

LLVM Summer School, Paris 2017

LLVM Summer School, Paris 2017 LLVM Summer School, Paris 2017 David Chisnall June 12 & 13 Setting up You may either use the VMs provided on the lab machines or your own computer for the exercises. If you are using your own machine,

More information

Xcode Tricks. ios App Development Fall 2010 Lecture 13

Xcode Tricks. ios App Development Fall 2010 Lecture 13 Xcode Tricks ios App Development Fall 2010 Lecture 13 Questions? Announcements Reminder: Assignment #3 due Monday, October 18 th by 11:59pm Today s Topics Building & Running Code Troubleshooting Debugging

More information

Modern Processor Architectures. L25: Modern Compiler Design

Modern Processor Architectures. L25: Modern Compiler Design Modern Processor Architectures L25: Modern Compiler Design The 1960s - 1970s Instructions took multiple cycles Only one instruction in flight at once Optimisation meant minimising the number of instructions

More information

The LLVM Compiler Infrastructure How to build LLVM in ten seconds - or die trying!

The LLVM Compiler Infrastructure How to build LLVM in ten seconds - or die trying! IGGI Doctoral Center - PHD funding for LLVM[link] Gamification (FoldIt, DockIt: Imperial) [link] Art (Mutator II: Brighton/Brussels) [link] MSc Computer Games and Entertainment [link] Bioinformatics (Rosalind

More information

LDC: A Dragon on Mars. David Nadlinger DConf 2013

LDC: A Dragon on Mars. David Nadlinger DConf 2013 LDC: A Dragon on Mars David Nadlinger (@klickverbot) DConf 2013 LLVM-based D Compiler LLVM Looking just at the core:»compiler Infrastructure«x86, ARM, PPC, Mips, NVPTX, R600, Advanced optimizer Apple,

More information

Modern Processor Architectures (A compiler writer s perspective) L25: Modern Compiler Design

Modern Processor Architectures (A compiler writer s perspective) L25: Modern Compiler Design Modern Processor Architectures (A compiler writer s perspective) L25: Modern Compiler Design The 1960s - 1970s Instructions took multiple cycles Only one instruction in flight at once Optimisation meant

More information

trisycl Open Source C++17 & OpenMP-based OpenCL SYCL prototype Ronan Keryell 05/12/2015 IWOCL 2015 SYCL Tutorial Khronos OpenCL SYCL committee

trisycl Open Source C++17 & OpenMP-based OpenCL SYCL prototype Ronan Keryell 05/12/2015 IWOCL 2015 SYCL Tutorial Khronos OpenCL SYCL committee trisycl Open Source C++17 & OpenMP-based OpenCL SYCL prototype Ronan Keryell Khronos OpenCL SYCL committee 05/12/2015 IWOCL 2015 SYCL Tutorial OpenCL SYCL committee work... Weekly telephone meeting Define

More information

Parallel processing with OpenMP. #pragma omp

Parallel processing with OpenMP. #pragma omp Parallel processing with OpenMP #pragma omp 1 Bit-level parallelism long words Instruction-level parallelism automatic SIMD: vector instructions vector types Multiple threads OpenMP GPU CUDA GPU + CPU

More information

GCC and LLVM collaboration

GCC and LLVM collaboration GCC and LLVM collaboration GNU Tools Cauldron 2014, Cambridge, UK Renato Golin LLVM Tech-Lead Linaro 1 Agenda What we should discuss: Why & How to collaborate Common projects we already share (the good)

More information

AMD gdebugger 6.2 for Linux

AMD gdebugger 6.2 for Linux AMD gdebugger 6.2 for Linux by vincent Saturday, 19 May 2012 http://www.streamcomputing.eu/blog/2012-05-19/amd-gdebugger-6-2-for-linux/ The printf-funtion in kernels isn t the solution to everything, so

More information

Vulkan 1.1 March Copyright Khronos Group Page 1

Vulkan 1.1 March Copyright Khronos Group Page 1 Vulkan 1.1 March 2018 Copyright Khronos Group 2018 - Page 1 Vulkan 1.1 Launch and Ongoing Momentum Strengthening the Ecosystem Improved developer tools (SDK, validation/debug layers) More rigorous conformance

More information

The Role of Standards in Heterogeneous Programming

The Role of Standards in Heterogeneous Programming The Role of Standards in Heterogeneous Programming Multi-core Challenge Bristol UWE 45 York Place, Edinburgh EH1 3HP June 12th, 2013 Codeplay Software Ltd. Incorporated in 1999 Based in Edinburgh, Scotland

More information

Why C? Because we can t in good conscience espouse Fortran.

Why C? Because we can t in good conscience espouse Fortran. C Tutorial Why C? Because we can t in good conscience espouse Fortran. C Hello World Code: Output: C For Loop Code: Output: C Functions Code: Output: Unlike Fortran, there is no distinction in C between

More information

Testing. ECE/CS 5780/6780: Embedded System Design. Why is testing so hard? Why do testing?

Testing. ECE/CS 5780/6780: Embedded System Design. Why is testing so hard? Why do testing? Testing ECE/CS 5780/6780: Embedded System Design Scott R. Little Lecture 24: Introduction to Software Testing and Verification What is software testing? Running a program in order to find bugs (faults,

More information

LLVMLinux: x86 Kernel Build

LLVMLinux: x86 Kernel Build LLVMLinux: x86 Kernel Build Presented by: Jan-Simon Möller Presentation Date: 2012.08.30 Topics Common issues (x86 perspective) Specific Issues with Clang/LLVM Specific Issues with the Linux Kernel Status

More information

Sanitizing Sensitive Data: How to get it Right (or at least Less Wrong ) Roderick Chapman, 22nd September 2017

Sanitizing Sensitive Data: How to get it Right (or at least Less Wrong ) Roderick Chapman, 22nd September 2017 Sanitizing Sensitive Data: How to get it Right (or at least Less Wrong ) Roderick Chapman, 22nd September 2017 Contents The problem Technical issues Design goals Language support A policy for sanitization

More information

Traditional Smalltalk Playing Well With Others Performance Etoile. Pragmatic Smalltalk. David Chisnall. August 25, 2011

Traditional Smalltalk Playing Well With Others Performance Etoile. Pragmatic Smalltalk. David Chisnall. August 25, 2011 Étoilé Pragmatic Smalltalk David Chisnall August 25, 2011 Smalltalk is Awesome! Pure object-oriented system Clean, simple syntax Automatic persistence and many other great features ...but no one cares

More information

Rapid Development with Runtime Compiled C++ #RCCpp

Rapid Development with Runtime Compiled C++ #RCCpp Rapid Development with Runtime Compiled C++ Matthew Jack Doug Binks matthew.jack@mooncollider.net doug.binks@enkisoftware.com @MatthewTJack @dougbinks #RCCpp Scripts Programmers turn to scripts for rapid

More information

Introduction to Problem Solving and Programming in Python.

Introduction to Problem Solving and Programming in Python. Introduction to Problem Solving and Programming in Python http://cis-linux1.temple.edu/~tuf80213/courses/temple/cis1051/ Overview Types of errors Testing methods Debugging in Python 2 Errors An error in

More information

Fuzzing Clang to find ABI Bugs. David Majnemer

Fuzzing Clang to find ABI Bugs. David Majnemer Fuzzing Clang to find ABI Bugs David Majnemer What s in an ABI? The size, alignment, etc. of types Layout of records, RTTI, virtual tables, etc. The decoration of types, functions, etc. To generalize:

More information

Xbox 360 Architecture. Lennard Streat Samuel Echefu

Xbox 360 Architecture. Lennard Streat Samuel Echefu Xbox 360 Architecture Lennard Streat Samuel Echefu Overview Introduction Hardware Overview CPU Architecture GPU Architecture Comparison Against Competing Technologies Implications of Technology Introduction

More information

https://github.com/aguinet/dragon EuroLLVM Adrien Guinet 2018/04/17

https://github.com/aguinet/dragon EuroLLVM Adrien Guinet 2018/04/17 DragonFFI Foreign Function Interface and JIT for C code https://github.com/aguinet/dragon EuroLLVM 2018 - Adrien Guinet (@adriengnt) 2018/04/17 Content of this talk whoami FFI? (and related work) FFI for

More information

Dynamic code analysis tools

Dynamic code analysis tools Dynamic code analysis tools Stewart Martin-Haugh (STFC RAL) Berkeley Software Technical Interchange meeting Stewart Martin-Haugh (STFC RAL) Dynamic code analysis tools 1 / 16 Overview Introduction Sanitizer

More information

Partial Wave Analysis at BES III harnessing the power of GPUs

Partial Wave Analysis at BES III harnessing the power of GPUs Partial Wave Analysis at BES III harnessing the power of GPUs Niklaus Berger IHEP Beijing MENU 2010, College of William and Mary Overview Partial Wave Analysis at BES III PWA as a computational problem

More information

Rethinking the core OS in 2015

Rethinking the core OS in 2015 Rethinking the core OS in 2015 Presented by Bernhard "Bero" Rosenkränzer Date Embedded Linux Conference Europe, 2015 Are alternatives to gcc, libstdc++ and glibc viable yet? (And how do I use them?) The

More information

BEAMJIT: An LLVM based just-in-time compiler for Erlang. Frej Drejhammar

BEAMJIT: An LLVM based just-in-time compiler for Erlang. Frej Drejhammar BEAMJIT: An LLVM based just-in-time compiler for Erlang Frej Drejhammar 140407 Who am I? Senior researcher at the Swedish Institute of Computer Science (SICS) working on programming languages,

More information

A Case Study in Optimizing GNU Radio s ATSC Flowgraph

A Case Study in Optimizing GNU Radio s ATSC Flowgraph A Case Study in Optimizing GNU Radio s ATSC Flowgraph Presented by Greg Scallon and Kirby Cartwright GNU Radio Conference 2017 Thursday, September 14 th 10am ATSC FLOWGRAPH LOADING 3% 99% 76% 36% 10% 33%

More information

Improving Linux development with better tools

Improving Linux development with better tools Improving Linux development with better tools Andi Kleen Oct 2013 Intel Corporation ak@linux.intel.com Linux complexity growing Source lines in Linux kernel All source code 16.5 16 15.5 M-LOC 15 14.5 14

More information

Intro to Segmentation Fault Handling in Linux. By Khanh Ngo-Duy

Intro to Segmentation Fault Handling in Linux. By Khanh Ngo-Duy Intro to Segmentation Fault Handling in Linux By Khanh Ngo-Duy Khanhnd@elarion.com Seminar What is Segmentation Fault (Segfault) Examples and Screenshots Tips to get Segfault information What is Segmentation

More information

Unit Testing as Hypothesis Testing

Unit Testing as Hypothesis Testing Unit Testing as Hypothesis Testing Jonathan Clark September 19, 2012 You should test your code. Why? To find bugs. Even for seasoned programmers, bugs are an inevitable reality. Today, we ll take an unconventional

More information

logistics: ROP assignment

logistics: ROP assignment bug-finding 1 logistics: ROP assignment 2 2013 memory safety landscape 3 2013 memory safety landscape 4 different design points memory safety most extreme disallow out of bounds usually even making out-of-bounds

More information

The Fone Works. Sony PlayStation 4 Teardown. This guide presents a representation of how we fix your device in our Repair Showroom.

The Fone Works. Sony PlayStation 4 Teardown. This guide presents a representation of how we fix your device in our Repair Showroom. The Fone Works Sony PlayStation 4 Sony PlayStation 4 Teardown. This guide presents a representation of how we fix your device in our Repair Showroom. Written By: The Foneworks 2017 thefoneworks.dozuki.com

More information

Assignment 1c: Compiler organization and backend programming

Assignment 1c: Compiler organization and backend programming Assignment 1c: Compiler organization and backend programming Roel Jordans 2016 Organization Welcome to the third and final part of assignment 1. This time we will try to further improve the code generation

More information

Rethinking the core OS in 2015

Rethinking the core OS in 2015 Rethinking the core OS in 2015 Presented by Bernhard "Bero" Rosenkränzer Are alternatives to gcc, libstdc++ and glibc viable yet? Date Linux Plumbers Conference 2015 The traditional approach Building a

More information

Hardening LLVM with Random Testing

Hardening LLVM with Random Testing Hardening LLVM with Random Testing Xuejun Yang, Yang Chen Eric Eide, John Regehr {jxyang, chenyang, eeide, regehr}@cs.utah.edu University of Utah 11/3/2010 1 A LLVM Crash Bug int * p[2]; int i; for (...)

More information

OpenMP Device Offloading to FPGA Accelerators. Lukas Sommer, Jens Korinth, Andreas Koch

OpenMP Device Offloading to FPGA Accelerators. Lukas Sommer, Jens Korinth, Andreas Koch OpenMP Device Offloading to FPGA Accelerators Lukas Sommer, Jens Korinth, Andreas Koch Motivation Increasing use of heterogeneous systems to overcome CPU power limitations 2017-07-12 OpenMP FPGA Device

More information

Sanitizing Sensitive Data: How to get it Right (or at least Less Wrong ) Roderick Chapman, 14th June 2017

Sanitizing Sensitive Data: How to get it Right (or at least Less Wrong ) Roderick Chapman, 14th June 2017 Sanitizing Sensitive Data: How to get it Right (or at least Less Wrong ) Roderick Chapman, 14th June 2017 Contents The problem Technical issues Design goals Ada language support A policy for sanitization

More information

Processes and Threads

Processes and Threads COS 318: Operating Systems Processes and Threads Kai Li and Andy Bavier Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall13/cos318 Today s Topics u Concurrency

More information

Lecture 11 - Portability and Optimizations

Lecture 11 - Portability and Optimizations Lecture 11 - Portability and Optimizations This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

More information

MULTI-CORE PROGRAMMING. Dongrui She December 9, 2010 ASSIGNMENT

MULTI-CORE PROGRAMMING. Dongrui She December 9, 2010 ASSIGNMENT MULTI-CORE PROGRAMMING Dongrui She December 9, 2010 ASSIGNMENT Goal of the Assignment 1 The purpose of this assignment is to Have in-depth understanding of the architectures of real-world multi-core CPUs

More information

Portability of OpenMP Offload Directives Jeff Larkin, OpenMP Booth Talk SC17

Portability of OpenMP Offload Directives Jeff Larkin, OpenMP Booth Talk SC17 Portability of OpenMP Offload Directives Jeff Larkin, OpenMP Booth Talk SC17 11/27/2017 Background Many developers choose OpenMP in hopes of having a single source code that runs effectively anywhere (performance

More information

Writing a fuzzer. for any language with american fuzzy lop. Ariel Twistlock Labs

Writing a fuzzer. for any language with american fuzzy lop. Ariel Twistlock Labs Writing a fuzzer for any language with american fuzzy lop Ariel Zelivansky @ Twistlock Labs What is fuzzing? Technique for testing software by providing it with random, unexpected or invalid input Dumb

More information

NVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU

NVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU NVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU GPGPU opens the door for co-design HPC, moreover middleware-support embedded system designs to harness the power of GPUaccelerated

More information

Tips on Using GDB to Track Down and Stamp Out Software Bugs

Tips on Using GDB to Track Down and Stamp Out Software Bugs Tips on Using GDB to Track Down and Stamp Out Software Bugs Brett Viren Physics Department MINOS Week In The Woods, 2005 Brett Viren (Brookhaven National Lab) Using GDB to Debug Ely 2005 1 / 34 Outline

More information

Statistical Debugging Benjamin Robert Liblit. Cooperative Bug Isolation. PhD Dissertation, University of California, Berkeley, 2004.

Statistical Debugging Benjamin Robert Liblit. Cooperative Bug Isolation. PhD Dissertation, University of California, Berkeley, 2004. Statistical Debugging Benjamin Robert Liblit. Cooperative Bug Isolation. PhD Dissertation, University of California, Berkeley, 2004. ACM Dissertation Award (2005) Thomas D. LaToza 17-654 Analysis of Software

More information

Partial Wave Analysis using Graphics Cards

Partial Wave Analysis using Graphics Cards Partial Wave Analysis using Graphics Cards Niklaus Berger IHEP Beijing Hadron 2011, München The (computational) problem with partial wave analysis n rec * * i=1 * 1 Ngen MC NMC * i=1 A complex calculation

More information

VMem. By Stewart Lynch.

VMem. By Stewart Lynch. VMem By Stewart Lynch. 1 Contents Introduction... 3 Overview... 4 Getting started... 6 Fragmentation... 7 Virtual Regions... 8 The FSA... 9 Biasing... 10 The Coalesce allocator... 11 Skewing indices...

More information

Linear Algebra libraries in Debian. DebConf 10 New York 05/08/2010 Sylvestre

Linear Algebra libraries in Debian. DebConf 10 New York 05/08/2010 Sylvestre Linear Algebra libraries in Debian Who I am? Core developer of Scilab (daily job) Debian Developer Involved in Debian mainly in Science and Java aspects sylvestre.ledru@scilab.org / sylvestre@debian.org

More information

Introduction to OpenMP

Introduction to OpenMP Introduction to OpenMP p. 1/?? Introduction to OpenMP More Syntax and SIMD Nick Maclaren Computing Service nmm1@cam.ac.uk, ext. 34761 June 2011 Introduction to OpenMP p. 2/?? C/C++ Parallel for (1) I said

More information

Unit Testing as Hypothesis Testing

Unit Testing as Hypothesis Testing Unit Testing as Hypothesis Testing Jonathan Clark September 19, 2012 5 minutes You should test your code. Why? To find bugs. Even for seasoned programmers, bugs are an inevitable reality. Today, we ll

More information

TraceBack: First Fault Diagnosis by Reconstruction of Distributed Control Flow

TraceBack: First Fault Diagnosis by Reconstruction of Distributed Control Flow TraceBack: First Fault Diagnosis by Reconstruction of Distributed Control Flow Andrew Ayers Chris Metcalf Junghwan Rhee Richard Schooler VERITAS Emmett Witchel Microsoft Anant Agarwal UT Austin MIT Software

More information

A Status Update of BEAMJIT, the Just-in-Time Compiling Abstract Machine. Frej Drejhammar and Lars Rasmusson

A Status Update of BEAMJIT, the Just-in-Time Compiling Abstract Machine. Frej Drejhammar and Lars Rasmusson A Status Update of BEAMJIT, the Just-in-Time Compiling Abstract Machine Frej Drejhammar and Lars Rasmusson 140609 Who am I? Senior researcher at the Swedish Institute of Computer Science

More information

Mingw-w64 and Win-builds.org - Building for Windows

Mingw-w64 and Win-builds.org - Building for Windows Mingw-w64 and Win-builds.org - Building for Windows February 2, 2014 1 Mingw-w64 2 3 Section outline Mingw-w64 History, motivations and philosophy What comes with a mingw-w64 tarball Environments to build

More information

Libabigail & ABI compatibility

Libabigail & ABI compatibility Libabigail & ABI compatibility Taming the runtime linking problem Ben Woodard Consulting Engineer Dodji Seketeli Tools Engineer Aug 2017 Problems due to ABI Compatibility There are several problems cause

More information

18-642: Code Style for Compilers

18-642: Code Style for Compilers 18-642: Code Style for Compilers 9/25/2017 1 Anti-Patterns: Coding Style: Language Use Code compiles with warnings Warnings are turned off or over-ridden Insufficient warning level set Language safety

More information

Operating Systems. Written by Justin Browning. Linux / UNIX Distributions Report

Operating Systems. Written by Justin Browning. Linux / UNIX Distributions Report Operating Systems Written by Justin Browning Linux / UNIX Distributions Report 1 Table of Contents Table of Contents... 2 Chapter 1: A Linux Free Distribution... 3 A Brief Description:... 3 Chapter 2:

More information

Memory Safety for Embedded Devices with nescheck

Memory Safety for Embedded Devices with nescheck Memory Safety for Embedded Devices with nescheck Daniele MIDI, Mathias PAYER, Elisa BERTINO Purdue University AsiaCCS 2017 Ubiquitous Computing and Security Sensors and WSNs are pervasive Small + cheap

More information

Improving Linux Development with better tools. Andi Kleen. Oct 2013 Intel Corporation

Improving Linux Development with better tools. Andi Kleen. Oct 2013 Intel Corporation Improving Linux Development with better tools Andi Kleen Oct 2013 Intel Corporation ak@linux.intel.com Linux complexity growing Source lines in Linux kernel All source code 16.5 16 15.5 M-LOC 15 14.5 14

More information

HSA foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room A. Alario! 23 November, 2015!

HSA foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room A. Alario! 23 November, 2015! Advanced Topics on Heterogeneous System Architectures HSA foundation! Politecnico di Milano! Seminar Room A. Alario! 23 November, 2015! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2

More information

Porting Linux to x86-64

Porting Linux to x86-64 Porting Linux to x86-64 Andi Kleen SuSE Labs ak@suse.de Abstract... Some implementation details with changes over the existing i386 port are discussed. 1 Introduction x86-64 is a new architecture developed

More information

An NVMe-based Offload Engine for Storage Acceleration Sean Gibb, Eideticom Stephen Bates, Raithlin

An NVMe-based Offload Engine for Storage Acceleration Sean Gibb, Eideticom Stephen Bates, Raithlin An NVMe-based Offload Engine for Storage Acceleration Sean Gibb, Eideticom Stephen Bates, Raithlin 1 Overview Acceleration for Storage NVMe for Acceleration How are we using (abusing ;-)) NVMe to support

More information

CERTIFIED. Faster & Cheaper Testing. Develop standards compliant C & C++ faster and cheaper, with Cantata automated unit & integration testing.

CERTIFIED. Faster & Cheaper Testing. Develop standards compliant C & C++ faster and cheaper, with Cantata automated unit & integration testing. CERTIFIED Faster & Cheaper Testing Develop standards compliant C & C++ faster and cheaper, with Cantata automated unit & integration testing. Why Industry leaders use Cantata Cut the cost of standards

More information

Lucid Virtu Installation Guide

Lucid Virtu Installation Guide Lucid Virtu Installation Guide 1 Introduction Lucid VIRTU solution is designed for Intel Sandy Bridge platform with Intel Processor Graphics enabled. VIRTU dynamically assigns tasks to best available graphics

More information

Intel Xeon Phi Coprocessors

Intel Xeon Phi Coprocessors Intel Xeon Phi Coprocessors Reference: Parallel Programming and Optimization with Intel Xeon Phi Coprocessors, by A. Vladimirov and V. Karpusenko, 2013 Ring Bus on Intel Xeon Phi Example with 8 cores Xeon

More information

ECE 574 Cluster Computing Lecture 17

ECE 574 Cluster Computing Lecture 17 ECE 574 Cluster Computing Lecture 17 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 28 March 2019 HW#8 (CUDA) posted. Project topics due. Announcements 1 CUDA installing On Linux

More information

Product Reviewer s Guide

Product Reviewer s Guide Product Reviewer s Guide Skype for Sony PlayStation Vita 2012 by Skype Table of Contents Summary System Requirements How to download Installation/How to Start Homepage Accepting/Rejecting incoming contact

More information

Beyond Hardware IP An overview of Arm development solutions

Beyond Hardware IP An overview of Arm development solutions Beyond Hardware IP An overview of Arm development solutions 2018 Arm Limited Arm Technical Symposia 2018 Advanced first design cost (US$ million) IC design complexity and cost aren t slowing down 542.2

More information

CDT 7.0 Helios Release Review. Planned Review Date: June 11, 2010 Communication Channel: cdt-dev Doug Schaefer

CDT 7.0 Helios Release Review. Planned Review Date: June 11, 2010 Communication Channel: cdt-dev Doug Schaefer CDT 7.0 Helios Release Review Planned Review Date: June 11, 2010 Communication Channel: cdt-dev Doug Schaefer Introduction The CDT (C/C++ Development Tools) project builds a platform that supports edit,

More information

Install Error Code 43 Nvidia Geforce 8500 Gt >>>CLICK HERE<<<

Install Error Code 43 Nvidia Geforce 8500 Gt >>>CLICK HERE<<< Install Error Code 43 Nvidia Geforce 8500 Gt Showed a code 43 error. site and downloaded the latest version of NVidia GeForce 8500 and restarted. A GFX TEAM member is running with a 8500GT? hi, i would

More information

O B J E C T L E V E L T E S T I N G

O B J E C T L E V E L T E S T I N G Source level testing and O B J E C T L E V E L T E S T I N G Objectives At the end of this section, you will be able to Explain the advantages and disadvantages of both instrumented testing and object

More information

Piecewise Holistic Autotuning of Compiler and Runtime Parameters

Piecewise Holistic Autotuning of Compiler and Runtime Parameters Piecewise Holistic Autotuning of Compiler and Runtime Parameters Mihail Popov, Chadi Akel, William Jalby, Pablo de Oliveira Castro University of Versailles Exascale Computing Research August 2016 C E R

More information

Locate a Hotspot and Optimize It

Locate a Hotspot and Optimize It Locate a Hotspot and Optimize It 1 Can Recompiling Just One File Make a Difference? Yes, in many cases it can! Often, you can get a major performance boost by recompiling a single file with the optimizing

More information

Introduction to OpenMP

Introduction to OpenMP Introduction to OpenMP p. 1/?? Introduction to OpenMP More Syntax and SIMD Nick Maclaren nmm1@cam.ac.uk September 2017 Introduction to OpenMP p. 2/?? C/C++ Parallel for (1) I said that I would give the

More information

AAlign: A SIMD Framework for Pairwise Sequence Alignment on x86-based Multi- and Many-core Processors

AAlign: A SIMD Framework for Pairwise Sequence Alignment on x86-based Multi- and Many-core Processors AAlign: A SIMD Framework for Pairwise Sequence Alignment on x86-based Multi- and Many-core Processors Kaixi Hou, Hao Wang, Wu-chun Feng {kaixihou,hwang121,wfeng}@vt.edu Pairwise Sequence Alignment Algorithms

More information

RISC-V: Opportunities and Challenges in SoCs

RISC-V: Opportunities and Challenges in SoCs December 5, 2018 @qualcomm Santa Clara, CA RISC-V: Opportunities and Challenges in SoCs Greg Wright Sr Director, Engineering Qualcomm Technologies, Inc. Introductions Who am I? Why am I here? 2 Quick tour

More information

Undefined Behaviour in C

Undefined Behaviour in C Undefined Behaviour in C Report Field of work: Scientific Computing Field: Computer Science Faculty for Mathematics, Computer Science and Natural Sciences University of Hamburg Presented by: Dennis Sobczak

More information

Mesos on ARM. Feng Li( 李枫 ),

Mesos on ARM. Feng Li( 李枫 ), Mesos on ARM Feng Li( 李枫 ), Agenda I. Background Information ARM Ecosystem Today Raspberry Pi II. Build Mesos for ARM Cross Compiling Native Compilation Build Mesos with Ninja Summary III. Clang/LLVM-based

More information

L25: Modern Compiler Design Exercises

L25: Modern Compiler Design Exercises L25: Modern Compiler Design Exercises David Chisnall Deadlines: October 26 th, November 23 th, November 9 th These simple exercises account for 20% of the course marks. They are intended to provide practice

More information

Introduction to Computing and Systems Architecture

Introduction to Computing and Systems Architecture Introduction to Computing and Systems Architecture 1. Computability A task is computable if a sequence of instructions can be described which, when followed, will complete such a task. This says little

More information

Ps3 system update storage media. Ps3 system update storage media.zip

Ps3 system update storage media. Ps3 system update storage media.zip Ps3 system update storage media Ps3 system update storage media.zip Safe Mode menu.download Sony PlayStation 3 Firmware Update create a folder named PS3 on the storage media Do not turn off the PS3 system

More information

Accelerating Ruby with LLVM

Accelerating Ruby with LLVM Accelerating Ruby with LLVM Evan Phoenix Oct 2, 2009 RUBY RUBY Strongly, dynamically typed RUBY Unified Model RUBY Everything is an object RUBY 3.class # => Fixnum RUBY Every code context is equal RUBY

More information

Cell Processor and Playstation 3

Cell Processor and Playstation 3 Cell Processor and Playstation 3 Guillem Borrell i Nogueras February 24, 2009 Cell systems Bad news More bad news Good news Q&A IBM Blades QS21 Cell BE based. 8 SPE 460 Gflops Float 20 GFLops Double QS22

More information

Palm Platform Hardware Intro to the Palm OS and application programming

Palm Platform Hardware Intro to the Palm OS and application programming Palm Platform Hardware Intro to the Palm OS and application programming Total memory (RAM/ROM) originals had only 128 Kb currently average is 4 Mb (max. 8Mb) 32 bit addresses 8, 16 & 32 bit data types

More information

PyCPUID Documentation

PyCPUID Documentation PyCPUID Documentation Release 0.5 Bram de Greve March 13, 2016 Contents 1 Introduction 3 1.1 Installation.............................................. 3 1.2 Source Code.............................................

More information

Introduction to the Raspberry Pi AND LINUX FOR DUMMIES

Introduction to the Raspberry Pi AND LINUX FOR DUMMIES Introduction to the Raspberry Pi AND LINUX FOR DUMMIES 700Mhz ARM v6 Broadcomm CPU+GPU 512 MB RAM (256MB on Model A) Boots off SD card for filesystem USB, Audio out, LAN (Model B only) HDMI + Composite

More information

A Plan 9 C Compiler for RISC-V

A Plan 9 C Compiler for RISC-V A Plan 9 C Compiler for RISC-V Richard Miller r.miller@acm.org Plan 9 C compiler - written by Ken Thompson for Plan 9 OS - used for Inferno OS kernel and limbo VM - used to bootstrap first releases of

More information