Project Proposals. Advanced Operating Systems / Embedded Systems (2016/2017)
|
|
- Piers Hood
- 6 years ago
- Views:
Transcription
1 Project Proposals / Embedded Systems (2016/2017) Giuseppe Massari, Federico Terraneo giuseppe.massari@polimi.it federico.terraneo@polimi.it
2 Project Rules 2/40 General rules Two types of project: Code development Bibliography research Up to 2 students per project A meeting/report mail every 15 days Brief discussion/reporting/demo about the project progress Source code, uploaded on a GIT remote repository Version control with GIT Use services like BitBucket, GitHub,... Report Format: LaTeX + PDF Language: English Dowload the template from the professor s website Giuseppe Massari
3 Project Rules 3/40 Evaluation Development Project [0 10] extra points (to add to the written exam grade) Quality of the developed code (5) Quality and completeness of the report (2) Quality of the GIT repository (2) Fulfillment of the timing constraints (1) Monographic Research [0 5] extra points Coverage and relevance of the selected papers (2) Writing quality (1) Presentation (PowerPoint/LibreOffice/Beamer/...) to be held at the tutor s office in 20 + questions (1) Fulfillment of the timing constraints (1) Giuseppe Massari
4 BOSP proposals 4/40 The Barbeque Run-Time Resource Management Open-Source (BOSP) Website: Mailing lists: User / News: Developers : Giuseppe Massari
5 BarbequeRTRM Open Source Project (BOSP) 5/40 Overview The core of the project is a Run-Time Resource Manager for Multi/Many-core Systems (the BarbequeRTRM) Scheduling, resource allocation, power management Open-source written in C++ Includes libraries, benchmarks, sample applications and tools Version control with GIT Giuseppe Massari
6 Barbeque Open Source Project (BOSP) 6/40 EU funded project involving BOSP [ ] 2PARMA: PARallel PAradigms and Run-time MAnagement techniques for Many-core Architecture [ ] CONTREX: Design of embedded mixed-criticality CONTRol systems under consideration of EXtra-functional properties ( [ ] HARPA: Harnessing Performance Variability ( [ ] MANGO: MANGO: exploring Manycore Architectures for Next- GeneratiOn HPC systems [ ] ANTAREX: AutoTuning and Adaptivity approach for Energy efficient exascale HPC systems (
7 The BarbequeRTRM 7/40 Layer view C/C++/OpenCL applications supported RTLib provides API and managing communication between Application and RTRM The resource manager daemon includes core components and plug-in modules Platform support exploiting Linux frameworks or custom drivers and libraries Giuseppe Massari
8 The BarbequeRTRM 8/40 Currently supported hardware systems Intel/AMD x86 single multi-core processor systems + Multiple GPUs (AMD) through OpenCL runtime Intel/AMD x86 multi-processor NUMA systems ARM Cortex A9 multi-core CPU based SoC PandaBoard Freescale imx6 Quad SABRE ARM big.little 8-core (Cortex A7 + Cortex A15) SoC (Samsung Exynos 54xx) Insight Arndale OctaCore ODROID XU-E ODROID XU-3 Giuseppe Massari
9 The BarbequeRTRM 9/40 Application Execution Model Resource-aware execution of the application Performance-aware resource management Website:
10 The BarbequeRTRM 10/40 HARPA project use case Landslide detection and prediction system by HENESIS S.r.l GOAL: The system (solar panel / battery powered) must remain ON Giuseppe Massari
11 BOSP 11/40 Google Protocol Buffer based communication BarbequeRTRM RTLib Application Refactoring of the BarbequeRTRM RTLib communication infrastructure, basing it on Google Protocol Buffer C/C++ Students 1-2 Notes 5/10 CFU
12 BOSP 12/40 RTLib Java wrapper BarbequeRTRM C/C++ RTLib Application Export the Abstract Execution Model to support also Java implemented applications, by wrapping the C++ interfaces through Java classes C/C++, Java, JNI Students 1-2 5/10 CFU
13 BOSP 13/40 RTLib Python wrapper BarbequeRTRM C/C++ RTLib Application Export the Abstract Execution Model to support also Python implemented applications, by wrapping the C++ interfaces through Python classes C/C++, Python Students 1-2 5/10 CFU
14 BOSP 14/40 State-of-the-art Resource Allocation policy implementation BarbequeRTRM Policy implements Implement a resource allocation /scheduling policy taken from the state-of-the-art. A paper describing the policy will be provided. C/C++ Students 1-2 5/10 CFU
15 BOSP 15/40 Linux Process Listener exploitation BarbequeRTRM Linux kernel connector RTLib Application Application Enable the management of generic Linux processes ( not integrated applications) by introducing new internal data structures and a simple policy C/C++ Students 1 5/10 CFU
16 BOSP 16/40 Network bandwith control characterization WiFi Ethernet Network Power consumption? Students 1 Exploits the recently integrated network bandwith control to characterize the performance of network-based applications and system power consumption, by exploring different network interfaces and bandwidth constaints. C/C++, Linux, Scripting languages (e.g., Python, Bash...) 5/10 CFU Notes Possible extension into a thesis and/or a publication
17 BOSP 17/40 Adapteva Epiphany-III Parallela board porting Dual core ARM and 16/64 core RISC processors on board ANSI C/C++ and OpenCL programmable Up to 90 GFLOPS processing capability HDMI, Ethernet, USB and 48 GPIO ports Giuseppe Massari
18 BOSP 18/40 Adapteva Epiphany-III Parallela board porting BarbequeRTRM Development of the BarbequeRTRM support for the Adapteva Parallela board in order enable run-time resource management of the many-core C/C++ Students 1-2 Notes 5/10 CFU Possible to extend into a master thesis
19 BOSP 19/40 Integration of multi-threaded benchmarks Application RTLib Application Modify a PARSEC or RODINIA (OpenMP) multi-threaded benchmark according to the Abstract Execution Model to make it run-time manageable C/C++ Students 1 5 CFU
20 BOSP 20/40 Integration of OpenCV samples Application RTLib Application OpenCV Integrate OpenCV provided samples into the AEM such that the output accuracy/quality depends on the resources assigned by the BarbequeRTRM (e.g. FaceDetection). C/C++, OpenCV Students 1 5 CFU
21 BOSP 21/40 Integration of Approximate Computing applications Application RTLib Application Modify an Approximate Computing application according to the Abstract Execution Model to make it run-time manageable. The application output accuracy must depend on the amount of resources assigned by the BarbequeRTRM. C/C++ Students 1 5 CFU
22 BOSP on Android 22/40 Recover and update RTLib integration C/C++ BarbequeRTRM RTLib Application Re-enable the possibility of managing Android applications integrated in the Abstract Execution Model Java, C++, JNI, Android Students 1 Notes Part of the code already available
23 BOSP on Android 23/40 Android process/vm migration evaluation A A Evaluate the possibility of migrating Android processes (VM) from one device to another by exploiting the Checkpoint/Restore in Userspace (CRIU) framework. Android, Linux Students 1 Notes // Possible to extend into a thesis
24 BOSP on Heterogeneous Systems 24/40 Who does assign devices to applications?
25 BOSP on Heterogeneous Systems 25/40 New heterogeneous resource allocation policy BarbequeRTRM CPU0 CPU1... Policy allocates GPU0 GPU1 Make experiments on CPU+GPU systems to implement a new resource allocation policies for OpenCL applications and heterogeneous system C/C++, OpenCL Students 1 Notes 10 CFU Possible to extend into a master thesis
26 BOSP on Heterogeneous Systems 26/40 Integration of RODINIA OpenCL benchmarks Application RTLib Application Modify a Rodinia OpenCL benchmark according to Abstract Execution Model to make it run-time manageable C/C++, OpenCL Students 1 5 CFU Notes
27 BOSP on Heterogeneous Systems 27/40 OpenCL on ARM CPU Compile a OpenCL runtime (pocl) on a ARM board. Run some benchmarks. Provide a comparison in terms of execution time, power/energy consumption. C/C++, OpenCL, Scripting languages Students 1 5/10 CFU
28 Open MPI 28/40 The MIG framework Migration of MPI application processes over multiple nodes of a distributed system MIG Process 1 Process 2 Process 3 Process 4 Process 5 Process 6 orted orted system0 system1... systemn Giuseppe Massari
29 MIG Open MPI framework 29/40 Rebase on top of Open MPI version 2.0 Open MPI 2.0 MIG The MIG framework for Open MPI must be rebased from version 1.7 to version 2.0 C/C++ Students 1-2 (suggested 2) 5/10 CFU
30 MIG Open MPI framework 30/40 Exploiting MIG implementing a new BarbequeRTRM policy BarbequeRTRM MIG Implement a centralized resource allocation policy to manage the migration/placement of MPI application processes C/C++ Students 1-2 5/10 CFU Notes Possible to extend into a master thesis
31 MIG Open MPI framework 31/40 Improve the framework performance with advanced techniques MIG Reduce the framework overhead by introducing some suitable smart techniques C/C++ Students 1-2 Notes 5/10 CFU Possible to extend into a master thesis / publication
32 MIG Open MPI framework 32/40 [Thesis] OpenMPI run-time application profiling support MIG Design and implement a run-time profiling framework for MPI applications that would increase the possibilities of performing dynamic resource management / process migrations C/C++ Students 1-2 5/10 CFU
33 MIG Open MPI framework 33/40 [Thesis] Development of the InfiniBand support MIG The MIG framework is lacking of the support for migrating application processes over nodes interconnected through InfiniBand C/C++ Students 1 5/10 CFU
34 34/40 STM32F4 development board + Miosix real-time operating system
35 STM32F4 + Miosix 35/40 Electronic pendulum Goal Form a pendulum with the stm32f4discovery and the USB cable Using the accelerometer, measure the pendulum period and estimate the cable length C/C++ Students 1-2 5/10 CFU Notes -
36 STM32F4 + Miosix 36/40 Tone generator Generate a sinewave tone on the headphone output of a given frequency Goal - C/C++ Students CFU Notes -
37 STM32F4 + Miosix 37/40 Audio compression Development of a firmware to compress the audio coming from the on-board microphone and send it to a serial port Goal - C/C++ Students 1-2 Notes 5 CFU The MIOSIX microphone driver is already provided
38 STM32F4 + Miosix 38/40 Port Miosix to a new STM32 board Make the Miosix kernel run on a new stm32 development board Goal - C/C++ Students 1-2 Notes 5/10 CFU A special board will be provided
39 STM32F4 + Miosix 39/40 Port Mxgui to a new development board Goal - Students 1-2 Mxgui is the Miosix library to handle displays. Add support for the display of the new stm32 development board C/C++ Notes 5/10 CFU A special board will be provided. Requires the previous project to be completed
40 Monographic Research 40/40 Topics Linux Cgroups version 2.0 Real-time scheduling algorithms Fault-detection frameworks / techniques Task scheduling on heterogeneous systems Memory management on heterogeneous systems... Giuseppe Massari
HARPA. HARPA-OS Engine, Final Release. Giuseppe Massari, Simone Libutti, William Fornaciari. HARPA Harnessing Performance Variability
HARPA Harnessing Performance Variability HARPA Harnessing Performance Variability Project ref. Call ref. Activity FP7-612069 FP7-ICT-2013-10 ICT-10-3.4 HARPA-OS Engine, Final Release Giuseppe Massari,
More informationF28HS Hardware-Software Interface: Systems Programming
F28HS Hardware-Software Interface: Systems Programming Hans-Wolfgang Loidl School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh Semester 2 2017/18 0 No proprietary software has
More informationTen (or so) Small Computers
Ten (or so) Small Computers by Jon "maddog" Hall Executive Director Linux International and President, Project Cauã 1 of 50 Who Am I? Half Electrical Engineer, Half Business, Half Computer Software In
More informationGPGPU on ARM. Tom Gall, Gil Pitney, 30 th Oct 2013
GPGPU on ARM Tom Gall, Gil Pitney, 30 th Oct 2013 Session Description This session will discuss the current state of the art of GPGPU technologies on ARM SoC systems. What standards are there? Where are
More informationIntegrating CPU and GPU, The ARM Methodology. Edvard Sørgård, Senior Principal Graphics Architect, ARM Ian Rickards, Senior Product Manager, ARM
Integrating CPU and GPU, The ARM Methodology Edvard Sørgård, Senior Principal Graphics Architect, ARM Ian Rickards, Senior Product Manager, ARM The ARM Business Model Global leader in the development of
More informationGPUs and Emerging Architectures
GPUs and Emerging Architectures Mike Giles mike.giles@maths.ox.ac.uk Mathematical Institute, Oxford University e-infrastructure South Consortium Oxford e-research Centre Emerging Architectures p. 1 CPUs
More informationPosition Paper: OpenMP scheduling on ARM big.little architecture
Position Paper: OpenMP scheduling on ARM big.little architecture Anastasiia Butko, Louisa Bessad, David Novo, Florent Bruguier, Abdoulaye Gamatié, Gilles Sassatelli, Lionel Torres, and Michel Robert LIRMM
More informationEmbedded Systems: Projects
December 2015 Embedded Systems: Projects Davide Zoni PhD email: davide.zoni@polimi.it webpage: home.dei.polimi.it/zoni Research Activities Interconnect: bus, NoC Simulation (component design, evaluation)
More informationHETEROGENEOUS SYSTEM ARCHITECTURE: PLATFORM FOR THE FUTURE
HETEROGENEOUS SYSTEM ARCHITECTURE: PLATFORM FOR THE FUTURE Haibo Xie, Ph.D. Chief HSA Evangelist AMD China OUTLINE: The Challenges with Computing Today Introducing Heterogeneous System Architecture (HSA)
More informationElaborazione dati real-time su architetture embedded many-core e FPGA
Elaborazione dati real-time su architetture embedded many-core e FPGA DAVIDE ROSSI A L E S S A N D R O C A P O T O N D I G I U S E P P E T A G L I A V I N I A N D R E A M A R O N G I U C I R I - I C T
More informationOpen Compute Stack (OpenCS) Overview. D.D. Nikolić Updated: 20 August 2018 DAE Tools Project,
Open Compute Stack (OpenCS) Overview D.D. Nikolić Updated: 20 August 2018 DAE Tools Project, http://www.daetools.com/opencs What is OpenCS? A framework for: Platform-independent model specification 1.
More informationBuilding supercomputers from embedded technologies
http://www.montblanc-project.eu Building supercomputers from embedded technologies Alex Ramirez Barcelona Supercomputing Center Technical Coordinator This project and the research leading to these results
More informationUTILIZING A BIG.LITTLE TM SOLUTION IN AUTOMOTIVE
UTILIZING A BIG.LITTLE TM SOLUTION IN AUTOMOTIVE JUN. 20, 2018 YOSHIYUKI ITO AUTOMOTIVE INFORMATION SOLUTION BUSINESS DIVISION RENESAS ELECTRONICS CORPORATION Today s Topics & Goal Requirement for big.little
More informationPedraforca: a First ARM + GPU Cluster for HPC
www.bsc.es Pedraforca: a First ARM + GPU Cluster for HPC Nikola Puzovic, Alex Ramirez We ve hit the power wall ALL computers are limited by power consumption Energy-efficient approaches Multi-core Fujitsu
More informationEnergy Efficiency Tuning: READEX. Madhura Kumaraswamy Technische Universität München
Energy Efficiency Tuning: READEX Madhura Kumaraswamy Technische Universität München Project Overview READEX Starting date: 1. September 2015 Duration: 3 years Runtime Exploitation of Application Dynamism
More informationPortable Power/Performance Benchmarking and Analysis with WattProf
Portable Power/Performance Benchmarking and Analysis with WattProf Amir Farzad, Boyana Norris University of Oregon Mohammad Rashti RNET Technologies, Inc. Motivation Energy efficiency is becoming increasingly
More informationThe Benefits of GPU Compute on ARM Mali GPUs
The Benefits of GPU Compute on ARM Mali GPUs Tim Hartley 1 SEMICON Europa 2014 ARM Introduction World leading semiconductor IP Founded in 1990 1060 processor licenses sold to more than 350 companies >
More informationCarlos Reaño, Javier Prades and Federico Silla Technical University of Valencia (Spain)
Carlos Reaño, Javier Prades and Federico Silla Technical University of Valencia (Spain) 4th IEEE International Workshop of High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB
More informationPresented By: Gregory M. Kurtzer HPC Systems Architect Lawrence Berkeley National Laboratory CONTAINERS IN HPC WITH SINGULARITY
Presented By: Gregory M. Kurtzer HPC Systems Architect Lawrence Berkeley National Laboratory gmkurtzer@lbl.gov CONTAINERS IN HPC WITH SINGULARITY A QUICK REVIEW OF THE LANDSCAPE Many types of virtualization
More informationMatrix. Get Started Guide
Matrix Get Started Guide Overview Matrix is a single board mini computer based on ARM with a wide range of interface, equipped with a powerful i.mx6 Freescale processor, it can run Android, Linux and other
More informationComputing on Low Power SoC Architecture
+ Computing on Low Power SoC Architecture Andrea Ferraro INFN-CNAF Lucia Morganti INFN-CNAF + Outline 2 Modern Low Power Systems on Chip Computing on System on Chip ARM CPU SoC GPU Low Power from Intel
More informationIntroduction to GPU hardware and to CUDA
Introduction to GPU hardware and to CUDA Philip Blakely Laboratory for Scientific Computing, University of Cambridge Philip Blakely (LSC) GPU introduction 1 / 35 Course outline Introduction to GPU hardware
More informationAndroid System Development Training 4-day session
Android System Development Training 4-day session Title Android System Development Training Overview Understanding the Android Internals Understanding the Android Build System Customizing Android for a
More informationMediaTek CorePilot 2.0. Delivering extreme compute performance with maximum power efficiency
MediaTek CorePilot 2.0 Heterogeneous Computing Technology Delivering extreme compute performance with maximum power efficiency In July 2013, MediaTek delivered the industry s first mobile system on a chip
More information8/28/12. CSE 820 Graduate Computer Architecture. Richard Enbody. Dr. Enbody. 1 st Day 2
CSE 820 Graduate Computer Architecture Richard Enbody Dr. Enbody 1 st Day 2 1 Why Computer Architecture? Improve coding. Knowledge to make architectural choices. Ability to understand articles about architecture.
More informationNext Generation Visual Computing
Next Generation Visual Computing (Making GPU Computing a Reality with Mali ) Taipei, 18 June 2013 Roberto Mijat ARM Addressing Computational Challenges Trends Growing display sizes and resolutions Increasing
More informationMediaTek CorePilot. Heterogeneous Multi-Processing Technology. Delivering extreme compute performance with maximum power efficiency
MediaTek CorePilot Heterogeneous Multi-Processing Technology Delivering extreme compute performance with maximum power efficiency In July 2013, MediaTek delivered the industry s first mobile system on
More informationHSA Foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room (Bld 20)! 15 December, 2017!
Advanced Topics on Heterogeneous System Architectures HSA Foundation! Politecnico di Milano! Seminar Room (Bld 20)! 15 December, 2017! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2
More informationProfiling and Debugging OpenCL Applications with ARM Development Tools. October 2014
Profiling and Debugging OpenCL Applications with ARM Development Tools October 2014 1 Agenda 1. Introduction to GPU Compute 2. ARM Development Solutions 3. Mali GPU Architecture 4. Using ARM DS-5 Streamline
More informationOP2 FOR MANY-CORE ARCHITECTURES
OP2 FOR MANY-CORE ARCHITECTURES G.R. Mudalige, M.B. Giles, Oxford e-research Centre, University of Oxford gihan.mudalige@oerc.ox.ac.uk 27 th Jan 2012 1 AGENDA OP2 Current Progress Future work for OP2 EPSRC
More informationThere s STILL plenty of room at the bottom! Andreas Olofsson
There s STILL plenty of room at the bottom! Andreas Olofsson 1 Richard Feynman s Lecture (1959) There's Plenty of Room at the Bottom An Invitation to Enter a New Field of Physics Why cannot we write the
More informationBarcelona Supercomputing Center
www.bsc.es Barcelona Supercomputing Center Centro Nacional de Supercomputación EMIT 2016. Barcelona June 2 nd, 2016 Barcelona Supercomputing Center Centro Nacional de Supercomputación BSC-CNS objectives:
More informationRaspberry Pi Introduction
ECE 1160/2160 Embedded Systems Design Raspberry Pi Introduction Wei Gao ECE 1160/2160 Embedded Systems Design 1 Raspberry Pi Classic embedded computer Single board computer Size of a credit card ECE 1160/2160
More informationIBM High Performance Computing Toolkit
IBM High Performance Computing Toolkit Pidad D'Souza (pidsouza@in.ibm.com) IBM, India Software Labs Top 500 : Application areas (November 2011) Systems Performance Source : http://www.top500.org/charts/list/34/apparea
More informationHSA foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room A. Alario! 23 November, 2015!
Advanced Topics on Heterogeneous System Architectures HSA foundation! Politecnico di Milano! Seminar Room A. Alario! 23 November, 2015! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2
More informationMatrix. Get Started Guide V2.0
Matrix Get Started Guide V2.0 Overview Matrix is a single board mini computer based on ARM with a wide range of interface, equipped with a powerful i.mx6 Freescale processor, it can run Android, Linux,
More informationCUDA GPGPU Workshop 2012
CUDA GPGPU Workshop 2012 Parallel Programming: C thread, Open MP, and Open MPI Presenter: Nasrin Sultana Wichita State University 07/10/2012 Parallel Programming: Open MP, MPI, Open MPI & CUDA Outline
More informationDRM(Direct Rendering Manager) of Tizen Kernel Joonyoung Shim
DRM(Direct Rendering Manager) of Tizen Kernel Joonyoung Shim jy0922.shim@samsung.com Contents What is DRM Why DRM What can we do How to implement Tizen kernel DRM Exynos DRM driver Future work 2 What is
More informationR goes Mobile: Efficient Scheduling for Parallel R Programs on Heterogeneous Embedded Systems
R goes Mobile: Efficient Scheduling for Parallel R Programs on Heterogeneous Embedded Systems, Andreas Lang Olaf Neugebauer, Peter Marwedel 03/07/2017 SFB 876 Parallel Machine Learning Algorithms Challenge:
More information7 DAYS AND 8 NIGHTS WITH THE CARMA DEV KIT
7 DAYS AND 8 NIGHTS WITH THE CARMA DEV KIT Draft Printed for SECO Murex S.A.S 2012 all rights reserved Murex Analytics Only global vendor of trading, risk management and processing systems focusing also
More informationHETEROGENEOUS MEMORY MANAGEMENT. Linux Plumbers Conference Jérôme Glisse
HETEROGENEOUS MEMORY MANAGEMENT Linux Plumbers Conference 2018 Jérôme Glisse EVERYTHING IS A POINTER All data structures rely on pointers, explicitly or implicitly: Explicit in languages like C, C++,...
More informationDesign Space Exploration and Application Autotuning for Runtime Adaptivity in Multicore Architectures
Design Space Exploration and Application Autotuning for Runtime Adaptivity in Multicore Architectures Cristina Silvano Politecnico di Milano cristina.silvano@polimi.it Outline Research challenges in multicore
More informationEuropean energy efficient supercomputer project
http://www.montblanc-project.eu European energy efficient supercomputer project Simon McIntosh-Smith University of Bristol (Based on slides from Alex Ramirez, BSC) Disclaimer: Speaking for myself... All
More informationSupercomputing with Commodity CPUs: Are Mobile SoCs Ready for HPC?
Supercomputing with Commodity CPUs: Are Mobile SoCs Ready for HPC? Nikola Rajovic, Paul M. Carpenter, Isaac Gelado, Nikola Puzovic, Alex Ramirez, Mateo Valero SC 13, November 19 th 2013, Denver, CO, USA
More informationBuilding supercomputers from commodity embedded chips
http://www.montblanc-project.eu Building supercomputers from commodity embedded chips Alex Ramirez Barcelona Supercomputing Center Technical Coordinator This project and the research leading to these results
More informationAn Introduction to the SPEC High Performance Group and their Benchmark Suites
An Introduction to the SPEC High Performance Group and their Benchmark Suites Robert Henschel Manager, Scientific Applications and Performance Tuning Secretary, SPEC High Performance Group Research Technologies
More informationButterfly effect of porting scientific applications to ARM-based platforms
montblanc-project.eu @MontBlanc_EU Butterfly effect of porting scientific applications to ARM-based platforms Filippo Mantovani September 12 th, 2017 This project has received funding from the European
More informationScalasca support for Intel Xeon Phi. Brian Wylie & Wolfgang Frings Jülich Supercomputing Centre Forschungszentrum Jülich, Germany
Scalasca support for Intel Xeon Phi Brian Wylie & Wolfgang Frings Jülich Supercomputing Centre Forschungszentrum Jülich, Germany Overview Scalasca performance analysis toolset support for MPI & OpenMP
More informationThe Mont-Blanc Project
http://www.montblanc-project.eu The Mont-Blanc Project Daniele Tafani Leibniz Supercomputing Centre 1 Ter@tec Forum 26 th June 2013 This project and the research leading to these results has received funding
More informationTOOLS FOR IMPROVING CROSS-PLATFORM SOFTWARE DEVELOPMENT
TOOLS FOR IMPROVING CROSS-PLATFORM SOFTWARE DEVELOPMENT Eric Kelmelis 28 March 2018 OVERVIEW BACKGROUND Evolution of processing hardware CROSS-PLATFORM KERNEL DEVELOPMENT Write once, target multiple hardware
More informationAccelerating sequential computer vision algorithms using commodity parallel hardware
Accelerating sequential computer vision algorithms using commodity parallel hardware Platform Parallel Netherlands GPGPU-day, 28 June 2012 Jaap van de Loosdrecht NHL Centre of Expertise in Computer Vision
More informationHeterogeneous Architecture. Luca Benini
Heterogeneous Architecture Luca Benini lbenini@iis.ee.ethz.ch Intel s Broadwell 03.05.2016 2 Qualcomm s Snapdragon 810 03.05.2016 3 AMD Bristol Ridge Departement Informationstechnologie und Elektrotechnik
More informationSoftware Driven Verification at SoC Level. Perspec System Verifier Overview
Software Driven Verification at SoC Level Perspec System Verifier Overview June 2015 IP to SoC hardware/software integration and verification flows Cadence methodology and focus Applications (Basic to
More informationUnleashing the benefits of GPU Computing with ARM Mali TM Practical applications and use-cases. Steve Steele, ARM
Unleashing the benefits of GPU Computing with ARM Mali TM Practical applications and use-cases Steve Steele, ARM 1 Today s Computational Challenges Trends Growing display sizes and resolutions, richer
More informationECE571: Advanced Microprocessor Design Final Project Spring Officially Due: Friday, 4 May 2018 (Last day of Classes)
Overview: ECE571: Advanced Microprocessor Design Final Project Spring 2018 Officially Due: Friday, 4 May 2018 (Last day of Classes) Design a project that explores the power, energy, and/or performance
More informationExploring Task Parallelism for Heterogeneous Systems Using Multicore Task Management API
EuroPAR 2016 ROME Workshop Exploring Task Parallelism for Heterogeneous Systems Using Multicore Task Management API Suyang Zhu 1, Sunita Chandrasekaran 2, Peng Sun 1, Barbara Chapman 1, Marcus Winter 3,
More informationParallel and Distributed Computing
Parallel and Distributed Computing NUMA; OpenCL; MapReduce José Monteiro MSc in Information Systems and Computer Engineering DEA in Computational Engineering Department of Computer Science and Engineering
More informationThe Mont-Blanc approach towards Exascale
http://www.montblanc-project.eu The Mont-Blanc approach towards Exascale Alex Ramirez Barcelona Supercomputing Center Disclaimer: Not only I speak for myself... All references to unavailable products are
More informationExperiences Using Tegra K1 and X1 for Highly Energy Efficient Computing
Experiences Using Tegra K1 and X1 for Highly Energy Efficient Computing Gaurav Mitra Andrew Haigh Luke Angove Anish Varghese Eric McCreath Alistair P. Rendell Research School of Computer Science Australian
More information«UNDERSTANDING EMBEDDED LINUX BENCHMARKING USING KERNEL TRACE ANALYSIS» ALEXIS MARTIN INRIA / LIG / UNIV. GRENOBLE, FRANCE
«UNDERSTANDING EMBEDDED LINUX BENCHMARKING USING KERNEL TRACE ANALYSIS» ALEXIS MARTIN INRIA / LIG / UNIV. GRENOBLE, FRANCE We do Need Benchmarking! Benchmark : a standard or point of reference against
More informationAMD ACCELERATING TECHNOLOGIES FOR EXASCALE COMPUTING FELLOW 3 OCTOBER 2016
AMD ACCELERATING TECHNOLOGIES FOR EXASCALE COMPUTING BILL.BRANTLEY@AMD.COM, FELLOW 3 OCTOBER 2016 AMD S VISION FOR EXASCALE COMPUTING EMBRACING HETEROGENEITY CHAMPIONING OPEN SOLUTIONS ENABLING LEADERSHIP
More informationPOWER-AWARE SOFTWARE ON ARM. Paul Fox
POWER-AWARE SOFTWARE ON ARM Paul Fox OUTLINE MOTIVATION LINUX POWER MANAGEMENT INTERFACES A UNIFIED POWER MANAGEMENT SYSTEM EXPERIMENTAL RESULTS AND FUTURE WORK 2 MOTIVATION MOTIVATION» ARM SoCs designed
More informationDEVELOPMENT GUIDE VAB-630. Android BSP v
DEVELOPMENT GUIDE VAB-630 Android BSP v1.0.3 1.00-08112017-153900 Copyright Copyright 2017 VIA Technologies Incorporated. All rights reserved. No part of this document may be reproduced, transmitted, transcribed,
More informationJava Embedded on ARM
Java Embedded on ARM The Embedded Market Evolving Rapidly Internet of Things 2.3B Internet Users Cloud for Embedded Devices Med-Large Embedded Multi-function Devices Enterprise Data and Applications Up
More informationIMPROVING ENERGY EFFICIENCY THROUGH PARALLELIZATION AND VECTORIZATION ON INTEL R CORE TM
IMPROVING ENERGY EFFICIENCY THROUGH PARALLELIZATION AND VECTORIZATION ON INTEL R CORE TM I5 AND I7 PROCESSORS Juan M. Cebrián 1 Lasse Natvig 1 Jan Christian Meyer 2 1 Depart. of Computer and Information
More informationIntegrating DMA capabilities into BLIS for on-chip data movement. Devangi Parikh Ilya Polkovnichenko Francisco Igual Peña Murtaza Ali
Integrating DMA capabilities into BLIS for on-chip data movement Devangi Parikh Ilya Polkovnichenko Francisco Igual Peña Murtaza Ali 5 Generations of TI Multicore Processors Keystone architecture Lowers
More informationtrisycl Open Source C++17 & OpenMP-based OpenCL SYCL prototype Ronan Keryell 05/12/2015 IWOCL 2015 SYCL Tutorial Khronos OpenCL SYCL committee
trisycl Open Source C++17 & OpenMP-based OpenCL SYCL prototype Ronan Keryell Khronos OpenCL SYCL committee 05/12/2015 IWOCL 2015 SYCL Tutorial OpenCL SYCL committee work... Weekly telephone meeting Define
More informationRapidIO.org Update. Mar RapidIO.org 1
RapidIO.org Update rickoco@rapidio.org Mar 2015 2015 RapidIO.org 1 Outline RapidIO Overview & Markets Data Center & HPC Communications Infrastructure Industrial Automation Military & Aerospace RapidIO.org
More informationAIST Super Green Cloud
AIST Super Green Cloud A build-once-run-everywhere high performance computing platform Takahiro Hirofuchi, Ryosei Takano, Yusuke Tanimura, Atsuko Takefusa, and Yoshio Tanaka Information Technology Research
More informationQuick Start Guide. SABRE Platform for Smart Devices Based on the i.mx 6 Series
Quick Start Guide SABRE Platform for Smart Devices Based on the i.mx 6 Series Quick Start Guide About the SABRE Platform for Smart Devices Based on the i.mx 6 Series The Smart Application Blueprint for
More informationHPC projects. Grischa Bolls
HPC projects Grischa Bolls Outline Why projects? 7th Framework Programme Infrastructure stack IDataCool, CoolMuc Mont-Blanc Poject Deep Project Exa2Green Project 2 Why projects? Pave the way for exascale
More informationOmpCloud: Bridging the Gap between OpenMP and Cloud Computing
OmpCloud: Bridging the Gap between OpenMP and Cloud Computing Hervé Yviquel, Marcio Pereira and Guido Araújo University of Campinas (UNICAMP), Brazil A bit of background qguido Araujo, PhD Princeton University
More informationConnected Component Labelling, an embarrassingly sequential algorithm
Connected Component Labelling, an embarrassingly sequential algorithm Platform Parallel Netherlands GPGPU-day, 20 June 203 Jaap van de Loosdrecht NHL Centre of Expertise in Computer Vision Van de Loosdrecht
More informationGPU Debugging Made Easy. David Lecomber CTO, Allinea Software
GPU Debugging Made Easy David Lecomber CTO, Allinea Software david@allinea.com Allinea Software HPC development tools company Leading in HPC software tools market Wide customer base Blue-chip engineering,
More informationTHE LEADER IN VISUAL COMPUTING
MOBILE EMBEDDED THE LEADER IN VISUAL COMPUTING 2 TAKING OUR VISION TO REALITY HPC DESIGN and VISUALIZATION AUTO GAMING 3 BEST DEVELOPER EXPERIENCE Tools for Fast Development Debug and Performance Tuning
More informationEfficient Programming for Multicore Processor Heterogeneity: OpenMP Versus OmpSs
Efficient Programming for Multicore Processor Heterogeneity: OpenMP Versus OmpSs Anastasiia Butko, Lawrence Berkeley National Laboratory F. Bruguier, A. Gamatié, G Sassatelli, LIRMM/CNRS/UM 2 Heterogeneity:
More informationn N c CIni.o ewsrg.au
@NCInews NCI and Raijin National Computational Infrastructure 2 Our Partners General purpose, highly parallel processors High FLOPs/watt and FLOPs/$ Unit of execution Kernel Separate memory subsystem GPGPU
More informationCut Power Consumption by 5x Without Losing Performance
Cut Power Consumption by 5x Without Losing Performance A big.little Software Strategy Klaas van Gend FAE, Trainer & Consultant The mandatory Klaas-in-a-Plane picture 2 October 10, 2014 LINUXCON EUROPE
More informationFiPS and M2DC: Novel Architectures for Reconfigurable Hyperscale Servers
FiPS and M2DC: Novel Architectures for Reconfigurable Hyperscale Servers Rene Griessl, Meysam Peykanu, Lennart Tigges, Jens Hagemeyer, Mario Porrmann Center of Excellence Cognitive Interaction Technology
More informationIntroduction. Lecture 1. Operating Systems Practical. 5 October 2016
Introduction Lecture 1 Operating Systems Practical 5 October 2016 This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
More informationHow GPUs can find your next hit: Accelerating virtual screening with OpenCL. Simon Krige
How GPUs can find your next hit: Accelerating virtual screening with OpenCL Simon Krige ACS 2013 Agenda > Background > About blazev10 > What is a GPU? > Heterogeneous computing > OpenCL: a framework for
More informationExploiting CUDA Dynamic Parallelism for low power ARM based prototypes
www.bsc.es Exploiting CUDA Dynamic Parallelism for low power ARM based prototypes Vishal Mehta Engineer, Barcelona Supercomputing Center vishal.mehta@bsc.es BSC/UPC CUDA Centre of Excellence (CCOE) Training
More informationEnergy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS
Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS Who am I? Education Master of Technology, NTNU, 2007 PhD, NTNU, 2010. Title: «Managing Shared Resources in Chip Multiprocessor Memory
More informationKontron s ARM-based COM solutions and software services
Kontron s ARM-based COM solutions and software services Peter Müller Product Line Manager COMs Kontron Munich, 4 th July 2012 Kontron s ARM Strategy Why ARM COMs? How? new markets for mobile applications
More informationOverview of research activities Toward portability of performance
Overview of research activities Toward portability of performance Do dynamically what can t be done statically Understand evolution of architectures Enable new programming models Put intelligence into
More informationA Closer Look at the Epiphany IV 28nm 64 core Coprocessor. Andreas Olofsson PEGPUM 2013
A Closer Look at the Epiphany IV 28nm 64 core Coprocessor Andreas Olofsson PEGPUM 2013 1 Adapteva Achieves 3 World Firsts 1. First processor company to reach 50 GFLOPS/W 3. First semiconductor company
More informationA Case for High Performance Computing with Virtual Machines
A Case for High Performance Computing with Virtual Machines Wei Huang*, Jiuxing Liu +, Bulent Abali +, and Dhabaleswar K. Panda* *The Ohio State University +IBM T. J. Waston Research Center Presentation
More informationARM big.little Technology Unleashed An Improved User Experience Delivered
ARM big.little Technology Unleashed An Improved User Experience Delivered Govind Wathan Product Specialist Cortex -A Mobile & Consumer CPU Products 1 Agenda Introduction to big.little Technology Benefits
More informationTessellation: Space-Time Partitioning in a Manycore Client OS
Tessellation: Space-Time ing in a Manycore Client OS Rose Liu 1,2, Kevin Klues 1, Sarah Bird 1, Steven Hofmeyr 3, Krste Asanovic 1, John Kubiatowicz 1 1 Parallel Computing Laboratory, UC Berkeley 2 Data
More informationCode Generation for Embedded Heterogeneous Architectures on Android
This is the author s version of the work. The definitive work was published in Proceedings of the Conference on Design, Automation and Test in Europe (DATE), Dresden, Germany, March 24-28, 2014. Code Generation
More informationSDACCEL DEVELOPMENT ENVIRONMENT. The Xilinx SDAccel Development Environment. Bringing The Best Performance/Watt to the Data Center
SDAccel Environment The Xilinx SDAccel Development Environment Bringing The Best Performance/Watt to the Data Center Introduction Data center operators constantly seek more server performance. Currently
More informationA unified multicore programming model
A unified multicore programming model Simplifying multicore migration By Sven Brehmer Abstract There are a number of different multicore architectures and programming models available, making it challenging
More informationVIProf: A Vertically Integrated Full-System Profiler
VIProf: A Vertically Integrated Full-System Profiler NGS Workshop, April 2007 Hussam Mousa Chandra Krintz Lamia Youseff Rich Wolski RACELab Research Dynamic software adaptation As program behavior or resource
More informationThe Use of Cloud Computing Resources in an HPC Environment
The Use of Cloud Computing Resources in an HPC Environment Bill, Labate, UCLA Office of Information Technology Prakashan Korambath, UCLA Institute for Digital Research & Education Cloud computing becomes
More informationSIGGRAPH Briefing August 2014
Copyright Khronos Group 2014 - Page 1 SIGGRAPH Briefing August 2014 Neil Trevett VP Mobile Ecosystem, NVIDIA President, Khronos Copyright Khronos Group 2014 - Page 2 Significant Khronos API Ecosystem Advances
More informationEnergy-Efficient Run-time Mapping and Thread Partitioning of Concurrent OpenCL Applications on CPU-GPU MPSoCs
Energy-Efficient Run-time Mapping and Thread Partitioning of Concurrent OpenCL Applications on CPU-GPU MPSoCs AMIT KUMAR SINGH, University of Southampton ALOK PRAKASH, Nanyang Technological University
More informationPorting of Real-Time Publish-Subscribe Middleware to Android
M.Vajnar, M. Sojka, P. Píša Czech Technical University in Prague Porting of Real-Time Publish-Subscribe Middleware to Android RTLWS15, Lugano-Manno Distributed applications problems 2/23 Distributed applications
More informationUnCovert: Evaluating thermal covert channels on Android systems. Pascal Wild
UnCovert: Evaluating thermal covert channels on Android systems Pascal Wild August 5, 2016 Contents Introduction v 1: Framework 1 1.1 Source...................................... 1 1.2 Sink.......................................
More informationEnabling a Richer Multimedia Experience with GPU Compute. Roberto Mijat Visual Computing Marketing Manager
Enabling a Richer Multimedia Experience with GPU Compute Roberto Mijat Visual Computing Marketing Manager 1 What is GPU Compute Operating System and most application processing continue to reside on the
More informationIntroduction to gem5. Nizamudheen Ahmed Texas Instruments
Introduction to gem5 Nizamudheen Ahmed Texas Instruments 1 Introduction A full-system computer architecture simulator Open source tool focused on architectural modeling BSD license Encompasses system-level
More information