Self-Aware Adaptation in FPGA-based Systems
|
|
- Daniela Harrington
- 5 years ago
- Views:
Transcription
1 DIPARTIMENTO DI ELETTRONICA E INFORMAZIONE Self-Aware Adaptation in FPGA-based Systems IEEE FPL 2010 Filippo Siorni: filippo.sironi@dresd.org Marco Triverio: marco.triverio@dresd.org Martina Maggio: mmaggio@mit.edu Hank Hoffmann: hank@csail.mit.edu Marco D. Santambrogio: santa@csail.mit.edu August 31 September 2, 2010 Politecnico di Milano Milano, Italy
2 Rationale and Contribution Modern computing systems complexity is skyrocketing mainly due to:! Tremendous availability and integration of heterogeneous resources! Demand for performance and reliability We present a Self-Aware Adaptive computing system blending techniques from different research fields to exploit available resources considering the execution context 2
3 Motivation example Thanks to the availability of heterogeneous resources, modern computing systems provide the ability to provide many different implementation of a functionality When multiple implementations of a functionality are available, taking reasonable choices at compile time about which one better suits the execution context is not a trivial task 3
4 Outline Introduction on Self-Aware Adaptive computing systems Context definition Proposed approach to implement Self- Aware Adaptive computing systems! Observe! Decide! Act Experimental results Conclusions and Future works 4
5 Introduction (1 of 2) The key idea: manage technology and its complexity using technology itself changing the computing system behavior and resources management policies The key characteristics:! Awareness! Adaptation! Approximation! Goal-orientation 5
6 Introduction (2 of 2) Self-Aware Adaptive computing systems follow the Observe, Decide, and Act loop: Input Goal Decide Environment Internals Observe Act Output State 6
7 Context definition (1 of 2) The literature is filled with works taking advantage of online monitoring/profiling, decision making, or hardware/software partitioning either static (compile time defined) or dynamic (run time defined) Many of these works does not exploit all these features resulting in a lack of Self-Aware Adaptation 7
8 Context definition (2 of 2) The use of static hardware/software partitioning to select an implementation of functionality prevents any kind of Self- Adaptation The use of dynamic hardware/software partitioning allows Self-Adaptation on a degree depending on Self-Awareness which depends on online monitoring/ profiling 8
9 Proposed approach The Self-Aware Adaptive computing system we have developed fully exploit the ODA-loop to overcome some of the limits we outlined in state of the art solutions The ODA-loop is implemented by means of three sub-systems:! Heartbeats: observe! Heuristic decision making process: decide! Implementation Switch Service: act 9
10 Observe Online performance assertion and monitoring are performed by means of Heartbeats Heartbeats is simple yet extremely powerful library used to declare performance goals and to update and monitor the overall throughput of an application It is based on two simple concepts:! Heartbeat! Heart rate More information at: and 10
11 Decide The decision making process can be implemented in many different ways:! Heuristic methods! Probabilistic methods! Machine learning techniques (1)! Control theory techniques (2) We implement a heuristic that uses information collected by Heartbeats and avoids oscillations (1) J. Eastep, et al.: Smartlocks: Lock Acquisition Scheduling for Self-Aware Synchronization (ICAC 10) (2) M. Maggio, et al.: Controlling software applications via resource allocation within the Heartbeats framework (to appear in CDC '10) 11
12 Act (1 of 2) The ability to hot-swap an implementation of a given functionality in favor of another one proves to be a fundamental characteristic for a Self- Aware Adaptive computing system The Implementation Switch Service behavior has been inspired by the hotswap mechanism available in K42 J. Appavoo, et al.: Experience with K42, an open-source, Linux-compatible, scalable operating-system kernel (IBM Systems Journal 05) 12
13 Act (2 of 2) Switchable unit: Dynamic-Link Library State translation: on-the-fly translation of the canonical data structure Self-aware adaptive library Data structure Self-aware adaptive library Data structure Canonical data structure Software implementation Hardware implementation Canonical data structure Software implementation Hardware implementation Data structure Data structure 13
14 Experimental results Testing platform:! Xilinx XC2VP30 FPGA, IBM PowerPC 405, 256 MB of SDRAM, and 1 GB of Flash running a Linux-based operating system Static analysis of both the hardware and the software implementation of the DES cryptographic algorithm Dynamic analysis of the Self-Aware Adaptive computing system Online monitoring, decision making process, and dynamic reconfiguration overheads analysis 14
15 Static analysis 1000! Software! Hardware! Reconfigurable Hardware! 900! 800! Execution time [ms]! 700! 600! 500! 400! 300! 200! 100! 0! 1! 2! 3! 4! 5! 6! 7! 8! 9! 10! 20! 30! 40! 50! 60! 70! 80! 90! 100! 200! 300! 400! 500! 600! 700! 800! 900! 1000! Blocks [#]! 15
16 Dynamic analysis M Δ observation delta R reconfiguration time Δ m minimum heart rate M maximum heart rate Δ Heart rate m R t0 t1 t2 t3 t4 t5 t6 t7 Time 16
17 Overheads (1 of 2) 7,00%! Overhead! Average! 6,00%! Overhead on Execution time! 5,00%! 4,00%! 3,00%! 2,00%! 1,00%! 0,00%! 1! 2! 3! 4! 5! 6! 7! 8! 9! 10! 20! 30! 40! 50! 60! 70! 80! 90! Blocks [#]! 100! 200! 300! 400! 500! 600! 700! 800! 900! 1000! 3,52%! 17
18 Overheads (2 of 2) 95,50%! Hardware! Reconfigurable Hardware! Overhead! Average Overhead! 200! 180! Overhead on Execution time! 95,00%! 94,50%! 94,00%! 93,50%! 93,00%! 92,50%! 1! 2! 3! 4! 5! 6! 7! 8! 9! 10! 20! 30! 40! 50! 60! 70! 80! 90! Blocks [#]! 100! 200! 300! 400! 500! 600! 700! 800! 900! 1000! 93,87%! 160! 140! 120! 100! 80! 60! 40! 20! 0! Execution times [ms]! 18
19 Conclusions The proposed approach merges the potential of reconfigurable architectures with online performance assertion and monitoring, and adaptation capabilities Experimental results show the goodness of the proposed approach when the computing system works within an unpredictable environment The overhead of the online monitoring and decision making process proved to be sustainable 19
20 Future works Run the Linux-based operating system on top of a multi-core processor sided with an FPGA used as an accelerator co-processor Implement an hot-swap mechanism within the Linux kernel to allow the implementation switch between device drivers optimized for different execution contexts 20
Self-Aware Adaptation in FPGA-based Systems
Self-Aware Adaptation in FPGA-based Systems F. Sironi, M. Triverio, H. Hoffmann, M. Maggio, and M. D. Santambrogio Dipartimento di Elettronica e Informazione (DEI) - Politecnico di Milano Email: {filippo.sironi,
More informationAutonomic Thread Scaling Library for QoS Management
Autonomic Thread Scaling Library for QoS Management Gianluca C. Durelli Politecnico di Milano Dipartimento di Elettronica, Informazione e Bioingegneria gianlucacarlo.durelli@polimi.it Marco D. Santambrogio
More informationAn adaptive genetic algorithm for dynamically reconfigurable modules allocation
An adaptive genetic algorithm for dynamically reconfigurable modules allocation Vincenzo Rana, Chiara Sandionigi, Marco Santambrogio and Donatella Sciuto chiara.sandionigi@dresd.org, {rana, santambr, sciuto}@elet.polimi.it
More informationNetwork-Aware Resource Allocation in Distributed Clouds
Dissertation Research Summary Thesis Advisor: Asst. Prof. Dr. Tolga Ovatman Istanbul Technical University Department of Computer Engineering E-mail: aralat@itu.edu.tr April 4, 2016 Short Bio Research and
More informationExploring Hardware Support For Scaling Irregular Applications on Multi-node Multi-core Architectures
Exploring Hardware Support For Scaling Irregular Applications on Multi-node Multi-core Architectures MARCO CERIANI SIMONE SECCHI ANTONINO TUMEO ORESTE VILLA GIANLUCA PALERMO Politecnico di Milano - DEI,
More informationA Novel Design Framework for the Design of Reconfigurable Systems based on NoCs
Politecnico di Milano & EPFL A Novel Design Framework for the Design of Reconfigurable Systems based on NoCs Vincenzo Rana, Ivan Beretta, Donatella Sciuto Donatella Sciuto sciuto@elet.polimi.it Introduction
More informationReconOS: Multithreaded Programming and Execution Models for Reconfigurable Hardware
ReconOS: Multithreaded Programming and Execution Models for Reconfigurable Hardware Enno Lübbers and Marco Platzner Computer Engineering Group University of Paderborn {enno.luebbers, platzner}@upb.de Outline
More informationPolitecnico di Milano
Politecnico di Milano Prototyping Pipelined Applications on a Heterogeneous FPGA Multiprocessor Virtual Platform Antonino Tumeo, Marco Branca, Lorenzo Camerini, Marco Ceriani, Gianluca Palermo, Fabrizio
More informationProgram-Driven Fine-Grained Power Management for the Reconfigurable Mesh
Program-Driven Fine-Grained Power Management for the Reconfigurable Mesh Heiner Giefers, Marco Platzner Computer Engineering Group University of Paderborn {hgiefers, platzner}@upb.de Outline 1. Introduction
More informationMutekH embedded operating system. January 10, 2013
MutekH embedded operating system January 10, 2013 Table of Contents Table of Contents History... 2 Native heterogeneity support... 3 MutekH kernel overview... 6 MutekH configuration... 17 MutekH embedded
More informationAA-Sort: A New Parallel Sorting Algorithm for Multi-Core SIMD Processors
AA-Sort: A New Parallel Sorting Algorithm for Multi-Core SIMD Processors Hiroshi Inoue, Takao Moriyama, Hideaki Komatsu and Toshio Nakatani IBM Tokyo Research Laboratory September 18 th, 2007 PACT2007
More informationMetodologie di Progettazione Hardware e Software
POLITECNICO DI MILANO Metodologie di Progettazione Hardware e Software Reconfigurable Computing - Design Flow - Marco D. Santambrogio marco.santabrogio@polimi.it Outline 2 Retargetable Compiler Basic Idea
More informationApplications to MPSoCs
3 rd Workshop on Mapping of Applications to MPSoCs A Design Exploration Framework for Mapping and Scheduling onto Heterogeneous MPSoCs Christian Pilato, Fabrizio Ferrandi, Donatella Sciuto Dipartimento
More informationXPU A Programmable FPGA Accelerator for Diverse Workloads
XPU A Programmable FPGA Accelerator for Diverse Workloads Jian Ouyang, 1 (ouyangjian@baidu.com) Ephrem Wu, 2 Jing Wang, 1 Yupeng Li, 1 Hanlin Xie 1 1 Baidu, Inc. 2 Xilinx Outlines Background - FPGA for
More informationSyCERS: a SystemC design exploration framework for SoC reconfigurable architecture
SyCERS: a SystemC design exploration framework for SoC reconfigurable architecture Carlo Amicucci Fabrizio Ferrandi Marco Santambrogio Donatella Sciuto Politecnico di Milano Dipartimento di Elettronica
More informationSmartlocks: Lock Acquisition Scheduling for Self-Aware Synchronization
Smartlocks: Lock Acquisition Scheduling for Self-Aware Synchronization The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As
More informationApplying Self-Aggregation to Load Balancing: Experimental Results
Applying Self-Aggregation to Load Balancing: Experimental Results Elisabetta Di Nitto, Daniel J. Dubois, Raffaela Mirandola Dipartimento di Elettronica e Informazione Politecnico di Milano Fabrice Saffre,
More informationA Global Operating System for HPC Clusters
A Global Operating System Emiliano Betti 1 Marco Cesati 1 Roberto Gioiosa 2 Francesco Piermaria 1 1 System Programming Research Group, University of Rome Tor Vergata 2 BlueGene Software Division, IBM TJ
More informationA Run-Time System for Partially Reconfigurable FPGAs: The case of STMicroelectronics SPEAr board
A Run-Time System for Partially Reconfigurable FPGAs: The case of STMicroelectronics SPEAr board George CHARITOPOULOS a,b,1, Dionisios PNEVMATIKATOS a,b, Marco D. SANTAMBROGIO c, Kyprianos PAPADIMITRIOU
More informationSelf-Organization Algorithms SelfLet Model and Architecture Self-Organization as Ability Conclusions
Self-Organization Algorithms for Autonomic Systems in the SelfLet Approach D. Devescovi E. Di Nitto D.J. Dubois R. Mirandola Dipartimento di Elettronica e Informazione Politecnico di Milano Reading Group
More informationCompute Node Design for DAQ and Trigger Subsystem in Giessen. Justus Liebig University in Giessen
Compute Node Design for DAQ and Trigger Subsystem in Giessen Justus Liebig University in Giessen Outline Design goals Current work in Giessen Hardware Software Future work Justus Liebig University in Giessen,
More informationA user-driven policy selection model
A user-driven policy selection model Mariagrazia Fugini, Pierluigi Plebani, Filippo Ramoni Dipartimento di Elettronica ed Informazione Politecnico di Milano Motivation 2 Web service description should
More informationCASE STUDY: Using Field Programmable Gate Arrays in a Beowulf Cluster
CASE STUDY: Using Field Programmable Gate Arrays in a Beowulf Cluster Mr. Matthew Krzych Naval Undersea Warfare Center Phone: 401-832-8174 Email Address: krzychmj@npt.nuwc.navy.mil The Robust Passive Sonar
More informationFPGA Reconfiguration!
Advanced Topics on Heterogeneous System Architectures Reconfiguration! Politecnico di Milano! Seminar Room, Bld 20! 4 December, 2017! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! Reconfiguration
More informationReCPU: a Parallel and Pipelined Architecture for Regular Expression Matching
ReCPU: a Parallel and Pipelined Architecture for Regular Expression Matching Marco Paolieri, Ivano Bonesana ALaRI, Faculty of Informatics University of Lugano, Lugano, Switzerland {paolierm, bonesani}@alari.ch
More informationDatabase Acceleration Solution Using FPGAs and Integrated Flash Storage
Database Acceleration Solution Using FPGAs and Integrated Flash Storage HK Verma, Xilinx Inc. August 2017 1 FPGA Analytics in Flash Storage System In-memory or Flash storage based DB reduce disk access
More informationEnabling Flexible Network FPGA Clusters in a Heterogeneous Cloud Data Center
Enabling Flexible Network FPGA Clusters in a Heterogeneous Cloud Data Center Naif Tarafdar, Thomas Lin, Eric Fukuda, Hadi Bannazadeh, Alberto Leon-Garcia, Paul Chow University of Toronto 1 Cloudy with
More informationMapping real-life applications on run-time reconfigurable NoC-based MPSoC on FPGA. Singh, A.K.; Kumar, A.; Srikanthan, Th.; Ha, Y.
Mapping real-life applications on run-time reconfigurable NoC-based MPSoC on FPGA. Singh, A.K.; Kumar, A.; Srikanthan, Th.; Ha, Y. Published in: Proceedings of the 2010 International Conference on Field-programmable
More informationA Light Weight Network on Chip Architecture for Dynamically Reconfigurable Systems
A Light Weight Network on Chip Architecture for Dynamically Reconfigurable Systems Simone Corbetta, Vincenzo Rana, Marco Domenico Santambrogio and Donatella Sciuto Dipartimento di Elettronica e Informazione
More informationPERLA PERvasive LAnguage
PERLA - SCHREIBER F.A., CAMPLANI R., FORTUNATO M., MARELLI M. 1 EXECUTIVE SUMMARY PERLA PERvasive LAnguage INTRODUCTION TO LANGUAGE FEATURES SCHREIBER F.A., CAMPLANI R., FORTUNATO M., MARELLI M. Dipartimento
More informationAutomated Bug Detection for Pointers and Memory Accesses in High-Level Synthesis Compilers
Automated Bug Detection for Pointers and Memory Accesses in High-Level Synthesis Compilers Pietro Fezzardi pietro.fezzardi@polimi.it Fabrizio Ferrandi fabrizio.ferrandi@polimi.it Dipartimento di Elettronica,
More informationSDACCEL DEVELOPMENT ENVIRONMENT. The Xilinx SDAccel Development Environment. Bringing The Best Performance/Watt to the Data Center
SDAccel Environment The Xilinx SDAccel Development Environment Bringing The Best Performance/Watt to the Data Center Introduction Data center operators constantly seek more server performance. Currently
More informationHigh Performance Computing Systems
High Performance Computing Systems Multikernels Doug Shook Multikernels Two predominant approaches to OS: Full weight kernel Lightweight kernel Why not both? How does implementation affect usage and performance?
More informationEuro-Par Pisa - Italy
Euro-Par 2004 - Pisa - Italy Accelerating farms through ad- distributed scalable object repository Marco Aldinucci, ISTI-CNR, Pisa, Italy Massimo Torquati, CS dept. Uni. Pisa, Italy Outline (Herd of Object
More informationON THE DESIGN OF AUTONOMIC TECHNIQUES
thesis 2016/1/16 10:05 page 1 #1 POLITECNICO DI MILANO DIPARTIMENTO DI ELETTRONICA, INFORMAZIONE E BIOINGEGNERIA DOCTORAL PROGRAMME IN COMPUTER SCIENCE AND ENGINEERING ON THE DESIGN OF AUTONOMIC TECHNIQUES
More informationASPDAC An application-centered Design Flow for Self Reconfigurable Systems implementation
ASPDAC 2009 An application-centered Design Flow for Self Reconfigurable Systems implementation Fabio Cancare: fabio.cancare@polimi.it Marco D. Santambrogio: marco.santambrogio@polimi.it Donatella Sciuto:
More informationHardware Design and Simulation for Verification
Hardware Design and Simulation for Verification by N. Bombieri, F. Fummi, and G. Pravadelli Universit`a di Verona, Italy (in M. Bernardo and A. Cimatti Eds., Formal Methods for Hardware Verification, Lecture
More informationCenter Extreme Scale CS Research
Center Extreme Scale CS Research Center for Compressible Multiphase Turbulence University of Florida Sanjay Ranka Herman Lam Outline 10 6 10 7 10 8 10 9 cores Parallelization and UQ of Rocfun and CMT-Nek
More informationA Framework to Model Self-Adaptive Computing Systems
A Framework to Model Self-Adaptive Computing Systems AHS 2013 @Politecnico di Torino - Italy June 25, 2013 Cristiana BOLCHINI Matteo CARMINATI Antonio MIELE Elisa QUINTARELLI mcarminati@elet.polimi.it
More informationCellSs Making it easier to program the Cell Broadband Engine processor
Perez, Bellens, Badia, and Labarta CellSs Making it easier to program the Cell Broadband Engine processor Presented by: Mujahed Eleyat Outline Motivation Architecture of the cell processor Challenges of
More informationHSA Foundation! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar Room (Bld 20)! 15 December, 2017!
Advanced Topics on Heterogeneous System Architectures HSA Foundation! Politecnico di Milano! Seminar Room (Bld 20)! 15 December, 2017! Antonio R. Miele! Marco D. Santambrogio! Politecnico di Milano! 2
More informationControlling software applications via resource allocation within the Heartbeats framework
Controlling software applications via resource allocation within the Heartbeats framework The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters.
More informationA Cost-oriented oriented Tool to Support Server Consolidation
A Cost-oriented oriented Tool to Support Server Consolidation Danilo Ardagna Politecnico di Milano Italy Summary Research objectives and Motivation Server Consolidation design phases and variables Empirical
More informationEffective Memory Access Optimization by Memory Delay Modeling, Memory Allocation, and Slack Time Management
International Journal of Computer Theory and Engineering, Vol., No., December 01 Effective Memory Optimization by Memory Delay Modeling, Memory Allocation, and Slack Time Management Sultan Daud Khan, Member,
More informationScalable and Dynamically Updatable Lookup Engine for Decision-trees on FPGA
Scalable and Dynamically Updatable Lookup Engine for Decision-trees on FPGA Yun R. Qu, Viktor K. Prasanna Ming Hsieh Dept. of Electrical Engineering University of Southern California Los Angeles, CA 90089
More informationSubject Name: OPERATING SYSTEMS. Subject Code: 10EC65. Prepared By: Kala H S and Remya R. Department: ECE. Date:
Subject Name: OPERATING SYSTEMS Subject Code: 10EC65 Prepared By: Kala H S and Remya R Department: ECE Date: Unit 7 SCHEDULING TOPICS TO BE COVERED Preliminaries Non-preemptive scheduling policies Preemptive
More informationA Parametric Design of a Built-in Self-Test FIFO Embedded Memory
A Parametric Design of a Built-in Self-Test FIFO Embedded Memory S. Barbagallo, M. Lobetti Bodoni, D. Medina G. De Blasio, M. Ferloni, F.Fummi, D. Sciuto DSRC Dipartimento di Elettronica e Informazione
More informationExploration of Cache Coherent CPU- FPGA Heterogeneous System
Exploration of Cache Coherent CPU- FPGA Heterogeneous System Wei Zhang Department of Electronic and Computer Engineering Hong Kong University of Science and Technology 1 Outline ointroduction to FPGA-based
More informationPolitecnico di Milano
Politecnico di Milano Automatic parallelization of sequential specifications for symmetric MPSoCs [Full text is available at https://re.public.polimi.it/retrieve/handle/11311/240811/92308/iess.pdf] Fabrizio
More informationFrom Interaction Overview Diagrams to Temporal Logic
From Interaction Overview Diagrams to Temporal Logic Politecnico di Milano Dipartimento di Elettronica e Informazione Luciano Baresi, Angelo Morzenti, Alfredo Motta, Matteo Rossi {baresi morzenti motta
More informationA Case Study in Optimizing GNU Radio s ATSC Flowgraph
A Case Study in Optimizing GNU Radio s ATSC Flowgraph Presented by Greg Scallon and Kirby Cartwright GNU Radio Conference 2017 Thursday, September 14 th 10am ATSC FLOWGRAPH LOADING 3% 99% 76% 36% 10% 33%
More informationGPUfs: Integrating a file system with GPUs
ASPLOS 2013 GPUfs: Integrating a file system with GPUs Mark Silberstein (UT Austin/Technion) Bryan Ford (Yale), Idit Keidar (Technion) Emmett Witchel (UT Austin) 1 Traditional System Architecture Applications
More informationSFS: Random Write Considered Harmful in Solid State Drives
SFS: Random Write Considered Harmful in Solid State Drives Changwoo Min 1, 2, Kangnyeon Kim 1, Hyunjin Cho 2, Sang-Won Lee 1, Young Ik Eom 1 1 Sungkyunkwan University, Korea 2 Samsung Electronics, Korea
More informationA Reconfigurable Network-on-Chip Architecture for Optimal Multi-Processor SoC Communication
A Reconfigurable Network-on-Chip Architecture for Optimal Multi-Processor SoC Communication Vincenzo Rana, David Atienza,, Marco Domenico Santambrogio, Donatella Sciuto, and Giovanni De Micheli Dipartimento
More informationKartik Lakhotia, Rajgopal Kannan, Viktor Prasanna USENIX ATC 18
Accelerating PageRank using Partition-Centric Processing Kartik Lakhotia, Rajgopal Kannan, Viktor Prasanna USENIX ATC 18 Outline Introduction Partition-centric Processing Methodology Analytical Evaluation
More informationCross-Layer Memory Management to Reduce DRAM Power Consumption
Cross-Layer Memory Management to Reduce DRAM Power Consumption Michael Jantz Assistant Professor University of Tennessee, Knoxville 1 Introduction Assistant Professor at UT since August 2014 Before UT
More informationA Reconfigurable Network-on-Chip Architecture for Optimal Multi-Processor SoC Communication
A Reconfigurable Network-on-Chip Architecture for Optimal Multi-Processor SoC Communication Vincenzo Rana, David Atienza,, Marco Domenico Santambrogio, Donatella Sciuto, and Giovanni De Micheli 4 Dipartimento
More informationDesign Space Exploration and Application Autotuning for Runtime Adaptivity in Multicore Architectures
Design Space Exploration and Application Autotuning for Runtime Adaptivity in Multicore Architectures Cristina Silvano Politecnico di Milano cristina.silvano@polimi.it Outline Research challenges in multicore
More informationRe-architecting Virtualization in Heterogeneous Multicore Systems
Re-architecting Virtualization in Heterogeneous Multicore Systems Himanshu Raj, Sanjay Kumar, Vishakha Gupta, Gregory Diamos, Nawaf Alamoosa, Ada Gavrilovska, Karsten Schwan, Sudhakar Yalamanchili College
More informationMultiLanes: Providing Virtualized Storage for OS-level Virtualization on Many Cores
MultiLanes: Providing Virtualized Storage for OS-level Virtualization on Many Cores Junbin Kang, Benlong Zhang, Tianyu Wo, Chunming Hu, and Jinpeng Huai Beihang University 夏飞 20140904 1 Outline Background
More informationGPUfs: Integrating a file system with GPUs
GPUfs: Integrating a file system with GPUs Mark Silberstein (UT Austin/Technion) Bryan Ford (Yale), Idit Keidar (Technion) Emmett Witchel (UT Austin) 1 Traditional System Architecture Applications OS CPU
More informationDynamic inter-core scheduling in Barrelfish
Dynamic inter-core scheduling in Barrelfish. avoiding contention with malleable domains Georgios Varisteas, Mats Brorsson, Karl-Filip Faxén November 25, 2011 Outline Introduction Scheduling & Programming
More informationSimulation of Scale-Free Networks
Simulation of Scale-Free Networks Gabriele D Angelo http://www.cs.unibo.it/gdangelo/ it/ / joint work with: Stefano Ferretti Department of Computer Science University of Bologna SIMUTOOLS
More informationA Study of Data Partitioning on OpenCL-based FPGAs. Zeke Wang (NTU Singapore), Bingsheng He (NTU Singapore), Wei Zhang (HKUST)
A Study of Data Partitioning on OpenC-based FPGAs Zeke Wang (NTU Singapore), Bingsheng He (NTU Singapore), Wei Zhang (HKUST) 1 Outline Background and Motivations Data Partitioning on FPGA OpenC on FPGA
More informationRuntime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays
Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays Éricles Sousa 1, Frank Hannig 1, Jürgen Teich 1, Qingqing Chen 2, and Ulf Schlichtmann
More informationPerformance Characterization, Prediction, and Optimization for Heterogeneous Systems with Multi-Level Memory Interference
The 2017 IEEE International Symposium on Workload Characterization Performance Characterization, Prediction, and Optimization for Heterogeneous Systems with Multi-Level Memory Interference Shin-Ying Lee
More informationA New Model for Optical Crosstalk in SinglePhoton Avalanche Diodes Arrays
A New Model for Optical Crosstalk in SinglePhoton Avalanche Diodes Arrays I. Rech, A. Ingargiola, R. Spinelli, S. Marangoni, I. Labanca, M. Ghioni, S. Cova Dipartimento di Elettronica ed Informazione Politecnico
More informationUsing Speculative Computation and Parallelizing techniques to improve Scheduling of Control based Designs
Using Speculative Computation and Parallelizing techniques to improve Scheduling of Control based Designs Roberto Cordone Fabrizio Ferrandi, Gianluca Palermo, Marco D. Santambrogio, Donatella Sciuto Università
More informationCapriccio : Scalable Threads for Internet Services
Capriccio : Scalable Threads for Internet Services - Ron von Behren &et al - University of California, Berkeley. Presented By: Rajesh Subbiah Background Each incoming request is dispatched to a separate
More informationSupport for Programming Reconfigurable Supercomputers
Support for Programming Reconfigurable Supercomputers Miriam Leeser Nicholas Moore, Albert Conti Dept. of Electrical and Computer Engineering Northeastern University Boston, MA Laurie Smith King Dept.
More informationScalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA
Scalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA Yufei Ma, Naveen Suda, Yu Cao, Jae-sun Seo, Sarma Vrudhula School of Electrical, Computer and Energy Engineering School
More informationHardware Software Codesign of Embedded Systems
Hardware Software Codesign of Embedded Systems Rabi Mahapatra Texas A&M University Today s topics Course Organization Introduction to HS-CODES Codesign Motivation Some Issues on Codesign of Embedded System
More informationMinimal Equation Sets for Output Computation in Object-Oriented Models
Minimal Equation Sets for Output Computation in Object-Oriented Models Vincenzo Manzoni Francesco Casella Dipartimento di Elettronica e Informazione, Politecnico di Milano Piazza Leonardo da Vinci 3, 033
More informationCS 655 Advanced Topics in Distributed Systems
Presented by : Walid Budgaga CS 655 Advanced Topics in Distributed Systems Computer Science Department Colorado State University 1 Outline Problem Solution Approaches Comparison Conclusion 2 Problem 3
More informationA software platform to support dynamically reconfigurable Systems-on-Chip under the GNU/Linux operating system
A software platform to support dynamically reconfigurable Systems-on-Chip under the GNU/Linux operating system 26th July 2005 Alberto Donato donato@elet.polimi.it Relatore: Prof. Fabrizio Ferrandi Correlatore:
More informationVirtual Machines Disco and Xen (Lecture 10, cs262a) Ion Stoica & Ali Ghodsi UC Berkeley February 26, 2018
Virtual Machines Disco and Xen (Lecture 10, cs262a) Ion Stoica & Ali Ghodsi UC Berkeley February 26, 2018 Today s Papers Disco: Running Commodity Operating Systems on Scalable Multiprocessors, Edouard
More informationFlash-Conscious Cache Population for Enterprise Database Workloads
IBM Research ADMS 214 1 st September 214 Flash-Conscious Cache Population for Enterprise Database Workloads Hyojun Kim, Ioannis Koltsidas, Nikolas Ioannou, Sangeetha Seshadri, Paul Muench, Clem Dickey,
More informationA DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU
A DEDUPLICATION-INSPIRED FAST DELTA COMPRESSION APPROACH W EN XIA, HONG JIANG, DA N FENG, LEI T I A N, M I N FU, YUKUN Z HOU PRESENTED BY ROMAN SHOR Overview Technics of data reduction in storage systems:
More informationEfficient Hardware Acceleration on SoC- FPGA using OpenCL
Efficient Hardware Acceleration on SoC- FPGA using OpenCL Advisor : Dr. Benjamin Carrion Schafer Susmitha Gogineni 30 th August 17 Presentation Overview 1.Objective & Motivation 2.Configurable SoC -FPGA
More informationDesigning Next-Generation Data- Centers with Advanced Communication Protocols and Systems Services. Presented by: Jitong Chen
Designing Next-Generation Data- Centers with Advanced Communication Protocols and Systems Services Presented by: Jitong Chen Outline Architecture of Web-based Data Center Three-Stage framework to benefit
More informationFPGA BASED OBJECT TRACKING SYSTEM PROJECT APPROVAL
FPGA BASED OBJECT TRACKING SYSTEM PROJECT APPROVAL Design of Embedded System Advanced Course-EDA385 Department of Computer Science, Lund University Submitted By HARSHAVARDHAN KITTUR aso10hki@student.lu.se
More informationPOLITECNICO DI MILANO
POLITECNICO DI MILANO Corso di Laurea Specialistica in Ingegneria Informatica Dipartimento di Elettronica, Informazione e Bioingegneria AN ARCHITECTURE FOR FUNCTION POINTERS AND NON-INLINED FUNCTION CALL
More informationReport on benchmark identification and planning of experiments to be performed
COTEST/D1 Report on benchmark identification and planning of experiments to be performed Matteo Sonza Reorda, Massimo Violante Politecnico di Torino Dipartimento di Automatica e Informatica Torino, Italy
More informationParallel graph traversal for FPGA
LETTER IEICE Electronics Express, Vol.11, No.7, 1 6 Parallel graph traversal for FPGA Shice Ni a), Yong Dou, Dan Zou, Rongchun Li, and Qiang Wang National Laboratory for Parallel and Distributed Processing,
More informationThe RUNES Middleware System
The Middleware System The EU Project Paolo Costa, Luca Mottola, Gian Pietro Picco Dip. Di Elettronica ed Informazione Politecnico di Milano Geoff Coulson Department of Computing Lancaster University Cecilia
More informationVXS-610 Dual FPGA and PowerPC VXS Multiprocessor
VXS-610 Dual FPGA and PowerPC VXS Multiprocessor Two Xilinx Virtex -5 FPGAs for high performance processing On-board PowerPC CPU for standalone operation, communications management and user applications
More informationVXS-621 FPGA & PowerPC VXS Multiprocessor
VXS-621 FPGA & PowerPC VXS Multiprocessor Xilinx Virtex -5 FPGA for high performance processing On-board PowerPC CPU for standalone operation, communications management and user applications Two PMC/XMC
More informationShared Address Space I/O: A Novel I/O Approach for System-on-a-Chip Networking
Shared Address Space I/O: A Novel I/O Approach for System-on-a-Chip Networking Di-Shi Sun and Douglas M. Blough School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta, GA
More informationThe ANTAREX Approach to AutoTuning and Adaptivity for Energy efficient HPC systems
The ANTAREX Approach to AutoTuning and Adaptivity for Energy efficient HPC systems The ANTAREX Team Nesus Fifth Working Group Meeting Ljubljana, July 8 th, 2016 ANTAREX AutoTuning and Adaptivity approach
More informationCognitive Algorithms for 3D Video Acquisition and Transmission
Department of Information Engineering, University of Padova Cognitive Algorithms for 3D Video Acquisition and Transmission Simone Milani Politecnico di Milano / University of Padova milani@elet.polimi.it,
More informationQoS-aware resource allocation and load-balancing in enterprise Grids using online simulation
QoS-aware resource allocation and load-balancing in enterprise Grids using online simulation * Universität Karlsruhe (TH) Technical University of Catalonia (UPC) Barcelona Supercomputing Center (BSC) Samuel
More informationComplexity Results for Throughput and Latency Optimization of Replicated and Data-parallel Workflows
Complexity Results for Throughput and Latency Optimization of Replicated and Data-parallel Workflows Anne Benoit and Yves Robert GRAAL team, LIP École Normale Supérieure de Lyon September 2007 Anne.Benoit@ens-lyon.fr
More informationACCELERATING THE PRODUCTION OF SYNTHETIC SEISMOGRAMS BY A MULTICORE PROCESSOR CLUSTER WITH MULTIPLE GPUS
ACCELERATING THE PRODUCTION OF SYNTHETIC SEISMOGRAMS BY A MULTICORE PROCESSOR CLUSTER WITH MULTIPLE GPUS Ferdinando Alessi Annalisa Massini Roberto Basili INGV Introduction The simulation of wave propagation
More informationDesigning a True Direct-Access File System with DevFS
Designing a True Direct-Access File System with DevFS Sudarsun Kannan, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau University of Wisconsin-Madison Yuangang Wang, Jun Xu, Gopinath Palani Huawei Technologies
More informationOperating System Approaches for Dynamically Reconfigurable Hardware
Operating System Approaches for Dynamically Reconfigurable Hardware Marco Platzner Computer Engineering Group University of Paderborn platzner@upb.de Outline operating systems for reconfigurable hardware
More informationPOLITECNICO DI MILANO. Self-Adaptive Software Systems on Android based on Application Heartbeats Framework
POLITECNICO DI MILANO Facoltà di Ingegneria dell Informazione Corso di Laurea in Ingegneria Informatica Self-Adaptive Software Systems on Android based on Application Heartbeats Framework Supervisor: Prof.
More informationData Processing on Modern Hardware
Data Processing on Modern Hardware Jens Teubner, TU Dortmund, DBIS Group jens.teubner@cs.tu-dortmund.de Summer 2014 c Jens Teubner Data Processing on Modern Hardware Summer 2014 1 Part V Execution on Multiple
More informationFast dynamic and partial reconfiguration Data Path
Fast dynamic and partial reconfiguration Data Path with low Michael Hübner 1, Diana Göhringer 2, Juanjo Noguera 3, Jürgen Becker 1 1 Karlsruhe Institute t of Technology (KIT), Germany 2 Fraunhofer IOSB,
More informationHardware/Software Codesign of Schedulers for Real Time Systems
Hardware/Software Codesign of Schedulers for Real Time Systems Jorge Ortiz Committee David Andrews, Chair Douglas Niehaus Perry Alexander Presentation Outline Background Prior work in hybrid co-design
More informationThe S6000 Family of Processors
The S6000 Family of Processors Today s Design Challenges The advent of software configurable processors In recent years, the widespread adoption of digital technologies has revolutionized the way in which
More information