Orchestration: Turning material breakthroughs into application performance

Size: px
Start display at page:

Download "Orchestration: Turning material breakthroughs into application performance"

Transcription

1 Orchestration: Turning material breakthroughs into application performance Jeronimo Castrillon (on behalf of the Orchestration team) TU Dresden Cfaed Chair for Compiler Construction

2 Cfaed: Research program (revisited) Goal to explore new technologies for electronic information processing which overcome the limits of CMOS technology Page: 2

3 Cfaed: Research program (revisited) Information Processing Systems research to handle heterogeneity Goal Systems to explore new Computing technologies for electronic Devices & information processing which overcome the limits Circuits of CMOS technology Materials & Functions CMOS (industry.focus) H HAEC: Highly Adaptive Energy6Efficient A!!Silicon!Nanowires B!!Carbon F Orchestration G Resilience C!!Organic D!!Biomolecular Assembly E!!Chemical Biology to inspire solutions I Biological Material research for post-cmos technologies Page: 3

4 Orchestration Path Mission Turn materials breakthroughs into application performance F Orchestration Page: 4

5 Underlying algorithms: self-adjusting Code synthesis: variant generation Orchestration Path Mission App.2Algorithms Skeletons Turn materials breakthroughs HW/SW%Stack Compilers into application performance Operating2system Heterogeneous template Cross-layer strategies Mobile2 Communi) cation Coordination High2 Performance Computing Hardware F Orchestration Database application Formal2Analysis Materials)Inspired Paths (Paths A2 E) Programming abstractions Micro-kernels System analysis: modeling, verification & reasoning Page: 5

6 Orchestration Path: the team! Investigators Prof. Uwe Aßmann Prof. Franz Baader Prof. Christel Baier Prof. Jeronimo Castrillon Prof. Gerhard Fettweis Prof. Christof Fetzer Prof. Jochen Fröhlich Prof. Hermann Härtig Prof. Wolfgang Lehner Prof. Wolfgang E. Nagel Dr. Marcus Völp Prof. Axel Voigt Page: 6

7 Orchestration Path: the team (2)! PhD/Postdocs Nils Asmußen Andres Goens Sebastian Haas Dirk Habich Immo Huismann Tomas Karnagel Sven Karol Sascha Klüppelholz Linda Leuschner Matthias Lieber Siqi Ling Steffen Märker Johannes Mey Benedikt Nöthen Michael Raitza Norman Rink Annett Ungethüm Page: 7

8 Outline! Introduction! HW/SW stacks! Orchestration stack! Outlook! Concluding remarks Page: 8

9 What is a stack! Software components that form a complete platform (where you can run applications on) Web app: Operating system, web server, database, programming language! Provide means to handle complexity Page: 9

10 HW/SW Stack: Example 1 Android%sack Source:% Page: 10

11 HW/SW Stack: Example 2 HSA%Stack Source:%Heterogeneous%System%Architecture eous=computing/what=is=heterogeneous= system=architecture=hsa/ Page: 11

12 Stacks and wild heterogeneity! Abstract/hide complexity! Leave as much visible as needed for efficiency! Emerging techs.: rethink established layers Languages, compilers, operating systems, Page: 12

13 Example: Compiler Runtime Compiler protocol assumptions Code Data Stack 0x0 Endurance problem: try not to use same locations that often! Runtime environment Heap 0xF..F Page: 13

14 Example: Flash FTL File%system File%system Flash%Translation% Layer%(FTL) Mapping Wear%leveling Hard%disk Flash%drive Page: 14

15 Outline! Introduction! HW/SW stacks! Orchestration stack! Outlook! Concluding remarks Page: 15

16 Recall: orchestration stack! Heterogeneous materials In Cfaed: Along with orchestration (see other DMA presentations)! Research plan Start with heterogeneous CMOS Tiled for scalability, integrability Pilot projects with materials as they become available Mobile2 Communi) cation Coordination High2 Performance Computing App.2Algorithms Skeletons Compilers Operating2system Hardware Database application Formal2Analysis Materials)Inspired Paths (Paths A2 E) Page: 16

17 Applications/Drivers! Mobile communication: typically run on heterogeneous! Data-bases: benefits of heterogeneity (see below)! HPC: high level languages, abstractions for scalability, productivity and portability Adaptive Finite Element Methods (FEM) Computational Fluid Dynamics (CFD) Coordination Applications App.% Algorithms Skeletons Compilers Operating% system Hardware Formal% Analysis Page: 17

18 Applications Applications/Drivers (2)! Adaptive Finite Element Methods (FEM) Working with multiple-meshes Communication across subdomains Requires repartitioning " load balancing Coordination App.% Algorithms Skeletons Compilers Operating% system Hardware Formal% Analysis Working on! Portable templatized library [Witkowski15] Page: 18! First results on heterogeneous platforms Dendritic%growth

19 Applications Applications/Drivers (3)! Computational Fluid Dynamics (CFD)! Large applications (10 10 unknowns)! Communication/computation is key Coordination App.% Algorithms Skeletons Compilers Operating% system Hardware Formal% Analysis! Working on! Heterogeneous implementation! More versatile fundamental concepts [Huismann15] Page: 19

20 Applications Hardware! Tomahawk: heterogeneous CMOS Coordination App.% Algorithms Skeletons Compilers Operating% system Hardware Formal% Analysis CM /µk! Tiled (with NoC) for scalability! Allows to integrate wildly heterogeneous components! CoreManager (CM): HW support for fine-grained task dispatching [Arnold13] Page: 20

21 Hardware Applications App.% Algorithms Skeletons Compilers Operating% system! DB-processor for sorting algorithms Hardware Coordination Formal% Analysis Selectivity: [Arnold14] Page: 21

22 Applications Operating Systems! Microkernel for heterogeneity Isolation (at NoC level)! Dynamic compartments! Isolation at network boundary Remote control: no need for OS everywhere! Simplify integration of novel materials! OS services on remote cores Coordination App.% Algorithms Skeletons Compilers Operating% system Hardware Formal% Analysis Page: 22

23 Applications Operating Systems! Microkernel for heterogeneity Coordination App.% Algorithms Skeletons Compilers Operating% system Hardware Formal% Analysis App2 1 File2 System [Asmussen15] App2 2 CM/ µk Isolation & remote control Page: 23

24 Compilers! Parallel dataflow programming languages and compilers PNkpn AudioAmp PNin(short A[2]) PNout(short B[2]) PNparam(short boost){ while (1) PNin(A) PNout(B) { for (int i = 0; i < 2; i++) B[i] = A[i]*boost; }} PNprocess Amp1 = AudioAmp PNin(C) PNout(F) PNparam(3); PNprocess Amp2 = AudioAmp PNin(D) PNout(G) PNparam(10); Coordination Applications App.% Algorithms Skeletons Compilers Operating% system Hardware Formal% Analysis [Castrillon14] Page: 24

25 Applications Compilers! SW synthesis for Tomahawk Static/dynamic mapping Automatic code generation Coordination App.% Algorithms Skeletons Compilers Operating% system Hardware Formal% Analysis P2 P5 P1 P3 P6 P4 CM / µk Page: 25

26 Skeletons! Basic building blocks for parallelism! Heterogeneous programming Different possible implementations of same principles (CUDA, Fortran, ) Advanced pre-compiler to generate and manage variants " Productivity Coordination Applications App.% Algorithms Skeletons Compilers Operating% system Hardware Formal% Analysis Page: 26

27 Skeletons Applications App.% Algorithms Skeletons Compilers Operating% system! Basic building blocks for parallelism Hardware Coordination Formal% Analysis 2 Style*Definitions Style:*OpenACC Type:*parallelization Fragment Target:DoConcurrent!$acc*loop*private(#PRIVATE_VARS#) #INNER#!$acc*end*do Attribute Loop.private_vars() Page: 27 1 Sequential*Program do*concurrent*(i =*1:n,* j =* 1:m) mat(i,j))=)x), y(i,j,k))*)2 end*do 2 Style*Application**Recipe RECIPE*{parallelization:OpenACC} OpenMP 3 Parallel*Program!$acc*parallel!$acc*loop*private(i,*j) do*k* =*1,*num_ele do*i*=*0,*po mat(i,j))=)x), y(i,j,k))*)2 end* do end*do!$acc*end*loop!$acc*end*parallel [Karol15]

28 Algorithms! Raise the level of abstraction High-level domain specific languages (more later) Algorithmic transformations: more optimization room Algorithmic specification: allow for adaptability Coordination Applications App.% Algorithms Skeletons Compilers Operating% system Hardware Formal% Analysis Page: 28

29 Formal analysis! Probabilistic model checking (PMC) Analyze approaches across layers of the stack Work on models for heterogeneous systems Examples:! Probability of finishing N tasks in T time! Probability of finishing N tasks with energy budget B! Trade-off analysis cost-utility (energy vs. performance) Model protocols: Spinlock, barriers, ebond, Coordination Applications App.% Algorithms Skeletons Compilers Operating% system Hardware Formal% Analysis Page: 29

30 Formal analysis Applications App.% Algorithms Skeletons Compilers Operating% system! Probabilistic model checking (PMC) Hardware Coordination Formal% Analysis [Baier14] ebond Minimal%expected%energy minimal&expected&energy&[wa2s] 2.9$ 2.7$ 2.5$ 2.3$ 2.1$ 1.9$ 1.7$ 1.5$ 1.3$ 7.3$ 6.3$ 5.3$ 4.3$ 3.3$ 2.3$ 1.3$ 200$ 300$ 400$ 500$ 600$ 700$ 800$ 900$ 1000$ 1100$ 1200$ 1300$ 1400$ 1500$ 1600$ 1700$ 1800$ 1900$ 2000$ 2100$ 2200$ 2300$ 2400$ 2500$ 2600$ 2700$ 2800$ 2900$ 3000$ 3100$ 3200$ 3300$ 3400$ 3500$ 3600$ 3700$ 3800$ 3900$ 4000$ 4100$ 4200$ 4300$ 4400$ 4500$ 4600$ 4700$ 4800$ 4900$ 5000$ 5100$ 5200$ 5300$ 5400$ 5500$ 5600$ 5700$ 5800$ 5900$ 6000$ 6100$ 6200$ 6300$ 6400$ 6500$ 6600$ 6700$ 6800$ 6900$ 7000$ 7100$ 7200$ 200$ 300$ 400$ 500$ 600$ 700$ 800$ 900$ 1000$ 1100$ 1200$ 1300$ 1400$ 1500$ 1600$ 1700$ 1800$ 1900$ 2000$ 2100$ 2200$ 2300$ 2400$ 2500$ 2600$ 2700$ 2800$ 2900$ 3000$ 3100$ 3200$ Maximal%required%bandwidth maximal&required&ebond+&bandwidth&[mbit/s] Page: 30

31 Outline! Introduction! HW/SW stacks! Orchestration stack! Outlook! Concluding remarks Page: 31

32 Current and future work! Further work on languages and methodologies Computational biology Chemical information processing! Technology analysis at system-level Pros (USPs) and cons Attempt skip incremental improvement System-level benefit based on assumptions Page: 32

33 Language for computational biology! Particle and mesh-based Simulations Discrete models of physical phenomena! Goal: Abstract programming language + optimization for heterogeneous HPC-clusters Particle0 Mesh DSL Analysis Static Opt. Runtime opt. [Karol15] Page: 33

34 Language for computational biology client grayscott integer, dimension(6) :: bcdef = real(..), dimension(:,:), pointer :: displace integer :: interval add_arg(k_rate,<#real(mk)#>,1.0_mk,0.0_m k,'k_rat e', ) add_arg(f,<#real(mk)#>,1.0_mk,0.0_mk,'f_ param', ') add_arg(d_u,<#real(mk)#>,1.0_mk,0.0_mk,' Du_param ', ) add_arg(d_v,<#real(mk)#>,1.0_mk,0.0_mk,' Dv_param ', ') ppm_init(1) U = create_field(1, "U") V = create_field(1, "V ) topo = create_topology(bcdef) c = create_particles(topo) allocate(displace(ppm_dim,c%npart)) call random_number(displace) call c%move(displace, info) call c%apply_bc(info) global_mapping(c, topo) discretize(u,c) discretize(v,c) ghost_mapping(c) foreach p in particles(c) with sca_fields(u,v) U_p = 1.0_mk V_p = 0.0_mk if (((x_p(1)-0.5)**2+(x_p(2)-0.5_mk)**2).lt.0. 01) [Karol15] U_p = 0.5_mk _mk*noise V_p = 0.25_mk _mk*noise end if end foreach Page: 34 then n = create_neighlist(c,cutoff=<#4._mk * c%h_avg#>) Lap = define_op(2, [2,0, 0,2], [1.0_mk, 1.0_mk], "Lap") W = discretize_op(lap, c, ppm_param_op_dcpse, [order=>2,c=>1.0_mk]) o,nstag = create_ode([u,v],grayscott_rhs,[u=>c,v], rk4) interval = 1 t = timeloop() do istage=1,nstag ghost_mapping(c) ode_step(o, t, time_step, 1) end do print([u=>c, V=>c],interval ) end timeloop ppm_finalize() end client rhs grayscott_rhs(u=>parts,v) get_fields(du,dv) du = apply_op(w, U) dv = apply_op(w, V) foreach p in particles(parts) with sca_fields(u,v,du,dv) du_p = D_u*dU_p - U_p*(V_p**2) + F*(1.0_mk-U_p) dv_p = D_v*dV_p + U_p*(V_p**2) - (F+k_rate)*V_p end foreach end rhs

35 SiNW: Components and reconfigurability! Edge of SiNW over other technologies P/N reconfigurability Multi-gate devices (with low latency penalty)! Components based on assumptions New uarchitectures? New pipeline structures and compiler optimizations? Page: 35

36 Outline! Introduction! HW/SW stacks! Orchestration stack! Outlook! Concluding remarks Page: 36

37 Summary! Orchestration path: Multi-disciplinary effort to turn materials breakthroughs into application performance! HW/SW stack to handle heterogeneity Using CMOS as vehicle Prepare for wildly heterogeneous systems! Interact with material experts, understand technologies and analyze at system-level Mobile2 Communi) cation Coordination High2 Performance Computing App.2Algorithms Skeletons Compilers Operating2system Hardware Database application Formal2Analysis Materials)Inspired Paths (Paths A2 E) Page: 37

38 Thanks! Retreat:%Orchestration% and%resilience%paths%( ) Page: 38

39 References [Witkowski15] Witkowski, T., Ling, S., Praetorius, S., & Voigt, A. (2015). Software concepts and numerical algorithms for a scalable adaptive parallel finite element method. Advances in Computational Mathematics, 1-33 [Huismann15] Huismann, I., Stiller, J., & Fröhlich, J. (2015). Two-level parallelization of a fluid mechanics algorithm exploiting hardware heterogenity. Computers & Fluids [Arnold13] Arnold, O., Matus, E., Noethen, B., Winter, M., Limberg, T., & Fettweis, G. (2014). Tomahawk: Parallelism and heterogeneity in communications signal processing MPSoCs. ACM Transactions on Embedded Computing Systems (TECS), 13(3s), 107 [Arnold14] Arnold, O., Haas, S., Fettweis, G., Schlegel, B., Kissinger, T., Karnagel, T., & Lehner, W. (2014). HASHI: An Application-Specific Instruction Set Extension for Hashing. ADMS@ VLDB, [Asmussen15] Asmussen, N., Volp, M., Nothen, B., & Ungethum, A. (2015, April). Demo abstract: Taming many heterogeneous cores. In Real-Time and Embedded Technology and Applications Symposium (RTAS), 2015 IEEE (pp ) [Castrillon14] J. Castrillon and R. Leupers, Programming Heterogeneous MPSoCs: Tool Flows to Close the Software Productivity Gap. Springer, 2014 [Karol15] Sven Karol, Well-Formed and Scalable Invasive Software Composition. PhD thesis. May 2015 [Baier14] Baier, C., Dubslaff, C., & Klüppelholz, S. (2014, July). Trade-off analysis meets probabilistic model checking. In Proceedings of the Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS) [Karol15] S. Karol, P. Incardona, Y. Afshar, I. Sbalzarini, and J. Castrillon. Towards a Next-Generation Parallel Particle-Mesh Language, in Domain-Specific Language Design and Implementation (DSLDI). July 2015 Page: 39

Compiling for deeply embedded and heterogeneous signal processing systems

Compiling for deeply embedded and heterogeneous signal processing systems Compiling for deeply embedded and heterogeneous signal processing systems Jeronimo Castrillon Cfaed Chair for Compiler Construction (CCC) 5G Summit, Dresden, Germany September 29, 2016 Multi-Processor/core

More information

An MPSoC for Energy-Efficient Database Query Processing

An MPSoC for Energy-Efficient Database Query Processing Vodafone Chair Mobile Communications Systems, Prof. Dr.-Ing. Dr. h.c. G. Fettweis An MPSoC for Energy-Efficient Database Query Processing TensilicaDay 2016 Sebastian Haas Emil Matúš Gerhard Fettweis 09.02.2016

More information

On mapping to multi/manycores

On mapping to multi/manycores On mapping to multi/manycores Jeronimo Castrillon Chair for Compiler Construction (CCC) TU Dresden, Germany MULTIPROG HiPEAC Conference Stockholm, 24.01.2017 Mapping for dataflow programming models MEM

More information

Dataflow programming for heterogeneous computing systems

Dataflow programming for heterogeneous computing systems Dataflow programming for heterogeneous computing systems Jeronimo Castrillon Cfaed Chair for Compiler Construction TU Dresden jeronimo.castrillon@tu-dresden.de Tutorial: Algorithmic specification, tools

More information

Overview on Hardware Optimizations for Database Engines

Overview on Hardware Optimizations for Database Engines Overview on Hardware Optimizations for Database Engines Annett Ungethüm, Dirk Habich, Tomas Karnagel, Sebastian Haas, Eric Mier, Gerhard Fettweis, Wolfgang Lehner BTW 2017, Stuttgart, Germany, 2017-03-09

More information

Programming Heterogeneous Embedded Systems for IoT

Programming Heterogeneous Embedded Systems for IoT Programming Heterogeneous Embedded Systems for IoT Jeronimo Castrillon Chair for Compiler Construction TU Dresden jeronimo.castrillon@tu-dresden.de Get-together toward a sustainable collaboration in IoT

More information

Adaptive Query Processing on Prefix Trees Wolfgang Lehner

Adaptive Query Processing on Prefix Trees Wolfgang Lehner Adaptive Query Processing on Prefix Trees Wolfgang Lehner Fachgruppentreffen, 22.11.2012 TU München Prof. Dr.-Ing. Wolfgang Lehner > Challenges for Database Systems Three things are important in the database

More information

Hardware Software Codesign of Embedded Systems

Hardware Software Codesign of Embedded Systems Hardware Software Codesign of Embedded Systems Rabi Mahapatra Texas A&M University Today s topics Course Organization Introduction to HS-CODES Codesign Motivation Some Issues on Codesign of Embedded System

More information

Analysis and software synthesis of KPN applications

Analysis and software synthesis of KPN applications Analysis and software synthesis of KPN applications Jeronimo Castrillon Chair for Compiler Construction TU Dresden jeronimo.castrillon@tu-dresden.de DREAMS Seminar UC Berkeley, CA. October 22 nd 2015 Acknowledgements

More information

A Database Accelerator for Energy-Efficient Query Processing and Optimization

A Database Accelerator for Energy-Efficient Query Processing and Optimization A Database Accelerator for Energy-Efficient Query Processing and Optimization Sebastian Haas, Oliver Arnold, Stefan Scholze, Sebastian Höppner, Georg Ellguth, Andreas Dixius, Annett Ungethüm, Eric Mier,

More information

EE382V: System-on-a-Chip (SoC) Design

EE382V: System-on-a-Chip (SoC) Design EE382V: System-on-a-Chip (SoC) Design Lecture 8 HW/SW Co-Design Sources: Prof. Margarida Jacome, UT Austin Andreas Gerstlauer Electrical and Computer Engineering University of Texas at Austin gerstl@ece.utexas.edu

More information

Hardware Software Codesign of Embedded System

Hardware Software Codesign of Embedded System Hardware Software Codesign of Embedded System CPSC489-501 Rabi Mahapatra Mahapatra - Texas A&M - Fall 00 1 Today s topics Course Organization Introduction to HS-CODES Codesign Motivation Some Issues on

More information

Design methodology for multi processor systems design on regular platforms

Design methodology for multi processor systems design on regular platforms Design methodology for multi processor systems design on regular platforms Ph.D in Electronics, Computer Science and Telecommunications Ph.D Student: Davide Rossi Ph.D Tutor: Prof. Roberto Guerrieri Outline

More information

Center Extreme Scale CS Research

Center Extreme Scale CS Research Center Extreme Scale CS Research Center for Compressible Multiphase Turbulence University of Florida Sanjay Ranka Herman Lam Outline 10 6 10 7 10 8 10 9 cores Parallelization and UQ of Rocfun and CMT-Nek

More information

NVMain Extension for Multi-Level Cache Systems

NVMain Extension for Multi-Level Cache Systems NVMain Extension for Multi-Level Cache Systems Asif Ali Khan, Fazal Hameed and Jeronimo Castrillon Chair For Compiler Construction Technische Universität Dresden, Germany { asif_ali.khan, fazal.hameed,

More information

Easy Multicore Programming using MAPS

Easy Multicore Programming using MAPS Easy Multicore Programming using MAPS Jeronimo Castrillon, Maximilian Odendahl Multicore Challenge Conference 2012 September 24 th, 2012 Institute for Communication Technologies and Embedded Systems Outline

More information

HETEROGENEOUS SYSTEM ARCHITECTURE: PLATFORM FOR THE FUTURE

HETEROGENEOUS SYSTEM ARCHITECTURE: PLATFORM FOR THE FUTURE HETEROGENEOUS SYSTEM ARCHITECTURE: PLATFORM FOR THE FUTURE Haibo Xie, Ph.D. Chief HSA Evangelist AMD China OUTLINE: The Challenges with Computing Today Introducing Heterogeneous System Architecture (HSA)

More information

... Lecture 10. Concepts of Mobile Operating Systems. Mobile Business I (WS 2017/18) Prof. Dr. Kai Rannenberg

... Lecture 10. Concepts of Mobile Operating Systems. Mobile Business I (WS 2017/18) Prof. Dr. Kai Rannenberg Lecture 10 Concepts of Mobile Operating Systems Mobile Business I (WS 2017/18) Prof. Dr. Kai Rannenberg... Deutsche Telekom Chair of Mobile Business & Multilateral Security Johann Wolfgang Goethe University

More information

COSADE Conference Series

COSADE Conference Series COSADE Conference Series Past, Present, and Future Sorin A. Huss 1 / 24 Initiators Werner Schindler Sorin Alexander Huss 2 / 24 Constructive Side-Channel Analysis and Secure Design Time Period 2010 to

More information

HW/SW-Database-CoDesign for Compressed Bitmap Index Processing

HW/SW-Database-CoDesign for Compressed Bitmap Index Processing HW/SW-Database-CoDesign for Compressed Bitmap Index Processing Sebastian Haas, Tomas Karnagel, Oliver Arnold, Erik Laux 1, Benjamin Schlegel 2, Gerhard Fettweis, Wolfgang Lehner Vodafone Chair Mobile Communications

More information

Conflict Detection-based Run-Length Encoding AVX-512 CD Instruction Set in Action

Conflict Detection-based Run-Length Encoding AVX-512 CD Instruction Set in Action Conflict Detection-based Run-Length Encoding AVX-512 CD Instruction Set in Action Annett Ungethüm, Johannes Pietrzyk, Patrick Damme, Dirk Habich, Wolfgang Lehner HardBD & Active'18 Workshop in Paris, France

More information

24. Framework Documentation

24. Framework Documentation 24. Framework Documentation 1 Prof. Uwe Aßmann TU Dresden Institut für Software und Multimediatechnik Lehrstuhl Softwaretechnologie 15-0.2, 23.01.16 Design Patterns and Frameworks, Prof. Uwe Aßmann Obligatory

More information

Hardware/Software Co-design

Hardware/Software Co-design Hardware/Software Co-design Zebo Peng, Department of Computer and Information Science (IDA) Linköping University Course page: http://www.ida.liu.se/~petel/codesign/ 1 of 52 Lecture 1/2: Outline : an Introduction

More information

Dynamic inter-core scheduling in Barrelfish

Dynamic inter-core scheduling in Barrelfish Dynamic inter-core scheduling in Barrelfish. avoiding contention with malleable domains Georgios Varisteas, Mats Brorsson, Karl-Filip Faxén November 25, 2011 Outline Introduction Scheduling & Programming

More information

THE COMPARISON OF PARALLEL SORTING ALGORITHMS IMPLEMENTED ON DIFFERENT HARDWARE PLATFORMS

THE COMPARISON OF PARALLEL SORTING ALGORITHMS IMPLEMENTED ON DIFFERENT HARDWARE PLATFORMS Computer Science 14 (4) 2013 http://dx.doi.org/10.7494/csci.2013.14.4.679 Dominik Żurek Marcin Pietroń Maciej Wielgosz Kazimierz Wiatr THE COMPARISON OF PARALLEL SORTING ALGORITHMS IMPLEMENTED ON DIFFERENT

More information

Reconstruction of Trees from Laser Scan Data and further Simulation Topics

Reconstruction of Trees from Laser Scan Data and further Simulation Topics Reconstruction of Trees from Laser Scan Data and further Simulation Topics Helmholtz-Research Center, Munich Daniel Ritter http://www10.informatik.uni-erlangen.de Overview 1. Introduction of the Chair

More information

! Readings! ! Room-level, on-chip! vs.!

! Readings! ! Room-level, on-chip! vs.! 1! 2! Suggested Readings!! Readings!! H&P: Chapter 7 especially 7.1-7.8!! (Over next 2 weeks)!! Introduction to Parallel Computing!! https://computing.llnl.gov/tutorials/parallel_comp/!! POSIX Threads

More information

COS 318: Operating Systems

COS 318: Operating Systems COS 318: Operating Systems OS Structures and System Calls Jaswinder Pal Singh Computer Science Department Princeton University (http://www.cs.princeton.edu/courses/cos318/) Outline Protection mechanisms

More information

MoCC - Models of Computation and Communication SystemC as an Heterogeneous System Specification Language

MoCC - Models of Computation and Communication SystemC as an Heterogeneous System Specification Language SystemC as an Heterogeneous System Specification Language Eugenio Villar Fernando Herrera University of Cantabria Challenges Massive concurrency Complexity PCB MPSoC with NoC Nanoelectronics Challenges

More information

Co-Design of Many-Accelerator Heterogeneous Systems Exploiting Virtual Platforms. SAMOS XIV July 14-17,

Co-Design of Many-Accelerator Heterogeneous Systems Exploiting Virtual Platforms. SAMOS XIV July 14-17, Co-Design of Many-Accelerator Heterogeneous Systems Exploiting Virtual Platforms SAMOS XIV July 14-17, 2014 1 Outline Introduction + Motivation Design requirements for many-accelerator SoCs Design problems

More information

Seminar Optimizing data management on new hardware (OpDaMNeHa)

Seminar Optimizing data management on new hardware (OpDaMNeHa) Seminar Optimizing data management on new hardware (OpDaMNeHa) Summer Term 2014 Lehrgebiet Informationssysteme Weiping Qu qu@cs.uni-kl.de AG Datenbanken und Informationssysteme AG Heterogene Informationssysteme

More information

Modeling, Testing and Executing Reo Connectors with the. Reo, Eclipse Coordination Tools

Modeling, Testing and Executing Reo Connectors with the. Reo, Eclipse Coordination Tools Replace this file with prentcsmacro.sty for your meeting, or with entcsmacro.sty for your meeting. Both can be found at the ENTCS Macro Home Page. Modeling, Testing and Executing Reo Connectors with the

More information

MPSOC Design examples

MPSOC Design examples MPSOC 2007 Eshel Haritan, VP Engineering, Inc. 1 MPSOC Design examples Freescale: ARM1136 + StarCore140e Broadcom: ARM11 + ARM9 + TeakLite + accelerators Qualcomm 4 processors + video, gps, wireless, audio

More information

NVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU

NVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU NVIDIA Think about Computing as Heterogeneous One Leo Liao, 1/29/2106, NTU GPGPU opens the door for co-design HPC, moreover middleware-support embedded system designs to harness the power of GPUaccelerated

More information

Improving the Practicality of Transactional Memory

Improving the Practicality of Transactional Memory Improving the Practicality of Transactional Memory Woongki Baek Electrical Engineering Stanford University Programming Multiprocessors Multiprocessor systems are now everywhere From embedded to datacenter

More information

Hierarchical vs. Flat Component Models

Hierarchical vs. Flat Component Models Hierarchical vs. Flat Component Models František Plášil, Petr Hnětynka DISTRIBUTED SYSTEMS RESEARCH GROUP http://nenya.ms.mff.cuni.cz Outline Component models (CM) Desired Features Flat vers. hierarchical

More information

REDUCTION IN RUN TIME USING TRAP ANALYSIS

REDUCTION IN RUN TIME USING TRAP ANALYSIS REDUCTION IN RUN TIME USING TRAP ANALYSIS 1 Prof. K.V.N.Sunitha 2 Dr V. Vijay Kumar 1 Professor & Head, CSE Dept, G.Narayanamma Inst.of Tech. & Science, Shaikpet, Hyderabad, India. 2 Dr V. Vijay Kumar

More information

The Microkernel Overhead

The Microkernel Overhead The Micro Overhead http://d3s.mff.cuni.cz Martin Děcký decky@d3s.mff.cuni.cz CHARLES UNIVERSITY IN PRAGUE faculty of mathematics and physics Martin Děcký, FOSDEM 2012, 5 th February 2012 The Micro Overhead

More information

ECE 8823: GPU Architectures. Objectives

ECE 8823: GPU Architectures. Objectives ECE 8823: GPU Architectures Introduction 1 Objectives Distinguishing features of GPUs vs. CPUs Major drivers in the evolution of general purpose GPUs (GPGPUs) 2 1 Chapter 1 Chapter 2: 2.2, 2.3 Reading

More information

Hardware-Software Codesign. 6. System Simulation

Hardware-Software Codesign. 6. System Simulation Hardware-Software Codesign 6. System Simulation Lothar Thiele 6-1 System Design specification system simulation (this lecture) (worst-case) perf. analysis (lectures 10-11) system synthesis estimation SW-compilation

More information

Outline EEL 5764 Graduate Computer Architecture. Chapter 3 Limits to ILP and Simultaneous Multithreading. Overcoming Limits - What do we need??

Outline EEL 5764 Graduate Computer Architecture. Chapter 3 Limits to ILP and Simultaneous Multithreading. Overcoming Limits - What do we need?? Outline EEL 7 Graduate Computer Architecture Chapter 3 Limits to ILP and Simultaneous Multithreading! Limits to ILP! Thread Level Parallelism! Multithreading! Simultaneous Multithreading Ann Gordon-Ross

More information

The Sluice Gate Theory: Have we found a solution for memory wall?

The Sluice Gate Theory: Have we found a solution for memory wall? The Sluice Gate Theory: Have we found a solution for memory wall? Xian-He Sun Illinois Institute of Technology Chicago, Illinois sun@iit.edu Keynote, HPC China, Nov. 2, 205 Scalable Computing Software

More information

DEPARTMENT OF COMPUTER SCIENCE

DEPARTMENT OF COMPUTER SCIENCE Department of Computer Science 1 DEPARTMENT OF COMPUTER SCIENCE Office in Computer Science Building, Room 279 (970) 491-5792 cs.colostate.edu (http://www.cs.colostate.edu) Professor L. Darrell Whitley,

More information

SPIN Operating System

SPIN Operating System SPIN Operating System Motivation: general purpose, UNIX-based operating systems can perform poorly when the applications have resource usage patterns poorly handled by kernel code Why? Current crop of

More information

IDE for medical device software development. Hyun-Do Lee, Field Application Engineer

IDE for medical device software development. Hyun-Do Lee, Field Application Engineer IDE for medical device software development Hyun-Do Lee, Field Application Engineer Agenda SW Validation Functional safety certified tool IAR Embedded Workbench Code Analysis tools SW Validation Certifications

More information

More on Conjunctive Selection Condition and Branch Prediction

More on Conjunctive Selection Condition and Branch Prediction More on Conjunctive Selection Condition and Branch Prediction CS764 Class Project - Fall Jichuan Chang and Nikhil Gupta {chang,nikhil}@cs.wisc.edu Abstract Traditionally, database applications have focused

More information

CUDA GPGPU Workshop 2012

CUDA GPGPU Workshop 2012 CUDA GPGPU Workshop 2012 Parallel Programming: C thread, Open MP, and Open MPI Presenter: Nasrin Sultana Wichita State University 07/10/2012 Parallel Programming: Open MP, MPI, Open MPI & CUDA Outline

More information

COS 318: Operating Systems

COS 318: Operating Systems COS 318: Operating Systems OS Structures and System Calls Prof. Margaret Martonosi Computer Science Department Princeton University http://www.cs.princeton.edu/courses/archive/fall11/cos318/ Outline Protection

More information

Computer Architecture Lecture 12: Out-of-Order Execution (Dynamic Instruction Scheduling)

Computer Architecture Lecture 12: Out-of-Order Execution (Dynamic Instruction Scheduling) 18-447 Computer Architecture Lecture 12: Out-of-Order Execution (Dynamic Instruction Scheduling) Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 2/13/2015 Agenda for Today & Next Few Lectures

More information

Co-synthesis and Accelerator based Embedded System Design

Co-synthesis and Accelerator based Embedded System Design Co-synthesis and Accelerator based Embedded System Design COE838: Embedded Computer System http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer

More information

VIProf: A Vertically Integrated Full-System Profiler

VIProf: A Vertically Integrated Full-System Profiler VIProf: A Vertically Integrated Full-System Profiler NGS Workshop, April 2007 Hussam Mousa Chandra Krintz Lamia Youseff Rich Wolski RACELab Research Dynamic software adaptation As program behavior or resource

More information

Hardware/Software Codesign

Hardware/Software Codesign Hardware/Software Codesign SS 2016 Prof. Dr. Christian Plessl High-Performance IT Systems group University of Paderborn Version 2.2.0 2016-04-08 how to design a "digital TV set top box" Motivating Example

More information

Analyses, Hardware/Software Compilation, Code Optimization for Complex Dataflow HPC Applications

Analyses, Hardware/Software Compilation, Code Optimization for Complex Dataflow HPC Applications Analyses, Hardware/Software Compilation, Code Optimization for Complex Dataflow HPC Applications CASH team proposal (Compilation and Analyses for Software and Hardware) Matthieu Moy and Christophe Alias

More information

Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS

Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS Energy Efficient Computing Systems (EECS) Magnus Jahre Coordinator, EECS Who am I? Education Master of Technology, NTNU, 2007 PhD, NTNU, 2010. Title: «Managing Shared Resources in Chip Multiprocessor Memory

More information

HW/SW Co-Design Lab. Seminar 1 WS 2017/2018. chair. Vodafone Chair Mobile Communications Systems, Prof. Dr.-Ing. Dr. h.c. G.

HW/SW Co-Design Lab. Seminar 1 WS 2017/2018. chair. Vodafone Chair Mobile Communications Systems, Prof. Dr.-Ing. Dr. h.c. G. Vodafone Chair Mobile Communications Systems, Prof. Dr.-Ing. Dr. h.c. G. Fettweis HW/SW Co-Design Lab Seminar 1 WS 2017/2018 TU Dresden, Slide 1 Schedule of the Lab Introduction: Lab task: Work: Become

More information

Performance and Accuracy of Lattice-Boltzmann Kernels on Multi- and Manycore Architectures

Performance and Accuracy of Lattice-Boltzmann Kernels on Multi- and Manycore Architectures Performance and Accuracy of Lattice-Boltzmann Kernels on Multi- and Manycore Architectures Dirk Ribbrock, Markus Geveler, Dominik Göddeke, Stefan Turek Angewandte Mathematik, Technische Universität Dortmund

More information

Model-based Analysis of Event-driven Distributed Real-time Embedded Systems

Model-based Analysis of Event-driven Distributed Real-time Embedded Systems Model-based Analysis of Event-driven Distributed Real-time Embedded Systems Gabor Madl Committee Chancellor s Professor Nikil Dutt (Chair) Professor Tony Givargis Professor Ian Harris University of California,

More information

The Use Of Virtual Platforms In MP-SoC Design. Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006

The Use Of Virtual Platforms In MP-SoC Design. Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006 The Use Of Virtual Platforms In MP-SoC Design Eshel Haritan, VP Engineering CoWare Inc. MPSoC 2006 1 MPSoC Is MP SoC design happening? Why? Consumer Electronics Complexity Cost of ASIC Increased SW Content

More information

Distributed Operating Systems

Distributed Operating Systems Distributed Operating Systems Name no more precise Interesting/advanced Topics in Operating Systems scalability systems security modeling Some overlap with Distributed Systems (Prof Schill) In some cases

More information

Handling Challenges of Multi-Core Technology in Automotive Software Engineering

Handling Challenges of Multi-Core Technology in Automotive Software Engineering Model Based Development Tools for Embedded Multi-Core Systems Handling Challenges of Multi-Core Technology in Automotive Software Engineering VECTOR INDIA CONFERENCE 2017 Timing-Architects Embedded Systems

More information

M 3 Microkernel-based System for Heterogeneous Manycores

M 3 Microkernel-based System for Heterogeneous Manycores M 3 Microkernel-based System for Heterogeneous Manycores Nils Asmussen MKC, 06/29/2017 1 / 48 Overall System Design Prototype Platforms Capabilities OS Services Context Switching Evaluation Heterogeneous

More information

Parallel Computing Using OpenMP/MPI. Presented by - Jyotsna 29/01/2008

Parallel Computing Using OpenMP/MPI. Presented by - Jyotsna 29/01/2008 Parallel Computing Using OpenMP/MPI Presented by - Jyotsna 29/01/2008 Serial Computing Serially solving a problem Parallel Computing Parallelly solving a problem Parallel Computer Memory Architecture Shared

More information

Systems I: Programming Abstractions

Systems I: Programming Abstractions Systems I: Programming Abstractions Course Philosophy: The goal of this course is to help students become facile with foundational concepts in programming, including experience with algorithmic problem

More information

Department of Computer Science, Institute for System Architecture, Operating Systems Group. Real-Time Systems '08 / '09. Hardware.

Department of Computer Science, Institute for System Architecture, Operating Systems Group. Real-Time Systems '08 / '09. Hardware. Department of Computer Science, Institute for System Architecture, Operating Systems Group Real-Time Systems '08 / '09 Hardware Marcus Völp Outlook Hardware is Source of Unpredictability Caches Pipeline

More information

Database Management and Tuning

Database Management and Tuning Database Management and Tuning Concurrency Tuning Johann Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Unit 8 May 10, 2012 Acknowledgements: The slides are provided by Nikolaus

More information

«Computer Science» Requirements for applicants by Innopolis University

«Computer Science» Requirements for applicants by Innopolis University «Computer Science» Requirements for applicants by Innopolis University Contents Architecture and Organization... 2 Digital Logic and Digital Systems... 2 Machine Level Representation of Data... 2 Assembly

More information

Challenges for GPU Architecture. Michael Doggett Graphics Architecture Group April 2, 2008

Challenges for GPU Architecture. Michael Doggett Graphics Architecture Group April 2, 2008 Michael Doggett Graphics Architecture Group April 2, 2008 Graphics Processing Unit Architecture CPUs vsgpus AMD s ATI RADEON 2900 Programming Brook+, CAL, ShaderAnalyzer Architecture Challenges Accelerated

More information

SELF-HEALING NETWORKS: REDUNDANCY AND STRUCTURE

SELF-HEALING NETWORKS: REDUNDANCY AND STRUCTURE SELF-HEALING NETWORKS: REDUNDANCY AND STRUCTURE Guido Caldarelli IMT, CNR-ISC and LIMS, London UK DTRA Grant HDTRA1-11-1-0048 INTRODUCTION The robustness and the shape Baran, P. On distributed Communications

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #7 2/5/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 Outline From last class

More information

Dongjun Shin Samsung Electronics

Dongjun Shin Samsung Electronics 2014.10.31. Dongjun Shin Samsung Electronics Contents 2 Background Understanding CPU behavior Experiments Improvement idea Revisiting Linux I/O stack Conclusion Background Definition 3 CPU bound A computer

More information

Run-time Environments

Run-time Environments Run-time Environments Status We have so far covered the front-end phases Lexical analysis Parsing Semantic analysis Next come the back-end phases Code generation Optimization Register allocation Instruction

More information

Run-time Environments

Run-time Environments Run-time Environments Status We have so far covered the front-end phases Lexical analysis Parsing Semantic analysis Next come the back-end phases Code generation Optimization Register allocation Instruction

More information

High-Level Simulations of On-Chip Networks

High-Level Simulations of On-Chip Networks High-Level Simulations of On-Chip Networks Claas Cornelius, Frank Sill, Dirk Timmermann 9th EUROMICRO Conference on Digital System Design (DSD) - Architectures, Methods and Tools - University of Rostock

More information

Design Space Exploration and Application Autotuning for Runtime Adaptivity in Multicore Architectures

Design Space Exploration and Application Autotuning for Runtime Adaptivity in Multicore Architectures Design Space Exploration and Application Autotuning for Runtime Adaptivity in Multicore Architectures Cristina Silvano Politecnico di Milano cristina.silvano@polimi.it Outline Research challenges in multicore

More information

Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors

Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors University of Crete School of Sciences & Engineering Computer Science Department Master Thesis by Michael Papamichael Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors

More information

Computer Architecture: Multi-Core Processors: Why? Prof. Onur Mutlu Carnegie Mellon University

Computer Architecture: Multi-Core Processors: Why? Prof. Onur Mutlu Carnegie Mellon University Computer Architecture: Multi-Core Processors: Why? Prof. Onur Mutlu Carnegie Mellon University Moore s Law Moore, Cramming more components onto integrated circuits, Electronics, 1965. 2 3 Multi-Core Idea:

More information

Huge market -- essentially all high performance databases work this way

Huge market -- essentially all high performance databases work this way 11/5/2017 Lecture 16 -- Parallel & Distributed Databases Parallel/distributed databases: goal provide exactly the same API (SQL) and abstractions (relational tables), but partition data across a bunch

More information

Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP

Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP Presenter: Course: EEC 289Q: Reconfigurable Computing Course Instructor: Professor Soheil Ghiasi Outline Overview of M.I.T. Raw processor

More information

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Hardware Design Environments Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Outline Welcome to COE 405 Digital System Design Design Domains and Levels of Abstractions Synthesis

More information

Design Space Exploration Using Parameterized Cores

Design Space Exploration Using Parameterized Cores RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS UNIVERSITY OF WINDSOR Design Space Exploration Using Parameterized Cores Ian D. L. Anderson M.A.Sc. Candidate March 31, 2006 Supervisor: Dr. M. Khalid 1 OUTLINE

More information

Proseminar. (with Eclipse) Jun.-Prof. Dr.-Ing. Steffen Becker. Model-Driven Software Engineering. Software Engineering Group

Proseminar. (with Eclipse) Jun.-Prof. Dr.-Ing. Steffen Becker. Model-Driven Software Engineering. Software Engineering Group Proseminar Model-Driven Software Engineering (with Eclipse) Jun.-Prof. Dr.-Ing. Steffen Becker Model-Driven Software Engineering Software Engineering Group 1 Outline Basic Requirements Preliminary Dates

More information

Real-Time Operating Systems Design and Implementation. LS 12, TU Dortmund

Real-Time Operating Systems Design and Implementation. LS 12, TU Dortmund Real-Time Operating Systems Design and Implementation (slides are based on Prof. Dr. Jian-Jia Chen) Anas Toma, Jian-Jia Chen LS 12, TU Dortmund October 19, 2017 Anas Toma, Jian-Jia Chen (LS 12, TU Dortmund)

More information

Productive Performance on the Cray XK System Using OpenACC Compilers and Tools

Productive Performance on the Cray XK System Using OpenACC Compilers and Tools Productive Performance on the Cray XK System Using OpenACC Compilers and Tools Luiz DeRose Sr. Principal Engineer Programming Environments Director Cray Inc. 1 The New Generation of Supercomputers Hybrid

More information

GPUs and GPGPUs. Greg Blanton John T. Lubia

GPUs and GPGPUs. Greg Blanton John T. Lubia GPUs and GPGPUs Greg Blanton John T. Lubia PROCESSOR ARCHITECTURAL ROADMAP Design CPU Optimized for sequential performance ILP increasingly difficult to extract from instruction stream Control hardware

More information

The CompSOC Design Flow for Virtual Execution Platforms

The CompSOC Design Flow for Virtual Execution Platforms NEST COBRA CA104 The CompSOC Design Flow for Virtual Execution Platforms FPGAWorld 10-09-2013 Sven Goossens*, Benny Akesson*, Martijn Koedam*, Ashkan Beyranvand Nejad, Andrew Nelson, Kees Goossens* * Introduction

More information

Distributed Operation Layer

Distributed Operation Layer Distributed Operation Layer Iuliana Bacivarov, Wolfgang Haid, Kai Huang, and Lothar Thiele ETH Zürich Outline Distributed Operation Layer Overview Specification Application Architecture Mapping Design

More information

Moneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010

Moneta: A High-performance Storage Array Architecture for Nextgeneration, Micro 2010 Moneta: A High-performance Storage Array Architecture for Nextgeneration, Non-volatile Memories Micro 2010 NVM-based SSD NVMs are replacing spinning-disks Performance of disks has lagged NAND flash showed

More information

Continuum-Microscopic Models

Continuum-Microscopic Models Scientific Computing and Numerical Analysis Seminar October 1, 2010 Outline Heterogeneous Multiscale Method Adaptive Mesh ad Algorithm Refinement Equation-Free Method Incorporates two scales (length, time

More information

ReconOS: Multithreaded Programming and Execution Models for Reconfigurable Hardware

ReconOS: Multithreaded Programming and Execution Models for Reconfigurable Hardware ReconOS: Multithreaded Programming and Execution Models for Reconfigurable Hardware Enno Lübbers and Marco Platzner Computer Engineering Group University of Paderborn {enno.luebbers, platzner}@upb.de Outline

More information

Modeling and Simulation of System-on. Platorms. Politecnico di Milano. Donatella Sciuto. Piazza Leonardo da Vinci 32, 20131, Milano

Modeling and Simulation of System-on. Platorms. Politecnico di Milano. Donatella Sciuto. Piazza Leonardo da Vinci 32, 20131, Milano Modeling and Simulation of System-on on-chip Platorms Donatella Sciuto 10/01/2007 Politecnico di Milano Dipartimento di Elettronica e Informazione Piazza Leonardo da Vinci 32, 20131, Milano Key SoC Market

More information

Lars Schor, and Lothar Thiele ETH Zurich, Switzerland

Lars Schor, and Lothar Thiele ETH Zurich, Switzerland Iuliana Bacivarov, Wolfgang Haid, Kai Huang, Lars Schor, and Lothar Thiele ETH Zurich, Switzerland Efficient i Execution of KPN on MPSoC Efficiency regarding speed-up small memory footprint portability

More information

IQ for DNA. Interactive Query for Dynamic Network Analytics. Haoyu Song. HUAWEI TECHNOLOGIES Co., Ltd.

IQ for DNA. Interactive Query for Dynamic Network Analytics. Haoyu Song.   HUAWEI TECHNOLOGIES Co., Ltd. IQ for DNA Interactive Query for Dynamic Network Analytics Haoyu Song www.huawei.com Motivation Service Provider s pain point Lack of real-time and full visibility of networks, so the network monitoring

More information

Industrial Multicore Software with EMB²

Industrial Multicore Software with EMB² Siemens Industrial Multicore Software with EMB² Dr. Tobias Schüle, Dr. Christian Kern Introduction In 2022, multicore will be everywhere. (IEEE CS) Parallel Patterns Library Apple s Grand Central Dispatch

More information

NetSpeed ORION: A New Approach to Design On-chip Interconnects. August 26 th, 2013

NetSpeed ORION: A New Approach to Design On-chip Interconnects. August 26 th, 2013 NetSpeed ORION: A New Approach to Design On-chip Interconnects August 26 th, 2013 INTERCONNECTS BECOMING INCREASINGLY IMPORTANT Growing number of IP cores Average SoCs today have 100+ IPs Mixing and matching

More information

HARDWARE IN REAL-TIME SYSTEMS HERMANN HÄRTIG, WITH MARCUS VÖLP

HARDWARE IN REAL-TIME SYSTEMS HERMANN HÄRTIG, WITH MARCUS VÖLP Faculty of Computer Science Institute of Systems Architecture, Operating Systems Group HARDWARE IN REAL-TIME SYSTEMS HERMANN HÄRTIG, WITH MARCUS VÖLP A SIMPLE PROBLEM event PC how to create a precise timestamp

More information

Peta-Scale Simulations with the HPC Software Framework walberla:

Peta-Scale Simulations with the HPC Software Framework walberla: Peta-Scale Simulations with the HPC Software Framework walberla: Massively Parallel AMR for the Lattice Boltzmann Method SIAM PP 2016, Paris April 15, 2016 Florian Schornbaum, Christian Godenschwager,

More information

Computing on GPUs. Prof. Dr. Uli Göhner. DYNAmore GmbH. Stuttgart, Germany

Computing on GPUs. Prof. Dr. Uli Göhner. DYNAmore GmbH. Stuttgart, Germany Computing on GPUs Prof. Dr. Uli Göhner DYNAmore GmbH Stuttgart, Germany Summary: The increasing power of GPUs has led to the intent to transfer computing load from CPUs to GPUs. A first example has been

More information

systems & research project

systems & research project class 4 systems & research project prof. HTTP://DASLAB.SEAS.HARVARD.EDU/CLASSES/CS265/ index index knows order about the data data filtering data: point/range queries index data A B C sorted A B C initial

More information

HARDWARE IN REAL-TIME SYSTEMS HERMANN HÄRTIG, MARCUS VÖLP

HARDWARE IN REAL-TIME SYSTEMS HERMANN HÄRTIG, MARCUS VÖLP Faculty of Computer Science Institute of Systems Architecture, Operating Systems Group HARDWARE IN REAL-TIME SYSTEMS HERMANN HÄRTIG, MARCUS VÖLP A SIMPLE PROBLEM event PC How to create a precise timestamp

More information

Service Oriented Performance Analysis

Service Oriented Performance Analysis Service Oriented Performance Analysis Da Qi Ren and Masood Mortazavi US R&D Center Santa Clara, CA, USA www.huawei.com Performance Model for Service in Data Center and Cloud 1. Service Oriented (end to

More information