An In Depth Look at VOLK

Size: px
Start display at page:

Download "An In Depth Look at VOLK"

Transcription

1 An In Depth Look at VOLK The Vector-Optimize Library of Kernels Nathan West U.S. Naval Research Laboratory 26 August 2015 (U) 26 August / 19

2 A brief look at VOLK organization VOLK is a sub-project of GNU Raio Sub-maintainer: Nathan West (that s me!) Coe c Free Software Founation license uner GPLv3 Submit issues an pull requests through (U) 26 August / 19

3 What s a VOLK? VOLK: Vector-Optimize Library of Kernels Collaboration, compatibility, an spee (in that orer) A library of functions optimize for various CPU architectures usually this means SIMD An API (an ispatcher) for calling the fastest coe for your machine A buil system for coe targetting specific instruction sets/architectures (U) 26 August / 19

4 Outline We will cover the priorities of VOLK in reverse orer: 1 Spee 2 Compatibility 3 Collaboration Goals: Become fluent in VOLK Know how to a a protokernel Know how to a a kernel Know how to make VOLK work for your project Know where to look to support new harware (U) 26 August / 19

5 Motivation: a NEON register view of 32fc 32f ot pro 32fc (U) 26 August / 19

6 Motivation: Disassemble C Complex Dot Prouct excerpt Coe can be inefficient even when harware features are use 1. l o o p 1 mov r11, r copy new a r e s s avec 3 mov r10, r copy new a r e s s bvec v l {16, 18, 20, 22 }, [ r11 ]! 5 a r9, r9, number += 1 cmp r9, number < h a l f p o i n t s? 7 a r8, r8, c a l c u l a t e n e x t avec a r e s s a r4, r4, c a l c u l a t e n e x t bvec a r e s s 9 v l {24, 26, 28, 30 }, [ r10 ]! v l {17, 19, 21, 23 }, [ r11 ] 11 v l {25, 27, 29, 31 }, [ r10 ] 13 <s n i p p e NEON ot p r o u c t ops> 15 bcc. l o o p r e p e a t h a l f p o i n t s t i m e s (U) 26 August / 19

7 Problem: Architecture-optimize coe is not portable Historical solution is #ifef fences Meta-compilers exist (ORC, Spiral, PEACHPy) This brings us to VOLK s secon concern: compatibility (U) 26 August / 19

8 Compatibility All problems in computer science can be solve by another layer of inirection. Davi Wheeler VOLK abstractions: arch (gen/archs.xml) machines (gen/machines.xml) (U) 26 August / 19

9 Compatibility: the arch abstraction Architectures are harware features, escribe in an xml file with Arch name Compiler flags Alignment restrictions check name (efine in tmpl/volk cpu.tmpl.c) (U) 26 August / 19

10 Compatibility: the machine abstraction Machines are a collection of architectures that are foun together in real harware. Example ( neon ): generic neon softfp OR harfp ORC is optional (U) 26 August / 19

11 Compatibility: kernels an protokernels Kernel: the boy of a vectorize loop (kernels/volk/*.h) Proto-kernel: a specific implementation of a kernel (functions in kernels/volk/*.h or.s files in kernels/volk/asm/<arch>/) input signature output signature proto-kernel tag volk_32fc_magnitue_32f_u_neon_fancy_sweet volk namespace kernel alignment API: from your application call the ispatcher volk 32fc magnitue 32f(...) (U) 26 August / 19

12 (U) 26 August / 19

13 Compatibility: the profiler an ispatcher volk profile will run an time all proto-kernels for your machine Fastest proto-kernels per kernel are store in $HOME/.volk/volk config At run-time ispatcher loas volk config an runs what volk profile reporte as fastest (lib/volk rank archs.c) Dispatcher hanles selecting alignment an vali proto-kernels (U) 26 August / 19

14 Compatibility: putting it all together sp engineer buil system user kernels/volk/volk_32f_x2_ot_pro_32f.h #ifef LV_HAVE_NEON volk_32f_x2_ot_pro_32f_neon(...)... NEON machine #enif volk_32f_x2_ot_pro_32f(...) #ifef LV_HAVE_AVX volk_32f_x2_ot_pro_32f_avx(...) neon arch... #enif volk_32fc_x2_multiply_32fc(...) avx arch kernels/volk/volk_32fc_x2_multiply_32fc.h #ifef LV_HAVE_NEON volk_32fc_x2_multiply_32fc_neon(...) AVX machine libvolk.so gr-signals sse arch... #enif volk_32f_x2_ot_pro_32f(...) #ifef LV_HAVE_AVX volk_32fc_x2_multiply_32fc_avx(...)... volk_32fc_x2_multiply_32fc(...) #enif #ifef LV_HAVE_SSE volk_32fc_x2_multiply_32fc_sse(...)... #enif (U) 26 August / 19

15 A final note on compatibility: QA an Puppets VOLK QA reas function signatures to generate buffers an ranom input ata Arch-specific proto-kernels are compare to generic for correctness If VOLK QA cannot generate buffers for your signature you nee a puppet Puppets wrap your kernel in a VOLK QA-frienly form. See the rotatorpuppet for examples (U) 26 August / 19

16 Collaboration VOLK is a repository for efficient coe that people actually wrote (not the coe we wish they wrote) (U) 26 August / 19

17 Collaboration: VOLK for you volk motool: like gr motool, but for VOLK experiment with new archs A kernels (new heaer files) Hopefully contribute upstream (U) 26 August / 19

18 Parting Thoughts VOLK is a necessity cause by vectorization being very ifficult for compilers DSP esigners write coe specific to harware an wrap it insie VOLK Users call VOLK kernels when they are available After profiling your coe, create a VOLK kernel where it is neee Contribute VOLK kernels upstream for collaboration (U) 26 August / 19

19 Working Group Thursay at 3:00 PM Missing kernels/proto-kernels Arch-specific helper sub-libraries Buil issues General feeback (U) 26 August / 19

Unit 15. Building Wide Muxes. Building Wide Muxes. Common Hardware Components WIDE MUXES

Unit 15. Building Wide Muxes. Building Wide Muxes. Common Hardware Components WIDE MUXES 5. 5.2 Unit 5 Common Harware Components WIE MUXE 5.3 5.4 Builing Wie Muxes Builing Wie Muxes o far muxesonly have single bit inputs I is only -bit I is only -bit What if we still want to select between

More information

Computer Organization

Computer Organization Computer Organization Douglas Comer Computer Science Department Purue University 250 N. University Street West Lafayette, IN 47907-2066 http://www.cs.purue.eu/people/comer Copyright 2006. All rights reserve.

More information

Chapter 9 Memory Management

Chapter 9 Memory Management Contents 1. Introuction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threas 6. CPU Scheuling 7. Process Synchronization 8. Dealocks 9. Memory Management 10.Virtual Memory

More information

Overview. Operating Systems I. Simple Memory Management. Simple Memory Management. Multiprocessing w/fixed Partitions.

Overview. Operating Systems I. Simple Memory Management. Simple Memory Management. Multiprocessing w/fixed Partitions. Overview Operating Systems I Management Provie Services processes files Manage Devices processor memory isk Simple Management One process in memory, using it all each program nees I/O rivers until 96 I/O

More information

You Can Do That. Unit 16. Motivation. Computer Organization. Computer Organization Design of a Simple Processor. Now that you have some understanding

You Can Do That. Unit 16. Motivation. Computer Organization. Computer Organization Design of a Simple Processor. Now that you have some understanding .. ou Can Do That Unit Computer Organization Design of a imple Clou & Distribute Computing (CyberPhysical, bases, Mining,etc.) Applications (AI, Robotics, Graphics, Mobile) ystems & Networking (Embee ystems,

More information

PART 5. Process Coordination And Synchronization

PART 5. Process Coordination And Synchronization PART 5 Process Coorination An Synchronization CS 503 - PART 5 1 2010 Location Of Process Coorination In The Xinu Hierarchy CS 503 - PART 5 2 2010 Coorination Of Processes Necessary in a concurrent system

More information

PART 2. Organization Of An Operating System

PART 2. Organization Of An Operating System PART 2 Organization Of An Operating System CS 503 - PART 2 1 2010 Services An OS Supplies Support for concurrent execution Facilities for process synchronization Inter-process communication mechanisms

More information

Recitation Caches and Blocking. 4 March 2019

Recitation Caches and Blocking. 4 March 2019 15-213 Recitation Caches an Blocking 4 March 2019 Agena Reminers Revisiting Cache Lab Caching Review Blocking to reuce cache misses Cache alignment Reminers Due Dates Cache Lab (Thursay 3/7) Miterm Exam

More information

MODULE VII. Emerging Technologies

MODULE VII. Emerging Technologies MODULE VII Emerging Technologies Computer Networks an Internets -- Moule 7 1 Spring, 2014 Copyright 2014. All rights reserve. Topics Software Define Networking The Internet Of Things Other trens in networking

More information

6.823 Computer System Architecture. Problem Set #3 Spring 2002

6.823 Computer System Architecture. Problem Set #3 Spring 2002 6.823 Computer System Architecture Problem Set #3 Spring 2002 Stuents are strongly encourage to collaborate in groups of up to three people. A group shoul han in only one copy of the solution to the problem

More information

Welcome to Wayland. Martin Gra ßlin BlueSystems Akademy 2015

Welcome to Wayland. Martin Gra ßlin BlueSystems Akademy 2015 Welcome to Waylan Martin Gra ßlin mgraesslin@ke.org BlueSystems Akaemy 2015 26.07.2015 Agena 1 Architecture 2 Evolution of KWin 3 The kwin Project 4 What s next? Welcome to Waylan Martin Gra ßlin Agena

More information

Computer Organization

Computer Organization Computer Organization Douglas Comer Computer Science Department Purue University 250 N. University Street West Lafayette, IN 47907-2066 http://www.cs.purue.eu/people/comer Copyright 2006. All rights reserve.

More information

Loop Scheduling and Partitions for Hiding Memory Latencies

Loop Scheduling and Partitions for Hiding Memory Latencies Loop Scheuling an Partitions for Hiing Memory Latencies Fei Chen Ewin Hsing-Mean Sha Dept. of Computer Science an Engineering University of Notre Dame Notre Dame, IN 46556 Email: fchen,esha @cse.n.eu Tel:

More information

Message Transport With The User Datagram Protocol

Message Transport With The User Datagram Protocol Message Transport With The User Datagram Protocol User Datagram Protocol (UDP) Use During startup For VoIP an some vieo applications Accounts for less than 10% of Internet traffic Blocke by some ISPs Computer

More information

Multilevel Paging. Multilevel Paging Translation. Paging Hardware With TLB 11/13/2014. CS341: Operating System

Multilevel Paging. Multilevel Paging Translation. Paging Hardware With TLB 11/13/2014. CS341: Operating System CS341: Operating System Lect31: 21 st Oct 2014 Dr A Sahu Dept o Comp Sc & Engg Inian Institute o Technology Guwahati ain Contiguous Allocation, Segmentation, Paging Page Table an TLB Paging : Larger Page

More information

CS 106 Winter 2016 Craig S. Kaplan. Module 01 Processing Recap. Topics

CS 106 Winter 2016 Craig S. Kaplan. Module 01 Processing Recap. Topics CS 106 Winter 2016 Craig S. Kaplan Moule 01 Processing Recap Topics The basic parts of speech in a Processing program Scope Review of syntax for classes an objects Reaings Your CS 105 notes Learning Processing,

More information

Workspace as a Service: an Online Working Environment for Private Cloud

Workspace as a Service: an Online Working Environment for Private Cloud 2017 IEEE Symposium on Service-Oriente System Engineering Workspace as a Service: an Online Working Environment for Private Clou Bo An 1, Xuong Shan 1, Zhicheng Cui 1, Chun Cao 2, Donggang Cao 1 1 Institute

More information

Design Principles for Practical Self-Routing Nonblocking Switching Networks with O(N log N) Bit-Complexity

Design Principles for Practical Self-Routing Nonblocking Switching Networks with O(N log N) Bit-Complexity IEEE TRANSACTIONS ON COMPUTERS, VOL. 46, NO. 10, OCTOBER 1997 1 Design Principles for Practical Self-Routing Nonblocking Switching Networks with O(N log N) Bit-Complexity Te H. Szymanski, Member, IEEE

More information

QEmu TCG Enhancements for Speeding-up the Emulation of SIMD instructions

QEmu TCG Enhancements for Speeding-up the Emulation of SIMD instructions QEmu TCG Enhancements for Speeding-up the Emulation of SIMD instructions Luc Michel, Nicolas Fournel and Frédéric Pétrot TIMA Laboratory System Level Synthesis Group DATE 11 W8 18/03/2011 Outline 1 About

More information

Finite Automata Implementations Considering CPU Cache J. Holub

Finite Automata Implementations Considering CPU Cache J. Holub Finite Automata Implementations Consiering CPU Cache J. Holub The finite automata are mathematical moels for finite state systems. More general finite automaton is the noneterministic finite automaton

More information

Study of Network Optimization Method Based on ACL

Study of Network Optimization Method Based on ACL Available online at www.scienceirect.com Proceia Engineering 5 (20) 3959 3963 Avance in Control Engineering an Information Science Stuy of Network Optimization Metho Base on ACL Liu Zhian * Department

More information

Problem Paper Atoms Tree. atoms.pas. atoms.cpp. atoms.c. atoms.java. Time limit per test 1 second 1 second 2 seconds. Number of tests

Problem Paper Atoms Tree. atoms.pas. atoms.cpp. atoms.c. atoms.java. Time limit per test 1 second 1 second 2 seconds. Number of tests ! " # %$ & Overview Problem Paper Atoms Tree Shen Tian Harry Wiggins Carl Hultquist Program name paper.exe atoms.exe tree.exe Source name paper.pas atoms.pas tree.pas paper.cpp atoms.cpp tree.cpp paper.c

More information

Politehnica University of Timisoara Mobile Computing, Sensors Network and Embedded Systems Laboratory. Testing Techniques

Politehnica University of Timisoara Mobile Computing, Sensors Network and Embedded Systems Laboratory. Testing Techniques Politehnica University of Timisoara Mobile Computing, Sensors Network an Embee Systems Laboratory ing Techniques What is testing? ing is the process of emonstrating that errors are not present. The purpose

More information

The Unix File System. Ken Wong Washington University. Bsd Unix Kernel I/O Structure. Monolithic OS Device Drivers (1) File Systems (CSE 422S)

The Unix File System. Ken Wong Washington University. Bsd Unix Kernel I/O Structure. Monolithic OS Device Drivers (1) File Systems (CSE 422S) File Systems (CSE 422S) Ken Wong Washington University kenw@wustl.eu www.arl.wustl.eu/~kenw usr bin ls 2 -Ken Wong, Dec 2007 The Unix File System vmunix tmp executables kenw mnt mount points arl other

More information

EFFICIENT ON-LINE TESTING METHOD FOR A FLOATING-POINT ADDER

EFFICIENT ON-LINE TESTING METHOD FOR A FLOATING-POINT ADDER FFICINT ON-LIN TSTING MTHOD FOR A FLOATING-POINT ADDR A. Droz, M. Lobachev Department of Computer Systems, Oessa State Polytechnic University, Oessa, Ukraine Droz@ukr.net, Lobachev@ukr.net Abstract In

More information

Improving Performance of Sparse Matrix-Vector Multiplication

Improving Performance of Sparse Matrix-Vector Multiplication Improving Performance of Sparse Matrix-Vector Multiplication Ali Pınar Michael T. Heath Department of Computer Science an Center of Simulation of Avance Rockets University of Illinois at Urbana-Champaign

More information

Automation of Bird Front Half Deboning Procedure: Design and Analysis

Automation of Bird Front Half Deboning Procedure: Design and Analysis Automation of Bir Front Half Deboning Proceure: Design an Analysis Debao Zhou, Jonathan Holmes, Wiley Holcombe, Kok-Meng Lee * an Gary McMurray Foo Processing echnology Division, AAS Laboratory, Georgia

More information

EFFICIENT STEREO MATCHING BASED ON A NEW CONFIDENCE METRIC. Won-Hee Lee, Yumi Kim, and Jong Beom Ra

EFFICIENT STEREO MATCHING BASED ON A NEW CONFIDENCE METRIC. Won-Hee Lee, Yumi Kim, and Jong Beom Ra th European Signal Processing Conference (EUSIPCO ) Bucharest, omania, August 7-3, EFFICIENT STEEO MATCHING BASED ON A NEW CONFIDENCE METIC Won-Hee Lee, Yumi Kim, an Jong Beom a Department of Electrical

More information

EXPLANATION OF THE ALGORITHMS FOR DISPLAYING 3D FIGURES ON THE COMPUTER SCREEN

EXPLANATION OF THE ALGORITHMS FOR DISPLAYING 3D FIGURES ON THE COMPUTER SCREEN Original Research Article: full paper [13] Galárraga, L., Heitz, G., Murphy, K., Suchanek, F. M. (2014). Canonicalizing Open Knowlege Bases. Proceeings of the 23r ACM International Conference on Conference

More information

Protocol Configuration

Protocol Configuration Configuration Protocol Configuration Many items must be set before protocols can be use IP aress of each network interface Aress mask for each network Initial values in the forwaring table Process is known

More information

CSCE Introduction to Computer Systems Spring 2019

CSCE Introduction to Computer Systems Spring 2019 CSCE 313-200 Introduction to Computer Systems Spring 2019 Threads Dmitri Loguinov Texas A&M University January 29, 2019 1 Updates Quiz on Thursday System Programming Tutorial (pay attention to exercises)

More information

Waleed K. Al-Assadi. Anura P. Jayasumana. Yashwant K. Malaiya y. February Colorado State University

Waleed K. Al-Assadi. Anura P. Jayasumana. Yashwant K. Malaiya y. February Colorado State University Dierential I DDQ Testable Static RAM Architecture Walee K. Al-Assai Anura P. Jayasumana Yashwant K. Malaiya y Technical Report CS-96-102 February 1996 Department of Electrical Engineering/ y Department

More information

Architecture Design of Mobile Access Coordinated Wireless Sensor Networks

Architecture Design of Mobile Access Coordinated Wireless Sensor Networks Architecture Design of Mobile Access Coorinate Wireless Sensor Networks Mai Abelhakim 1 Leonar E. Lightfoot Jian Ren 1 Tongtong Li 1 1 Department of Electrical & Computer Engineering, Michigan State University,

More information

National Aeronautics and Space and Administration Space Administration. CFE CMake Build System

National Aeronautics and Space and Administration Space Administration. CFE CMake Build System National Aeronautics and Space and Administration Space Administration CFE CMake Build System 1 1 Simplify integrating apps together CFS official Recycled from other projects Custom LC... SC HK A C B Z

More information

3M Promotional Products Catalogue 2009/2010. Think Creatively. Get Noticed

3M Promotional Products Catalogue 2009/2010. Think Creatively. Get Noticed 3M Promotional Proucts Catalogue 2009/2010 Think Creatively Get Notice Increase Bran Buil Traffic an Increase Sales Promote an Eve There s a note for that! ThankanCustomers Sales Call Create Goowill Leave

More information

Intensive Hypercube Communication: Prearranged Communication in Link-Bound Machines 1 2

Intensive Hypercube Communication: Prearranged Communication in Link-Bound Machines 1 2 This paper appears in J. of Parallel an Distribute Computing 10 (1990), pp. 167 181. Intensive Hypercube Communication: Prearrange Communication in Link-Boun Machines 1 2 Quentin F. Stout an Bruce Wagar

More information

SE-330 SERIES (NEW REVISION) IEC INTERFACE

SE-330 SERIES (NEW REVISION) IEC INTERFACE Tel: +1-800-832-3873 E-mail: techline@littelfuse.com www.littelfuse.com/se-330 SE-330 SERIES (NEW REVISION) IEC 61850 INTERFACE Revision 0-C-121117 Copyright 2018 Littelfuse Startco Lt. All rights reserve.

More information

k-nn Graph Construction: a Generic Online Approach

k-nn Graph Construction: a Generic Online Approach k-nn Graph Construction: a Generic Online Approach Wan-Lei Zhao arxiv:80.00v [cs.ir] Sep 08 Abstract Nearest neighbor search an k-nearest neighbor graph construction are two funamental issues arise from

More information

Politecnico di Torino. Porto Institutional Repository

Politecnico di Torino. Porto Institutional Repository Politecnico i Torino Porto Institutional Repository [Proceeing] Automatic March tests generation for multi-port SRAMs Original Citation: Benso A., Bosio A., i Carlo S., i Natale G., Prinetto P. (26). Automatic

More information

Impact of FTP Application file size and TCP Variants on MANET Protocols Performance

Impact of FTP Application file size and TCP Variants on MANET Protocols Performance International Journal of Moern Communication Technologies & Research (IJMCTR) Impact of FTP Application file size an TCP Variants on MANET Protocols Performance Abelmuti Ahme Abbasher Ali, Dr.Amin Babkir

More information

Performance Evaluation of a High Precision Software-based Timestamping Solution for Network Monitoring

Performance Evaluation of a High Precision Software-based Timestamping Solution for Network Monitoring 181 Performance Evaluation of a High Precision Software-base Timestamping Solution for Network Monitoring Peter Orosz, Tamas Skopko Faculty of Informatics University of Debrecen Debrecen, Hungary e-mail:

More information

Handling missing values in kernel methods with application to microbiology data

Handling missing values in kernel methods with application to microbiology data an Machine Learning. Bruges (Belgium), 24-26 April 2013, i6oc.com publ., ISBN 978-2-87419-081-0. Available from http://www.i6oc.com/en/livre/?gcoi=28001100131010. Hanling missing values in kernel methos

More information

SIMD Exploitation in (JIT) Compilers

SIMD Exploitation in (JIT) Compilers SIMD Exploitation in (JIT) Compilers Hiroshi Inoue, IBM Research - Tokyo 1 What s SIMD? Single Instruction Multiple Data Same operations applied for multiple elements in a vector register input 1 A0 input

More information

Principles of B-trees

Principles of B-trees CSE465, Fall 2009 February 25 1 Principles of B-trees Noes an binary search Anoe u has size size(u), keys k 1,..., k size(u) 1 chilren c 1,...,c size(u). Binary search property: for i = 1,..., size(u)

More information

Solution Representation for Job Shop Scheduling Problems in Ant Colony Optimisation

Solution Representation for Job Shop Scheduling Problems in Ant Colony Optimisation Solution Representation for Job Shop Scheuling Problems in Ant Colony Optimisation James Montgomery, Carole Faya 2, an Sana Petrovic 2 Faculty of Information & Communication Technologies, Swinburne University

More information

Registers. Registers

Registers. Registers All computers have some registers visible at the ISA level. They are there to control execution of the program hold temporary results visible at the microarchitecture level, such as the Top Of Stack (TOS)

More information

Porting BLIS to new architectures Early experiences

Porting BLIS to new architectures Early experiences 1st BLIS Retreat. Austin (Texas) Early experiences Universidad Complutense de Madrid (Spain) September 5, 2013 BLIS design principles BLIS = Programmability + Performance + Portability Share experiences

More information

CMSC 430 Introduction to Compilers. Spring Register Allocation

CMSC 430 Introduction to Compilers. Spring Register Allocation CMSC 430 Introuction to Compilers Spring 2016 Register Allocation Introuction Change coe that uses an unoune set of virtual registers to coe that uses a finite set of actual regs For ytecoe targets, can

More information

ELECTRONIC SCALE WB-380. Instruction manual

ELECTRONIC SCALE WB-380. Instruction manual ELECTRONIC SCALE WB-380 Instruction manual Temperature Range for Use : 5 C 35 C Relative Humiity : 30% 80% (without conensation) Temperature Range of Environment :

More information

Seamless Dynamic Runtime Reconfiguration in a Software-Defined Radio

Seamless Dynamic Runtime Reconfiguration in a Software-Defined Radio Seamless Dynamic Runtime Reconfiguration in a Software-Defined Radio Michael L Dickens, J Nicholas Laneman, and Brian P Dunn WINNF 11 Europe 2011-Jun-22/24 Overview! Background on relevant SDR! Problem

More information

A Formal Model and Efficient Traversal Algorithm for Generating Testbenches for Verification of IEEE Standard Floating Point Division

A Formal Model and Efficient Traversal Algorithm for Generating Testbenches for Verification of IEEE Standard Floating Point Division A Formal Moel an Efficient Traversal Algorithm for Generating Testbenches for Verification of IEEE Stanar Floating Point Division Davi W. Matula, Lee D. McFearin Department of Computer Science an Engineering

More information

API RI. Application Programming Interface Reference Implementation. Policies and Procedures Discussion

API RI. Application Programming Interface Reference Implementation. Policies and Procedures Discussion API Working Group Meeting, Harris County, TX March 22-23, 2016 Policies and Procedures Discussion Developing a Mission Statement What do we do? How do we do it? Whom do we do it for? What value are we

More information

Offloading Cellular Traffic through Opportunistic Communications: Analysis and Optimization

Offloading Cellular Traffic through Opportunistic Communications: Analysis and Optimization 1 Offloaing Cellular Traffic through Opportunistic Communications: Analysis an Optimization Vincenzo Sciancalepore, Domenico Giustiniano, Albert Banchs, Anreea Picu arxiv:1405.3548v1 [cs.ni] 14 May 24

More information

Baring it all to Software: The Raw Machine

Baring it all to Software: The Raw Machine Baring it all to Software: The Raw Machine Elliot Waingol, Michael Taylor, Vivek Sarkar, Walter Lee, Victor Lee, Jang Kim, Matthew Frank, Peter Finch, Srikrishna Devabhaktuni, Rajeev Barua, Jonathan Babb,

More information

Coupling the User Interfaces of a Multiuser Program

Coupling the User Interfaces of a Multiuser Program Coupling the User Interfaces of a Multiuser Program PRASUN DEWAN University of North Carolina at Chapel Hill RAJIV CHOUDHARY Intel Corporation We have evelope a new moel for coupling the user-interfaces

More information

Just-In-Time Software Pipelining

Just-In-Time Software Pipelining Just-In-Time Software Pipelining Hongbo Rong Hyunchul Park Youfeng Wu Cheng Wang Programming Systems Lab Intel Labs, Santa Clara What is software pipelining? A loop optimization exposing instruction-level

More information

SIT timing pulleys - EAGLE

SIT timing pulleys - EAGLE IT timing pulleys - EAGLE The IT EAGLE pulleys, manufacture with innovative high-tech equipments, have been specifically esigne to fit with the ILENT NC belt. This combination, thanks to the continuous

More information

Design of Controller for Crawling to Sitting Behavior of Infants

Design of Controller for Crawling to Sitting Behavior of Infants Design of Controller for Crawling to Sitting Behavior of Infants A Report submitte for the Semester Project To be accepte on: 29 June 2007 by Neha Priyaarshini Garg Supervisors: Luovic Righetti Prof. Auke

More information

Data Mining: Clustering

Data Mining: Clustering Bi-Clustering COMP 790-90 Seminar Spring 011 Data Mining: Clustering k t 1 K-means clustering minimizes Where ist ( x, c i t i c t ) ist ( x m j 1 ( x ij i, c c t ) tj ) Clustering by Pattern Similarity

More information

MR DAMPER OPTIMAL PLACEMENT FOR SEMI-ACTIVE CONTROL OF BUILDINGS USING AN EFFICIENT MULTI- OBJECTIVE BINARY GENETIC ALGORITHM

MR DAMPER OPTIMAL PLACEMENT FOR SEMI-ACTIVE CONTROL OF BUILDINGS USING AN EFFICIENT MULTI- OBJECTIVE BINARY GENETIC ALGORITHM 24th International Symposium on on Automation & Robotics in in Construction (ISARC 2007) Construction Automation Group I.I.T. Maras MR DAMPER OPTIMAL PLACEMENT FOR SEMI-ACTIVE CONTROL OF BUILDINGS USING

More information

Overview : Computer Networking. IEEE MAC Protocol: CSMA/CA Internet mobility TCP over noisy links

Overview : Computer Networking. IEEE MAC Protocol: CSMA/CA Internet mobility TCP over noisy links Overview 15-441 15-441: Computer Networking 15-641 Lecture 24: Wireless Eric Anerson Fall 2014 www.cs.cmu.eu/~prs/15-441-f14 Internet mobility TCP over noisy links Link layer challenges an WiFi Cellular

More information

SIMD. Utilization of a SIMD unit in the OS Kernel. Shogo Saito 1 and Shuichi Oikawa 2 2. SIMD. SIMD (Single SIMD SIMD SIMD SIMD

SIMD. Utilization of a SIMD unit in the OS Kernel. Shogo Saito 1 and Shuichi Oikawa 2 2. SIMD. SIMD (Single SIMD SIMD SIMD SIMD OS SIMD 1 2 SIMD (Single Instruction Multiple Data) SIMD OS (Operating System) SIMD SIMD OS Utilization of a SIMD unit in the OS Kernel Shogo Saito 1 and Shuichi Oikawa 2 Nowadays, it is very common that

More information

Comparison of Methods for Increasing the Performance of a DUA Computation

Comparison of Methods for Increasing the Performance of a DUA Computation Comparison of Methos for Increasing the Performance of a DUA Computation Michael Behrisch, Daniel Krajzewicz, Peter Wagner an Yun-Pang Wang Institute of Transportation Systems, German Aerospace Center,

More information

Classifying Facial Expression with Radial Basis Function Networks, using Gradient Descent and K-means

Classifying Facial Expression with Radial Basis Function Networks, using Gradient Descent and K-means Classifying Facial Expression with Raial Basis Function Networks, using Graient Descent an K-means Neil Allrin Department of Computer Science University of California, San Diego La Jolla, CA 9237 nallrin@cs.ucs.eu

More information

Github/Git Primer. Tyler Hague

Github/Git Primer. Tyler Hague Github/Git Primer Tyler Hague Why Use Github? Github keeps all of our code up to date in one place Github tracks changes so we can see what is being worked on Github has issue tracking for keeping up with

More information

Lakshmish Ramanna University of Texas at Dallas Dept. of Electrical Engineering Richardson, TX

Lakshmish Ramanna University of Texas at Dallas Dept. of Electrical Engineering Richardson, TX Boy Sensor Networks to Evaluate Staning Balance: Interpreting Muscular Activities Base on Inertial Sensors Rohith Ramachanran Richarson, TX 75083 rxr057100@utallas.eu Gaurav Prahan Dept. of Computer Science

More information

Frequency Domain Parameter Estimation of a Synchronous Generator Using Bi-objective Genetic Algorithms

Frequency Domain Parameter Estimation of a Synchronous Generator Using Bi-objective Genetic Algorithms Proceeings of the 7th WSEAS International Conference on Simulation, Moelling an Optimization, Beijing, China, September 15-17, 2007 429 Frequenc Domain Parameter Estimation of a Snchronous Generator Using

More information

Number Theoretic Attacks On Secure Password Schemes

Number Theoretic Attacks On Secure Password Schemes Number Theoretic Attacks On Secure Passwor Schemes Sarvar Patel Bellcore Math an Cryptography Research Group 445 South St, Morristown, NJ 07960, USA sarvar@bellcore.com Abstract Encrypte Key Exchange (EKE)

More information

DATA PARALLEL FPGA WORKLOADS: SOFTWARE VERSUS HARDWARE. Peter Yiannacouras, J. Gregory Steffan, and Jonathan Rose

DATA PARALLEL FPGA WORKLOADS: SOFTWARE VERSUS HARDWARE. Peter Yiannacouras, J. Gregory Steffan, and Jonathan Rose DATA PARALLEL FPGA WORKLOADS: SOFTWARE VERSUS HARDWARE Peter Yiannacouras, J. Gregory Steffan, an Jonathan Rose Ewar S. Rogers Sr. Department of Electrical an Computer Engineering University of Toronto

More information

+ E. Bit-Alignment for Retargetable Code Generators * 1 Introduction A D T A T A A T D A A. Keen Schoofs Gert Goossens Hugo De Mant

+ E. Bit-Alignment for Retargetable Code Generators * 1 Introduction A D T A T A A T D A A. Keen Schoofs Gert Goossens Hugo De Mant Bit-lignment for Retargetable Coe Generators * Keen Schoofs Gert Goossens Hugo e Mant IMEC, Kapelreef 75, B-3001 Leuven, Belgium bstract When builing a bit-true retargetable compiler, every signal type

More information

Compiler Optimisation

Compiler Optimisation Compiler Optimisation Michael O Boyle mob@inf.e.ac.uk Room 1.06 January, 2014 1 Two recommene books for the course Recommene texts Engineering a Compiler Engineering a Compiler by K. D. Cooper an L. Torczon.

More information

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017 ECE 550D Funamentals of Computer Systems an Engineering Fall 017 Datapaths Prof. John Boar Duke University Slies are erive from work by Profs. Tyler Bletch an Anrew Hilton (Duke) an Amir Roth (Penn) What

More information

Positions, Iterators, Tree Structures and Tree Traversals. Readings - Chapter 7.3, 7.4 and 8

Positions, Iterators, Tree Structures and Tree Traversals. Readings - Chapter 7.3, 7.4 and 8 Positions, Iterators, Tree Structures an Reaings - Chapter 7.3, 7.4 an 8 1 Positional Lists q q q q A position acts as a marker or token within the broaer positional list. A position p is unaffecte by

More information

Coupon Recalculation for the GPS Authentication Scheme

Coupon Recalculation for the GPS Authentication Scheme Coupon Recalculation for the GPS Authentication Scheme Georg Hofferek an Johannes Wolkerstorfer Graz University of Technology, Institute for Applie Information Processing an Communications (IAIK), Inffelgasse

More information

Scalable Deterministic Scheduling for WDM Slot Switching Xhaul with Zero-Jitter

Scalable Deterministic Scheduling for WDM Slot Switching Xhaul with Zero-Jitter FDL sel. VOA SOA 100 Regular papers ONDM 2018 Scalable Deterministic Scheuling for WDM Slot Switching Xhaul with Zero-Jitter Bogan Uscumlic 1, Dominique Chiaroni 1, Brice Leclerc 1, Thierry Zami 2, Annie

More information

Table-based division by small integer constants

Table-based division by small integer constants Table-base ivision by small integer constants Florent e Dinechin, Laurent-Stéphane Diier LIP, Université e Lyon (ENS-Lyon/CNRS/INRIA/UCBL) 46, allée Italie, 69364 Lyon Ceex 07 Florent.e.Dinechin@ens-lyon.fr

More information

Detour Planning for Fast and Reliable Failure Recovery in SDN with OpenState

Detour Planning for Fast and Reliable Failure Recovery in SDN with OpenState Detour Planning for Fast an Reliable Failure Recovery in SDN with OpenState Antonio Capone, Carmelo Cascone, Alessanro Q.T. Nguyen, Brunile Sansò Dipartimento i Elettronica, Informazione e Bioingegneria,

More information

Attitude Dynamics and Control of a Dual-Body Spacecraft using Variable-Speed Control Moment Gyros

Attitude Dynamics and Control of a Dual-Body Spacecraft using Variable-Speed Control Moment Gyros Attitue Dynamics an Control of a Dual-Boy Spacecraft using Variable-Spee Control Moment Gyros Abstract Marcello Romano, Brij N. Agrawal Department of Mechanical an Astronautical Engineering, US Naval Postgrauate

More information

OpenZFS Collaboration

OpenZFS Collaboration OpenZFS Collaboration OpenZFS User Conference March 16, 2017 Brian Behlendorf Lawrence Livermore National Laboratory This work was performed under the auspices of the U.S. Department of Energy by Lawrence

More information

ELECTRONIC PHYSICIAN SCALE WB-3000 INSTRUCTION MANUAL

ELECTRONIC PHYSICIAN SCALE WB-3000 INSTRUCTION MANUAL ELECTRONIC PHYSICIAN SCALE WB-3000 INSTRUCTION MANUAL Please keep this manual in a safe place, an make sure it is reaily available whenever necessary. Please use this prouct only after carefully reaing

More information

HKG18-506: HCQC HPC Compiler Quality Checker. Masaki Arai LEG HPC-SIG FUJITSU LABORATORIES LTD.

HKG18-506: HCQC HPC Compiler Quality Checker. Masaki Arai LEG HPC-SIG FUJITSU LABORATORIES LTD. HKG18-506: HCQC HPC Compiler Quality Checker Masaki Arai LEG HPC-SIG FUJITSU LABORATORIES LTD. Outline Background and Purpose Subject of Quality Check Metrics for Quality Evaluation Quality Evaluation

More information

Supporting Fully Adaptive Routing in InfiniBand Networks

Supporting Fully Adaptive Routing in InfiniBand Networks XIV JORNADAS DE PARALELISMO - LEGANES, SEPTIEMBRE 200 1 Supporting Fully Aaptive Routing in InfiniBan Networks J.C. Martínez, J. Flich, A. Robles, P. López an J. Duato Resumen InfiniBan is a new stanar

More information

The challenge of SVE in QEMU. Alex Bennée Senior Virtualization Engineer

The challenge of SVE in QEMU. Alex Bennée Senior Virtualization Engineer The challenge of SVE in QEMU Alex Bennée Senior Virtualization Engineer alex.bennee@linaro.org What is ARM s SVE? Why implement SVE in QEMU? How does QEMU s TCG work? Challenges for QEMU s emulation Current

More information

SSE and SSE2. Timothy A. Chagnon 18 September All images from Intel 64 and IA 32 Architectures Software Developer's Manuals

SSE and SSE2. Timothy A. Chagnon 18 September All images from Intel 64 and IA 32 Architectures Software Developer's Manuals SSE and SSE2 Timothy A. Chagnon 18 September 2007 All images from Intel 64 and IA 32 Architectures Software Developer's Manuals Overview SSE: Streaming SIMD (Single Instruction Multiple Data) Extensions

More information

JigCell Model Connector: building large molecular network models from components

JigCell Model Connector: building large molecular network models from components Simulation Applications JigCell Moel Connector: builing large molecular network moels from components Simulation: Transactions of the Society for Moeling an Simulation International 2018, Vol. 94(11) 993

More information

Accelerating the Single Cluster PHD Filter with a GPU Implementation

Accelerating the Single Cluster PHD Filter with a GPU Implementation Accelerating the Single Cluster PHD Filter with a GPU Implementation Chee Sing Lee, José Franco, Jérémie Houssineau, Daniel Clar Abstract The SC-PHD filter is an algorithm which was esigne to solve a class

More information

Webmail Instructions

Webmail Instructions Medway Grid for Learning Policies and Guidance Webmail Instructions (Version 1.10-29/04/2005) Connecting to the webmail service... 1 Accessing old email... 1 To Send a New Message... 3 Organising your

More information

Adjacency Matrix Based Full-Text Indexing Models

Adjacency Matrix Based Full-Text Indexing Models 1000-9825/2002/13(10)1933-10 2002 Journal of Software Vol.13, No.10 Ajacency Matrix Base Full-Text Inexing Moels ZHOU Shui-geng 1, HU Yun-fa 2, GUAN Ji-hong 3 1 (Department of Computer Science an Engineering,

More information

Introduction to Computers and Programming

Introduction to Computers and Programming 16.070 Introduction to Computers and Programming May 2 Recitation 12 Spring 2002 Topics: Project Q&A Real-time overview Multitasking Final Project: Any Questions? Real-time overview What is a real-time

More information

Animação e Visualização Tridimensional. Collision Detection Corpo docente de AVT / CG&M / DEI / IST / UTL

Animação e Visualização Tridimensional. Collision Detection Corpo docente de AVT / CG&M / DEI / IST / UTL Animação e Visualização Triimensional Collision Detection Collision Hanling Collision Detection Collision Determination Collision Response Collision Hanling Collision Detection Collision Determination

More information

FTSM FAST TAKAGI-SUGENO FUZZY MODELING. Manfred Männle

FTSM FAST TAKAGI-SUGENO FUZZY MODELING. Manfred Männle FTS FAST TAKAGI-SUGENO FUZZY ODELING anfre ännle Institute for Computer Design an Fault Tolerance University of Karlsruhe, D-7628 Karlsruhe, Germany anfre.aennle@informatik.uni-karlsruhe.e Abstract Takagi-Sugeno

More information

Git. SSE2034: System Software Experiment 3, Fall 2018, Jinkyu Jeong

Git. SSE2034: System Software Experiment 3, Fall 2018, Jinkyu Jeong Git Prof. Jinkyu Jeong (Jinkyu@skku.edu) TA -- Minwoo Ahn (minwoo.ahn@csl.skku.edu) TA -- Donghyun Kim (donghyun.kim@csl.skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu

More information

An Algorithm for Building an Enterprise Network Topology Using Widespread Data Sources

An Algorithm for Building an Enterprise Network Topology Using Widespread Data Sources An Algorithm for Builing an Enterprise Network Topology Using Wiesprea Data Sources Anton Anreev, Iurii Bogoiavlenskii Petrozavosk State University Petrozavosk, Russia {anreev, ybgv}@cs.petrsu.ru Abstract

More information

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Generalized Edge Coloring for Channel Assignment in Wireless Networks Generalize Ege Coloring for Channel Assignment in Wireless Networks Chun-Chen Hsu Institute of Information Science Acaemia Sinica Taipei, Taiwan Da-wei Wang Jan-Jan Wu Institute of Information Science

More information

NAND flash memory is widely used as a storage

NAND flash memory is widely used as a storage 1 : Buffer-Aware Garbage Collection for Flash-Base Storage Systems Sungjin Lee, Dongkun Shin Member, IEEE, an Jihong Kim Member, IEEE Abstract NAND flash-base storage evice is becoming a viable storage

More information

Preamble. Singly linked lists. Collaboration policy and academic integrity. Getting help

Preamble. Singly linked lists. Collaboration policy and academic integrity. Getting help CS2110 Spring 2016 Assignment A. Linke Lists Due on the CMS by: See the CMS 1 Preamble Linke Lists This assignment begins our iscussions of structures. In this assignment, you will implement a structure

More information

Preliminary Performance Evaluation of Application Kernels using ARM SVE with Multiple Vector Lengths

Preliminary Performance Evaluation of Application Kernels using ARM SVE with Multiple Vector Lengths Preliminary Performance Evaluation of Application Kernels using ARM SVE with Multiple Vector Lengths Y. Kodama, T. Odajima, M. Matsuda, M. Tsuji, J. Lee and M. Sato RIKEN AICS (Advanced Institute for Computational

More information

Learning Subproblem Complexities in Distributed Branch and Bound

Learning Subproblem Complexities in Distributed Branch and Bound Learning Subproblem Complexities in Distribute Branch an Boun Lars Otten Department of Computer Science University of California, Irvine lotten@ics.uci.eu Rina Dechter Department of Computer Science University

More information

SAMPLE FREE SEO Analysis Report. December 5, 2005

SAMPLE FREE SEO Analysis Report. December 5, 2005 SAMPLE FREE SEO Analysis Report December 5, 2005 SAMPLE Keywor Phrase Analysis: Tier #3 Hello Via Net Marketing, Below is the list of keywors that represents the market research that has been performe

More information

INDEX SET that the utmost attention is paid to the following 5. 7 instructions and to save it.

INDEX SET that the utmost attention is paid to the following 5. 7 instructions and to save it. TLE 20 MICROPROCESSOR-BASED DIGITAL ELECTRIC FREEZER CTROLLER OPERATING INSTRUCTIS Vr. 0 (ENG) - 09/0 - co.: ISTR 0720 FOREWORD TECNOLOGIC S.p.A. VIA INDIPENDENZA 27029 VIGEVANO (PV) ITALY TEL.: +39 038

More information