Starfish Efficient Concurrency Support

Size: px
Start display at page:

Download "Starfish Efficient Concurrency Support"

Transcription

1 Efficient Concurrency Support for Computer Applications Robert LiKamWa Lin Zhong Rice University 1

2 In the year

3 Continuous mobile vision Oh, I haven't talked to Fred recently... Where did I place my keys? Remind me to tell Lin to let me graduate!?????????? 3

4 Continuous mobile vision Energy consumption overwhelms wearable battery! Insufficient concurrency support for vision! 4

5 Application Capture Service Headers 5

6 Application Capture Service Headers Scale Face 6

7 200+ ms 200+ ms 200+ ms Camera is overworked Computation is overworked We need efficient concurrency support! 7

8 Observations: apps utilize common vision library 8

9 Scale Face common primitives Scale Face Scale SURF Observations: apps utilize common vision library Capture/Computation is redundant across apps 9

10 Scale Face common primitives Scale Face Scale SURF Observations: apps utilize common vision library Capture/Computation is redundant across apps 10

11 Proposal: Split- process design for centralized control Scale SURF Face Key Idea: Share computations, Reduce redundancy 11

12 Core Service Uses vision headers to intercept library calls Core Service Executes, tracks, and shares library call computations 12

13 Core Service Split- process library solution for efficient, transparent multi- app service 13

14 facedetect(img) Core facedetect(img) Function Cache Arg Search Exec. Function Cache Storage/Retrieval First Call Execute call ( ms) Store results Subsequent Call(s) Skip execution (0 ms) Retrieve results 14

15 Core Cache Search Function Cache Share computations, Reduce redundancy 15

16 Core Cache Search Function Cache Timing optimizations: + Reuse "Fresh Frames to promote cache hits + Delay call return to encourage device sleep 16

17 Core Mitigating Cache Search Split- Process Overhead Function Cache + Promote sharing by reusing "Fresh Frames + Maintain code privacy by delaying call return + Manage concurrent requests through fine- grained locks 17

18 Core Cache Search Function Cache Reduce expense of argument passing Avoid library object modification 18

19 Split- Process in the Literature Object Redefinition Lightweight RPC SUN RPC Zero- copy Require library redesign Cloud Transfer COMET MAUI CloneCloud Object tracking Optimized for code offload 19

20 Split- Process Argument Passing Shared Memory Core facedetect( ) 3.5 ms Execution Goal: Minimize deep copy overhead is expensive 20

21 1) Protected shallow copy Shared Memory Core facedetect( ) CoW Execution CoW Issue copy- on- write (mprotect) on received objects 21

22 2) Direct output marshalling Shared Memory Core facedetect( ) CoW CoW Execution malloc() Write new data directly into shared memory 22

23 3) Reuse arguments from prior calls Shared Memory Core facedetect( ) CoW Argument Table (img, shm_ptr) CoW Execution malloc() Track, reuse previous inputs/outputs 23

24 (Usually) Zero- Copy Argument Passing Shared Memory Core facedetect( ) CoW Argument Table (img, shm_ptr) CoW Execution malloc() + Reduce deep copies through argument reuse + Reduce allocation through buffer reuse 24

25 Optimizations Share Computations Maintain expectations of developers & users Increase sharing efficiency by relaxing timing Decrease redundancy Reduce argument copy Reuse memory buffers 25

26 Experimental Platform OpenCV + Android + Google Glass Monsoon Power Monitor OMAP4430 dual- core Cortex- A9 pinned to 600 MHz Benchmarks: 1) Per- call micro- benchmarks 2) Multi- app benchmarks 26

27 Memory optimizations cut overhead in half Prepare&Inputs& Send&Inputs& Search&Cache& 'call'execution:'resize() Native& &Unopt.& & First Call Native : 21 ms Unopt. : 42 ms : 24ms Exec.&Function& Allocate&Outputs& Prepare&Outputs& Receive&Outputs& 0" 5" 10" 15" 20" 25" Execution&time&(ms)& 27

28 Memory optimizations cut overhead in half Prepare&Inputs& Send&Inputs& Search&Cache& Exec.&Function& Allocate&Outputs& Prepare&Outputs& 'call'execution:'resize() Native& &Unopt.& & First Call Native : 21 ms Unopt. : 42 ms : 24ms Second Call Native : 21 ms Unopt. : 10 ms : 6 ms Receive&Outputs& 0" 5" 10" 15" 20" 25" Execution&time&(ms)& works well when native function execution >> 5 ms 28

29 Places where fails 5-6 ms performance overhead per- call: Bad for fast computations Bad with large, deep arguments Non- cacheable functions Random, temporal, external dependencies Functions with specific parameters But great for many high- level functions: Face detect, Corner detect, Image resize 29

30 vs. Multi- App Workload If Lin then That Social Logger Facebook Google+ Twitter MySpace Whatsapp... Scale Face Detect Face Recog 0.3 FPS 30

31 When running multiple apps, achieves higher performance 5" 4.5" 4" 3.5" 3" Frame&Rate& 2.5" (FPS)& 2" 1.5" 1" 0.5" 31 0" Na/ve" " 1" 2" 3" 4" 5" 6" 7" 8" 9" 10" Number&of&App&Instances& Higher is better

32 When running multiple apps, draws single- app power Na.ve" " Power& Consump-on& (mw)& 2500" 2000" 1500" 1000" 500" 0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10" Number&of&App&Instances& Lower is better 32

33 When running multiple apps, draws single- app power Na.ve" " Power& Consump-on& (mw)& 2500" 2000" 1500" 1000" 500" 0" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10" Number&of&App&Instances& achieves efficient concurrency support! 33

34 Core Cache Search Function Cache Share computations, Reduce redundancy 34

35 Continuing mobile vision Mitigating Analog Signal Chain Bandwidth Preserving User/Subject Privacy Designing Efficient Systems 35

36 : Share computations, Reduce redundancy Transparent, efficient library call caching Timing optimizations for frame freshness and performance preservation Memory optimizations for minimal deep copy and small cache footprint Google Glass Experiments: Low overhead Memory optimizations slash overhead in half Reduced power draw Multi- app workloads draw Single app power 36

Energy-Aware, Security-Conscious Code Offloading for the Mobile Cloud

Energy-Aware, Security-Conscious Code Offloading for the Mobile Cloud Energy-Aware, Security-Conscious Code Offloading for the Mobile Cloud Kirill Mechitov, Atul Sandur, and Gul Agha October 12, 2016 Background: Mobile Cloud Computing Cloud Computing Access to virtually

More information

Rsyslog: going up from 40K messages per second to 250K. Rainer Gerhards

Rsyslog: going up from 40K messages per second to 250K. Rainer Gerhards Rsyslog: going up from 40K messages per second to 250K Rainer Gerhards What's in it for you? Bad news: will not teach you to make your kernel component five times faster Perspective user-space application

More information

Energy Discounted Computing On Multicore Smartphones Meng Zhu & Kai Shen. Atul Bhargav

Energy Discounted Computing On Multicore Smartphones Meng Zhu & Kai Shen. Atul Bhargav Energy Discounted Computing On Multicore Smartphones Meng Zhu & Kai Shen Atul Bhargav Overview Energy constraints in a smartphone Li-Ion Battery Arm big.little Hardware Sharing What is Energy Discounted

More information

Diffusing Your Mobile Apps: Extending In-Network Function Virtualisation to Mobile Function Offloading

Diffusing Your Mobile Apps: Extending In-Network Function Virtualisation to Mobile Function Offloading Diffusing Your Mobile Apps: Extending In-Network Function Virtualisation to Mobile Function Offloading Mario Almeida, Liang Wang*, Jeremy Blackburn, Konstantina Papagiannaki, Jon Crowcroft* Telefonica

More information

Mobile Middleware Course. Mobile Platforms and Middleware. Sasu Tarkoma

Mobile Middleware Course. Mobile Platforms and Middleware. Sasu Tarkoma Mobile Middleware Course Mobile Platforms and Middleware Sasu Tarkoma Role of Software and Algorithms Software has an increasingly important role in mobile devices Increase in device capabilities Interaction

More information

CSci 8980 Mobile Cloud Computing. Outsourcing: Components I

CSci 8980 Mobile Cloud Computing. Outsourcing: Components I CSci 8980 Mobile Cloud Computing Outsourcing: Components I MAUI: Making Smartphones Last Longer With Code Offload Microsoft Research Outline Motivation MAUI system design MAUI proxy MAUI profiler MAUI

More information

IoT Ecosystem and Business Opportunities

IoT Ecosystem and Business Opportunities IoT Ecosystem and Business Opportunities 17th May, 2017 1 Copyright 2017 Samsung. All Rights Reserved. Shivakumar Mathapathi Co-Founder & CTO Dew Mobility (Approved Vendor for Samsung) Table of Contents

More information

Phone Overview. Important buttons on your Jitterbug Smart

Phone Overview. Important buttons on your Jitterbug Smart Phone Overview Important buttons on your Jitterbug Smart A B A) Volume Button: PRESS upper end of button to increase volume, PRESS the lower end to decrease volume B) Power/Lock Button: PRESS and release

More information

Shareable Camera Framework for Multiple Computer Vision Applications

Shareable Camera Framework for Multiple Computer Vision Applications 669 Shareable Camera Framework for Multiple Computer Vision Applications Hayun Lee, Gyeonghwan Hong, Dongkun Shin Department of Electrical and Computer Engineering, Sungkyunkwan University, Korea lhy920806@skku.edu,

More information

Using the Xelio Tablet

Using the Xelio Tablet 1 Using the Xelio Tablet by Dick Evans www.rwevans.com The Eyeglass in the top left opens the Google search on the Internet The microphone picture lets you talk instead of typing The 6 squares in the upper

More information

L.C.Smith. Privacy-Preserving Offloading of Mobile App to the Public Cloud

L.C.Smith. Privacy-Preserving Offloading of Mobile App to the Public Cloud Privacy-Preserving Offloading of Mobile App to the Public Cloud Yue Duan, Mu Zhang, Heng Yin and Yuzhe Tang Department of EECS Syracuse University L.C.Smith College of Engineering 1 and Computer Science

More information

Call Paths for Pin Tools

Call Paths for Pin Tools , Xu Liu, and John Mellor-Crummey Department of Computer Science Rice University CGO'14, Orlando, FL February 17, 2014 What is a Call Path? main() A() B() Foo() { x = *ptr;} Chain of function calls that

More information

Ubiquitous and Mobile Computing CS 528:EnergyEfficiency Comparison of Mobile Platforms and Applications: A Quantitative Approach. Norberto Luna Cano

Ubiquitous and Mobile Computing CS 528:EnergyEfficiency Comparison of Mobile Platforms and Applications: A Quantitative Approach. Norberto Luna Cano Ubiquitous and Mobile Computing CS 528:EnergyEfficiency Comparison of Mobile Platforms and Applications: A Quantitative Approach Norberto Luna Cano Computer Science Dept. Worcester Polytechnic Institute

More information

Designing Next-Generation Data- Centers with Advanced Communication Protocols and Systems Services. Presented by: Jitong Chen

Designing Next-Generation Data- Centers with Advanced Communication Protocols and Systems Services. Presented by: Jitong Chen Designing Next-Generation Data- Centers with Advanced Communication Protocols and Systems Services Presented by: Jitong Chen Outline Architecture of Web-based Data Center Three-Stage framework to benefit

More information

Special Topics: CSci 8980 Edge Computing Outsourcing III

Special Topics: CSci 8980 Edge Computing Outsourcing III Special Topics: CSci 8980 Edge Computing Outsourcing III Jon B. Weissman (jon@cs.umn.edu) Department of Computer Science University of Minnesota Parametric Analysis for Adaptive Computation Offoading 2

More information

Next-Generation Cloud Platform

Next-Generation Cloud Platform Next-Generation Cloud Platform Jangwoo Kim Jun 24, 2013 E-mail: jangwoo@postech.ac.kr High Performance Computing Lab Department of Computer Science & Engineering Pohang University of Science and Technology

More information

Lecture 12: Model Serving. CSE599W: Spring 2018

Lecture 12: Model Serving. CSE599W: Spring 2018 Lecture 12: Model Serving CSE599W: Spring 2018 Deep Learning Applications That drink will get you to 2800 calories for today I last saw your keys in the store room Remind Tom of the party You re on page

More information

User-level Management of Kernel Memory

User-level Management of Kernel Memory User-level Management of Memory Andreas Haeberlen University of Karlsruhe Karlsruhe, Germany Kevin Elphinstone University of New South Wales Sydney, Australia 1 Motivation: memory Threads Files memory

More information

Speculations about Computer Architecture in Next Three Years. Jan. 20, 2018

Speculations about Computer Architecture in Next Three Years. Jan. 20, 2018 Speculations about Computer Architecture in Next Three Years shuchang.zhou@gmail.com Jan. 20, 2018 About me https://zsc.github.io/ Source-to-source transformation Cache simulation Compiler Optimization

More information

An Overlay File System for Cloud-Assisted Mobile Applications

An Overlay File System for Cloud-Assisted Mobile Applications An Overlay File System for Cloud-Assisted Mobile Applications Jianchen Shan, Nafize R. Paiker, Xiaoning Ding, Narain Gehani, Reza Curtmola, Cristian Borcea Department of Computer Science, New Jersey Institute

More information

Reduce Data Usage. 01 Cellular Data for Certain Apps Go to Settings > Cellular. Dad s iphone Tips Version: 1/1/2018 6:43:00 AM

Reduce Data Usage. 01 Cellular Data for Certain Apps Go to Settings > Cellular. Dad s iphone Tips Version: 1/1/2018 6:43:00 AM Page 1 of 6 Contents Reduce Data Usage... 1 01 Cellular Data for Certain Apps... 1 02 icoud Drive... 3 03 Wi-Fi Assist... 3 04 Automatic Downloads... 3 05 Background App Refresh... 3 06 Load Remote Images...

More information

GPUfs: Integrating a file system with GPUs

GPUfs: Integrating a file system with GPUs GPUfs: Integrating a file system with GPUs Mark Silberstein (UT Austin/Technion) Bryan Ford (Yale), Idit Keidar (Technion) Emmett Witchel (UT Austin) 1 Traditional System Architecture Applications OS CPU

More information

Go Deep: Fixing Architectural Overheads of the Go Scheduler

Go Deep: Fixing Architectural Overheads of the Go Scheduler Go Deep: Fixing Architectural Overheads of the Go Scheduler Craig Hesling hesling@cmu.edu Sannan Tariq stariq@cs.cmu.edu May 11, 2018 1 Introduction Golang is a programming language developed to target

More information

CS533 Concepts of Operating Systems. Jonathan Walpole

CS533 Concepts of Operating Systems. Jonathan Walpole CS533 Concepts of Operating Systems Jonathan Walpole Lightweight Remote Procedure Call (LRPC) Overview Observations Performance analysis of RPC Lightweight RPC for local communication Performance Remote

More information

CHAPTER - 4 REMOTE COMMUNICATION

CHAPTER - 4 REMOTE COMMUNICATION CHAPTER - 4 REMOTE COMMUNICATION Topics Introduction to Remote Communication Remote Procedural Call Basics RPC Implementation RPC Communication Other RPC Issues Case Study: Sun RPC Remote invocation Basics

More information

VARIOUS systems have been designed to allow mobile

VARIOUS systems have been designed to allow mobile IEEE TRANSACTION ON CLOUD COMPUTING 1 Design and Implementation of an Overlay File System for Cloud-Assisted Mobile Apps Nafize R. Paiker, Jianchen Shan, Cristian Borcea, Narain Gehani, Reza Curtmola,

More information

GPUfs: Integrating a file system with GPUs

GPUfs: Integrating a file system with GPUs ASPLOS 2013 GPUfs: Integrating a file system with GPUs Mark Silberstein (UT Austin/Technion) Bryan Ford (Yale), Idit Keidar (Technion) Emmett Witchel (UT Austin) 1 Traditional System Architecture Applications

More information

WearDrive: Fast and Energy Efficient Storage for Wearables

WearDrive: Fast and Energy Efficient Storage for Wearables WearDrive: Fast and Energy Efficient Storage for Wearables Reza Shisheie Cleveland State University CIS 601 Wearable Computing: A New Era 2 Wearable Computing: A New Era Notifications Fitness/Healthcare

More information

Using Transparent Compression to Improve SSD-based I/O Caches

Using Transparent Compression to Improve SSD-based I/O Caches Using Transparent Compression to Improve SSD-based I/O Caches Thanos Makatos, Yannis Klonatos, Manolis Marazakis, Michail D. Flouris, and Angelos Bilas {mcatos,klonatos,maraz,flouris,bilas}@ics.forth.gr

More information

Big and Fast. Anti-Caching in OLTP Systems. Justin DeBrabant

Big and Fast. Anti-Caching in OLTP Systems. Justin DeBrabant Big and Fast Anti-Caching in OLTP Systems Justin DeBrabant Online Transaction Processing transaction-oriented small footprint write-intensive 2 A bit of history 3 OLTP Through the Years relational model

More information

Enhancement of Open Source Monitoring Tool for Small Footprint JuneDatabases

Enhancement of Open Source Monitoring Tool for Small Footprint JuneDatabases Enhancement of Open Source Monitoring Tool for Small Footprint Databases Sukh Deo Under the guidance of Prof. Deepak B. Phatak Dept. of CSE IIT Bombay June 23, 2014 Enhancement of Open Source Monitoring

More information

DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs

DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs IBM Research AI Systems Day DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs Xiaofan Zhang 1, Junsong Wang 2, Chao Zhu 2, Yonghua Lin 2, Jinjun Xiong 3, Wen-mei

More information

PAC094 Performance Tips for New Features in Workstation 5. Anne Holler Irfan Ahmad Aravind Pavuluri

PAC094 Performance Tips for New Features in Workstation 5. Anne Holler Irfan Ahmad Aravind Pavuluri PAC094 Performance Tips for New Features in Workstation 5 Anne Holler Irfan Ahmad Aravind Pavuluri Overview of Talk Virtual machine teams 64-bit guests SMP guests e1000 NIC support Fast snapshots Virtual

More information

biosim App: Android Quick Reference Guide for i-limb devices

biosim App: Android Quick Reference Guide for i-limb devices biosim App: Android Quick Reference Guide for i-limb devices 1 Contents 1 Welcome and important points 2 Getting started 5 Activation 6 Firmware Update i-limb ultra revolution 12 Connection 12 Searching

More information

Building Ultra-Low Power Wearable SoCs

Building Ultra-Low Power Wearable SoCs Building Ultra-Low Power Wearable SoCs 1 Wearable noun An item that can be worn adjective Easy to wear, suitable for wearing 2 Wearable Opportunity: Fastest Growing Market Segment Projected Growth from

More information

Open Mobile Platforms. EE 392I, Lecture-6 May 4 th, 2010

Open Mobile Platforms. EE 392I, Lecture-6 May 4 th, 2010 Open Mobile Platforms EE 392I, Lecture-6 May 4 th, 2010 Open Mobile Platforms The Android Initiative T-Mobile s ongoing focus on Android based devices in US and EU markets In Nov 2007, Google announced

More information

Enhancement of Open Source Monitoring Tool for Small Footprint JuneDatabases

Enhancement of Open Source Monitoring Tool for Small Footprint JuneDatabases Enhancement of Open Source Monitoring Tool for Small Footprint Databases Sukh Deo Under the guidance of Prof. Deepak B. Phatak Dept. of CSE IIT Bombay June 22, 2014 Enhancement of Open Source Monitoring

More information

Enabling Efficient and Scalable Zero-Trust Security

Enabling Efficient and Scalable Zero-Trust Security WHITE PAPER Enabling Efficient and Scalable Zero-Trust Security FOR CLOUD DATA CENTERS WITH AGILIO SMARTNICS THE NEED FOR ZERO-TRUST SECURITY The rapid evolution of cloud-based data centers to support

More information

Software-Controlled Multithreading Using Informing Memory Operations

Software-Controlled Multithreading Using Informing Memory Operations Software-Controlled Multithreading Using Informing Memory Operations Todd C. Mowry Computer Science Department University Sherwyn R. Ramkissoon Department of Electrical & Computer Engineering University

More information

Portland State University ECE 588/688. Graphics Processors

Portland State University ECE 588/688. Graphics Processors Portland State University ECE 588/688 Graphics Processors Copyright by Alaa Alameldeen 2018 Why Graphics Processors? Graphics programs have different characteristics from general purpose programs Highly

More information

COL862 - Low Power Computing

COL862 - Low Power Computing COL862 - Low Power Computing Power Measurements using performance counters and studying the low power computing techniques in IoT development board (PSoC 4 BLE Pioneer Kit) and Arduino Mega 2560 Submitted

More information

FaRM: Fast Remote Memory

FaRM: Fast Remote Memory FaRM: Fast Remote Memory Problem Context DRAM prices have decreased significantly Cost effective to build commodity servers w/hundreds of GBs E.g. - cluster with 100 machines can hold tens of TBs of main

More information

Ajax Performance Analysis. Ryan Breen

Ajax Performance Analysis. Ryan Breen Ajax Performance Analysis Ryan Breen Ajax Performance Analysis Who Goals Ryan Breen: VP Technology at Gomez and blogger at ajaxperformance.com Survey tools available to developers Understand how to approach

More information

MoodScope: Asia! Sensing mood from smartphone usage pa5erns. Yunxin Liu Nicholas D. Lane. Robert Likamwa Lin Zhong

MoodScope: Asia! Sensing mood from smartphone usage pa5erns. Yunxin Liu Nicholas D. Lane. Robert Likamwa Lin Zhong MoodScope: Sensing mood from smartphone usage pa5erns Robert Likamwa Lin Zhong Yunxin Liu Nicholas D. Lane Asia! Mood-Enhanced Apps! Personal" analytics! Social" ecosystems! Media " recommendation! 2 of

More information

From Boolean Algebra to Smart Glass

From Boolean Algebra to Smart Glass From Boolean Algebra to Smart Glass George Tai 2014/03 Boolean Algebra Why mathematics is the base for today s computer technology? In mathematics and mathematical logic, Boolean algebra is the subarea

More information

Arachne. Core Aware Thread Management Henry Qin Jacqueline Speiser John Ousterhout

Arachne. Core Aware Thread Management Henry Qin Jacqueline Speiser John Ousterhout Arachne Core Aware Thread Management Henry Qin Jacqueline Speiser John Ousterhout Granular Computing Platform Zaharia Winstein Levis Applications Kozyrakis Cluster Scheduling Ousterhout Low-Latency RPC

More information

HPMMAP: Lightweight Memory Management for Commodity Operating Systems. University of Pittsburgh

HPMMAP: Lightweight Memory Management for Commodity Operating Systems. University of Pittsburgh HPMMAP: Lightweight Memory Management for Commodity Operating Systems Brian Kocoloski Jack Lange University of Pittsburgh Lightweight Experience in a Consolidated Environment HPC applications need lightweight

More information

TUNING CUDA APPLICATIONS FOR MAXWELL

TUNING CUDA APPLICATIONS FOR MAXWELL TUNING CUDA APPLICATIONS FOR MAXWELL DA-07173-001_v6.5 August 2014 Application Note TABLE OF CONTENTS Chapter 1. Maxwell Tuning Guide... 1 1.1. NVIDIA Maxwell Compute Architecture... 1 1.2. CUDA Best Practices...2

More information

STYROFOAM. Robert LiKamWa, David Ramirez, Jason Holloway. Advisors: Lin Zhong, Behnaam Aazhang, Ashok Veeraraghavan Rice University

STYROFOAM. Robert LiKamWa, David Ramirez, Jason Holloway. Advisors: Lin Zhong, Behnaam Aazhang, Ashok Veeraraghavan Rice University STYROFOAM Robert LiKamWa, David Ramirez, Jason Holloway Advisors: Lin Zhong, Behnaam Aazhang, Ashok Veeraraghavan Rice University STYROFOAM A Tightly-Packed Coding Scheme for Camera-based Visible Light

More information

416 Distributed Systems. Distributed File Systems 1: NFS Sep 18, 2018

416 Distributed Systems. Distributed File Systems 1: NFS Sep 18, 2018 416 Distributed Systems Distributed File Systems 1: NFS Sep 18, 2018 1 Outline Why Distributed File Systems? Basic mechanisms for building DFSs Using NFS and AFS as examples NFS: network file system AFS:

More information

Capriccio : Scalable Threads for Internet Services

Capriccio : Scalable Threads for Internet Services Capriccio : Scalable Threads for Internet Services - Ron von Behren &et al - University of California, Berkeley. Presented By: Rajesh Subbiah Background Each incoming request is dispatched to a separate

More information

data parallelism Chris Olston Yahoo! Research

data parallelism Chris Olston Yahoo! Research data parallelism Chris Olston Yahoo! Research set-oriented computation data management operations tend to be set-oriented, e.g.: apply f() to each member of a set compute intersection of two sets easy

More information

H.-S. Oh, B.-J. Kim, H.-K. Choi, S.-M. Moon. School of Electrical Engineering and Computer Science Seoul National University, Korea

H.-S. Oh, B.-J. Kim, H.-K. Choi, S.-M. Moon. School of Electrical Engineering and Computer Science Seoul National University, Korea H.-S. Oh, B.-J. Kim, H.-K. Choi, S.-M. Moon School of Electrical Engineering and Computer Science Seoul National University, Korea Android apps are programmed using Java Android uses DVM instead of JVM

More information

Distributed File Systems Issues. NFS (Network File System) AFS: Namespace. The Andrew File System (AFS) Operating Systems 11/19/2012 CSC 256/456 1

Distributed File Systems Issues. NFS (Network File System) AFS: Namespace. The Andrew File System (AFS) Operating Systems 11/19/2012 CSC 256/456 1 Distributed File Systems Issues NFS (Network File System) Naming and transparency (location transparency versus location independence) Host:local-name Attach remote directories (mount) Single global name

More information

IX: A Protected Dataplane Operating System for High Throughput and Low Latency

IX: A Protected Dataplane Operating System for High Throughput and Low Latency IX: A Protected Dataplane Operating System for High Throughput and Low Latency Belay, A. et al. Proc. of the 11th USENIX Symp. on OSDI, pp. 49-65, 2014. Reviewed by Chun-Yu and Xinghao Li Summary In this

More information

GeForce3 OpenGL Performance. John Spitzer

GeForce3 OpenGL Performance. John Spitzer GeForce3 OpenGL Performance John Spitzer GeForce3 OpenGL Performance John Spitzer Manager, OpenGL Applications Engineering jspitzer@nvidia.com Possible Performance Bottlenecks They mirror the OpenGL pipeline

More information

Smart Watch Phone User Guide. Please read the manual before use.

Smart Watch Phone User Guide. Please read the manual before use. Smart Watch Phone User Guide Please read the manual before use. 1. SAFETY WARNING 1.1 The information in this document won't be modified or extended in accordance with any notice. 1.2 The watch should

More information

Chapter 6: Distributed Systems: The Web. Fall 2012 Sini Ruohomaa Slides joint work with Jussi Kangasharju et al.

Chapter 6: Distributed Systems: The Web. Fall 2012 Sini Ruohomaa Slides joint work with Jussi Kangasharju et al. Chapter 6: Distributed Systems: The Web Fall 2012 Sini Ruohomaa Slides joint work with Jussi Kangasharju et al. Chapter Outline Web as a distributed system Basic web architecture Content delivery networks

More information

Azor: Using Two-level Block Selection to Improve SSD-based I/O caches

Azor: Using Two-level Block Selection to Improve SSD-based I/O caches Azor: Using Two-level Block Selection to Improve SSD-based I/O caches Yannis Klonatos, Thanos Makatos, Manolis Marazakis, Michail D. Flouris, Angelos Bilas {klonatos, makatos, maraz, flouris, bilas}@ics.forth.gr

More information

Slide 1. Opera Max. Migrating the next billion smartphone users for better app experience

Slide 1. Opera Max. Migrating the next billion smartphone users for better app experience Slide 1 Opera Max Migrating the next billion smartphone users for better app experience The 3 Consideration in the Next Billion Migration Slide 2 Cost of ownership (Device) Cost of usage (Data) Network

More information

High-Performance Data Loading and Augmentation for Deep Neural Network Training

High-Performance Data Loading and Augmentation for Deep Neural Network Training High-Performance Data Loading and Augmentation for Deep Neural Network Training Trevor Gale tgale@ece.neu.edu Steven Eliuk steven.eliuk@gmail.com Cameron Upright c.upright@samsung.com Roadmap 1. The General-Purpose

More information

A Comparison of Capacity Management Schemes for Shared CMP Caches

A Comparison of Capacity Management Schemes for Shared CMP Caches A Comparison of Capacity Management Schemes for Shared CMP Caches Carole-Jean Wu and Margaret Martonosi Princeton University 7 th Annual WDDD 6/22/28 Motivation P P1 P1 Pn L1 L1 L1 L1 Last Level On-Chip

More information

Acceleration Systems Technical Overview. September 2014, v1.4

Acceleration Systems Technical Overview. September 2014, v1.4 Acceleration Systems Technical Overview September 2014, v1.4 Acceleration Systems 2014 Table of Contents 3 Background 3 Cloud-Based Bandwidth Optimization 4 Optimizations 5 Protocol Optimization 5 CIFS

More information

CSci Introduction to Distributed Systems. Communication: RPC

CSci Introduction to Distributed Systems. Communication: RPC CSci 5105 Introduction to Distributed Systems Communication: RPC Today Remote Procedure Call Chapter 4 TVS Last Time Architectural styles RPC generally mandates client-server but not always Interprocess

More information

Are Mobile Technologies Safe Enough for Industrie 4.0?

Are Mobile Technologies Safe Enough for Industrie 4.0? Are Mobile Technologies Safe Enough for Industrie 4.0? Presented by Bryan Owen PE Mobile Technology is Awesome! Cameras Drone UAVs GPS Sensors Smart phones Wearables https://www.osisoft.com/presentations/geospatial-sensor---driven-analytics-using-drones/

More information

Lightweight RPC. Robert Grimm New York University

Lightweight RPC. Robert Grimm New York University Lightweight RPC Robert Grimm New York University The Three Questions What is the problem? What is new or different? What are the contributions and limitations? The Structure of Systems Monolithic kernels

More information

CS 654 Computer Architecture Summary. Peter Kemper

CS 654 Computer Architecture Summary. Peter Kemper CS 654 Computer Architecture Summary Peter Kemper Chapters in Hennessy & Patterson Ch 1: Fundamentals Ch 2: Instruction Level Parallelism Ch 3: Limits on ILP Ch 4: Multiprocessors & TLP Ap A: Pipelining

More information

Firefox for Android. Reviewer s Guide. Contact us:

Firefox for Android. Reviewer s Guide. Contact us: Reviewer s Guide Contact us: press@mozilla.com Table of Contents About Mozilla 1 Move at the Speed of the Web 2 Get Started 3 Mobile Browsing Upgrade 4 Get Up and Go 6 Customize On the Go 7 Privacy and

More information

OS-caused Long JVM Pauses - Deep Dive and Solutions

OS-caused Long JVM Pauses - Deep Dive and Solutions OS-caused Long JVM Pauses - Deep Dive and Solutions Zhenyun Zhuang LinkedIn Corp., Mountain View, California, USA https://www.linkedin.com/in/zhenyun Zhenyun@gmail.com 2016-4-21 Outline q Introduction

More information

TUNING CUDA APPLICATIONS FOR MAXWELL

TUNING CUDA APPLICATIONS FOR MAXWELL TUNING CUDA APPLICATIONS FOR MAXWELL DA-07173-001_v7.0 March 2015 Application Note TABLE OF CONTENTS Chapter 1. Maxwell Tuning Guide... 1 1.1. NVIDIA Maxwell Compute Architecture... 1 1.2. CUDA Best Practices...2

More information

LINUX KERNEL UPDATES FOR AUTOMOTIVE: LESSONS LEARNED

LINUX KERNEL UPDATES FOR AUTOMOTIVE: LESSONS LEARNED LINUX KERNEL UPDATES FOR AUTOMOTIVE: LESSONS LEARNED TOM MCREYNOLDS, VLAD BUZOV AUTOMOTIVE SOFTWARE OCTOBER 15TH, 2013 Why kernel upgrades : the problem Linux Kernel cadence doesn t match Automotive s

More information

POMAC: Properly Offloading Mobile Applications to Clouds

POMAC: Properly Offloading Mobile Applications to Clouds POMAC: Properly Offloading Mobile Applications to Clouds Mohammed Anowarul Hassan George Mason University Kshitiz Bhattarai SAP Lab Palo Alto Qi Wei and Songqing Chen George Mason University 1 Outline

More information

Thin Locks: Featherweight Synchronization for Java

Thin Locks: Featherweight Synchronization for Java Thin Locks: Featherweight Synchronization for Java D. Bacon 1 R. Konuru 1 C. Murthy 1 M. Serrano 1 Presented by: Calvin Hubble 2 1 IBM T.J. Watson Research Center 2 Department of Computer Science 16th

More information

Simplifying DSP Development with C6EZ Tools

Simplifying DSP Development with C6EZ Tools Simplifying DSP Development with C6EZ Tools DSP Development made easier with C6EZ Tools Seamlessly ports ARM code to DSP (ARM Developers) Provides ARM access to ready-to-use DSP kernels (System Developers)

More information

CA485 Ray Walshe Google File System

CA485 Ray Walshe Google File System Google File System Overview Google File System is scalable, distributed file system on inexpensive commodity hardware that provides: Fault Tolerance File system runs on hundreds or thousands of storage

More information

Footfall GPS Polling Scheduler for Power Saving on Wearable Devices. Kent W. Nixon, Xiang Chen, Yiran Chen University of Pittsburgh January 28, 2016

Footfall GPS Polling Scheduler for Power Saving on Wearable Devices. Kent W. Nixon, Xiang Chen, Yiran Chen University of Pittsburgh January 28, 2016 Footfall GPS Polling Scheduler for Power Saving on Wearable Devices Kent W. Nixon, Xiang Chen, Yiran Chen University of Pittsburgh January 28, 2016 Talk Outline Background Current GPS Usage Model Footfall

More information

GPUfs: Integrating a file system with GPUs

GPUfs: Integrating a file system with GPUs GPUfs: Integrating a file system with GPUs Mark Silberstein (UT Austin/Technion) Bryan Ford (Yale), Idit Keidar (Technion) Emmett Witchel (UT Austin) 1 Building systems with GPUs is hard. Why? 2 Goal of

More information

Systematic Cooperation in P2P Grids

Systematic Cooperation in P2P Grids 29th October 2008 Cyril Briquet Doctoral Dissertation in Computing Science Department of EE & CS (Montefiore Institute) University of Liège, Belgium Application class: Bags of Tasks Bag of Task = set of

More information

Analysis and Implementation of Global Preemptive Fixed-Priority Scheduling with Dynamic Cache Allocation

Analysis and Implementation of Global Preemptive Fixed-Priority Scheduling with Dynamic Cache Allocation Analysis and Implementation of Global Preemptive Fixed-Priority Scheduling with Dynamic Cache Allocation Meng Xu Linh Thi Xuan Phan Hyon-Young Choi Insup Lee Department of Computer and Information Science

More information

High-Performance Transaction Processing in Journaling File Systems Y. Son, S. Kim, H. Y. Yeom, and H. Han

High-Performance Transaction Processing in Journaling File Systems Y. Son, S. Kim, H. Y. Yeom, and H. Han High-Performance Transaction Processing in Journaling File Systems Y. Son, S. Kim, H. Y. Yeom, and H. Han Seoul National University, Korea Dongduk Women s University, Korea Contents Motivation and Background

More information

Ultra-Compact & Super Light-weight

Ultra-Compact & Super Light-weight Ultra-Compact & Super Light-weight Set-top Box Perfect size to fit into the glove compartment. Front & Rear Cameras Total weight of the cameras are less than 45grams. Front camera is more 50% smaller than

More information

Towards SDN-Defined Programmable BYOD (Bring Your Own Device) Security

Towards SDN-Defined Programmable BYOD (Bring Your Own Device) Security Towards SDN-Defined Programmable BYOD (Bring Your Own Device) Security Sungmin Hong, Robert Baykov, Lei Xu, Srinath Nadimpalli, Guofei Gu SUCCESS Lab Texas A&M University Outline Introduction & Motivation

More information

1. Memory technology & Hierarchy

1. Memory technology & Hierarchy 1 Memory technology & Hierarchy Caching and Virtual Memory Parallel System Architectures Andy D Pimentel Caches and their design cf Henessy & Patterson, Chap 5 Caching - summary Caches are small fast memories

More information

OpenACC Course. Office Hour #2 Q&A

OpenACC Course. Office Hour #2 Q&A OpenACC Course Office Hour #2 Q&A Q1: How many threads does each GPU core have? A: GPU cores execute arithmetic instructions. Each core can execute one single precision floating point instruction per cycle

More information

IBM PSSC Montpellier Customer Center. Blue Gene/P ASIC IBM Corporation

IBM PSSC Montpellier Customer Center. Blue Gene/P ASIC IBM Corporation Blue Gene/P ASIC Memory Overview/Considerations No virtual Paging only the physical memory (2-4 GBytes/node) In C, C++, and Fortran, the malloc routine returns a NULL pointer when users request more memory

More information

C 1. Recap. CSE 486/586 Distributed Systems Distributed File Systems. Traditional Distributed File Systems. Local File Systems.

C 1. Recap. CSE 486/586 Distributed Systems Distributed File Systems. Traditional Distributed File Systems. Local File Systems. Recap CSE 486/586 Distributed Systems Distributed File Systems Optimistic quorum Distributed transactions with replication One copy serializability Primary copy replication Read-one/write-all replication

More information

vcacheshare: Automated Server Flash Cache Space Management in a Virtualiza;on Environment

vcacheshare: Automated Server Flash Cache Space Management in a Virtualiza;on Environment vcacheshare: Automated Server Flash Cache Space Management in a Virtualiza;on Environment Fei Meng Li Zhou Xiaosong Ma Sandeep U3amchandani Deng Liu With VMware during this work Background Virtualization

More information

A Scalable Lock Manager for Multicores

A Scalable Lock Manager for Multicores A Scalable Lock Manager for Multicores Hyungsoo Jung Hyuck Han Alan Fekete NICTA Samsung Electronics University of Sydney Gernot Heiser NICTA Heon Y. Yeom Seoul National University @University of Sydney

More information

Low Overhead Concurrency Control for Partitioned Main Memory Databases. Evan P. C. Jones Daniel J. Abadi Samuel Madden"

Low Overhead Concurrency Control for Partitioned Main Memory Databases. Evan P. C. Jones Daniel J. Abadi Samuel Madden Low Overhead Concurrency Control for Partitioned Main Memory Databases Evan P. C. Jones Daniel J. Abadi Samuel Madden" Banks" Payment Processing" Airline Reservations" E-Commerce" Web 2.0" Problem:" Millions

More information

Information Sharing and User Privacy in the Third-party Identity Management Landscape

Information Sharing and User Privacy in the Third-party Identity Management Landscape Information Sharing and User Privacy in the Third-party Identity Management Landscape Anna Vapen¹, Niklas Carlsson¹, Anirban Mahanti², Nahid Shahmehri¹ ¹Linköping University, Sweden ²NICTA, Australia 2

More information

Introducing the Cray XMT. Petr Konecny May 4 th 2007

Introducing the Cray XMT. Petr Konecny May 4 th 2007 Introducing the Cray XMT Petr Konecny May 4 th 2007 Agenda Origins of the Cray XMT Cray XMT system architecture Cray XT infrastructure Cray Threadstorm processor Shared memory programming model Benefits/drawbacks/solutions

More information

2014 Honeywell Users Group Europe Middle East and Africa. Mobile App Guide

2014 Honeywell Users Group Europe Middle East and Africa. Mobile App Guide 2014 Honeywell Users Group Europe Middle East and Africa Mobile App Guide Introduction Welcome to the 2014 Honeywell Users Group EMEA Conference. This year, we have replaced the printed agenda book with

More information

Synchronization. CS61, Lecture 18. Prof. Stephen Chong November 3, 2011

Synchronization. CS61, Lecture 18. Prof. Stephen Chong November 3, 2011 Synchronization CS61, Lecture 18 Prof. Stephen Chong November 3, 2011 Announcements Assignment 5 Tell us your group by Sunday Nov 6 Due Thursday Nov 17 Talks of interest in next two days Towards Predictable,

More information

BzTree: A High-Performance Latch-free Range Index for Non-Volatile Memory

BzTree: A High-Performance Latch-free Range Index for Non-Volatile Memory BzTree: A High-Performance Latch-free Range Index for Non-Volatile Memory JOY ARULRAJ JUSTIN LEVANDOSKI UMAR FAROOQ MINHAS PER-AKE LARSON Microsoft Research NON-VOLATILE MEMORY [NVM] PERFORMANCE DRAM VOLATILE

More information

CloneCloud: Elastic Execution between Mobile Device and Cloud, Chun et al.

CloneCloud: Elastic Execution between Mobile Device and Cloud, Chun et al. CloneCloud: Elastic Execution between Mobile Device and Cloud, Chun et al. Noah Apthorpe Department of Computer Science Princeton University October 14th, 2015 Noah Apthorpe CloneCloud 1/16 Motivation

More information

CS551 Ad-hoc Routing

CS551 Ad-hoc Routing CS551 Ad-hoc Routing Bill Cheng http://merlot.usc.edu/cs551-f12 1 Mobile Routing Alternatives Why not just assume a base station? good for many cases, but not some (military, disaster recovery, sensor

More information

SECURED SOCIAL TUBE FOR VIDEO SHARING IN OSN SYSTEM

SECURED SOCIAL TUBE FOR VIDEO SHARING IN OSN SYSTEM ABSTRACT: SECURED SOCIAL TUBE FOR VIDEO SHARING IN OSN SYSTEM J.Priyanka 1, P.Rajeswari 2 II-M.E(CS) 1, H.O.D / ECE 2, Dhanalakshmi Srinivasan Engineering College, Perambalur. Recent years have witnessed

More information

EECS 192: Mechatronics Design Lab

EECS 192: Mechatronics Design Lab EECS 192: Mechatronics Design Lab Discussion 9 (Part 2): Embedded Software GSI: Richard Ducky Lin 18 & 19 Feb 2015 (Week 9) 1 Embedded Programming 2 Software Engineering Ducky (UCB EECS) Mechatronics Design

More information

Last Time. Making correct concurrent programs. Maintaining invariants Avoiding deadlocks

Last Time. Making correct concurrent programs. Maintaining invariants Avoiding deadlocks Last Time Making correct concurrent programs Maintaining invariants Avoiding deadlocks Today Power management Hardware capabilities Software management strategies Power and Energy Review Energy is power

More information

Goals. Facebook s Scaling Problem. Scaling Strategy. Facebook Three Layer Architecture. Workload. Memcache as a Service.

Goals. Facebook s Scaling Problem. Scaling Strategy. Facebook Three Layer Architecture. Workload. Memcache as a Service. Goals Memcache as a Service Tom Anderson Rapid application development - Speed of adding new features is paramount Scale Billions of users Every user on FB all the time Performance Low latency for every

More information