Dynamic Mobile Device Integration for Efficient Cross-device Resource Sharing

Size: px
Start display at page:

Download "Dynamic Mobile Device Integration for Efficient Cross-device Resource Sharing"

Transcription

1 : Dynamic Mobile Device Integration for Efficient Cross-device Resource Sharing Dongju Chae 1, Joonsung Kim 2, Gwangmu Lee 2, Hanjun Kim 1, Kyung-Ah Chang 3, Hyogun Lee 3, and Jangwoo Kim 2 1 Dept. of Computer Science and Engineering, POSTECH 2 Dept. of Electrical and Computer Engineering, Seoul National University 3 Samsung Electronics

2 Increasing Number of IoT Devices 2/35

3 Sharing Multi-device Enables Many Services I want to build my home theater Audio Smart Monitoring System? Custom Home Theater Storage Remote Gaming System?, and various custom services. Audio 3/35

4 Index Limitations of existing schemes I/O request forwarding Manual programming : Efficient dynamic resource sharing Evaluation Conclusion 4/35

5 Running Example: Home Theater Program open(storage); load(data); decode(data); render(data); 5/35

6 (1) I/O Request Forwarding - with the example of home theater Phone TV Program open(storage); load(data); decode(data); render(data); Decoded data I/O abstraction layer High Traffic! I/O Resources data Storage I/O Resources Storage 6/35

7 (1) I/O Request Forwarding - with the example of home theater Good load(data); decode(data); render(data); Storage Phone TV Program I/O open(storage); abstraction layer for transparency Easy programming environment Bad Decoded data I/O abstraction layer High Traffic! High network traffic duei/o to Resources unoptimizedi/o datapath Resources data Poor performance Storage 7/35

8 Sender Receiver (2) Manual Programming - with the example of home theater Phone Program TV Program open(storage); load(data); Low Traffic! recv(data); send(data); decode(data); I/O Interface Decoded data I/O Interface render(data); I/O Resources data Storage I/O Resources Storage 8/35

9 Sender Receiver (2) Manual Programming - with the example of home theater Good open(storage); open(storage); load(data); load(data); send(data); decode(data); render(data); Phone Program I/O Interface Phone Program Device-aware task partitioning Low Good performance Traffic! Bad TV Program Decoded data I/O I/O abstraction Interface layer TV recv(data); decode(data); render(data); Hand-tuned multi-device application data High programming effort I/O ResourcesI/O Resources I/O ResourcesI/O Resources Storage Storage Storage Storage 9/35

10 Programmability Design Goals Request Forwarding High Performance & Programmability Manual Programming Performance 10/35

11 Index Limitations of existing schemes : Efficient dynamic resource sharing Key ideas Architecture & Implementation Evaluation Conclusion 11/35

12 Key Ideas of 1) Transparent & Wide Resource Integration Kernel-level resource integration Transparent integration (easy programmability) Sharing multiple types of resources (i.e., CPU, Memory, I/O) Wide resource coverage 2) Resource-aware Dynamic Task Redistribution Contention detection based on resource usage Performance estimation for migration scenarios Optimized performance 12/35

13 Key Ideas of (1/2) 1) Transparent & Wide Resource Integration I want to use remote screen on TV! I/O datapath Local I/O Remote I/O User Kernel I/O Resource I/O Resource Phone TV 13/35

14 Key Ideas of (2/2) 2) Resource-aware Dynamic Task Redistribution I/O datapath Local I/O Remote I/O User High Traffic! Kernel I/O Resource I/O Resource Phone TV 14/35

15 Key Ideas of (2/2) 2) Resource-aware Dynamic Task Redistribution Low Traffic! User I/O datapath Local I/O Remote I/O Kernel I/O Resource I/O Resource Phone TV 15/35

16 Architecture User Migration Selector Migration Selector Kernel Contention Detector Contention Detector (1) Resource Integrator Select Resource Integrator Storage Local Device (Master) Remote Device 16/35

17 Architecture User Migration Selector Migration Selector Kernel (2) Contention Detector Contention Detector Resource Integrator Resource Integrator Storage Local Device (Master) Remote Device 17/35

18 Architecture User Kernel (3) Migration Selector Migration Selector Contention Detector Contention Detector Resource Integrator Resource Integrator Storage Local Device (Master) Remote Device 18/35

19 (1) Resource Integrator (Memory, CPU, I/O) DSM Memory integration Distributed shared memory (DSM) Three perf. optimizations Lazy release consistency model Page-level coherency block Memory prefetching Device Device 19/35

20 (1) Resource Integrator (Memory, CPU, I/O) Migration CPU integration Thread migration DSM Optimizations Thread group granularity Clone-based migration Transparent live migration Device Device 20/35

21 (1) Resource Integrator (Memory, CPU, I/O) Migration I/O resource integration Request forwarding (e.g., device file) DSM /dev/xx Device File /dev/xx Optimizations Data compression Platform-assisted handling Device Resource XX Device 21/35

22 2) Contention Detector Calculate the slowdown of each thread Collect per-thread resource usage (e.g., CPU, network, ) Measure a stall time due to resource access File user Comp kernel Low (fast) High Storage Task Latency Phone TV 22/35

23 2) Contention Detector Calculate the slowdown of each thread Collect per-thread resource usage (e.g., CPU, network, ) Measure a stall time due to resource access File user Comp Contention Low (fast) Network High Storage kernel Task Latency Phone TV 23/35

24 3) Migration Selector Estimate the performance tradeoff after migration Utilize intra-/inter-device information to decide migration targets 1. Intra-device communication 2. Inter-device bandwidth & latency Group 1 Group 2 1MB CPU 2MB Storage Player 5MB 5MB 3. Remote resource status Phone TV 24/35

25 3) Migration Selector Estimate the performance tradeoff after migration Utilize intra-/inter-device information to decide migration targets Calculate the tradeoffs of all possible migration scenarios High traffic! High traffic! High mig. overhead! Scenario #1 Scenario #2 Scenario #3 Low traffic! Scenario #4 Select Target Storage Storage Storage Storage

26 Index Limitations of existing schemes : Efficient dynamic resource sharing Evaluation Scenario #1: Home theater Scenario #2: Smart monitoring Conclusion 26/35

27 Evaluation Setup: Home theater Master Device (Local) Home Theater Large TV (Remote) HQ Speaker (Remote) Android-based smartphones (5.1.1) - Nexus 5 (master) - Nexus 4 (connected to a speaker) Tizen-based smart TV (2.3) Storage Audio Phone (Master) Speaker Audio TV 27/35

28 Frames per second (FPS) Performance Timeline - Resource reconfiguration (local screen remote large screen) 30 Local Remote Large Select a remote large screen Start migration Restore the target performance 6 0 Detect contention & Start analysis Time (sec) /35

29 Frames Per Second Throughput Improvement - Measured FPS for the home theater scenario Achieve (or closely) the target performance goal 30 Ideal (24 FPS) Request Forwarding 240p 360p 480p 720p Video quality (HD) 4x higher 8.3x higher 1080p (Full HD) 29/35

30 Avg. Stall Per Frame (ms) Minimized Network Stall Time - Per-frame latency analysis for each datapath Reduce the exposed network traffic Storage Phone TV Request Forwarding (RF) Storage Phone TV (DM) 50 0 RF DM RF DM RF DM RF DM 240p 480p 720p (HD) Video quality p (Full HD) 30/35

31 Evaluation Setup: Smart Monitoring Smart Monitoring Master Device (Local) Selector Processing Phone Processing Camera Camera Device (Remote) Android-based smartphones (5.1.1) - Nexus 5 (master) - Nexus 5 (camera) x 3 Camera Camera Phone (Master) Camera Phone 31/35

32 Frames Per Second Throughput Improvement - As the camera preview resolution increases Achieve higher throughput than Request Forwarding Ideal (15 FPS) Request Forwarding 176x x x x x480 Preview resolution (pixels) 3x higher 32/35

33 Avg. Frame Latency (ms) Computation Bottleneck - Per-frame detection latency Potential to achieve higher throughput with faster CPUs 400 Network Computation RF DM RF DM RF DM RF DM RF DM 176x x x x x480 Preview resolution (pixels) 33/35

34 Conclusion : Efficient cross-device resource sharing Transparent resource integration for diverse resources Resource-aware dynamic task redistribution Implementation on top of Android/Tizen devices Achieve target performance for multi-device services e.g., 8.2 FPS 24 FPS on the home theater scenario 34/35

35 : Dynamic Mobile Device Integration for Efficient Cross-device Resource Sharing Thank You! Dongju Chae Department of Computer Science & Engineering Pohang University of Science and Technology (POSTECH) 35/35

DynaMix: Dynamic Mobile Device Integration for Efficient Cross-device Resource Sharing

DynaMix: Dynamic Mobile Device Integration for Efficient Cross-device Resource Sharing DynaMix: Dynamic Mobile Device Integration for Efficient Cross-device Resource Sharing Dongju Chae, POSTECH; Joonsung Kim and Gwangmu Lee, Seoul National University; Hanjun Kim, POSTECH; Kyung-Ah Chang

More information

DCS-ctrl: A Fast and Flexible Device-Control Mechanism for Device-Centric Server Architecture

DCS-ctrl: A Fast and Flexible Device-Control Mechanism for Device-Centric Server Architecture DCS-ctrl: A Fast and Flexible ice-control Mechanism for ice-centric Server Architecture Dongup Kwon 1, Jaehyung Ahn 2, Dongju Chae 2, Mohammadamin Ajdari 2, Jaewon Lee 1, Suheon Bae 1, Youngsok Kim 1,

More information

Next-Generation Cloud Platform

Next-Generation Cloud Platform Next-Generation Cloud Platform Jangwoo Kim Jun 24, 2013 E-mail: jangwoo@postech.ac.kr High Performance Computing Lab Department of Computer Science & Engineering Pohang University of Science and Technology

More information

Distributed File Systems Issues. NFS (Network File System) AFS: Namespace. The Andrew File System (AFS) Operating Systems 11/19/2012 CSC 256/456 1

Distributed File Systems Issues. NFS (Network File System) AFS: Namespace. The Andrew File System (AFS) Operating Systems 11/19/2012 CSC 256/456 1 Distributed File Systems Issues NFS (Network File System) Naming and transparency (location transparency versus location independence) Host:local-name Attach remote directories (mount) Single global name

More information

WiZi-Cloud: Application-transparent Dual ZigBee-WiFi Radios for Low Power Internet Access

WiZi-Cloud: Application-transparent Dual ZigBee-WiFi Radios for Low Power Internet Access WiZi-Cloud: Application-transparent Dual ZigBee-WiFi Radios for Low Power Internet Access Tao Jin, Guevara Noubir, Bo Sheng College of Computer and Information Science Northeastern University InfoCom 2011,

More information

StressRight: Finding the Right Stress for Accurate In-development System Evaluation

StressRight: Finding the Right Stress for Accurate In-development System Evaluation StressRight: Finding the Right Stress for Accurate In-development System Evaluation Jaewon Lee 1, Hanhwi Jang 1, Jae-eon Jo 1, Gyu-Hyeon Lee 2, Jangwoo Kim 2 High Performance Computing Lab Pohang University

More information

System Support for Internet of Things

System Support for Internet of Things System Support for Internet of Things Kishore Ramachandran (Kirak Hong - Google, Dave Lillethun, Dushmanta Mohapatra, Steffen Maas, Enrique Saurez Apuy) Overview Motivation Mobile Fog: A Distributed

More information

Integrating CPU and GPU, The ARM Methodology. Edvard Sørgård, Senior Principal Graphics Architect, ARM Ian Rickards, Senior Product Manager, ARM

Integrating CPU and GPU, The ARM Methodology. Edvard Sørgård, Senior Principal Graphics Architect, ARM Ian Rickards, Senior Product Manager, ARM Integrating CPU and GPU, The ARM Methodology Edvard Sørgård, Senior Principal Graphics Architect, ARM Ian Rickards, Senior Product Manager, ARM The ARM Business Model Global leader in the development of

More information

Staged Memory Scheduling

Staged Memory Scheduling Staged Memory Scheduling Rachata Ausavarungnirun, Kevin Chang, Lavanya Subramanian, Gabriel H. Loh*, Onur Mutlu Carnegie Mellon University, *AMD Research June 12 th 2012 Executive Summary Observation:

More information

Computer parallelism Flynn s categories

Computer parallelism Flynn s categories 04 Multi-processors 04.01-04.02 Taxonomy and communication Parallelism Taxonomy Communication alessandro bogliolo isti information science and technology institute 1/9 Computer parallelism Flynn s categories

More information

Scalable Multi-DM642-based MPEG-2 to H.264 Transcoder. Arvind Raman, Sriram Sethuraman Ittiam Systems (Pvt.) Ltd. Bangalore, India

Scalable Multi-DM642-based MPEG-2 to H.264 Transcoder. Arvind Raman, Sriram Sethuraman Ittiam Systems (Pvt.) Ltd. Bangalore, India Scalable Multi-DM642-based MPEG-2 to H.264 Transcoder Arvind Raman, Sriram Sethuraman Ittiam Systems (Pvt.) Ltd. Bangalore, India Outline of Presentation MPEG-2 to H.264 Transcoding Need for a multiprocessor

More information

Multiple Context Processors. Motivation. Coping with Latency. Architectural and Implementation. Multiple-Context Processors.

Multiple Context Processors. Motivation. Coping with Latency. Architectural and Implementation. Multiple-Context Processors. Architectural and Implementation Tradeoffs for Multiple-Context Processors Coping with Latency Two-step approach to managing latency First, reduce latency coherent caches locality optimizations pipeline

More information

Dia: AutoDirective Audio Capturing Through a Synchronized Smartphone Array

Dia: AutoDirective Audio Capturing Through a Synchronized Smartphone Array Dia: AutoDirective Audio Capturing Through a Synchronized Smartphone Array Sanjib Sur Teng Wei and Xinyu Zhang University of Wisconsin - Madison 1 Multimedia applications in smartphones Growing mobile

More information

HV-320 DVB-T FPV TV Transmitter Box Quick Installation Guide

HV-320 DVB-T FPV TV Transmitter Box Quick Installation Guide HV-320 DVB-T FPV TV Transmitter Box Quick Installation Guide PACKAGE CONTENTS 2 FRONT PANEL VIEW 2 BACK PANEL VIEW 2 BOARD VIEW 3 POWER ON 4 CONFIGURE THE TRANSMISSION PARAMETERS 5 BACKUP AND RESTORE DC

More information

Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Active thread Idle thread

Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Active thread Idle thread Intra-Warp Compaction Techniques Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Goal Active thread Idle thread Compaction Compact threads in a warp to coalesce (and eliminate)

More information

ENGN1640: Design of Computing Systems Topic 06: Advanced Processor Design

ENGN1640: Design of Computing Systems Topic 06: Advanced Processor Design ENGN1640: Design of Computing Systems Topic 06: Advanced Processor Design Professor Sherief Reda http://scale.engin.brown.edu Electrical Sciences and Computer Engineering School of Engineering Brown University

More information

Agenda. System Performance Scaling of IBM POWER6 TM Based Servers

Agenda. System Performance Scaling of IBM POWER6 TM Based Servers System Performance Scaling of IBM POWER6 TM Based Servers Jeff Stuecheli Hot Chips 19 August 2007 Agenda Historical background POWER6 TM chip components Interconnect topology Cache Coherence strategies

More information

ApsaraDB for Redis. Product Introduction

ApsaraDB for Redis. Product Introduction ApsaraDB for Redis is compatible with open-source Redis protocol standards and provides persistent memory database services. Based on its high-reliability dual-machine hot standby architecture and seamlessly

More information

Mali GPU acceleration of HEVC and VP9 Decoder

Mali GPU acceleration of HEVC and VP9 Decoder Mali GPU acceleration of HEVC and VP9 Decoder 2 Web Video continues to grow!!! Video accounted for 50% of the mobile traffic in 2012 - Citrix ByteMobile's 4Q 2012 Analytics Report. Globally, IP video traffic

More information

Ultra Low Power GPUs for Wearables

Ultra Low Power GPUs for Wearables Ultra Low Power GPUs for Wearables Georgios Keramidas January 2015 The Company Who we are? Think Silicon is a privately held company founded in 2007. What we do? Development of low power GPU IP semiconductor

More information

S WHAT THE PROFILER IS TELLING YOU: OPTIMIZING GPU KERNELS. Jakob Progsch, Mathias Wagner GTC 2018

S WHAT THE PROFILER IS TELLING YOU: OPTIMIZING GPU KERNELS. Jakob Progsch, Mathias Wagner GTC 2018 S8630 - WHAT THE PROFILER IS TELLING YOU: OPTIMIZING GPU KERNELS Jakob Progsch, Mathias Wagner GTC 2018 1. Know your hardware BEFORE YOU START What are the target machines, how many nodes? Machine-specific

More information

ibench: Quantifying Interference in Datacenter Applications

ibench: Quantifying Interference in Datacenter Applications ibench: Quantifying Interference in Datacenter Applications Christina Delimitrou and Christos Kozyrakis Stanford University IISWC September 23 th 2013 Executive Summary Problem: Increasing utilization

More information

NC B103/210/220-DN. Full HD Fixed Network Camera

NC B103/210/220-DN. Full HD Fixed Network Camera Full HD Fixed Network Camera NC B103/210/220-DN ㆍD1 / 720p / 1080p Resolution ㆍDual Codec (H.264, MJPEG) / Dual Streaming (B103), Triple Streaming (B210/220) ㆍVideo Analytics (Tripzone, Tampering)/ Region

More information

Mali Developer Resources. Kevin Ho ARM Taiwan FAE

Mali Developer Resources. Kevin Ho ARM Taiwan FAE Mali Developer Resources Kevin Ho ARM Taiwan FAE ARM Mali Developer Tools Software Development SDKs for OpenGL ES & OpenCL OpenGL ES Emulators Shader Development Studio Shader Library Asset Creation Texture

More information

Venezia: a Scalable Multicore Subsystem for Multimedia Applications

Venezia: a Scalable Multicore Subsystem for Multimedia Applications Venezia: a Scalable Multicore Subsystem for Multimedia Applications Takashi Miyamori Toshiba Corporation Outline Background Venezia Hardware Architecture Venezia Software Architecture Evaluation Chip and

More information

Profiling and Debugging Games on Mobile Platforms

Profiling and Debugging Games on Mobile Platforms Profiling and Debugging Games on Mobile Platforms Lorenzo Dal Col Senior Software Engineer, Graphics Tools Gamelab 2013, Barcelona 26 th June 2013 Agenda Introduction to Performance Analysis with ARM DS-5

More information

I/O Buffering and Streaming

I/O Buffering and Streaming I/O Buffering and Streaming I/O Buffering and Caching I/O accesses are reads or writes (e.g., to files) Application access is arbitary (offset, len) Convert accesses to read/write of fixed-size blocks

More information

Realtime Issues. Real-Time CPU Scheduling. Implementing Real-Time Systems. Operating Systems CSC 256/456 1

Realtime Issues. Real-Time CPU Scheduling. Implementing Real-Time Systems. Operating Systems CSC 256/456 1 Implementing Real-Time Systems In general, real-time operating systems must provide: Realtime Issues (1) Preemptive, priority-based scheduling (2) Preemptive kernels (3) Latency must be minimized CSC 2/456

More information

Lecture 27 DASH (Dynamic Adaptive Streaming over HTTP)

Lecture 27 DASH (Dynamic Adaptive Streaming over HTTP) CS 414 Multimedia Systems Design Lecture 27 DASH (Dynamic Adaptive Streaming over HTTP) Klara Nahrstedt Spring 2012 Administrative MP2 posted MP2 Deadline April 7, Saturday, 5pm. APPLICATION Internet Multimedia

More information

A LOW-COMPLEXITY AND LOSSLESS REFERENCE FRAME ENCODER ALGORITHM FOR VIDEO CODING

A LOW-COMPLEXITY AND LOSSLESS REFERENCE FRAME ENCODER ALGORITHM FOR VIDEO CODING 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) A LOW-COMPLEXITY AND LOSSLESS REFERENCE FRAME ENCODER ALGORITHM FOR VIDEO CODING Dieison Silveira, Guilherme Povala,

More information

Increasing Performance for PowerCenter Sessions that Use Partitions

Increasing Performance for PowerCenter Sessions that Use Partitions Increasing Performance for PowerCenter Sessions that Use Partitions 1993-2015 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,

More information

Sandor Heman, Niels Nes, Peter Boncz. Dynamic Bandwidth Sharing. Cooperative Scans: Marcin Zukowski. CWI, Amsterdam VLDB 2007.

Sandor Heman, Niels Nes, Peter Boncz. Dynamic Bandwidth Sharing. Cooperative Scans: Marcin Zukowski. CWI, Amsterdam VLDB 2007. Cooperative Scans: Dynamic Bandwidth Sharing in a DBMS Marcin Zukowski Sandor Heman, Niels Nes, Peter Boncz CWI, Amsterdam VLDB 2007 Outline Scans in a DBMS Cooperative Scans Benchmarks DSM version VLDB,

More information

Mobile Edge Computing for 5G: The Communication Perspective

Mobile Edge Computing for 5G: The Communication Perspective Mobile Edge Computing for 5G: The Communication Perspective Kaibin Huang Dept. of Electrical & Electronic Engineering The University of Hong Kong Hong Kong Joint Work with Yuyi Mao (HKUST), Changsheng

More information

Introduction I/O 1. I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec

Introduction I/O 1. I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec Introduction I/O 1 I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections I/O Device Summary I/O 2 I/O System

More information

Fall 2011 Prof. Hyesoon Kim. Thanks to Prof. Loh & Prof. Prvulovic

Fall 2011 Prof. Hyesoon Kim. Thanks to Prof. Loh & Prof. Prvulovic Fall 2011 Prof. Hyesoon Kim Thanks to Prof. Loh & Prof. Prvulovic Flynn s Taxonomy of Parallel Machines How many Instruction streams? How many Data streams? SISD: Single I Stream, Single D Stream A uniprocessor

More information

Voice Over LTE (VoLTE) Technology. July 23, 2018 Tim Burke

Voice Over LTE (VoLTE) Technology. July 23, 2018 Tim Burke Voice Over LTE (VoLTE) Technology July 23, 2018 Tim Burke Range of Frequencies Humans Can Hear 20,000 Hz 20 Hz Human Hearing 8,000 Hz 10,000 Hz 14,000 Hz 12,000 Hz Range of Frequencies Designed For Entertainment

More information

A Row Buffer Locality-Aware Caching Policy for Hybrid Memories. HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu

A Row Buffer Locality-Aware Caching Policy for Hybrid Memories. HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu A Row Buffer Locality-Aware Caching Policy for Hybrid Memories HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu Overview Emerging memories such as PCM offer higher density than

More information

Multimedia. Sape J. Mullender. Huygens Systems Research Laboratory Universiteit Twente Enschede

Multimedia. Sape J. Mullender. Huygens Systems Research Laboratory Universiteit Twente Enschede Multimedia Sape J. Mullender Huygens Systems Research Laboratory Universiteit Twente Enschede 1 What is Multimedia? Videophone CD Interactive One socket, many services (video) phone TV video on demand

More information

Under The Hood: Performance Tuning With Tizen. Ravi Sankar Guntur

Under The Hood: Performance Tuning With Tizen. Ravi Sankar Guntur Under The Hood: Performance Tuning With Tizen Ravi Sankar Guntur How to write a Tizen App Tools already available in IDE v2.3 Dynamic Analyzer Valgrind 2 What s NEXT? Want to optimize my application App

More information

Workload Characterization and Optimization of TPC-H Queries on Apache Spark

Workload Characterization and Optimization of TPC-H Queries on Apache Spark Workload Characterization and Optimization of TPC-H Queries on Apache Spark Tatsuhiro Chiba and Tamiya Onodera IBM Research - Tokyo April. 17-19, 216 IEEE ISPASS 216 @ Uppsala, Sweden Overview IBM Research

More information

HD Audio Converter Incorporates HDMI technology

HD Audio Converter Incorporates HDMI technology HD Audio Converter Incorporates HDMI technology HDMI input HDMI, Optical, Coaxial and 3.5mm audio output MODEL : HA-110SA OWNERS MANUAL 1 INTRODUCTION Congratulations on your purchase of the HD Audio Converter.

More information

CSCI-GA Multicore Processors: Architecture & Programming Lecture 10: Heterogeneous Multicore

CSCI-GA Multicore Processors: Architecture & Programming Lecture 10: Heterogeneous Multicore CSCI-GA.3033-012 Multicore Processors: Architecture & Programming Lecture 10: Heterogeneous Multicore Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Status Quo Previously, CPU vendors

More information

D E N A L I S T O R A G E I N T E R F A C E. Laura Caulfield Senior Software Engineer. Arie van der Hoeven Principal Program Manager

D E N A L I S T O R A G E I N T E R F A C E. Laura Caulfield Senior Software Engineer. Arie van der Hoeven Principal Program Manager 1 T HE D E N A L I N E X T - G E N E R A T I O N H I G H - D E N S I T Y S T O R A G E I N T E R F A C E Laura Caulfield Senior Software Engineer Arie van der Hoeven Principal Program Manager Outline Technology

More information

Profiling and Workflow

Profiling and Workflow Profiling and Workflow Preben N. Olsen University of Oslo and Simula Research Laboratory preben@simula.no September 13, 2013 1 / 34 Agenda 1 Introduction What? Why? How? 2 Profiling Tracing Performance

More information

Application-Specific Configuration Selection in the Cloud: Impact of Provider Policy and Potential of Systematic Testing

Application-Specific Configuration Selection in the Cloud: Impact of Provider Policy and Potential of Systematic Testing Application-Specific Configuration Selection in the Cloud: Impact of Provider Policy and Potential of Systematic Testing Mohammad Hajjat +, Ruiqi Liu*, Yiyang Chang +, T.S. Eugene Ng*, Sanjay Rao + + Purdue

More information

GreenBag: Energy-efficient Bandwidth Aggregation For Real-time Streaming in Heterogeneous Mobile Wireless Networks

GreenBag: Energy-efficient Bandwidth Aggregation For Real-time Streaming in Heterogeneous Mobile Wireless Networks GreenBag: Energy-efficient Bandwidth Aggregation For Real-time Streaming in Heterogeneous Mobile Wireless Networks Duc Hoang Bui, Kilho Lee, Sangeun Oh, Insik Shin Dept. of Computer Science KAIST, South

More information

Profiling: Understand Your Application

Profiling: Understand Your Application Profiling: Understand Your Application Michal Merta michal.merta@vsb.cz 1st of March 2018 Agenda Hardware events based sampling Some fundamental bottlenecks Overview of profiling tools perf tools Intel

More information

RPT: Re-architecting Loss Protection for Content-Aware Networks

RPT: Re-architecting Loss Protection for Content-Aware Networks RPT: Re-architecting Loss Protection for Content-Aware Networks Dongsu Han, Ashok Anand ǂ, Aditya Akella ǂ, and Srinivasan Seshan Carnegie Mellon University ǂ University of Wisconsin-Madison Motivation:

More information

Review: Creating a Parallel Program. Programming for Performance

Review: Creating a Parallel Program. Programming for Performance Review: Creating a Parallel Program Can be done by programmer, compiler, run-time system or OS Steps for creating parallel program Decomposition Assignment of tasks to processes Orchestration Mapping (C)

More information

Datacenter Wide- area Enterprise

Datacenter Wide- area Enterprise Datacenter Wide- area Enterprise Client LOAD- BALANCER Can t choose path : ( Servers Outline and goals A new architecture for distributed load-balancing joint (server, path) selection Demonstrate a nation-wide

More information

Reliable Stream Analysis on the Internet of Things

Reliable Stream Analysis on the Internet of Things Reliable Stream Analysis on the Internet of Things ECE6102 Course Project Team IoT Submitted April 30, 2014 1 1. Introduction Team IoT is interested in developing a distributed system that supports live

More information

Distributed Shared Memory

Distributed Shared Memory Distributed Shared Memory History, fundamentals and a few examples Coming up The Purpose of DSM Research Distributed Shared Memory Models Distributed Shared Memory Timeline Three example DSM Systems The

More information

The Efficient Point to Point or Multipoint Live Video Streaming in Wireless Devices Mahesh 1 R Rajkumar 2 Dr M V Sudhamani 3

The Efficient Point to Point or Multipoint Live Video Streaming in Wireless Devices Mahesh 1 R Rajkumar 2 Dr M V Sudhamani 3 IJSRD - International Journal for Scientific Research & Development Vol. 3, Issue 04, 2015 ISSN (online): 2321-0613 The Efficient Point to Point or Multipoint Live Video Streaming in Wireless Devices Mahesh

More information

Ubiquitous and Mobile Computing CS 525M: Virtually Unifying Personal Storage for Fast and Pervasive Data Accesses

Ubiquitous and Mobile Computing CS 525M: Virtually Unifying Personal Storage for Fast and Pervasive Data Accesses Ubiquitous and Mobile Computing CS 525M: Virtually Unifying Personal Storage for Fast and Pervasive Data Accesses Pengfei Tang Computer Science Dept. Worcester Polytechnic Institute (WPI) Introduction:

More information

Virtual Machines. 2 Disco: Running Commodity Operating Systems on Scalable Multiprocessors([1])

Virtual Machines. 2 Disco: Running Commodity Operating Systems on Scalable Multiprocessors([1]) EE392C: Advanced Topics in Computer Architecture Lecture #10 Polymorphic Processors Stanford University Thursday, 8 May 2003 Virtual Machines Lecture #10: Thursday, 1 May 2003 Lecturer: Jayanth Gummaraju,

More information

Transmit Smart with Transmit Beamforming

Transmit Smart with Transmit Beamforming WHITE PAPER Transmit Smart with Transmit Beamforming Bhama Vemuru Senior Technical Marketing Engineering Marvell November 2011 www.marvell.com Introduction One of the challenges often faced with Wi-Fi

More information

Next Generation Visual Computing

Next Generation Visual Computing Next Generation Visual Computing (Making GPU Computing a Reality with Mali ) Taipei, 18 June 2013 Roberto Mijat ARM Addressing Computational Challenges Trends Growing display sizes and resolutions Increasing

More information

NUMA-Aware Data-Transfer Measurements for Power/NVLink Multi-GPU Systems

NUMA-Aware Data-Transfer Measurements for Power/NVLink Multi-GPU Systems NUMA-Aware Data-Transfer Measurements for Power/NVLink Multi-GPU Systems Carl Pearson 1, I-Hsin Chung 2, Zehra Sura 2, Wen-Mei Hwu 1, and Jinjun Xiong 2 1 University of Illinois Urbana-Champaign, Urbana

More information

CUDA OPTIMIZATION WITH NVIDIA NSIGHT ECLIPSE EDITION

CUDA OPTIMIZATION WITH NVIDIA NSIGHT ECLIPSE EDITION CUDA OPTIMIZATION WITH NVIDIA NSIGHT ECLIPSE EDITION WHAT YOU WILL LEARN An iterative method to optimize your GPU code Some common bottlenecks to look out for Performance diagnostics with NVIDIA Nsight

More information

Impact of TCP Window Size on a File Transfer

Impact of TCP Window Size on a File Transfer Impact of TCP Window Size on a File Transfer Introduction This example shows how ACE diagnoses and visualizes application and network problems; it is not a step-by-step tutorial. If you have experience

More information

IP Video Phone on DM64x

IP Video Phone on DM64x IP Video Phone on DM64x Sriram Sethuraman Ittiam Systems Pvt. Ltd., Bangalore Acknowledgments to: Ittiam AV Systems and VVOIP Teams Video Phone Brief history Over IP New Markets Suitability of DM64x Solution

More information

Video-Aware Wireless Networks (VAWN) Final Meeting January 23, 2014

Video-Aware Wireless Networks (VAWN) Final Meeting January 23, 2014 Video-Aware Wireless Networks (VAWN) Final Meeting January 23, 2014 1/26 ! Real-time Video Transmission! Challenges and Opportunities! Lessons Learned for Real-time Video! Mitigating Losses in Scalable

More information

Designing Next-Generation Data- Centers with Advanced Communication Protocols and Systems Services. Presented by: Jitong Chen

Designing Next-Generation Data- Centers with Advanced Communication Protocols and Systems Services. Presented by: Jitong Chen Designing Next-Generation Data- Centers with Advanced Communication Protocols and Systems Services Presented by: Jitong Chen Outline Architecture of Web-based Data Center Three-Stage framework to benefit

More information

Experience with MultiAgent-Based Distributed Service Composition

Experience with MultiAgent-Based Distributed Service Composition FIWC 2009, SNU Feb. 24, 2009 Experience with Multi-Based Distributed Composition Sang Woo Han and JongWon Kim {swhan, jongwon}@nm.gist.ac.kr Networked Media Lab., Dept. of Information and Communications

More information

15-740/ Computer Architecture Lecture 20: Main Memory II. Prof. Onur Mutlu Carnegie Mellon University

15-740/ Computer Architecture Lecture 20: Main Memory II. Prof. Onur Mutlu Carnegie Mellon University 15-740/18-740 Computer Architecture Lecture 20: Main Memory II Prof. Onur Mutlu Carnegie Mellon University Today SRAM vs. DRAM Interleaving/Banking DRAM Microarchitecture Memory controller Memory buses

More information

OS-caused Long JVM Pauses - Deep Dive and Solutions

OS-caused Long JVM Pauses - Deep Dive and Solutions OS-caused Long JVM Pauses - Deep Dive and Solutions Zhenyun Zhuang LinkedIn Corp., Mountain View, California, USA https://www.linkedin.com/in/zhenyun Zhenyun@gmail.com 2016-4-21 Outline q Introduction

More information

Samsung Galaxy S3 Repair Video Format Support Movie

Samsung Galaxy S3 Repair Video Format Support Movie Samsung Galaxy S3 Repair Video Format Support Movie Unable to find the latest update for KitKat on Samsung Galaxy Tab 3 SM T210 Their technicians should be able to fix it or replace it.. Unable to play

More information

JEDEC Mobile Forum 2014

JEDEC Mobile Forum 2014 How Storage Solutions Are Accelerating the Mobile Revolution Stephen Lum, Mobile Memory Product Marketing Hank Lai, Mobile Memory Product Planning Samsung Electronics, AHQ JEDEC Mobile Forum 2014 Copyright

More information

Final Lecture. A few minutes to wrap up and add some perspective

Final Lecture. A few minutes to wrap up and add some perspective Final Lecture A few minutes to wrap up and add some perspective 1 2 Instant replay The quarter was split into roughly three parts and a coda. The 1st part covered instruction set architectures the connection

More information

Rocksteady: Fast Migration for Low-Latency In-memory Storage. Chinmay Kulkarni, Aniraj Kesavan, Tian Zhang, Robert Ricci, Ryan Stutsman

Rocksteady: Fast Migration for Low-Latency In-memory Storage. Chinmay Kulkarni, Aniraj Kesavan, Tian Zhang, Robert Ricci, Ryan Stutsman Rocksteady: Fast Migration for Low-Latency In-memory Storage Chinmay Kulkarni, niraj Kesavan, Tian Zhang, Robert Ricci, Ryan Stutsman 1 Introduction Distributed low-latency in-memory key-value stores are

More information

Advanced d Instruction Level Parallelism. Computer Systems Laboratory Sungkyunkwan University

Advanced d Instruction Level Parallelism. Computer Systems Laboratory Sungkyunkwan University Advanced d Instruction ti Level Parallelism Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu ILP Instruction-Level Parallelism (ILP) Pipelining:

More information

Flash-Conscious Cache Population for Enterprise Database Workloads

Flash-Conscious Cache Population for Enterprise Database Workloads IBM Research ADMS 214 1 st September 214 Flash-Conscious Cache Population for Enterprise Database Workloads Hyojun Kim, Ioannis Koltsidas, Nikolas Ioannou, Sangeetha Seshadri, Paul Muench, Clem Dickey,

More information

Architectural Musings

Architectural Musings Architectural Musings Rethinking Computer Systems Architecture & Evaluation Christopher Vick cvick@qti.qualcomm.com March 23, 2014 1 Introduction Vision Talk How should we analyze, reason about and evaluate

More information

Parallelizing Inline Data Reduction Operations for Primary Storage Systems

Parallelizing Inline Data Reduction Operations for Primary Storage Systems Parallelizing Inline Data Reduction Operations for Primary Storage Systems Jeonghyeon Ma ( ) and Chanik Park Department of Computer Science and Engineering, POSTECH, Pohang, South Korea {doitnow0415,cipark}@postech.ac.kr

More information

Accelerate Network Protocol Stack Performance and Adoption in the Cloud Networking via DMM

Accelerate Network Protocol Stack Performance and Adoption in the Cloud Networking via DMM Accelerate Network Protocol Stack Performance and Adoption in the Cloud Networking via DMM Waterman Cao Senior Researcher Cloud Networking Lab, Huawei AGENDA 01 02 03 Overview What we face DMM Overview

More information

Multimedia Systems 2011/2012

Multimedia Systems 2011/2012 Multimedia Systems 2011/2012 System Architecture Prof. Dr. Paul Müller University of Kaiserslautern Department of Computer Science Integrated Communication Systems ICSY http://www.icsy.de Sitemap 2 Hardware

More information

Collecting OpenCL*-related Metrics with Intel Graphics Performance Analyzers

Collecting OpenCL*-related Metrics with Intel Graphics Performance Analyzers Collecting OpenCL*-related Metrics with Intel Graphics Performance Analyzers Collecting Important OpenCL*-related Metrics with Intel GPA System Analyzer Introduction Intel SDK for OpenCL* Applications

More information

PowerVR GPU IP from Wearables to Servers. Kristof Beets Director of Business Development May 2015

PowerVR GPU IP from Wearables to Servers. Kristof Beets Director of Business Development May 2015 PowerVR GPU IP from Wearables to Servers Kristof Beets Director of Business Development May 2015 www.imgtec.com Expanding embedded GPU market opportunities Huge range of market opportunities equates to

More information

Hybrid Implementation of 3D Kirchhoff Migration

Hybrid Implementation of 3D Kirchhoff Migration Hybrid Implementation of 3D Kirchhoff Migration Max Grossman, Mauricio Araya-Polo, Gladys Gonzalez GTC, San Jose March 19, 2013 Agenda 1. Motivation 2. The Problem at Hand 3. Solution Strategy 4. GPU Implementation

More information

2 x 3.5-inch SATA 6Gb/s, SATA 3Gb/s hard drive. 1. The standard system is shipped without HDD. 2. For the HDD compatibility list, please

2 x 3.5-inch SATA 6Gb/s, SATA 3Gb/s hard drive. 1. The standard system is shipped without HDD. 2. For the HDD compatibility list, please VS-2112 Pro+ Hardware Spec. Processor Memory HDD Capacity Dual-core Intel processor 4GB RAM 2 x 3.5-inch SATA 6Gb/s, SATA 3Gb/s hard drive NOTE: 1. The standard system is shipped without HDD. 2. For the

More information

Managing Array of SSDs When the Storage Device is No Longer the Performance Bottleneck

Managing Array of SSDs When the Storage Device is No Longer the Performance Bottleneck Managing Array of Ds When the torage Device is No Longer the Performance Bottleneck Byung. Kim, Jaeho Kim, am H. Noh UNIT (Ulsan National Institute of cience & Technology) Outline Motivation & Observation

More information

ND9322P ND9424P. H CH/16-CH Embedded Plug & Play NVR

ND9322P ND9424P. H CH/16-CH Embedded Plug & Play NVR ND9322P ND9424P H.265 8-CH/16-CH Embedded Plug & Play NVR H.265/H.264 4K Display 8/16 x PoE Port RAID 0, 1 Fisheye Dewarp Trend Micro IoT Security VIVOCloud VIVOTEK s ND9322P/ND9424P is a new generation

More information

High-Performance Data Loading and Augmentation for Deep Neural Network Training

High-Performance Data Loading and Augmentation for Deep Neural Network Training High-Performance Data Loading and Augmentation for Deep Neural Network Training Trevor Gale tgale@ece.neu.edu Steven Eliuk steven.eliuk@gmail.com Cameron Upright c.upright@samsung.com Roadmap 1. The General-Purpose

More information

Ubiquitous, lightweight NFV in the age of the Internet of Things

Ubiquitous, lightweight NFV in the age of the Internet of Things Ubiquitous, lightweight NFV in the age of the Internet of Things Richard Cziva - University of Glasgow, United Kingdom Richard.Cziva@glasgow.ac.uk NFW WORLD CONGRESS, San Jose, CA, US 19/04/2016 About

More information

Piccolo. Fast, Distributed Programs with Partitioned Tables. Presenter: Wu, Weiyi Yale University. Saturday, October 15,

Piccolo. Fast, Distributed Programs with Partitioned Tables. Presenter: Wu, Weiyi Yale University. Saturday, October 15, Piccolo Fast, Distributed Programs with Partitioned Tables 1 Presenter: Wu, Weiyi Yale University Outline Background Intuition Design Evaluation Future Work 2 Outline Background Intuition Design Evaluation

More information

HyperFlow A Distributed Control Plane for OpenFlow

HyperFlow A Distributed Control Plane for OpenFlow HyperFlow A Distributed Control Plane for OpenFlow Amin Tootoonchian Yashar Ganjali System and Networking Group Department of Computer Science University of Toronto Brief Overview of OpenFlow Root cause

More information

Model: LT-122-PCIE For PCI Express

Model: LT-122-PCIE For PCI Express Model: LT-122-PCIE For PCI Express Data Sheet JUNE 2014 Page 1 Introduction... 3 Board Dimensions... 4 Input Video Connections... 5 Host bus connectivity... 6 Functional description... 7 Video Front-end...

More information

Introducing the Cray XMT. Petr Konecny May 4 th 2007

Introducing the Cray XMT. Petr Konecny May 4 th 2007 Introducing the Cray XMT Petr Konecny May 4 th 2007 Agenda Origins of the Cray XMT Cray XMT system architecture Cray XT infrastructure Cray Threadstorm processor Shared memory programming model Benefits/drawbacks/solutions

More information

15-740/ Computer Architecture, Fall 2011 Midterm Exam II

15-740/ Computer Architecture, Fall 2011 Midterm Exam II 15-740/18-740 Computer Architecture, Fall 2011 Midterm Exam II Instructor: Onur Mutlu Teaching Assistants: Justin Meza, Yoongu Kim Date: December 2, 2011 Name: Instructions: Problem I (69 points) : Problem

More information

Multimedia Decoder Using the Nios II Processor

Multimedia Decoder Using the Nios II Processor Multimedia Decoder Using the Nios II Processor Third Prize Multimedia Decoder Using the Nios II Processor Institution: Participants: Instructor: Indian Institute of Science Mythri Alle, Naresh K. V., Svatantra

More information

Increasing Performance of Existing Oracle RAC up to 10X

Increasing Performance of Existing Oracle RAC up to 10X Increasing Performance of Existing Oracle RAC up to 10X Prasad Pammidimukkala www.gridironsystems.com 1 The Problem Data can be both Big and Fast Processing large datasets creates high bandwidth demand

More information

A Performance Modeling and Simulation Approach to Software Defined Radio

A Performance Modeling and Simulation Approach to Software Defined Radio A Performance Modeling and Simulation Approach to Software Defined Radio OMG Software-Based Communications (SBC) Workshop San Diego, CA - August, 2005 Shawkang Wu & Long Ho Integrated Defense Systems The

More information

Storage. Hwansoo Han

Storage. Hwansoo Han Storage Hwansoo Han I/O Devices I/O devices can be characterized by Behavior: input, out, storage Partner: human or machine Data rate: bytes/sec, transfers/sec I/O bus connections 2 I/O System Characteristics

More information

Milestone Solution Partner IT Infrastructure Components Certification Report

Milestone Solution Partner IT Infrastructure Components Certification Report Milestone Solution Partner IT Infrastructure Components Certification Report Dell MD3860i Storage Array Multi-Server 1050 Camera Test Case 4-2-2016 Table of Contents Executive Summary:... 3 Abstract...

More information

6 x 3.5-inch SATA 6Gb/s, SATA 3Gb/s hard drive. 1. The standard system is shipped without HDD. 2. For the HDD compatibility list, please visit :

6 x 3.5-inch SATA 6Gb/s, SATA 3Gb/s hard drive. 1. The standard system is shipped without HDD. 2. For the HDD compatibility list, please visit : VS-6112 Pro+ Hardware Spec. Processor Memory HDD Capacity Dual-core Intel processor 4GB RAM 6 x 3.5-inch SATA 6Gb/s, SATA 3Gb/s hard drive NOTE: 1. The standard system is shipped without HDD. 2. For the

More information

Deployment Planning and Optimization for Big Data & Cloud Storage Systems

Deployment Planning and Optimization for Big Data & Cloud Storage Systems Deployment Planning and Optimization for Big Data & Cloud Storage Systems Bianny Bian Intel Corporation Outline System Planning Challenges Storage System Modeling w/ Intel CoFluent Studio Simulation Methodology

More information

NUMA replicated pagecache for Linux

NUMA replicated pagecache for Linux NUMA replicated pagecache for Linux Nick Piggin SuSE Labs January 27, 2008 0-0 Talk outline I will cover the following areas: Give some NUMA background information Introduce some of Linux s NUMA optimisations

More information

SMCCSE: PaaS Platform for processing large amounts of social media

SMCCSE: PaaS Platform for processing large amounts of social media KSII The first International Conference on Internet (ICONI) 2011, December 2011 1 Copyright c 2011 KSII SMCCSE: PaaS Platform for processing large amounts of social media Myoungjin Kim 1, Hanku Lee 2 and

More information

Datacenter Wide- area Enterprise

Datacenter Wide- area Enterprise Datacenter Wide- area Enterprise Client LOAD- BALANCER Can t choose path : ( Servers Outline and goals A new architecture for distributed load-balancing joint (server, path) selection Demonstrate a nation-wide

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff and Shun Tak Leung Google* Shivesh Kumar Sharma fl4164@wayne.edu Fall 2015 004395771 Overview Google file system is a scalable distributed file system

More information