Performance Analysis of Virtual Environments

Size: px
Start display at page:

Download "Performance Analysis of Virtual Environments"

Transcription

1 Performance Analysis of Virtual Environments Nikhil V Mishrikoti (nikmys@cs.utah.edu) Advisor: Dr. Rob Ricci Co-Advisor: Anton Burtsev 1

2 Introduction Motivation Virtual Machines (VMs) becoming pervasive in data centers and academic institutions. Honouring SLAs. Promised vs Actual. Quantify impact of Virtualization. Making sense of performance data. 2

3 Introduction Project Goals Empower user to investigate performance problems with as little inertia as possible. 3

4 Introduction Project Framework to build tools for performing fine grained analysis of, resource utilization overheads and performance bottlenecks, in virtual setups. 4

5 Agenda Introduction Xen Overview Data Collection Analysis Framework Analysis Algorithms Composibility 5

6 Agenda Introduction Xen Overview Data Collection Analysis Framework Analysis Algorithms Composibility 6

7 Xen Overview Dom 0 Dom 1... Dom U Admin VM Xen Hardware Open source. Widely used. Ex Amazon EC2 7

8 Agenda Introduction Xen Overview Data Collection Analysis Framework Analysis Algorithms Composibility 8

9 Data Collection Xentrace - Overview Xentrace is a lightweight tracing utility that collects hypervisor and domain level events. Ships with Xen. Dom 0 Dom 1 Dom n Xentrace Domain Events... Domain Events Hypervisor Events Xen Hardware Xentrace logs HDD 9

10 Data Collection Xentrace - Advantages Widely available since it ships with Xen. Easily extensible. 10

11 Data Collection Xentrace - Data Not originally intended for performance. Xentrace collects enormous amounts of raw information. E.x: data collected during a disk intensive load for 1 minute exceeds 700 MB Hence, chose as the source of performance data. 11

12 Data Collection Xentrace - Details Event masks to selectively capture event data. Log data is in binary format. Additional events not provided by Xentrace, can be manually added and collected by inserting trace macros in Xen or domain source. 12

13 Data Collection Xentrace - Event Format uint64 uint32 uint32 uint32 CPU tsc ns event_id d0 d1 d2 d3 d4 d5 Physical CPU id Optional trace data Timestamps - CPU clock cycles & nanoseconds Trace Event type 13

14 Data Collection Xentrace - Limitations xentrace_format : binary to text. E.x: 700 MB log has 20+ million lines of text Very difficult to manually peruse and, identify performance problems. gain high level overview of performance. 14

15 Agenda Introduction Xen Overview Data Collection Analysis Framework Analysis Algorithms Composibility 15

16 Analysis Framework Architecture Dom 0 Dom 1... Dom n Xentrace Domain Events Domain Events Hypervisor Events Xen Hardware Xentrace logs HDD Reader Analyses 16

17 Analysis Framework Overview Implementation split in two components. Reader: Parses binary log data offline and passes C - style structs to Analyses component. Analyses: Algorithms consisting a group of handlers for different event types. Generate high level performance metrics like CPU utilization, disk i/o performance etc. 17

18 Introduction Xen Overview Data Collection Analysis Framework Reader Analyses Analysis Algorithms Composibility Agenda 18

19 Analysis Framework Reader - Caveats Two caveats make parsing non-trivial. Events collected in logs not completely ordered. 19

20 Analysis Framework Reader - Unordered logs CPU 0 20

21 Analysis Framework Reader - Unordered logs CPU 0 ev1 ev2 ev3 ev4 ev5 ev6 Time 21

22 Analysis Framework Reader - Unordered logs CPU 0 ev1 ev2 ev3 ev4 ev5 ev6 CPU 1 ev1 ev2 ev3 ev4 ev5 ev6 CPU 2 ev1 ev2 ev3 ev4 ev5 ev6 Time 22

23 Analysis Framework Reader - Unordered logs CPU 0 ev1 ev2 ev3 ev4 ev5 ev6 ev1 ev1 ev2 Ordered ev1 CPU 1 ev1 ev2 ev3 ev4 ev5 ev6 ev2 ev2 ev3 CPU 2 ev1 ev2 ev3 ev4 ev5 ev6 ev4 ev3 ev3 ev4 ev5 Time ev5 ev6 23 ev4 ev5

24 Analysis Framework Reader - Unordered logs CPU 0 CPU 1 CPU 2 ev1 ev2 ev3 ev4 ev5 ev6 ev1 ev2 ev3 ev4 ev5 ev6 ev1 ev2 ev3 ev4 ev5 ev6 Time ev1 ev1 ev2 ev1 ev2 ev2 ev3 ev4 ev3 ev3 ev4 ev5 ev5 Ordered Log ev1 ev2 ev3 ev4 ev5 ev6 ev1 ev2 ev3 ev4 ev5 ev6 ev1 ev2 ev3 ev6 ev4 24 ev4 ev5 ev5 ev6

25 Analysis Framework Reader - Unordered logs Unordered logs ev1 ev2 ev3 ev4 ev5 ev6 ev1 ev2 Sort? ev1 ev1 ev2 ev1 ev2 ev2 ev3 Ordered logs ev3 ev4 ev4 ev5 ev3 ev6 ev3 ev1 ev4 ev2 ev5 ev3 ev4 ev5 ev6 25 ev5 ev6 ev4 ev5

26 Analysis Framework Reader - Unordered logs Unordered large logs ev1 ev2 ev3 ev4 ev5 ev6 External Sort? ev1 ev2 ev3 ev4 ev5 ev6 d-way Merge Sort ev1 ev1 ev2 ev1 ev2 ev2 ev3 ev4 ev3 ev3 Ordered large logs ev1 ev2 ev4 ev5 ev3 ev4 ev5 ev6 26 ev5 ev6 ev4 ev5

27 Analysis Framework Reader - Unordered logs Unordered large logs ev1 ev2 ev3 ev4 ev5 ev6 ev1 ev2 d-way Merge ev1 ev1 ev2 ev1 ev2 ev2 ev3 Ordered large logs ev3 ev4 ev4 ev5 ev3 ev6 ev3 ev1 ev4 ev2 ev5 ev3 ev4 ev5 ev6 Min-Heap 27 ev5 ev6 ev4 ev5

28 Analysis Framework Reader - Caveats Two caveats make parsing non-trivial. Events collected in logs not completely ordered. Loss of events during log collection. 28

29 Analysis Framework Reader - Lost Records Events come in faster than they can be flushed to disk. Xentrace inserts a lost_record event in the logs. Interferes with analysis esp. time sensitive. ev1 ev1 ev2 ev1 ev2 ev3 ev4 ev5 ev5 ev6 29 ev4 ev5

30 Analysis Framework Reader - Lost Records Tried different approaches. Increase buffer size. Use Event masks when possible. Fix Xentrace bug. Treat lost_records as just another event. It s handler notifies other event handlers in execution of its occurrence. They deal with it appropriately (discarding analysis, ignoring it completely etc.) 30

31 Introduction Xen Overview Data Collection Analysis Framework Reader Analyses Analysis Algorithms Composibility Agenda 31

32 Analyses Each tool s analysis logic is made up of Event Handlers. Event Handlers registered with Reader. Handler needs 3 methods written by the user. Initialize Handle Finalize Ex: Count of Events. Initialize count to 0, increment count, print count at the end. 32

33 Analyses Architecture Xentrace logs Reader Handlers Parse Xentrace records ev_id_1 ev_handler_1(..) Call Event Handlers ev_id_2 ev_id_3 ev_handler_2(..) Finalizers ev_handler_3(..) Free Handlers 33

34 Introduction Xen Overview Data Collection Analysis Framework Reader Analyses Analysis Algorithms Composibility Agenda 34

35 Analysis Algorithms Reasoning about performance in virtual environments is not always straightforward. The user has to follow his instinct or data from another analysis. 35

36 Analysis Algorithms Motivation Quantify phenomena observed on virtual setups. Utilization Saturation Errors (USE) methodology [1] 36

37 Analysis Algorithms CPU Utilization CPU Scheduling Latency Time in Hypervisor Disk I/O Device Driver Queue Status Device Driver Queue Request and Response Latency 37

38 Analysis Algorithms CPU Utilization - Why? Fine grained CPU utilization information can, Check if hypervisor adheres to Service Level Agreements (SLA) between hosting providers and clients. Detecting unbalanced mapping between physical CPUs and VMs. 38

39 Analysis Algorithms CPU Utilization dom 0 dom1 dom U vcpu0 vcpu1 v0 v1 v2 vcpu0 CPU 0 CPU 1 CPU n 39

40 Analysis Algorithms CPU Utilization CPU Scheduling Latency Time in Hypervisor Disk I/O Device Driver Queue Status Device Driver Queue Request and Response Latency 40

41 Analysis Algorithms CPU Scheduling Latency domain_wake(dom1, vcpu 0) domain_wake(dom1, vcpu 1) enter_sched(v0) enter_sched(v1) 41

42 Analysis Algorithms CPU Scheduling Latency Measures time a VM had to wait to get scheduled since the context switch request was sent out. Delay in context switch not only affects CPU bound tasks but also I/O jobs. Since domain needs CPU to process I/O requests/responses. 42

43 Analysis Algorithms CPU Scheduling Latency Wait times can increase for a number of reasons, VCPUs are over-scheduled. Physical CPU is always busy. Imbalance in VCPU => CPU affinity. 43

44 Analysis Algorithms CPU Scheduling Latency domid: 0 : CPU Wait Time: (ms) domid: 1 : CPU Wait Time: (ms) Total CPU Wait time for all domains: (ms) (ns) : (ns) : (ns) : (ns) : (ns) : (ns) : (ns) : (ns) : (ns) : (ns) : 0 > 7000 (ns) : 0 44

45 Analysis Algorithms CPU Utilization CPU Scheduling Latency Time in Hypervisor Disk I/O Device Driver Queue Status Device Driver Queue Request and Response Latency 45

46 Analysis Algorithms Time in Hypervisor - Why? CPU utilization data can sometimes be unreliable to infer performance problems from. Possible case is when most execution takes place in hypervisor. E.x: When significant amount of time is spent in the hypervisor, executing instructions or performing I/O on behalf of a domain, CPU util data will not show this behavior. E.x: Honoring SLAs. Does SLA include Xen runtime? 46

47 Analysis Algorithms Time in Hypervisor Total of 0 lost_record events encountered Total time spent in Domain 0 : CPU 1 = (ms) Total time spent in Domain IDLE : CPU 1 = (ms) Total time spent in Domain 0 : CPU 0 = (ms) Total time spent in Domain IDLE : CPU 0 = (ms) Total time spent in Domain 1 : CPU 2 = (ms) Total time spent in Domain IDLE : CPU 2 = (ms) Total time spent in Xen: (ms) 47

48 Analysis Algorithms CPU Utilization CPU Scheduling Latency Time in Hypervisor Disk I/O Device Driver Queue Status Device Driver Queue Request and Response Latency 48

49 Analysis Algorithms Disk I/O - Split Device Driver Model Backend Frontend Xen 49

50 Analysis Algorithms Disk I/O - Why? Performance impact of split device drivers on Disk I/O. 50

51 Analysis Algorithms 1 Disk I/O - Shared Ring Buffer 2 [2] 3 DomU writes Req 1 DomU writes Req 2 Dom0 writes Resp DomU reads Resp 1 Dom0 writes Resp 2 DomU reads Resp 2 51

52 Analysis Algorithms Disk I/O - Simplified Disk Response Shared Ring buffer Disk Request Dom Dom... Dom 0 1 U Event Mechanism Xen Hardware HDD 52

53 Analysis Algorithms Disk I/O - Device Driver Queues Front Shared-Ring Request Queue Dom 0 Dom 1 Back Shared-Ring Request Queue Front Request Queue Back Shared-Ring Response Queue BDD FDD Front Shared-Ring Response Queue Xen Hardware HDD 53

54 Analysis Algorithms Disk I/O - Device Driver Queue States Blocked : Cannot process requests to/from queue. Unable to add new requests to queue or Queue is empty. Unblocked : Can enqueue new incoming requests. 54

55 Analysis Algorithms Disk I/O - Device Driver Queue States Intuition was that a queue blocked for a long time would block the entire pipeline. 55

56 Analysis Algorithms Disk I/O - Device Driver Queue States Q_blocked Blocked Q_unblocked Q_blocked Q_unblocked Unblocked 56

57 Analysis Algorithms Disk I/O - Device Driver Queue States Q_blocked Blocked Q_unblocked Q_blocked Q_unblocked lost_records Q_blocked lost_records lost_records Unblocked Unknown Q_unblocked 57

58 Analysis Algorithms Disk I/O - Observations Blocked Unbuffered : 99 % - All queues Buffered : 50 % - Frontend request queue, 99 % rest. Buffer cache enables faster request processing at frontend. Disk I/O so slow, virtualization overheads negligible. 58

59 Analysis Algorithms CPU Utilization CPU Scheduling Latency Time in Hypervisor Disk I/O Device Driver Queue Status Device Driver Queue Request and Response Latency 59

60 Analysis Algorithms Queue Latency - Simplified DomU writes Req & notifies Dom0 Dom0 reads Req 3 4 Dom0 writes Resp & notifies DomU 60 DomU reads Resp

61 Analysis Algorithms Queue Latency - Simplified t2 - t1 SR Request Latency DomU writes Req & notifies Dom0 Dom0 reads Req 3 t4 - t3 4 SR Response Latency Dom0 writes Resp & notifies DomU 61 DomU reads Resp

62 Analysis Algorithms Disk I/O - Queue Latency SR Request Latency t1 t0 Dom 0 Dom 1 Disk Request Disk Response BDD FDD t3 t0 Xen Hardware HDD t3 t4 SR Response Latency 62

63 1 Analysis Algorithms Queue Latency - Results SR Response Latency >> SR Request Latency approx. 2 order of magnitudes greater for buffered i/o 63

64 Analysis Algorithms Queue Latency - Results QUEUE TIMES ================================================================== Queue BLOCKED: Unable to add new requests to queue or queue empty. Queue UNBLOCKED: Can enqueue new incoming requests. 1 Front Request Queue Unblocked : (ms) ; Blocked : (ms) Back Request Queue Unblocked : (ms) ; Blocked : (ms) Front Shared Ring Resp Queue Unblocked : (ms) ; Blocked : (ms) QUEUE WAIT TIMES ================================================================== Back Request Queue Wait Time : Back Response Queue Wait Time : (ms) (ms) 64

65 Introduction Xen Overview Data Collection Analysis Framework Reader Analyses Analysis Algorithms Composibility Agenda 65

66 Composibility Problem Analysis : Average queue blocked times on domain 0 66

67 Composibility Problem Analysis : Average queue blocked times on domain 0 Disk I/O analysis Averaging function Logic from CPU utilization 67

68 Composibility Problem Analysis : Average queue blocked times on domain 0 Disk I/O analysis Averaging function New Tool Logic from CPU utilization Lots of duplication of effort 68

69 Composibility Overview Framework, so far, gives us stand alone tools for focussed analysis. Composibility gives agility to this framework. 69

70 Composibility Overview Ability to reuse analysis algorithms logic for different event types. Easy to combine analysis tool outputs without having to rewrite large part of logic. Compose new analysis using reusable parts. 70

71 Composibility Overview Ability to reuse analysis algorithms logic for different event types. Stages Easy to combine analysis tool outputs without having to rewrite large parts of logic. Operators 71

72 Composibility Pipeline Analysis : Average queue blocked times on domain 0 event dom 0? Disk I/O tool Average output 72

73 Composibility Pipeline Analysis : Average queue blocked times on domain 0 event dom 0? Disk I/O tool Average output Stages Pipeline Operators 73

74 Composibility Pipeline - Operators Pipe ( ) : Connects a single stage to another. Split ( + ) : Connects a single stage to multiple stages. Executes all stages. Or ( or ) : Connects a single stage to multiple stages. Executes stages until valid return. Join : Connects multiple stages to a single stage. Either wait for a single connected stage to pass a valid event (JOIN_OR) or wait for all the connected stages to pass a successful event (JOIN_SPLIT). Syntax ideas inspired from A Universal Calculus for Streaming Processing Languages [3] 74

75 Composibility Pipeline - Operators Simplified split AND join_split or OR join_or 75

76 Composibility Pipeline - Stages Reusable and independent analysis components. Input: One or more events. Output : Same event, new event with results from execution or invalid event. If invalid event returned, break from Pipeline. 76

77 Composibility Pipeline Analysis : Average queue blocked times on domain 0 event dom 0? Disk I/O tool Average output 77

78 Composibility Pipeline Analysis : Average queue blocked times on domain 0 event vm(dom0) Disk I/O tool Average output 78

79 Composibility Pipeline Analysis : Average queue blocked times on domain 0 event vm(dom0) wait_time (Q_B, Q_U) Average output 79

80 Composibility Pipeline Analysis : Average queue blocked times on domain 0 event_id(q_b) event vm(dom0) + join_split wait_time() Average output + join_split event_id(q_u) 80

81 Composibility Pipeline Analysis : Average queue blocked times on domain 0 event_id(q_b) event vm(dom0) + join_split wait_time() average() output + join_split event_id(q_u) 81

82 Composibility Pipeline - Syntax event_id(q_b) event vm(dom0) + join_split wait_time() average() output + join_split event_id(q_u) vm(dom0) event_id(q_b) + event_id(q_u) wait_time() average() 82

83 Composibility Pipeline - Runtime vm(dom0) event_id(q_b) + event_id(q_u) wait_time() average() Parser [4] s1 = create_stage(vm, dom0); s2 = create_stage(event_id, Q_b); s3 = create_stage(event_id, Q_u); s4 = create_stage(wait_time, NULL); s5 = create_stage(average, NULL); split(s1, s2); split(s1, s3); join(s2, s4, JOIN_SPLIT); join(s3, s4, JOIN_SPLIT); pipe(s4, s5); while(!feof(fp)) { parse_next_event(&ev); execute_pipe(s1, ev); }

84 Composibility Summary Reusable and independent analysis components - Stages Connect stages using Operators. Compose Pipeline using Stages and Operators

85 Demo If time permits?? 85

86 Conclusion Goals met. Easier to build tools for fine grained performance analysis of Xen - Reader & Analyses Build complex analysis tools in a short time - Composibility 86

87 Thank You Q & A 87

88 References [1] USE method ( [2] Xen reference guide [3] R. Soule, M. Hirzel, R. Grimm. A Universal Calculus of Streaming Languages. ESOP 10. [4] Vembyr. ( 88

Design of Parallel Algorithms. Course Introduction

Design of Parallel Algorithms. Course Introduction + Design of Parallel Algorithms Course Introduction + CSE 4163/6163 Parallel Algorithm Analysis & Design! Course Web Site: http://www.cse.msstate.edu/~luke/courses/fl17/cse4163! Instructor: Ed Luke! Office:

More information

Buffer Management for XFS in Linux. William J. Earl SGI

Buffer Management for XFS in Linux. William J. Earl SGI Buffer Management for XFS in Linux William J. Earl SGI XFS Requirements for a Buffer Cache Delayed allocation of disk space for cached writes supports high write performance Delayed allocation main memory

More information

Abstractions for Practical Virtual Machine Replay. Anton Burtsev, David Johnson, Mike Hibler, Eric Eride, John Regehr University of Utah

Abstractions for Practical Virtual Machine Replay. Anton Burtsev, David Johnson, Mike Hibler, Eric Eride, John Regehr University of Utah Abstractions for Practical Virtual Machine Replay Anton Burtsev, David Johnson, Mike Hibler, Eric Eride, John Regehr University of Utah 2 3 Number of systems supporting replay: 0 Determinism 4 CPU is deterministic

More information

Interrupt Coalescing in Xen

Interrupt Coalescing in Xen Interrupt Coalescing in Xen with Scheduler Awareness Michael Peirce & Kevin Boos Outline Background Hypothesis vic-style Interrupt Coalescing Adding Scheduler Awareness Evaluation 2 Background Xen split

More information

Resource Containers. A new facility for resource management in server systems. Presented by Uday Ananth. G. Banga, P. Druschel, J. C.

Resource Containers. A new facility for resource management in server systems. Presented by Uday Ananth. G. Banga, P. Druschel, J. C. Resource Containers A new facility for resource management in server systems G. Banga, P. Druschel, J. C. Mogul OSDI 1999 Presented by Uday Ananth Lessons in history.. Web servers have become predominantly

More information

Xen and the Art of Virtualization. CSE-291 (Cloud Computing) Fall 2016

Xen and the Art of Virtualization. CSE-291 (Cloud Computing) Fall 2016 Xen and the Art of Virtualization CSE-291 (Cloud Computing) Fall 2016 Why Virtualization? Share resources among many uses Allow heterogeneity in environments Allow differences in host and guest Provide

More information

Keeping up with the hardware

Keeping up with the hardware Keeping up with the hardware Challenges in scaling I/O performance Jonathan Davies XenServer System Performance Lead XenServer Engineering, Citrix Cambridge, UK 18 Aug 2015 Jonathan Davies (Citrix) Keeping

More information

Using Alluxio to Improve the Performance and Consistency of HDFS Clusters

Using Alluxio to Improve the Performance and Consistency of HDFS Clusters ARTICLE Using Alluxio to Improve the Performance and Consistency of HDFS Clusters Calvin Jia Software Engineer at Alluxio Learn how Alluxio is used in clusters with co-located compute and storage to improve

More information

Operating System Design Issues. I/O Management

Operating System Design Issues. I/O Management I/O Management Chapter 5 Operating System Design Issues Efficiency Most I/O devices slow compared to main memory (and the CPU) Use of multiprogramming allows for some processes to be waiting on I/O while

More information

DAT (cont d) Assume a page size of 256 bytes. physical addresses. Note: Virtual address (page #) is not stored, but is used as an index into the table

DAT (cont d) Assume a page size of 256 bytes. physical addresses. Note: Virtual address (page #) is not stored, but is used as an index into the table Assume a page size of 256 bytes 5 Page table size (determined by size of program) 1 1 0 1 0 0200 00 420B 00 xxxxxx 1183 00 xxxxxx physical addresses Residency Bit (0 page frame is empty) Note: Virtual

More information

24-vm.txt Mon Nov 21 22:13: Notes on Virtual Machines , Fall 2011 Carnegie Mellon University Randal E. Bryant.

24-vm.txt Mon Nov 21 22:13: Notes on Virtual Machines , Fall 2011 Carnegie Mellon University Randal E. Bryant. 24-vm.txt Mon Nov 21 22:13:36 2011 1 Notes on Virtual Machines 15-440, Fall 2011 Carnegie Mellon University Randal E. Bryant References: Tannenbaum, 3.2 Barham, et al., "Xen and the art of virtualization,"

More information

Chapter 5 Memory Hierarchy Design. In-Cheol Park Dept. of EE, KAIST

Chapter 5 Memory Hierarchy Design. In-Cheol Park Dept. of EE, KAIST Chapter 5 Memory Hierarchy Design In-Cheol Park Dept. of EE, KAIST Why cache? Microprocessor performance increment: 55% per year Memory performance increment: 7% per year Principles of locality Spatial

More information

Abstract. Testing Parameters. Introduction. Hardware Platform. Native System

Abstract. Testing Parameters. Introduction. Hardware Platform. Native System Abstract In this paper, we address the latency issue in RT- XEN virtual machines that are available in Xen 4.5. Despite the advantages of applying virtualization to systems, the default credit scheduler

More information

Xenrelay: An Efficient Data Transmitting Approach for Tracing Guest Domain

Xenrelay: An Efficient Data Transmitting Approach for Tracing Guest Domain Xenrelay: An Efficient Data Transmitting Approach for Tracing Guest Domain Hai Jin, Wenzhi Cao, Pingpeng Yuan, Xia Xie Cluster and Grid Computing Lab Services Computing Technique and System Lab Huazhong

More information

238P: Operating Systems. Lecture 14: Process scheduling

238P: Operating Systems. Lecture 14: Process scheduling 238P: Operating Systems Lecture 14: Process scheduling This lecture is heavily based on the material developed by Don Porter Anton Burtsev November, 2017 Cooperative vs preemptive What is cooperative multitasking?

More information

Supporting Soft Real-Time Tasks in the Xen Hypervisor

Supporting Soft Real-Time Tasks in the Xen Hypervisor Supporting Soft Real-Time Tasks in the Xen Hypervisor Min Lee 1, A. S. Krishnakumar 2, P. Krishnan 2, Navjot Singh 2, Shalini Yajnik 2 1 Georgia Institute of Technology Atlanta, GA, USA minlee@cc.gatech.edu

More information

Introduction to Cloud Computing and Virtualization. Mayank Mishra Sujesha Sudevalayam PhD Students CSE, IIT Bombay

Introduction to Cloud Computing and Virtualization. Mayank Mishra Sujesha Sudevalayam PhD Students CSE, IIT Bombay Introduction to Cloud Computing and Virtualization By Mayank Mishra Sujesha Sudevalayam PhD Students CSE, IIT Bombay Talk Layout Cloud Computing Need Features Feasibility Virtualization of Machines What

More information

Memory Management. Frédéric Haziza Spring Department of Computer Systems Uppsala University

Memory Management. Frédéric Haziza Spring Department of Computer Systems Uppsala University Memory Management Frédéric Haziza Department of Computer Systems Uppsala University Spring 2008 Operating Systems Process Management Memory Management Storage Management Compilers Compiling

More information

An Energy-Efficient Asymmetric Multi-Processor for HPC Virtualization

An Energy-Efficient Asymmetric Multi-Processor for HPC Virtualization An Energy-Efficient Asymmetric Multi-Processor for HP Virtualization hung Lee and Peter Strazdins*, omputer Systems Group, Research School of omputer Science, The Australian National University (slides

More information

ClearSpeed Visual Profiler

ClearSpeed Visual Profiler ClearSpeed Visual Profiler Copyright 2007 ClearSpeed Technology plc. All rights reserved. 12 November 2007 www.clearspeed.com 1 Profiling Application Code Why use a profiler? Program analysis tools are

More information

COMPILER CONSTRUCTION FOR A NETWORK IDENTIFICATION SUMIT SONI PRAVESH KUMAR

COMPILER CONSTRUCTION FOR A NETWORK IDENTIFICATION SUMIT SONI PRAVESH KUMAR COMPILER CONSTRUCTION FOR A NETWORK IDENTIFICATION SUMIT SONI 13 PRAVESH KUMAR language) into another computer language (the target language, often having a binary form known as object code The most common

More information

Input/Output Management

Input/Output Management Chapter 11 Input/Output Management This could be the messiest aspect of an operating system. There are just too much stuff involved, it is difficult to develop a uniform and consistent theory to cover

More information

UNIT I (Two Marks Questions & Answers)

UNIT I (Two Marks Questions & Answers) UNIT I (Two Marks Questions & Answers) Discuss the different ways how instruction set architecture can be classified? Stack Architecture,Accumulator Architecture, Register-Memory Architecture,Register-

More information

COSC3330 Computer Architecture Lecture 20. Virtual Memory

COSC3330 Computer Architecture Lecture 20. Virtual Memory COSC3330 Computer Architecture Lecture 20. Virtual Memory Instructor: Weidong Shi (Larry), PhD Computer Science Department University of Houston Virtual Memory Topics Reducing Cache Miss Penalty (#2) Use

More information

OpenVMS Performance Update

OpenVMS Performance Update OpenVMS Performance Update Gregory Jordan Hewlett-Packard 2007 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Agenda System Performance Tests

More information

Jyotheswar Kuricheti

Jyotheswar Kuricheti Jyotheswar Kuricheti 1 Agenda: 1. Performance Tuning Overview 2. Identify Bottlenecks 3. Optimizing at different levels : Target Source Mapping Session System 2 3 Performance Tuning Overview: 4 What is

More information

ni.com Best Practices for Architecting Embedded Applications in LabVIEW

ni.com Best Practices for Architecting Embedded Applications in LabVIEW Best Practices for Architecting Embedded Applications in LabVIEW Overview of NI RIO Architecture PC Real Time Controller FPGA 2 Where to Start? 3 Requirements Before you start to design your system, you

More information

Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS

Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS Structure Page Nos. 2.0 Introduction 4 2. Objectives 5 2.2 Metrics for Performance Evaluation 5 2.2. Running Time 2.2.2 Speed Up 2.2.3 Efficiency 2.3 Factors

More information

8 Introduction to Distributed Computing

8 Introduction to Distributed Computing CME 323: Distributed Algorithms and Optimization, Spring 2017 http://stanford.edu/~rezab/dao. Instructor: Reza Zadeh, Matroid and Stanford. Lecture 8, 4/26/2017. Scribed by A. Santucci. 8 Introduction

More information

FIFO-based Event Channel ABI

FIFO-based Event Channel ABI FIFO-based Event Channel ABI David Vrabel Draft C Contents 1 Introduction 3 1.1 Revision History........................... 3 1.2 Purpose................................ 3 1.3

More information

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip Reducing Hit Times Critical Influence on cycle-time or CPI Keep L1 small and simple small is always faster and can be put on chip interesting compromise is to keep the tags on chip and the block data off

More information

CSE Lecture 11: Map/Reduce 7 October Nate Nystrom UTA

CSE Lecture 11: Map/Reduce 7 October Nate Nystrom UTA CSE 3302 Lecture 11: Map/Reduce 7 October 2010 Nate Nystrom UTA 378,000 results in 0.17 seconds including images and video communicates with 1000s of machines web server index servers document servers

More information

ò mm_struct represents an address space in kernel ò task represents a thread in the kernel ò A task points to 0 or 1 mm_structs

ò mm_struct represents an address space in kernel ò task represents a thread in the kernel ò A task points to 0 or 1 mm_structs Last time We went through the high-level theory of scheduling algorithms Scheduling Today: View into how Linux makes its scheduling decisions Don Porter CSE 306 Lecture goals Understand low-level building

More information

A common scenario... Most of us have probably been here. Where did my performance go? It disappeared into overheads...

A common scenario... Most of us have probably been here. Where did my performance go? It disappeared into overheads... OPENMP PERFORMANCE 2 A common scenario... So I wrote my OpenMP program, and I checked it gave the right answers, so I ran some timing tests, and the speedup was, well, a bit disappointing really. Now what?.

More information

Scheduling. Don Porter CSE 306

Scheduling. Don Porter CSE 306 Scheduling Don Porter CSE 306 Last time ò We went through the high-level theory of scheduling algorithms ò Today: View into how Linux makes its scheduling decisions Lecture goals ò Understand low-level

More information

A Case for High Performance Computing with Virtual Machines

A Case for High Performance Computing with Virtual Machines A Case for High Performance Computing with Virtual Machines Wei Huang*, Jiuxing Liu +, Bulent Abali +, and Dhabaleswar K. Panda* *The Ohio State University +IBM T. J. Waston Research Center Presentation

More information

L41: Lab 1 - Getting Started with Kernel Tracing - I/O

L41: Lab 1 - Getting Started with Kernel Tracing - I/O L41: Lab 1 - Getting Started with Kernel Tracing - I/O Dr Robert N.M. Watson Michaelmas Term 2015 The goals of this lab are to: Introduce you to our experimental environment and DTrace Have you explore

More information

SANDPIPER: BLACK-BOX AND GRAY-BOX STRATEGIES FOR VIRTUAL MACHINE MIGRATION

SANDPIPER: BLACK-BOX AND GRAY-BOX STRATEGIES FOR VIRTUAL MACHINE MIGRATION SANDPIPER: BLACK-BOX AND GRAY-BOX STRATEGIES FOR VIRTUAL MACHINE MIGRATION Timothy Wood, Prashant Shenoy, Arun Venkataramani, and Mazin Yousif * University of Massachusetts Amherst * Intel, Portland Data

More information

Ti Parallel Computing PIPELINING. Michał Roziecki, Tomáš Cipr

Ti Parallel Computing PIPELINING. Michał Roziecki, Tomáš Cipr Ti5317000 Parallel Computing PIPELINING Michał Roziecki, Tomáš Cipr 2005-2006 Introduction to pipelining What is this What is pipelining? Pipelining is an implementation technique in which multiple instructions

More information

Spring 2017 :: CSE 506. Device Programming. Nima Honarmand

Spring 2017 :: CSE 506. Device Programming. Nima Honarmand Device Programming Nima Honarmand read/write interrupt read/write Spring 2017 :: CSE 506 Device Interface (Logical View) Device Interface Components: Device registers Device Memory DMA buffers Interrupt

More information

Fine-Grained Interconnect Synthesis

Fine-Grained Interconnect Synthesis Fine-Grained Interconnect Synthesis Alex Rodionov, David Biancolin, Jonathan Rose Department of Electrical & Computer Engineering University of Toronto (1) Making Hardware Design Easier n Interconnect:

More information

Visual Profiler. User Guide

Visual Profiler. User Guide Visual Profiler User Guide Version 3.0 Document No. 06-RM-1136 Revision: 4.B February 2008 Visual Profiler User Guide Table of contents Table of contents 1 Introduction................................................

More information

5 Solutions. Solution a. no solution provided. b. no solution provided

5 Solutions. Solution a. no solution provided. b. no solution provided 5 Solutions Solution 5.1 5.1.1 5.1.2 5.1.3 5.1.4 5.1.5 5.1.6 S2 Chapter 5 Solutions Solution 5.2 5.2.1 4 5.2.2 a. I, J b. B[I][0] 5.2.3 a. A[I][J] b. A[J][I] 5.2.4 a. 3596 = 8 800/4 2 8 8/4 + 8000/4 b.

More information

Course Review. Hui Lu

Course Review. Hui Lu Course Review Hui Lu Syllabus Cloud computing Server virtualization Network virtualization Storage virtualization Cloud operating system Object storage Syllabus Server Virtualization Network Virtualization

More information

CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES

CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES OBJECTIVES Detailed description of various ways of organizing memory hardware Various memory-management techniques, including paging and segmentation To provide

More information

L41: Lab 2 - Kernel implications of IPC

L41: Lab 2 - Kernel implications of IPC L41: Lab 2 - Kernel implications of IPC Dr Robert N.M. Watson Michaelmas Term 2015 The goals of this lab are to: Continue to gain experience tracing user-kernel interactions via system calls Explore the

More information

FIFO-based Event Channel ABI

FIFO-based Event Channel ABI FIFO-based Event Channel ABI David Vrabel Draft F Contents 1 Introduction 3 1.1 Revision History........................... 3 1.2 Purpose................................ 4 1.3

More information

!! What is virtual memory and when is it useful? !! What is demand paging? !! When should pages in memory be replaced?

!! What is virtual memory and when is it useful? !! What is demand paging? !! When should pages in memory be replaced? Chapter 10: Virtual Memory Questions? CSCI [4 6] 730 Operating Systems Virtual Memory!! What is virtual memory and when is it useful?!! What is demand paging?!! When should pages in memory be replaced?!!

More information

QoS support for Intelligent Storage Devices

QoS support for Intelligent Storage Devices QoS support for Intelligent Storage Devices Joel Wu Scott Brandt Department of Computer Science University of California Santa Cruz ISW 04 UC Santa Cruz Mixed-Workload Requirement General purpose systems

More information

Main Memory (Part II)

Main Memory (Part II) Main Memory (Part II) Amir H. Payberah amir@sics.se Amirkabir University of Technology (Tehran Polytechnic) Amir H. Payberah (Tehran Polytechnic) Main Memory 1393/8/17 1 / 50 Reminder Amir H. Payberah

More information

I/O and virtualization

I/O and virtualization I/O and virtualization CSE-C3200 Operating systems Autumn 2015 (I), Lecture 8 Vesa Hirvisalo Today I/O management Control of I/O Data transfers, DMA (Direct Memory Access) Buffering Single buffering Double

More information

Exploiting Concurrency

Exploiting Concurrency Exploiting Concurrency How I stopped worrying and started threading Michael Meeks michael.meeks@collabora.com mmeeks / irc.freenode.net Collabora Productivity Stand at the crossroads and look; ask for

More information

MapReduce: Simplified Data Processing on Large Clusters 유연일민철기

MapReduce: Simplified Data Processing on Large Clusters 유연일민철기 MapReduce: Simplified Data Processing on Large Clusters 유연일민철기 Introduction MapReduce is a programming model and an associated implementation for processing and generating large data set with parallel,

More information

davidklee.net gplus.to/kleegeek linked.com/a/davidaklee

davidklee.net gplus.to/kleegeek linked.com/a/davidaklee @kleegeek davidklee.net gplus.to/kleegeek linked.com/a/davidaklee Specialties / Focus Areas / Passions: Performance Tuning & Troubleshooting Virtualization Cloud Enablement Infrastructure Architecture

More information

Time Capsule: Tracing Packet Latency across Different Layers in Virtualized Systems

Time Capsule: Tracing Packet Latency across Different Layers in Virtualized Systems Time Capsule: Tracing Packet Latency across Different Layers in Virtualized Systems Kun Suo *, Jia Rao *, Luwei Cheng w, Francis C. M. Lau UT Arlington *, Facebook w, The University of Hong Kong Virtualiza@on

More information

Threads. CS3026 Operating Systems Lecture 06

Threads. CS3026 Operating Systems Lecture 06 Threads CS3026 Operating Systems Lecture 06 Multithreading Multithreading is the ability of an operating system to support multiple threads of execution within a single process Processes have at least

More information

Software Speculative Multithreading for Java

Software Speculative Multithreading for Java Software Speculative Multithreading for Java Christopher J.F. Pickett and Clark Verbrugge School of Computer Science, McGill University {cpicke,clump}@sable.mcgill.ca Allan Kielstra IBM Toronto Lab kielstra@ca.ibm.com

More information

SPIN Operating System

SPIN Operating System SPIN Operating System Motivation: general purpose, UNIX-based operating systems can perform poorly when the applications have resource usage patterns poorly handled by kernel code Why? Current crop of

More information

Inferring Temporal Behaviours Through Kernel Tracing

Inferring Temporal Behaviours Through Kernel Tracing Inferring Temporal Behaviours Through Kernel Tracing Paolo Rallo, Nicola Manica, Luca Abeni University of Trento Trento - Italy prallo@gmail.com, nicola.manica@disi.unitn.it, luca.abeni@unitn.it Technical

More information

R16 SET - 1 '' ''' '' ''' Code No: R

R16 SET - 1 '' ''' '' ''' Code No: R 1. a) Define Latency time and Transmission time? (2M) b) Define Hash table and Hash function? (2M) c) Explain the Binary Heap Structure Property? (3M) d) List the properties of Red-Black trees? (3M) e)

More information

Operating Systems. Designed and Presented by Dr. Ayman Elshenawy Elsefy

Operating Systems. Designed and Presented by Dr. Ayman Elshenawy Elsefy Operating Systems Designed and Presented by Dr. Ayman Elshenawy Elsefy Dept. of Systems & Computer Eng.. AL-AZHAR University Website : eaymanelshenawy.wordpress.com Email : eaymanelshenawy@yahoo.com Reference

More information

YCSB++ benchmarking tool Performance debugging advanced features of scalable table stores

YCSB++ benchmarking tool Performance debugging advanced features of scalable table stores YCSB++ benchmarking tool Performance debugging advanced features of scalable table stores Swapnil Patil M. Polte, W. Tantisiriroj, K. Ren, L.Xiao, J. Lopez, G.Gibson, A. Fuchs *, B. Rinaldi * Carnegie

More information

Martin Kruliš, v

Martin Kruliš, v Martin Kruliš 1 Optimizations in General Code And Compilation Memory Considerations Parallelism Profiling And Optimization Examples 2 Premature optimization is the root of all evil. -- D. Knuth Our goal

More information

Overview of Data Exploration Techniques. Stratos Idreos, Olga Papaemmanouil, Surajit Chaudhuri

Overview of Data Exploration Techniques. Stratos Idreos, Olga Papaemmanouil, Surajit Chaudhuri Overview of Data Exploration Techniques Stratos Idreos, Olga Papaemmanouil, Surajit Chaudhuri data exploration not always sure what we are looking for (until we find it) data has always been big volume

More information

Virtual Memory: From Address Translation to Demand Paging

Virtual Memory: From Address Translation to Demand Paging Constructive Computer Architecture Virtual Memory: From Address Translation to Demand Paging Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology November 9, 2015

More information

Internals of Active Dataguard. Saibabu Devabhaktuni

Internals of Active Dataguard. Saibabu Devabhaktuni Internals of Active Dataguard Saibabu Devabhaktuni PayPal DB Engineering team Sehmuz Bayhan Our visionary director Saibabu Devabhaktuni Sr manager of DB engineering team http://sai-oracle.blogspot.com

More information

Real-Time Internet of Things

Real-Time Internet of Things Real-Time Internet of Things Chenyang Lu Cyber-Physical Systems Laboratory h7p://www.cse.wustl.edu/~lu/ Internet of Things Ø Convergence of q Miniaturized devices: integrate processor, sensors and radios.

More information

Chapter 8: Memory- Management Strategies. Operating System Concepts 9 th Edition

Chapter 8: Memory- Management Strategies. Operating System Concepts 9 th Edition Chapter 8: Memory- Management Strategies Operating System Concepts 9 th Edition Silberschatz, Galvin and Gagne 2013 Chapter 8: Memory Management Strategies Background Swapping Contiguous Memory Allocation

More information

Chapter 8 & Chapter 9 Main Memory & Virtual Memory

Chapter 8 & Chapter 9 Main Memory & Virtual Memory Chapter 8 & Chapter 9 Main Memory & Virtual Memory 1. Various ways of organizing memory hardware. 2. Memory-management techniques: 1. Paging 2. Segmentation. Introduction Memory consists of a large array

More information

Chapter 7: Main Memory. Operating System Concepts Essentials 8 th Edition

Chapter 7: Main Memory. Operating System Concepts Essentials 8 th Edition Chapter 7: Main Memory Operating System Concepts Essentials 8 th Edition Silberschatz, Galvin and Gagne 2011 Chapter 7: Memory Management Background Swapping Contiguous Memory Allocation Paging Structure

More information

Streaming Massive Environments From Zero to 200MPH

Streaming Massive Environments From Zero to 200MPH FORZA MOTORSPORT From Zero to 200MPH Chris Tector (Software Architect Turn 10 Studios) Turn 10 Internal studio at Microsoft Game Studios - we make Forza Motorsport Around 70 full time staff 2 Why am I

More information

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms

More information

Best Practices for Architecting Embedded Applications in LabVIEW Jacques Cilliers Applications Engineering

Best Practices for Architecting Embedded Applications in LabVIEW Jacques Cilliers Applications Engineering Best Practices for Architecting Embedded Applications in LabVIEW Jacques Cilliers Applications Engineering Overview of NI RIO Architecture PC Real Time Controller FPGA 4 Where to Start? 5 Requirements

More information

Performance Optimization for Informatica Data Services ( Hotfix 3)

Performance Optimization for Informatica Data Services ( Hotfix 3) Performance Optimization for Informatica Data Services (9.5.0-9.6.1 Hotfix 3) 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic,

More information

Computer Systems Architecture I. CSE 560M Lecture 18 Guest Lecturer: Shakir James

Computer Systems Architecture I. CSE 560M Lecture 18 Guest Lecturer: Shakir James Computer Systems Architecture I CSE 560M Lecture 18 Guest Lecturer: Shakir James Plan for Today Announcements No class meeting on Monday, meet in project groups Project demos < 2 weeks, Nov 23 rd Questions

More information

XML Parsers. Asst. Prof. Dr. Kanda Runapongsa Saikaew Dept. of Computer Engineering Khon Kaen University

XML Parsers. Asst. Prof. Dr. Kanda Runapongsa Saikaew Dept. of Computer Engineering Khon Kaen University XML Parsers Asst. Prof. Dr. Kanda Runapongsa Saikaew (krunapon@kku.ac.th) Dept. of Computer Engineering Khon Kaen University 1 Overview What are XML Parsers? Programming Interfaces of XML Parsers DOM:

More information

What s An OS? Cyclic Executive. Interrupts. Advantages Simple implementation Low overhead Very predictable

What s An OS? Cyclic Executive. Interrupts. Advantages Simple implementation Low overhead Very predictable What s An OS? Provides environment for executing programs Process abstraction for multitasking/concurrency scheduling Hardware abstraction layer (device drivers) File systems Communication Do we need an

More information

The HAMMER Filesystem DragonFlyBSD Project Matthew Dillon 11 October 2008

The HAMMER Filesystem DragonFlyBSD Project Matthew Dillon 11 October 2008 The HAMMER Filesystem DragonFlyBSD Project Matthew Dillon 11 October 2008 HAMMER Quick Feature List 1 Exabyte capacity (2^60 = 1 million terrabytes). Fine-grained, live-view history retention for snapshots

More information

Automatic NUMA Balancing. Rik van Riel, Principal Software Engineer, Red Hat Vinod Chegu, Master Technologist, HP

Automatic NUMA Balancing. Rik van Riel, Principal Software Engineer, Red Hat Vinod Chegu, Master Technologist, HP Automatic NUMA Balancing Rik van Riel, Principal Software Engineer, Red Hat Vinod Chegu, Master Technologist, HP Automatic NUMA Balancing Agenda What is NUMA, anyway? Automatic NUMA balancing internals

More information

Chapter 18 Indexing Structures for Files

Chapter 18 Indexing Structures for Files Chapter 18 Indexing Structures for Files Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Disk I/O for Read/ Write Unit for Disk I/O for Read/ Write: Chapter 18 One Buffer for

More information

CS307: Operating Systems

CS307: Operating Systems CS307: Operating Systems Chentao Wu 吴晨涛 Associate Professor Dept. of Computer Science and Engineering Shanghai Jiao Tong University SEIEE Building 3-513 wuct@cs.sjtu.edu.cn Download Lectures ftp://public.sjtu.edu.cn

More information

Systems Architecture II

Systems Architecture II Systems Architecture II Topics Interfacing I/O Devices to Memory, Processor, and Operating System * Memory-mapped IO and Interrupts in SPIM** *This lecture was derived from material in the text (Chapter

More information

CAVA: Exploring Memory Locality for Big Data Analytics in Virtualized Clusters

CAVA: Exploring Memory Locality for Big Data Analytics in Virtualized Clusters 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing : Exploring Memory Locality for Big Data Analytics in Virtualized Clusters Eunji Hwang, Hyungoo Kim, Beomseok Nam and Young-ri

More information

CSE 544 Principles of Database Management Systems

CSE 544 Principles of Database Management Systems CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 5 - DBMS Architecture and Indexing 1 Announcements HW1 is due next Thursday How is it going? Projects: Proposals are due

More information

ECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation

ECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation ECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation Weiping Liao, Saengrawee (Anne) Pratoomtong, and Chuan Zhang Abstract Binary translation is an important component for translating

More information

FAWN. A Fast Array of Wimpy Nodes. David Andersen, Jason Franklin, Michael Kaminsky*, Amar Phanishayee, Lawrence Tan, Vijay Vasudevan

FAWN. A Fast Array of Wimpy Nodes. David Andersen, Jason Franklin, Michael Kaminsky*, Amar Phanishayee, Lawrence Tan, Vijay Vasudevan FAWN A Fast Array of Wimpy Nodes David Andersen, Jason Franklin, Michael Kaminsky*, Amar Phanishayee, Lawrence Tan, Vijay Vasudevan Carnegie Mellon University *Intel Labs Pittsburgh Energy in computing

More information

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 21: Network Protocols (and 2 Phase Commit)

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 21: Network Protocols (and 2 Phase Commit) CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring 2003 Lecture 21: Network Protocols (and 2 Phase Commit) 21.0 Main Point Protocol: agreement between two parties as to

More information

COSC4201. Chapter 5. Memory Hierarchy Design. Prof. Mokhtar Aboelaze York University

COSC4201. Chapter 5. Memory Hierarchy Design. Prof. Mokhtar Aboelaze York University COSC4201 Chapter 5 Memory Hierarchy Design Prof. Mokhtar Aboelaze York University 1 Memory Hierarchy The gap between CPU performance and main memory has been widening with higher performance CPUs creating

More information

AN 831: Intel FPGA SDK for OpenCL

AN 831: Intel FPGA SDK for OpenCL AN 831: Intel FPGA SDK for OpenCL Host Pipelined Multithread Subscribe Send Feedback Latest document on the web: PDF HTML Contents Contents 1 Intel FPGA SDK for OpenCL Host Pipelined Multithread...3 1.1

More information

Final Lecture. A few minutes to wrap up and add some perspective

Final Lecture. A few minutes to wrap up and add some perspective Final Lecture A few minutes to wrap up and add some perspective 1 2 Instant replay The quarter was split into roughly three parts and a coda. The 1st part covered instruction set architectures the connection

More information

Memory Management. CSE 2431: Introduction to Operating Systems Reading: , [OSC]

Memory Management. CSE 2431: Introduction to Operating Systems Reading: , [OSC] Memory Management CSE 2431: Introduction to Operating Systems Reading: 8.1 8.3, [OSC] 1 Outline Basic Memory Management Swapping Variable Partitions Memory Management Problems 2 Basic Memory Management

More information

Department of Computer Science San Marcos, TX Report Number TXSTATE-CS-TR Clustering in the Cloud. Xuan Wang

Department of Computer Science San Marcos, TX Report Number TXSTATE-CS-TR Clustering in the Cloud. Xuan Wang Department of Computer Science San Marcos, TX 78666 Report Number TXSTATE-CS-TR-2010-24 Clustering in the Cloud Xuan Wang 2010-05-05 !"#$%&'()*+()+%,&+!"-#. + /+!"#$%&'()*+0"*-'(%,1$+0.23%(-)+%-+42.--3+52367&.#8&+9'21&:-';

More information

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS

PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS Objective PRINCIPLES OF COMPILER DESIGN UNIT I INTRODUCTION TO COMPILERS Explain what is meant by compiler. Explain how the compiler works. Describe various analysis of the source program. Describe the

More information

Parallel and Distributed Optimization with Gurobi Optimizer

Parallel and Distributed Optimization with Gurobi Optimizer Parallel and Distributed Optimization with Gurobi Optimizer Our Presenter Dr. Tobias Achterberg Developer, Gurobi Optimization 2 Parallel & Distributed Optimization 3 Terminology for this presentation

More information

Stateless Network Functions:

Stateless Network Functions: Stateless Network Functions: Breaking the Tight Coupling of State and Processing Murad Kablan, Azzam Alsudais, Eric Keller, Franck Le University of Colorado IBM Networks Need Network Functions Firewall

More information

Hardware-Based Speculation

Hardware-Based Speculation Hardware-Based Speculation Execute instructions along predicted execution paths but only commit the results if prediction was correct Instruction commit: allowing an instruction to update the register

More information

Performance Sentry VM Provider Objects April 11, 2012

Performance Sentry VM Provider Objects April 11, 2012 Introduction This document describes the Performance Sentry VM (Sentry VM) Provider performance data objects defined using the VMware performance groups and counters. This version of Performance Sentry

More information

Dynamic Fine Grain Scheduling of Pipeline Parallelism. Presented by: Ram Manohar Oruganti and Michael TeWinkle

Dynamic Fine Grain Scheduling of Pipeline Parallelism. Presented by: Ram Manohar Oruganti and Michael TeWinkle Dynamic Fine Grain Scheduling of Pipeline Parallelism Presented by: Ram Manohar Oruganti and Michael TeWinkle Overview Introduction Motivation Scheduling Approaches GRAMPS scheduling method Evaluation

More information

CSE 544: Principles of Database Systems

CSE 544: Principles of Database Systems CSE 544: Principles of Database Systems Anatomy of a DBMS, Parallel Databases 1 Announcements Lecture on Thursday, May 2nd: Moved to 9am-10:30am, CSE 403 Paper reviews: Anatomy paper was due yesterday;

More information

Chapter 8: Main Memory. Operating System Concepts 9 th Edition

Chapter 8: Main Memory. Operating System Concepts 9 th Edition Chapter 8: Main Memory Silberschatz, Galvin and Gagne 2013 Chapter 8: Memory Management Background Swapping Contiguous Memory Allocation Segmentation Paging Structure of the Page Table Example: The Intel

More information