10. Multiprocessor Scheduling (Advanced)
|
|
- Timothy Hood
- 5 years ago
- Views:
Transcription
1 10. Multirocessor Scheduling (Advanced) Oerating System: Three Easy Pieces 1
2 Multirocessor Scheduling The rise of the multicore rocessor is the source of multirocessorscheduling roliferation. w Multicore: Multile CPU cores are acked onto a single chi. Adding more CPUs does not make that single alication run faster. à You ll have to rewrite alication to run in arallel, using threads. How to schedule jobs on Multile CPUs? AOS@UC 2
3 Single CPU with cache CPU Cache Cache Small, fast memories Hold coies of oular data that is found in the main memory. Utilize temoral and satial locality Memory Main Memory Holds all of the data Access to main memory is slower than cache. By keeing data in cache, the system can make slow memory aear to be a fast one AOS@UC 3
4 Cache coherence Coherence of shared resource data stored in multile caches. 0. Two CPUs with caches sharing memory 1. CPU0 reads a data at address 1. CPU 0 CPU 1 CPU 0 CPU 1 Cache Cache Cache! Cache Bus Bus! Memory ! Memory AOS@UC 4
5 Cache coherence (Cont.) 2.! is udated and CPU1 is scheduled. 3. CPU1 re-reads the value at address A CPU 0 CPU 1 CPU 0 CPU 1 Cache Cache!!! Cache Cache Bus Bus! Memory ! Memory CPU1 gets the old value # instead of the correct value #. AOS@UC 5
6 Cache coherence solution Bus snooing w Each cache ays attention to memory udates by observing the bus. w When a CPU sees an udate for a data item it holds in its cache, it will notice the change and either invalidate its coy or udate it. AOS@UC 6
7 Don t forget synchronization When accessing shared data across CPUs, mutual exclusion rimitives should likely be used to guarantee correctness. 1 tyedef struct Node_t { 2 int value; 3 struct Node_t *next; 4 } Node_t; 5 6 int List_Po() { 7 Node_t *tm = head; // remember old head... 8 int value = head->value; //... and its value 9 head = head->next; // advance head to next ointer 10 free(tm); // free old head 11 return value; // return value at head 12 } Simle List Delete Code AOS@UC 7
8 Don t forget synchronization (Cont.) Solution 1 thread_mtuex_t m; 2 tyedef struct Node_t { 3 int value; 4 struct Node_t *next; 5 } Node_t; 6 7 int List_Po() { 8 lock(&m) 9 Node_t *tm = head; // remember old head int value = head->value; //... and its value 11 head = head->next; // advance head to next ointer 12 free(tm); // free old head 13 unlock(&m) 14 return value; // return value at head 15 } Simle List Delete Code with lock AOS@UC 8
9 Cache Affinity Kee a rocess on the same CPU if at all ossible w A rocess builds u a fair bit of state in the cache of a CPU. w The next time the rocess run, it will run faster if some of its state is already resent in the cache on that CPU. A multirocessor scheduler should consider cache affinity when making its scheduling decision. AOS@UC 9
10 Single queue Multirocessor Scheduling (SQMS) Put all jobs that need to be scheduled into a single queue. w Each CPU simly icks the next job from the globally shared queue. w Cons: Some form of locking have to be inserted à Lack of scalability Cache affinity Examle: Queue A B C D E NULL Possible job scheduler across CPUs: CPU0 A E D C B CPU1 B A E D C CPU2 C B A E D CPU3 D C B A E (reeat) (reeat) (reeat) (reeat) AOS@UC 10
11 Scheduling Examle with Cache affinity Queue A B C D E NULL CPU0 A E A A A CPU1 B B E B B CPU2 C C C E C CPU3 D D D D E (reeat) (reeat) (reeat) (reeat) w Preserving affinity for most Jobs A through D are not moved across rocessors. Only job e Migrating from CPU to CPU. w Imlementing such a scheme can be comlex. AOS@UC 11
12 Multi-queue Multirocessor Scheduling (MQMS) MQMS consists of multile scheduling queues. w Each queue will follow a articular scheduling disciline. w When a job enters the system, it is laced on exactly one scheduling queue. w Avoid the roblems of information sharing and synchronization. AOS@UC 12
13 MQMS Examle With round robin, the system might roduce a schedule that looks like this: Q0 A C Q1 B D CPU0 CPU1 A A C C A A C C A A C C B B D D B B D D B B D D MQMS rovides more scalability and cache affinity. AOS@UC 13
14 Load Imbalance issue of MQMS After job C in Q0 finishes: CPU0 CPU1 Q0 A Q1 B D A A A A A A A A A A A A B B D D B B D D B B D D A gets twice as much CPU as B and D. After job A in Q0 finishes: Q0 Q1 B D CPU0 CPU1 B B D D B B D D B B D D CPU0 will be left idle! AOS@UC 14
15 How to deal with load imbalance? The answer is to move jobs (Migration). w Examle: Q0 Q1 B D The OS moves one of B or D to CPU 0 Q0 D Q1 B Or Q0 B Q1 D AOS@UC 15
16 How to deal with load imbalance? (Cont.) A more tricky case: Q0 A Q1 B D A ossible migration attern: w Kee switching jobs CPU0 CPU1 A A A A B A B A B B B B B D B D D D D D A D A D Migrate B to CPU0 Migrate A to CPU1 AOS@UC 16
17 Work Stealing Move jobs between queues w Imlementation: w Cons: A source queue that is low on jobs is icked. The source queue occasionally eeks at another target queue. If the target queue is more full than the source queue, the source will steal one or more jobs from the target queue. High overhead and trouble scaling AOS@UC 17
18 Linux Multirocessor Schedulers O(1) w A Priority-based scheduler w Use Multile queues w Change a rocess s riority over time w Schedule those with highest riority w Interactivity is a articular focus Comletely Fair Scheduler (CFS) (current mainline) w Deterministic roortional-share aroach w Based on Staircase Deadline (fairness is the focus) w Red-black tree for scalability AOS@UC 18
19 Linux Multirocessor Schedulers (Cont.) BF Scheduler (BFS) (Not in the mainline) w A single queue aroach w Proortional-share w Based on Earliest Eligible Virtual Deadline First (EEVDF) w Focus on interactive (not scale well with cores). Suerseded by MuQSS to fix that The battle of schedulers : Kolivas (SD) vs Molnar (CFS) AOS@UC 19
20 And you have to realize that there are not very many things that have a ged as well as the scheduler. Which is just another roof that scheduling is easy. Linus Torvalds, 2001 [43] Scheduling is not easy!, E.g: The Linux Scheduler: a Decade of Wasted Cores htt:// final29.df AOS@UC 20
21 Disclaimer: Disclaimer: This lecture slide set is used in AOS course at University of Cantabria. Was initially develoed for Oerating System course in Comuter Science Det. at Hanyang University. This lecture slide set is for OSTEP book written by Remzi and Andrea Araci-Dusseau (at University of Wisconsin) 21
32. Common Concurrency Problems.
32. Common Concurrency Problems. Oerating System: Three Easy Pieces AOS@UC 1 Common Concurrency Problems More recent work focuses on studying other tyes of common concurrency bugs. w Take a brief look
More information28. Locks. Operating System: Three Easy Pieces
28. Locks Oerating System: Three Easy Pieces AOS@UC 1 Locks: The Basic Idea Ensure that any critical section executes as if it were a single atomic instruction. w An examle: the canonical udate of a shared
More information33. Event-based Concurrency
33. Event-based Concurrency Oerating System: Three Easy Pieces AOS@UC 1 Event-based Concurrency A different style of concurrent rogramming without threads w Used in GUI-based alications, some tyes of internet
More information22. Swaping: Policies
22. Swaing: Policies Oerating System: Three Easy Pieces 1 Beyond Physical Memory: Policies Memory ressure forces the OS to start aging out ages to make room for actively-used ages. Deciding which age to
More information6. Mechanism: Limited Direct Execution
6. Mechanism: Limited Direct Execution Oerating System: Three Easy Pieces AOS@UC 1 How to efficiently virtualize the CPU with control? The OS needs to share the hysical CPU by time sharing. Issue w Performance:
More information20. Paging: Smaller Tables
20. Paging: Smaller Tables Oerating System: Three Easy Pieces AOS@UC 1 Paging: Linear Tables We usually have one age table for every rocess in the system. w Assume that 32-bit address sace with 4KB ages
More information37. Hard Disk Drives. Operating System: Three Easy Pieces
37. Hard Disk Drives Oerating System: Three Easy Pieces AOS@UC 1 Hard Disk Driver Hard disk driver have been the main form of ersistent data storage in comuter systems for decades ( and consequently file
More information30. Condition Variables
30. Condition Variables Oerating System: Three Easy Pieces AOS@UC 1 Condition Variables There are many cases where a thread wishes to check whether a condition is true before continuing its execution.
More information2. Introduction to Operating Systems
2. Introduction to Oerating Systems Oerating System: Three Easy Pieces 1 What a haens when a rogram runs? A running rogram executes instructions. 1. The rocessor fetches an instruction from memory. 2.
More information15. Address Translation
15. Address Translation Oerating System: Three Easy Pieces AOS@UC 1 Memory Virtualizing with Efficiency and Control Memory virtualizing takes a similar strategy known as limited direct execution(lde) for
More information43. Log-structured File Systems
43. Log-structured File Systems Oerating System: Three Easy Pieces AOS@UC 1 LFS: Log-structured File System Proosed by Stanford back in 91 Motivated by: w DRAM Memory sizes where growing. w Large ga between
More information42. Crash Consistency: FSCK and Journaling
42. Crash Consistency: FSCK and Journaling Oerating System: Three Easy Pieces AOS@UC 1 Crash Consistency AOS@UC 2 Crash Consistency Unlike most data structure, file system data structures must ersist w
More information36. I/O Devices. Operating System: Three Easy Pieces 1
36. I/O Devices Oerating System: Three Easy Pieces AOS@UC 1 I/O Devices I/O is critical to comuter system to interact with systems. Issue : w How should I/O be integrated into systems? w What are the general
More information4. The Abstraction: The Process
4. The Abstraction: The Process Operating System: Three Easy Pieces AOS@UC 1 How to provide the illusion of many CPUs? p CPU virtualizing w The OS can promote the illusion that many virtual CPUs exist.
More informationIntroduction to Parallel Algorithms
CS 1762 Fall, 2011 1 Introduction to Parallel Algorithms Introduction to Parallel Algorithms ECE 1762 Algorithms and Data Structures Fall Semester, 2011 1 Preliminaries Since the early 1990s, there has
More information14. Memory API. Operating System: Three Easy Pieces
14. Memory API Oerating System: Three Easy Pieces 1 Memory API: malloc() #include void* malloc(size_t size) Allocate a memory region on the hea. w Argument size_t size : size of the memory block(in
More informationCS Computer Systems. Lecture 6: Process Scheduling
CS 5600 Computer Systems Lecture 6: Process Scheduling Scheduling Basics Simple Schedulers Priority Schedulers Fair Share Schedulers Multi-CPU Scheduling Case Study: The Linux Kernel 2 Setting the Stage
More information39. File and Directories
39. File and Directories Oerating System: Three Easy Pieces AOS@UC 1 Persistent Storage Kee a data intact even if there is a ower loss. w Hard disk drive w Solid-state storage device Two key abstractions
More informationRecap: Consensus. CSE 486/586 Distributed Systems Mutual Exclusion. Why Mutual Exclusion? Why Mutual Exclusion? Mutexes. Mutual Exclusion C 1
Reca: Consensus Distributed Systems Mutual Exclusion Steve Ko Comuter Sciences and Engineering University at Buffalo On a synchronous system There s an algorithm that works. On an asynchronous system It
More informationDistributed Systems (5DV147)
Distributed Systems (5DV147) Mutual Exclusion and Elections Fall 2013 1 Processes often need to coordinate their actions Which rocess gets to access a shared resource? Has the master crashed? Elect a new
More informationSPITFIRE: Scalable Parallel Algorithms for Test Set Partitioned Fault Simulation
To aear in IEEE VLSI Test Symosium, 1997 SITFIRE: Scalable arallel Algorithms for Test Set artitioned Fault Simulation Dili Krishnaswamy y Elizabeth M. Rudnick y Janak H. atel y rithviraj Banerjee z y
More informationCOMP Parallel Computing. BSP (1) Bulk-Synchronous Processing Model
COMP 6 - Parallel Comuting Lecture 6 November, 8 Bulk-Synchronous essing Model Models of arallel comutation Shared-memory model Imlicit communication algorithm design and analysis relatively simle but
More informationEECS 750: Advanced Operating Systems. 01/29 /2014 Heechul Yun
EECS 750: Advanced Operating Systems 01/29 /2014 Heechul Yun 1 Administrative Next summary assignment Resource Containers: A New Facility for Resource Management in Server Systems, OSDI 99 due by 11:59
More informationAUTOMATIC GENERATION OF HIGH THROUGHPUT ENERGY EFFICIENT STREAMING ARCHITECTURES FOR ARBITRARY FIXED PERMUTATIONS. Ren Chen and Viktor K.
inuts er clock cycle Streaming ermutation oututs er clock cycle AUTOMATIC GENERATION OF HIGH THROUGHPUT ENERGY EFFICIENT STREAMING ARCHITECTURES FOR ARBITRARY FIXED PERMUTATIONS Ren Chen and Viktor K.
More informationSimulating Ocean Currents. Simulating Galaxy Evolution
Simulating Ocean Currents (a) Cross sections (b) Satial discretization of a cross section Model as two-dimensional grids Discretize in sace and time finer satial and temoral resolution => greater accuracy
More informationParallel Construction of Multidimensional Binary Search Trees. Ibraheem Al-furaih, Srinivas Aluru, Sanjay Goil Sanjay Ranka
Parallel Construction of Multidimensional Binary Search Trees Ibraheem Al-furaih, Srinivas Aluru, Sanjay Goil Sanjay Ranka School of CIS and School of CISE Northeast Parallel Architectures Center Syracuse
More informationLecture 18. Today, we will discuss developing algorithms for a basic model for parallel computing the Parallel Random Access Machine (PRAM) model.
U.C. Berkeley CS273: Parallel and Distributed Theory Lecture 18 Professor Satish Rao Lecturer: Satish Rao Last revised Scribe so far: Satish Rao (following revious lecture notes quite closely. Lecture
More informationTexture Mapping with Vector Graphics: A Nested Mipmapping Solution
Texture Maing with Vector Grahics: A Nested Mimaing Solution Wei Zhang Yonggao Yang Song Xing Det. of Comuter Science Det. of Comuter Science Det. of Information Systems Prairie View A&M University Prairie
More informationSource Coding and express these numbers in a binary system using M log
Source Coding 30.1 Source Coding Introduction We have studied how to transmit digital bits over a radio channel. We also saw ways that we could code those bits to achieve error correction. Bandwidth is
More informationRecap. Run to completion in order of arrival Pros: simple, low overhead, good for batch jobs Cons: short jobs can stuck behind the long ones
Recap First-Come, First-Served (FCFS) Run to completion in order of arrival Pros: simple, low overhead, good for batch jobs Cons: short jobs can stuck behind the long ones Round-Robin (RR) FCFS with preemption.
More informationShuigeng Zhou. May 18, 2016 School of Computer Science Fudan University
Query Processing Shuigeng Zhou May 18, 2016 School of Comuter Science Fudan University Overview Outline Measures of Query Cost Selection Oeration Sorting Join Oeration Other Oerations Evaluation of Exressions
More informationCSCI-GA Operating Systems Lecture 3: Processes and Threads -Part 2 Scheduling Hubertus Franke
CSCI-GA.2250-001 Operating Systems Lecture 3: Processes and Threads -Part 2 Scheduling Hubertus Franke frankeh@cs.nyu.edu Processes Vs Threads The unit of dispatching is referred to as a thread or lightweight
More informationCS 470 Spring Mike Lam, Professor. Performance Analysis
CS 470 Sring 2018 Mike Lam, Professor Performance Analysis Performance analysis Why do we arallelize our rograms? Performance analysis Why do we arallelize our rograms? So that they run faster! Performance
More informationAn Efficient Video Program Delivery algorithm in Tree Networks*
3rd International Symosium on Parallel Architectures, Algorithms and Programming An Efficient Video Program Delivery algorithm in Tree Networks* Fenghang Yin 1 Hong Shen 1,2,** 1 Deartment of Comuter Science,
More informationOptimizing Dynamic Memory Management!
Otimizing Dynamic Memory Management! 1 Goals of this Lecture! Hel you learn about:" Details of K&R hea mgr" Hea mgr otimizations related to Assignment #6" Faster free() via doubly-linked list, redundant
More informationAn Indexing Framework for Structured P2P Systems
An Indexing Framework for Structured P2P Systems Adina Crainiceanu Prakash Linga Ashwin Machanavajjhala Johannes Gehrke Carl Lagoze Jayavel Shanmugasundaram Deartment of Comuter Science, Cornell University
More informationCSE 120 Principles of Operating Systems
CSE 120 Principles of Operating Systems Spring 2018 Lecture 15: Multicore Geoffrey M. Voelker Multicore Operating Systems We have generally discussed operating systems concepts independent of the number
More informationMultiprocessor and Real- Time Scheduling. Chapter 10
Multiprocessor and Real- Time Scheduling Chapter 10 Classifications of Multiprocessor Loosely coupled multiprocessor each processor has its own memory and I/O channels Functionally specialized processors
More informationLecture notes for CS Chapter 4 11/27/18
Chapter 5: Thread-Level arallelism art 1 Introduction What is a parallel or multiprocessor system? Why parallel architecture? erformance potential Flynn classification Communication models Architectures
More informationRandomized algorithms: Two examples and Yao s Minimax Principle
Randomized algorithms: Two examles and Yao s Minimax Princile Maximum Satisfiability Consider the roblem Maximum Satisfiability (MAX-SAT). Bring your knowledge u-to-date on the Satisfiability roblem. Maximum
More informationOn-Chip Interconnect Implications of Shared Memory Multicores
On-Chi Interconnect Ilications of Shared Meory Multicores Srini Devadas Couter Science and Artificial Intelligence Laboratory (CSAIL) Massachusetts Institute of Technology 1 Prograing 1000 cores MPI has
More informationControl plane and data plane. Computing systems now. Glacial process of innovation made worse by standards process. Computing systems once upon a time
Classical work Architecture A A A Intro to SDN A A Oerating A Secialized Packet A A Oerating Secialized Packet A A A Oerating A Secialized Packet A A Oerating A Secialized Packet Oerating Secialized Packet
More informationDesign Trade-offs in Customized On-chip Crossbar Schedulers
J Sign Process Syst () 8:9 8 DOI.7/s-8--x Design Trade-offs in Customized On-chi Crossbar Schedulers Jae Young Hur Stehan Wong Todor Stefanov Received: October 7 / Revised: June 8 / cceted: ugust 8 / Published
More information521493S Computer Graphics Exercise 3 (Chapters 6-8)
521493S Comuter Grahics Exercise 3 (Chaters 6-8) 1 Most grahics systems and APIs use the simle lighting and reflection models that we introduced for olygon rendering Describe the ways in which each of
More informationA Study of Protocols for Low-Latency Video Transport over the Internet
A Study of Protocols for Low-Latency Video Transort over the Internet Ciro A. Noronha, Ph.D. Cobalt Digital Santa Clara, CA ciro.noronha@cobaltdigital.com Juliana W. Noronha University of California, Davis
More informationCS2 Algorithms and Data Structures Note 8
CS2 Algorithms and Data Structures Note 8 Heasort and Quicksort We will see two more sorting algorithms in this lecture. The first, heasort, is very nice theoretically. It sorts an array with n items in
More informationImplementing Locks. Nima Honarmand (Based on slides by Prof. Andrea Arpaci-Dusseau)
Implementing Locks Nima Honarmand (Based on slides by Prof. Andrea Arpaci-Dusseau) Lock Implementation Goals We evaluate lock implementations along following lines Correctness Mutual exclusion: only one
More information1.5 Case Study. dynamic connectivity quick find quick union improvements applications
. Case Study dynamic connectivity quick find quick union imrovements alications Subtext of today s lecture (and this course) Stes to develoing a usable algorithm. Model the roblem. Find an algorithm to
More informationCS2 Algorithms and Data Structures Note 8
CS2 Algorithms and Data Structures Note 8 Heasort and Quicksort We will see two more sorting algorithms in this lecture. The first, heasort, is very nice theoretically. It sorts an array with n items in
More informationDirected File Transfer Scheduling
Directed File Transfer Scheduling Weizhen Mao Deartment of Comuter Science The College of William and Mary Williamsburg, Virginia 387-8795 wm@cs.wm.edu Abstract The file transfer scheduling roblem was
More informationSensitivity Analysis for an Optimal Routing Policy in an Ad Hoc Wireless Network
1 Sensitivity Analysis for an Otimal Routing Policy in an Ad Hoc Wireless Network Tara Javidi and Demosthenis Teneketzis Deartment of Electrical Engineering and Comuter Science University of Michigan Ann
More informationEfficient Processing of Top-k Dominating Queries on Multi-Dimensional Data
Efficient Processing of To-k Dominating Queries on Multi-Dimensional Data Man Lung Yiu Deartment of Comuter Science Aalborg University DK-922 Aalborg, Denmark mly@cs.aau.dk Nikos Mamoulis Deartment of
More informationSpace-efficient Region Filling in Raster Graphics
"The Visual Comuter: An International Journal of Comuter Grahics" (submitted July 13, 1992; revised December 7, 1992; acceted in Aril 16, 1993) Sace-efficient Region Filling in Raster Grahics Dominik Henrich
More informationComplexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks
Journal of Comuting and Information Technology - CIT 8, 2000, 1, 1 12 1 Comlexity Issues on Designing Tridiagonal Solvers on 2-Dimensional Mesh Interconnection Networks Eunice E. Santos Deartment of Electrical
More informationPROCESS SCHEDULING II. CS124 Operating Systems Fall , Lecture 13
PROCESS SCHEDULING II CS124 Operating Systems Fall 2017-2018, Lecture 13 2 Real-Time Systems Increasingly common to have systems with real-time scheduling requirements Real-time systems are driven by specific
More informationSFO The Linux Kernel Scheduler. Viresh Kumar (PMWG)
SFO17-421 The Linux Kernel Scheduler Viresh Kumar (PMWG) Topics CPU Scheduler The O(1) scheduler Current scheduler design Scheduling classes schedule() Scheduling classes and policies Sched class: STOP
More informationMidterm Exam. October 20th, Thursday NSC
CSE 421/521 - Operating Systems Fall 2011 Lecture - XIV Midterm Review Tevfik Koşar University at Buffalo October 18 th, 2011 1 Midterm Exam October 20th, Thursday 9:30am-10:50am @215 NSC Chapters included
More informationEquality-Based Translation Validator for LLVM
Equality-Based Translation Validator for LLVM Michael Ste, Ross Tate, and Sorin Lerner University of California, San Diego {mste,rtate,lerner@cs.ucsd.edu Abstract. We udated our Peggy tool, reviously resented
More informationModels for Advancing PRAM and Other Algorithms into Parallel Programs for a PRAM-On-Chip Platform
Models for Advancing PRAM and Other Algorithms into Parallel Programs for a PRAM-On-Chi Platform Uzi Vishkin George C. Caragea Bryant Lee Aril 2006 University of Maryland, College Park, MD 20740 UMIACS-TR
More informationConcurrency: Locks. Announcements
CS 537 Introduction to Operating Systems UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department Concurrency: Locks Andrea C. Arpaci-Dusseau Remzi H. Arpaci-Dusseau Questions answered in this lecture:
More information10 File System Mass Storage Structure Mass Storage Systems Mass Storage Structure Mass Storage Structure FILE SYSTEM 1
10 File System 1 We will examine this chater in three subtitles: Mass Storage Systems OERATING SYSTEMS FILE SYSTEM 1 File System Interface File System Imlementation 10.1.1 Mass Storage Structure 3 2 10.1
More informationOperating Systems. Process scheduling. Thomas Ropars.
1 Operating Systems Process scheduling Thomas Ropars thomas.ropars@univ-grenoble-alpes.fr 2018 References The content of these lectures is inspired by: The lecture notes of Renaud Lachaize. The lecture
More informationSimple example. Analysis of programs with pointers. Points-to relation. Program model. Points-to graph. Ordering on points-to relation
Simle eamle Analsis of rograms with ointers := 5 tr := @ *tr := 9 := rogram S1 S2 S3 S4 deendences What are the deendences in this rogram? Problem: just looking at variable names will not give ou the correct
More informationRandomized Selection on the Hypercube 1
Randomized Selection on the Hyercube 1 Sanguthevar Rajasekaran Det. of Com. and Info. Science and Engg. University of Florida Gainesville, FL 32611 ABSTRACT In this aer we resent randomized algorithms
More informationA common scenario... Most of us have probably been here. Where did my performance go? It disappeared into overheads...
OPENMP PERFORMANCE 2 A common scenario... So I wrote my OpenMP program, and I checked it gave the right answers, so I ran some timing tests, and the speedup was, well, a bit disappointing really. Now what?.
More informationLecture 2: Fixed-Radius Near Neighbors and Geometric Basics
structure arises in many alications of geometry. The dual structure, called a Delaunay triangulation also has many interesting roerties. Figure 3: Voronoi diagram and Delaunay triangulation. Search: Geometric
More informationTime and Coordination in Distributed Systems. Operating Systems
Time and Coordination in Distributed Systems Oerating Systems Clock Synchronization Physical clocks drift, therefore need for clock synchronization algorithms Many algorithms deend uon clock synchronization
More information10. Parallel Methods for Data Sorting
10. Parallel Methods for Data Sorting 10. Parallel Methods for Data Sorting... 1 10.1. Parallelizing Princiles... 10.. Scaling Parallel Comutations... 10.3. Bubble Sort...3 10.3.1. Sequential Algorithm...3
More informationPrivacy Preserving Moving KNN Queries
Privacy Preserving Moving KNN Queries arxiv:4.76v [cs.db] 4 Ar Tanzima Hashem Lars Kulik Rui Zhang National ICT Australia, Deartment of Comuter Science and Software Engineering University of Melbourne,
More informationTimers 1 / 46. Jiffies. Potent and Evil Magic
Timers 1 / 46 Jiffies Each timer tick, a variable called jiffies is incremented It is thus (roughly) the number of HZ since system boot A 32-bit counter incremented at 1000 Hz wraps around in about 50
More informationMitigating the Impact of Decompression Latency in L1 Compressed Data Caches via Prefetching
Mitigating the Imact of Decomression Latency in L1 Comressed Data Caches via Prefetching by Sean Rea A thesis resented to Lakehead University in artial fulfillment of the requirement for the degree of
More informationCS370 Operating Systems
CS370 Operating Systems Colorado State University Yashwant K Malaiya Spring 2019 Lecture 8 Scheduling Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 FAQ POSIX: Portable Operating
More informationUsing Permuted States and Validated Simulation to Analyze Conflict Rates in Optimistic Replication
Using Permuted States and Validated Simulation to Analyze Conflict Rates in Otimistic Relication An-I A. Wang Comuter Science Deartment Florida State University Geoff H. Kuenning Comuter Science Deartment
More informationCS370 Operating Systems
CS370 Operating Systems Colorado State University Yashwant K Malaiya Fall 2017 Lecture 10 Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 Chapter 6: CPU Scheduling Basic Concepts
More informationLinear Data Structure Linked List
. Definition. Reresenting List in C. Imlementing the oerations a. Inserting a node b. Deleting a node c. List Traversal. Linked imlementation of Stack 5. Linked imlementation of Queue 6. Circular List
More informationCLIC CLient-Informed Caching for Storage Servers
CLIC CLient-Informed Caching for Storage Servers Xin Liu Ashraf Aboulnaga Ken Salem Xuhui Li David R. Cheriton School of Comuter Science University of Waterloo February 2009 Two-Tier Caching DBMS storage
More informationStorage Allocation CSE 143. Pointers, Arrays, and Dynamic Storage Allocation. Pointer Variables. Pointers: Review. Pointers and Types
CSE 143 Pointers, Arrays, and Dynamic Storage Allocation [Chater 4,. 148-157, 172-177] Storage Allocation Storage (memory) is a linear array of cells (bytes) Objects of different tyes often reuire differing
More informationExtracting Optimal Paths from Roadmaps for Motion Planning
Extracting Otimal Paths from Roadmas for Motion Planning Jinsuck Kim Roger A. Pearce Nancy M. Amato Deartment of Comuter Science Texas A&M University College Station, TX 843 jinsuckk,ra231,amato @cs.tamu.edu
More informationEC 513 Computer Architecture
EC 513 Computer Architecture Cache Coherence - Snoopy Cache Coherence rof. Michel A. Kinsy Consistency in SMs CU-1 CU-2 A 100 Cache-1 A 100 Cache-2 CU- bus A 100 Consistency in SMs CU-1 CU-2 A 200 Cache-1
More informationConcurrency: Mutual Exclusion (Locks)
Concurrency: Mutual Exclusion (Locks) Questions Answered in this Lecture: What are locks and how do we implement them? How do we use hardware primitives (atomics) to support efficient locks? How do we
More informationCPU Scheduling. Operating Systems (Fall/Winter 2018) Yajin Zhou ( Zhejiang University
Operating Systems (Fall/Winter 2018) CPU Scheduling Yajin Zhou (http://yajin.org) Zhejiang University Acknowledgement: some pages are based on the slides from Zhi Wang(fsu). Review Motivation to use threads
More informationEE678 Application Presentation Content Based Image Retrieval Using Wavelets
EE678 Alication Presentation Content Based Image Retrieval Using Wavelets Grou Members: Megha Pandey megha@ee. iitb.ac.in 02d07006 Gaurav Boob gb@ee.iitb.ac.in 02d07008 Abstract: We focus here on an effective
More informationRESEARCH ARTICLE. Simple Memory Machine Models for GPUs
The International Journal of Parallel, Emergent and Distributed Systems Vol. 00, No. 00, Month 2011, 1 22 RESEARCH ARTICLE Simle Memory Machine Models for GPUs Koji Nakano a a Deartment of Information
More informationContinuous Visible k Nearest Neighbor Query on Moving Objects
Continuous Visible k Nearest Neighbor Query on Moving Objects Yaniu Wang a, Rui Zhang b, Chuanfei Xu a, Jianzhong Qi b, Yu Gu a, Ge Yu a, a Deartment of Comuter Software and Theory, Northeastern University,
More informationImproved heuristics for the single machine scheduling problem with linear early and quadratic tardy penalties
Imroved heuristics for the single machine scheduling roblem with linear early and quadratic tardy enalties Jorge M. S. Valente* LIAAD INESC Porto LA, Faculdade de Economia, Universidade do Porto Postal
More information12) United States Patent 10) Patent No.: US 6,321,328 B1
USOO6321328B1 12) United States Patent 10) Patent No.: 9 9 Kar et al. (45) Date of Patent: Nov. 20, 2001 (54) PROCESSOR HAVING DATA FOR 5,961,615 10/1999 Zaid... 710/54 SPECULATIVE LOADS 6,006,317 * 12/1999
More informationComparing IS-IS and OSPF
Comaring IS-IS and OSPF ISP Workshos These materials are licensed under the Creative Commons Attribution-NonCommercial 4.0 International license (htt://creativecommons.org/licenses/by-nc/4.0/) Last udated
More informationPower Savings in Embedded Processors through Decode Filter Cache
Power Savings in Embedded Processors through Decode Filter Cache Weiyu Tang Rajesh Guta Alexandru Nicolau Deartment of Information and Comuter Science University of California, Irvine Irvine, CA 92697-3425
More informationA BICRITERION STEINER TREE PROBLEM ON GRAPH. Mirko VUJO[EVI], Milan STANOJEVI] 1. INTRODUCTION
Yugoslav Journal of Oerations Research (00), umber, 5- A BICRITERIO STEIER TREE PROBLEM O GRAPH Mirko VUJO[EVI], Milan STAOJEVI] Laboratory for Oerational Research, Faculty of Organizational Sciences University
More informationA Metaheuristic Scheduler for Time Division Multiplexed Network-on-Chip
Downloaded from orbit.dtu.dk on: Jan 25, 2019 A Metaheuristic Scheduler for Time Division Multilexed Network-on-Chi Sørensen, Rasmus Bo; Sarsø, Jens; Pedersen, Mark Ruvald; Højgaard, Jasur Publication
More informationò mm_struct represents an address space in kernel ò task represents a thread in the kernel ò A task points to 0 or 1 mm_structs
Last time We went through the high-level theory of scheduling algorithms Scheduling Today: View into how Linux makes its scheduling decisions Don Porter CSE 306 Lecture goals Understand low-level building
More informationCache Coherence in Scalable Machines
ache oherence in Scalable Machines SE 661 arallel and Vector Architectures rof. Muhamed Mudawar omputer Engineering Department King Fahd University of etroleum and Minerals Generic Scalable Multiprocessor
More informationScheduling. Don Porter CSE 306
Scheduling Don Porter CSE 306 Last time ò We went through the high-level theory of scheduling algorithms ò Today: View into how Linux makes its scheduling decisions Lecture goals ò Understand low-level
More informationW4118: advanced scheduling
W4118: advanced scheduling Instructor: Junfeng Yang References: Modern Operating Systems (3 rd edition), Operating Systems Concepts (8 th edition), previous W4118, and OS at MIT, Stanford, and UWisc Outline
More informationDistributed Estimation from Relative Measurements in Sensor Networks
Distributed Estimation from Relative Measurements in Sensor Networks #Prabir Barooah and João P. Hesanha Abstract We consider the roblem of estimating vectorvalued variables from noisy relative measurements.
More informationPREDICTING LINKS IN LARGE COAUTHORSHIP NETWORKS
PREDICTING LINKS IN LARGE COAUTHORSHIP NETWORKS Kevin Miller, Vivian Lin, and Rui Zhang Grou ID: 5 1. INTRODUCTION The roblem we are trying to solve is redicting future links or recovering missing links
More informationThe R-LRPD Test: Speculative Parallelization of Partially Parallel Loops
The R-LRPD Test: Seculative Parallelization of Partially Parallel Loos Francis Dang, Hao Yu, Lawrence Rauchwerger Det. of Comuter Science, Texas A&M University College Station, TX 778- {fhd,hy89,rwerger}@cs.tamu.edu
More informationFrequently asked questions from the previous class survey
CS 370: OPERATING SYSTEMS [CPU SCHEDULING] Shrideep Pallickara Computer Science Colorado State University L15.1 Frequently asked questions from the previous class survey Could we record burst times in
More informationBasic Types and Arrays. Pointers. Records and Pointers. Record Definition. Creating a Record. Pointer. Basic Types. Arrays
Basic Tyes and Arrays ointers SE 6 Data Structures Lecture Basic Tyes integer, real (floating oint), boolean (0,), character Arrays A[099] : integer array A 0 6 7 99 A[] /9/0 ointers and Lists- Lecture
More informationImproving Parallel-Disk Buffer Management using Randomized Writeback æ
Aears in Proceedings of ICPP 98 1 Imroving Parallel-Disk Buffer Management using Randomized Writeback æ Mahesh Kallahalla Peter J Varman Detartment of Electrical and Comuter Engineering Rice University
More information