Computer and Machine Vision

Size: px
Start display at page:

Download "Computer and Machine Vision"

Transcription

1 Computer and Machine Vision Lecture Week 3 Part-2 January 27, 2014 Sam Siewert

2 Resource Scaling Processing Co-Processors GPU CUDA, OpenCL Many-Core E.g. Intel Xeon Phi MICA FPGA E.g. Altera Stratix Ideally Camera Interface I/O High Rate Transport HD-SDI, Camera Link, GigE/10GE Memory SSD, PCIe Nand, NVM FusionIO, Micron, Intel Memristor (Future) Sam Siewert 2

3 SIMD Vector Instructions Intel MMX, SSE 1, 2, 3, 4.x Code Generation Using SIMD Extensions to Accelerate Algorithms (Edge Enhancement) PSF Sam Siewert 3

4 Offload, Co-Proc, Vector Proc 1. GPU (Graphics Processing Units) Evolved for Consumer CGI and Games Physics Engines 3D Rendering + Texture (4D Vector Operations) Game Engines and Simulation HD Output: HDMI, HD-SDI, Headless GP-GPU Higher End Used for Digital Cinema / Post Production, Broadcast PNY Quadro FX NVIDIA CUDA for Post GP-GPU Being Used to Accelerate Encode, Transcode, Trans-rate, etc Built-In SIMD Instruction Set Extensions Intel SSE Sam Siewert 4

5 GP-GPU, What Is It? Ideal for Large Bitwise, Integer, and Floating Point Vector Math Flynn s Taxonomy SIMD Architecture often leverages GP-GPU Co- Processors or Cell for MPMD Single Data Multiple Data Single Instruction/Prog Multiple Instruction SISD (Traditional Uniprocessor) SIMD (SSE 4.2, Vector Processing) SPMD (Single Program 5 Multiple Data), GP-GPU MISD (Voting schemes and active-active controllers) MIMD (Distributed systems (MPMD), Clusters with MPI/PVM (SPMD), AMP/SMP)

6 SSE Streaming SIMD Extensions 128-bit registers known as XMM0 through XMM7 Large Operands and Operators (Multi-Word) E.g. 128-bit XOR of Two Operands Multiple Multiply and Accumulate Operations for Floating Point (DSP Kernel Operations) E.g. 4 Component Vector addition 4 Single Precision Pixel Multiply and Accumulate in Single Instruction vec_res.x = v1.x + v2.x; vec_res.y = v1.y + v2.y; vec_res.z = v1.z + v2.z; vec_res.w = v1.w + v2.w; movaps xmm0,address-of-v1 addps xmm0,address-of-v2 movaps address-of-vec_res,xmm0 16 operations to load 2 operands, add, store 3 SSE operations to load, add, store ;xmm0=v1.w v1.z v1.y v1.x ;xmm0=v1.w+v2.w v1.z+v2.z v1.y+v2.y v1.x+v2.x Sam Siewert 6

7 Scheduling Parallel/Cluster HW MIMD OS SMP threading, provides load balancing, affinity operations, routable interrupts (e.g. MSI- X), e.g. NPTL RTOS AMP is most often used in Embedded Systems MPMD OpenCL, CUDA, DirectCompute (DirectX extension) Intel OpenMP, Linux Cluster, MPI Sam Siewert 7

8 How Does NPTL Work? No Thread Manager or M-on-N Mapping Previous POSIX Threading Model Manager Becomes Bottleneck Two-Level Scheduling Not Deterministic Many Pthreads (M) to N Kernel Threads Still an Issue O(n) Scheduling for each Manager Direct Mapping of User to Kernel Thread or 1-to-1 User Space Pthread Maps Directly onto Kernel Thread (Requires Root privilege) Deterministic (Non-Determinism due to Kernel Preemptability Issues) O(1) Scheduling Scheduling Policies Selectable Similar to RTOS Tasking Sam Siewert 8

9 Linux NPTL Scheduling Policies Fixed Priority Preemptive SCHED_FIFO This is Priority Preemptive SCHED_RR This is Fair, but at Kernel Level SCHED_OTHER This is OS default and should not be used POSIX Threads have Policy (FIFO, RR, OTHER) Priority (RT min to RT max) Creation (Fork) Join (Wait for thread completion at rendezvous) Synchronization Methods Semaphores Message Queues Asynchronous Communication Methods Signals Queued Signals POSIX RT Extensions Include Virtual Timer Services Signals Tied to Timer Services Priority Inversion Protection (Availability on Linux TBD) Sam Siewert 9

10 NPTL Coding Code Walk-through July 7, 2004 Sam Siewert

11 Thread Scheduling Policy pthread_attr_init(&rt_sched_attr); pthread_attr_setinheritsched(&rt_sched_attr, PTHREAD_EXPLICIT_SCHED); pthread_attr_setschedpolicy(&rt_sched_attr, SCHED_FIFO); rt_max_prio = sched_get_priority_max(sched_fifo); rt_min_prio = sched_get_priority_min(sched_fifo); rt_param.sched_priority = rt_max_prio-1; rc=sched_setscheduler(getpid(), SCHED_FIFO, &rt_param); pthread_attr_getscope(&rt_sched_attr, &scope); if(scope == PTHREAD_SCOPE_SYSTEM) printf("pthread SCOPE SYSTEM\n"); else if (scope == PTHREAD_SCOPE_PROCESS) printf("pthread SCOPE PROCESS\n"); else printf("pthread SCOPE UNKNOWN\n"); Sam Siewert 11

12 Thread Creation and Join rc = pthread_create(&main_thread, &main_sched_attr, testthread, (void *)0); if (rc) { printf("error; pthread_create() rc is %d\n", rc); perror(null); exit(-1); } pthread_join(main_thread, NULL); if(pthread_attr_destroy(&rt_sched_attr)!= 0) perror("attr destroy"); Sam Siewert 12

13 Issues Beyond Policy and Feasibility Throughput Latency How do they Differ? E.g. Frame Rate vs. Time to First Frame Sam Siewert 13

14 Digital Video (Quick Reminders) Sam Siewert 14

15 Simple Encode/Decode is Processing Intensive GPU Co-Processors Can Offload CPU Example with Mplayer and VDPAU (Video Decode and Presentation Acceleration Unit) for Linux Core Loading with Mplayer VDPAU MPEG Decode (Load balancing and offload) Dual-Core SW Decode (Load balancing) Sam Siewert 15

16 Discussion What Does Eye See? Ewald Hering (1872), Opponent Colors (R/G, Y/B) Color Models RGB Cube HSV - Hue/Saturation/Value Hue Similarity to R, G, Y, B Saturation Color vs. Brightness Value Low=Black, High=Color RGB Cube Red and Green Opponent Colors Can t See Both Simultaneously Yellow and Blue Opponent Colors HSV Cylinder Luminance (Candela/Square-Meter) Light Passing Through Area Forming a Solid Angle in A Direction Candela (Photonic Power )= Watts/Steradian More Precise than Brightness Chrominance ( CrCb or UV in YCrCb or YUV) U=Blue Luminance (Y) V=Red - Luminance (Y) Wavelength Spectrum - ROYGBIV Sam Siewert 16

17 Summary of Capture Bayer Pixel RGB Sampled in 4:3 or 16:9 or other AR Frame Array G R B G Graymap is Green, or Y alone in YCrCb Eye Integrates the Color-bands to Perceive Color (any band alone appears gray) Frames are Processed with CV/MV CV/MV Processing of Frames Over Time Distinguishes from Image Processing Interactive or Real-Time Sam Siewert 17

18 Frame Analysis and Image Processing Resources for Raw Frame Data GNU Image Processing Single Frame Analysis and Transforms Octave Similar to MATLAB Irfanview Simple Viewer includes PPM OpenCV (C/C++ and Python API) Single Frame Viewing and Analysis Image Processing Libraries Sam Siewert 18

19 Practice with Linux GIMP PPM and JPEG Frame Analysis FFMPEG MPEG-4 DV to Frames Sobel Image Transformation Real-Time Sobel or Canny Image Transformation Batch Mode FFMPEG Re-encoding Sam Siewert 19

A320 Supplemental Multi-Core Materials

A320 Supplemental Multi-Core Materials A320 Supplemental Multi-Core Materials Scaling for Data-centric Computing (Overview for OS) April 18, 2013 Sam Siewert Scaling Processors and Processing Distributed Systems Networked Machines, Map Reduce

More information

CEC 450 Real-Time Systems

CEC 450 Real-Time Systems CEC 450 Real-Time Systems Lecture 2 Introduction Part 1 August 31, 2015 Sam Siewert So Why SW for HRT Systems? ASIC and FPGA State-Machine Solutions Offer Hardware Clocked Deterministic Solutions FPGAs

More information

Review of Last Lecture. CS 61C: Great Ideas in Computer Architecture. The Flynn Taxonomy, Intel SIMD Instructions. Great Idea #4: Parallelism.

Review of Last Lecture. CS 61C: Great Ideas in Computer Architecture. The Flynn Taxonomy, Intel SIMD Instructions. Great Idea #4: Parallelism. CS 61C: Great Ideas in Computer Architecture The Flynn Taxonomy, Intel SIMD Instructions Instructor: Justin Hsia 1 Review of Last Lecture Amdahl s Law limits benefits of parallelization Request Level Parallelism

More information

CS 61C: Great Ideas in Computer Architecture. The Flynn Taxonomy, Intel SIMD Instructions

CS 61C: Great Ideas in Computer Architecture. The Flynn Taxonomy, Intel SIMD Instructions CS 61C: Great Ideas in Computer Architecture The Flynn Taxonomy, Intel SIMD Instructions Instructor: Justin Hsia 3/08/2013 Spring 2013 Lecture #19 1 Review of Last Lecture Amdahl s Law limits benefits

More information

CS 61C: Great Ideas in Computer Architecture. The Flynn Taxonomy, Intel SIMD Instructions

CS 61C: Great Ideas in Computer Architecture. The Flynn Taxonomy, Intel SIMD Instructions CS 61C: Great Ideas in Computer Architecture The Flynn Taxonomy, Intel SIMD Instructions Guest Lecturer: Alan Christopher 3/08/2014 Spring 2014 -- Lecture #19 1 Neuromorphic Chips Researchers at IBM and

More information

CS A331 Programming Language Concepts

CS A331 Programming Language Concepts CS A331 Programming Language Concepts Lecture 12 Alternative Language Examples (General Concurrency Issues and Concepts) March 30, 2014 Sam Siewert Major Concepts Concurrent Processing Processes, Tasks,

More information

CEC 450 Real-Time Systems

CEC 450 Real-Time Systems CEC 450 Real-Time Systems Lecture 2 Introduction to Scheduling of RT Services Part 1 September 4, 2018 Sam Siewert So Why SW for HRT Systems? ASIC and FPGA State-Machine Solutions Offer Hardware Clocked

More information

Flynn Taxonomy Data-Level Parallelism

Flynn Taxonomy Data-Level Parallelism ecture 27 Computer Science 61C Spring 2017 March 22nd, 2017 Flynn Taxonomy Data-Level Parallelism 1 New-School Machine Structures (It s a bit more complicated!) Software Hardware Parallel Requests Assigned

More information

Review: Parallelism - The Challenge. Agenda 7/14/2011

Review: Parallelism - The Challenge. Agenda 7/14/2011 7/14/2011 CS 61C: Great Ideas in Computer Architecture (Machine Structures) The Flynn Taxonomy, Data Level Parallelism Instructor: Michael Greenbaum Summer 2011 -- Lecture #15 1 Review: Parallelism - The

More information

Parallelism and Performance Instructor: Steven Ho

Parallelism and Performance Instructor: Steven Ho Parallelism and Performance Instructor: Steven Ho Review of Last Lecture Cache Performance AMAT = HT + MR MP 2 Multilevel Cache Diagram Main Memory Legend: Request for data Return of data CPU L1$ Memory

More information

CEC 450 Real-Time Systems

CEC 450 Real-Time Systems CEC 450 Real-Time Systems Lecture 2 Introduction to Scheduling of RT Services Part 1 September 2, 2017 Sam Siewert So Why SW for HRT Systems? ASIC and FPGA State-Machine Solutions Offer Hardware Clocked

More information

Agenda. Review. Costs of Set-Associative Caches

Agenda. Review. Costs of Set-Associative Caches Agenda CS 61C: Great Ideas in Computer Architecture (Machine Structures) Technology Trends and Data Level Instructor: Michael Greenbaum Caches - Replacement Policies, Review Administrivia Course Halfway

More information

CS A320 Operating Systems for Engineers

CS A320 Operating Systems for Engineers CS A320 Operating Systems for Engineers Lecture 4 Conclusion of MOS Chapter 2 September 18, 2013 Sam Siewert Many Ways to Schedule a CPU Core We ve Come a Long way Since Batch Scheduling Sam Siewert 2

More information

The Flynn Taxonomy, Intel SIMD Instruc(ons Miki Lus(g and Dan Garcia

The Flynn Taxonomy, Intel SIMD Instruc(ons Miki Lus(g and Dan Garcia CS 61C: Great Ideas in Computer Architecture The Flynn Taxonomy, Intel SIMD Instruc(ons Miki Lus(g and Dan Garcia 1 Nostalgia DALLAS (Aug. 7, 2001) - - Leveraging programmable digital signal processors

More information

ME964 High Performance Computing for Engineering Applications

ME964 High Performance Computing for Engineering Applications ME964 High Performance Computing for Engineering Applications Execution Scheduling in CUDA Revisiting Memory Issues in CUDA February 17, 2011 Dan Negrut, 2011 ME964 UW-Madison Computers are useless. They

More information

! Readings! ! Room-level, on-chip! vs.!

! Readings! ! Room-level, on-chip! vs.! 1! 2! Suggested Readings!! Readings!! H&P: Chapter 7 especially 7.1-7.8!! (Over next 2 weeks)!! Introduction to Parallel Computing!! https://computing.llnl.gov/tutorials/parallel_comp/!! POSIX Threads

More information

Parallel Processors. The dream of computer architects since 1950s: replicate processors to add performance vs. design a faster processor

Parallel Processors. The dream of computer architects since 1950s: replicate processors to add performance vs. design a faster processor Multiprocessing Parallel Computers Definition: A parallel computer is a collection of processing elements that cooperate and communicate to solve large problems fast. Almasi and Gottlieb, Highly Parallel

More information

CS 61C: Great Ideas in Computer Architecture

CS 61C: Great Ideas in Computer Architecture CS 61C: Great Ideas in Computer Architecture Flynn Taxonomy, Data-level Parallelism Instructors: Vladimir Stojanovic & Nicholas Weaver http://inst.eecs.berkeley.edu/~cs61c/ 1 New-School Machine Structures

More information

Lecture Topics. Announcements. Today: Advanced Scheduling (Stallings, chapter ) Next: Deadlock (Stallings, chapter

Lecture Topics. Announcements. Today: Advanced Scheduling (Stallings, chapter ) Next: Deadlock (Stallings, chapter Lecture Topics Today: Advanced Scheduling (Stallings, chapter 10.1-10.4) Next: Deadlock (Stallings, chapter 6.1-6.6) 1 Announcements Exam #2 returned today Self-Study Exercise #10 Project #8 (due 11/16)

More information

Using SSE and IPP to Accelerate Algorithms

Using SSE and IPP to Accelerate Algorithms Using SSE and IPP to Accelerate Algorithms By Sam Siewert Algorithm Acceleration Using SIMD Computing architecture can be described at the highest level using Flynn s architecture classification scheme

More information

Chapter 5: CPU Scheduling. Operating System Concepts 8 th Edition,

Chapter 5: CPU Scheduling. Operating System Concepts 8 th Edition, Chapter 5: CPU Scheduling Operating System Concepts 8 th Edition, Hanbat National Univ. Computer Eng. Dept. Y.J.Kim 2009 Chapter 5: Process Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms

More information

CS 61C: Great Ideas in Computer Architecture. The Flynn Taxonomy, Data Level Parallelism

CS 61C: Great Ideas in Computer Architecture. The Flynn Taxonomy, Data Level Parallelism CS 61C: Great Ideas in Computer Architecture The Flynn Taxonomy, Data Level Parallelism Instructor: Justin Hsia 7/11/2012 Summer 2012 Lecture #14 1 Review of Last Lecture Performance programming When possible,

More information

SE300 SWE Practices. Lecture 10 Introduction to Event- Driven Architectures. Tuesday, March 17, Sam Siewert

SE300 SWE Practices. Lecture 10 Introduction to Event- Driven Architectures. Tuesday, March 17, Sam Siewert SE300 SWE Practices Lecture 10 Introduction to Event- Driven Architectures Tuesday, March 17, 2015 Sam Siewert Copyright {c} 2014 by the McGraw-Hill Companies, Inc. All rights Reserved. Four Common Types

More information

Chapter 5: CPU Scheduling

Chapter 5: CPU Scheduling COP 4610: Introduction to Operating Systems (Fall 2016) Chapter 5: CPU Scheduling Zhi Wang Florida State University Contents Basic concepts Scheduling criteria Scheduling algorithms Thread scheduling Multiple-processor

More information

ME964 High Performance Computing for Engineering Applications

ME964 High Performance Computing for Engineering Applications ME964 High Performance Computing for Engineering Applications Memory Issues in CUDA Execution Scheduling in CUDA February 23, 2012 Dan Negrut, 2012 ME964 UW-Madison Computers are useless. They can only

More information

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620

Introduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620 Introduction to Parallel and Distributed Computing Linh B. Ngo CPSC 3620 Overview: What is Parallel Computing To be run using multiple processors A problem is broken into discrete parts that can be solved

More information

PROGRAMOVÁNÍ V C++ CVIČENÍ. Michal Brabec

PROGRAMOVÁNÍ V C++ CVIČENÍ. Michal Brabec PROGRAMOVÁNÍ V C++ CVIČENÍ Michal Brabec PARALLELISM CATEGORIES CPU? SSE Multiprocessor SIMT - GPU 2 / 17 PARALLELISM V C++ Weak support in the language itself, powerful libraries Many different parallelization

More information

Computer and Machine Vision

Computer and Machine Vision Computer and Machine Vision Lecture Week 5 Part-2 February 13, 2014 Sam Siewert Outline of Week 5 Background on 2D and 3D Geometric Transformations Chapter 2 of CV Fundamentals of 2D Image Transformations

More information

Chapter 6: CPU Scheduling

Chapter 6: CPU Scheduling Chapter 6: CPU Scheduling Silberschatz, Galvin and Gagne Histogram of CPU-burst Times 6.2 Silberschatz, Galvin and Gagne Alternating Sequence of CPU And I/O Bursts 6.3 Silberschatz, Galvin and Gagne CPU

More information

CPU Scheduling. Operating Systems (Fall/Winter 2018) Yajin Zhou ( Zhejiang University

CPU Scheduling. Operating Systems (Fall/Winter 2018) Yajin Zhou (  Zhejiang University Operating Systems (Fall/Winter 2018) CPU Scheduling Yajin Zhou (http://yajin.org) Zhejiang University Acknowledgement: some pages are based on the slides from Zhi Wang(fsu). Review Motivation to use threads

More information

Lecture 25: Board Notes: Threads and GPUs

Lecture 25: Board Notes: Threads and GPUs Lecture 25: Board Notes: Threads and GPUs Announcements: - Reminder: HW 7 due today - Reminder: Submit project idea via (plain text) email by 11/24 Recap: - Slide 4: Lecture 23: Introduction to Parallel

More information

Shared-Memory Hardware

Shared-Memory Hardware Shared-Memory Hardware Parallel Programming Concepts Winter Term 2013 / 2014 Dr. Peter Tröger, M.Sc. Frank Feinbube Shared-Memory Hardware Hardware architecture: Processor(s), memory system(s), data path(s)

More information

CS 61C: Great Ideas in Computer Architecture. The Flynn Taxonomy, Data Level Parallelism

CS 61C: Great Ideas in Computer Architecture. The Flynn Taxonomy, Data Level Parallelism CS 61C: Great Ideas in Computer Architecture The Flynn Taxonomy, Data Level Parallelism Instructor: Alan Christopher 7/16/2014 Summer 2014 -- Lecture #14 1 Review of Last Lecture Performance programming

More information

CS A490 Machine Vision and Computer Graphics

CS A490 Machine Vision and Computer Graphics CS A490 Machine Vision and Computer Graphics Lecture 1 - Introduction August 28, 2012 Sam Siewert Sam Siewert UC Berkeley National Research University, Philosophy/Physics 1984-85 University of Notre Dame,

More information

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing

INSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing UNIVERSIDADE TÉCNICA DE LISBOA INSTITUTO SUPERIOR TÉCNICO Departamento de Engenharia Informática Architectures for Embedded Computing MEIC-A, MEIC-T, MERC Lecture Slides Version 3.0 - English Lecture 12

More information

RT Digital Media Extended Lab Choose One: Work Alone Work in a Pair Extra Work Required 3 or More Students May Collaborate, but Submissions must be Un

RT Digital Media Extended Lab Choose One: Work Alone Work in a Pair Extra Work Required 3 or More Students May Collaborate, but Submissions must be Un ECEN 5033 RT Digital Media Systems Lecture 10 Extended Lab Background March 31, 2008 Sam Siewert RT Digital Media Extended Lab Choose One: Work Alone Work in a Pair Extra Work Required 3 or More Students

More information

Chapter 5 CPU scheduling

Chapter 5 CPU scheduling Chapter 5 CPU scheduling Contents Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Thread Scheduling Operating Systems Examples Java Thread Scheduling

More information

CSCE Introduction to Computer Systems Spring 2019

CSCE Introduction to Computer Systems Spring 2019 CSCE 313-200 Introduction to Computer Systems Spring 2019 Threads Dmitri Loguinov Texas A&M University January 29, 2019 1 Updates Quiz on Thursday System Programming Tutorial (pay attention to exercises)

More information

CS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology

CS8803SC Software and Hardware Cooperative Computing GPGPU. Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology CS8803SC Software and Hardware Cooperative Computing GPGPU Prof. Hyesoon Kim School of Computer Science Georgia Institute of Technology Why GPU? A quiet revolution and potential build-up Calculation: 367

More information

Computer and Machine Vision

Computer and Machine Vision Computer and Machine Vision Deeper Dive into MPEG Digital Video Encoding January 22, 2014 Sam Siewert Reminders CV and MV Use UNCOMPRESSED FRAMES Remote Cameras (E.g. Security) May Need to Transport Frames

More information

Job Scheduling. CS170 Fall 2018

Job Scheduling. CS170 Fall 2018 Job Scheduling CS170 Fall 2018 What to Learn? Algorithms of job scheduling, which maximizes CPU utilization obtained with multiprogramming Select from ready processes and allocates the CPU to one of them

More information

CS510 Operating System Foundations. Jonathan Walpole

CS510 Operating System Foundations. Jonathan Walpole CS510 Operating System Foundations Jonathan Walpole The Process Concept 2 The Process Concept Process a program in execution Program - description of how to perform an activity instructions and static

More information

RT extensions/applications of general-purpose OSs

RT extensions/applications of general-purpose OSs EECS 571 Principles of Real-Time Embedded Systems Lecture Note #15: RT extensions/applications of general-purpose OSs General-Purpose OSs for Real-Time Why? (as discussed before) App timing requirements

More information

Computer and Machine Vision

Computer and Machine Vision Computer and Machine Vision Lecture Week 7 Part-1 (Convolution Transform Speed-up and Hough Linear Transform) February 26, 2014 Sam Siewert Outline of Week 7 Basic Convolution Transform Speed-Up Concepts

More information

Chapter 5: Process Scheduling

Chapter 5: Process Scheduling Chapter 5: Process Scheduling Chapter 5: Process Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Thread Scheduling Multiple-Processor Scheduling Real-Time CPU Scheduling Operating Systems

More information

WHY PARALLEL PROCESSING? (CE-401)

WHY PARALLEL PROCESSING? (CE-401) PARALLEL PROCESSING (CE-401) COURSE INFORMATION 2 + 1 credits (60 marks theory, 40 marks lab) Labs introduced for second time in PP history of SSUET Theory marks breakup: Midterm Exam: 15 marks Assignment:

More information

Chapter 5: CPU Scheduling

Chapter 5: CPU Scheduling Chapter 5: CPU Scheduling Chapter 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Thread Scheduling Multiple-Processor Scheduling Operating Systems Examples Algorithm Evaluation

More information

Technology for a better society. hetcomp.com

Technology for a better society. hetcomp.com Technology for a better society hetcomp.com 1 J. Seland, C. Dyken, T. R. Hagen, A. R. Brodtkorb, J. Hjelmervik,E Bjønnes GPU Computing USIT Course Week 16th November 2011 hetcomp.com 2 9:30 10:15 Introduction

More information

CEC 450 Real-Time Systems

CEC 450 Real-Time Systems CEC 450 Real-Time Systems Lecture 7 Review October 9, 2017 Sam Siewert Coming Next Finish Up with Recount of Mars Pathfinder and Unbounded Priority Inversion Mike Jone s Page (Microsoft) Glenn Reeves on

More information

Chapter 5: CPU Scheduling. Operating System Concepts Essentials 8 th Edition

Chapter 5: CPU Scheduling. Operating System Concepts Essentials 8 th Edition Chapter 5: CPU Scheduling Silberschatz, Galvin and Gagne 2011 Chapter 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Thread Scheduling Multiple-Processor Scheduling Operating

More information

CPU Scheduling (Part II)

CPU Scheduling (Part II) CPU Scheduling (Part II) Amir H. Payberah amir@sics.se Amirkabir University of Technology (Tehran Polytechnic) Amir H. Payberah (Tehran Polytechnic) CPU Scheduling 1393/7/28 1 / 58 Motivation Amir H. Payberah

More information

Parallel Computing Introduction

Parallel Computing Introduction Parallel Computing Introduction Bedřich Beneš, Ph.D. Associate Professor Department of Computer Graphics Purdue University von Neumann computer architecture CPU Hard disk Network Bus Memory GPU I/O devices

More information

Computing on GPUs. Prof. Dr. Uli Göhner. DYNAmore GmbH. Stuttgart, Germany

Computing on GPUs. Prof. Dr. Uli Göhner. DYNAmore GmbH. Stuttgart, Germany Computing on GPUs Prof. Dr. Uli Göhner DYNAmore GmbH Stuttgart, Germany Summary: The increasing power of GPUs has led to the intent to transfer computing load from CPUs to GPUs. A first example has been

More information

Lecture notes Lectures 1 through 5 (up through lecture 5 slide 63) Book Chapters 1-4

Lecture notes Lectures 1 through 5 (up through lecture 5 slide 63) Book Chapters 1-4 EE445M Midterm Study Guide (Spring 2017) (updated February 25, 2017): Instructions: Open book and open notes. No calculators or any electronic devices (turn cell phones off). Please be sure that your answers

More information

Independent Study RM Scheduling Feasibility Tests conducted on

Independent Study RM Scheduling Feasibility Tests conducted on RM Scheduling Feasibility Tests conducted on TI DM3730 Processor - 1 GHz ARM Cortex-A8 core with Angstrom and TimeSys Linux ported on to BeagleBoard xm. Nisheeth Bhat Aim: The independent study involved

More information

Parallel design patterns ARCHER course. Vectorisation and active messaging

Parallel design patterns ARCHER course. Vectorisation and active messaging Parallel design patterns ARCHER course Vectorisation and active messaging Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License.

More information

Chapter 5: Process Scheduling

Chapter 5: Process Scheduling Chapter 5: Process Scheduling Chapter 5: Process Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Thread Scheduling Operating Systems Examples Algorithm

More information

CS307: Operating Systems

CS307: Operating Systems CS307: Operating Systems Chentao Wu 吴晨涛 Associate Professor Dept. of Computer Science and Engineering Shanghai Jiao Tong University SEIEE Building 3-513 wuct@cs.sjtu.edu.cn Download Lectures ftp://public.sjtu.edu.cn

More information

A General Discussion on! Parallelism!

A General Discussion on! Parallelism! Lecture 2! A General Discussion on! Parallelism! John Cavazos! Dept of Computer & Information Sciences! University of Delaware! www.cis.udel.edu/~cavazos/cisc879! Lecture 2: Overview Flynn s Taxonomy of

More information

Color and Shading. Color. Shapiro and Stockman, Chapter 6. Color and Machine Vision. Color and Perception

Color and Shading. Color. Shapiro and Stockman, Chapter 6. Color and Machine Vision. Color and Perception Color and Shading Color Shapiro and Stockman, Chapter 6 Color is an important factor for for human perception for object and material identification, even time of day. Color perception depends upon both

More information

Chapter 6: CPU Scheduling. Operating System Concepts 9 th Edition

Chapter 6: CPU Scheduling. Operating System Concepts 9 th Edition Chapter 6: CPU Scheduling Silberschatz, Galvin and Gagne 2013 Objectives To introduce CPU scheduling, which is the basis for multiprogrammed operating systems To describe various CPU-scheduling algorithms

More information

CS 110 Computer Architecture

CS 110 Computer Architecture CS 110 Computer Architecture Amdahl s Law, Data-level Parallelism Instructor: Sören Schwertfeger http://shtech.org/courses/ca/ School of Information Science and Technology SIST ShanghaiTech University

More information

Chapter 6: CPU Scheduling

Chapter 6: CPU Scheduling Chapter 6: CPU Scheduling Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Thread Scheduling Multiple-Processor Scheduling Real-Time CPU Scheduling Operating Systems Examples

More information

Chapter 6: CPU Scheduling. Operating System Concepts 9 th Edition

Chapter 6: CPU Scheduling. Operating System Concepts 9 th Edition Chapter 6: CPU Scheduling Silberschatz, Galvin and Gagne 2013 Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Thread Scheduling Multiple-Processor Scheduling Real-Time

More information

Introduction II. Overview

Introduction II. Overview Introduction II Overview Today we will introduce multicore hardware (we will introduce many-core hardware prior to learning OpenCL) We will also consider the relationship between computer hardware and

More information

CPU Scheduling. Basic Concepts. Histogram of CPU-burst Times. Dispatcher. CPU Scheduler. Alternating Sequence of CPU and I/O Bursts

CPU Scheduling. Basic Concepts. Histogram of CPU-burst Times. Dispatcher. CPU Scheduler. Alternating Sequence of CPU and I/O Bursts CS307 Basic Concepts Maximize CPU utilization obtained with multiprogramming CPU Scheduling CPU I/O Burst Cycle Process execution consists of a cycle of CPU execution and I/O wait CPU burst distribution

More information

Chapter 6: CPU Scheduling

Chapter 6: CPU Scheduling Chapter 6: CPU Scheduling Silberschatz, Galvin and Gagne 2013 Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Thread Scheduling Multiple-Processor Scheduling Real-Time

More information

CEC 450 Real-Time Systems

CEC 450 Real-Time Systems CEC 450 Real-Time Systems Lecture 6 Accounting for I/O Latency September 28, 2015 Sam Siewert A Service Release and Response C i WCET Input/Output Latency Interference Time Response Time = Time Actuation

More information

Scheduling. Today. Next Time Process interaction & communication

Scheduling. Today. Next Time Process interaction & communication Scheduling Today Introduction to scheduling Classical algorithms Thread scheduling Evaluating scheduling OS example Next Time Process interaction & communication Scheduling Problem Several ready processes

More information

GPU programming. Dr. Bernhard Kainz

GPU programming. Dr. Bernhard Kainz GPU programming Dr. Bernhard Kainz Overview About myself Motivation GPU hardware and system architecture GPU programming languages GPU programming paradigms Pitfalls and best practice Reduction and tiling

More information

Parallel Computing. Hwansoo Han (SKKU)

Parallel Computing. Hwansoo Han (SKKU) Parallel Computing Hwansoo Han (SKKU) Unicore Limitations Performance scaling stopped due to Power consumption Wire delay DRAM latency Limitation in ILP 10000 SPEC CINT2000 2 cores/chip Xeon 3.0GHz Core2duo

More information

SE310 Analysis and Design of Software Systems

SE310 Analysis and Design of Software Systems SE310 Analysis and Design of Software Systems Lecture 9 Review of Event-Driven Architectures March 6, 2018 Sam Siewert Reminders No class on Thursday - use time for Scrum meeting and to work on designs

More information

DLP, Amdahl s Law, Intro to Multi-thread/processor systems Instructor: Nick Riasanovsky

DLP, Amdahl s Law, Intro to Multi-thread/processor systems Instructor: Nick Riasanovsky DLP, Amdahl s Law, Intro to Multi-thread/processor systems Instructor: Nick Riasanovsky Review of Last Lecture Performance measured in latency or bandwidth Latency measurement for a program: Flynn Taxonomy

More information

Chap. 2 part 1. CIS*3090 Fall Fall 2016 CIS*3090 Parallel Programming 1

Chap. 2 part 1. CIS*3090 Fall Fall 2016 CIS*3090 Parallel Programming 1 Chap. 2 part 1 CIS*3090 Fall 2016 Fall 2016 CIS*3090 Parallel Programming 1 Provocative question (p30) How much do we need to know about the HW to write good par. prog.? Chap. gives HW background knowledge

More information

Massively Parallel Architectures

Massively Parallel Architectures Massively Parallel Architectures A Take on Cell Processor and GPU programming Joel Falcou - LRI joel.falcou@lri.fr Bat. 490 - Bureau 104 20 janvier 2009 Motivation The CELL processor Harder,Better,Faster,Stronger

More information

Introduction to Parallel Computing

Introduction to Parallel Computing Portland State University ECE 588/688 Introduction to Parallel Computing Reference: Lawrence Livermore National Lab Tutorial https://computing.llnl.gov/tutorials/parallel_comp/ Copyright by Alaa Alameldeen

More information

Types of Parallel Computers

Types of Parallel Computers slides1-22 Two principal types: Types of Parallel Computers Shared memory multiprocessor Distributed memory multicomputer slides1-23 Shared Memory Multiprocessor Conventional Computer slides1-24 Consists

More information

Introduction to GPU hardware and to CUDA

Introduction to GPU hardware and to CUDA Introduction to GPU hardware and to CUDA Philip Blakely Laboratory for Scientific Computing, University of Cambridge Philip Blakely (LSC) GPU introduction 1 / 35 Course outline Introduction to GPU hardware

More information

Processor Architecture and Interconnect

Processor Architecture and Interconnect Processor Architecture and Interconnect What is Parallelism? Parallel processing is a term used to denote simultaneous computation in CPU for the purpose of measuring its computation speeds. Parallel Processing

More information

Martin Dubois, ing. Contents

Martin Dubois, ing. Contents Martin Dubois, ing Contents Without OpenNet vs With OpenNet Technical information Possible applications Artificial Intelligence Deep Packet Inspection Image and Video processing Network equipment development

More information

CISC 7310X. C05: CPU Scheduling. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 3/1/2018 CUNY Brooklyn College

CISC 7310X. C05: CPU Scheduling. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 3/1/2018 CUNY Brooklyn College CISC 7310X C05: CPU Scheduling Hui Chen Department of Computer & Information Science CUNY Brooklyn College 3/1/2018 CUNY Brooklyn College 1 Outline Recap & issues CPU Scheduling Concepts Goals and criteria

More information

Computer and Machine Vision

Computer and Machine Vision Computer and Machine Vision Lecture Week 4 Part-2 February 5, 2014 Sam Siewert Outline of Week 4 Practical Methods for Dealing with Camera Streams, Frame by Frame and De-coding/Re-encoding for Analysis

More information

Operating systems and concurrency (B10)

Operating systems and concurrency (B10) Operating systems and concurrency (B10) David Kendall Northumbria University David Kendall (Northumbria University) Operating systems and concurrency (B10) 1 / 26 Introduction This lecture looks at Some

More information

Chapter 6: CPU Scheduling. Operating System Concepts 9 th Edition

Chapter 6: CPU Scheduling. Operating System Concepts 9 th Edition Chapter 6: CPU Scheduling Silberschatz, Galvin and Gagne 2013 Chapter 6: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Thread Scheduling Multiple-Processor Scheduling Real-Time

More information

Chapter 1. Introduction: Part I. Jens Saak Scientific Computing II 7/348

Chapter 1. Introduction: Part I. Jens Saak Scientific Computing II 7/348 Chapter 1 Introduction: Part I Jens Saak Scientific Computing II 7/348 Why Parallel Computing? 1. Problem size exceeds desktop capabilities. Jens Saak Scientific Computing II 8/348 Why Parallel Computing?

More information

Parallel Architectures

Parallel Architectures Parallel Architectures CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Parallel Architectures Spring 2018 1 / 36 Outline 1 Parallel Computer Classification Flynn s

More information

Operating Systems CS 323 Ms. Ines Abbes

Operating Systems CS 323 Ms. Ines Abbes Taibah University College of Community of Badr Computer Science Department Operating Systems CS71/CS72 جامعة طيبة كلية المجتمع ببدر قسم علوم الحاسب مقرر: نظم التشغيل Operating Systems CS 323 Ms. Ines Abbes

More information

CS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it

CS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it Lab 1 Starts Today Already posted on Canvas (under Assignment) Let s look at it CS 590: High Performance Computing Parallel Computer Architectures Fengguang Song Department of Computer Science IUPUI 1

More information

Build cost-effective, reliable signage solutions with the 8 display output, single slot form factor NVIDIA NVS 810

Build cost-effective, reliable signage solutions with the 8 display output, single slot form factor NVIDIA NVS 810 WEB COPY NVIDIA NVS 810 for Eight DP Displays Part No. VCNVS810DP-PB Overview Build cost-effective, reliable signage solutions with the 8 display output, single slot form factor NVIDIA NVS 810 The NVIDIA

More information

Processes, Threads, SMP, and Microkernels

Processes, Threads, SMP, and Microkernels Processes, Threads, SMP, and Microkernels Slides are mainly taken from «Operating Systems: Internals and Design Principles, 6/E William Stallings (Chapter 4). Some materials and figures are obtained from

More information

MAPPING VIDEO CODECS TO HETEROGENEOUS ARCHITECTURES. Mauricio Alvarez-Mesa Techische Universität Berlin - Spin Digital MULTIPROG 2015

MAPPING VIDEO CODECS TO HETEROGENEOUS ARCHITECTURES. Mauricio Alvarez-Mesa Techische Universität Berlin - Spin Digital MULTIPROG 2015 MAPPING VIDEO CODECS TO HETEROGENEOUS ARCHITECTURES Mauricio Alvarez-Mesa Techische Universität Berlin - Spin Digital MULTIPROG 2015 Video Codecs 70% of internet traffic will be video in 2018 [CISCO] Video

More information

Chapter 5 Process Scheduling

Chapter 5 Process Scheduling Chapter 5 Process Scheduling Da-Wei Chang CSIE.NCKU Source: Abraham Silberschatz, Peter B. Galvin, and Greg Gagne, "Operating System Concepts", 9th Edition, Wiley. 1 Outline Basic Concepts Scheduling Criteria

More information

Multi-Processors and GPU

Multi-Processors and GPU Multi-Processors and GPU Philipp Koehn 7 December 2016 Predicted CPU Clock Speed 1 Clock speed 1971: 740 khz, 2016: 28.7 GHz Source: Horowitz "The Singularity is Near" (2005) Actual CPU Clock Speed 2 Clock

More information

Computer and Machine Vision

Computer and Machine Vision Computer and Machine Vision Lecture Week 12 Part-1 Additional Programming Considerations March 29, 2014 Sam Siewert Outline of Week 12 Computer Vision APIs and Languages Alternatives to C++ and OpenCV

More information

Overview. Web Copy. NVIDIA Quadro M4000 Extreme Performance in a Single-Slot Form Factor

Overview. Web Copy. NVIDIA Quadro M4000 Extreme Performance in a Single-Slot Form Factor Web Copy NVIDIA Quadro M4000 Part No. VCQM4000-PB Overview NVIDIA Quadro M4000 Extreme Performance in a Single-Slot Form Factor Get real interactive expression with NVIDIA Quadro the world s most powerful

More information

ECEN 5653/4653 Real-Time Digital Media (A Linux-Based Systems Approach)

ECEN 5653/4653 Real-Time Digital Media (A Linux-Based Systems Approach) ECEN 5653/4653 Real-Time Digital Media (A Linux-Based Systems Approach) Lecture 1 - Introduction January 17, 2012 Sam Siewert Prof. Sam Siewert - My Background Co-Founder of Embedded Certificate Program

More information

Hybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS

Hybrid KAUST Many Cores and OpenACC. Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS + Hybrid Computing @ KAUST Many Cores and OpenACC Alain Clo - KAUST Research Computing Saber Feki KAUST Supercomputing Lab Florent Lebeau - CAPS + Agenda Hybrid Computing n Hybrid Computing n From Multi-Physics

More information

Lecture 2 Process Management

Lecture 2 Process Management Lecture 2 Process Management Process Concept An operating system executes a variety of programs: Batch system jobs Time-shared systems user programs or tasks The terms job and process may be interchangeable

More information

A General Discussion on! Parallelism!

A General Discussion on! Parallelism! Lecture 2! A General Discussion on! Parallelism! John Cavazos! Dept of Computer & Information Sciences! University of Delaware!! www.cis.udel.edu/~cavazos/cisc879! Lecture 2: Overview Flynn s Taxonomy

More information

Addressing Heterogeneity in Manycore Applications

Addressing Heterogeneity in Manycore Applications Addressing Heterogeneity in Manycore Applications RTM Simulation Use Case stephane.bihan@caps-entreprise.com Oil&Gas HPC Workshop Rice University, Houston, March 2008 www.caps-entreprise.com Introduction

More information