Interrupt Service Threads - A New Approach to Handle Multiple Hard Real-Time Events on a Multithreaded Microcontroller

Size: px
Start display at page:

Download "Interrupt Service Threads - A New Approach to Handle Multiple Hard Real-Time Events on a Multithreaded Microcontroller"

Transcription

1 Interrupt Service Threads - A New Approach to Handle Multiple Hard Real-Time Events on a Multithreaded Microcontroller U. Brinkschulte, C. Krakowski J. Kreuzinger, Th. Ungerer Institute of Process Control, Institute of Computer Design Automation and Robotics and Fault Tolerance University of Karlsruhe University of Karlsruhe D Karlsruhe, Germany D Karlsruhe, Germany Abstract We propose a new event handling mechanism based on a multithreaded microcontroller, that allows efficient handling of concurrent events with hard real-time requirements. Real-time threads are used as interrupt service threads (ISTs) instead of interrupt service routines (ISRs) and are executed on a multithreaded microcontroller. Several thread priority schemes are managed in hardware, in particular, we propose a guaranteed percentage scheme where each real-time thread is assigned a rate of the full processing power. We show an analytical evaluation of the IST technique and the guaranteed percentage scheme using real-time requirements of an autonomous guided vehicle. The evaluations show that the ISR solution with the fixed priority preemptive scheme isn t able to guarantee the specified deadlines, in contrast to the IST solution, that even offers a spare of 5%. Moreover, in our case guaranteed percentage scheduling offers an advantage over earliest deadline first scheduling, if not only deadlines, but also data rates must be met. When calculating the maximum vehicle speed without violating the real-time constraints ISTs dominate ISRs by a speed increase of 28%. 1 Introduction The market for embedded systems is rapidly spreading out as can be seen by the number of sold microcontrollers. A special requirement for many embedded systems is realtime behavior. Usually interrupt service routines (ISRs) are used to implement event handling routines. However, when several hard real-time events that emerge in an irregular time pattern must be serviced, ISRs are uncomfortable to program and may even miss hard deadlines because of nonoptimal processor utilization. In our approach we propose interrupt service threads (ISTs) as a new efficient hardwaresupported event handling mechanism which simplifies and improves the handling of concurrent events with hard realtime requirements. Therefore, we propose a multithreaded microcontroller that supports multiple ISTs, zero-cycle context switching overhead, and triggers ISTs directly by hardware. A hardware unit called priority manager within the microcontroller manages several thread priority schemes. We propose a guaranteed percentage scheme where each real-time thread is assigned a rate of the full processor power. Section 2 introduces our thread-based real-time event handling by ISTs in combination with thread priority strategies, in particular with the guaranteed percentage scheme. Section 3 introduces multithreaded processor techniques, and presents our multithreaded microcontroller model. The benefits of using ISTs instead of ISRs and of the guaranteed percentage scheme in event handling of concurrent events with hard real-time requirements are evaluated in section 4 with respect to an autonomous guided vehicle. 2 Thread-based Real-time Event Handling The conventional method of event handling on today s processors and microcontrollers is event handling by interrupt service routines (ISRs). Events of different priorities are handled by ISRs with appropriate priorities. The generally used priority scheme is fixed priority preemptive. This priority scheme has several disadvantages: First, fixed priority preemptive scheduling restricts the processor charge to about 70% [1]. Furthermore, the handling of events of lower priority can be blocked for a longer amount of time by events of higher priority. This forces the programmer to keep ISRs as short as possible and to move work outside the ISR. If several concurrent time-critical events must be handled that way, the resulting programs are complex and hard to test. An alternative method to handle events is by threads. In

2 this case an occurring event activates an assigned thread. Using threads instead of ISRs has several advantages: A flexible context switch between event triggered threads and other threads is possible. ISRs can only be interrupted by other ISRs, but not by threads. instructions of the three ISTs can be interleaved in a very fine granular fashion without context switching overhead on a multithreaded processor. The remaining 10% processing power remain as a reserve. Threads allow flexible priority schemes. For example, earliest deadline first (EDF) or guaranteed percentage scheduling can be used instead of fixed priority preemptive. Both, the EDF scheme [2] and the guaranteed percentage scheme allow processor charges of 100%. The guaranteed percentage scheme offers the possibility to assign a rate of the full processing power to each thread. Herewith response times and data rates can be guaranteed even for several concurrent events independent of other processor activities, as long as there is no overload condition. Furthermore, overload conditions can be easily detected early, as soon as the total requested percentage of processor cycles exceeds 100%. Scheduling strategies like EDF detect overload conditions late by missed deadlines. On a conventional microcontroller, thread based event handling can be emulated by indirectly activating a thread by an ISR. The only task of the ISR is to launch the corresponding event handling thread. This concept is realized by several operating systems, e.g. as Asynchronous System Traps (AST) by DEC [3]. In our approach, we propose a multithreaded microcontroller which handles events by activating Interrupt Service Threads (ISTs) directly by hardware. First, this avoids latencies, which occur by the indirect thread activation. Also operating system calls are evaded for EDF or guaranteed percentage scheduling schemes. So response times can be improved. Second, a multithreaded microcontroller with zero-cycle context switching overhead allows a very fine granular realization of the guaranteed percentage scheduling scheme. The requested percentage can be guaranteed in the very short period of a few dozens of processor cycles, which is not possible when a context switch needs time for itself. This is demonstrated in Fig. 1 as follows. The top part of Fig. 1 shows three events with assigned rates and deadlines to be met. The shaded column picks a time period where all three ISTs that handle the three events are active. On a conventional microcontroller a context switch generates some context switching overhead. If the guaranteed rates of the three ISTs should be fulfilled within the shown time period, the 90% processing power needed by the three concurrently executing ISTs lead to an exceeding of the 100% processing power if the context switching overheads of the three IST switches is more than 10% of the processing power as demonstrated in the middle part of the figure. In contrast, Figure 1. Guaranteed percentage scheme on a multithreaded processor For that reasons, the IST concept simplifies and improves the programming of concurrent real-time events. Moreover it offers e.g. the possibility of debug threads which monitor the system at a low percentage without changing real-time behavior. The IST concept and the guaranteed percentage scheme unfold their full power only in combination with hardware support by a multithreaded processor core. 3 A multithreaded microcontroller A multithreaded processor is characterized by the existence of multiple on-chip instruction counters for different threads of control and the ability to execute instructions from different threads in the pipeline simultaneously. A multithreaded processor usually features multiple onchip register sets to yield a fast context switch. Multithreading techniques may be applied within the microprocessor or microcontroller to enhance its performance by masking latencies of instructions of the presently scheduled thread by instructions of other threads. Thus the throughput of a multiprogramming workload is increased leading to very powerful techniques that appear in next generation s multiple-issue microprocessors (see Sun s MAJC [4], and DEC/Compaq s EV8 Architecture [5]). However, to date multithreading has never been applied for event handling by taking advantage of its fast context switching ability. The basic multithreading techniques that are appropriate for microcontrollers with a single-issue RISC processor kernel are the cycle-by-cycle and the block interleaving techniques [6]. Cycle-by-cycle interleaving switches context each cycle with a zero cycle contextswitching overhead but with bad single-thread performance,

3 because every cycle an instruction of another thread is introduced in the pipeline. Block interleaving processors execute a single thread until a context-switch event occurs. Typically a switch-on-cache-miss event strategy is applied leading to 6-12 cycles context-switching overhead. To deal with the requirements of the thread-based realtime event handling, we propose a multithreaded microcontroller to implement the IST with several priority schemes. In particular we focus on the guaranteed percentage scheme. Our multithreaded microcontroller is intermediate between the cycle-by-cycle and the block interleaving approaches and reaches a zero cycle context-switching overhead. The microcontroller holds the contexts of up to four hardware threads. Such hardware threads can be non real-time (e.g. operating system, debugging, garbage collection) or realtime threads (ISTs). Because of its application for embedded systems, the processor core of the multithreaded microcontroller is kept at the hardware level of a simple microcontroller similar to the M As shown in Fig. 2, the microcontroller core consists of an instruction-fetch unit, a decode unit, a memory access unit (MEM), and an execution unit (ALU). Four register sets are provided on the processor chip, restricting the number of hardware-supported ISTs to at most four. A signal unit triggers IST execution due to external signals directly by hardware. There are no caches because of the need of real-time applications. memory interface address instructions micro-ops ROM PC1 IW1 instruction fetch PC2 PC3 PC4 IW2 IW3 IW4 priority manager instruction decode signal unit extern signals The instruction fetch unit holds four program counters (PC) with dedicated status bits (e.g. thread active/suspended), each PC is assigned to another thread. Instructions are fetched depending on the status bits and fill levels of the IWs. The instruction decode unit contains the above mentioned IWs, dedicated status bits (e.g. priority, delay) and counters in the case of implementing the guaranteed percentage scheme. A priority manager decides subject to the bits and counters from which IW the next instruction will be decoded. Besides the guaranteed percentage scheme also other priority schemes may be supported by the priority manager. Each opcode is propagated through the pipeline with its thread id. Opcodes from multiple threads can be simultaneously present in the different pipeline stages. The instructions for memory access are executed by the MEM unit. If the memory interface only permits one access each cycle, an arbiter is needed for instruction fetch and data access. All other instructions are executed by the ALU unit. Both units (MEM and ALU) can take several cycles to complete an instruction execution. After that, the result is written back to the register set of the according thread. External signals are delivered to the signal unit from the peripheral components of the microcontroller core as e.g. timer, counter, or serial interface. By the occurrence of such a signal the corresponding IST will be activated. As soon as an IST activation ends its assigned real-time thread is suspended and its status is stored. An external signal may activate the same thread again. To avoid pipeline stalls, instructions from other threads can be fed into the pipeline using various static or dynamic multithreading techniques. Possible reasons for idle times may be a branch or memory access. The decode unit may predict the latency after such an instruction and inform the priority manager via delay bits (switch-on-branch and switch-on-load strategies). There is no overhead for such a context switch. No save/restore of registers or removal of instructions from the pipeline is needed, because each thread has it s own register set. Because of the unpredictability of cache accesses, a noncached memory access is preferred for real-time microcontrollers. The emerging load latencies are bridged by scheduling instructions of other threads by the priority manager. Therefore a cache is omitted from our multithreaded microcontroller. data path MEM ALU 4 Evaluation using an industrial application example register sets Figure 2. Block diagram of the Komodo microcontroller This section gives a evaluation of the IST technique and of the guaranteed percentage scheme. The evaluation is done using a real industrial application example of autonomous guided vehicles (AGV). This is a good example of several concurrent time-critical events. The vehicles in our example are guided by a reflex tape glued on the floor. A vehicle pursues its track by use of a CCD line camera. This camera produces periodic events in a rate of 10 milliseconds. This period gives the deadline for converting and reading the camera information and for executing the control loop, which keeps the vehicle on the track. A second time-critical event is produced asynchronously by

4 transponder-based position marks. These position marks notify the vehicle that some default position is reached (e.g. a docking station). If the vehicle notices a position mark, the corresponding transponder which is installed in the floor beside the track, must be read. The precision needed for position detection using these marks is 1 cm. This gives a vehicle speed dependent deadline for reading the transponder information. The vehicle speed can vary in the range of 0.5 to 1 meters per second, which results in a deadline range from 20 to 10 milliseconds. To solve this job, the vehicle software is structured into three tasks: The control task performs the control loop based on the actual camera information. It is triggered by a timer event with a period of 10 milliseconds. The camera task triggered by the same timer event converts and reads the next camera information. The transponder task is triggered by the position mark events and reads the transponder information. We compare the real-time behavior of three different realizations of these tasks: first, each task is realized by a conventional ISR; second, each task is realized by an IST using guaranteed percentage scheduling; third, each task is realized by an IST using EDF scheduling. To allow a fair comparison, we assume an identical processor performance for all three techniques. This is based on the real performance of a 20 MHz M68302 microcontroller which is often used in industrial applications. So our timing values stem from real applications. The evaluation itself is done as follows: First we examine the three techniques at a fixed vehicle speed of 0.65 meters per second. Then we calculate the maximum speed that can be reached for each technique without violating the realtime constraints. The following table summarizes the basic values for the first evaluation: vehicle speed 0.65 m/sec camera period 10 msec position mark precision 1cm control task calculation time 1) 5msec camera task execution time 1) 1msec transponder task execution time 1) 5.5 msec 1) based on a 20 MHz M68302 processor This gives the following execution-time / deadline ratio: control task 5 msec / 10 msec = 50% camera task 1 msec / 10 msec = 10% transponder task 5.5 msec / 15.4 msec 1) = 35% 1) 15.4msec=1cm/0.65m/sec 1. Realization with ISRs At first we demonstrate the solution with a conventional ISR realization of the tasks. The priority scheme of ISRs is a fixed priority scheme. Fixed priority scheduling can only guaranty a processor utilization of 78% for three events [1], which is less than the needed 95%. Therefore we will have missed deadlines regardless of the priority assignment. Figure 3 shows an example for missed deadlines at a specific assignment. # "? JH?= AH= JH= # FHAA FJA@ # E I I # # Figure 3. ISRs with P control >P camera >P transp 2. Realization with ISTs using guaranteed percentage scheduling The guaranteed percentage scheduling assigns a guaranteed percentage of processor cycles to a thread. On the proposed microcontroller with its zero time context switches, this percentage is guaranteed in a very short period of a few processor cycles by the hardware priority manager. So the realization of the three tasks is simple: Each task is assigned to a thread (IST) with the executiontime/deadline ratio as requested percentage of processor cycles (50% control task, 10% camera task, 35% transponder task). This gives a total requested percentage of 95%, which means, all deadlines will be met in any case. Furthermore there is an additional spare of 5%, which e.g. could be used for a debug thread. Figure 4 shows the worst case scenario, where all events occur at the same time.? JH?= AH= JH= # " # # #!# J AN # # J AN J AN # #!# # " Figure 4. ISTs with guaranteed percentage 3. Realization with ISTs using EDF scheduling EDF and guaranteed percentage scheduling allow a processor utilization of 100% [1]. This means, ISTs with EDF or guaranteed percentage scheduling meet all deadlines in the example above. Figure 5 shows the above scenario using EDF scheduling.

5 ? JH?= AH= JH= # # # # # # I A? Figure 5. ISTs with EDF Speedup IST=ISR = 28% Finally, we like to point out that in this section we evaluated the benefits of the IST concept. However, using a multithreaded microcontroller has the additional benefit of latency masking [7] which is not yet regarded. Unfortunately this benefit can not be evaluated statically. So our calculated values are worst-case values. 5 Conclusions But figure 5 reveals a disadvantage of EDF scheduling. The camera task and the transponder task not only must meet a deadline, they deal as well with data rates. The transponder information starts at the position mark event and lasts 4 milliseconds (8 bytes, baud serial link). In case of guaranteed percentage scheduling this is no problem, because the transponder task starts at the position mark event as well with 35% of processor cycles, which is enough to read the information. In case of EDF scheduling, the transponder task starts 6 miliseconds after the position mark event. This means, if the serial link doesn t contain a 8 byte hardware buffer, the information will be lost. For the same reason, a 10 times faster AD converter is needed for reading the CCD camera. As a conclusion the ISR solution isn t able to guarantee the specified deadlines, in contrast to the IST solution, that even offers a spare of 5%. Moreover, guaranteed percentage scheduling offers an advantage over EDF scheduling, if not only deadlines, but also data rates must be met. Maximum vehicle speed for IST and ISR In a second step, the maximum vehicle speed without violating the realtime constraints can be calculated. For ISTs using guaranteed percentage scheduling, the maximum vehicle speed can be reached, if the transponder task uses the remaining spare of 5%. This leads to a transponder task of 40% and a total processor utilization of 100%. In this case, the transponder task can guaranty a deadline of (5.5 msec * 100%) / 40% = msec. With this deadline, the position resolution of 1 cm can be retained for a vehicle speed of: V, IST max =0:73m=sec This calculation is valid for ISTs using EDF scheduling as well, because the same processor utilization of 100% is reached. In case of ISRs, the reachable deadline for the transponder task can be taken from Fig. 3. It calculates to 5 msec +1msec+4msec+5msec+1msec+1.5msec=17.5 msec. So the maximum vehicle speed can be calculated to V, ISR max =0:57m=sec As result, the vehicle speedup for the IST architecture compared to ISR is We combine interrupt service threads (ISTs) with a multithreaded microcontroller to form a new hardwaresupported event handling mechanism which allows efficient handling of concurrent events with hard real-time requirements. We propose a guaranteed percentage scheme where each real-time thread is assigned a percentage of the full processor power. An analytical evaluation shows the advantages of our approach in handling concurrent overlapping events. The additional ability of a multithreaded microcontroller which is able to utilize instruction latencies by scheduling instructions of a different thread is not regarded in the calculations, although it can yield a higher execution speed. So the calculated values are worst-case values. We are working on the simulation of the proposed multithreaded microcontroller. Our target is the evaluation of its real-time performance versus the performance of a conventional microprocessor with ISRs and fixed priority preemptive scheme. References [1] C. L. Liu and J. W. Layland. Scheduling Algorithms for Multiprogramming in a Hard Real-Time Environment. JACM, 20(1):46 61, [2] J. A. Stankovic, M. Spuri, K. Ramamritham, G. C. Buttazzo. Deadline Scheduling for Real-Time Systems: EDF and Related Algorithms. Kluwer Academic Publishers, [3] Digital. Guide to Decthreads. March [4] L. Gwennap. MAJC Gives VLIW a New Twist. Microprocessor Report, Vol 13, No. 12, pp , September, [5] J. Emer. Simultaneous Multithreading: Multiplying Alpha Performance. Microprocessor Forum, San Jose, October [6] J. Kreuzinger and T. Ungerer. Context Switching Techniques for Decoupled Multithreaded Processors. 25th EU- ROMICRO Conference, Milano, Vol 1, pp , September [7] W. Grünewald, T. Ungerer. A Multithreaded Processor Designed for Distributed Shared Memory Systems. International Conference on Advances in Parallel and Distributed Computing, Shanghai, pp , March 1997.

A Multithreaded Java Microcontroller for Thread-Oriented Real-Time Event-Handling

A Multithreaded Java Microcontroller for Thread-Oriented Real-Time Event-Handling A Multithreaded Java Microcontroller for Thread-Oriented Real-Time Event-Handling U. Brinkschulte, C. Krakowski J. Kreuzinger, Th. Ungerer Institute of Process Control, Institute of Computer Design Automation

More information

instruction fetch memory interface signal unit priority manager instruction decode stack register sets address PC2 PC3 PC4 instructions extern signals

instruction fetch memory interface signal unit priority manager instruction decode stack register sets address PC2 PC3 PC4 instructions extern signals Performance Evaluations of a Multithreaded Java Microcontroller J. Kreuzinger, M. Pfeer A. Schulz, Th. Ungerer Institute for Computer Design and Fault Tolerance University of Karlsruhe, Germany U. Brinkschulte,

More information

Real-time Scheduling on Multithreaded Processors

Real-time Scheduling on Multithreaded Processors Real-time Scheduling on Multithreaded Processors J. Kreuzinger, A. Schulz, M. Pfeffer, Th. Ungerer Institute for Computer Design, and Fault Tolerance University of Karlsruhe D-76128 Karlsruhe, Germany

More information

Real-time Scheduling on Multithreaded Processors

Real-time Scheduling on Multithreaded Processors Real-time Scheduling on Multithreaded Processors J. Kreuzinger, A. Schulz, M. Pfeffer, Th. Ungerer U. Brinkschulte, C. Krakowski Institute for Computer Design, Institute for Process Control, and Fault

More information

A Microkernel Architecture for a Highly Scalable Real-Time Middleware

A Microkernel Architecture for a Highly Scalable Real-Time Middleware A Microkernel Architecture for a Highly Scalable Real-Time Middleware U. Brinkschulte, C. Krakowski,. Riemschneider. Kreuzinger, M. Pfeffer, T. Ungerer Institute of Process Control, Institute of Computer

More information

A Scheduling Technique Providing a Strict Isolation of Real-time Threads

A Scheduling Technique Providing a Strict Isolation of Real-time Threads A Scheduling Technique Providing a Strict Isolation of Real-time Threads U. Brinkschulte ¾, J. Kreuzinger ½, M. Pfeffer, Th. Ungerer ½ Institute for Computer Design and Fault Tolerance University of Karlsruhe,

More information

Priority manager. I/O access

Priority manager. I/O access Implementing Real-time Scheduling Within a Multithreaded Java Microcontroller S. Uhrig 1, C. Liemke 2, M. Pfeffer 1,J.Becker 2,U.Brinkschulte 3, Th. Ungerer 1 1 Institute for Computer Science, University

More information

A Real-Time Java System on a Multithreaded Java Microcontroller

A Real-Time Java System on a Multithreaded Java Microcontroller A Real-Time Java System on a Multithreaded Java Microcontroller M. Pfeffer, S. Uhrig, Th. Ungerer Institute for Computer Science University of Augsburg D-86159 Augsburg fpfeffer, uhrig, ungererg @informatik.uni-augsburg.de

More information

The Komodo Project: Thread-based Event Handling Supported by a Multithreaded Java Microcontroller

The Komodo Project: Thread-based Event Handling Supported by a Multithreaded Java Microcontroller The Komodo Project: Thread-based Event Handling Supported by a Multithreaded Java Microcontroller J. Kreuzinger, R. Marston, Th. Ungerer Dept. of Computer Design and Fault Tolerance University of Karlsruhe

More information

Multimedia Systems 2011/2012

Multimedia Systems 2011/2012 Multimedia Systems 2011/2012 System Architecture Prof. Dr. Paul Müller University of Kaiserslautern Department of Computer Science Integrated Communication Systems ICSY http://www.icsy.de Sitemap 2 Hardware

More information

4. Hardware Platform: Real-Time Requirements

4. Hardware Platform: Real-Time Requirements 4. Hardware Platform: Real-Time Requirements Contents: 4.1 Evolution of Microprocessor Architecture 4.2 Performance-Increasing Concepts 4.3 Influences on System Architecture 4.4 A Real-Time Hardware Architecture

More information

Chapter 12. CPU Structure and Function. Yonsei University

Chapter 12. CPU Structure and Function. Yonsei University Chapter 12 CPU Structure and Function Contents Processor organization Register organization Instruction cycle Instruction pipelining The Pentium processor The PowerPC processor 12-2 CPU Structures Processor

More information

EC EMBEDDED AND REAL TIME SYSTEMS

EC EMBEDDED AND REAL TIME SYSTEMS EC6703 - EMBEDDED AND REAL TIME SYSTEMS Unit I -I INTRODUCTION TO EMBEDDED COMPUTING Part-A (2 Marks) 1. What is an embedded system? An embedded system employs a combination of hardware & software (a computational

More information

Embedded Systems. Read pages

Embedded Systems. Read pages Embedded Systems Read pages 385-417 Definition of Embedded Systems Embedded systems Computer dedicated to serve specific purposes Many physical systems today use computer for powerful and intelligent applications

More information

CPU Structure and Function. Chapter 12, William Stallings Computer Organization and Architecture 7 th Edition

CPU Structure and Function. Chapter 12, William Stallings Computer Organization and Architecture 7 th Edition CPU Structure and Function Chapter 12, William Stallings Computer Organization and Architecture 7 th Edition CPU must: CPU Function Fetch instructions Interpret/decode instructions Fetch data Process data

More information

Processors. Young W. Lim. May 12, 2016

Processors. Young W. Lim. May 12, 2016 Processors Young W. Lim May 12, 2016 Copyright (c) 2016 Young W. Lim. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version

More information

Computer-System Architecture (cont.) Symmetrically Constructed Clusters (cont.) Advantages: 1. Greater computational power by running applications

Computer-System Architecture (cont.) Symmetrically Constructed Clusters (cont.) Advantages: 1. Greater computational power by running applications Computer-System Architecture (cont.) Symmetrically Constructed Clusters (cont.) Advantages: 1. Greater computational power by running applications concurrently on all computers in the cluster. Disadvantages:

More information

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS UNIT-I OVERVIEW & INSTRUCTIONS 1. What are the eight great ideas in computer architecture? The eight

More information

Main Points of the Computer Organization and System Software Module

Main Points of the Computer Organization and System Software Module Main Points of the Computer Organization and System Software Module You can find below the topics we have covered during the COSS module. Reading the relevant parts of the textbooks is essential for a

More information

ECE519 Advanced Operating Systems

ECE519 Advanced Operating Systems IT 540 Operating Systems ECE519 Advanced Operating Systems Prof. Dr. Hasan Hüseyin BALIK (10 th Week) (Advanced) Operating Systems 10. Multiprocessor, Multicore and Real-Time Scheduling 10. Outline Multiprocessor

More information

Multithreaded Processors. Department of Electrical Engineering Stanford University

Multithreaded Processors. Department of Electrical Engineering Stanford University Lecture 12: Multithreaded Processors Department of Electrical Engineering Stanford University http://eeclass.stanford.edu/ee382a Lecture 12-1 The Big Picture Previous lectures: Core design for single-thread

More information

Computer Architecture

Computer Architecture Computer Architecture Slide Sets WS 2013/2014 Prof. Dr. Uwe Brinkschulte M.Sc. Benjamin Betting Part 10 Thread and Task Level Parallelism Computer Architecture Part 10 page 1 of 36 Prof. Dr. Uwe Brinkschulte,

More information

Multimedia-Systems. Operating Systems. Prof. Dr.-Ing. Ralf Steinmetz Prof. Dr. rer. nat. Max Mühlhäuser Prof. Dr.-Ing. Wolfgang Effelsberg

Multimedia-Systems. Operating Systems. Prof. Dr.-Ing. Ralf Steinmetz Prof. Dr. rer. nat. Max Mühlhäuser Prof. Dr.-Ing. Wolfgang Effelsberg Multimedia-Systems Operating Systems Prof. Dr.-Ing. Ralf Steinmetz Prof. Dr. rer. nat. Max Mühlhäuser Prof. Dr.-Ing. Wolfgang Effelsberg WE: University of Mannheim, Dept. of Computer Science Praktische

More information

Analyzing Real-Time Systems

Analyzing Real-Time Systems Analyzing Real-Time Systems Reference: Burns and Wellings, Real-Time Systems and Programming Languages 17-654/17-754: Analysis of Software Artifacts Jonathan Aldrich Real-Time Systems Definition Any system

More information

PC Interrupt Structure and 8259 DMA Controllers

PC Interrupt Structure and 8259 DMA Controllers ELEC 379 : DESIGN OF DIGITAL AND MICROCOMPUTER SYSTEMS 1998/99 WINTER SESSION, TERM 2 PC Interrupt Structure and 8259 DMA Controllers This lecture covers the use of interrupts and the vectored interrupt

More information

Control Hazards. Prediction

Control Hazards. Prediction Control Hazards The nub of the problem: In what pipeline stage does the processor fetch the next instruction? If that instruction is a conditional branch, when does the processor know whether the conditional

More information

CARUSO Project Goals and Principal Approach

CARUSO Project Goals and Principal Approach CARUSO Project Goals and Principal Approach Uwe Brinkschulte *, Jürgen Becker #, Klaus Dorfmüller-Ulhaas +, Ralf König #, Sascha Uhrig +, and Theo Ungerer + * Department of Computer Science, University

More information

OVERHEADS ENHANCEMENT IN MUTIPLE PROCESSING SYSTEMS BY ANURAG REDDY GANKAT KARTHIK REDDY AKKATI

OVERHEADS ENHANCEMENT IN MUTIPLE PROCESSING SYSTEMS BY ANURAG REDDY GANKAT KARTHIK REDDY AKKATI CMPE 655- MULTIPLE PROCESSOR SYSTEMS OVERHEADS ENHANCEMENT IN MUTIPLE PROCESSING SYSTEMS BY ANURAG REDDY GANKAT KARTHIK REDDY AKKATI What is MULTI PROCESSING?? Multiprocessing is the coordinated processing

More information

Real-time Garbage Collection for a Multithreaded Java Microcontroller

Real-time Garbage Collection for a Multithreaded Java Microcontroller Real-time Garbage Collection for a Multithreaded Java Microcontroller S. Fuhrmann, M. Pfeffer, J. Kreuzinger, Th. Ungerer Institute for Computer Design and Fault Tolerance University of Karlsruhe D-76128

More information

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institution of Technology, IIT Delhi

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institution of Technology, IIT Delhi Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institution of Technology, IIT Delhi Lecture - 20 Fundamentals of Embedded Operating Systems In today s class, we shall

More information

CHAPTER 4 MARIE: An Introduction to a Simple Computer

CHAPTER 4 MARIE: An Introduction to a Simple Computer CHAPTER 4 MARIE: An Introduction to a Simple Computer 4.1 Introduction 177 4.2 CPU Basics and Organization 177 4.2.1 The Registers 178 4.2.2 The ALU 179 4.2.3 The Control Unit 179 4.3 The Bus 179 4.4 Clocks

More information

Computer Architecture Lecture 15: Load/Store Handling and Data Flow. Prof. Onur Mutlu Carnegie Mellon University Spring 2014, 2/21/2014

Computer Architecture Lecture 15: Load/Store Handling and Data Flow. Prof. Onur Mutlu Carnegie Mellon University Spring 2014, 2/21/2014 18-447 Computer Architecture Lecture 15: Load/Store Handling and Data Flow Prof. Onur Mutlu Carnegie Mellon University Spring 2014, 2/21/2014 Lab 4 Heads Up Lab 4a out Branch handling and branch predictors

More information

Chapter 1: Introduction Operating Systems MSc. Ivan A. Escobar

Chapter 1: Introduction Operating Systems MSc. Ivan A. Escobar Chapter 1: Introduction Operating Systems MSc. Ivan A. Escobar What is an Operating System? A program that acts as an intermediary between a user of a computer and the computer hardware. Operating system

More information

Multiprocessor scheduling

Multiprocessor scheduling Chapter 10 Multiprocessor scheduling When a computer system contains multiple processors, a few new issues arise. Multiprocessor systems can be categorized into the following: Loosely coupled or distributed.

More information

Announcements/Reminders

Announcements/Reminders Announcements/Reminders Class news group: rcfnews.cs.umass.edu::cmpsci.edlab.cs377 CMPSCI 377: Operating Systems Lecture 5, Page 1 Last Class: Processes A process is the unit of execution. Processes are

More information

TDT Coarse-Grained Multithreading. Review on ILP. Multi-threaded execution. Contents. Fine-Grained Multithreading

TDT Coarse-Grained Multithreading. Review on ILP. Multi-threaded execution. Contents. Fine-Grained Multithreading Review on ILP TDT 4260 Chap 5 TLP & Hierarchy What is ILP? Let the compiler find the ILP Advantages? Disadvantages? Let the HW find the ILP Advantages? Disadvantages? Contents Multi-threading Chap 3.5

More information

DSP/BIOS Kernel Scalable, Real-Time Kernel TM. for TMS320 DSPs. Product Bulletin

DSP/BIOS Kernel Scalable, Real-Time Kernel TM. for TMS320 DSPs. Product Bulletin Product Bulletin TM DSP/BIOS Kernel Scalable, Real-Time Kernel TM for TMS320 DSPs Key Features: Fast, deterministic real-time kernel Scalable to very small footprint Tight integration with Code Composer

More information

There are different characteristics for exceptions. They are as follows:

There are different characteristics for exceptions. They are as follows: e-pg PATHSHALA- Computer Science Computer Architecture Module 15 Exception handling and floating point pipelines The objectives of this module are to discuss about exceptions and look at how the MIPS architecture

More information

UNIT- 5. Chapter 12 Processor Structure and Function

UNIT- 5. Chapter 12 Processor Structure and Function UNIT- 5 Chapter 12 Processor Structure and Function CPU Structure CPU must: Fetch instructions Interpret instructions Fetch data Process data Write data CPU With Systems Bus CPU Internal Structure Registers

More information

CPU Scheduling: Objectives

CPU Scheduling: Objectives CPU Scheduling: Objectives CPU scheduling, the basis for multiprogrammed operating systems CPU-scheduling algorithms Evaluation criteria for selecting a CPU-scheduling algorithm for a particular system

More information

Multithreading and the Tera MTA. Multithreading for Latency Tolerance

Multithreading and the Tera MTA. Multithreading for Latency Tolerance Multithreading and the Tera MTA Krste Asanovic krste@lcs.mit.edu http://www.cag.lcs.mit.edu/6.893-f2000/ 6.893: Advanced VLSI Computer Architecture, October 31, 2000, Lecture 6, Slide 1. Krste Asanovic

More information

Chapter 2: Computer-System Structures. Hmm this looks like a Computer System?

Chapter 2: Computer-System Structures. Hmm this looks like a Computer System? Chapter 2: Computer-System Structures Lab 1 is available online Last lecture: why study operating systems? Purpose of this lecture: general knowledge of the structure of a computer system and understanding

More information

An Integration of Imprecise Computation Model and Real-Time Voltage and Frequency Scaling

An Integration of Imprecise Computation Model and Real-Time Voltage and Frequency Scaling An Integration of Imprecise Computation Model and Real-Time Voltage and Frequency Scaling Keigo Mizotani, Yusuke Hatori, Yusuke Kumura, Masayoshi Takasu, Hiroyuki Chishiro, and Nobuyuki Yamasaki Graduate

More information

Uniprocessor Computer Architecture Example: Cray T3E

Uniprocessor Computer Architecture Example: Cray T3E Chapter 2: Computer-System Structures MP Example: Intel Pentium Pro Quad Lab 1 is available online Last lecture: why study operating systems? Purpose of this lecture: general knowledge of the structure

More information

Instr. execution impl. view

Instr. execution impl. view Pipelining Sangyeun Cho Computer Science Department Instr. execution impl. view Single (long) cycle implementation Multi-cycle implementation Pipelined implementation Processing an instruction Fetch instruction

More information

William Stallings Computer Organization and Architecture. Chapter 11 CPU Structure and Function

William Stallings Computer Organization and Architecture. Chapter 11 CPU Structure and Function William Stallings Computer Organization and Architecture Chapter 11 CPU Structure and Function CPU Structure CPU must: Fetch instructions Interpret instructions Fetch data Process data Write data Registers

More information

Hardware-Based Speculation

Hardware-Based Speculation Hardware-Based Speculation Execute instructions along predicted execution paths but only commit the results if prediction was correct Instruction commit: allowing an instruction to update the register

More information

Computer Systems Assignment 4: Scheduling and I/O

Computer Systems Assignment 4: Scheduling and I/O Autumn Term 018 Distributed Computing Computer Systems Assignment : Scheduling and I/O Assigned on: October 19, 018 1 Scheduling The following table describes tasks to be scheduled. The table contains

More information

COMPUTER ORGANISATION CHAPTER 1 BASIC STRUCTURE OF COMPUTERS

COMPUTER ORGANISATION CHAPTER 1 BASIC STRUCTURE OF COMPUTERS Computer types: - COMPUTER ORGANISATION CHAPTER 1 BASIC STRUCTURE OF COMPUTERS A computer can be defined as a fast electronic calculating machine that accepts the (data) digitized input information process

More information

4/6/2011. Informally, scheduling is. Informally, scheduling is. More precisely, Periodic and Aperiodic. Periodic Task. Periodic Task (Contd.

4/6/2011. Informally, scheduling is. Informally, scheduling is. More precisely, Periodic and Aperiodic. Periodic Task. Periodic Task (Contd. So far in CS4271 Functionality analysis Modeling, Model Checking Timing Analysis Software level WCET analysis System level Scheduling methods Today! erformance Validation Systems CS 4271 Lecture 10 Abhik

More information

Lecture 14: Multithreading

Lecture 14: Multithreading CS 152 Computer Architecture and Engineering Lecture 14: Multithreading John Wawrzynek Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs.berkeley.edu/~johnw

More information

UNIT I (Two Marks Questions & Answers)

UNIT I (Two Marks Questions & Answers) UNIT I (Two Marks Questions & Answers) Discuss the different ways how instruction set architecture can be classified? Stack Architecture,Accumulator Architecture, Register-Memory Architecture,Register-

More information

The Nios II Family of Configurable Soft-core Processors

The Nios II Family of Configurable Soft-core Processors The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture

More information

A hardware operating system kernel for multi-processor systems

A hardware operating system kernel for multi-processor systems A hardware operating system kernel for multi-processor systems Sanggyu Park a), Do-sun Hong, and Soo-Ik Chae School of EECS, Seoul National University, Building 104 1, Seoul National University, Gwanakgu,

More information

Advanced Parallel Architecture Lesson 3. Annalisa Massini /2015

Advanced Parallel Architecture Lesson 3. Annalisa Massini /2015 Advanced Parallel Architecture Lesson 3 Annalisa Massini - 2014/2015 Von Neumann Architecture 2 Summary of the traditional computer architecture: Von Neumann architecture http://williamstallings.com/coa/coa7e.html

More information

Lecture 7: Pipelining Contd. More pipelining complications: Interrupts and Exceptions

Lecture 7: Pipelining Contd. More pipelining complications: Interrupts and Exceptions Lecture 7: Pipelining Contd. Kunle Olukotun Gates 302 kunle@ogun.stanford.edu http://www-leland.stanford.edu/class/ee282h/ 1 More pipelining complications: Interrupts and Exceptions Hard to handle in pipelined

More information

Techniques described here for one can sometimes be used for the other.

Techniques described here for one can sometimes be used for the other. 01-1 Simulation and Instrumentation 01-1 Purpose and Overview Instrumentation: A facility used to determine what an actual system is doing. Simulation: A facility used to determine what a specified system

More information

Embedded Systems: OS. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Embedded Systems: OS. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University Embedded Systems: OS Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Standalone Applications Often no OS involved One large loop Microcontroller-based

More information

What s An OS? Cyclic Executive. Interrupts. Advantages Simple implementation Low overhead Very predictable

What s An OS? Cyclic Executive. Interrupts. Advantages Simple implementation Low overhead Very predictable What s An OS? Provides environment for executing programs Process abstraction for multitasking/concurrency scheduling Hardware abstraction layer (device drivers) File systems Communication Do we need an

More information

WHY PARALLEL PROCESSING? (CE-401)

WHY PARALLEL PROCESSING? (CE-401) PARALLEL PROCESSING (CE-401) COURSE INFORMATION 2 + 1 credits (60 marks theory, 40 marks lab) Labs introduced for second time in PP history of SSUET Theory marks breakup: Midterm Exam: 15 marks Assignment:

More information

Last 2 Classes: Introduction to Operating Systems & C++ tutorial. Today: OS and Computer Architecture

Last 2 Classes: Introduction to Operating Systems & C++ tutorial. Today: OS and Computer Architecture Last 2 Classes: Introduction to Operating Systems & C++ tutorial User apps OS Virtual machine interface hardware physical machine interface An operating system is the interface between the user and the

More information

Introduction to Computer Systems and Operating Systems

Introduction to Computer Systems and Operating Systems Introduction to Computer Systems and Operating Systems Minsoo Ryu Real-Time Computing and Communications Lab. Hanyang University msryu@hanyang.ac.kr Topics Covered 1. Computer History 2. Computer System

More information

CS370 Operating Systems

CS370 Operating Systems CS370 Operating Systems Colorado State University Yashwant K Malaiya Fall 2016 Lecture 2 Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 2 System I/O System I/O (Chap 13) Central

More information

Ron Kalla, Balaram Sinharoy, Joel Tendler IBM Systems Group

Ron Kalla, Balaram Sinharoy, Joel Tendler IBM Systems Group Simultaneous Multi-threading Implementation in POWER5 -- IBM's Next Generation POWER Microprocessor Ron Kalla, Balaram Sinharoy, Joel Tendler IBM Systems Group Outline Motivation Background Threading Fundamentals

More information

Multiprocessor and Real-Time Scheduling. Chapter 10

Multiprocessor and Real-Time Scheduling. Chapter 10 Multiprocessor and Real-Time Scheduling Chapter 10 1 Roadmap Multiprocessor Scheduling Real-Time Scheduling Linux Scheduling Unix SVR4 Scheduling Windows Scheduling Classifications of Multiprocessor Systems

More information

Shared Address Space I/O: A Novel I/O Approach for System-on-a-Chip Networking

Shared Address Space I/O: A Novel I/O Approach for System-on-a-Chip Networking Shared Address Space I/O: A Novel I/O Approach for System-on-a-Chip Networking Di-Shi Sun and Douglas M. Blough School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta, GA

More information

Understanding the basic building blocks of a microcontroller device in general. Knows the terminologies like embedded and external memory devices,

Understanding the basic building blocks of a microcontroller device in general. Knows the terminologies like embedded and external memory devices, Understanding the basic building blocks of a microcontroller device in general. Knows the terminologies like embedded and external memory devices, CISC and RISC processors etc. Knows the architecture and

More information

Chapter 3. Top Level View of Computer Function and Interconnection. Yonsei University

Chapter 3. Top Level View of Computer Function and Interconnection. Yonsei University Chapter 3 Top Level View of Computer Function and Interconnection Contents Computer Components Computer Function Interconnection Structures Bus Interconnection PCI 3-2 Program Concept Computer components

More information

Unit 2 : Computer and Operating System Structure

Unit 2 : Computer and Operating System Structure Unit 2 : Computer and Operating System Structure Lesson 1 : Interrupts and I/O Structure 1.1. Learning Objectives On completion of this lesson you will know : what interrupt is the causes of occurring

More information

Kaisen Lin and Michael Conley

Kaisen Lin and Michael Conley Kaisen Lin and Michael Conley Simultaneous Multithreading Instructions from multiple threads run simultaneously on superscalar processor More instruction fetching and register state Commercialized! DEC

More information

Practice Problems (Con t) The ALU performs operation x and puts the result in the RR The ALU operand Register B is loaded with the contents of Rx

Practice Problems (Con t) The ALU performs operation x and puts the result in the RR The ALU operand Register B is loaded with the contents of Rx Microprogram Control Practice Problems (Con t) The following microinstructions are supported by each CW in the CS: RR ALU opx RA Rx RB Rx RB IR(adr) Rx RR Rx MDR MDR RR MDR Rx MAR IR(adr) MAR Rx PC IR(adr)

More information

Computer Architecture: Multithreading (I) Prof. Onur Mutlu Carnegie Mellon University

Computer Architecture: Multithreading (I) Prof. Onur Mutlu Carnegie Mellon University Computer Architecture: Multithreading (I) Prof. Onur Mutlu Carnegie Mellon University A Note on This Lecture These slides are partly from 18-742 Fall 2012, Parallel Computer Architecture, Lecture 9: Multithreading

More information

I, J A[I][J] / /4 8000/ I, J A(J, I) Chapter 5 Solutions S-3.

I, J A[I][J] / /4 8000/ I, J A(J, I) Chapter 5 Solutions S-3. 5 Solutions Chapter 5 Solutions S-3 5.1 5.1.1 4 5.1.2 I, J 5.1.3 A[I][J] 5.1.4 3596 8 800/4 2 8 8/4 8000/4 5.1.5 I, J 5.1.6 A(J, I) 5.2 5.2.1 Word Address Binary Address Tag Index Hit/Miss 5.2.2 3 0000

More information

Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Cache Performance

Cache Performance and Memory Management: From Absolute Addresses to Demand Paging. Cache Performance 6.823, L11--1 Cache Performance and Memory Management: From Absolute Addresses to Demand Paging Asanovic Laboratory for Computer Science M.I.T. http://www.csg.lcs.mit.edu/6.823 Cache Performance 6.823,

More information

Chapter 1: Introduction. Operating System Concepts 8th Edition,

Chapter 1: Introduction. Operating System Concepts 8th Edition, Chapter 1: Introduction, Administrivia Reading: Chapter 1. Next time: Continued Grand Tour. 1.2 Outline Common computer system devices. Parallelism within an operating system. Interrupts. Storage operation,

More information

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14 MIPS Pipelining Computer Organization Architectures for Embedded Computing Wednesday 8 October 14 Many slides adapted from: Computer Organization and Design, Patterson & Hennessy 4th Edition, 2011, MK

More information

CPU Scheduling. Operating Systems (Fall/Winter 2018) Yajin Zhou ( Zhejiang University

CPU Scheduling. Operating Systems (Fall/Winter 2018) Yajin Zhou (  Zhejiang University Operating Systems (Fall/Winter 2018) CPU Scheduling Yajin Zhou (http://yajin.org) Zhejiang University Acknowledgement: some pages are based on the slides from Zhi Wang(fsu). Review Motivation to use threads

More information

Operating System Support

Operating System Support William Stallings Computer Organization and Architecture 10 th Edition Edited by Dr. George Lazik + Chapter 8 Operating System Support Application programming interface Application binary interface Instruction

More information

Advanced issues in pipelining

Advanced issues in pipelining Advanced issues in pipelining 1 Outline Handling exceptions Supporting multi-cycle operations Pipeline evolution Examples of real pipelines 2 Handling exceptions 3 Exceptions In pipelined execution, one

More information

Embedded Systems: OS

Embedded Systems: OS Embedded Systems: OS Jinkyu Jeong (Jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong (jinkyu@skku.edu) Standalone

More information

Processors, Performance, and Profiling

Processors, Performance, and Profiling Processors, Performance, and Profiling Architecture 101: 5-Stage Pipeline Fetch Decode Execute Memory Write-Back Registers PC FP ALU Memory Architecture 101 1. Fetch instruction from memory. 2. Decode

More information

Three basic multiprocessing issues

Three basic multiprocessing issues Three basic multiprocessing issues 1. artitioning. The sequential program must be partitioned into subprogram units or tasks. This is done either by the programmer or by the compiler. 2. Scheduling. Associated

More information

Concurrent Event Handling through Multithreading

Concurrent Event Handling through Multithreading IEEE TRANSACTIONS ON COMPUTERS, VOL. 48, NO. 9, SEPTEMBER 1999 903 Concurrent Event Handling through Multithreading Stephen W. Keckler, Member, IEEE, Andrew Chang, Student Member, IEEE, Whay S. Lee, Sandeep

More information

Advanced Parallel Architecture Lesson 3. Annalisa Massini /2015

Advanced Parallel Architecture Lesson 3. Annalisa Massini /2015 Advanced Parallel Architecture Lesson 3 Annalisa Massini - Von Neumann Architecture 2 Two lessons Summary of the traditional computer architecture Von Neumann architecture http://williamstallings.com/coa/coa7e.html

More information

GUJARAT TECHNOLOGICAL UNIVERSITY MASTER OF COMPUTER APPLICATION SEMESTER: III

GUJARAT TECHNOLOGICAL UNIVERSITY MASTER OF COMPUTER APPLICATION SEMESTER: III GUJARAT TECHNOLOGICAL UNIVERSITY MASTER OF COMPUTER APPLICATION SEMESTER: III Subject Name: Operating System (OS) Subject Code: 630004 Unit-1: Computer System Overview, Operating System Overview, Processes

More information

New Advances in Micro-Processors and computer architectures

New Advances in Micro-Processors and computer architectures New Advances in Micro-Processors and computer architectures Prof. (Dr.) K.R. Chowdhary, Director SETG Email: kr.chowdhary@jietjodhpur.com Jodhpur Institute of Engineering and Technology, SETG August 27,

More information

SAE5C Computer Organization and Architecture. Unit : I - V

SAE5C Computer Organization and Architecture. Unit : I - V SAE5C Computer Organization and Architecture Unit : I - V UNIT-I Evolution of Pentium and Power PC Evolution of Computer Components functions Interconnection Bus Basics of PCI Memory:Characteristics,Hierarchy

More information

Module 5: "MIPS R10000: A Case Study" Lecture 9: "MIPS R10000: A Case Study" MIPS R A case study in modern microarchitecture.

Module 5: MIPS R10000: A Case Study Lecture 9: MIPS R10000: A Case Study MIPS R A case study in modern microarchitecture. Module 5: "MIPS R10000: A Case Study" Lecture 9: "MIPS R10000: A Case Study" MIPS R10000 A case study in modern microarchitecture Overview Stage 1: Fetch Stage 2: Decode/Rename Branch prediction Branch

More information

Performance of Multithreaded Chip Multiprocessors and Implications for Operating System Design

Performance of Multithreaded Chip Multiprocessors and Implications for Operating System Design Performance of Multithreaded Chip Multiprocessors and Implications for Operating System Design Based on papers by: A.Fedorova, M.Seltzer, C.Small, and D.Nussbaum Pisa November 6, 2006 Multithreaded Chip

More information

COT 4600 Operating Systems Fall Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 3:00-4:00 PM

COT 4600 Operating Systems Fall Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 3:00-4:00 PM COT 4600 Operating Systems Fall 2009 Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 3:00-4:00 PM Lecture 23 Attention: project phase 4 due Tuesday November 24 Final exam Thursday December 10 4-6:50

More information

On Performance, Transistor Count and Chip Space Assessment of Multimediaenhanced Simultaneous Multithreaded Processors

On Performance, Transistor Count and Chip Space Assessment of Multimediaenhanced Simultaneous Multithreaded Processors On Performance, Transistor Count and Chip Space Assessment of Multimediaenhanced Simultaneous Multithreaded Processors Ulrich Sigmund, Marc Steinhaus, and Theo Ungerer VIONA Development GmbH, Karlstr.

More information

PowerVR Hardware. Architecture Overview for Developers

PowerVR Hardware. Architecture Overview for Developers Public Imagination Technologies PowerVR Hardware Public. This publication contains proprietary information which is subject to change without notice and is supplied 'as is' without warranty of any kind.

More information

THE OPTIUM MICROPROCESSOR AN FPGA-BASED IMPLEMENTATION

THE OPTIUM MICROPROCESSOR AN FPGA-BASED IMPLEMENTATION THE OPTIUM MICROPROCESSOR AN FPGA-BASED IMPLEMENTATION Radu Balaban Computer Science student, Technical University of Cluj Napoca, Romania horizon3d@yahoo.com Horea Hopârtean Computer Science student,

More information

REAL-TIME MULTITASKING KERNEL FOR IBM-BASED MICROCOMPUTERS

REAL-TIME MULTITASKING KERNEL FOR IBM-BASED MICROCOMPUTERS Malaysian Journal of Computer Science, Vol. 9 No. 1, June 1996, pp. 12-17 REAL-TIME MULTITASKING KERNEL FOR IBM-BASED MICROCOMPUTERS Mohammed Samaka School of Computer Science Universiti Sains Malaysia

More information

8085 Microprocessor Architecture and Memory Interfacing. Microprocessor and Microcontroller Interfacing

8085 Microprocessor Architecture and Memory Interfacing. Microprocessor and Microcontroller Interfacing 8085 Microprocessor Architecture and Memory 1 Points to be Discussed 8085 Microprocessor 8085 Microprocessor (CPU) Block Diagram Control & Status Signals Interrupt Signals 8085 Microprocessor Signal Flow

More information

Design and Implementation of a FPGA-based Pipelined Microcontroller

Design and Implementation of a FPGA-based Pipelined Microcontroller Design and Implementation of a FPGA-based Pipelined Microcontroller Rainer Bermbach, Martin Kupfer University of Applied Sciences Braunschweig / Wolfenbüttel Germany Embedded World 2009, Nürnberg, 03.03.09

More information

CPU Structure and Function

CPU Structure and Function CPU Structure and Function Chapter 12 Lesson 17 Slide 1/36 Processor Organization CPU must: Fetch instructions Interpret instructions Fetch data Process data Write data Lesson 17 Slide 2/36 CPU With Systems

More information

Instruction Pipelining Review

Instruction Pipelining Review Instruction Pipelining Review Instruction pipelining is CPU implementation technique where multiple operations on a number of instructions are overlapped. An instruction execution pipeline involves a number

More information

Advanced Computer Architecture

Advanced Computer Architecture Advanced Computer Architecture Chapter 1 Introduction into the Sequential and Pipeline Instruction Execution Martin Milata What is a Processors Architecture Instruction Set Architecture (ISA) Describes

More information

CS425 Computer Systems Architecture

CS425 Computer Systems Architecture CS425 Computer Systems Architecture Fall 2017 Thread Level Parallelism (TLP) CS425 - Vassilis Papaefstathiou 1 Multiple Issue CPI = CPI IDEAL + Stalls STRUC + Stalls RAW + Stalls WAR + Stalls WAW + Stalls

More information