This course provides an overview of the SH-2 32-bit RISC CPU core used in the popular SH-2 series microcontrollers

Similar documents
What is Pipelining? Time per instruction on unpipelined machine Number of pipe stages

What is Pipelining? RISC remainder (our assumptions)

3/12/2014. Single Cycle (Review) CSE 2021: Computer Organization. Single Cycle with Jump. Multi-Cycle Implementation. Why Multi-Cycle?

MIPS Pipelining. Computer Organization Architectures for Embedded Computing. Wednesday 8 October 14

Purpose This course provides an overview of the SH-2A 32-bit RISC CPU core built into newer microcontrollers in the popular SH-2 series

Data Hazards Compiler Scheduling Pipeline scheduling or instruction scheduling: Compiler generates code to eliminate hazard

Course Introduction. Purpose: Objectives: Content: 27 pages 4 questions. Learning Time: 20 minutes

Instruction Pipelining Review

Parallelism. Execution Cycle. Dual Bus Simple CPU. Pipelining COMP375 1

COSC 6385 Computer Architecture - Pipelining

Minimizing Data hazard Stalls by Forwarding Data Hazard Classification Data Hazards Present in Current MIPS Pipeline

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome

Pipelining. CSC Friday, November 6, 2015

The Processor Pipeline. Chapter 4, Patterson and Hennessy, 4ed. Section 5.3, 5.4: J P Hayes.

Pipelining: Hazards Ver. Jan 14, 2014

Instruction Level Parallelism. ILP, Loop level Parallelism Dependences, Hazards Speculation, Branch prediction

Instruction Pipelining

Instruction Pipelining

Pipeline Overview. Dr. Jiang Li. Adapted from the slides provided by the authors. Jiang Li, Ph.D. Department of Computer Science

TECH. CH14 Instruction Level Parallelism and Superscalar Processors. What is Superscalar? Why Superscalar? General Superscalar Organization

Advanced Computer Architecture

Pipelining concepts The DLX architecture A simple DLX pipeline Pipeline Hazards and Solution to overcome

MIPS ISA AND PIPELINING OVERVIEW Appendix A and C

5008: Computer Architecture HW#2

These actions may use different parts of the CPU. Pipelining is when the parts run simultaneously on different instructions.

Computer and Hardware Architecture I. Benny Thörnberg Associate Professor in Electronics

Pipelining, Branch Prediction, Trends

Pipelining. Pipeline performance

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

ECEC 355: Pipelining

Full Datapath. Chapter 4 The Processor 2

Instruction Level Parallelism. Appendix C and Chapter 3, HP5e

Modern Computer Architecture

Lecture 7: Pipelining Contd. More pipelining complications: Interrupts and Exceptions

Pipeline Review. Review

Pipelining! Advanced Topics on Heterogeneous System Architectures. Politecnico di Milano! Seminar DEIB! 30 November, 2017!

Lecture 4 - Pipelining

The Processor: Improving the performance - Control Hazards

CISC 662 Graduate Computer Architecture Lecture 6 - Hazards

Department of Computer and IT Engineering University of Kurdistan. Computer Architecture Pipelining. By: Dr. Alireza Abdollahpouri

Full Datapath. Chapter 4 The Processor 2

COMP2611: Computer Organization. The Pipelined Processor

Pipeline Architecture RISC

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 14 Instruction Level Parallelism and Superscalar Processors

Computer Architecture

Chapter 4. Instruction Execution. Introduction. CPU Overview. Multiplexers. Chapter 4 The Processor 1. The Processor.

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

Ti Parallel Computing PIPELINING. Michał Roziecki, Tomáš Cipr

Processor (II) - pipelining. Hwansoo Han

Improving Performance: Pipelining

Lecture 3. Pipelining. Dr. Soner Onder CS 4431 Michigan Technological University 9/23/2009 1

MIPS An ISA for Pipelining

CIS 662: Midterm. 16 cycles, 6 stalls

EI338: Computer Systems and Engineering (Computer Architecture & Operating Systems)

RISC & Superscalar. COMP 212 Computer Organization & Architecture. COMP 212 Fall Lecture 12. Instruction Pipeline no hazard.

Advanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017

Computer Architecture. Lecture 6.1: Fundamentals of

Control Hazards - branching causes problems since the pipeline can be filled with the wrong instructions.

Organisasi Sistem Komputer

Lecture 4: Instruction Set Architecture

ECE260: Fundamentals of Computer Engineering

Computer Architecture ELEC3441

CS 3510 Comp&Net Arch

Chapter 4. The Processor

Week 11: Assignment Solutions

Simple Instruction Pipelining

LECTURE 3: THE PROCESSOR

COSC4201 Pipelining. Prof. Mokhtar Aboelaze York University

Lecture 5: Instruction Pipelining. Pipeline hazards. Sequential execution of an N-stage task: N Task 2

The Processor. Z. Jerry Shi Department of Computer Science and Engineering University of Connecticut. CSE3666: Introduction to Computer Architecture

Instr. execution impl. view

What is Pipelining. work is done at each stage. The work is not finished until it has passed through all stages.

More advanced CPUs. August 4, Howard Huang 1

Full Datapath. CSCI 402: Computer Architectures. The Processor (2) 3/21/19. Fengguang Song Department of Computer & Information Science IUPUI

Appendix C. Abdullah Muzahid CS 5513

INSTRUCTION LEVEL PARALLELISM

What do we have so far? Multi-Cycle Datapath (Textbook Version)

Pipelining Analogy. Pipelined laundry: overlapping execution. Parallelism improves performance. Four loads: Non-stop: Speedup = 8/3.5 = 2.3.

Single cycle MIPS data path without Forwarding, Control, or Hazard Unit

CAD for VLSI 2 Pro ject - Superscalar Processor Implementation

Computer and Information Sciences College / Computer Science Department Enhancing Performance with Pipelining

Architectures & instruction sets R_B_T_C_. von Neumann architecture. Computer architecture taxonomy. Assembly language.

Lecture: Pipelining Basics

Photo David Wright STEVEN R. BAGLEY PIPELINES AND ILP

Basic Pipelining Concepts

Lecture 5: Pipelining Basics

Code Generation. CS 540 George Mason University

SISTEMI EMBEDDED. Computer Organization Pipelining. Federico Baronti Last version:

COMPUTER ORGANIZATION AND DESIGN

Pipelining and Vector Processing

CPE Computer Architecture. Appendix A: Pipelining: Basic and Intermediate Concepts

Advanced processor designs

ECE 154A Introduction to. Fall 2012

CPE300: Digital System Architecture and Design

Pipelining. Maurizio Palesi

Superscalar Processors Ch 14

Superscalar Processing (5) Superscalar Processors Ch 14. New dependency for superscalar case? (8) Output Dependency?

The Processor: Instruction-Level Parallelism

Computer System. Agenda

Chapter 4 The Processor 1. Chapter 4B. The Processor

Transcription:

Course Introduction Purpose: This course provides an overview of the SH-2 32-bit RISC CPU core used in the popular SH-2 series microcontrollers Objectives: Learn about error detection and address errors Explore the pipeline in the SH-2 CPU Understand branching Obtain some helpful coding tips Content: 16 pages 3 questions Learning Time: 15 minutes 1

CPU Error Detection SH-2 CPU can detect and react to various error conditions: Address errors Illegal instructions (invalid op codes) Undefined instruction executed Slot illegal instructions Certain instructions, such as one that changes the PC, cannot be located in a delay slot (i.e., a slot after a delayed branch) 2

Address Errors Instruction fetch from odd address Instruction fetch from on-chip peripheral module space Instruction fetch from external memory space in single-chip mode Word data accessed from odd address Longword data accessed from other than a longword boundary Longword accessed in 8-bit on-chip peripheral module space External memory space accessed when in single-chip mode Stack access to address that is not multiple of four (longword alignment required) Access using vector base register when VBR is not a multiple of four (longword alignment required)

PROPERTIES On passing, 'Finish' button: On failing, 'Finish' button: Allow user to leave quiz: User may view slides after quiz: User may attempt quiz: Goes to Next Slide Goes to Slide At any time After passing quiz Unlimited times

Why Use a Pipeline? CISC CPUs perform all operations sequentially, then execute instructions in series, one at a time; instruction execution can take several cycles RISC CPUs use a pipeline to try to overlap independent parts" of instructions, so that one instruction may be able to execute every cycle Content of a typical instruction Instruction Fetch Inst. Dec./ Reg. Acc. ALU Mem Access/ Data Cycle Write Back Register CISC: Variable length SH-2: 16 bits long CISC Instruction Instruction Instruction IF ID ALU MA WB IF ID ALU MA WB IF ID ALU MA WB Instruction RISC Instruction Instruction Time saved Ideal pipeline speedup: T pipeline = T unpipelined / # of Pipeline stages

Ideal SH-2 Pipeline Flow IF (Instruction Fetch): Fetches an instruction from the memory in which the program is stored ID (Instruction Decode): Decodes instruction fetched EX (Instruction Execution): Performs data operations and address calculations according to the results of decoding MA (Memory Access): WB(Write Back): Accesses data in memory; generated by instructions that involve memory access (with some exceptions) Returns results of memory access (data to a register) - Generated by instructions that involve memory loads, with some exceptions Slots Instruction 1 IF ID EX MA WB Instruction 2 IF ID EX MA WB Instruction Instruction 3 IF ID EX MA WB stream Instruction 4 IF ID EX MA WB Instruction 5 IF ID EX MA WB Instruction 6 IF ID EX MA WB Time Ideal execution One instruction every slot (1 cycle per slot)

Hazards Prevent Ideal Flow Pipeline can be stalled by: Structural hazards Arise from resource conflicts that occur when the hardware cannot support all possible combinations of instructions in simultaneous, overlapped execution Data hazards Arise when an instruction depends on the result of a previous instruction in a way that is exposed by the overlapping of instructions in the pipeline Control hazards Arise from the pipelining of branches and other instructions that change the PC

PROPERTIES On passing, 'Finish' button: On failing, 'Finish' button: Allow user to leave quiz: User may view slides after quiz: User may attempt quiz: Goes to Next Slide Goes to Slide At any time After passing quiz Unlimited times

Slots Requiring Multiple Cycles Execution time of Instruction 1: 4 cycles IF stage takes two cycles MA stage takes three cycles 9

Pipeline and Flow Control Due to the pipeline, instructions are fetched before previous instructions are totally completed Program timing is affected when fetched instructions are not used; this situation can occur for: Conditional branching Unconditional branching Understand the difference between Virtual time for instruction execution (ideal case) Total time for an instruction to execute (actual case) SuperH compilers organize code to minimize pipeline stalls

Conditional Branch Taken When branch is taken... Conditional branch instructions: BF <destination> BT <destination> Destination address is known at this point Destination address is known at this point. BR Instr. IF ID EX Next Instr. IF ID Fetched, but discarded Next Instr. IF Fetched, but discarded BR Destination IF ID EX MA

Conditional Branch Not Taken When branch is NOT taken... Conditional branch instructions: BF <destination> BT <destination> Execution continues normally BR Inst IF ID EX MA WB Next Inst Next Inst Next Inst IF ID EX MA WB IF ID EX MA WB IF ID EX MA WB

Conditional Delayed Branch When branch is taken... Instructions: BF/S <destination> BT/S <destination> Delay Introduced BR Instr. IF ID EX Delay Next Instr. IF Delay ID EX MA WB BR Destination IF ID EX MA WB IF ID EX MA Delayed Branch / Instruction Re-order Traditional CPU ADD.W R1,R0 BGT target_address SuperH CPU BF/S target_address ADD.W R1,R0

PROPERTIES On passing, 'Finish' button: On failing, 'Finish' button: Allow user to leave quiz: User may view slides after quiz: User may attempt quiz: Goes to Next Slide Goes to Slide At any time After passing quiz Unlimited times

Programming Tips Use local variables wherever possible to improve execution (locals use less ROM and RAM) Use modular programming to reduce far branches Be careful with constants; use 8-bit wherever possible Avoid operations on the MAC that will stall the pipeline Place functions that call each other close together Try to align instructions on 32-bit boundaries, especially load/store instructions Convert byte and word values to signed long integers, the most efficient data type for the SH-2 architecture Make sure that instructions that immediately follow an instruction that loads from memory, do not use the same destination register as the load instruction

Course Summary Error detection Address errors Pipeline Branching Recommendations for coding 16