East Tennessee State University Department of Computer and Information Sciences CSCI 4717 Computer Architecture TEST 2 for Fall Semester, 2007

Similar documents
East Tennessee State University Department of Computer and Information Sciences CSCI 4717 Computer Architecture TEST 3 for Fall Semester, 2005

East Tennessee State University Department of Computer and Information Sciences CSCI 4717 Computer Architecture TEST 3 for Fall Semester, 2006

Read this before starting!

Read this before starting!

Read this before starting!

Section 001. Read this before starting! You may use one sheet of scrap paper that you will turn in with your test.

Section 001. Read this before starting!

Read this before starting!

Read this before starting!

Read this before starting!

Read this before starting!

Read this before starting!

Section 001. Read this before starting!

Read this before starting!

Section 002. Read this before starting!

Read this before starting!

Section 001. Read this before starting!

Read this before starting!

Section 001 & 002. Read this before starting!

East Tennessee State University Department of Computer and Information Sciences CSCI 1710/Section 001 Web Design I TEST 1 for Fall Semester, 2000

Read this before starting!

Read this before starting!

If a b or a c, then a bc If a b and b c, then a c Every positive integer n > 1 can be written uniquely as n = p 1 k1 p 2 k2 p 3 k3 p 4 k4 p s

Read this before starting!

Read this before starting!

EECS 470 Final Exam Fall 2015

CPU Structure and Function

SUPERSCALAR AND VLIW PROCESSORS

BEng (Hons.) Telecommunications. BSc (Hons.) Computer Science with Network Security

Chapter 18 Parallel Processing

Module 4c: Pipelining

William Stallings Computer Organization and Architecture

Main Points of the Computer Organization and System Software Module

Computer Organization. Chapter 16

CS 351 Final Review Quiz

UNIT V: CENTRAL PROCESSING UNIT

Lecture 18: Coherence Protocols. Topics: coherence protocols for symmetric and distributed shared-memory multiprocessors (Sections

What is Superscalar? CSCI 4717 Computer Architecture. Why the drive toward Superscalar? What is Superscalar? (continued) In class exercise

Organisasi Sistem Komputer

Pipelining, Branch Prediction, Trends

CPU Structure and Function

Week 11: Assignment Solutions

CSCI 4717 Computer Architecture

University of Toronto Faculty of Applied Science and Engineering

Parallel Processing. Computer Architecture. Computer Architecture. Outline. Multiple Processor Organization

EN164: Design of Computing Systems Topic 06.b: Superscalar Processor Design

Faculty of Science FINAL EXAMINATION

2 MARKS Q&A 1 KNREDDY UNIT-I

ECE 3055: Final Exam

DC57 COMPUTER ORGANIZATION JUNE 2013

OPEN BOOK, OPEN NOTES. NO COMPUTERS, OR SOLVING PROBLEMS DIRECTLY USING CALCULATORS.

Multi-Processor / Parallel Processing

PART A (22 Marks) 2. a) Briefly write about r's complement and (r-1)'s complement. [8] b) Explain any two ways of adding decimal numbers.

Chapter 18. Parallel Processing. Yonsei University

RISC & Superscalar. COMP 212 Computer Organization & Architecture. COMP 212 Fall Lecture 12. Instruction Pipeline no hazard.

Reader's Guide Outline of the Book A Roadmap For Readers and Instructors Why Study Computer Organization and Architecture Internet and Web Resources

Pipelining. Parts of these slides are from the support material provided by W. Stallings

Computer Organization

Do not start the test until instructed to do so!

ASSEMBLY LANGUAGE MACHINE ORGANIZATION

UNIVERSITY OF MORATUWA CS2052 COMPUTER ARCHITECTURE. Time allowed: 2 Hours 10 min December 2018

University of Toronto Faculty of Applied Science and Engineering

Advanced Parallel Architecture Lessons 5 and 6. Annalisa Massini /2017

Computer and Hardware Architecture I. Benny Thörnberg Associate Professor in Electronics

CS 351 Final Exam Solutions

Computer Organization and Technology Processor and System Structures

Department of Computer Science Duke University Ph.D. Qualifying Exam. Computer Architecture 180 minutes

CS311 Lecture: Pipelining, Superscalar, and VLIW Architectures revised 10/18/07

Computer Organization and Architecture (CSCI-365) Sample Final Exam

UNIT I BASIC STRUCTURE OF COMPUTERS Part A( 2Marks) 1. What is meant by the stored program concept? 2. What are the basic functional units of a

Real instruction set architectures. Part 2: a representative sample

Understand the factors involved in instruction set

a number of pencil-and-paper(-and-calculator) questions two Intel assembly programming questions

Multiprocessors & Thread Level Parallelism

Lecture 29 Review" CPU time: the best metric" Be sure you understand CC, clock period" Common (and good) performance metrics"

ECE 485/585 Microprocessor System Design

Chapter 12. CPU Structure and Function. Yonsei University

Honorary Professor Supercomputer Education and Research Centre Indian Institute of Science, Bangalore

6.004 Tutorial Problems L22 Branch Prediction

CSCI-375 Operating Systems

Computer Architecture

6x86 PROCESSOR Superscalar, Superpipelined, Sixth-generation, x86 Compatible CPU

Multiprocessors and Thread-Level Parallelism. Department of Electrical & Electronics Engineering, Amrita School of Engineering

WAYLAND BAPTIST UNIVERSITY VIRTUAL CAMPUS SCHOOL OF BUSINESS SYLLABUS

Characteristics of Mult l ip i ro r ce c ssors r

1. (10) True or False: (1) It is possible to have a WAW hazard in a 5-stage MIPS pipeline.

CSE 378 Final Exam 3/14/11 Sample Solution

COMPUTER ORGANISATION CHAPTER 1 BASIC STRUCTURE OF COMPUTERS

What is Pipelining? RISC remainder (our assumptions)

EECS 470 Final Exam Fall 2013

CS430 Computer Architecture

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING UNIT-1

Design with Microprocessors

Interfacing. Introduction. Introduction Addressing Interrupt DMA Arbitration Advanced communication architectures. Vahid, Givargis

DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING QUESTION BANK

CS Mid-Term Examination - Fall Solutions. Section A.

Lecture 9: MIMD Architectures

CS252 Spring 2017 Graduate Computer Architecture. Lecture 12: Cache Coherence

SIDDHARTH GROUP OF INSTITUTIONS :: PUTTUR Siddharth Nagar, Narayanavanam Road QUESTION BANK (DESCRIPTIVE) UNIT-I

LECTURE 3: THE PROCESSOR

Transcription:

Points missed: Student's Name: Total score: /100 points East Tennessee State University Department of Computer and Information Sciences CSCI 4717 Computer Architecture TEST 2 for Fall Semester, 2007 Read this before starting! The total possible score for this test is 100 points. This test is closed book and closed notes No calculators may be used during the test Please turn off all cell phones & pagers during the test. You may use one sheet of scrap paper that you will turn in with your test. When possible, indicate final answers by drawing a box around them. This is to aid the grader (who might not be me!) Failure to do so might result in no credit for answer. Example: Fine print 1 point will be deducted per answer for missing or incorrect units when required. No assumptions will be made for hexadecimal versus decimal, so you should always include the base in your answer. If you perform work on the back of a page in this test, indicate that you have done so in case the need arises for partial credit to be determined. Academic Misconduct: Section 5.7 "Academic Misconduct" of the East Tennessee State University Faculty Handbook, October 21, 2005: "Academic misconduct will be subject to disciplinary action. Any act of dishonesty in academic work constitutes academic misconduct. This includes plagiarism, the changing of falsifying of any academic documents or materials, cheating, and the giving or receiving of unauthorized aid in tests, examinations, or other assigned school work. Penalties for academic misconduct will vary with the seriousness of the offense and may include, but are not limited to: a grade of 'F' on the work in question, a grade of 'F' of the course, reprimand, probation, suspension, and expulsion. For a second academic offense the penalty is permanent expulsion."

Input/Output 1. Which types of interrupt configurations depend on the physical (hardware) connections between the device and the CPU to determine the interrupt priority? Check all that apply. (2 points) a.) Separate interrupt pins for each device c.) Daisy chain/hardware poll b.) Software poll d.) Bus arbitration 2. Which types of interrupt configurations depend on the interrupting device s ability to place a vector on the bus to identify itself? Check all that apply. (2 points) a.) Separate interrupt pins for each device c.) Daisy chain/hardware poll b.) Software poll d.) Bus arbitration 3. Which of the following four I/O methods requires the CPU to check the I/O device regularly to see if it needs attention? (2 points) a.) programmed I/O b.) interrupt driven I/O c.) direct memory access d.) I/O channel 4. Which of the following four I/O methods requires the CPU to act as a bridge for moving data between the I/O device and main memory? (2 points) a.) programmed I/O b.) interrupt driven I/O c.) direct memory access d.) I/O channel 5. In a configuration where the DMA controller acts as a bridge between the I/O device it supports and the system bus, the DMA requires access(es) to the bus to transfer a single word from the I/O device to memory. (2 points) 6. Place a check in all of the boxes that truthfully complete the sentence, An I/O Channel... (3 points) - is the electrical connections that connect an I/O module to the I/O device - can execute a set of instructions given to it by the CPU - is an extension of the DMA concept and therefore performs the transfer of data on its own - is the bus arbiter for I/O devices 7. Aside from the general advantages of a cache, how does a cache support the use of DMA? (3 points)

Registers and Pipelining 8. The two instructions below are three-operand instructions. In the table below, write three short programs that do exactly the same thing, one with two-operand instructions, one with one-operand instructions, and one with zero-operand instructions. Note that Y is an intermediate result and does not need to be saved anywhere. If it fits the needs of the instruction, feel free to use register names R1, R2, etc. (5 points) ADD Y,A,B ;Y = A + B MULT Z,Y,C ;Z = Y C Two operand instructions One operand instructions Zero operand instructions 9. How does the number of operands allowed in an instruction affect the design of the instruction set? Be as specific as you can. (3 points) 10. Which of the following operations is performed by the zero-operand assembly language code shown to the right? (2 points) a.) Y = ((B C) + D) (A E) b.) Y = A (B (C + D) E) c.) Y = (A (B + C)) (D E) d.) Y = ((A B) + C) (D E) e.) Y = (((A + B ) C) D) E f.) None of the above ADD MULT SUB MULT POP A B C D E Y 11. Identify all of the stages such that 2 or more of them occurring at the same time could result in a bus resource conflict? Circle all that apply. (2 points) a.) Fetch instruction (FI) b.) Decode instruction (DI) c.) Calculate operands (CO) d.) Fetch operands (FO) e.) Execute Instruction (EI) f.) Write Operand (WO)

12. True or false: Just like a conditional branch, an interrupt will cause a pipeline flush. (2 points) 13. There were two factors discussed in class that prevent reaching the ideal speed up of a processor by increasing the number of pipeline stages. Describe one of them. (2 points) 14. Assume that we are executing 1 10 6 instructions using pipelined processor. If there is a 10% chance that an instruction will be a conditional branch and a 60% chance that a conditional branch will be taken, how many times will the pipeline be flushed if the system uses the branch prediction algorithm "predict never taken?" (3 points) The following three questions are based on a system that uses a single branch history bit to record whether a conditional branch instruction branched or not during its previous execution. When a branch instruction is encountered again, it predicts a branch or no-branch based on what happened during the last execution. The bit is initialized to the "did not branch" state. 15. How many incorrect predictions will occur for a for-loop during its first execution? (2 points) 16. True or false: The number of incorrect predictions will be different for the second execution of the for-loop than it was for the first execution. (2 points) 17. True or false: By increasing the number of bits used in the previous problem for branch history from 1 to 2, the number of incorrect predictions for the for-loop can be reduced. (2 points) 18. Describe why counting the number of conditional jumps contained in a section of machine code (static occurrence) is an ineffective statistic when determining the performance of a branch prediction algorithm. (2 points) RISC Processors 19. In a RISC architecture, a delayed load is required in order to: (2 points) a.) avoid flushing the pipeline. b.) avoid data dependency problems. c.) allow parallel instructions to complete before the next set of instructions is loaded. d.) There is no such thing as a delayed load.

The next two questions are based on the section of assembly language code shown below. L1: XOR AX, AX ;Clear AX L2: ADD AX, [BX] ;AX = AX + value pointed to by [BX] L3: MOV [BX], AX ;Store AX to addr pointed to by [BX] L4: SUB BX, 2 ;BX = BX - 2 L5: CMP BX, 0 ;Compare BX with 0 L6: JNE L2 ;If BX=0, go to L2 L7: MOV [final],ax ;Store AX to memory location final 20. In order to avoid flushing the pipeline of a RISC processor executing the above code, the compiler should insert a NOP instruction after which line. (2 points) a.) L2 b.) L3 c.) L4 d.) L5 e.) L6 f.) L7 g.) No NOP needed 21. If instead of inserting a NOP we wished to create an optimized delayed branch, which line(s) could replace the NOP in the previous problem? Circle all that apply. (2 points) a.) L2 b.) L3 c.) L4 d.) L5 e.) L6 f.) L7 g.) No lines can move 22. Find the absolute minimum number of registers required to execute the code below exactly as it is shown assuming that every variable in the code will need to be placed in a register to be used. (In other words, don t try to optimize the code, just the use of registers.) (2 points) a.) 2 b.) 3 c.) 4 d.) 5 e.) 6 f.) 7 g.) 8 int vara = 50; int varb << cin; // User input initializes varb if (vara < varb) { varc = vara % 5; } else if (vara > varb) { for (i = 0; i < 25; i++) varc += varb % i; } else { varc = 10; } 23. For the six architectural characteristics listed below, identify whether the statement more closely identifies a CISC architecture or a RISC architecture. (4 points) CISC RISC The execution time of an instruction is more consistent across all instructions Have a very limited number of addressing modes Tends to have more general purpose registers Use a fixed instruction length

Superscalar 24. Identify the write-read, write-write, and read-write dependencies in the instruction sequence below by entering each line pair with a dependency in the correct column of the table to the right. For example, if L1 and L4 had a write-write dependency (which they may or may not), you would enter L1-L4 in the column labeled write-write. (4 points) L1: R1 = R2 + 99 L2: R3 = R1 + 105 L3: R3 = R3 + R5 L4: R5 = R1 R2 L5: R1 = R1 + R3 write-read write-write read-write 25. In the code below, identify references to initial register values by adding the subscript 'a' to the register reference. Identify new allocations to registers with the next highest subscript and identify references to these new allocations using the same subscript. (3 points) R1 = R2 + 99 R3 = R1 + 105 R3 = R3 + R5 R5 = R1 R2 R1 = R1 + R3 The following five questions are based on the "in-order issue/in-order completion" execution sequence shown in the figure to the right. 26. Give a possible reason why I2 cannot enter the execute stage until cycle 3. (2 points) Decode Execute Write Cycle I1 I2 1 I1 I2 I1 2 I3 I4 I2 3 I3 I4 I2 4 I5 I6 I4 I3 I1 I2 5 I5 I6 I3 6 I6 I5 I3 I4 7 I6 8 I5 I6 9 27. Why doesn t I5 get written until cycle 9 when it actually is ready to be written during cycle 8? (2 points)

28. Assuming I5 and I6 do not share a data dependency, why does I6 stay in the decode stage of the pipeline for one more cycle than I5? (2 points) 29. Which instructions could have been written earlier if this had been an "in-order-issue/out-of-order completion" machine. (3 points) 30. In an out-of-order-issue/out-of-order-completion architecture, what is the earliest cycle I5 could enter the execute stage assuming there are no data dependencies? (2 points) 31. True or false: Delayed branches are used in applications compiled for a superscalar machine. (2 points) 32. Which two of the following dependencies exhibit similar behavior? Circle both. (2 points) a.) Data dependency b.) Procedural dependency c.) Resource conflict 33. There were two situations discussed in class that would cause a procedural dependency. Describe one of them. (2 points) SMP and Clusters 34. Which cache coherence protocol uses distributed control? (2 points) a.) directory protocol b.) snoopy protocol c.) both 35. Which SMP bus configuration is simplest due to its architecture being closest to the singleprocessor architecture? (2 points) a.) time-shared bus b.) multiport memory c.) central controller 36. Assume a multiprocessor system uses the MESI protocol. If the current state of a line in processor A's cache is exclusive and processor A modifies that data, but does not write the new value to main memory, what is the new state of that line in processor A's cache? (2 points) a.) modified b.) exclusive c.) shared d.) invalid e.) cannot be determined

37. Assume a multiprocessor system uses the MESI protocol. If the current state of a line in processor A's cache is exclusive and processor A modifies that data, and updates the new value in main memory, what is the new state of that line in processor A's cache? (2 points) a.) modified b.) exclusive c.) shared d.) invalid e.) cannot be determined 38. For the five characteristics listed below, identify whether the statement more closely identifies an SMP system or a cluster. (4 points) SMP Cluster Processors tightly coupled; communication usually through shared memory The newer of the two architectures Operation is closer to that of a single processor system Lower overall power consumption 39. Identify one of the three reasons discussed in class why the realized speed up of a vector processor is not a linear relation to the number of processing units, i.e., why n parallel units do not offer a speedup of n. (3 points) 40. In addition to instructions, a vector processor can also pipeline. (2 points) 41. Pick two of the following three applications and identify the parallel architecture that would best serve the application: superscalar uni-processor, symmetric multiprocessor, high availability cluster, load balancing cluster, high-performance computing cluster, a grid cluster, or a vector computer. Use a sentence or two to justify your answer. (6 points) a.) Atmospheric simulation b.) Encryption of military information c.) Using financial/economic indicators to predict stock market performance