CS6303 COMPUTER ARCHITECTURE QUESTION BANK

Size: px

Start display at page:

Download "CS6303 COMPUTER ARCHITECTURE QUESTION BANK"

Stephen Griffin
5 years ago
Views:

1 CS6303 COMPUTER ARCHITECTURE QUESTION BANK UNIT I - OVERVIEW & INSTRUCTIONS-TWO MARKS 1.Define Computer. A computer can be defined as a fast electronic calculating machine that can accept digitized data as input, process the data and produce information as output. 2. List down the Eight Great Ideas in Computer Architecture. Design for Moore s Law Use Abstraction to Simplify Design Make the Common Case Fast Performance via Parallelism Performance via Pipelining Performance via Prediction Hierarchy of Memories Dependability via Redundancy 3. How will you improve performance via parallelism? Computer architects have offered designs that get more performance by performing operations in parallel. Use multiple jet engines of a plane for parallel performance. 4. what is the need for Hierarchy of Memories? Programmers want memory to be fast, large, and cheap. The hierarchical arrangement of storage in current computer architectures is called the memory hierarchy. It is designed to take advantage of memory locality in computer programs. Each level of the hierarchy is of higher speed and lower latency, and is of smaller size, than lower levels. 5. List the Components of a Computer System. The underlying hardware in any computer performs the same basic functions: inputting data, outputting data, processing data, and storing data. The five classic components of a computer are Input Output Memory data path and control (with the last two sometimes combined and called the processor) 6. What you meant by Operating System? An operating system interfaces between a user s program and the hardware and provides a variety of services and supervisory functions. 7. Define Moore s Law.

2 Moore's law is the observation that, over the history of computing hardware, the number of transistors on integrated circuits doubles approximately every 18 to 24 months. The law is named after Intel co-founder Gordon E. Moore, who described the trend in his 1965 paper. 8. What you meant by Redundancy? Redundancy is the duplication of critical components or functions of a system with the intention of increasing reliability of the system. 9. Define pixel. The smallest individual picture element. Screens are composed of hundreds of thousands to millions of pixels, organized in a matrix. 10. What is the use of data path and control? The processor logically comprises two main components: data path and control, the respective brawn and brain of the processor. The data path performs the arithmetic operations, and control tells the data path, memory, and I/O devices what to do according to the wishes of the instructions of the program. 11. What are the main functions of memory unit and its types? The functions of memory unit are to store programs and data. There are two classes of storage, they are Primary storage and Secondary storage. 12. Define RAM. Memory in which any location can be reached in a short and fixed amount of time after specifying its address is called random-access memory (RAM). 13. What is memory access time? The time required to access one word is called memory access time. 14. What are system software and there functions? System software is a collection of programs that are executed as needed to perform functions such as Receiving and interpreting using commands. Entering and editing application programs and storing them as files in secondary storage devices. 15. List Technologies for Building Processors and Memory 16. What you meant by transistors? A transistor is simply an on/off switch controlled by electricity. The integrated circuit (IC) combined dozens to hundreds of transistors into a single chip. 17. Define the terms silicon and semiconductor. The manufacture of a chip begins with silicon, a substance found in sand. Because silicon does not conduct electricity well, it is called a semiconductor. With a special chemical process, it is possible to add materials to silicon that allow tiny areas to transform into one of three devices: Excellent conductors of electricity (using either microscopic copper or aluminum wire)

3 Excellent insulators from electricity (like plastic sheathing or glass) Areas that can conduct or insulate under special conditions (as a switch) 18. Describe the process of manufacturing Integrated circuits. The process starts with a silicon crystal ingot, which looks like a giant sausage. Today, ingots are 8 12 inches in diameter and about inches long. An ingot is finely sliced into wafers no more than 0.1 inches thick. These wafers then go through a series of processing steps, during which patterns of chemicals are placed on each wafer, creating the transistors, conductors, and insulators. 19. What you meant by die? The individual rectangular sections that are cut from a wafer, more informally known as chips. 20. How will you calculate the cost of an integrated? 21. Define the terms response time and throughput. The response time is the time between the start and completion of a task also referred to as execution time. The throughput or bandwidth the total amount of work done in a given unit of time. 22. How will you maximize the performance of computer? To maximize performance, we want to minimize response time or execution time for some task. Thus, we can relate performance and execution time for a computer X: 23. Can you compare the performance of two different computers quantitatively? Yes, we can relate the performance of two different computers quantitatively as Below Then we could say the X is n times faster than Y. 24. Compare user CPU time and system CPU time. user CPU time system CPU time The CPU time spent in a program itself The CPU time spent in the operating system performing tasks on behalf of the program 25. What is the difference between uniprocessor and multiprocessor? Uniprocessor : - A type of architecture that is based on a single computing unit. All operations ( additions, multiplications,etc) are done sequentally on the unit. Multiprocessor: - A type of architecture that is based on multiple computing units. Some of the operations ( not all, mind you ) are done in parallel and the results are joined afterwards. 26. Define MIPS. MIPS is an instruction execution rate, MIPS specifies performance inversely to execution time; faster computers have a higher MIPS rating. 27. Define the Term WORD. The natural unit of access in a computer, usually a group of 32 bits; corresponds to the size of a register in the MIPS architecture. 28. What are called data transfer instructions?

4 MIPS must include instructions that transfer data between memory and registers. Such instructions are called data transfer instructions. To access a word in memory, the instruction must supply the memory address. Memory is just a large, single-dimensional array, with the address acting as the index to that array, starting at Which is called alignment restriction? In MIPS, words must start at addresses that are multiples of 4. This requirement is called an alignment restriction 30. Give the format of MIPS R-type instruction. op: Basic operation of the instruction, traditionally called the opcode. rs: The first register source operand. rt: The second register source operand. rd: The register destination operand. It gets the result of the operation. shamt: Shift amount. funct: Function. This field, often called the function code, selects the specific variant of the operation in the op field. PART B 1. Explain about the functional units of a computer. 2. Explain about Eight Great Ideas in Computer Architecture. 3.Describe the role of system software to improve the performance of a computer. 4. What are the special registers in a typical computer? Explain their purposes in detail. 5. What are addressing modes? Explain the various addressing modes with examples. 6. Briefly explain about Technologies for Building Processors and Memory. 7. Explain about measuring and improving the performance of computer? 8. Describe about different types of operands in MIPS. Give examples 9.Explain about the various logical operations performed in MIPS with suitable examples. 10. Explain about the various decision making operations performed in MIPS with suitable examples. 11. Explain in detail about the various MIPS Immediate addressing modes. 12. Explain how to represent instructions in the computer with suitable examples 13. List the advantages of multiprocessor over uniprocessor. UNIT II ARITHMETIC OPERATIONS-TWO MARKS 1. Define Arithmetic Logic Unit (ALU). Hardware that performs addition, subtraction, and usually logical operations such as AND and OR. 2. When can overflow occur in addition?

5 Overflow occurs when adding two positive numbers and the sum is negative, or vice versa. This spurious sum means a carry out occurred into the sign bit. 3. When the overflow will not occur in addition? When adding operands with different signs, overflow cannot occur. The reason is the sum must be no larger than one of the operands. For example, 10+4= 6. Since the operands fit in 32 bits and the sum is no larger than an operand, the sum must fit in 32 bits as well. Therefore, no overflow can occur when adding positive and negative operands. 4. Divide overflow is generated when (A) Sign of the dividend is different from that of divisor. (B) Sign of the dividend is same as that of divisor. (C) The first part of the dividend is smaller than the divisor. (D) The first part of the dividend is greater than the divisor. Ans: B If the first part of the dividend is greater than the deviser, then the result should be of greater length, then that can be hold in a register of the system. The registers are of fixed length in any processor. 5. The negative numbers in the binary system can be represented by (A) Sign magnitude (B) I's complement (C) 2's complement (D) All of the above Ans. (C) 6. Write the principle of booth multiplication. Booth multiplication is nothing but addition of properly shifted multiplicand patterns. It is carried out by following steps: a) Start from LSB. Check each bit one by one. b) Change the first one as -1. c) Skip all exceeding one s (record them as zeros) till you see a zero. Change this zero as one d) Continue to look for next one without disturbing zeros, precede using rules b), and c) 7. List the advantages of Booth algorithm. 1. It handles both positive and negative multipliers uniformly. 2. It achieves some efficiency in the no. of additions required When the multiplier has a few large blocks of 1 s. 8. Give representation of a MIPS floating-point number. F involves the value in the fraction field and E involves the value in the exponent field; 9. Give representation of a double precision floating-point number. 10. What you meant by overflow and underflow in Floating point numbers? overflow means that the exponent is too large to be represented in the exponent field. overflow (floating-point) A situation in which a positive exponent becomes too large to fit in the exponent field.

6 underflow (floating-point) A situation in which a negative exponent becomes too large to fit in the exponent field. 11. List various Floating-Point Instructions in MIPS. Floating-point addition, single (add.s) and addition, double (add.d) Floating-point subtraction, single (sub.s) and subtraction, double (sub.d) Floating-point multiplication, single (mul.s) and multiplication, double (mul.d) Floating-point division, single (div.s) and division, double (div.d) Floating-point comparison, single (c.x.s) and comparison, double (c.x.d), where x may be equal (eq), not equal (neq), less than (lt), less than or equal (le), greater than (gt), or greater than or equal (ge) Floating-point branch, true (bclt) and branch, false (bclf) Floating-point comparison sets a bit to true or false, depending on the comparison condition, and a floating-point branch then decides whether or not to branch, depending on the condition. 12. What are the floating point instructions in MIPS? MIPS supports the IEEE 754 single precision and double precision formats with these instructions: Floating-point addition Floating-point subtraction Floating-point multiplication Floating-point division Floating-point comparison Floating-point branch 13. Define Guard and Round Guard is the first of two extra bits kept on the right during intermediate calculations of floating point numbers. It is used to improve rounding accuracy. Round is a method to make the intermediate floating-point result fit the floating-point format; the goal is typically to find the nearest number that can be represented in the format. IEEE 754, therefore, always keeps two extra bits on the right during intermediate additions, called guard and round, respectively. 14. Define ULP Units in the Last Place is defined as the number of bits in error in the least significant bits of the significant between the actual number and the number that can be represented. 15. What is meant by sticky bit? Sticky bit is a bit used in rounding in addition to guard and round that is set whenever there are nonzero bits to the right of the round bit. This sticky bit allows the computer to see the difference between ten and ten when rounding. 16. What is meant by sub-word parallelism? Given that the parallelism occurs within a wide word, the extensions are classified as subword parallelism. It is also classified under the more general name of data level parallelism.

7 They have been also called vector or SIMD, for single instruction, multiple data. The rising popularity of multimedia applications led to arithmetic instructions that support narrower operations that can easily operate in parallel. For example, ARM added more than 100 instructions in the NEON multimedia instruction extension to support sub-word parallelism, which can be used either with ARMv7 or ARMv What are the steps in the floating-point addition? The steps in the floating-point addition are 1. Align the decimal point of the number that has the smaller exponent. 2. Addition of the significant numbers. 3. Normalize the sum. 4. Round the result 19. Define Subword parallelism. Every microprocessor has special support so that bytes and half words take up less space when stored in memory, but due to the infrequency of arithmetic operations on these data sizes in typical integer programs, there was little support beyond data transfers. Architects recognized that many graphics and audio applications would perform the same operation on vectors of this data. By partitioning the carry chains within a 128-bit adder, a processor could use parallelism to perform simultaneous operations on short vectors of sixteen 8-bit operands, eight 16-bit operands, four 32-bit operands, or two 64-bit operands. The cost of such partitioned adders was small. Given the parallelism occurs within a wide word, the extensions are classified as sub word parallelism. 20. Define Guard and Round Guard is the first of two extra bits kept on the right during intermediate calculations of floating point numbers. It is used to improve rounding accuracy. Round is a method to make the intermediate floating-point result fit the floating point format; the goal is typically to find the nearest number that can be represented in the format. IEEE 754, therefore, always keeps two extra bits on the right during intermediate additions, called guard and round, respectively. 21. Define ULP Units in the Last Place is defined as the number of bits in error in the least significant bits of the significant between the actual number and the number that can be represented. 22. Write the Add/subtract rule for floating point numbers. 1) Choose the number with the smaller exponent and shift its mantissa right a number of steps equal to the difference in exponents. 2) Set the exponent of the result equal to the larger exponent. 3) Perform addition/subtraction on the mantissa and determine the sign of the result 4) Normalize the resulting value, if necessary.

8 23. Write the multiply rule for floating point numbers. 1) Add the exponent and subtract ) Multiply the mantissa and determine the sign of the result. 3) Normalize the resulting value, if necessary. 24. What is the purpose of guard bits used in floating point arithmetic Although the mantissa of initial operands are limited to 24 bits, it is important to retain extra bits, called as guard bits. 25. What are the ways to truncate the guard bits? There are several ways to truncate the guard bits: 1) Chopping 2) Von Neumann rounding 3) Rounding 26. Define carry save addition(csa) process. Instead of letting the carries ripple along the rows, they can be saved and introduced into the next roe at the correct weighted position. Delay in CSA is less than delay through the ripple carry adder. 27.In conforming to the IEEE standard mention any four situations under which a processor sets exception flag. Underflow: If the number requires an exponent less than -126 or in a double precision, if the number requires an exponent less than to represent its normalized form the underflow occurs. Overflow: In a single precision, if the number requires an exponent greater than -127 or in a double precision, if the number requires an exponent greater than to represent its normalized form the underflow occurs. Divide by zero: It occurs when any number is divided by zero. Invalid: It occurs if operations such as 0/0 are attempted. 28. Why floating point number is more difficult to represent and process than integer? An integer value requires only half the memory space as an equivalent. IEEE double precision floating point value. Applications that use only integer based arithmetic will therefore also have significantly smaller memory requirement A floating-point operation usually runs hundreds of times slower than an equivalent integer based arithmetic operation. 29.Define chopping. Chopping is a simple way to truncate or remove the guard bits and make no changes in the retained bits. 30.Define Half adder and full adder. The logic circuit that performs the addition of two binary digits is known as half adder. The circuit that performs the addition of three binary digits is known as full adder. PART B 1. Briefly explain about Binary Addition and Subtraction in MIPS architecture. 2. Explain about Sequential Version of the Multiplication Algorithm and

9 Hardware. 3. Compare the operations of first and revised version of refined version of the multiplication hardware. 4. Explain about division algorithm and hardware with neat diagram. 5. Compare the operations of first and revised version of refined version of the division hardware. 6. Briefly explain about Floating Point operations. 7.Illustrate Booth Algorithm with an example. 8. Explain about the representation of floating point numbers with suitable examples. UNIT III PROCESSOR AND CONTROL UNIT-TWO MARKS 1.List the two steps involved in executing an instruction. Fetch the Instruction Fetch the operands 2. Define hazard and its types. Any condition that causes the pipeline to stall is called a hazard. Its types are: Data hazard Instruction hazard Structural hazard 3Define data hazard. A data hazard is any condition in which either the source or the destination operands of an instruction are not available at the time expected in the pipeline. A data hazard is a situation in which the pipeline is stalled because the data to be operated on are delayed for some reason. 4. Define instruction hazard or control hazard. A pipeline may also be stalled because of the delayed in the availability of an instruction. This may be a result of a miss in the catch, requiring the instruction to be fetched from the main memory. Such hazard are often called control hazard. 5. Define structural hazard. Structural hazard occurs in the following situation when two instructions required the use of a given hardware resource at the same time. 6. Define operand forwarding. The data are available at the output of the ALU once the Execute stage completes step E1. Hence, the delay can be reduced or possibly eliminated if we arrange for the result of instruction I1 to be forwarded directly for use in step E2. This is called operand forwarding. 7. Define Branch penalty. The time lost as a result of a branch instruction is often referred to as branch penalty.

10 8. Define Branch folding. The instruction fetch unit has executed the branch instruction concurrently with the execution of other instruction. This technique is referred to as branch folding. 9. Define Branch prediction. It is a technique for reducing the branch penalty associated with conditional branches is to attempt to predict whether or not a particular branch will be taken. 10 Define static branch prediction The branch prediction decision is always the same every time a given instruction is executed. Any approach that has this characteristic is called static branch prediction. 11. Define Dynamic branch prediction. The branch prediction decision may change depending on execution history.. Any approach that has this characteristic is called Dynamic branch prediction. 12. Define Superscalar processor. There are processors which are capable of achieving an instruction executing throughput of more than one instruction per cycle. They are known superscalar processor. 13. What are the disadvantages of increasing the number of stages in pipelined processing? (Apr/May 2011) Pipelining has many disadvantages though there are a lot of techniques used by CPUs and compilers designers to overcome most of them of them; following is a list of common drawbacks: 1. The design of a non-pipelined processor simpler and cheaper to manufacture, non pipelined processor executes only a single instruction at a time. This prevents branch delays (in Pipelining, every branch is delayed) as well as problems when serial instructions being executed concurrently. 2. In pipelined processor, insertion of flip flops between modules increases the instruction latency compared to a non-pipelined processor. 3. A non-pipelined processor will have a defined instruction throughput. The performance of a pipelined processor is much harder to predict and may vary widely for different programs. 4. Many designs include pipelines as long as 7, 10, 20, 31 and even more stages; a disadvantage of a long pipeline is when a program branches, the entire pipeline must be flushed (cleared). 14. Define data path element. A unit used to operate on or hold data within a processor. In the MIPS Implementation, the data path elements include the instruction and data memories, the register file, the ALU, and adders. 15. Define branch target address. The address specified in a branch, which becomes the new program counter (PC) if the branch is taken. In the MIPS architecture the branch target is given by the sum of the offset field of the instruction and the address of the

11 instruction following the branch. 16. Define register file. The processor s 32 general-purpose registers are stored in a structure called a register file. A register file is a collection of registers in which any register can be read or written by specifying the number of the register in the file. The register file contains the register state of the computer. 17. List the five stages of instruction execution. 1. IF: Instruction fetch 2. ID: Instruction decode and register file read 3. EX: Execution or address calculation 4. MEM: Data memory access 5. WB: Write back 18. Write about branch prediction buffer Also called branch history table. A small memory that is indexed by the lower portion of the address of the branch instruction and that contains one or more bits indicating whether the branch was recently taken or not. 19. Define exception. Exceptions are also known as interrupt. An unscheduled event that disrupts program execution;it is used to detect overflow. 20.Write about the classification of data hazards. Classification of data hazard: A pair of instructions can produce data hazard by referring reading or writing the same memory location. Assume that i is executed before J. So, the hazards can be classified RAW hazard WAW hazard WAR hazard RAW hazard : ( read after write) Instruction j tries to read a source operand before instruction i writes it. WAW hazard :( write after write) Instruction j tries to write a source operand before instruction i writes it. WAR hazard :( write after read) Instruction j tries to write a source operand before instruction i reads it. 21. How data hazard can be prevented in pipelining? Data hazards in the instruction pipelining can prevented by the following techniques. a)operand Forwarding b)software Approach 22. How addressing modes affect the instruction pipelining? Degradation of performance is an instruction pipeline may be due to address dependency where operand address cannot be calculated without available information needed by addressing mode for e.g. An instructions with register indirect mode cannot proceed to fetch the operand if the previous instructions is loading the address into the register. Hence operand access is delayed

12 degrading the performance of pipeline. 23. List out the methods used to improve system performance. The methods used to improve system performance are 1.Processor clock 2.Basic Performance Equation 3.Pipelining 4.Clock rate 5.Instruction set 6.Compiler 24.Define Pipelining. In order to reduce the overall processing time several instructions are being executed simultaneously. This process is termed as pipelining. PART B 1.Describe in detail about the functional units and the basic implementation scheme of MIPS with suitable diagram. 2. Explain how the instruction pipeline works. What are the various situations where an instruction pipeline can stall? 3. Examine the relationships between pipeline execution and addressing modes. 4. Describe the role of cache memory in pipelined system. (ii) Discuss the influence of pipelining on instruction set design. 5. What is instruction hazard? Explain the methods for dealing with the instruction hazards. 6.Describe the data path and control considerations for pipelining. 7.Describe the techniques for handling data and instruction hazards in pipelining. 8. Briefly explain about exceptions. 9. Briefly explain about Creating a Single Data path with neat diagram. 10. Explain about operations of the data path with necessary diagrams. 11. Compare the performance of Single-Cycle versus Pipelined Performance. UNIT IV PARALLELISM-TWO MARKS 1.What are the different ways to achieve parallelism? Multiple Functional units Multiple Processors 2.Define ILP. Architectural technique that allows the overlap of individual machine operations.that is Multiple operations will execute in parallel. 3. What you mean by multiple issue in ILP? The multiple issue technique in ILP is A scheme whereby multiple instructions are launched in one clock cycle.

13 4. Define speculation. Speculation is an approach that allows the compiler or the processor to guess about the properties of an instruction, so as to enable execution to begin for other instructions that may depend on the speculated instruction. 5. Give the compiler technique to get more performance from loops. Loop unrolling is a technique to get more performance from loops that access arrays, in which multiple copies of the loop body are made and instructions from different iterations are scheduled together. 6. Why we need superscalar processor? Superscalar processor is an advanced pipelining technique that enables the processor to execute more than one instruction per clock cycle by selecting them during execution. 7. What you meant by static multiple issue? An approach to implementing a multiple-issue processor where many decisions are made by the compiler before execution. 8. Define dynamic multiple issue. An approach to implementing a multiple-issue processor where many decisions are made during execution by the processor. 9. How does the processor determine how many instructions and which instructions can be issued in a given clock cycle? In most static issue processors, this process is at least partially handled by the compiler; in dynamic issue designs, it is normally dealt with at runtime by the processor, although the compiler will often have already tried to help improve the issue rate by placing the instructions in a beneficial order. 10. Define VLIW. A style of instruction set architecture that launches many operations that are defined to be independent in a single wide instruction, typically with many separate opcode fields. 11. Draw three primary units of a dynamically scheduled pipeline. 12. What are all the three major units of Dynamic pipeline scheduling? instruction fetch and issue unit multiple functional units commit unit 13. What is the use of commit unit? The unit in a dynamic or out-of-order execution pipeline that decides when it is safe to release the result of an operation to programmer-visible registers and memory. 14. What is called reservation stations? A buffer within a functional unit that holds the operands and the operation. 15. Give the major limitation of pipelining technique. If an instruction is stalled in the pipeline, no later instructions can proceed. Thus, if there is a dependency between two closely spaced instructions in the pipeline, it will stall. 16. How will you split ID pipe to introduce out-of-order execution?

14 Issue - Decode instructions, check for structural hazards; Read operands - Wait until no data hazards, then read operands. 17. What you meant by register renaming? As instructions are issued, the register specifiers for pending operands are renamed to the names of the reservation station in a process called register renaming. This combination of issue logic and reservation stations provides renaming and eliminates WAW and WAR hazards. 18. Consider a nonpipelined machine with 6 execution stages of lengths 50 ns, 50 ns, 60 ns, 60 ns, 50 ns, and 50 ns. - Find the instruction latency on this machine. - How much time does it take to execute 100 instructions? Instruction latency = = 320 ns Time to execute 100 instructions = 100*320 = ns 19. What is Flynn s Classification? In 1966, Michael Flynn proposed a classification for computer architectures based on the number of instruction steams and data streams (Flynn s Taxonomy). Flynn uses the stream concept for describing a machine's structure A stream simply means a sequence of items (data or instructions). The classification of computer architectures based on the number of instruction steams and data streams (Flynn s Taxonomy). 20. Give the Flynn s Classification Of Computer Architectures. Flynn s Taxonomy SISD: Single instruction single data Classical von Neumann architecture SIMD: Single instruction multiple data MISD: Multiple instructions single data Non existent, just listed for completeness MIMD: Multiple instructions multiple data Most common and general parallel machine 21. Describe about SIMD. SIMD (Single-Instruction stream, Multiple-Data streams) Each instruction is executed on a different set of data by different processors i.e multiple processing units of the same type process on multiple-data streams. This group is dedicated to array processing machines. Sometimes, vector processors can also be seen as a part of this group. 22. Describe In-Order Issue with Out-of-Order Completion With out-of-order completion, a later instruction may complete before a previous instruction Out-of-order completion is used in single-issue pipelined processors to improve the performance of long-latency operations such as divide When using out-of-order completion instruction issue is stalled when there is a resource conflict (e.g., for a functional unit) or when the instructions ready to issue need a result that has not yet been computed.

15 23. Define Issue Slots and Issue Packet Issue slots are the positions from which instructions could be issued in a given clock cycle. By analogy, these correspond to positions at the starting blocks for a sprint. Issue packet is the set of instructions that issues together in one clock cycle; the packet may be determined statically by the compiler or dynamically by the processor. 24. Define VLIW Very Long Instruction Word (VLIW) is a style of instruction set architecture that launches many operations that are defined to be independent in a single wide instruction, typically with many separate opcode fields. 25. Define Superscalar Processor Superscalar is an advanced pipelining technique that enables the processor to execute more than one instruction per clock cycle by selecting them during execution. Dynamic multipleissue processors are also known as superscalar processors, or simply superscalars. 26. What is meant by loop unrolling? An important compiler technique to get more performance from loops is loop unrolling, where multiple copies of the loop body are made. After unrolling, there is more ILP available by overlapping instructions from different iterations. 27. What is meant by anti-dependence? How is it removed? Anti-dependence is an ordering forced by the reuse of a name, typically a register, rather than by a true dependence that carries a value between two instructions. It is also called as name dependence. Register renaming is the technique used to remove anti-dependence in which the registers are renamed by the compiler or hardware. 28. What is the use of reservation station and reorder buffer? Reservation station is a buffer within a functional unit that holds the operands and the operation. Reorder buffer is the buffer that holds results in a dynamically scheduled processor until it is safe to store the results to memory or a register. 29. Differentiate in-order execution from out-of-order execution. Out-of-order execution is a situation in pipelined execution when an instruction is blocked from executing does not cause the following instructions to wait. It preserves the data flow order of the program. In-order execution requires the instruction fetch and decode unit to issue instructions in order, which allows dependences to be tracked, and requires the commit unit to write results to registers and memory in program fetch order. This conservative mode is called in-order commit. 30. What is meant by hardware multithreading? Hardware multithreading allows multiple threads to share the functional units of a single processor in an overlapping fashion to try to utilize the hardware resources efficiently. To permit this sharing, the processor must duplicate the independent state of each thread. It Increases the utilization of a processor. 31. What are the two main approaches to hardware multithreading?

16 There are two main approaches to hardware multithreading. Fine-grained multithreading switches between threads on each instruction, resulting in interleaved execution of multiple threads. This interleaving is often done in a round-robin fashion, skipping any threads that are stalled at that clock cycle. Coarse-grained multithreading is an alternative to fine-grained multithreading. It switches threads only on costly stalls, such as last-level cache misses. 32. What is SMT? Simultaneous Multithreading (SMT)is a variation on hardware multithreading that uses the resources of a multiple-issue, dynamically scheduled pipelined processor to exploit threadlevel parallelism. It also exploits instruction level parallelism. 33. Differentiate SMT from hardware multithreading. Since SMT relies on the existing dynamic mechanisms, it does not switch resources every cycle. Instead, SMT is always executing instructions from multiple threads, leaving it up to the hardware to associate instruction slots and renamed registers with their proper threads. 34. What are the three multithreading options? The three multithreading options are: 1. A superscalar with coarse-grained multithreading 2. A superscalar with fine-grained multithreading 3. A superscalar with simultaneous multithreading 35. Write about out-of-order Issue with Out-of-Order Completion With in-order issue the processor stops decoding instructions whenever a decoded instruction has a resource conflict or a data dependency on an issued, but uncompleted instruction The processor is not able to look beyond the conflicted instruction even though more downstream instructions might have no conflicts and thus be issuable Fetch and decode instructions beyond the conflicted one ( instruction window : Tetris), store them in an instruction buffer (as long as there s room), and flag those instructions in the buffer that don t have resource conflicts or data dependencies Flagged instructions are then issued from the buffer without regard to their program order PART B 1. Briefly explain about Instruction-level-parallelism. 2. Explain about static two-issue data path with neat diagram. 3. Describe about Loop Unrolling for Multiple-Issue Pipelines. 4. Explain about Dynamic Multiple-Issue Processors. 5. Explain about Flynn's classification. 6. Briefly explain about Hardware multithreading. 7.Write in detail about Multicore processors.

17 UNIT V MEMORY AND I/O SYSTEMS-TWO MARKS 1. Which processors uses big and little endian arrangements? Big-endian : processor Little-endian: Intel processor 2. What is the use of MAR and MDR? MAR-Memory Address Register. MDR-Memory Data Register Data transfer between the memory and the processor takes place through the use of two processor registers, called MAR and MDR. If MAR is k bits long and LDR is n bits long,then the memory unit may contain up to 2^k addressable locations. 3. Define is memory access time. The time that elapses between the initiation of an operation and the completion of that operation is called memory access time. e.g.: time between the read and the MFC signal. 4. Define memory cycle time. Memory cycle time is the minimum time delay required between the initiation of two successive memory operations.eg: time between two successive read operations. 5. What is RAM? A memory unit is called random-access memory if any location can be accessed for a read and writes operations in some fixed amount of time that is independent of the location s address. 6. What is cache memory? Cache memory is a small, fast memory that is inserted between the larger, slower main memory and the processor. It holds the currently active segments of a program and their data. 7. Define virtual address. The memory control circuitry translates the address specified by the program into an address that can be used to access the physical memory. In such cases the address generated by the processor is referred to as a virtual or logical address. 8. What are the uses of memory management unit? The virtual address space is mapped onto the physical memory where data are actually stored. A special memory control circuit, often called the memory management unit, implements the mapping function. 9. Define word line Memory cells are usually organized in the form of an array. Each row of cells constitutes a memory word, and all cells of a row are connected to a common line called word line.

18 10. Define static memory Memories that consist of circuits capable of retaining their state as long as power is applied are known as static memories. Ex. SRAM. CMOS. 11. Write the advantages of SRAM. SRAM can be accessed very quickly. Access times of a few nanoseconds are found in commercially available chips. It is used in applications where speed is of critical concern. 12. Which memory is called asynchronous DRAM? A specialized memory controller circuit provides the necessary control signals, RAS (Row Address Strobe) and CAS (Column Address Strobe) that govern the timings. Such memories are called asynchronous DRAMs. 13. What is the role of mode register in SRAM? SRAMs have several different modes of operation, which can be selected by writing control information into the a mode register. 14. Define memory latency and bandwidth. The term memory latency is used to refer to the amount of time it takes to transfer a word of data to or from the memory. If the performance measure is defined in terms of the number of bits or bytes that can be transferred in one second, then the measure is called memory bandwidth. 15. Define double-data-rate SDRAM. If a device transfer data on both edges of the clock, their bandwidth is essentially doubled for long burst transfers. Such devices are called (DDR SDRAM) double-data-rate SDRAM. 16. Define motherboard. If a large memory is built by placing DRAM chips directly on the main system printed circuit board that contains the processor, then it is called motherboard. 17. What is the need of Rambus? A very wide bus is expensive and requires a lot of space on a motherboard. An alternative approach is to implement a narrow bus that is much faster. This approach was used by Rambus Inc. to develop a proprietary design known as Rambus. 18. Define Rambus channel and RDRAM. Rambus provides a complete specification for the design of communication links, called Rambus channel. Circuitry needed to interface to the Rambus channel is included on the chip. Such chips are called as Rambus DRAM (RDRAM). 19. What are the types of packets? Request Acknowledge Data 20. Difference between DDR SDRAM and RDRAM? DDR SDRAM RDRAM 1. It is an open standard. 1.It is a proprietary design of Rambus Inc. 2. It is cheaper than RDRAM. 2. It is costlier.

19 3.Performance is less adequate than RDRAM 3. Performance is adequate. 21. What is the use of EPROM chips? ROM(Read Only Memory) PROM(Programmable Read Only Memory) 1.it is not flexible and convenient 1.it provides flexibility and convenience 2. it is expensive. 2.it is less expensive In EPROM chips, the contents can be erased and reprogrammed. Erasure requires dissipating the charges trapped in the transistors of memory cells. Exposing the chip to UV light can do this. 22. What is the disadvantage of EPROM? (Or) Why EEPROM is needed? In EPROM the chip must be physically removed from the circuit for reprogramming and that its entire content are erased by the UV light. It is possible to implement another version of erasable PROMs that can be both programmed and erased electrically.it is called as EEPROM. 23. What is the advantage and disadvantage of flash drives? Advantage: 1. They have shorter seek and access times, which results in faster response. 2. They have lower power consumption. 3. They are insensitive to vibration. Disadvantage: 1. It has smaller capacity. 2. It is higher in cost per bit. 24. What is called locality of reference? Many instructions in localized areas of the program are executed repeatedly during some time period, and the remainder of the program is accessed relatively infrequently. This is referred to as locality of reference. 25. Why replacement algorithm is required? When the cache is full and a memory word that is not in the cache is referenced, the cache control hardware must decide which block should be removed to create space for the new block that contains the referenced word. The collection of rules for making this decision constitutes the replacement algorithm. 26. Define associative-mapping. The tag bits of an address received from the processor are compared to the tag bits of each block of the cache to see if the desired block is present. This is called the associativemapping. 27. What are the two ways of locality of reference? Temporal: It means that recently executed instruction is likely to be executed again very soon. Spatial: It means that instructions in close proximity to a recently executed instruction are also likely to be executed soon. 28. What are types of mapping functions? The correspondence between the main memory blocks and those in cache is

20 specified by a mapping function. Its types are Direct-mapping Associative-mapping Set-associative mapping 29. What is called as LRU block? When a block is to be overwritten, it is sensible to overwrite the one that has gone the longest time without being referenced. This block is called the Least recently Used(LRU) block, and the technique is called LRU replacement algorithm. 30. Define hit rate and miss rate. Successful access to data in a cache is called hit. The number of hits stated as a fraction of all attempted accesses is called the hit rate. The number of misses stated as a fraction of attempted accesses is called the miss rate. 31. What is miss penalty? The extra time needed to bring the desired information into the cache is called miss penalty. 32. Differentiate physical address from logical address. Physical address is an address in main memory. Logical address (or) virtual address is the CPU generated addresses that corresponds to a location in virtual space and is translated by address mapping to a physical address when memory is accessed. 33. Define Page Fault Page fault is an event that occurs when an accessed page is not present in main memory. 34. Define page table and page frame. Information about main memory location of each page is kept in a page table. An area in the main memory that can hold one page is called page frame. 35. What is system space and user space? It is convenient to assemble the operating system routines into the virtual address space, called the system space. It is separated from the virtual space in which the user application programs reside is called the user space. 36. What are two states of processor? Supervisor state: Processor is placed in the supervisor state when the operating system routines are being executed. User state: Processor is placed in the user state to execute the user programs. 37. What is Winchester technology? In most modern disk units, the disks and the read/write heads are placed in a sealed, air filtered enclosure. This approach is called Winchester technology. 38. What are parts of disk system? The disk system consists of 3 parts.they are Disk: Assembly of disk platters. Disk drive: comprises the electro mechanical mechanism that spins the disk

21 and moves the read/write heads. Disk controller: electronic circuitry that controls the operation of the system. 39. What is the use of ECC? Error Correcting Code (ECC) are used to detect and correct errors that may have occurred in writing or reading of the 512 data bytes. 40. Define seek time, rotational time and access time? -Seek time is the time required to move the read/write head to the proper track. -Rotational or latency time is the amount of time that elapses after the head is positioned over the correct track until the starting position of the address sector passes under the read/write head. The sum of these above two delays is called the disk access time. 41. Write the controllers major function? Seek Read Write Error checking 42. What is booting? When the power is turned off,the contents of main memory are lost.when the power is turned on again the OS has to be loaded into the main memory, which takes place a part of process called booting. 43. Describe a floppy disk. Floppy disks are smaller, simpler and cheaper disk units consist of a flexible, removable, plastic diskette coated with magnetic material. 44. Define data striping A single large file is stored in several separate disk units by breaking the file up into a number of smaller pieces and storing these pieces on different disks. This is called data striping. 45. What is annealing? This process will leave the alloy in a crystalline state that allows light to pass through. 46. Write about the lase powers used in CD-RW drives. CD-RW drive uses 3 different laser powers. The highest power is used to record the pits. The middle power is used to put the alloy into its crystalline state. It is called erase power. The lowest power is used to read the stored information. 47. What is called cartridge tapes? Tape system, which uses an 8-mm video format tape housed in a cassette, is called cartridge tapes. 48. What is pit and land? CD s bottom layer is a polycarbonate plastic, which functions as clear glass base. The surface of this plastic is programmed to store data by indenting it with pits. The unintended parts are called land.

22 49. Define TLB Translation-Look aside Buffer (TLB) is a cache that keeps track of recently used address mappings to try to avoid an access to the page table. 50. What is memory interleaving? (Nov /Dec 2010) (MAY/JUNE 2012) The memory is partitioned into a number of modules connected to a common memory address and data buses. A primary module is a memory array together with its own addressed data registers. Figure shows a memory unit with four modules. 51. What is the use of EEPROM? (Apr/May 2011) EPROM (electrically erasable programmable read-only memory) is usermodifiable read-only memory (ROM) that can be erased and reprogrammed (written to) repeatedly through the application of higher than normal electrical voltage. Unlike EPROM chips, EEPROMs do not need to be removed from the computer to be modified. 52. State the hardware needed to implement the LRU in replacement algorithm (Apr/May 2011) LRU replacement policy Least Recently Used (LRU) replacement policy is used replace the cache line or page that is least recently used. 53. An address space is specified by 24 bits and the corresponding memory space by 16 bits: How many words are in the (a) virtual memory (b) main memory (Apr/May 2011) We consider a unit of memory called page. A page is typically rather large, for example 4 KB (4096 bytes). A page is the unit of memory that is transferred between disk and main memory. A virtual page is either in memory or on disk. Example: 32-bit virtual address space Pages that are 4 KB large (4 KB = 212 bytes) 16 MB main memory (16 MB = 224 bytes) 54. What are the temporal and spatial localities of references? Temporal locality (locality in time): if an item is referenced, it will tend to be referenced again soon. Spatial locality (locality in space): if an item is referenced, items whose addresses are close by will tend to be referenced soon 55. What are the various memory technologies? The various memory technologies are: 1. SRAM semiconductor memory 2. DRAM semiconductor memory 3. Flash semiconductor memory 4. Magnetic disk 56. Define Rotational Latency Rotational latency, also called rotational delay, is the time required for the desired sector of a disk to rotate under the read/write head, usually assumed to be half the rotation time. 57. What is direct-mapped cache? Direct-mapped cache is a cache structure in which each memory location is

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS UNIT-I OVERVIEW & INSTRUCTIONS 1. What are the eight great ideas in computer architecture? The eight