B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION

Introduction :- An exploits the hardware resources of one or more processors to provide a set of services to system users. The OS also manages secondary memory and I/O devices on behalf of its users. So it is necessary to have some understanding some of computer system hardware. Basic Elements :- At a top level, a computer consists of processor, memory and I.O components, with one or more modules of each type. These components are interconnected in some fashion to achieve the main function of computer, which is to execute the program. Processor : Controls the operation of the computer and perform its data processing functions. When there is only one processor, it is often referred to as a Central Processing Unit (CPU). Main Memory :- Stores data and programs. This memory is typically volatile; that is, when the computer is shut down, the content of the memory are lost. In contrast, the content of disk memory are retained even when the computer system is shut down. It is also refer to as a real memory or primary memory. I/O Modules :- Move data between the computer and its external environment. The external environment consists of a variety of devices, including secondary memory device, communication equipment, and terminals. System Bus :- Provides for communication among processors, main memory, and I/O modules. One of the processor s function is to execute data with memory. For this purpose, it typically makes use of two internal registers: 1. Memory Address Register (MAR) :- It will specify the address in memory for the next read or write instruction. 2. Memory Buffer Register (MBR) :- It contains the data to be written into memory or it receives the data read from the memory. Page # 1

Similarly, an I/O Address Register (I/OAR) specify a particular I/O device, an I/O Buffer Register (I/OBR) is used for the exchange of data between an I/O modules and the processor. A memory module consists of a set of locations, define by sequentially numbers address. Each location contains a bit pattern that can be interpreted as either an instruction or a data. An I/O module transfers a data from external device to processor and memory and vice versa. It contains an internal buffer for holding data until they can be sent on. Processor Registers :- A processor includes a set of registers that provide memory that is faster and smaller then main memory. Processor registers serve two functions: 1. User-visible Registers: Enables he machine or assembly language programmer to minimize main memory references by optimizing the register use. For high level languages, an optimizing compiler will attempt to make intelligent choice of which variable to assign to register and which to main memory location. Some high-level languages such as C, allow the Page # 2

programmer to suggest to the compiler which variable should be held in register. 2. Control and Status Register: Used by the processor to control the operation of the processor and privileged OS routines to control the execution of programs. Processor Registers User-visible Registers Control and Status Registers Data Registers Address Registers Program Counter Register Instruction Registers Index Register Segment Pointer Stack Pointer User-visible Registers :- A user-visible register may be referenced by means of the machine language that the processor execute and is generally available to all programs, including application programs as well as system programs. Types of registers that are typically available are data, address and condition code register. Data Register :- Data register can be assigned to a variety of functions by the programmer. In some case, they are general purpose in nature and can be used in any machine instructions that perform operation on data. Often, however there are restrictions. For example, there may be dedicated registers for floating point operations and others for integer operations. Address Register :- Address register contain main memory address of dataa and instructions, or they contain the portion of an address that is used in calculation of complete or effective address. There are different modes of addressing. For example: Index Register: Indexed addressing is a common mode of addressing that involves adding an index to a base value to get the effective address. Page # 3

Segment Pointer: With segmented addressing, memory is divided into segments, which are variable length blocks of words (Ordered set of Bytes or Bites). A memory reference consists of a reference to a particular segment. In this mode of addressing, a register is used to hold base address (starting location) of the segment. There may be multiple registers; for example, one for OS and another for the currently running application. Stack Pointer: If there is user-visible stack addressing, then there is a dedicated register that points to a top of the stack. That allows the use of instructions that contain no address field, such as push and pop. For some processor, when a procedure is call then the content of all the user-visible registers will automatically save. Saving and restoring is performed by the processor as a part of execution of instruction. This allows each procedure to use these registers independently. On other processor, the programmer must save the content of the relevant user-visible registers prior to the next procedure call, by including the instruction for this process at the end of each procedure. Thus, saving and restoring may be performed in either hardware or software, depending on the processor. Control and Status Register :- A variety of processor registers are employed to control the operation of the processor. On most processors, most of them are not visible to the user. Some of them may be accessible by the machine instruction executed in control or kernel mode. Different processors will have different register organizations and different terminology. Program Counter (PC): Contain the address of next instruction to be fetched. Instruction Register (IR): Contain the instruction most recently fetched. All processor designs also include a register or a set of registers, often known as the Program Status Word (PSW), that contain status information. The PSW typically contains condition code plus other status information, such interrupt enable / disable bits and kernel / user mode bit. Condition codes (also known as flags) are bites typically set by the processor as the result of operation. For example, an arithmetic operation may produce a positive, negative, and zero or overflow result. In addition to the result itself being stored in a register or memory, a condition code is set after the execution of arithmetic instruction. The condition code may be tested as a part of conditional branch. Generally machine instructions allow this bit to be read by implicit reference, but they cannot be altered by explicit reference because they are intended for feedback regarding the result of instruction execution. Page # 4

Instruction Execution A program to be executed by the processor consists of a set of instructions stored in memory. In this simplest form, instruction processing consists of two steps: The processor read the instruction from the memory one at a time and executes each instruction. Program execution consists of repeating the process of instruction fetch and instruction execution. Instruction execution may involve several operations and depend on the nature of the instruction. The processing required for a single instruction is called an instruction cycle. BASIC INSTRUCTION CYCLE:- Start Fetch Next Instruction Execute Instruction Halt Fetch Stage Execution Stage The two steps are referred to as the fetch stage and execution stage. Program execution halts only if the processor turned off, some sort of unrecoverable error occurs, or a program instruction that halts the processor in encountered. Instruction Fetch and Execution :- At the beginning of each instruction cycle, the processor fetches an instruction from memory. Typically, the program counter (PC) holds the address of the next instruction to be fetched. If not instructed the processor always increment the PC after each instruction fetch so that it will fetch the next instruction in sequence (i.e. the instruction located at the next higher memory address). For example: Consider a computer in which each instruction occupies 16-bit word of memory. Assume that the program counter is located at location 300. At the start of instruction cycle the processor will fetch the instruction at the location 300. On succeeding instruction cycle, it will fetch instruction from location 301, 302, 303 and so on. The fetched instruction is loaded into the Instruction Register (IR). The instruction contains the bit and specifies the action that the processor has to take. The processor interrupts the instruction and performs the required action. In general this instruction fall into four categories :- 1. Process-memory: Data may be transferred from processor to memory or from memory to processor. 2. Processor-I/O: Data may be transferred to or from a peripheral device by transferring between the processor and I/O modules. Page # 5

3. Data processing: The processor may perform some arithmetic or logic operation on data. 4. Control: An instruction may specify that the sequence of execution has been altered. For example the processor may fetch an instruction from a location 149, which specify that the next instruction will be from 182. The processor set the program counter to 182. Thus, on the next fetch stage, the instruction will be fetched from location 182 rather than 150. Characteristics of Hypothetical Machine 0 3 4 15 Opcode Address (a) Instruction Format 0 1 15 S Magnitude (b) Integer Format Program Counter (PC) = Address of Instruction Instruction Register (IR) = Instruction begin Executed Accumulator (AC) = Temporary Storage (c) Internal CPU Registers 0001 = Load AC from Memory 0010 = Store AC to Memory 0101 = Add to AC from Memory (d) Internal CPU Registers An Instruction Execution may involve a combination of these actions. Consider the simple example using a hypothetical processor that includes the characteristics listed in above figure. The processor contains the single data register, called Accumulator (AC). Both Instruction and Data are 16 bits long, and memory is organized as a sequence of 16 bits words. The instruction format provides 4 bits for the Opcode, allow as many as 2 4 = 16 different opcodes represent by a single Hexadecimal digit. The Opcode defines a operation perform by processor. With the remaining 12 bits of the instruction format, up to 2 12 = 4096 words of memory can be directly addressed and denoted by three hexadecimal digits. Given example of program execution illustrates a partial program execution, showing the relevant portion of memory and processor registers. 1. The PC program counter contains 300, the address of the first instruction. The instruction is loaded in to the IR instruction register and PC is incremented. This process involves the Memory Address Register (MAR) and Memory Buffer Register (MBR). Page # 6

2. The first 4 bits (first Hexadecimal digit) in the IR indicates the AC is to be loaded from memory. The remaining 12 bits (three Hexadecimal digits) specify the address, which is 940. 3. The next instruction (5941) is fetched from location 301 and the PC is incremented. 4. The old content of AC and the content of 941 are added and result is stored into AC. 5. The next instruction 2941 is fetched from the location 302 and the PC is incremented. 6. The content of AC is stored at location 941. In this example, three instruction cycles, each consisting of a fetch stage and an execute stage, are needed to add the contents of location 940 and 941. Example of Program Execution Page # 7

I/O Function :- Data can be exchange directly between an I/O module (a Disk Controller) and the processor. Just as the processor can initiate a read or write with memory, specifying the address of memory location, the processor can also read data from or write data to an I/O module. In this latter case, the processor identifies a specific device that is control by the particular I/O module. Thus, an instruction sequence will be similar to the previous example, with I/O instruction rather than the memoryreferencing instruction. Interrupts Virtually all computers provide a mechanism by which other modules (I/O or memory) may interrupt the normal sequencing of processor. Interrupts are provided primarily as a way to improve processor utilization. For example, most I/O devices are much slower than the processor. Suppose the processor is transferring data to a printer using an instruction cycle scheme. After each write instruction the processor must pause and remain idle until the printer catches up. Length of this pause may be on the order of many thousands or even millions of instruction cycles. Clearly, this is very wistful use of processor. To give a specific example; consider a PC that operates at 1 GHz, which would allow roughly 10 9 instructions per second. A typical hard disk has a rotational speed of 7200 RPM (Rotations per Minute) for a half track rotation time of 4 micro seconds, which is 4 million times slower than the processor. Classes of Interrupts :- Program Timer I/O Hardware Failure Generated by some condition that occurs as a result of an instruction execution, such as arithmetic flow, division by zero, attempt to execute an illegal machine instruction, and reference outside a memory space. Generated by a timer within the processor. This allows the operating system to perform certain functions on a regular basis. Generated by an I/O controller, to signal normal completion of an operation or it signal a variety of error conditions. Generated by a failure, such as power failure or memory parity error. Page # 8

(Figure 1.5) The user program performs a series of WRITE calls interleaved processing. In above figure the solid line represent segment of code in program. Code segment 1, 2, and 3 refer to a sequence of instructions that do not involve I/O. the WRITE calls are to an I/O routines that is system utility and that will perform the actual I/O operation. The I/O program consists of three segments: 1. Sequence of instructions, labeled 4 in the figure, to prepare for the actual I/O operation. This may include copying the data to be output into a special buffer and preparing the parameters for a device command. 2. The actual I/O command. Without the use of interrupt this command is issued, the program must wait for the I/O device to perform the requested function. The program might wait by simply repeatedly performing a test operation to determine if the I/O operation is done. 3. A sequence of instructions, labeled 5 in the figure, to complete the operation. This may include a setting a flag indicating a success or failure of operation. The dashed line represents the path of execution followed by the processor; that is, this line shows the sequence in which the instructions are executed. Thus after the first WRITE instruction is encountered, the user program is interrupted and the execution continue with the I/O program. After the I/O program execution is completed, execution resume in the user program immediately following the WRITE instruction. Page # 9

Because the I/O operation may take a relatively long time to complete, the I/O operation is hung up waiting for the operation to complete; hence the user program is stopped at the point of WRITE call for some considerable time period. Interrupt and the instruction cycle :- With interrupts, the processor can be engaged in executing other instructions while an I/O operation is in progress. After these few instructions have been executed, control returns to the user program. Meanwhile, the external device is busy accepting data from computer memory and printing it. When the external device become ready to served, that is when it ready to accept more data from the processor, the I/O module for this external device sent an Interrupt Request signal to the processor. The processor responds by suspending operation of the current program; branching off to a routine to a service of particular I/O device known as an Interrupt Handler and resume the original execution after the device is serviced. For the user program the interrupt suspends the normal sequence of execution. When the interrupt processing is completed, the execution resumes. Thus the user program does not have to contain any special code to accommodate interrupts; the processor and the OS are responsible for suspending the user program and then resuming at the same point. To accommodate interrupts, an interrupt stage added to an instruction cycle. In interrupt stage the processor checks to see if any interrupts have occurred, indicated by the presence of an interrupt signal. If no interrupts are pending, the processor proceeds to the fetch stage and fetch the next instruction of the current program. If an interrupt is pending the processor suspends the execution of current program and executes an Interrupt Handler routine. The interrupt handler routine is generally a part of the OS. Fetch Stage Start Fetch Next Instruction Execute Instruction Execute Instruction Execution Stage Interrupt Stage Halt Page # 10

Transfer Control via Interrupt To appreciate the gain in efficiency, consider the following diagram, which is a timing diagram based on the flow of control figure (1.5a and 1.5b) and (1.5a and 1.5c). Page # 11

Figure 1.5b and 1.8 assume that that time required for the I/O operation is relatively short. The more typically case, especially for a slow device such as printer, is that the I/O operation will take a much more time than the executing a sequence of user instructions. In this case, the user program reaches the second WRITE call before the I/O operation completed the first WRITE call and the result is that the user program is hung up at that point. Interrupt Processing :- An interrupt triggers a number of events, both in the processor hardware and in software. Above figure shows a sequence of interrupt processing. When a I/O device complete an I/O operation, the following sequence of hardware events occurs: 1. The device issues an interrupt signal to the processor. 2. The processor finished execution of current instruction before responding to the interrupt. 3. The processor tests for the pending interrupt request. If it found then it will send the acknowledgement signal to the device that issues the interrupt. The acknowledgement allows the device to remove its interrupt signal. Page # 12

4. The processor next need to prepare a transfer control to the interrupt routine. Before that, it saves the information needed to resume the current program at the point of interrupt. The minimum information required is the program status word (PSW) and the location of the next instruction to be executed, which is contains in the program counter. These can be pushed into the control stack. 5. The processor then loads the program counter with the entry location of the interrupt-handler routine that will respond to this interrupt. Depending on the computer architecture and OS design, there may be a single program, one for each type of interrupt, and one for each device and each type of interrupt. If there are more than one interrupts routines, the processor must determine which one to invoke. This information may have been included in the original interrupt signal, or processor may have to issue a request to the device that issues the interrupt to get a response that contains the needed information. Page # 13

Once the program counter has been loaded, the processor proceeds to the next instruction cycle, which begins with the interrupt fetch. 6. At this point the program counter and the PSW relating to the interrupted program have been saved on the control stack. However there is other information that is considered as a part of the state of the execution program. In particular the content of the processor register need to be saved, because this register need to be used by the interrupt handler. Typically the interrupt handler begins by saving the content of all registers on the stack. 7. The interrupt handler may now proceed to process the interrupt. This includes an examination of status information relating to a I/O operation or other event that caused an interrupt. It may also involved sending additional command and acknowledgement to I/O device. 8. When interrupt processing is completed, the saved registers values are retrieved from the stack and restore to the registers. 9. The final act is to restore the PSW and program counter values from the stack. As a result the next instruction to be executed will be from previously interrupted program. Multiple Interrupts :- So far, we have discussed the occurrence of a single interrupt. Suppose, however, that one or more interrupts can occur while an interrupt is being processed. For example, a program may be receiving a data from the communication line and printing result at the same time. The communication line controller will generate an interrupts every time when a unit of data arrives. The unit could be a single character or a block, depending on the nature of communication discipline. In any case, it is possible for a communication interrupt to occur while a printer interrupt is being processed. How to deal with Multiple Interrupts:- Two approaches can be taken to deal with multiple interrupts. 1. The first is to disable the interrupts while the interrupt is being processed. A disable interrupts simply means that the processor ignores any new interrupt request signal. If the interrupt occur this time, it generally remains pending and will be checked by the processor after the processor has reenable the interrupt. Thus, the user program is executing and the interrupt is occurs, interrupts are disable immediately. After the interrupt handler routine completes, interrupts are reenabling before resuming the user program, and the processor checks to see is additional interrupts have occurred. This approach is simply, as interrupts are handled in strict sequential order. Page # 14

The drawback to this approach is that it does not take into account relatively priority and critical-time need. For example, when input arrives from the communication line, it may need to be absorbed rapidly to make room for more input. If the first batch of input has not been processed before the second batch arrives, data may be lost because the buffer on the I/O device may fill and overflow. 2. A second approach is defines a priorities for the interrupts and to allow an interrupt of higher priority to cause a lower-priority interrupt handler to be interrupted. As an example of this second approach, consider a system with three I/O devices: a printer, a disk, and a communication line, with increasing the priority of 2, 4, and 5 respectively. Figure 1.13, base on an example, illustrates a possible sequence. Page # 15

Figure 1.13 base on an example, illustrate a posible sequence. A user program begins at t = 0. At t = 10, a printer interrupt occurs; user information is place into control stack and execution continue with the printer interrupt service routine (ISR). While this intewrrupt is still executing, at t = 15 a communication iterrupt occurs. Because of the communication line has the higher priority then the printer, the interrupt request is honored. The printer ISR is interrupted, its state is pushed onto the stack, and execution continue at the communication ISR. While this routine is executing the disk interrupt is occurs at t = 20. Because this interrupt is having the lower priority then the communication device, it simply held and communication interrupt will continue. But when the communication ISR is complete at t = 25, the previous processor state is restore. Which is the execition os pronter ISR. However before a single instruction in that routin can be execute, the processor honors the higher priority disk interrupt and transfer the control to the disk ISR. Only when that the routine is completes at t = 40, control finally retuirn to the user program. Memory Hierarchy The design constraints can be built up by three questions. Wow much? How fast? How much expensive? The question How much is somewhat open ended. If the capacity is there the application will likely be developed to use it. That question how fast is, in a sense, easier to answer. To achieve the greatest performance, the memory must be able to keep up with the processor. That is, as the processor is executing the instruction, we would not want it have to pause waiting for instruction or operands. The final question, how much expensive, must also be considered. For a practical system, the cost of memory must be reasonable in relationship to other component. Page # 16

As might be expected, there is a tradeoff among the three key characteristics of memory: namely, Capacity, Access Time, and Cost. The difficult situation facing the designer is clear. The designer would like to use memory technologies that provide for large-capacity memory, both because the capacity is needed and the cost per bit is low. However, the performance requirements, the designer need to use expensive, relatively lower-capacity memories with fast access time. Characteristics of Memory :- 1. Capacity 2. Access Time 3. Cost Memory Hierarchy A. Decreasing Cost per bit. B. Increasing Capacity. C. Increasing Access Time. D. Decreasing Frequency of Access to the memory by the processor. Thus, smaller, more expensive, faster memories are supplemented by larger, cheaper, slower memories. The key to the success of this organization decreasing frequency of access at lower levels. Suppose that the processor has access to two levels of memory. Level 1 contains 1000 bytes and has an access time 0.1 µs; level 2 contains 100000 bytes and has an access time of 1 µs. assume that the byte to be accessed is in level 1, then the processor access it directly. If it is in level 2, then the byte is first transfer to level 1 and then access by the processor. For simplicity, we ignore the time required for the processor to determine whether the byte is in level 1 or in level 2. Page # 17

The above figure shows the general shape of the curve that models this situation. The figure shows the average access time to a two-level memory as a function of the Hit Ratio H, where H is define as a fraction of all memory access that are found in the faster memory for example, cache. T1 is the access time to Level 1 and T2 is the access time to Level 2. For high percentage of Level 1 access, the average total access time is much closer to that of Level 1 then that of Level 2. In our example, suppose 95% of the memory access are found in the cache (H = 0.95). Then the average time to access a byte can be express as: (0.95) (0.1 µs) + (0.05) (0.1 µs + 1 µs) = 0.095 + 0.055 = 0.15 µs The result is close to the access time of the faster memory. So the strategy of using two memory levels works in principle, but only if conditions (a) through (d) in the list apply. The basis for the validity for condition (d) is a principle known as Locality of Reference. During the execution of program, memory reference by the processor, for both instruction and data. Program typically contains number iterative loops and subroutines. Once a loop or subroutine is entered, there are repeated references to a small set of instructions. Similarly, operations on table or arrays involve access to a cluster to data bytes. Accordingly, it is possible to organize data across the hierarchy such that the percentage of accesses to each successively lower level is substantially less than that of the level above. Let level2 memory contain all program instructions and data. The current cluster can be temporarily placed in level 1. For time to time, one of the clusters in level 1 will have to be swapped back to level 2to make room for new cluster coming into level 1. This principle can be applied across more than two levels of memory. The fastest, smallest and most expensive type of memory consist of a registers internal to the processor. Each location in main memory has a unique address, and most machine instructions refer to one or more main memory addresses. Data are stored more parentally on external mass storage devices, of which the most common are hard disk and removable media, such as removable disk, tape, and optical storage. External, non volatile memory is also refers to as a Secondary Memory or Auxiliary Memory. These are used to store program and data files and are usually visible to the programmer only in terms of file and records. Additional levels can be effectively added to the hierarchy in software. For example, a portion of main memory can be used as a buffer to temporarily hold data that are to be read out to disk. Such a technique sometimes referred to as a disk cache. It will improve a performance in two ways: 1. Instead of many small transfers of data, we have a few large transfers of data. This improves disk performance and minimizes the processor involvement. 2. Some data distinct for write-out may be referenced by a program before the next dump to disk. In this case, the data are retrieving rapidly from the software cache rather then slowly from the disk. Page # 18

Cache Memory Although cache memory is invisible to OS, it interacts with other memory management hardware. Motivation:- On all instruction cycle the processor access memory at least once, to fetch the instruction, and often one or more additional times, to fetch operands and/or store result. The rate at which the processor can execute instructions is clearly limited by the memory cycle time. This limitation has been significant problem because of the persistence mismatch between processor and main memory speed. Ideally main memory should be built with the same technology as that of the processor registers, giving memory cycle times comparable to processor cycle times. The solution is that the exploit the locality by providing a small, fast memory between the processor and fast memory, namely the cache. Byte of Word Transfer Block Transfer CPU Cache Principles:- Cache Main Memory The concept is illustrated in above figure. There is a relatively large and slow main memory together with a smaller, faster cache memory. The cache contains a copy of a portion of main memory. When the processor attempt to read a byte or word of memory, a check is made to determine if the byte or word is in the cache? If so, the byte or word is delivered to the processor. If not, a block of main memory, consisting of some fixed number of bytes is read into the cache and then the byte or word is delivered to the processor. Following figure describe the structure of cache and main memory system. The Main Memory consists of up to 2 n addressable words, with each word having a unique n-bit address. For mapping purpose, this memory is considered to consist of a number of fixed length blocks of K words each. That is there are M = 2 n /K blocks. Cache contains of C slots (also refeered as lines ) of K words each, and the number of slots are considerably less then the number of blocks inmain memory (C < M). Some subset of the block of main memory reside in the slot of cache. If processor want to read a block from the main memory that is not there in cache, that block is transferred to one of the slot of cache. Because of the number of blocks in main memory is greater then the number of slota in cache, an individual slot can not be uniquely and permenantly dedicated to a particular block. Therefoer, each slot includes a tag that identifies which perticular block is currently being store. Page # 19

The processor generates the address, RA of a word to be read. If the word is contained in the cache, it is delivered to the processor. Otherwise the block containing that word loaded into the cache and the word is delivered to the processor. Cache Design:- Key elements for designing cache are briefly summarized here. Cache size Block size Mapping function Replacement algorithm Write policy We have already dealt with the issue of Cache Size. It turns out the reasonably small caches can have significant impact on performance. Another size issue of that of Block Size: the unit of data exchange between cache and main memory. As the block size increases, more useful data are brought into the cache. The hit ratio will begin to decrease, however the block becomes bigger and the probability of using the newly fetch data becomes less than the probability of resuming the data that have to be moved out of the cache to make room for the new block. Page # 20

When a new block of data is read into the cache, the mapping function determines which cache location the block will occupy. Two constraints affect the design of the mapping function. First when one block is read in, another may have to be replaced. We would like to do this in such a way as to minimize the probability that we will replace a block that we will needed in the near future. The more flexible the mapping functions, the more scope we have to design a replacement algorithm to maximize the hit ratio. Second, the more flexible the mapping functions, the more complex is the circuitry required to search the cache to determine if the given block is in the cache. The replacement algorithm chooses, within the constraints of the mapping function, which block to replace when a new block is to be loaded into the cache and the cache already ahs all slots filled with other blocks. We would like to replace the block that is least likely to be needed again in the near future. Although it is impossible to identify such a block, a reasonably effective strategy is to replace the Page # 21

block that has been in the cache longest with no referenced to it. This policy is referred to as the Least-Recently-Used (LRU) blocks. If the content of the block in the cache are altered, then it is necessary to write it back to main memory before replacing it. The Write Policy will take the authority when the memory write operation take place. Communication Techniques Three techniques are possible foe I/O operations: Programmed I/O Interrupt Driven I/O Direct Memory Access (DMA) 1. Programmed I/O :- When a processor is executing a program and encounters an instruction related to I/O, it executes the instruction by using a command to the appropriate I/O module. In the case of programmed I/O, the I/O module performs the requested action and then set the appropriate bits in the I/O status register but takes no further action to alter the processor. In particular, it does not interrupt the processor. Thus, after the I/O instruction is invoke, the processor must take some action role in determining when the I/O instruction is completed. For this purpose, the processor periodically checks the status of the I/O module until it finds that the operation is complete. With this technique the processor is responsible for extracting data from main memory for output and storing data into memory for input. Thus the instruction set includes the I/O instructions in the following categories: Control: Used to active an external device and tell it what to do. Status: Used to test various conditions associated with an I/O modules. Transfer: Used to read and/or write data between processor register and external devices. Page # 22

2. Interrupt Driven I/O :- With programmed I/O processor has to wait for a long time for the I/O module of concern to be read for either reception or transmission of more data. The processor while waiting must interrogate the status of the I/O module. As a result, the performance level of the entire system is degraded. An alternative is for the processor to issue an I/O command to a module and then go no to do some other useful work. The I/O module will then interrupt the processor to request service when it is ready to exchange the data with the processor. The processor then executes the data transfer, as before, and then resumes its former processing. Let us consider how this works, first from the point of view of the I/O module. For input, the I/O module receives a READ command from the processor. The I/O module then proceeds to read data in form an associated peripheral. Once the data are in the module s data register, the module signals an interrupt to the processor over a control line. The module then waits until its data are requested by the processor. When the request is made, the module places its data on the data bus and is then ready from another operation. From the processor s point of view, the action for input is as follows. The processor issues a READ command. It then saves the context of the current program and goes off and does something else. At the end of each instruction cycle, the processor checks for interrupt. When the interrupt from the I/O module occurs, the processor save the context of the program it is currently executing and begins to execute an interrupt-handling program that processes the interrupts. The interrupt driven I/O is more efficient than programmed I/O because it eliminates needless waiting. However interrupt driven still consumes a lot of processor time, because every word of data that goes from memory to I/O module or from I/O module to memory must pass through the processor. There are multiple I/O modules in a computer system, so mechanisms are needed to determine which device caused the interrupt and to decide, in case of multiple interrupts, which one to handle first. Page # 23

3. Direct Memory Access (DMA) :- Interrupt driven I/O, though more efficient then simple programmed I/O, still requires the active intervention of the processor to transfer data between memory and I/O module, and any data transfer must traverse a path through the processor. Thus, both of these forms of I/O suffer from two drawbacks: 1. The I/O transfer rate is limited by the speed with which the processor can test and service the device. 2. The processor is tied up in managing an I/O transfers; a number of instructions must be executed for each I/O transfer. When large volumes of data are to be moved, a more efficient technique is required: Direct Memory Access (DMA). DMA function can be performed by a separate module on the system bus or it can be incorporated into an I/O module. When the processor wishes to read or write a block of data, it issues a command to the DMA module by sending following details to DMA module: Whether a read or write is required. The address of the I/O device involved. The starting location in memory to read the data from or write data to. The number or words to be read or written. The processor then continues with other network. It has delegated this I/O operation to the DMA module, and that module will take care of it. The DMA module transfer the entire block of data or one word at a time directly to or from memory without going through the processor. When the transfer is completed the DMA module sends an interrupt to the processor. The processor is involved at the beginning and at the end of transfer. The DMA module needs to take control of bus to transfer the data to and from memory. Because of this competition of bus usage, there may be time when the processor needs the bus and wait for the DMA module. But this is not an interrupt; the processor does not save the context and do something else. Page # 24