Summer 2003 Lecture 18 07/09/03

Summer 2003 Lecture 18 07/09/03 NEW HOMEWORK Instruction Execution Times: The 8088 CPU is a synchronous machine that operates at a particular clock frequency. In the case of the original IBM PC, that clock frequency is 4.77 MHZ. Each clock cycle lasts: 1/(<Clock Freq> * 10^6) for MHz 1/(<Clock Freq> * 10^9) for GHz Nanosecond (ns) Microsecond (us) Millisecond (ms) 1 billion th of a second, or 1.0 * 10^-9 sec 1 million th of a second, or 1.0 * 10^-6 sec 1 thousand th of a second, or 1.0 * 10^-3 sec 1/4,770,000 = 209.64ns Each instruction in the instruction set takes a specific number of clock cycles to execute. The table beginning on page 643 in the text shows the execution times for each instruction. For example, the instruction MOV BX,VAR[SI][BX] appears in the table under Data Transfer Instructions on page 646. Specifically, this instruction is of the form MOV reg,mem. The table shows that this instruction takes 8+EA clock cycles, and involves 1 memory transfer. The time for effective address calculations (EA) is shown at the bottom of page 648. This addressing mode of this instruction (mov bx,var[si][bx]) is of the form BX+SI+DISP. The table shows that this effective address calculation takes 11 clock cycles. Additionally, the note at the bottom of page 648 indicates that each word memory transfer requires an additional 4 clock cycles. Therefore, on an 8088 CPU, this instruction would take 8+11+4 = 23 clock cycles to execute. A clock with a frequency of 4.77 Mhz has a period of about 209.64 ns, so this instruction would take 23 * 209.64ns = 4.8217 micro seconds to execute.

The reason for the additional 4 clock cycles per memory transfer has to do with the different external bus size between the 8088 and the 8086. The 8088 has an 8 bit external data bus as opposed to the 16 bit external data bus on the 8086. For an 8088 to do a word transfer to or from memory requires two eight bit cycles to be performed, while an 8086 can perform a word transfer in a single memory cycle. The additional 4 clock cycles on the 8088 accounts for the additional memory cycle that must be performed for a word transfer into or out of the 8 bit memory used externally by the 8088. This is the case when the word operand on the 8086 is word aligned. However, if the word operand is misaligned, the 8086 will also have to perform two memory cycles to get the word data, and will also incur the additional 4 clock cycles of execution time overhead. Types of Memory Devices Memory devices can be considered within several different categories: Read/Write Memory vs. Read Only Memory Static Memory vs. Dynamic Memory Random Access vs. Sequential Access Parallel Memory vs. Serial Memory Read/Write Memory vs. Read Only Memory Read/write memory is memory that can both be read from and written to. Virtually all computers will contain some read/write memory. At the very least, it will be necessary for some read/write memory for variable storage. In most computers, the majority of the memory will be read/write memory, and used for storing the program instructions as well as variable data. Read/write memory is generally called RAM (random access memory), which is something of a misnomer, as most memory is randomly accessible. Read only memory (ROM) is memory that can only be read from and not written to. A true ROM will have its contents determined when it is manufactured. The mask used to manufacture the chip will define the pattern of 1 s and 0 s contained in the memory cells. In addition to true ROM, there are a number of other types of memories that are called ROM, although they can be written to in special ways.

PROM is programmable read only memory. These chips are made using a technology that allows the memory cells of the device to be programmed once in the field. Special programming hardware is used to program the device, and once programmed it cannot be changed. EPROM, erasable programmable read only memory, is a type of ROM built using a technology that allows the device to be programmed and erased. Typically a special programmer is used to program the memories and they can be erased by exposure to short wave ultraviolet light. This requires the memory chip to be enclosed in a special package with a transparent quartz window so that the UV light reach the chip to erase it. EEPROM, electrically erasable programmable read only memory, is similar to EPROM, except that rather than needing ultraviolet light to erase the memory cells, they can be erased using an electrical signal. Flash memory is another type of memory built using the same technology. In an EEPROM, each memory cell can be individually erased and reprogrammed. In a Flash memory, it is divided up into blocks, and an entire block must be erased before the cells in it can be reprogrammed. Limiting the erasure to blocks of cells reduces the total amount of erase electronics required, and so Flash memory can be built with higher density than EEPROM. Static vs Dynamic Memory There are two general ways of building the memory cell in a RAM memory device. In static memory, each memory cell is built using a circuit that is essentially a flip-flop. This requires six transistors per memory cell. A flip-flop is a stable circuit, and once a value has been written to it, it will hold that value as long as power is provided to the device. It is also possible to build a memory cell using a single transistor. By using an insulated gate FET with the gate connected to a capacitor, it is possible to use the charge on the capacitor to hold a bit of information. The presence of charge on the capacitor will turn the transistor on and store a 0 bit. The absence of charge on the capacitor will leave the transistor turned off, and represent a 1 bit. By using a floating gate, it is possible to simply use stray capacitance on the chip for the capacitor and not use an explicit capacitor at all. This is called a dynamic ram cell. Unfortunately, this arrangement is not stable. The capacitor is not a perfect capacitor and has leakage. Eventually, the charge will leak away, and the 0 bits will turn into 1 bits. In order for a memory such as this to work properly,

its contents need to be refreshed periodically. This requires that its contents need to be read and re-written before enough time has elapsed for the charge to leak away.

The advantage of dynamic ram is that for a given size of transistor, a dynamic ram cell will be about 1/6 as large as an equivalent static ram cell, thus allowing significantly higher memory density. The disadvantage is the need to periodically refresh the memory. Dynamic ram devices have logic built into them to simplify the task of refreshing the ram array. This logic takes up some space, reducing the density slightly, and the refresh takes up some time, reducing the average speed of the memory slightly. The advantages of dynamic ram so outweigh the disadvantages though, that the main memory in most computer systems is primarily made from dynamic rams. Random Access vs Sequential Access Random access means that any given memory location can be accessed for read or write without having to have accessed any of the previous memory locations. A tape memory is an example of a sequential access memory. A single track within a disk drive is also an example of sequential access. The main memory used in all modern computers is random access memory. Memory Device Signals and Timing Memory devices are built with the memory cells contained in a rectangular array containing some specific number of cells. Each cell can store one bit of information. The data from this memory array will be brought out as a 1 bit wide word, 4 bit wide word, 8 bit wide word, or wider. There will be as many i/o pins on the chip as there are bits in the word size of the device. (i.e. a 1 bit wide device will have one i/o pin, an 8 bit wide device will have eight i/o pins, etc.). To select one of the locations within the device, there will be some number of address inputs depending on the number of different locations contained within the device. If the memory device contains 8K locations, it will require 13 address inputs. (2 13 = 8192 or 8K). A 32K device would require 15 address inputs. A common static memory device is the 62256, made by a number of different manufacturers. The 62256 is an 8 bit wide memory device with 32K memory

locations. This is generally called a 32K X 8 SRAM. This memory device has the following signals: A14 A0-15 address input lines D7 D0-8 data input/output lines CS - active low chip select OE - active low output enable WR - active low write enable The address inputs (A14 A0) are used to select a specific location within the chip to be read or written. The data lines (D7 D0) are used to bring data into or out of the memory array on the chip. When doing a read, the current contents of a memory location will appear on these pins. When doing a write the value to be written into an internal location within the chip is presented here. The CS (chip select) input is used to enable the device. It will only respond to its other inputs if the CS line is true. This is an active low input, so it will only respond to its other inputs if the CS line is low. The OE (output enable) input is used to cause the memory device to place the contents of the location specified by the address inputs onto the data pins. This is used when reading from the memory device

The WR (write enable) input is used to cause the memory device to write the internal location specified by the address inputs with the data currently appearing on the data pins. This is used when writing to the memory device. This is an active low signal, and the actual write occurs on the rising edge when WR goes from low back to the high state. Static RAM Timing Diagrams The following diagrams show the timing of the above signals for read and write cycles: Address CS OE Data Out High Impedance Data Out Valid Read Cycle Address CS WR OE Data I/O Data In Valid Write Cycle

Memory Decoding Memory decoding logic is responsible for recognizing that a memory bus cycle is occurring on the system bus and cause the correct memory chip in the system to respond to the bus cycle. The selection of the correct memory chip is the responsibility of the address decoder. The purpose of an address decoder is to recognize a particular pattern on one or more of the address lines and generate one or more chip select signals to enable the appropriate memory device or devices. One chip select will be generated for each memory device in the memory bank and will enable the memory device that is to appear at that particular address in the memory map of the system. In addition to decoding the address inputs, it is necessary for the decoder to decode additional control lines from the CPU to determine the type of bus cycle and only respond to the correct type of bus cycle. IO/M In an 8088 based system, the IO/M signal will be high when the current bus cycle is an i/o cycle, and low when the current cycle is a memory cycle. A memory decoder would need to include the IO/M signal in the decoding so that a chip select is generated only when the current bus cycle is a memory cycle. A typical computer system will have multiple banks of memory. A memory bank is a contiguous range of memory addresses being assigned to a set of memory devices. When doing memory decoding for a system such as this, the address lines can be divided into three groups. The highest order address lines will be used to decode the bank address and generate a bank select signal that enables the chip select decoder. The middle address lines will be used to decode and generate chip selects for the individual memory devices in the selected bank. The low order address lines will be fed directly to the memory devices and used to select the specific memory location inside the currently selected memory device. The number of address lines assigned to each of these three functions depends on the sizes of the memory banks and the number and size of devices making up the memory banks. For Example 1MB of memory using 62256 static RAM chips 32K per chip 15 address bits 4 chips per bank 2 bits 8 banks 3 bits

Generally, a decoder will have one or more enable inputs, one or more select inputs, and a number of outputs determined by the number of select inputs. For example a two line to four line decoder would have two select inputs and four outputs. The binary pattern appearing on the select inputs would active one of the outputs corresponding to that binary value. When designing a memory decoder, the higher order address lines and the IO/M signal would be combined logically to generate the enable going into the chip select decoder. The intermediate address lines would go to the select inputs of the chip select decoder, and the outputs of the chip select decoder would go to the chip select lines of the individual memory devices. Partial Address Decoding vs Full Address Decoding It is possible to design a memory decoder which ignores some of the address lines. When all address lines are used as input to the decoder, full address decoding results. When some address lines are ignored, partial address decoding results.