Computer Memory varady.geza@mik.pte.hu
Memories - storages Speed Price Capacity (felejtő) Memory Storage (nem felejtő)
Memory Control Unit Arithmetical Logical Unit (ALU) Main memory Programs Data Without this: no stored program machines I/O devices Registers Main memory Disk Printer
Memory Standard unit: bit (binary unit) 0 or 1 (binary number system) BCD (Binary Coded Decimal) Decimal numbers on 4 bit 4 bit ( nybble ) : 16 possibilities 6 combinations not used 2006: 0010 0000 0000 0110 BCD 2006: 0000 1111 1101 0110 B Simple conversion embedded systems
Addressing modes Operational (main) memory Cells (like a long paper-streak) Cell address serial number squarely identifies the given cell Cell numbers address width k-bit cells (address is independent from k) 10 pcs 8 bit cell address 2 10 10 pcs 128 bit cell address: 2 10
Addressing modes (Addresses a: 4 bit, b: 3 bit, c: 3 bit) Tanenbaum
Addressing modes Cell smallest addressable unit IBM PC 8 bit (1 byte ~ octet) Todays quasi-standard: 8 bit cell Computer Cell size (bits) Burroughs B1700 1 IBM PC 8 DEC PDP-8 12 IBM 1130 16 DEC PDP-15 18 XDS 940 24 Electrologica X8 27 XDS Sigma 9 32 Honeywell 6180 36 CDC 3600 48 CDC Cyber 60
Addressing modes Word Consists of cells (bytes) 32 bit word: 4 byte Most of instructions use words (32 bit computer 32 bit words, 64 bit computers 64 bit words)
Byte order Bytes of Word right-left big endian (SPARC, IBM) Bytes of Word left-right big endian (Intel) Tanenbaum
Error detection codes Memories can fail Eg. Current-peeks (lighting, cosmic radiation) Error Detection Parity bit Bits can flip we have to detect them Parity: eg. even parity: we complement the number of 1- s to even 0101100 01011001 0101000 01010000
Error detection codes Error detecting resend data What is error? Bit divergence in some positions Codeword: m effective bits + r redundant bits n bit codeword Number of different bit-positions: Hamming-distance eg.: 1001 and 1010 - Hamming-distance: 2
Error detection codes One codeword turns into another with Hamming-distance number of one-bit errors m effective bits 2 m variations n bit 2 n codewords (2 m effective) If codeword is forbidden error Hamming-distance of code-sets: the lowest Hamming-distance of any two codewords from the sets
Error detecting codes To recognize d number of one-bit errors, we need d+1 Hamming-distance coding: d:=2 Hamming-dist: 2+1=3 eg.: 000000, 000111, 111000, 111111 If 2-bits change, we can detect it (it won t turn into other codeword) Eg.: 011111. Now what? Original: 000111?(2 errors) or 111111(1 error)
Error Correcting Codes (ECC) To correct d number of one-bit errors, we need 2d+1 Hamming-distance coding: d:=2 Hamming-distance: 2*2+1 = 5 0000000000 0000011111 1111100000 1111111111 2 bit errors is still correctable what is the closest codeword?
Error Correcting Codes (ECC) Let s assume we create a code that can correct 1 bit errors, with n-length, where m effective bits r parity bits so n=m+r 2 m pcs effective words n pcs 1 bit error (Hamming-distance=1) 1 pc 0 bit error (n+1)*2 m pcs 1 bit error and without error
Error Correcting Codes (ECC) (n+1)*2 m pcs 1 bit error (+1 pc no error) All the codewords: 2 n (n+1)*2 m <= 2 n, n = m + r (m+r+1)*2 m <=2 m+r (m+r+1) <=2 r m is given, as such, we get for r a lower boundary
Error Correcting Codes (ECC) How many parity bits (check bits) do we need to be able to correct 1-bit errors? Tanenbaum
Error Correcting Codes (ECC) Richard Hamming He found the method for the lower boundary Basic idea: overlaping parity bits, they check each other 1100 to code eg. AB, AC, AD, ABC partition Tanenbaum
Error Correcting Codes (ECC) Hamming-code Serial number of bits starts from 1 (not 0) Parity bits position at power of 2 All data bits go to between parity bits Parity bits check for: p1: 1,3,5,7,9,11, (from first parity pos. Every second) p2: 2,3,6,7,10,11, (from second parity: 2 yes, 2 no) p3: 4,5,6,7,12,13,14,15, (from 3rd parity: 4 yes, 4 no)
Error Correction Code (ECC) Hamming-code General: b. bit is checked by that b 1,b 2,b 3,..,b j parity bits, for them it s true that the sum of positions of b 1,b 2,b 3,..,b j is b. Eg.: 7. bit is checked by 1., 2. and 4. bit, since 1+2+4=7 1 2 4 8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Error Correcting Codes (ECC) Hamming-code Eg.: Word: 100101 Code: 10p010p1pp 10p010p1p1 10p010p111 10p0101111 1010101111
Error Correcting Codes (ECC) Hamming-code Generate parity code for the read code If read and generated differs error the sum of serial of parity bits give the position Excel-example
Memories cache CPU faster than the memory Development CPU to be faster Memory to be bigger CPU and memory speed difference grows Possible solution CPU waits for the memory (eg. NOP instructions) We could build very fast RAMs Very expensive We can integrate the RAM in a limited size to the CPU We need to make a compromise of Size and Speed: Cache
Memories cache Cache s logical position
Memories cache Size Mostly in KB-MB sizes Slow access RAM-parts are stored in faster memory Locality-principle temporal locality of reference We read a group of word from memory at once Further access to the cache CPU-s cache the main memory, but the concept works with different peripherals (eg. Hard disk drives) as well. Fast, expensive CPU cache Main mem.
Memories cache Speeds c cache access speed m memory access speed h hit ratio (how many read/writes happened from cache) (eg. We read k-times, so once from the slow memory and k-1-times from the cache, thus h=(k-1)/k 1-h fail ratio Average access time: c + (1-h)m h 0 : c+m h 1 : c CPU c cache m Main Mem.
Memories cache Unified cache (Neumann) Data and Instruction in same memory More simple to implement Data and Instr. flow in balance Split cache (Harvard-architecture) Data and instruction separated Because of pipelines, instructions (fetch) and operands (load) can go parallel Parallel operation is possible (with the unified memory, it s much harder)
Memories - registers CPU ALU Regiszterek Dekódoló, vezérlő egység Inner, temporal storage of CPU Belső sín Bitwidth same as of CPU (64 bit CPU 64 bit Busz vezérlő Cím generáló registers) Instructions work allways with registers
Memories - Registers CPU ALU Regiszterek Dekódoló, vezérlő egység Registers affect the speed of CPU Belső they sín should be fast Busz vezérlő Cím generáló Operands, instructions, state-bits Volatile memory no power - erases
Memories main memory Primary or. Operative memory RAM (Random Access Memory) Size typical: 128Mb 8Gb Accessible without I/O channels Volatile Modular extensions 256 MB modul First extension per chips Nowadays, chips on an IC SIMM (Single Inline Memory, one cap for both sides) and DIMM (Dual Inline Memory, different caps for both sides) Error Correction Code (ECC) is possible, but consumes much time
Memories main memory DIP (Dual Inline Package) eg. 8086, 286 SIPP (Single Inline Package) In some 286 PCs Fragile (pins can brake easily)
Memories main memory 32 bit data path 30 and 72 pin SIMM (Single Inline Memory Module) Pins from both sides are the same 30 pins 8 or 9 bit 72 pins 32 or 36 bit
Memories main memory 64 bit data path 168 and 184 pin DIMMs (Dual Inline Memory Module) Both sides are different pins (pin number!) 168 pins (SDRAM) 184 pins (DDR RAM)
Memories main memory 32 or 64 bit datapath SO-DIMM (Small Outline DIMM) Notebooks Routers Printers mini-mainboards
Memories main memory Task: storing programs and data (Neumann) Writable readable (RAM, Random Access: we dont have to read it sequentially) Types Static Dinamic
Memories static RAM Static RAM (SRAM) It stores until it has a current input no need for refresh Access time: some nsec-s It s built by bit-cells Eg. SR-latch (NOR) forbidden
Memories static RAM Low density not ideal for big capacities, it s also expensive Consumption raises with speedup, as CPU cache very high Simple implementation and design (no need for refresh)
Memories dynamic RAM Dynamic RAM (DRAM) Bitcell-array every cell is a transistor and a capacitor Charged / discharged ~ 1 / 0 Charge leaks need for refresh! (200-500 / sec) High density possible (only two components) Mainly used for main memory (SRAM for registers and cache) Access time: some 10 nsec-s
Memories dinamic RAM FPM (Fast Page Mode) DRAM Cells in matrix IN: row-, column-address OUT: cells value Maximum speed ~ 176 MBps Asynchronous (address- and data-lines are on different clocks) EDO (Extended Data Output) DRAM Second address reference is possible before first data output (parallelism). In case of rapid read-write, throughput raises Maximum speed ~ 264 MBps Asynchronous
Memories dinamic RAM SDRAM (Synchronous DRAM) SRAM and DRAM hybrid Synchronous - clock Request, Read synced by a clock After fix clock number, answer arrives (latency) By 2000, every PC uses this type
Memories dinamic RAM RDRAM (Rambus DRAM) 3x faster than SDRAM (400 MHz) Every module has a memory-control (2x-3x more expensive) latency higher (45ns, instead of 7.5ns) Heat dissipation higher Only used in pairs (CRIMM module if needed) In 2002, Two channel DDR-s knocked them out
Memories DDR SDRAM DDR (Double Data Rate) SDRAM 2x data rate data transfer at clocks up and down edges as well Lower voltage (SDRAM: 3.3V, DDR: 2.5V)
Memories DDR2 SDRAM DDR (Double Data Rate 2) SDRAM Higher clock frequency Lower voltage (1.8 V)
Memories DDR3 SDRAM DDR (Double Data Rate 3) SDRAM Not the GDDR3 (but GDDR3 is based on it) Lower voltage (1.5 V) More channels (up to 8) parallel Transcend-info.com MT/s: MegaTransfers per sec 2 MT/s = 1 MHz clock Prefetch: single address, multiple words 8n -8datawords/access
Memories ROM ROM (Read Only Memory) Only readable Bitpattern burned at manufacture Content will remain forever Application Basic program (boot program) of machines Cheaper than RAM Too much time between order - manufacture
Memories (E)(E)PROM PROM (Programable Read Only Memory) 1x writable, onlye readable Possible user-design ROM-s Programing by burning fuses EPROM (Erasable PROM) Erasable by UV-radiation Programmable EPROM-burner EEPROM (Electrically EPROM) Erase by electric pulses Programable electrically 10th the speed and 100th the capacity of S- and D-RAMs ½ the speed and 1/64 the capacityo of EPROM
Memories Flash ROM Similar to EEPROM, but Lower access time (~50 nsec) More cheap Can be written and read in blocks Very resistant (heat, pressure) ~100.000 read/write cycles Many devices uses them (SSD?)