ECE 486/586 Computer Architecture Lecture # 2 Spring 2015 Portland State University
Recap of Last Lecture Old view of computer architecture: Instruction Set Architecture (ISA) design Real computer architecture: Design to maximize performance within constraints: cost, power, and availability Includes ISA, microarchitecture, hardware Tasks of a computer architect: Determine features needed by a market and incorporate those features in the computer Identify important technology scaling trends and adapt computer design accordingly
Lecture Topics Trends in Semiconductor Technology Logic, DRAM and Storage Technology Bandwidth and Latency scaling Transistors and Wires Trends in power and Energy Trends in Cost Impact of time, volume and commoditization Cost of an Integrated Circuit Reference: Chapter 1: Sections 1.4, 1.5 and 1.6 (Pages 17 32)
Technology Trends: Transistor Counts Transistor density increases by ~ 35% per year Die size increase less predictable, ranging from 10% to 20% per year Combined effect: On-chip transistor count increases by 40% to 55% per year (2x increase every 18 to 24 months) Popularly known as Moore s Law What can we do with all these extra transistors: More complex processors (e.g., deeper pipelines, SIMD units) More general-purpose cores on a chip More special-purpose cores (e.g., graphics) on a chip Larger on-chip caches
Technology Trends: DRAM In 1990s, DRAM capacity was increasing by 60% per year (quadrupling every 3 years) More recently, DRAM capacity increase has slowed down to 25% -- 40% (doubling every 2 to 3 years) Capacity increases will continue to slow down due to charge storage limits of the DRAM capacitors DRAM may stop scaling before the end of this decade Impact on Architecture: Need to explore other memory technologies to replace DRAM as the main memory
Technology Trends: Storage Flash Memory: Capacity increasing by 50 to 60% per year (doubling in < 2 years) 15-20X cheaper/bit than DRAM Increasing use as SSD in Laptops and as the only storage device in tablets and smartphones Magnetic Disk Technology: Capacity increasing by 40% per year (doubling every 2 to 3 years) 15-25X cheaper/bit then Flash 300-500X cheaper/bit than DRAM Most widely used in server and warehouse-scale storage
Why care about Technology Trends? Architect needs to be aware of technology trends to make the correct design choices and trade-offs Product design begins 3 5 years before the product is expected to hit the market Consistent technology trends help in knowing how a design would behave in the future
Bandwidth vs. Latency Bandwidth or throughput Total work done in a given time 10,000-25,000X improvement for processors 300-1200X improvement for memory and disks Latency or response time Time between start and completion of an event 30-80X improvement for processors 6-8X improvement for memory and disks
Bandwidth vs. Latency Bandwidth improvements significantly outpace latency improvements
Technology Trends: Transistors Scaling refers to reduction in integrated circuit feature size (in x or y dimension) 1971: 10 micrometer 2011: 32 nanometers Transistor density (count per unit area) increases quadratically with feature size reductions (e.g. 2x density increase from 32 nm to 22 nm process technology) Transistor speed typically increases linearly with decreasing feature size Complex for multiple reasons including decrease in supply voltage
Technology Trends: Wires Wire delays do not scale well with technology Delay proportional to Resistance * Capacitance Wire lengths decrease with feature size reductions But, resistance and capacitance per unit length increase Poor wire delay scaling compared to transistor scaling creates design challenges Larger fractions of processor clock cycle consumed by signal propagation delay on wires
Power & Energy: A Systems Perspective System Needs: Power needs to brought in and distributed around the chip Power dissipated as heat needs to be removed Considerations: Peak Power Power supply system needs to provision for peak power needs If a processor attempts to draw more current than the supply can provide, there is a voltage droop, which leads to malfunction Thermal Design power (TDP) Characterizes sustained power consumption Determined by the capabilities of the cooling system Lower than peak power, higher than average power consumption Energy Efficiency Considers power and execution time (Energy = Power * Time)
Power vs. Energy Power poses operating constraints: can only execute fast enough to max out the power delivery or the cooling solution Energy is the ultimate metric measures the true cost of performing a task has a direct impact on battery lives (portable devices) and electricity bills (servers) Example: If processor A consumes 20% more power than processor B but finishes the task in 30% less time, its relative energy is 1.2 * 0.7 = 0.84. Processor A is better, provided its higher power can be supported by the power delivery and cooling systems
Dynamic Energy and Power Dynamic energy Transistor switches from 0 -> 1 or 1 -> 0 ½ x Capacitive load x Voltage 2 Dynamic power ½ x Capacitive load x Voltage 2 x Frequency switched Reducing clock rate reduces power, not energy Reducing voltage reduces both power and energy Voltages have dropped from 5V to < 1V in 20 years
Example Question: Consider a processor which can operate in two different modes: (i) Mode-1: a high voltage mode (1V, 3GHz), and (ii) Mode-2: a low voltage mode (0.75V, 2 GHz). How much savings in dynamic energy and dynamic power will the processor achieve by operating in Mode-2 as compared to Mode-1? Answer: (Energy) mode2 /(ENERGY) mode1 = (0.75) 2 /(1) 2 = 0.56 44% energy savings (Power) mode2 /(Power) mode1 = (0.75) 2 (2) / (1) 2 (3)= 0.375 62.5% power savings
Power Wall Until 2003, increases in transistor count and frequency dominated reductions in voltage => net increase in power Intel 80386 consumed ~ 2 W 3.3 GHz Intel Core i7 consumes 130 W Heat must be dissipated from 1.5 x 1.5 cm chip This is the limit of what can be cooled by air Result: Clock speeds became stagnant from 2003 onwards
Techniques to Reduce Power Do nothing well: Turn off the clock of inactive modules, e.g., idle FP units or cores Dynamic Voltage-Frequency Scaling: Multiple operating modes with different voltages and frequencies In periods of low activity, switch to lower voltage (frequency) Low power memory states DRAMs have a series of increasingly lower power modes Switching from a low power mode to active mode consumes extra latency Overclocking Turbo mode in Intel processors
Static Power Power consumed when the system is idle Current static x Voltage Proportional to the number of transistors Increasing rapidly with larger on-chip SRAM caches To cut down static power: Need to turn off the power supply to inactive modules (power gating)
Cost and Price Trends Impacted by time, volume, and commodification Cost decreases with time due to: Learning curve resulting in improved yields Recognized opportunities for cost reductions Cost decreases with volume due to: Learning curve reached much faster Purchasing and manufacturing efficiency Reduction in amortized per unit cost of R&D Cost decreases with commodification due to: Multiple vendors producing large volumes of mostly identical products, leading to competition and price reduction Intense competition results in lower profit margins and reduced prices
Intel Pentium 4 and Pentium M Pricing Price reduces with time, as the manufacturing process matures and volume increases
IC Manufacturing Integrated circuit manufacturing starts from the production of silicon wafers Silicon ingot is sliced into silicon wafers Wafers go through multiple processing steps Patterned wafers are tested and bad wafers are removed from the population Good wafers are chopped into dies which go through another testing process Good dies are packaged, re-tested and then sent to customers
Example: Intel Sandy Bridge 280 dies/300 mm wafer, 32nm process technology
Cost of an Integrated Circuit
Cost of an Integrated Circuit Bose-Einstein formula: Wafer yield: accounts for wafers that are completely bad Defects per unit area: Measure of random manufacturing defects 0.016-0.057 defects per square cm (2010) N (Process complexity factor) Measure of manufacturing difficulty 11.5-15.5 (40 nm, 2010)
Yield Example Problem: Assume Wafer yield = 100% Assume Defect Density = 0.031 per cm 2 Assume N= 13.5 Compare die yields for a die that is 1.5 cm on a side with a die that is 1 cm on a side?
Yield Example Problem: Assume Wafer yield = 100% Assume Defect Density = 0.031 per cm 2 Assume N= 13.5 Compare die yields for a die that is 1.5 cm on a side with a die that is 1 cm on a side? Solution: Die A: 1.5 cm on a side Area = 1.5 2 = 2.25cm 2 Die Yield = 100% * 1/(1 + 0.031 * 2.25) 13.5 = 0.4
Yield Example Problem: Assume Wafer yield = 100% Assume Defect Density = 0.031 per cm 2 Assume N= 13.5 Compare die yields for a die that is 1.5 cm on a side with a die that is 1 cm on a side? Solution: Die A: 1.5 cm on a side Area = 1.5 2 = 2.25cm 2 Die Yield = 100% * 1/(1 + 0.031 * 2.25) 13.5 = 0.4 Die B: 1 cm on a side Area = 1 2 = 1cm 2 Die Yield = 100% * 1/(1 + 0.031 * 1) 13.5 = 0.66
Yield Example (cont.) Problem: For the problem on previous page, compare the number of good dies for a 30 cm wafer in each case.
Yield Example (cont.) Problem: For the problem on previous page, compare the number of good dies for a 30 cm wafer in each case. Solution: Die A: 1.5 cm on a side Dies per 30 cm wafer = π * ((30/2) 2 /(1.5) 2 30/(2*(1.5) 2 ) 0.5 ) = 270 Die Yield = 0.4 Number of good dies per wafer = 270 * 0.4 = 108
Yield Example (cont.) Problem: For the problem on previous page, compare the number of good dies for a 30 cm wafer in each case. Solution: Die A: 1.5 cm on a side Dies per 30 cm wafer = π * ((30/2) 2 /(1.5) 2 30/(2*(1.5) 2 ) 0.5 ) = 270 Die Yield = 0.4 Number of good dies per wafer = 270 * 0.4 = 108 Die B: 1 cm on a side Dies per 30 cm wafer = π * ((30/2) 2 /(1) 2 30/(2*(1) 2 ) 0.5 ) = 640 Die Yield = 0.66 Number of good dies per wafer = 640 * 0.66 = 422
Cost Summary Total cost is determined by many factors: raw material cost, wafer yield, defect density, die size, testing costs, packaging costs etc. Manufacturing process dictates Wafer cost Wafer yield Defect density Architect/Designer controls Die size Larger die less dies per wafer Higher cost Larger die Less die yield Higher cost Package Pins I/Os
Cost vs. Price For commodity products (DRAM, disks, embedded processors), price closely tracks cost (low profit margin) Lots of competitors Higher volumes manufacturing costs easier to amortize For more specialized products (HPC, servers), price is significantly higher than cost (high profit margins) Fewer competitors Lower volumes With the proliferation of low-end consumer devices (such as smartphones), profit margins will continue to shrink in near-future