A Low Power 720p Motion Estimation Processor with 3D Stacked Memory
|
|
- Melanie Hawkins
- 6 years ago
- Views:
Transcription
1 A Low Power 720p Motion Estimation Processor with 3D Stacked Memory Shuping Zhang, Jinjia Zhou, Dajiang Zhou and Satoshi Goto Graduate School of Information, Production and Systems, Waseda University 2-7 Hibikino, Kitakyushu , Japan Abstract In this paper, a motion estimation processor (MEP) with 3D stacked memory architecture is proposed to 1) reduce the memory and core power consumption; 2) provide higher bandwidth. Firstly, a memory die is designed and staked with MEP die. By adding face-to-face (F2F) pad and through silicon vias (TSV) definitions, 2D electronic design automation (EDA) tools are extended to support the proposed 3D stacking architecture. Moreover, a novel memory controller is applied to control the data transmission and the timing between memory die and MEP die. Finally, 3D physical design is completed for the whole system including TSV/F2F placement, floor plan optimization, power network generation, etc. Comparing with 2D technology, the number of IO pins is reduced by 77%. After optimizing the floor plan of the MEP die and memory die, the routing wire length is reduced by 13.4% and 50% respectively. The simulation results show that the max bandwidth is more than 14GB/s and whole design can support real-time 720p@60fps encoding at 8MHz with less than 65mW, which is only one sixth of the stateof-the-art MEP. Keywords 3DIC design; motion estimation processor; low power design; memory stacking I. INTRODUCTION With the development of semiconductor technology, the portable devices become more and more powerful. Meanwhile, the camera integrated in the portable devices has higher and higher performance. The most obvious feature is that the capture resolution has been improved from 0.3 Megapixel to 10 Megapixel. With such a powerful camera, 720p, 1080p video recording and playback have been a common function in the new portable devices. Benefit from the portability of smart phones and the great expressivity of the video, more and more users tend to record their lives by video. But in the same time, power consumption has been the bottleneck of portable devices, many devices need to be charged once or even twice per day, which makes a bad experience to users. Users prefer a long battery life time to their portable devices. In a word, the popularity of video capture and playback by portable devices is increasing, so that a low power video codec is required. Many researches focused on the reduction of the power consumption on the video codec itself and got quite good results. V. Sze et al. [1] implemented a full real-time 720p H.264 decoder in 65nm, by using variety of techniques such as multiple voltage, frequency domains, frame level dynamic voltage and frequency scaling, the video decoder core power is reduced to 1.8mW. Y. Lin et al. [2] implemented a 1080p@30fps This research was supported by the regional innovation strategy support program of MEXT and Waseda University Graduate Program for Embodiment Informatics (FY2013-FY2019). H.264/AVC (advanced video coding) encoder in 130nm, by applying several techniques including complexity reduction, cross-stage hardware sharing, etc., the encoder core power is optimized to 242mW. However, although the video encoder/decoder core power is reduced significantly, working with an external dynamic random access memory (DRAM), the total power consumption is still high. Many works have been focused on reducing the DRAM power by decreasing the DRAM bandwidth requirement [3][4]. But conventional 2D integrated circuit process technology has encountered the bottleneck. Now many researchers are trying to solve the DRAM problem by 3D large scale integration (LSI). A good example is, for the regularity of the architecture of DRAM, an industrial high performance 8Gb 3D DDR3 memory has been developed in [5]. What s more, Samsung has applied 3D-TSV technology to its 30nm-class DRAM products to keep pace with Moore s Law and industry projections. Also many researches focus on wide IO memory [6] and hybrid memory cube (HMC) [7] to improve the performance. Not limited to the memory area, many researches were focusing on 3D LSI design. A 64 core processor with stacked memory was designed in [8], whose max throughput is up to 63.8GB/s. T. Zhao et al. [9] introduced a 5-tier stacked H.264 application with on-chip DRAM stacking. Even though the memory power is not given in [9], we can figure it out by [10] according to the characteristics given in [9]. The memory power is 492.5mW, which is still too high. In this paper, a motion estimation processor (MEP) with 3D stacked memory architecture is proposed to reduce the memory power and provide higher memory bandwidth. MEP is a key encoding component of almost all modern video coding standards. As profiled in [11], MEP takes more than 50% of the total computation time in an H.264/AVC encoder when configured to use single-direction full search and a search range (SR) of 32. The MEP used in this design is with a SR of 211 and 2 reference frames, which will consume more. This work focuses on reducing the MEP power by 3D integration technology. Firstly, a memory die is designed and stacked with our previous MEP die. By adding face-to-face (F2F) pads and through silicon vias (TSV) definitions, 2D electronic design automation (EDA) tools are extended to support the proposed 3D stacking architecture. Moreover, a novel memory controller is designed to control the data transmission and the timing between memory die and MEP die. Finally, 3D physical design is completed for the whole system including TSV/F2F placement, floor plan optimization of two dies, power network generations, and so on. As a result, comparing with 2D technology based /14/$ IEEE
2 wire TSV Motion Estimation Processor dummy silicon substrate Memory package substrate F2F bonding pads Fig. 1. Side view of the stacked dies MEP, the number of input/output (IO) pins is reduced by 77%. After optimizing the floor plan of the processor die and memory die, the routing wire length is reduced by 13.4% and 50% respectively. The simulation results show that the power consumption of the whole design is 64.85mW and the max bandwidth is 14.06GB/s, which is much better compared to the state-of-the-art works. II. ARCHITECTURE DESIGN A. 2-Die Stacked 3D Architecture Fig. 1 shows the side view of the stacked dies. Two dies (MEP and memory) with same size are stacked face to face. MEP die is put on the top because of the following considerations. Firstly, all the IO cells are in MEP die, and are connected with the landing pads on the backside of the MEP die (the upper surface of the 3D chip) by TSV technology. The landing pads are connected with the lead by wire bonding. Secondly, the MEP consumes more power than the memory so that the MEP die generated more heat than the memory die. Thus, a better cooling can be provided for MEP on top die. Based on this design,io pins are not needed for memory die since all the data transmission and power delivery in memory die are through F2F bonding pads. Therefore, without the limitation of IO pins, 128/256 bit width IO memory can be applied in this design. access memory (SRAM) interfaces and schedule MB-level tasks. Two independent memory interfaces are employed to provide connectivity to the memory die. SRAM (A) is a 256 bit interface for buffering reference frame while SRAM (B) is a 128 bit interface for buffering source frame and motion vectors (MVs). Data is stored to SRAM (A) after frame compression [4]. Two caches implemented with on-chip SRAMs are employed for serving reference frame data to IMEC and IMER [3]. The 24KB IMEC cache consists of 16 data memory banks with independent read addresses. Each bank is implemented with a 1R1W SRAM of 256x48bits. The 512KB IMER cache is also composed of 16 data memory banks. Each bank is implemented with a 1RW SRAM with 2048x128bits. The MEP can provide a max throughput of 1.59Gpixels/s. Therefore, real-time 720p@60fps video encoding can be supported at 8MHz. C. Memory Architecture The MEP requires two memories for data access, so 2 memories are designed in this work. The memory is composed of many normal SRAM blocks. With the limitation of the chip size and the big size of SRAM blocks, the capacity is limited. Memory (A) is designed to be 14.25Mb because one 720p frame requires 7.032Mb memories and two reference frames are stored in memory (A). Memory (B) is designed to be 8Mb because one source frame and some MVs are stored in Memory (B). Fig. 3(a) shows the block diagram of memory (A) buffering reference frame. It consists of three 4Mb banks and one 2.25Mb bank. Eight 32bit 512Kb sub-banks which are in a 4Mb bank are combined together to generate a 256 bit width memory bank. The small 2.5Mb bank including four 64b B. MEP Architecture The top-level block diagram of MEP is shown in Fig. 2. Based on our previous work [3], the MEP contains a hierarchical integer motion estimation (IME) component and a fractional motion estimation (FME) component. The IME component is separated into an IME coarse search engine (IMEC) and an IME refinement search engine (IMER). IMEC, IMER and FME work in parallel in an MB-level pipeline and a memory scheduler component issues data requests to static random (a) Block diagram of SRAM memory (A) Sub bank 0 Sub bank 1 Sub bank 2 Sub bank 3 (b) Block diagram of SRAM memory (B) Fig. 2. Top-level block diagram of MEP Fig. 3. Architecture of memory
3 Controller A Controller B Processor netlist SDC file Processor def file Floor plan (manually) verification Memory scheduler Memory netlist SDC file Power/ground network generation TSV & F2F pads placement (manually) Auto placement & routing CTS, global/detail routing Fix violations Insert decap/fill cells Memory A: reference frame buffer Memory B: source frame buffer Memory def file Fig. 5. Backend design flow ME processor Fig. 4. Architecture of memory controller 512Kb sub-banks and two 128b 128Kb sub-banks is also a 256 bit width memory bank. The 4 banks share the same data bus while all sub-banks share the same address bus. Fig. 3(b) shows the block diagram of memory (B) buffering source frame and MVs. It consists of four 2Mb banks. Each 2Mb bank contains four 32b 512Kb sub-banks making up a 128 bit-width memory bank. These 4 banks share another data bus and all sub-banks in memory (B) share another address bus. To reduce the power consumption of the memory die, all memory banks will switch to standby mode automatically until there is data access on the corresponding bank. Benefited from this approach, 13.89% power is saved. Table I summarizes the specification of these 2 memories. The address width of both two memories is 16bit. The synthesis results show that the memory die can run at 300MHz. So the max bandwidth can be up to 14GB/s. TABLE I. Specification of memories Memory (A) Memory (B) Memory type SRAM SRAM Capacity 14.25Mb 8Mb Bit-width Address Capacity of sub-bank 512Kb(I),512Kb(II),128Kb 512Kb Technology file Timing library Physical library D. Architecture of Memory Controller Double-data-rate (DDR) memory controller is integrated in the original MEP [3]. In this work, a novel SRAM memory controller is designed to take the place of the DDR memory controller. There are two main functions of the memory controller. Firstly, it can Control the data transmission between MEP and the two memories. Moreover, it is capable of controlling the timing of MEP and the two memories. The novel memory controller includes 2 independent controllers (controller A and B) as shown in Fig. 4. It responds the data access requests from memory scheduler in MEP. Controller A undertakes the data transmission between memory A and MEP while memory B and MEP are connected by controller B. Both two memories are compatible to the burst mode whose burst length is 8. There are some other benefits from this design. Firstly, all the interfaces between MEP die and memory dies are included in this module. As long as we need to change the interfaces, we only need to modify this module. Secondly, the MEP can be integrated into this design easily. III. 3D PHYSICAL DESIGN In this section, we will introduce the proposed 3D physical design. Design flow is also introduced in [9], but cell definition is not included. We firstly define F2F pad, TSV by modifying library exchange format (LEF) file and timing library. And then, floor plan of the memory die and the MEP die, F2F pads and TSVs placement, and power network are presented in this section. A. Backend Design Flow Fig. 5 shows the backend design flow. The net list file and timing constraint file which are generated in front end design are imported to the EDA tool. Cadence encounter is used in backend design. The MEP die and the memory die run the backend design respectively. Thus, we need to separate the generated net list into 2 net list files before the beginning of backend design. Also we need to prepare many files including 65nm technology file, timing library, physical library, etc. before beginning. In backend design flow, we optimize the floor plan so that the EDA tools can get good result in subsequent steps such as placement, clock tree synthesis (CTS) and routing. After floor plan, power network needs to be designed. Even though this is a low frequency low power 3DLSI design, a strong power network is built to ensure the power delivery. Thirdly, TSV and F2F pads are placed manually. TSV is used in IO area for connecting IO cell and landing pad. Consequently, TSV placement is only run in MEP die. The rest of steps can be done automatically by EDA tools. In step of auto routing, some special settings are done to reserve the top metal layer for F2F bonding. B. F2F Pad and TSV Definitions The side view of the defined TSV is shown in Fig. 6(a). The diameter of TSV is 2um while the diameter of the landing (a) (b) Fig. 6. (a) Side view of TSV, (b) Shape of F2F pad
4 Sub bank 0 in SRAM (B) share low 32 bit data bus Bank 0 0 Bank 1 0 Bank 2 0 Bank 0 1 Bank 1 1 Bank 2 2 Bank 2 1 Bank 3 1 Bank 0 2 Bank 1 2 Bank 0 3 Blocks belonging to SRAM (B) Bank 1 3 Triangle that indicates the orientation of the block and the locations of the pins in the block Bank 2 3 Bank 3 3 Bank 3 0 Bank 3 2 Bank 3 3 The blocks belonging to SRAM (A) are placed together Bank 0 1 Bank 1 1 Bank 2 1 Bank 3 1 Bank 0 3 Bank 1 3 Bank 2 3 Bank 0 5 Bank 1 5 Bank 2 5 Bank 0 7 Bank 1 7 Bank 2 7 Fig. 9. (a) Locations of the F2F pads, (b) F2F signal pads, (c) F2F P/G pads Fig. 7. Floor plan of the memory die pad on the first metal layer is 5um. A large landing pad allows a large misalignment of TSV, so that it can improve the yield. Fig. 6(b) illustrates the shape of the customized F2F pad. The diameter of the F2F pad is 3.4um and the F2F pad has 2 pins. The one on the top metal layer is used for connecting the other die while another one on the low metal layer is used for connecting the signal, power or ground (P/G). C. Floor Plan of Memory Die Both memory A and memory B are integrated in the memory die. Memory B includes 16 memory blocks while Memory A includes 30 memory blocks, i.e. there are 46 available memory blocks in total. In addition, a backup block is set in memory die. These 47 blocks are placed in a 4384um by 4640um core area as shown in Fig. 7. To minimize the chip size, the blocks are placed as close as possible. Floor plan is a key step in backend design flow. It will affect the auto placement and routing results greatly and directly. Several ideas are proposed to optimize the floor plan of memory die. Firstly, to minimize the wire length, the blocks, which share the same address bus and data bus, are lumped together. As described in section II.C, since all the banks from same SRAM (SRAM A or B) share the address bus, placing these banks together can greatly reduce the wire length. As shown in Fig. 7, the blocks covered by the white box are placed together since all the blocks are belonged to SRAM A and share address bus. Secondly, the triangle in the corner of the block indicates the orientation of the block and the locations of the pins of the block. The location and the order of the pins in a block are fixed. But the orientation of the block can be set by user. By making the pins of each block as close as possible and in the same order, the routing congestion can be reduced. Finally, within the SRAM (SRAM A or B), the blocks which share the same part of data bus are put together to reduce the length of the routing wire. As described in section II.C, E.g. the 4 blocks covered by the red box in Fig. 7 are the sub-bank 0 of each bank in SRAM B, and they share the same low 32bit data bus, so they neighbor with each other. Consequently, after optimizing the floor plan, the total routing wire length is reduced by more than 50%. D. F2F Pads and TSVs Placement In this design, TSV is used in IO area of MEP die. Fig. 8 shows the TSVs inserted in the location of the IO pads. To improve the yield, redundant TSVs (32 per IO pad) are placed to connect one IO pad. The pitch of the redundant TSV is 10um. There are 10 input cells, 14 output cells and 32 P/G cells in processor die, i.e., 56 IO pads. So the total number of TSV is Table II lists the TSV parameters. TABLE II. TSV parameters Diameter 2um Pitch 10um Depth 6um # per IO pad 32 Total number 1792 The locations of the F2F pads are decided by the floor plan of the memory die, as shown in Fig. 9(a). The F2F signal pads are placed in the white boxes where the pins of the memory blocks are gathered nearly in. The order of F2F signal pads is the same as the pins in the blocks, so that routing congestion can be reduced. Fig. 9(b) shows the F2F signal pads and the pitch is 5um. There are 803 F2F signal pads in total. The F2F P/G pads are located on the core power ring and Fig. 8. TSVs inserted in an IO pad Fig. 10. Floor plan of MEP die
5 enclosed with the red boxes. The enlarged view of F2F P/G pads is shown in Fig. 9(c). The pitch of the F2F P/G pad is 10um and the number of F2F P/G pads is 500/500. Table III lists the F2F pad parameters in one die. TABLE III. Diameter Pitch Total number F2F pad parameters 3.4um Signal P/G 5um 10um Signal 803 Power 500 Ground 500 E. Floor Plan of Processor Fig. 10 shows the floor plan of MEP die. Since the locations of the F2F pads are decided by memory die, the floor plan of MEP is optimized based on the fixed F2F pads. As described in section II.B, the IMEC cache is composed of 16 memory banks and the IMER cache also consists of 16 memory banks. Each bank is composed of 2 RAM blocks. In order to reduce the routing wire, these 2 blocks are put together. 2.2M logic gates and 72 cache blocks are placed in a 3340um by 3400um core area. Blocks of IMER cache are placed in the bottom while the blocks of IMEC cache are placed in the top. The blocks are placed from periphery to interior with proper orientation. Compared to the floor plan automatically generated by encounter, the routing wire length is reduced by 13.4%. The IO cells are mainly placed in the left and right side of the IO area since the top and the bottom sides are occupied by F2F P/G pads. There are only 24 control signal cells and 32 P/G cells in the processor die because most of pins connecting to memory die are replaced by the F2F signal pads. Compared to [3], the number of IO pin is reduced by 77%. F. Power Network A strong power delivery network ensures reliable operation of circuits on a chip, especially in a 3D IC. Fig. 11 shows the power network of the processor die. The core area is surrounded by a wide P/G core ring connecting inside and outside. Inside the core area, power rails which connect to the power ring horizontally supply power to the standard cells. Furthermore, the power stripes, which connect to the power ring and power rails vertically, are set to reduce the IR-drop. To enhance the power supply to the RAM blocks, block rings are also added, which are not shown in the figure. The F2F P/G pads are used for power delivering from MEP die to the memory die, and they are connected to the power ring directly. The power network of the memory die is similar to the processor die, except that there is no P/G cell and the locations of the F2F P/G pads are on the power ring. So the power network of the memory die is not introduced again. IV. SIMULATION RESULTS AND COMPARISON The proposed architecture is synthesized with synopsys design-compiler by using 65 G standard cell library. And then, cadence encounter is used in backend design including floor plan, power network generation, TSV and F2F placement, CTS, auto placement and routing, etc. Fig. 12 shows the layout of the MEP die, the memory die, and the physical characteristic of the whole design is summarized in table IV. The simulations are based on these layouts. Since this is a low frequency and low power design, much verification such as signal integrity (SI) analysis and thermal simulation are not necessary. The result of the 3D power analysis and 3D IR-drop simulation are shown in the following parts. TABLE IV. Physical characteristic of this design Process technology 65nm Chip size 5000um x 5000um Core size Processor die Memory die Frequency@voltage # of TSV um x 3400um 4384um x 4640um 8MHz@1.2V # of F2F signal/p/g pad 803/500/500 #of signal/pg IO cell 24/32 A. 3D power analysis 3D power analysis is performed by cadence encounter Power system (EPS). Not only the physical library, timing library, technology file, but also the net list file, design exchange format (DEF) file, standard delay format file, etc. are imported into EPS. Then, the analysis method is set to static and the corner mode is set to normal (1.2V at 25 ). The frequency is read from the timing constraint file and the toggle rate is set. Finally, EPS reports the result that the total power is 64.85mW. The power of the processor die and memory die is 37.67mW and 27.18mW respectively. B. 3D IR-Drop simulation IR-drop simulation is also done by EPS. Firstly, a cell list is done to create power grid library. Then the analysis method is IO cell F2F pad Power ring Ground ring Fig. 11. Power network of the processor die Fig. 12. Layout of processor die (left) and memory die (right)
6 set to static mode, the temperature is set to 25 and the locations of the source power are set. Here the source power of the processor die is the power cell and the source power of the memory die is the F2F power pad. Thirdly, the result files of power analysis are imported. After finishing these steps, EPS analyzes the IR-drop automatically. The worst IR-drop of the processor die is 5.478mV while the maximum IR-drop of the memory die is only 1.555mV. Note that the voltage of the power net is 1.05V, so there are plenty of margins for the 3D stacked case. C. Comparison Here we do not compare this design with [1] and [2] for their different functions. Ref. [1] is a video decoder which is without MEP. Ref. [2] is a whole encoder including not only MEP but also other parts. An MEP with system-in-silicon architecture is introduced in [12], whose core power and memory power is 2383mW and 190mW respectively. The total power is 2573mW. After normalization, the power and energy efficiency are 432.5mW and 6.952nJ/pixel respectively. TABLE V. Memory type Design specification and power comparison [12] This design System-in-Silicon DRAM / on chip 3D stacked memory / on chip Technology 180nm/110nm 65nm Frequency 200MHz/25MHz@1.8V 8MHz@1.2V Throughput 1080p@30fps 720p@60fps Core power Norm. core power a Memory power Norm. memory power 2383mW 382.5mW 190mW 50mW 37.67mW 27.18mW Norm. total power 432.5mW 64.85mW Energy efficiency b 6.952nJ/pixel 1.173nJ/pixel a. Power normalized to 65nm (P65 = P180 / 6.23 = P110 / 3.8) b. Energy efficiency = Norm. total power / throughput TABLE VI. Bandwidth comparison [9] [6] This design Footprint 12.3mm x 21.8mm - 5mm x 5mm # of tier/die Max frequency 133MHz - 300MHz Working frequency 60MHz 200MHz 8MHz Data width x Max bandwidth 8.5GB/s 12.8GB/s 14GB/s Table V shows the specification of this design and the power comparison to [12]. Benefit from the algorithm optimization, the MEP can process 9 pixels per cycle in average, so that it can encode 720p@60fps video sequences with 2 reference frames at 8MHz. Under this frequency, the power consumption of the MEP die and the memory die is as shown. Benefit from the frequency reduction, and 3D integration, the energy efficiency of this design is 1.173nJ/pixel, which is only one sixth of [12]. Bandwidth comparison is given in table VI. A 3D implementation of H.264 encoder is introduced in [9], whose max bandwidth is 8.5GB/s. The max bandwidth of this design is 14GB/s, which are about 1.64 times of [9]. Also the max bandwidth of this design is a little higher than the wide IO single date rate memory [6], since the bandwidth of [6] is 12.8GB/s. V. CONCLUSION In this paper, a MEP with 3D stacked memory architecture is proposed to reduce the memory power and provide higher bandwidth. By adding F2F pad and TSV definitions, 2D EDA tools are extended to support the proposed 3D stacking architecture. Furthermore, a novel memory controller is designed to control the data transmission and the timing between memory die and MEP die. Finally, 3D physical design is completed for the whole system including floor plan optimization of two dies, TSV/F2F placement, power network generations, etc. As a result, comparing with 2D technology based MEP, the number of IO pins is reduced by 77%. After optimizing the floor plan of the processor die and memory die, the routing wire length is reduced by 13.4% and 50% respectively. The simulation result show that the power consumption and the max bandwidth of the whole design is 64.85mW and 14GB/s respectively, which is much better compared to the state-of-the-art. REFERENCES [1] Sze, Vivienne, et al. "A 0.7-v 1.8-Mw H. 264/AVC 720p Video Decoder". Solid-State Circuits, IEEE Journal of (2009): [2] Yu-Kun Lin, et al. "A 242mW 10mm2 1080p H.264/AVC High-Profile Encoder Chip". Solid-State Circuits Conference, ISSCC Digest of Technical Papers. IEEE International [3] Zhou, Jinjia, et al. "A 1.59 Gpixel/s Motion Estimation Processor with to-211 Search Range for UHDTV Video Encoder". VLSI Circuits (VLSIC), 2013 Symposium on. IEEE, C286-C287. [4] Zhou, Dajiang, et al. "A 530 Mpixels/s 4096x2160@ 60fps H. 264/AVC High Profile Video Decoder Chip". Solid-State Circuits, IEEE Journal of 46.4 (2011): [5] Uksong Kang, et al. "8 Gb 3-D DDR3 DRAM using through-silicon-via Technology". Solid-State Circuits, IEEE Journal of 45.1 (2010): [6] Kim, Jung-Sik, et al. "A 1.2 V 12.8 GB/s 2 Gb Mobile Wide-I/O DRAM with I/Os using TSV Based Stacking". Solid-State Circuits, IEEE Journal of 47.1 (2012): [7] Jeddeloh, Joe, and Brent Keeth. "Hybrid Memory Cube New DRAM Architecture Increases Density and Performance". VLSI Technology (VLSIT), 2012 Symposium on. IEEE, [8] Healy, M. B., et al. "Design and Analysis of 3D-MAPS: A Many-Core 3D Processor with Stacked Memory". Custom Integrated Circuits Conference (CICC), 2010 IEEE [9] Tao Zhang, et al. "A 3D SoC Design for H.264 Application with on- Chip DRAM Stacking". 3D Systems Integration Conference (3DIC), 2010 IEEE International [10] ator/ddr3_power_calc.xlsm [11] Woong IL Choi, Byeungwoo Jeon, and Jechang Jeong. "Fast Motion Estimation with Modified Diamond Search for Variable Motion Block Sizes". Image Processing, ICIP Proceedings International Conference on II vol.3. [12] Kumagai, K., et al. "System-in-Silicon Architecture and its Application to H.264/AVC Motion Estimation for 1080HDTV". Solid-State Circuits Conference, ISSCC Digest of Technical Papers. IEEE International
A Study of IR-drop Noise Issues in 3D ICs with Through-Silicon-Vias
A Study of IR-drop Noise Issues in 3D ICs with Through-Silicon-Vias Moongon Jung and Sung Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology, Atlanta, Georgia, USA Email:
More informationOn GPU Bus Power Reduction with 3D IC Technologies
On GPU Bus Power Reduction with 3D Technologies Young-Joon Lee and Sung Kyu Lim School of ECE, Georgia Institute of Technology, Atlanta, Georgia, USA yjlee@gatech.edu, limsk@ece.gatech.edu Abstract The
More informationA Design Tradeoff Study with Monolithic 3D Integration
A Design Tradeoff Study with Monolithic 3D Integration Chang Liu and Sung Kyu Lim Georgia Institute of Techonology Atlanta, Georgia, 3332 Phone: (44) 894-315, Fax: (44) 385-1746 Abstract This paper studies
More informationPhysical Design Implementation for 3D IC Methodology and Tools. Dave Noice Vassilios Gerousis
I NVENTIVE Physical Design Implementation for 3D IC Methodology and Tools Dave Noice Vassilios Gerousis Outline 3D IC Physical components Modeling 3D IC Stack Configuration Physical Design With TSV Summary
More informationDesign and Analysis of Ultra Low Power Processors Using Sub/Near-Threshold 3D Stacked ICs
Design and Analysis of Ultra Low Power Processors Using Sub/Near-Threshold 3D Stacked ICs Sandeep Kumar Samal, Yarui Peng, Yang Zhang, and Sung Kyu Lim School of ECE, Georgia Institute of Technology, Atlanta,
More information3D systems-on-chip. A clever partitioning of circuits to improve area, cost, power and performance. The 3D technology landscape
Edition April 2017 Semiconductor technology & processing 3D systems-on-chip A clever partitioning of circuits to improve area, cost, power and performance. In recent years, the technology of 3D integration
More information2D/3D Graphics Accelerator for Mobile Multimedia Applications. Ramchan Woo, Sohn, Seong-Jun Song, Young-Don
RAMP-IV: A Low-Power and High-Performance 2D/3D Graphics Accelerator for Mobile Multimedia Applications Woo, Sungdae Choi, Ju-Ho Sohn, Seong-Jun Song, Young-Don Bae,, and Hoi-Jun Yoo oratory Dept. of EECS,
More informationHigh Performance VLSI Architecture of Fractional Motion Estimation for H.264/AVC
Journal of Computational Information Systems 7: 8 (2011) 2843-2850 Available at http://www.jofcis.com High Performance VLSI Architecture of Fractional Motion Estimation for H.264/AVC Meihua GU 1,2, Ningmei
More informationDesign and Analysis of 3D IC-Based Low Power Stereo Matching Processors
Design and Analysis of 3D IC-Based Low Power Stereo Matching Processors Seung-Ho Ok 1, Kyeong-ryeol Bae 1, Sung Kyu Lim 2, and Byungin Moon 1 1 School of Electronics Engineering, Kyungpook National University,
More informationThermal-Aware Memory Management Unit of 3D- Stacked DRAM for 3D High Definition (HD) Video
Thermal-Aware Memory Management Unit of 3D- Stacked DRAM for 3D High Definition (HD) Video Chih-Yuan Chang, Po-Tsang Huang, Yi-Chun Chen, Tian-Sheuan Chang and Wei Hwang Department of Electronics Engineering
More informationISSCC 2006 / SESSION 22 / LOW POWER MULTIMEDIA / 22.1
ISSCC 26 / SESSION 22 / LOW POWER MULTIMEDIA / 22.1 22.1 A 125µW, Fully Scalable MPEG-2 and H.264/AVC Video Decoder for Mobile Applications Tsu-Ming Liu 1, Ting-An Lin 2, Sheng-Zen Wang 2, Wen-Ping Lee
More informationFABRICATION TECHNOLOGIES
FABRICATION TECHNOLOGIES DSP Processor Design Approaches Full custom Standard cell** higher performance lower energy (power) lower per-part cost Gate array* FPGA* Programmable DSP Programmable general
More informationPower-Supply-Network Design in 3D Integrated Systems
Power-Supply-Network Design in 3D Integrated Systems Michael B. Healy and Sung Kyu Lim School of Electrical and Computer Engineering, Georgia Institute of Technology 777 Atlantic Dr. NW, Atlanta, GA 3332
More informationAn Overview of Standard Cell Based Digital VLSI Design
An Overview of Standard Cell Based Digital VLSI Design With examples taken from the implementation of the 36-core AsAP1 chip and the 1000-core KiloCore chip Zhiyi Yu, Tinoosh Mohsenin, Aaron Stillmaker,
More informationFive Emerging DRAM Interfaces You Should Know for Your Next Design
Five Emerging DRAM Interfaces You Should Know for Your Next Design By Gopal Raghavan, Cadence Design Systems Producing DRAM chips in commodity volumes and prices to meet the demands of the mobile market
More informationXuena Bao, Dajiang Zhou, Peilin Liu, and Satoshi Goto, Fellow, IEEE
An Advanced Hierarchical Motion Estimation Scheme with Lossless Frame Recompression and Early Level Termination for Beyond High Definition Video Coding Xuena Bao, Dajiang Zhou, Peilin Liu, and Satoshi
More informationLow-Power Technology for Image-Processing LSIs
Low- Technology for Image-Processing LSIs Yoshimi Asada The conventional LSI design assumed power would be supplied uniformly to all parts of an LSI. For a design with multiple supply voltages and a power
More information3D SYSTEM INTEGRATION TECHNOLOGY CHOICES AND CHALLENGE ERIC BEYNE, ANTONIO LA MANNA
3D SYSTEM INTEGRATION TECHNOLOGY CHOICES AND CHALLENGE ERIC BEYNE, ANTONIO LA MANNA OUTLINE 3D Application Drivers and Roadmap 3D Stacked-IC Technology 3D System-on-Chip: Fine grain partitioning Conclusion
More informationAn overview of standard cell based digital VLSI design
An overview of standard cell based digital VLSI design Implementation of the first generation AsAP processor Zhiyi Yu and Tinoosh Mohsenin VCL Laboratory UC Davis Outline Overview of standard cellbased
More informationPhysical Design of a 3D-Stacked Heterogeneous Multi-Core Processor
Physical Design of a -Stacked Heterogeneous Multi-Core Processor Randy Widialaksono, Rangeen Basu Roy Chowdhury, Zhenqian Zhang, Joshua Schabel, Steve Lipa, Eric Rotenberg, W. Rhett Davis, Paul Franzon
More informationOn Enhancing Power Benefits in 3D ICs: Block Folding and Bonding Styles Perspective
On Enhancing Power Benefits in 3D ICs: Block Folding and Bonding Styles Perspective Moongon Jung, Taigon Song, Yang Wan, Yarui Peng, and Sung Kyu Lim School of ECE, Georgia Institute of Technology, Atlanta,
More informationXilinx SSI Technology Concept to Silicon Development Overview
Xilinx SSI Technology Concept to Silicon Development Overview Shankar Lakka Aug 27 th, 2012 Agenda Economic Drivers and Technical Challenges Xilinx SSI Technology, Power, Performance SSI Development Overview
More informationA SCALABLE COMPUTING AND MEMORY ARCHITECTURE FOR VARIABLE BLOCK SIZE MOTION ESTIMATION ON FIELD-PROGRAMMABLE GATE ARRAYS. Theepan Moorthy and Andy Ye
A SCALABLE COMPUTING AND MEMORY ARCHITECTURE FOR VARIABLE BLOCK SIZE MOTION ESTIMATION ON FIELD-PROGRAMMABLE GATE ARRAYS Theepan Moorthy and Andy Ye Department of Electrical and Computer Engineering Ryerson
More informationCadence On-Line Document
Cadence On-Line Document 1 Purpose: Use Cadence On-Line Document to look up command/syntax in SoC Encounter. 2 Cadence On-Line Document An on-line searching system which can be used to inquire about LEF/DEF
More informationThree DIMENSIONAL-CHIPS
IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) ISSN: 2278-2834, ISBN: 2278-8735. Volume 3, Issue 4 (Sep-Oct. 2012), PP 22-27 Three DIMENSIONAL-CHIPS 1 Kumar.Keshamoni, 2 Mr. M. Harikrishna
More informationIMEC CORE CMOS P. MARCHAL
APPLICATIONS & 3D TECHNOLOGY IMEC CORE CMOS P. MARCHAL OUTLINE What is important to spec 3D technology How to set specs for the different applications - Mobile consumer - Memory - High performance Conclusions
More informationECE 486/586. Computer Architecture. Lecture # 2
ECE 486/586 Computer Architecture Lecture # 2 Spring 2015 Portland State University Recap of Last Lecture Old view of computer architecture: Instruction Set Architecture (ISA) design Real computer architecture:
More informationFast frame memory access method for H.264/AVC
Fast frame memory access method for H.264/AVC Tian Song 1a), Tomoyuki Kishida 2, and Takashi Shimamoto 1 1 Computer Systems Engineering, Department of Institute of Technology and Science, Graduate School
More information3D TECHNOLOGIES: SOME PERSPECTIVES FOR MEMORY INTERCONNECT AND CONTROLLER
3D TECHNOLOGIES: SOME PERSPECTIVES FOR MEMORY INTERCONNECT AND CONTROLLER CODES+ISSS: Special session on memory controllers Taipei, October 10 th 2011 Denis Dutoit, Fabien Clermidy, Pascal Vivet {denis.dutoit@cea.fr}
More informationPhysical Implementation
CS250 VLSI Systems Design Fall 2009 John Wawrzynek, Krste Asanovic, with John Lazzaro Physical Implementation Outline Standard cell back-end place and route tools make layout mostly automatic. However,
More informationJapanese two Samurai semiconductor ventures succeeded in near 3D-IC but failed the business, why? and what's left?
Japanese two Samurai semiconductor ventures succeeded in near 3D-IC but failed the business, why? and what's left? Liquid Design Systems, Inc CEO Naoya Tohyama Overview of this presentation Those slides
More informationASIC Physical Design Top-Level Chip Layout
ASIC Physical Design Top-Level Chip Layout References: M. Smith, Application Specific Integrated Circuits, Chap. 16 Cadence Virtuoso User Manual Top-level IC design process Typically done before individual
More informationFRAME-LEVEL QUALITY AND MEMORY TRAFFIC ALLOCATION FOR LOSSY EMBEDDED COMPRESSION IN VIDEO CODEC SYSTEMS
FRAME-LEVEL QUALITY AD MEMORY TRAFFIC ALLOCATIO FOR LOSSY EMBEDDED COMPRESSIO I VIDEO CODEC SYSTEMS Li Guo, Dajiang Zhou, Shinji Kimura, and Satoshi Goto Graduate School of Information, Production and
More informationCalibrating Achievable Design GSRC Annual Review June 9, 2002
Calibrating Achievable Design GSRC Annual Review June 9, 2002 Wayne Dai, Andrew Kahng, Tsu-Jae King, Wojciech Maly,, Igor Markov, Herman Schmit, Dennis Sylvester DUSD(Labs) Calibrating Achievable Design
More informationMultilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology
1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823
More informationUCLA 3D research started in 2002 under DARPA with CFDRC
Coping with Vertical Interconnect Bottleneck Jason Cong UCLA Computer Science Department cong@cs.ucla.edu http://cadlab.cs.ucla.edu/ cs edu/~cong Outline Lessons learned Research challenges and opportunities
More informationLaboratory 6. - Using Encounter for Automatic Place and Route. By Mulong Li, 2013
CME 342 (VLSI Circuit Design) Laboratory 6 - Using Encounter for Automatic Place and Route By Mulong Li, 2013 Reference: Digital VLSI Chip Design with Cadence and Synopsys CAD Tools, Erik Brunvand Background
More informationPackaging Technology for Image-Processing LSI
Packaging Technology for Image-Processing LSI Yoshiyuki Yoneda Kouichi Nakamura The main function of a semiconductor package is to reliably transmit electric signals from minute electrode pads formed on
More informationOVERCOMING THE MEMORY WALL FINAL REPORT. By Jennifer Inouye Paul Molloy Matt Wisler
OVERCOMING THE MEMORY WALL FINAL REPORT By Jennifer Inouye Paul Molloy Matt Wisler ECE/CS 570 OREGON STATE UNIVERSITY Winter 2012 Contents 1. Introduction... 3 2. Background... 5 3. 3D Stacked Memory...
More information6T- SRAM for Low Power Consumption. Professor, Dept. of ExTC, PRMIT &R, Badnera, Amravati, Maharashtra, India 1
6T- SRAM for Low Power Consumption Mrs. J.N.Ingole 1, Ms.P.A.Mirge 2 Professor, Dept. of ExTC, PRMIT &R, Badnera, Amravati, Maharashtra, India 1 PG Student [Digital Electronics], Dept. of ExTC, PRMIT&R,
More informationUnleashing the Power of Embedded DRAM
Copyright 2005 Design And Reuse S.A. All rights reserved. Unleashing the Power of Embedded DRAM by Peter Gillingham, MOSAID Technologies Incorporated Ottawa, Canada Abstract Embedded DRAM technology offers
More informationISSCC 2001 / SESSION 9 / INTEGRATED MULTIMEDIA PROCESSORS / 9.2
ISSCC 2001 / SESSION 9 / INTEGRATED MULTIMEDIA PROCESSORS / 9.2 9.2 A 80/20MHz 160mW Multimedia Processor integrated with Embedded DRAM MPEG-4 Accelerator and 3D Rendering Engine for Mobile Applications
More informationDevelopment of Low Power ISDB-T One-Segment Decoder by Mobile Multi-Media Engine SoC (S1G)
Development of Low Power ISDB-T One-Segment r by Mobile Multi-Media Engine SoC (S1G) K. Mori, M. Suzuki *, Y. Ohara, S. Matsuo and A. Asano * Toshiba Corporation Semiconductor Company, 580-1 Horikawa-Cho,
More informationOutline. SoC Encounter Flow. Typical Backend Design Flow. Digital IC-Project and Verification. Place and Route. Backend ASIC Design flow
Outline Digital IC-Project and Verification Deepak Dasalukunte Backend ASIC Design flow General steps Input files Floorplanning Placement Clock-synthesis Routing Typical Backend Design Flow SoC Encounter
More informationChapter 5B. Large and Fast: Exploiting Memory Hierarchy
Chapter 5B Large and Fast: Exploiting Memory Hierarchy One Transistor Dynamic RAM 1-T DRAM Cell word access transistor V REF TiN top electrode (V REF ) Ta 2 O 5 dielectric bit Storage capacitor (FET gate,
More informationThermal Analysis on Face-to-Face(F2F)-bonded 3D ICs
1/16 Thermal Analysis on Face-to-Face(F2F)-bonded 3D ICs Kyungwook Chang, Sung-Kyu Lim School of Electrical and Computer Engineering Georgia Institute of Technology Introduction Challenges in 2D Device
More informationHigh-Density Integration of Functional Modules Using Monolithic 3D-IC Technology
High-Density Integration of Functional Modules Using Monolithic 3D-IC Technology Shreepad Panth 1, Kambiz Samadi 2, Yang Du 2, and Sung Kyu Lim 1 1 Dept. of Electrical and Computer Engineering, Georgia
More information8Kb Logic Compatible DRAM based Memory Design for Low Power Systems
8Kb Logic Compatible DRAM based Memory Design for Low Power Systems Harshita Shrivastava 1, Rajesh Khatri 2 1,2 Department of Electronics & Instrumentation Engineering, Shree Govindram Seksaria Institute
More informationSYNTHESIS FOR ADVANCED NODES
SYNTHESIS FOR ADVANCED NODES Abhijeet Chakraborty Janet Olson SYNOPSYS, INC ISPD 2012 Synopsys 2012 1 ISPD 2012 Outline Logic Synthesis Evolution Technology and Market Trends The Interconnect Challenge
More informationCAD Technology of the SX-9
KONNO Yoshihiro, IKAWA Yasuhiro, SAWANO Tomoki KANAMARU Keisuke, ONO Koki, KUMAZAKI Masahito Abstract This paper outlines the design techniques and CAD technology used with the SX-9. The LSI and package
More informationLecture Content. 1 Adam Teman, 2018
Lecture Content 1 Adam Teman, 2018 Digital VLSI Design Lecture 6: Moving to the Physical Domain Semester A, 2018-19 Lecturer: Dr. Adam Teman December 24, 2018 Disclaimer: This course was prepared, in its
More informationProASIC PLUS FPGA Family
ProASIC PLUS FPGA Family Key Features Reprogrammable /Nonvolatile Flash Technology Low Power Secure Single Chip/Live at Power Up 1M Equivalent System Gates Cost Effective ASIC Alternative ASIC Design Flow
More informationChapter 0 Introduction
Chapter 0 Introduction Jin-Fu Li Laboratory Department of Electrical Engineering National Central University Jhongli, Taiwan Applications of ICs Consumer Electronics Automotive Electronics Green Power
More informationA LOW-COMPLEXITY AND LOSSLESS REFERENCE FRAME ENCODER ALGORITHM FOR VIDEO CODING
2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) A LOW-COMPLEXITY AND LOSSLESS REFERENCE FRAME ENCODER ALGORITHM FOR VIDEO CODING Dieison Silveira, Guilherme Povala,
More informationMultimedia in Mobile Phones. Architectures and Trends Lund
Multimedia in Mobile Phones Architectures and Trends Lund 091124 Presentation Henrik Ohlsson Contact: henrik.h.ohlsson@stericsson.com Working with multimedia hardware (graphics and displays) at ST- Ericsson
More informationMonolithic 3D IC Design for Deep Neural Networks
Monolithic 3D IC Design for Deep Neural Networks 1 with Application on Low-power Speech Recognition Kyungwook Chang 1, Deepak Kadetotad 2, Yu (Kevin) Cao 2, Jae-sun Seo 2, and Sung Kyu Lim 1 1 School of
More informationedram to the Rescue Why edram 1/3 Area 1/5 Power SER 2-3 Fit/Mbit vs 2k-5k for SRAM Smaller is faster What s Next?
edram to the Rescue Why edram 1/3 Area 1/5 Power SER 2-3 Fit/Mbit vs 2k-5k for SRAM Smaller is faster What s Next? 1 Integrating DRAM and Logic Integrate with Logic without impacting logic Performance,
More informationEfficient VLSI Huffman encoder implementation and its application in high rate serial data encoding
LETTER IEICE Electronics Express, Vol.14, No.21, 1 11 Efficient VLSI Huffman encoder implementation and its application in high rate serial data encoding Rongshan Wei a) and Xingang Zhang College of Physics
More informationDFT-3D: What it means to Design For 3DIC Test? Sanjiv Taneja Vice President, R&D Silicon Realization Group
I N V E N T I V E DFT-3D: What it means to Design For 3DIC Test? Sanjiv Taneja Vice President, R&D Silicon Realization Group Moore s Law & More : Tall And Thin More than Moore: Diversification Moore s
More informationOn the Design of Ultra-High Density 14nm Finfet based Transistor-Level Monolithic 3D ICs
2016 IEEE Computer Society Annual Symposium on VLSI On the Design of Ultra-High Density 14nm Finfet based Transistor-Level Monolithic 3D ICs Jiajun Shi 1,2, Deepak Nayak 1,Motoi Ichihashi 1, Srinivasa
More informationCentip3De: A 64-Core, 3D Stacked, Near-Threshold System
1 1 1 Centip3De: A 64-Core, 3D Stacked, Near-Threshold System Ronald G. Dreslinski David Fick, Bharan Giridhar, Gyouho Kim, Sangwon Seo, Matthew Fojtik, Sudhir Satpathy, Yoonmyung Lee, Daeyeon Kim, Nurrachman
More informationISSN Vol.05, Issue.12, December-2017, Pages:
ISSN 2322-0929 Vol.05, Issue.12, December-2017, Pages:1174-1178 www.ijvdcs.org Design of High Speed DDR3 SDRAM Controller NETHAGANI KAMALAKAR 1, G. RAMESH 2 1 PG Scholar, Khammam Institute of Technology
More informationReduce Your System Power Consumption with Altera FPGAs Altera Corporation Public
Reduce Your System Power Consumption with Altera FPGAs Agenda Benefits of lower power in systems Stratix III power technology Cyclone III power Quartus II power optimization and estimation tools Summary
More informationDesign and Implementation of High Performance DDR3 SDRAM controller
Design and Implementation of High Performance DDR3 SDRAM controller Mrs. Komala M 1 Suvarna D 2 Dr K. R. Nataraj 3 Research Scholar PG Student(M.Tech) HOD, Dept. of ECE Jain University, Bangalore SJBIT,Bangalore
More informationA 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications
A 50Mvertices/s Graphics Processor with Fixed-Point Programmable Vertex Shader for Mobile Applications Ju-Ho Sohn, Jeong-Ho Woo, Min-Wuk Lee, Hye-Jung Kim, Ramchan Woo, Hoi-Jun Yoo Semiconductor System
More informationA Low Power DDR SDRAM Controller Design P.Anup, R.Ramana Reddy
A Low Power DDR SDRAM Controller Design P.Anup, R.Ramana Reddy Abstract This paper work leads to a working implementation of a Low Power DDR SDRAM Controller that is meant to be used as a reference for
More informationTSV Test. Marc Loranger Director of Test Technologies Nov 11 th 2009, Seoul Korea
TSV Test Marc Loranger Director of Test Technologies Nov 11 th 2009, Seoul Korea # Agenda TSV Test Issues Reliability and Burn-in High Frequency Test at Probe (HFTAP) TSV Probing Issues DFT Opportunities
More informationBANDWIDTH REDUCTION SCHEMES FOR MPEG-2 TO H.264 TRANSCODER DESIGN
BANDWIDTH REDUCTION SCHEMES FOR MPEG- TO H. TRANSCODER DESIGN Xianghui Wei, Wenqi You, Guifen Tian, Yan Zhuang, Takeshi Ikenaga, Satoshi Goto Graduate School of Information, Production and Systems, Waseda
More informationA Review Paper on Reconfigurable Techniques to Improve Critical Parameters of SRAM
IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 09, 2016 ISSN (online): 2321-0613 A Review Paper on Reconfigurable Techniques to Improve Critical Parameters of SRAM Yogit
More informationPicoServer : Using 3D Stacking Technology To Enable A Compact Energy Efficient Chip Multiprocessor
PicoServer : Using 3D Stacking Technology To Enable A Compact Energy Efficient Chip Multiprocessor Taeho Kgil, Shaun D Souza, Ali Saidi, Nathan Binkert, Ronald Dreslinski, Steve Reinhardt, Krisztian Flautner,
More informationNoC Round Table / ESA Sep Asynchronous Three Dimensional Networks on. on Chip. Abbas Sheibanyrad
NoC Round Table / ESA Sep. 2009 Asynchronous Three Dimensional Networks on on Chip Frédéric ric PétrotP Outline Three Dimensional Integration Clock Distribution and GALS Paradigm Contribution of the Third
More informationTutorial 2 Automatic Placement & Routing
Tutorial 2 Automatic Placement & Routing Please follow the instructions found under Setup on the CADTA main page before starting this tutorial. 1.1. Start Encounter Log on to a VLSI server using your EE
More informationDigital IC- Project 1. Place and Route. Oskar Andersson. Oskar Andersson, EIT, LTH, Digital IC project and Verifica=on
Digital IC- Project 1 Oskar Andersson Outline Backend ASIC Design flow (Physical Design) General steps Input files Floorplanning Placement ClockTree- synthesis Rou=ng Typical Backend Design Flow Synthesis
More informationAbbas El Gamal. Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program. Stanford University
Abbas El Gamal Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program Stanford University Chip stacking Vertical interconnect density < 20/mm Wafer Stacking
More information3D-IC is Now Real: Wide-IO is Driving 3D-IC TSV. Samta Bansal and Marc Greenberg, Cadence EDPS Monterey, CA April 5-6, 2012
3D-IC is Now Real: Wide-IO is Driving 3D-IC TSV Samta Bansal and Marc Greenberg, Cadence EDPS Monterey, CA April 5-6, 2012 What the fuss is all about * Source : ECN Magazine March 2011 * Source : EDN Magazine
More informationESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS)
ESE 570 Cadence Lab Assignment 2: Introduction to Spectre, Manual Layout Drawing and Post Layout Simulation (PLS) Objective Part A: To become acquainted with Spectre (or HSpice) by simulating an inverter,
More informationOUTLINE Introduction Power Components Dynamic Power Optimization Conclusions
OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions 04/15/14 1 Introduction: Low Power Technology Process Hardware Architecture Software Multi VTH Low-power circuits Parallelism
More informationAn Automated System for Checking Lithography Friendliness of Standard Cells
An Automated System for Checking Lithography Friendliness of Standard Cells I-Lun Tseng, Senior Member, IEEE, Yongfu Li, Senior Member, IEEE, Valerio Perez, Vikas Tripathi, Zhao Chuan Lee, and Jonathan
More informationIntroduction 1. GENERAL TRENDS. 1. The technology scale down DEEP SUBMICRON CMOS DESIGN
1 Introduction The evolution of integrated circuit (IC) fabrication techniques is a unique fact in the history of modern industry. The improvements in terms of speed, density and cost have kept constant
More informationSMAFTI Package Technology Features Wide-Band and Large-Capacity Memory
SMAFTI Package Technology Features Wide-Band and Large-Capacity Memory KURITA Yoichiro, SOEJIMA Koji, KAWANO Masaya Abstract and NEC Corporation have jointly developed an ultra-compact system-in-package
More informationEECS 598: Integrating Emerging Technologies with Computer Architecture. Lecture 10: Three-Dimensional (3D) Integration
1 EECS 598: Integrating Emerging Technologies with Computer Architecture Lecture 10: Three-Dimensional (3D) Integration Instructor: Ron Dreslinski Winter 2016 University of Michigan 1 1 1 Announcements
More informationAn Infrastructural IP for Interactive MPEG-4 SoC Functional Verification
International Journal on Electrical Engineering and Informatics - Volume 1, Number 2, 2009 An Infrastructural IP for Interactive MPEG-4 SoC Functional Verification Trio Adiono 1, Hans G. Kerkhoff 2 & Hiroaki
More informationHotChips An innovative HD video and digital image processor for low-cost digital entertainment products. Deepu Talla.
HotChips 2007 An innovative HD video and digital image processor for low-cost digital entertainment products Deepu Talla Texas Instruments 1 Salient features of the SoC HD video encode and decode using
More informationDesign of Low Power Wide Gates used in Register File and Tag Comparator
www..org 1 Design of Low Power Wide Gates used in Register File and Tag Comparator Isac Daimary 1, Mohammed Aneesh 2 1,2 Department of Electronics Engineering, Pondicherry University Pondicherry, 605014,
More informationDIRECT Rambus DRAM has a high-speed interface of
1600 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 34, NO. 11, NOVEMBER 1999 A 1.6-GByte/s DRAM with Flexible Mapping Redundancy Technique and Additional Refresh Scheme Satoru Takase and Natsuki Kushiyama
More informationMemory Technologies for the Multimedia Market
Memory Technologies for the Multimedia Market Hitachi Review Vol. 50 (), No. 2 33 Katsuyuki Sato, Ph.D. Yoshikazu Saito Hitoshi Miwa Yasuhiro Kasama OVERVIEW: Different mobile multimedia-oriented products
More informationPOWER REDUCTION IN CONTENT ADDRESSABLE MEMORY
POWER REDUCTION IN CONTENT ADDRESSABLE MEMORY Latha A 1, Saranya G 2, Marutharaj T 3 1, 2 PG Scholar, Department of VLSI Design, 3 Assistant Professor Theni Kammavar Sangam College Of Technology, Theni,
More informationPhysical Placement with Cadence SoCEncounter 7.1
Physical Placement with Cadence SoCEncounter 7.1 Joachim Rodrigues Department of Electrical and Information Technology Lund University Lund, Sweden November 2008 Address for correspondence: Joachim Rodrigues
More informationLow Power using Match-Line Sensing in Content Addressable Memory S. Nachimuthu, S. Ramesh 1 Department of Electrical and Electronics Engineering,
Low Power using Match-Line Sensing in Content Addressable Memory S. Nachimuthu, S. Ramesh 1 Department of Electrical and Electronics Engineering, K.S.R College of Engineering, Tiruchengode, Tamilnadu,
More informationTHE latest generation of microprocessors uses a combination
1254 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 30, NO. 11, NOVEMBER 1995 A 14-Port 3.8-ns 116-Word 64-b Read-Renaming Register File Creigton Asato Abstract A 116-word by 64-b register file for a 154 MHz
More informationISSN Vol.05,Issue.09, September-2017, Pages:
WWW.IJITECH.ORG ISSN 2321-8665 Vol.05,Issue.09, September-2017, Pages:1693-1697 AJJAM PUSHPA 1, C. H. RAMA MOHAN 2 1 PG Scholar, Dept of ECE(DECS), Shirdi Sai Institute of Science and Technology, Anantapuramu,
More informationTutorial for Cadence SOC Encounter Place & Route
Tutorial for Cadence SOC Encounter Place & Route For Encounter RTL-to-GDSII System 13.15 T. Manikas, Southern Methodist University, 3/9/15 Contents 1 Preliminary Setup... 1 1.1 Helpful Hints... 1 2 Starting
More informationSystem Verification of Hardware Optimization Based on Edge Detection
Circuits and Systems, 2013, 4, 293-298 http://dx.doi.org/10.4236/cs.2013.43040 Published Online July 2013 (http://www.scirp.org/journal/cs) System Verification of Hardware Optimization Based on Edge Detection
More informationRethinking On-chip DRAM Cache for Simultaneous Performance and Energy Optimization
Rethinking On-chip DRAM Cache for Simultaneous Performance and Energy Optimization Fazal Hameed and Jeronimo Castrillon Center for Advancing Electronics Dresden (cfaed), Technische Universität Dresden,
More informationA COST-EFFICIENT RESIDUAL PREDICTION VLSI ARCHITECTURE FOR H.264/AVC SCALABLE EXTENSION
A COST-EFFICIENT RESIDUAL PREDICTION VLSI ARCHITECTURE FOR H.264/AVC SCALABLE EXTENSION Yi-Hau Chen, Tzu-Der Chuang, Chuan-Yung Tsai, Yu-Jen Chen, and Liang-Gee Chen DSP/IC Design Lab., Graduate Institute
More informationDigital system (SoC) design for lowcomplexity. Hyun Kim
Digital system (SoC) design for lowcomplexity multimedia processing Hyun Kim SoC Design for Multimedia Systems Goal : Reducing computational complexity & power consumption of state-ofthe-art technologies
More informationMore Course Information
More Course Information Labs and lectures are both important Labs: cover more on hands-on design/tool/flow issues Lectures: important in terms of basic concepts and fundamentals Do well in labs Do well
More informationFPGA Provides Speedy Data Compression for Hyperspectral Imagery
FPGA Provides Speedy Data Compression for Hyperspectral Imagery Engineers implement the Fast Lossless compression algorithm on a Virtex-5 FPGA; this implementation provides the ability to keep up with
More information3D Memory Stacking for Fast Checkpointing/Restore Applications
3D Memory Stacking for Fast Checkpointing/Restore Applications Jing Xie, Xiangyu Dong, Yuan Xie Pennsylvania State University Computer Science and Engineering Department University Park, PA, 682, USA Abstract
More informationMOSAID Semiconductor
MOSAID Semiconductor Fabr-IC (A Single-Chip Gigabit Ethernet Switch With Integrated Memory) @Hot Chips Dave Brown Chief Architect July 4, 2001 Fabr-IC Feature summary 2 Gig ports 1 gig port for stacking
More information