Clock Skew Optimization Considering Complicated Power Modes
|
|
- Horace O’Brien’
- 5 years ago
- Views:
Transcription
1 Clock Skew Optimization Considering Complicated Power Modes Chiao-Ling Lung 1,2, Zi-Yi Zeng 1, Chung-Han Chou 1, Shih-Chieh Chang 1 National Tsing-Hua University, HsinChu, Taiwan 1 Industrial Technology Research Institute, HsinChu, Taiwan 2 cllung0608@gmail.com, zen.ziyi@gmail.com, u942518@oz.nthu.edu.tw, scchang@cs.nthu.edu.tw Abstract To conserve energy, a design which utilizes different power modes has been widely adopted. However, when a design has many different power modes, clock tree optimization (CTO) becomes very difficult. In this paper, we propose a two-level power-mode-aware CTO methodology. Among all different power modes, the chip-level CTO globally reduces clock skew among modules, whereas the module-level CTO reduces clock skew within a single module. Our experimental results show that the power-mode-aware CTO can achieve significant improvement in the worst-case condition with only a minor penalty in area. Keywords-power modes, clock tree, clock skew I. INTRODUCTION Due to technology scaling, the ITRS roadmap 2008 Update [24] predicts that, by 2015, high performance integrated circuits will work with on-chip local clock frequencies up to 8.5 GHz. However, in synchronous design, the performance is limited not only by the speed capability of devices but also by the synchronization ability of data signals. The clock skew, the maximum difference among the clock arrival times of sequential elements, imposes important constraints on the system performance. Power Modes Full Speed Figure 1. Industrial example. MPU DSP1 DSP2 1.2V 1.2V 1.2V Active1 1.2V 1.2V 1.0V Active2 1.2V 1.0V 1.2V Suspend 1.0V 1.0V 1.0V Inactive 1.0V 0V 0V Many previous works have concentrated on the problem of clock skew minimization. In [2] [3], clock trees are constructed by zero- or bounded-skew routing. To achieve further skew control, buffer and wire-sizing techniques have been proposed by [4] [6] [18-19]. In order to consider process variation issues, a statistical timing model is used for clock tree optimization [9] [12]. Some researchers [8] [10] use an intentional useful skew scheduling to improve system performance. Special structures such as hybrid and clock meshes have been studied in [7] [14-15] [17]. To lower power consumption, [13] suggests a lowpower clock scheme by distributing the clock signal at a lower voltage and translating it to a higher voltage at the utilization points. A type-matching method is proposed by [5] to consider the impact of clock gating. Chip-level clock tree synthesis is presented by [16] to construct a clock tree for SoC. A novel clock distribution methodology is presented by [11] to perform dynamic de-skewing during the operation of the chip. Despite many studies on clock tree optimization, clock skew minimization is still difficult to achieve in advanced power-saving methodologies where many different power modes are used. Take an industrial case shown in Figure 1 as an example. The design has over 40 modules, some of which may operate in 1.2 V or 1.0V, or may completely shut down. The design has a total of 64 power modes to fit various operating requirements. Some power modes are shown in Figure 1. Since the operating voltage has great influence on the delay of a clock buffer, the clock arrival times of FF sinks in a module may vary greatly when the module performs in a different operating voltage. As a result, it is extremely difficult to implement a single piece of clock network that satisfies the clock skew constraints in all possible power modes. The difficulty of generating a single clock tree to satisfy clock skew constraints in multiple power modes has been pointed out in several industrial publications [21-23]. One way to resolve the clock skew problem is to adopt the asynchronous design style. However, an asynchronous design is difficult to verify and requires an additional synchronizer circuit to handle data synchronization. The previous work [21] uses the delay locked loop (DLL) to synchronize the clock between power domains. As far as we know, none of the previous works have proposed solutions to the problem of clock skew minimization of complicated power modes in the synchronous way. In this paper, we propose a Power-Mode-Aware CTO framework to resolve the skew issue in the complicated power modes. Our framework consists of two major subcomponents the chip-level CTO and the module-level CTO. The chip-level CTO attempts to reduce the global clock skew in a design among all possible power modes. In contrast, the module-level CTO tries to reduce the local clock skew within a module among all different operating voltages. In the chip-level CTO, we propose novel power-modeaware buffers (PMABs) which are inserted into a chip-level clock tree to balance the clock skew among various modules of differing voltage modes. The PMAB is a super buffer with /DATE EDAA
2 mode-selection capability. The delays of a PMAB can be adjusted under various mode conditions. In this paper, we have innovated two different ways of implementing a PMAB which attempts to reduce inter-module clock skew. In the modulelevel CTO, we follow the popular way [7] [19] of using linear programming to reduce the clock skew. We have used an industrial 65nm technology library to perform a set of experiments and the results are very promising. The major contributions of this paper are summarized as follows. We propose to resolve the clock skew problem due to complicated power modes by using a PMAB which has various propagation delays to be chosen by a voltage mode. To reduce the area penalty of a PMAB, we explore the flexibility of designing a PMAB. Our methodology can cope with the current design flow. The rest of this paper is organized as follows. Section II introduces chip-level CTO. Section III describes a modulelevel CTO. Then Section IV demonstrates how to implement our framework with a commercial design flow to achieve onepass clock skew optimization. In Section V, we show experimental results on benchmark circuits. Section VI summarizes our findings to conclude the paper. II. CHIP-LEVEL CTO have five power modes for the design: Full Speed, Active 1, Active 2, Suspend and Inactive. However, for all modules including MPU, DSP1 and DSP2, we have only two voltage modes, 1.2V and 1.0V. Table I. An example design with four power modes. Power Mode M1 M2 M3 V L E V L E V L E pm pm pm pm V: Voltage; L: Latest latency; E: Earliest latency We now describe the steps for designing a PMAB. In the first step, we analyze and record the global latest clock latency called L global among all modules of possible voltage modes. Consider the example in Table I where the design has three modules (M1, M2, M3), four power modes (pm1, pm2, pm3, pm4) and two voltage modes (1.2V and 1.0V). In power mode pm4, module M1 operates in voltage mode 1.0V with the latest clock latency of 14. Similarly, we have latency of 9 for M2 operating in 1.2V and latency of 13 for M3 in 1.0V. Among all modules in all voltage modes, the global latest clock latency is L global = 14, which is the latest clock latency of module M1 operating in 1.0V. In addition, there is a clock skew of 7 between the latency of M1 in 1.0V and the latency of M2 in 1.2V. The clock skew of 7 is called the global clock skew of the design and is denoted as Skew global. Voltage Mode SELv Alignment Delay B 1 B 2 B 3 1.2V V Figure 2. An example of clock tree with PMABs. This section describes the design of a chip-level CTO which inserts PMABs to balance clock skew among modules. An example of clock tree with PMABs is given in Figure 2, where triangles stand for PMABs, solid lines represent clock signals and dotted lines are selection signals generated from the power mode controller. In this section, we first present a possible implementation of a PMAB design and then present important lemmas relating to a PMAB. After that, we then propose a modified PMAB which has less area cost and better efficiency in clock latency than the original one. A. PMAB Design First, we would like to clarify the terms voltage mode and power mode. Throughout this paper, the term voltage mode describes different operating voltages for a module, whereas the term power mode describes different configurations of the operating voltages of modules. For example in Figure 1, we (a) Original clock tree (c) Clock tree with PMABs (b) Alignment Delay (d) An example of a PMAB Figure 3. An example of a PMAB and a clock tree with PMABs. Next, for each voltage mode of a module, we calculate the delay to align its latest clock latency with L global. The delay to align the clock latency with L global for a module m in a voltage mode v is called the alignment delay of module m in voltage mode v and is denoted as m,v. In the same example, the latest clock latency of module M2 in 1.2V is 9. To align with L global (14), we need a delay of 5 (=14-9) so that the latest clock latency will be the same as L global. Therefore, we say that the
3 alignment delay of 1.2V for module M2 is M2,1.2 = 5. For another example, the alignment delay of 1.0V for module M3 is M3,1.0 = 1 (=14-13). For a module, we can calculate the alignment delays of voltage modes. Then, we design a PMAB of a module as a tunable delay element which uses the voltage mode as the select signal to select a set of the corresponding alignment delays. In the same example, the PMAB for module M3 has two voltage modes, 1.2V and 1.0V. The alignment delay of 1.2V is M3,1.2 = 3 and the alignment delay of 1.0V is M3,1.0 = 1. The PMAB of module M3 can be designed using a MUX which has the voltage mode as the select signal and two delay buffers with the delay of 3 for 1.2V and the delay of 1 for 1.0V, as shown in Figure 3(b) and Figure 3(d). After PMABs insertion, we can reduce Skew global from 7 to 4. Figure 3(a) shows the original clock tree and Figure 3(c) demonstrates a clock tree with PMABs. B. Characteristics of a PMAB With the insertion of the PMAB for a module, we can align the latest clock latency of a module in each voltage mode to L global. As a result, after inserting PMABs, we have the important property that the latest clock latency of each module is equal to L global for any given voltage mode. We have the following lemmas. Lemma 1: After inserting of a PMAB, the clock latencies within a module vary at the same pace or in other words, the clock skew within a module does not change. Informal proof: Since the sequential elements in a module belong to the same PMAB, no matter how many delays are padded by the PMAB, the clock latencies in the same module increase by the same quantity every time. Q.E.D. Lemma 2: After inserting of PMABs, we can obtain the optimal global clock skew of a design. And the optimal global clock skew equals to the maximal local clock skew of all modules among voltage modes. Informal proof: According to Lemma 1, the local clock skew of a module cannot be improved by a PMAB. As a result, the best possible global clock skew which can be achieved is the largest local clock skew. Q.E.D. The above lemmas state that the use of PMABs allows us to neglect the inter-module clock skew. Thus, we need only to focus on the reduction of clock skew within a module. In the Table II. Symbols Definition Symbols Description Example L global The maximal latest clock latency among all Take Table modules of possible voltage modes I as example E global The minimal earliest clock latency among all L global = 14 modules of possible voltage modes E global = 7 Skew global The difference between L global and E global Skew global = 7 Skew local The maximal local skew within a module Skew local = 4 L local The latest clock latency of module with Skew local L local = 14 E local The earliest clock latency of module with Skew local E local = 10 (a) Original (b) PMAB (c) Modified PMAB Figure 4. The alignment delays of the case listed in Table I. same example in Table I, among all local clock skews, the largest local clock skew is 4 when module M1 operates in 1.0V. In general, without PMABs, the global clock skew can be larger than the largest local clock skew of 4. However, Lemma 2 states that after inserting PMABs, the global clock skew is equal to the largest local clock skew of 4. C. A modified PMAB design The PMAB design described above tries to align the latest clock latencies of all modules to L global. In this section, we show that despite the simplicity of a PMAB design, the restriction of aligning only to L global is unnecessary in certain power modes and may cause large area penalty. We now present a modified PMAB design to alleviate the unnecessary restriction while still maintaining the good properties of Lemma 1 and 2 of a PMAB design. Before the discussion of a modified PMAB, we need new definitions of symbols. First, among all modules in all voltage modes, we say that the maximal local clock skew is Skew local and its corresponding earliest and latest clock latencies are E local and L local, i.e., Skew local = L local - E local. Then, as with L global, we defined a new symbol E global which is the global earliest clock latency among all modules of possible voltage modes. We summarize all symbols in Table II. For example in Table I, the largest local clock skew within a module, Skew local is 4 when M1 operates in 1.0V with E local = 10 and L local = 14. In addition, Skew global is 7 when M2 operates in 1.2V with E global = 7 and M1 operates in 1.0V with L global = 14. According to Lemma 2, after PMABs insertion, we have Skew global = Skew local, E global = E local, and L global = L local. As a
4 result, we need only to make sure all other clock latencies are located between E global and L global. Based on this observation, we have the flexibility of assigning the delays of a PMAB to be within the range and still achieve the optimal clock skew. With the flexible delay assignment, we can reduce the area for designing a PMAB. Figure 4 shows the clock latency and skew information for the example shown in Table I. The solid bar represents a range from the earliest clock latency to the latest clock latency, and the dashed bar represents the alignment delay for each module in each voltage mode. The double-headed arrow represents the global clock skew and, the dashed arrow represents the skew improvement. Figure 4(a) illustrates the original clock latency and skew information before PMAB insertion, and Figure 4(b) shows the result after PMAB insertion, where all latest clock latencies have been aligned to L global of 14. A modified PMAB, which will be described later, may have the clock latencies shown in Figure 5(c). All of them are within the range but do not align to the latest one. Take module M3 in 1.0V as an example in Figure 5(a), the latest clock latency of 13 is less than L local of 14, and the earliest clock latency of 11 is greater than E local of 10. For a modified PMAB, we can assign M3,1.0 = 0 and keep the clock skew unchanged. The delay of M3,1.0 being 0 means that there is no need for a delay buffer. On the other hand, the delay of M1,1.2 can be within the range from 1 to 3 without affecting the optimal clock skew. We now show that under different conditions among L global, L local, E global and E local, we need to use different formulations to calculate the flexibility of alignment delays. We have exhausted all possible conditions and categorize the conditions into four types. The mathematical expressions of the four types are as follows. Type 1. Type 2. Type 3. L local = L global and E local = E global L local < L global and E local > E global L local < L global and E local = E global 1. delay_assignment { 2. case (Type = 1) 3. do nothing 4. case (Type = 2 or 3) { 5. local = L global - L local 6. E local = E local + local 7. foreach (module m) 8. foreach (operating voltages v) 9. if (E m,v < E local ) then 10. m,v = E local - E m,v 11. } 12. case (Type = 4){ 13. foreach (module m) 14. foreach (operating voltages v) 15. if (E m,v < E local ) then 16. m,v = E local - E m,v 17. } 18. } Figure 5. Pseudo code of delay assignment. Type 4. L local = L global and E local > E global The procedures to calculate alignment delays for each type are described in Figure 5 and the complexity is O(kN), where k is the number of voltage modes and N is the number of modules. III. MODULE-LEVEL CTO The purpose of the module-level CTO is to build a clock tree which has the smallest skew possible within a module. In our framework, we utilize a similar linear program methodology [7] [19] which is commonly used for the clock skew minimization. We derive an LP formulation whose goal is to minimize the maximum clock skew within a module. Our LP formulation consists of two categories of LP constraints -- clock path constraints, and clock skew constraints. The clock path constraints describe the delay of a clock path by summing up the delays of buffers and wires on the clock path. The clock skew constraints are to calculate the maximum clock skew. Inputs: 1. An initial buffered clock tree topological T, 2. d i is the delay of b i, i {1,,N}, 3. w i is the delay of the wire between b i to its parent, i {1,,N} 4. P j is a set of buffers from clock source to s j, j {1,,M} Decision variables: Δd i, i {1,,N} Objective function: minimize: skew Subject to: // clock path constraints a j =Σ(w i + d i + Δd i ), i P j, j {1,,M} // clock skew constraints a max a j, a min a j, j {1,,M} skew = a max a min Outputs: 1. optimal latency at j of s j, j {1,,M} 2. optimal delay dt i of b i, i {1,,N} Figure 6. LP formulation. Given an initial clock tree T with N buffers and M sinks, the LP formulation can be stated as in Figure 6, where b i and s j denote the i th buffer and the j th sink on the clock tree; where d i and dt i are the delay and target delay of b i ; where a j and at j are the clock latency and target clock latency of s j ; where a max and a min are the maximum and minimum clock latency; and where skew max is the maximum skew. Although the LP formulation can provide an optimal clock skew, an exact solution requires rich delay buffers with various delay values. However, only a limited range of buffer sizes is available in a library. Traditionally, a mapping stage has been required to map a delay solution from an LP to a buffer with the closest delay. We found that the optimal delay for those buffers whose positions are not in the critical paths can be stated in a range that still achieves the optimal clock skew. This observation provides more flexibility when mapping the LP s solution to library cells.
5 IV. OUR FRAMEWORK To achieve an automation framework, our framework can work with a commercial design flow. We use PrimeTime as static timing analysis engine. In addition, since the interconnection delay becomes an increasingly larger component of the total delay in advance technology, the interconnection delay should also be considered. Stand Parasitic Exchange Format (SPEF) [25], the widely adopted format which records wire resistance and capacitance is used in our framework to take interconnection delay into account. Figure 7. Experimental Flow. Our experimental flow is shown in Figure 7. First, the clock trees of all modules are generated by the tool SOC Encounter with a level-shifter inserted. Second, we extract the clock tree structure and the interconnect information generated by SOC Encounter, where interconnect information is recorded in Stand Parasitic Exchange Format (SPEF) with wire resistance and capacitance. The module-level CTO is performed as follows. We use PrimeTime to extract clock latency and skew information, and to generate linear programming constraints. The linear programming constraints are solved by lpsolve_5.5. Our delaymapping algorithm uses the result of LP to generate the final clock tree for each module. After finishing module-level CTO, we then perform chiplevel CTO. We insert a PMAB for each module. Utilizing the clock latency information, we determine the alignment delays. During the construction of a PMAB, an alignment delay is formed by a buffer chain in which the buffers have been selected from industrial technology libraries. Finally, we generate the new design with PMABs inserted, and the report of clock information. V. EXPERIMENTAL RESULTS We have implemented our approach as shown in Figure 7, and applied the approach on a large industrial design with more than 56 power modes. To test more designs, we also created a set of new designs consisting of two or three modules instantiated from ISCAS89 benchmark circuits. Each new circuit is assumed to have two voltage modes, 1.32V and 0.9V. The initial clock tree given to our approach is constructed as follows. We first use Design Compiler to map all circuits to industrial 65nm technology library and use SOC Encounter to perform placement, clock tree synthesis and routing. After that, we obtained the initial clock tree by performing SOC Encounter assuming that all modules operate in the high voltage because timing is normally critical in this power mode. We ran all experiments on a Linux OS workstation, with 2.8 GHz CPU and 4 GB memory. The experimental results are shown in Table III. Columns one to three show the name of the circuit, the total number of sequential elements (FF), and the power modes (PM) in a circuit, respectively. Columns four to seven show the worst clock skew of all power modes of SOC Encounter (SOCE), PMAB, modified PMAB (mpmab), and the skew improvement of mpmab compared with SOCE (in %), respectively. Columns eight to eleven show the average clock skew of all power modes of SOCE, PMAB, mpmab and the skew improvement of mpmab compared with SOCE (in %), respectively. Columns twelve to fifteen show the worst clock latency of SOCE, PMAB, mpmab and the latency overhead of mpmab compared with SOCE, respectively. Columns sixteen to eighteen show the area overhead of PMAB, mpmab and the area overhead improvement of mpmab compared with PMAB. Finally, column nineteen shows the runtime of mpmab. For the case of IND1 in Table III, the worst clock skew of SOC Encounter is ps and the average clock skew is ps. After applying modified PMAB, the worst clock skew becomes 163.7ps and the average clock skew is ps. Our approach achieves a 66.94% improvement in the worst clock skew and a 65.75% improvement in the average clock skew. In this case, the clock latency penalty due to the PMAB is 36.78ps and the area overhead of the PMAB is only 0.05% of the total cell area, which doesn t consider the routing overhead. In addition, compared with PMAB, modified PMAB reduces 16.67% of area overhead, but still keep the skew unchanged. The clock latency distributions after applying PMAB and after applying modified PMAB are shown in Figure 8. On average, both PMAB and modified PMAB improve 74% of the worst clock skew, whereas the average worst latency penalty of modified PMAB is 39.26ps. Furthermore, the average area overheads of PMAB and modified PMAB are 0.16% and 0.12%. Although the worst clock skew and worst clock latency of modified PMAB are as good as PMAB, but the average latency overhead and area overhead of modified PMAB are less than PMAB. Our experimental results show that, compared with PMAB, modified PMAB improves 16.41% of the average latency overhead and 25.61% of area overhead on average. Furthermore, there exist minor difference between a PMAB and the corresponding modified PMAB in those columns regarding the worst clock skew, the average clock skew and the worst latency. The reason of minor difference is caused by the mapping inaccuracy that is to use delay buffers to implement certain delays. VI. CONCLUSIONS In this paper, we have proposed efficient ways to optimize clock skew considering the complicated power modes in an SoC design. Our methodology consists of a chip-level CTO and a module-level CTO. We also present our flow to adapt to a current design flow. Our experiments show that, both the PMAB and modified PMAB approaches dramatically improve
6 Circuits #FF Table III. Experimental results # Worst Clock Skew(ps) Average Clock Skew(ps) Worst Latency(ps) Area Overhead PM SOCE PMAB mpmab % SOCE PMAB mpmab % SOCE PMAB mpmab PMAB mpmab % Runtime (s) IND1 18, % % % 0.05% 16.67% 1934 case1 3, % % % 0.10% 37.50% 33 case2 2, % % % 0.09% 25.00% 27 case3 3, % % % 0.08% 27.27% 31 case4 3, % % % 0.09% 25.00% 26 case5 1, % % % 0.28% 22.22% 7 Avg % 63.92% % 0.12% 25.61% 343 the clock skew while incurring very little additional area overhead for designs with complicated power modes. Compared with PMAB, the modified PMAB approach utilizes less area and latency, while still maintaining the quality of results. REFERENCES [1] P. Ampadu, Ultra-low voltage VLSI : are we there yet?, in Proc. of ISCAS, pp , 2006 [2] K. D. Boese and A. B. Kahng, Zero-skew clock routing trees with minimum wirelength, in Proc. of IEEE 5th Int. ASIC Conf., pp , [3] T. H. Chao, Y. C. Hsu, J. M. Ho, K. D. Boese and A.B. Kahng, Zero skew clock routing with minimum wire length, in IEEE Trans. on Circuits Systems, vol. 39, pp , [4] C. C. N. Chu and D. F. Wong, An efficient and optimal algorithm for simultaneous buffer and wire sizing, in IEEE Trans. on Computer- Aided Design, vol. 18, pp , Sept [5] C. M. Chang, S. H. Huang, Y. K. Ho, J. Z. Lin, H. P. Wang and Y. S. Lu, Type-matching clock tree for zero skew clock gating, in Proc. of DAC, pp , 2008 [6] J. Cong and K. S. Leung, "Optimal wiresizing under the distributed elmore delay model," in IEEE Trans. on CAD, vol.14, pp , Mar [7] M. P. Desai, R. Cvijetic, and J. Jensen, Sizing of clock distribution networks for high performance CPU chips, In Proc. of DAC, pp , [8] E.G. Friedman, Clock distribution networks in synchronous digital integrated circuits, in Proc. IEEE, vol. 89, pp , May [9] M. Hashimoto, T. Yamamoto, and H. Onodera. Statistical analysis of clock skew variation in H-tree structure, in Proc. of ISQED, [10] J. L. Neves and E. G. Friedman, Optimal clock skew scheduling tolerant to process Variations, in Proc. of DAC, pp , June 1996 [11] P. Mahoney, E. Fetzer, B. Doyle and S. Naffziger Clock Distribution on a Dual-Core, Multi-Threaded Itanium-Family Processor, in IEEE ISSCC, pp , [12] U. Padmanabhan, Janet M. Wang, J. Hu, Statistical clock tree routing for robustness to process variations, in Proc. of ISPD, pp , [13] J Pangjun, S. S. Sapatnekar, Low-power clock distribution using multiple voltages and reduced swings, in IEEE Trans. on VLSI, vol. 10, pp , Jun [14] S. Pullela, N. Menezes and L. T. Pillage, Reliable non-zero skew clock tree using wire width optimization, in Proc. of DAC., pp , [15] A. Rajaram, J. Hu, R. Mahapatra, Reducing clock skew variability via cross links, in Proc. of DAC, pp , June 2004 [16] A. Rajaram and D. Z. Pan, Robust chip-level clock tree synthesis for SOC designs, in Proc. of DAC, pp , 2008 [17] H. Su and S. S. Sapatnekar, Hybrid structured clock network construction, in Proc. of ICCAD, pp , 2001 [18] J. L. Tsai, T. H. Chen, and C.C. Chen., Zero skew clock-tree optimization with buffer insertion/sizing and wire sizing, in IEEE Trans. on CAD, vol. 23, pp , April [19] K. Wang, Y. Ran, and M. Marek-Sadowska, General skew constrained clock network sizing based on sequential linear programming, in IEEE Trans. on CAD, vol. 24, pp , May [20] Q. Zhu and W. W. M. Dai, High-speed clock network sizing optimization based on distributed RC and lossy RLC interconnect models, in IEEE Trans. on CAD, vol. 15, pp , Sep [21] pdf [22] A practical guide to low-power design, Power Forward Initiative (PFI), [23] pdf [24] International Technology Roadmap for Semiconductors(ITRS), 2007 Edition, [25] IEEE 1481 Standard for Integrated Circuit (IC) Delay and Power Calculation System, (a) Original (b) PMAB (c) Modified PMAB Figure 8. The experimental result of IND1.
Power-Mode-Aware Buffer Synthesis for Low-Power Clock Skew Minimization
This article has been accepted and published on J-STAGE in advance of copyediting. Content is final as presented. IEICE Electronics Express, Vol.* No.*,*-* Power-Mode-Aware Buffer Synthesis for Low-Power
More informationSymmetrical Buffered Clock-Tree Synthesis with Supply-Voltage Alignment
Symmetrical Buffered Clock-Tree Synthesis with Supply-Voltage Alignment Xin-Wei Shih, Tzu-Hsuan Hsu, Hsu-Chieh Lee, Yao-Wen Chang, Kai-Yuan Chao 2013.01.24 1 Outline 2 Clock Network Synthesis Clock network
More informationVariation Tolerant Buffered Clock Network Synthesis with Cross Links
Variation Tolerant Buffered Clock Network Synthesis with Cross Links Anand Rajaram David Z. Pan Dept. of ECE, UT-Austin Texas Instruments, Dallas Sponsored by SRC and IBM Faculty Award 1 Presentation Outline
More informationSymmetrical Buffer Placement in Clock Trees for Minimal Skew Immune to Global On-chip Variations
Symmetrical Buffer Placement in Clock Trees for Minimal Skew Immune to Global On-chip Variations Renshen Wang Department of Computer Science and Engineering University of California, San Diego La Jolla,
More informationMulti-Voltage Domain Clock Mesh Design
Multi-Voltage Domain Clock Mesh Design Can Sitik Electrical and Computer Engineering Drexel University Philadelphia, PA, 19104 USA E-mail: as3577@drexel.edu Baris Taskin Electrical and Computer Engineering
More informationClock Tree Resynthesis for Multi-corner Multi-mode Timing Closure
Clock Tree Resynthesis for Multi-corner Multi-mode Timing Closure Subhendu Roy 1, Pavlos M. Mattheakis 2, Laurent Masse-Navette 2 and David Z. Pan 1 1 ECE Department, The University of Texas at Austin
More informationVERY large scale integration (VLSI) design for power
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 7, NO. 1, MARCH 1999 25 Short Papers Segmented Bus Design for Low-Power Systems J. Y. Chen, W. B. Jone, Member, IEEE, J. S. Wang,
More informationHigh-Speed Clock Routing. Performance-Driven Clock Routing
High-Speed Clock Routing Performance-Driven Clock Routing Given: Locations of sinks {s 1, s,,s n } and clock source s 0 Skew Bound B >= 0 If B = 0, zero-skew routing Possibly other constraints: Rise/fall
More informationCrosslink Insertion for Variation-Driven Clock Network Construction
Crosslink Insertion for Variation-Driven Clock Network Construction Fuqiang Qian, Haitong Tian, Evangeline Young Department of Computer Science and Engineering The Chinese University of Hong Kong {fqqian,
More informationOn Constructing Lower Power and Robust Clock Tree via Slew Budgeting
1 On Constructing Lower Power and Robust Clock Tree via Slew Budgeting Yeh-Chi Chang, Chun-Kai Wang and Hung-Ming Chen Dept. of EE, National Chiao Tung University, Taiwan 2012 年 3 月 29 日 Outline 2 Motivation
More informationKyoung Hwan Lim and Taewhan Kim Seoul National University
Kyoung Hwan Lim and Taewhan Kim Seoul National University Table of Contents Introduction Motivational Example The Proposed Algorithm Experimental Results Conclusion In synchronous circuit design, all sequential
More informationDetermination of Worst-case Crosstalk Noise for Non-Switching Victims in GHz+ Interconnects
Determination of Worst-case Crosstalk Noise for Non-Switching Victims in GHz+ Interconnects Jun Chen ECE Department University of Wisconsin, Madison junc@cae.wisc.edu Lei He EE Department University of
More informationProblem Formulation. Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets.
Clock Routing Problem Formulation Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets. Better to develop specialized routers for these nets.
More informationUniversity of California at Berkeley. Berkeley, CA the global routing in order to generate a feasible solution
Post Routing Performance Optimization via Multi-Link Insertion and Non-Uniform Wiresizing Tianxiong Xue and Ernest S. Kuh Department of Electrical Engineering and Computer Sciences University of California
More informationDouble Patterning-Aware Detailed Routing with Mask Usage Balancing
Double Patterning-Aware Detailed Routing with Mask Usage Balancing Seong-I Lei Department of Computer Science National Tsing Hua University HsinChu, Taiwan Email: d9762804@oz.nthu.edu.tw Chris Chu Department
More informationPushPull: Short Path Padding for Timing Error Resilient Circuits YU-MING YANG IRIS HUI-RU JIANG SUNG-TING HO. IRIS Lab National Chiao Tung University
PushPull: Short Path Padding for Timing Error Resilient Circuits YU-MING YANG IRIS HUI-RU JIANG SUNG-TING HO IRIS Lab National Chiao Tung University Outline Introduction Problem Formulation Algorithm -
More informationFault-Tolerant 3D Clock Network
Fault-Tolerant Clock Network Chiao-Ling Lung 1,2, Yu-Shih Su 2, Shih-Hsiu Huang 1, Yiyu Shi 3, and Shih-Chieh Chang 1 1 Department of Computer Science National Tsing Hua University HsinChu 30013, Taiwan
More informationProcess-Induced Skew Variation for Scaled 2-D and 3-D ICs
Process-Induced Skew Variation for Scaled 2-D and 3-D ICs Hu Xu, Vasilis F. Pavlidis, and Giovanni De Micheli LSI-EPFL July 26, 2010 SLIP 2010, Anaheim, USA Presentation Outline 2-D and 3-D Clock Distribution
More informationWhitespace-Aware TSV Arrangement in 3D Clock Tree Synthesis
2013 IEEE Computer Society Annual Symposium on VLSI Whitespace-Aware TSV Arrangement in 3D Clock Tree Synthesis Xin Li, Wulong Liu, Haixiao Du, Yu Wang, Yuchun Ma, Huazhong Yang Tsinghua National Laboratory
More informationCluster-based approach eases clock tree synthesis
Page 1 of 5 EE Times: Design News Cluster-based approach eases clock tree synthesis Udhaya Kumar (11/14/2005 9:00 AM EST) URL: http://www.eetimes.com/showarticle.jhtml?articleid=173601961 Clock network
More informationClock Gating Optimization with Delay-Matching
Clock Gating Optimization with Delay-Matching Shih-Jung Hsu Computer Science and Engineering Yuan Ze University Chung-Li, Taiwan Rung-Bin Lin Computer Science and Engineering Yuan Ze University Chung-Li,
More information[14] M. A. B. Jackson, A. Srinivasan and E. S. Kuh, Clock routing for high-performance ICs, 27th ACM
Journal of High Speed Electronics and Systems, pp65-81, 1996. [14] M. A. B. Jackson, A. Srinivasan and E. S. Kuh, Clock routing for high-performance ICs, 27th ACM IEEE Design AUtomation Conference, pp.573-579,
More informationParallel-computing approach for FFT implementation on digital signal processor (DSP)
Parallel-computing approach for FFT implementation on digital signal processor (DSP) Yi-Pin Hsu and Shin-Yu Lin Abstract An efficient parallel form in digital signal processor can improve the algorithm
More informationA Novel Performance-Driven Topology Design Algorithm
A Novel Performance-Driven Topology Design Algorithm Min Pan, Chris Chu Priyadarshan Patra Electrical and Computer Engineering Dept. Intel Corporation Iowa State University, Ames, IA 50011 Hillsboro, OR
More informationFloorplan considering interconnection between different clock domains
Proceedings of the 11th WSEAS International Conference on CIRCUITS, Agios Nikolaos, Crete Island, Greece, July 23-25, 2007 115 Floorplan considering interconnection between different clock domains Linkai
More informationOn GPU Bus Power Reduction with 3D IC Technologies
On GPU Bus Power Reduction with 3D Technologies Young-Joon Lee and Sung Kyu Lim School of ECE, Georgia Institute of Technology, Atlanta, Georgia, USA yjlee@gatech.edu, limsk@ece.gatech.edu Abstract The
More informationMulti-Corner Multi-Voltage Domain Clock Mesh Design
Multi-Corner Multi-Voltage Domain Clock Mesh Design Can Sitik Electrical and Computer Engineering Drexel University Philadelphia, PA, 19104 USA E-mail: as3577@drexel.edu Baris Taskin Electrical and Computer
More informationCrosstalk Noise Optimization by Post-Layout Transistor Sizing
Crosstalk Noise Optimization by Post-Layout Transistor Sizing Masanori Hashimoto hasimoto@i.kyoto-u.ac.jp Masao Takahashi takahasi@vlsi.kuee.kyotou.ac.jp Hidetoshi Onodera onodera@i.kyoto-u.ac.jp ABSTRACT
More informationCircuit Model for Interconnect Crosstalk Noise Estimation in High Speed Integrated Circuits
Advance in Electronic and Electric Engineering. ISSN 2231-1297, Volume 3, Number 8 (2013), pp. 907-912 Research India Publications http://www.ripublication.com/aeee.htm Circuit Model for Interconnect Crosstalk
More informationSimultaneous OPC- and CMP-Aware Routing Based on Accurate Closed-Form Modeling
Simultaneous OPC- and CMP-Aware Routing Based on Accurate Closed-Form Modeling Shao-Yun Fang, Chung-Wei Lin, Guang-Wan Liao, and Yao-Wen Chang March 26, 2013 Graduate Institute of Electronics Engineering
More informationA Novel Framework for Multilevel Full-Chip Gridless Routing
A Novel Framework for Multilevel Full-Chip Gridless Routing Tai-Chen Chen Yao-Wen Chang Shyh-Chang Lin Graduate Institute of Electronics Engineering Graduate Institute of Electronics Engineering SpringSoft,
More informationInterconnect Delay and Area Estimation for Multiple-Pin Nets
Interconnect Delay and Area Estimation for Multiple-Pin Nets Jason Cong and David Z. Pan UCLA Computer Science Department Los Angeles, CA 90095 Sponsored by SRC and Avant!! under CA-MICRO Presentation
More informationFine-Grained Sleep Transistor Sizing Algorithm for Leakage Power Minimization
6.1 Fine-Grained Sleep Transistor Sizing Algorithm for Leakage Power Minimization De-Shiuan Chiou, Da-Cheng Juan, Yu-Ting Chen, and Shih-Chieh Chang Department of CS, National Tsing Hua University, Hsinchu,
More informationS 1 S 2. C s1. C s2. S n. C sn. S 3 C s3. Input. l k S k C k. C 1 C 2 C k-1. R d
Interconnect Delay and Area Estimation for Multiple-Pin Nets Jason Cong and David Zhigang Pan Department of Computer Science University of California, Los Angeles, CA 90095 Email: fcong,pang@cs.ucla.edu
More informationDESIGN AND PERFORMANCE ANALYSIS OF CARRY SELECT ADDER
DESIGN AND PERFORMANCE ANALYSIS OF CARRY SELECT ADDER Bhuvaneswaran.M 1, Elamathi.K 2 Assistant Professor, Muthayammal Engineering college, Rasipuram, Tamil Nadu, India 1 Assistant Professor, Muthayammal
More informationPostgrid Clock Routing for High Performance Microprocessor Designs
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 31, NO. 2, FEBRUARY 2012 255 Postgrid Clock Routing for High Performance Microprocessor Designs Haitong Tian, Wai-Chung
More informationThree DIMENSIONAL-CHIPS
IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) ISSN: 2278-2834, ISBN: 2278-8735. Volume 3, Issue 4 (Sep-Oct. 2012), PP 22-27 Three DIMENSIONAL-CHIPS 1 Kumar.Keshamoni, 2 Mr. M. Harikrishna
More informationTiming-Constrained I/O Buffer Placement for Flip- Chip Designs
Timing-Constrained I/O Buffer Placement for Flip- Chip Designs Zhi-Wei Chen 1 and Jin-Tai Yan 2 1 College of Engineering, 2 Department of Computer Science and Information Engineering Chung-Hua University,
More informationFast Dual-V dd Buffering Based on Interconnect Prediction and Sampling
Based on Interconnect Prediction and Sampling Yu Hu King Ho Tam Tom Tong Jing Lei He Electrical Engineering Department University of California at Los Angeles System Level Interconnect Prediction (SLIP),
More informationOptimal Prescribed-Domain Clock Skew Scheduling
Optimal Prescribed-Domain Clock Skew Scheduling Li Li, Yinghai Lu, Hai Zhou Electrical Engineering and Computer Science Northwestern University 6B-4 Abstract Clock skew scheduling is an efficient technique
More informationEfficient Test Compaction for Combinational Circuits Based on Fault Detection Count-Directed Clustering
Efficient Test Compaction for Combinational Circuits Based on Fault Detection Count-Directed Clustering Aiman El-Maleh, Saqib Khurshid King Fahd University of Petroleum and Minerals Dhahran, Saudi Arabia
More informationA Novel Design of High Speed and Area Efficient De-Multiplexer. using Pass Transistor Logic
A Novel Design of High Speed and Area Efficient De-Multiplexer Using Pass Transistor Logic K.Ravi PG Scholar(VLSI), P.Vijaya Kumari, M.Tech Assistant Professor T.Ravichandra Babu, Ph.D Associate Professor
More informationEfficient Static Timing Analysis Using a Unified Framework for False Paths and Multi-Cycle Paths
Efficient Static Timing Analysis Using a Unified Framework for False Paths and Multi-Cycle Paths Shuo Zhou, Bo Yao, Hongyu Chen, Yi Zhu and Chung-Kuan Cheng University of California at San Diego La Jolla,
More informationAn Interconnect-Centric Design Flow for Nanometer Technologies
An Interconnect-Centric Design Flow for Nanometer Technologies Jason Cong UCLA Computer Science Department Email: cong@cs.ucla.edu Tel: 310-206-2775 URL: http://cadlab.cs.ucla.edu/~cong Exponential Device
More informationRetiming and Clock Scheduling for Digital Circuit Optimization
184 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 21, NO. 2, FEBRUARY 2002 Retiming and Clock Scheduling for Digital Circuit Optimization Xun Liu, Student Member,
More informationA Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction
A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo Han, Andrew B. Kahng, Jongpil Lee, Jiajia Li and Siddhartha Nath CSE and ECE Departments,
More information8D-3. Experiences of Low Power Design Implementation and Verification. Shi-Hao Chen. Jiing-Yuan Lin
Experiences of Low Power Design Implementation and Verification Shi-Hao Chen Global Unichip Corp. Hsin-Chu Science Park, Hsin-Chu, Taiwan 300 +886-3-564-6600 hockchen@globalunichip.com Jiing-Yuan Lin Global
More informationNanometer technologies enable higher-frequency designs
By Ron Press & Jeff Boyer Easily Implement PLL Clock Switching for At-Speed Test By taking advantage of pattern-generation features, a simple logic design can utilize phase-locked-loop clocks for accurate
More informationAn Efficient Algorithm For RLC Buffer Insertion
An Efficient Algorithm For RLC Buffer Insertion Zhanyuan Jiang, Shiyan Hu, Jiang Hu and Weiping Shi Texas A&M University, College Station, Texas 77840 Email: {jerryjiang, hushiyan, jianghu, wshi}@ece.tamu.edu
More informationDUE to the high computational complexity and real-time
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 15, NO. 3, MARCH 2005 445 A Memory-Efficient Realization of Cyclic Convolution and Its Application to Discrete Cosine Transform Hun-Chen
More informationINTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY
IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY Clock Power Reduction Using Merged Flip Flops Technique S.Murugan ME VLSI Design, SCAD College of Engineering and Technology,
More informationHow Much Logic Should Go in an FPGA Logic Block?
How Much Logic Should Go in an FPGA Logic Block? Vaughn Betz and Jonathan Rose Department of Electrical and Computer Engineering, University of Toronto Toronto, Ontario, Canada M5S 3G4 {vaughn, jayar}@eecgutorontoca
More informationAlgorithms for Non-Hanan-Based Optimization for VLSI Interconnect under a Higher-Order AWE Model
446 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 19, NO. 4, APRIL 2000 Algorithms for Non-Hanan-Based Optimization for VLSI Interconnect under a Higher-Order AWE
More informationIterative-Constructive Standard Cell Placer for High Speed and Low Power
Iterative-Constructive Standard Cell Placer for High Speed and Low Power Sungjae Kim and Eugene Shragowitz Department of Computer Science and Engineering University of Minnesota, Minneapolis, MN 55455
More informationA 256-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology
http://dx.doi.org/10.5573/jsts.014.14.6.760 JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.14, NO.6, DECEMBER, 014 A 56-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology Sung-Joon Lee
More informationWITH the development of the semiconductor technology,
Dual-Link Hierarchical Cluster-Based Interconnect Architecture for 3D Network on Chip Guang Sun, Yong Li, Yuanyuan Zhang, Shijun Lin, Li Su, Depeng Jin and Lieguang zeng Abstract Network on Chip (NoC)
More informationSolving MIPI D-PHY Receiver Test Challenges
Stefan Walther and Yu Hu Verigy stefan.walther@verigy.com yu.hu@verigy.com Abstract MIPI stands for the Mobile Industry Processor Interface, which provides a flexible, low-cost, high-speed interface solution
More informationA General Sign Bit Error Correction Scheme for Approximate Adders
A General Sign Bit Error Correction Scheme for Approximate Adders Rui Zhou and Weikang Qian University of Michigan-Shanghai Jiao Tong University Joint Institute Shanghai Jiao Tong University, Shanghai,
More informationA Survey on Buffered Clock Tree Synthesis for Skew Optimization
A Survey on Buffered Clock Tree Synthesis for Skew Optimization Anju Rose Tom 1, K. Gnana Sheela 2 1, 2 Electronics and Communication Department, Toc H Institute of Science and Technology, Kerala, India
More informationCombinatorial Algorithms for Fast Clock Mesh Optimization
Combinatorial Algorithms for Fast Clock Mesh Optimization Ganesh Venkataraman, Zhuo Feng, Jiang Hu, Peng Li Dept. of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843
More informationDesign of Low-Power and Low-Latency 256-Radix Crossbar Switch Using Hyper-X Network Topology
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.15, NO.1, FEBRUARY, 2015 http://dx.doi.org/10.5573/jsts.2015.15.1.077 Design of Low-Power and Low-Latency 256-Radix Crossbar Switch Using Hyper-X Network
More informationGated-Demultiplexer Tree Buffer for Low Power Using Clock Tree Based Gated Driver
Gated-Demultiplexer Tree Buffer for Low Power Using Clock Tree Based Gated Driver E.Kanniga 1, N. Imocha Singh 2,K.Selva Rama Rathnam 3 Professor Department of Electronics and Telecommunication, Bharath
More informationAsia and South Pacific Design Automation Conference
Asia and South Pacific Design Automation Conference Authors: Kuan-Yu Lin, Hong-Ting Lin, and Tsung-Yi Ho Presenter: Hong-Ting Lin chibli@csie.ncku.edu.tw http://eda.csie.ncku.edu.tw Electronic Design Automation
More informationBuffered Steiner Trees for Difficult Instances
Buffered Steiner Trees for Difficult Instances C. J. Alpert 1, M. Hrkic 2, J. Hu 1, A. B. Kahng 3, J. Lillis 2, B. Liu 3, S. T. Quay 1, S. S. Sapatnekar 4, A. J. Sullivan 1, P. Villarrubia 1 1 IBM Corp.,
More informationMaking Fast Buffer Insertion Even Faster Via Approximation Techniques
1A-3 Making Fast Buffer Insertion Even Faster Via Approximation Techniques Zhuo Li 1,C.N.Sze 1, Charles J. Alpert 2, Jiang Hu 1, and Weiping Shi 1 1 Dept. of Electrical Engineering, Texas A&M University,
More informationFloorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence
Floorplan and Power/Ground Network Co-Synthesis for Fast Design Convergence Chen-Wei Liu 12 and Yao-Wen Chang 2 1 Synopsys Taiwan Limited 2 Department of Electrical Engineering National Taiwan University,
More informationProcessor and DRAM Integration by TSV- Based 3-D Stacking for Power-Aware SOCs
Processor and DRAM Integration by TSV- Based 3-D Stacking for Power-Aware SOCs Shin-Shiun Chen, Chun-Kai Hsu, Hsiu-Chuan Shih, and Cheng-Wen Wu Department of Electrical Engineering National Tsing Hua University
More informationArchitecture-Level Synthesis for Automatic Interconnect Pipelining
Architecture-Level Synthesis for Automatic Interconnect Pipelining Jason Cong, Yiping Fan, Zhiru Zhang Computer Science Department University of California, Los Angeles, CA 90095 {cong, fanyp, zhiruz}@cs.ucla.edu
More informationINTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 9 /Issue 3 / OCT 2017
Design of Low Power Adder in ALU Using Flexible Charge Recycling Dynamic Circuit Pallavi Mamidala 1 K. Anil kumar 2 mamidalapallavi@gmail.com 1 anilkumar10436@gmail.com 2 1 Assistant Professor, Dept of
More informationHAI ZHOU. Evanston, IL Glenview, IL (847) (o) (847) (h)
HAI ZHOU Electrical and Computer Engineering Northwestern University 2535 Happy Hollow Rd. Evanston, IL 60208-3118 Glenview, IL 60025 haizhou@ece.nwu.edu www.ece.nwu.edu/~haizhou (847) 491-4155 (o) (847)
More informationDesign and Implementation of CVNS Based Low Power 64-Bit Adder
Design and Implementation of CVNS Based Low Power 64-Bit Adder Ch.Vijay Kumar Department of ECE Embedded Systems & VLSI Design Vishakhapatnam, India Sri.Sagara Pandu Department of ECE Embedded Systems
More informationAbbas El Gamal. Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program. Stanford University
Abbas El Gamal Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program Stanford University Chip stacking Vertical interconnect density < 20/mm Wafer Stacking
More informationImplementation of Asynchronous Topology using SAPTL
Implementation of Asynchronous Topology using SAPTL NARESH NAGULA *, S. V. DEVIKA **, SK. KHAMURUDDEEN *** *(senior software Engineer & Technical Lead, Xilinx India) ** (Associate Professor, Department
More informationBARP-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs
-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs Pejman Lotfi-Kamran, Masoud Daneshtalab *, Caro Lucas, and Zainalabedin Navabi School of Electrical and Computer Engineering, The
More informationTree Structure and Algorithms for Physical Design
Tree Structure and Algorithms for Physical Design Chung Kuan Cheng, Ronald Graham, Ilgweon Kang, Dongwon Park and Xinyuan Wang CSE and ECE Departments UC San Diego Outline: Introduction Ancestor Trees
More informationFPGA IMPLEMENTATION OF FLOATING POINT ADDER AND MULTIPLIER UNDER ROUND TO NEAREST
FPGA IMPLEMENTATION OF FLOATING POINT ADDER AND MULTIPLIER UNDER ROUND TO NEAREST SAKTHIVEL Assistant Professor, Department of ECE, Coimbatore Institute of Engineering and Technology Abstract- FPGA is
More information6T- SRAM for Low Power Consumption. Professor, Dept. of ExTC, PRMIT &R, Badnera, Amravati, Maharashtra, India 1
6T- SRAM for Low Power Consumption Mrs. J.N.Ingole 1, Ms.P.A.Mirge 2 Professor, Dept. of ExTC, PRMIT &R, Badnera, Amravati, Maharashtra, India 1 PG Student [Digital Electronics], Dept. of ExTC, PRMIT&R,
More informationVdd Programmability to Reduce FPGA Interconnect Power
Vdd Programmability to Reduce FPGA Interconnect Power Fei Li, Yan Lin and Lei He Electrical Engineering Department University of California, Los Angeles, CA 90095 ABSTRACT Power is an increasingly important
More informationSynthesizable FPGA Fabrics Targetable by the VTR CAD Tool
Synthesizable FPGA Fabrics Targetable by the VTR CAD Tool Jin Hee Kim and Jason Anderson FPL 2015 London, UK September 3, 2015 2 Motivation for Synthesizable FPGA Trend towards ASIC design flow Design
More informationPhysical Design of Digital Integrated Circuits (EN0291 S40) Sherief Reda Division of Engineering, Brown University Fall 2006
Physical Design of Digital Integrated Circuits (EN029 S40) Sherief Reda Division of Engineering, Brown University Fall 2006 Lecture 08: Interconnect Trees Introduction to Graphs and Trees Minimum Spanning
More informationSF-LRU Cache Replacement Algorithm
SF-LRU Cache Replacement Algorithm Jaafar Alghazo, Adil Akaaboune, Nazeih Botros Southern Illinois University at Carbondale Department of Electrical and Computer Engineering Carbondale, IL 6291 alghazo@siu.edu,
More informationNoCIC: A Spice-based Interconnect Planning Tool Emphasizing Aggressive On-Chip Interconnect Circuit Methods
1 NoCIC: A Spice-based Interconnect Planning Tool Emphasizing Aggressive On-Chip Interconnect Circuit Methods V. Venkatraman, A. Laffely, J. Jang, H. Kukkamalla, Z. Zhu & W. Burleson Interconnect Circuit
More informationMulticycle-Path Challenges in Multi-Synchronous Systems
Multicycle-Path Challenges in Multi-Synchronous Systems G. Engel 1, J. Ziebold 1, J. Cox 2, T. Chaney 2, M. Burke 2, and Mike Gulotta 3 1 Department of Electrical and Computer Engineering, IC Design Research
More informationObstacle-Aware Longest-Path Routing with Parallel MILP Solvers
, October 20-22, 2010, San Francisco, USA Obstacle-Aware Longest-Path Routing with Parallel MILP Solvers I-Lun Tseng, Member, IAENG, Huan-Wen Chen, and Che-I Lee Abstract Longest-path routing problems,
More informationOn the Decreasing Significance of Large Standard Cells in Technology Mapping
On the Decreasing Significance of Standard s in Technology Mapping Jae-sun Seo, Igor Markov, Dennis Sylvester, and David Blaauw Department of EECS, University of Michigan, Ann Arbor, MI 48109 {jseo,imarkov,dmcs,blaauw}@umich.edu
More informationStatic Compaction Techniques to Control Scan Vector Power Dissipation
Static Compaction Techniques to Control Scan Vector Power Dissipation Ranganathan Sankaralingam, Rama Rao Oruganti, and Nur A. Touba Computer Engineering Research Center Department of Electrical and Computer
More informationWhitespace-Aware TSV Arrangement in 3D Clock Tree Synthesis
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. XX, NO. XX, XXXX 1 Whitespace-Aware TSV Arrangement in 3D Clock Tree Synthesis Wulong Liu, Student Member, IEEE, Yu Wang, Senior Member,
More informationLi Minqiang Institute of Systems Engineering Tianjin University, Tianjin , P.R. China
Multi-level Genetic Algorithm (MLGA) for the Construction of Clock Binary Tree Nan Guofang Tianjin University, Tianjin 07, gfnan@tju.edu.cn Li Minqiang Tianjin University, Tianjin 07, mqli@tju.edu.cn Kou
More informationA Novel Pseudo 4 Phase Dual Rail Asynchronous Protocol with Self Reset Logic & Multiple Reset
A Novel Pseudo 4 Phase Dual Rail Asynchronous Protocol with Self Reset Logic & Multiple Reset M.Santhi, Arun Kumar S, G S Praveen Kalish, Siddharth Sarangan, G Lakshminarayanan Dept of ECE, National Institute
More informationFPGA Clock Network Architecture: Flexibility vs. Area and Power
FPGA Clock Network Architecture: Flexibility vs. Area and Power Julien Lamoureux and Steven J.E. Wilton Department of Electrical and Computer Engineering University of British Columbia Vancouver, B.C.,
More informationDesign Compiler Graphical Create a Better Starting Point for Faster Physical Implementation
Datasheet Create a Better Starting Point for Faster Physical Implementation Overview Continuing the trend of delivering innovative synthesis technology, Design Compiler Graphical streamlines the flow for
More informationDesign and Analysis of Kogge-Stone and Han-Carlson Adders in 130nm CMOS Technology
Design and Analysis of Kogge-Stone and Han-Carlson Adders in 130nm CMOS Technology Senthil Ganesh R & R. Kalaimathi 1 Assistant Professor, Electronics and Communication Engineering, Info Institute of Engineering,
More informationInterconnect Design for Deep Submicron ICs
! " #! " # - Interconnect Design for Deep Submicron ICs Jason Cong Lei He Kei-Yong Khoo Cheng-Kok Koh and Zhigang Pan Computer Science Department University of California Los Angeles CA 90095 Abstract
More informationCATALYST: Planning Layer Directives for Effective Design Closure
CATALYST: Planning Layer Directives for Effective Design Closure Yaoguang Wei 1, Zhuo Li 2, Cliff Sze 2 Shiyan Hu 3, Charles J. Alpert 2, Sachin S. Sapatnekar 1 1 Department of Electrical and Computer
More informationThe Design and Implementation of a Low-Latency On-Chip Network
The Design and Implementation of a Low-Latency On-Chip Network Robert Mullins 11 th Asia and South Pacific Design Automation Conference (ASP-DAC), Jan 24-27 th, 2006, Yokohama, Japan. Introduction Current
More informationCalibrating Achievable Design GSRC Annual Review June 9, 2002
Calibrating Achievable Design GSRC Annual Review June 9, 2002 Wayne Dai, Andrew Kahng, Tsu-Jae King, Wojciech Maly,, Igor Markov, Herman Schmit, Dennis Sylvester DUSD(Labs) Calibrating Achievable Design
More informationLOGIC EFFORT OF CMOS BASED DUAL MODE LOGIC GATES
LOGIC EFFORT OF CMOS BASED DUAL MODE LOGIC GATES D.Rani, R.Mallikarjuna Reddy ABSTRACT This logic allows operation in two modes: 1) static and2) dynamic modes. DML gates, which can be switched between
More informationAn Efficient Routing Tree Construction Algorithm with Buffer Insertion, Wire Sizing and Obstacle Considerations
An Efficient Routing Tree Construction Algorithm with uffer Insertion, Wire Sizing and Obstacle Considerations Sampath Dechu Zion Cien Shen Chris C N Chu Physical Design Automation Group Dept Of ECpE Dept
More informationTestability Optimizations for A Time Multiplexed CPLD Implemented on Structured ASIC Technology
ROMANIAN JOURNAL OF INFORMATION SCIENCE AND TECHNOLOGY Volume 14, Number 4, 2011, 392 398 Testability Optimizations for A Time Multiplexed CPLD Implemented on Structured ASIC Technology Traian TULBURE
More informationarxiv: v1 [cs.ar] 14 May 2017
Fast Statistical Timing Analysis for Circuits with Post-Silicon Tunable Clock Buffers Bing Li, Ning Chen, Ulf Schlichtmann Institute for Electronic Design Automation, Technische Universitaet Muenchen,
More information