Ting-Sheng Chen, Ding-Yuan Lee, Tsung-Te Liu, and An-Yeu Wu, Fellow, IEEE

Size: px
Start display at page:

Download "Ting-Sheng Chen, Ding-Yuan Lee, Tsung-Te Liu, and An-Yeu Wu, Fellow, IEEE"

Transcription

1 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 63, NO. 10, OCTOBER Dynamic Reconfigurable Ternary Content Addressable Memory for OpenFlow-Compliant Low-Power Packet Processing Ting-Sheng Chen, Ding-Yuan Lee, Tsung-Te Liu, and An-Yeu Wu, Fellow, IEEE Abstract OpenFlow, the main protocol for software-defined networking (SDN) advocated by Google, requires multiple flow tables with long and variable rule lengths. For fast packet forwarding, using ternary content-addressable memory (TCAM) as implementation of lookup-table has displayed superior performance in executing the conventional packet classification algorithms. However, applying traditional TCAM designs to an OpenFlow switch leads to problems of insufficient word length, inflexible table configuration, and high-power consumptions. This paper presents a low-power dynamic reconfigurable TCAM design for OpenFlowcompliant packet processing. It can support long and variable word lengths for different flow tables, with less than 5% of area overhead. To reduce dynamic power consumption without sacrificing throughput, the proposed hybrid-pipelined structure based on NAND-NOR CAM and pipelined-tcam, can achieve lower energy-delay products than traditional TCAM. Moreover, the proposed self-power gating technique controlled by the mask data optimizes both NAND and NOR core cells, which lowers power consumption without additional control and routing overhead. Finally, the low-power reconfigurable TCAM is implemented in TSMC 40 nm CMOS technology. The implementation results show that the energy metric of the proposed reconfigurable TCAM is only fj/bit/search, at a 416 MHz operating frequency. Thus, the proposed TCAM design is very suitable for the emerging high performance packet processing in future SDN applications that need long and variable word lengths. Index Terms High-throughput, low-power, OpenFlow, packet classification, reconfigurable VLSI design, software-defined networking (SDN), ternary content -addressable memory (TCAM). I. INTRODUCTION WITH growing global data traffic, the backbone network of cloud services is facing more challenging and severe requirements. To meet the demands of the backbone network, Software-Defined Networking (SDN) was proposed in 2009 [1]. SDN has since become one of the most promising archi- Manuscript received February 21, 2016; revised June 9, 2016; accepted June 22, Date of publication September 13, 2016; date of current version September 27, This work was supported in part by the Ministry of Science and Technology, ROC, under Grant E , E , and E MY2. This paper was recommended by Associate Editor R. Azarderakhsh. The authors are with the Graduate Institute of Electronics Engineering and Department of Electrical Engineering, National Taiwan University, Taipei 106, Taiwan ( tim@access.ee.ntu.edu.tw; dan@access.ee.ntu.edu.tw; ttliu@ ntu.edu.tw; andywu@ntu.edu.tw). Color versions of one or more of the figures in this paper are available online at Digital Object Identifier /TCSI Fig. 1. A simplified block diagram of OpenFlow switch. tectures for managing large-scale complex networks. Currently, OpenFlow, which is advocated by Google [2], [3], is the most popular SDN protocol/standard and has been standardized [4]. As shown in Fig. 1, to achieve programmability, an OpenFlow switch requests multiple flow tables with various match-field. Rule lengths in each table ranging from 32 to 568 bits. Recently, Ternary Content Addressable Memory (TCAM) has been applied to packet processing to forward packets rapidly [5]. Owing to the parallel architecture, TCAM displays superior performance in executing packet classification algorithm [6]. Although TCAM design had been intensively studied for computer architecture applications [7] [10], three major issues are yet to be addressed when using TCAM to implement a lookup table in OpenFlow-compliant switches, as shown in Fig. 1: 1) Insufficient word length: The current version of the OpenFlow switch must support at least 14 match fields, with the rule lengths in a flow table may reach 568 bits. However, the word lengths of most current TCAM designs are insufficient. Moreover, TCAM with long word lengths cause a long delay path, even resulting in the worse noise margin. 2) Inflexible table configuration: Multiple-table processing enables packet classification to be programmable and stable. Rule lengths in each table are different. However, traditional TCAM designs with fixed word lengths are not suitable for variable rule lengths. Applying traditional TCAM designs to OpenFlow switches will lead to inflexible table configuration and wastage of precious TCAM resources IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information.

2 1662 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 63, NO. 10, OCTOBER 2016 Fig. 2. Proposed dynamic-reconfigurable TCAM for OpenFlow-compliant switches. Fig. 3. Example of OpenFlow multiple-table processing. 3) High-power consumption: OpenFlow switches need greater number of flow tables than traditional network switches. TCAM is very power-hungry, consuming the largest portion (as much as 25%) of total power in a highend network switch, as discussed in [28]. Multiple-table processing causes more substantial static/dynamic power consumptions. In this paper, we aim to develop a dynamic-reconfigurable lowpower TCAM for OpenFlow-compliant switches. As depicted in Fig. 2, the main contributions of this paper are as follows: 1) To reduce static power consumption, we optimize both NAND and NOR core cells. Without additional control and routing overheads, the proposed self-power gating technique controlled by mask cells can reduce static power by 42% and dynamic power consumption by 15% at the cell level. 2) To lower the dynamic power consumption without sacrificing the throughput rate, we propose a hybrid-pipelined structure by combining both NAND-NOR TCAM and pipeline TCAM. Furthermore, we analyze the appropriate size for the NAND-type TCAM, further minimizing the energy-delay product by 45% at the structural level. 3) We propose a dynamic reconfigurable TCAM to handle long and variable word lengths. The proposed Filter-Based Multi-Mode Cascaded (FBMMC) architecture can flexibly support word lengths ranging from 144 to 576 bits. Moreover, the area and timing overheads of this reconfigurable architecture are relatively low, achieving less than 5% of area overhead. To validate our design, we implement the dynamic reconfigurable low-power TCAM in TSMC 40-nm CMOS technology. The simulation results show that the energy metric is fj/ Bit/Search at a 416 MHz operating frequency. The rest of this paper is organized as follows. Section II briefly introduces the principles of the OpenFlow switch and traditional TCAM architecture. Section III presents the self-power gating technique and hybrid-pipelined structure for low-power TCAM operation. We illustrate the proposed dynamic reconfigurable TCAM in Section IV. Section V compares this work to other related works. Section V concludes this paper. II. ARCHITECTURE OF TCAM MACRO IN NETWORK SWITCHES A. Operations of OpenFlow Switch Fig. 1 shows a simplified block diagram of an OpenFlow switch. It utilizes the OpenFlow protocol and channel to communicate with OpenFlow controllers. It can adjust the routing policy of switches, depending on the network topology and traffic status, by adding, modifying, or removing the rules of flow tables. There are multiple flow tables in an OpenFlow switch. Multiple-table processing enables packet classification to be programmable and stable. As shown in Fig. 3, when receiving a packet, the packet header is sent to flow-table_0, which is typically an initial classifier. It decides which table can further process this packet header, and then forwards it to that flow table. The Current Openflow switch (V1.5) support at least 14 fields, as shown in Table I. Each table can focus on the selected tuples as the match field, with each field having mask ability. The lengths of the forwarding rules in each table are different. The rule lengths in each flow table range from 32 bits (e.g., IPv4_dst.) to 568 bits (14 match fields). In general, a flow table with a longer rule length gives rough classification, which means that it has fewer entries [11]. On the other hand, a flow table with less tuples (e.g., a firewall) has more entries [12]. Compared with the conventional switch, an OpenFlow switch can process multiple layers (L1 to L4) of a packet header. However, the large-sized and multiple tables cause considerable power consumption. Thus, flow tables for OpenFlow switch with reconfigurable, high-performance, and low-power design is essential for the backbone network in the future. B. Traditional TCAM Architecture Using TCAM as implementation of lookup-table displays superior performance in executing conventional packet classification algorithm. Owing to the parallel architecture, TCAM can finish searching and comparison of a packet header in one or a few clock cycles. Fig. 4 shows the block diagram of CAM. A search word is broadcasted onto SearchLines (SL) of the table of stored words. The size of the words is L bits and the number of entries is typically from a few hundred to a few thousand. Each word has a MatchLine (ML) that indicates whether the

3 CHEN et al.: DYNAMIC RECONFIGURABLE TCAM FOR OPENFLOW-COMPLIANT LOW-POWER PACKET PROCESSING 1663 TABLE I EXAMPLE OF OPENFLOW 14-FIELD PACKET CLASSIFICATION RULE SET TABLE II SUMMARY OF NAND AND NOR-TYPE TCAM Fig. 4. Block diagram of content addressable memory (CAM). Unlike the NAND-type TCAM that pulls ML down when all match occurs, the NOR-type TCAM discharges the ML to the ground in case of any miss, as shown in Fig. 5(b). The mask bit can turn off the pull-down transistor of the NOR cell. Table II summarizes the differences between NAND- and NOR-type TCAMs. Because of the long serial pass transistor delay and threshold drop, the NAND-type TCAM requires more evaluation time than the NOR-type TCAM at the same word length. In contrast, the NOR-type TCAM can provide excellent performance owing to its short critical path delay. Dynamic power consumption comprises ML and SL power consumption. SL power consumption is related to the word length and input pattern. As for ML power consumption, owing to the low probability of rule matching and load capacitance, the switched capacitance derived from multiplying switching activity and load capacitance is also low, resulting in minimal dynamic power consumption. On the other hand, because of high switching activity and load capacitance, the NOR-type TCAM consumes more power consumption than the NANDtype TCAM. Fig. 5. Conventional L-bit TCAM: (a) NAND-type TCAM; (b) NOR-type TCAM with a matched case; (c) NOR-type TCAM with 1-bit mismatched case. search data and the stored word are the same. The priority encoder outputs the matched location. The CAM with mask ability is called ternary CAM (TCAM). The masked bit of the TCAM always matches the search bit, regardless of the content of the search bit. There are two types of TCAM: NAND- and NOR-type TCAM.Fig.5(a)showstheNAND-type TCAM. When MLPre is logic 1, ML_NAND is pre-charged to high voltage by PMOS P 2. Search data are sent to the XNOR cell, and the NMOS N1 is turned off to avoid short current to the ground. When MLPRe is logic 0, PMOS P 2 stops charging the ML. If the input data match the stored rule, then the ML will be discharged to the ground. The mask bit can always turn on the pass transistor of the NAND cell, regardless of the match result of the XNOR cell. C. Discontinuous Distribution of Mask Data Because the OpenFlow switch gathers multiple match fields in a flow table, the continuous feature of mask data rarely appears in many-field classification. Therefore, traditional TCAM designs for single-field classification [14], [16], [23] (e.g., IPv6 destination) that utilize the continuous feature of mask data are unsuitable for OpenFlow switch or many-field packet classification with discontinuous distribution of mask data. Priority encoder is not suitable for multiple-field classification, because the priority of a rule is no longer consistent with the proportion of mask data itself. Furthermore, the use of priority encoder results in increased energy consumption of pattern updates and search operations. Hence, for multiple-field classification, the concept of Priority Decision in Memory [13] and the priority field shown in Table I can be utilized to decide the match result. D. Challenge of Long-Word Lengths TCAM Recently, a TCAM with butterfly matchline (Butterfly- TCAM) was presented by [14] with low energy consumption

4 1664 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 63, NO. 10, OCTOBER 2016 solutions take advantage of the continuous feature of mask data, which can be applied only to single-field packet classification, but not to many-field packet classification. Fig. 7 shows the proposed self-power gating technique (S-PG) for both the NAND- andnor-type TCAM. The match results of NAND and NOR-type TCAM, Match NAND and Match NOR is given by: Fig. 6. Power gating technique utilizes a continuous feature of mask data, with additional control and routing overhead proposed in [14]. design. A TCAM with Low-swing MatchLine Sense amplifier (LVMLSS-TCAM) was developed by [15]. However, both Butterfly and LVMLSS TCAMs did not consider long word lengths and reconfigurable architecture. TCAM with long word lengths will result in larger ML capacitance, which causes greater dynamic power consumption than traditional TCAM. Furthermore, TCAM with longer word lengths also affects search operation. Shown in Fig. 5(b) and (c), the total ML currents of an L-bit NOR-type TCAM for the cases of match and 1-bit mismatch can be written as follows: I ML_match = L I OFF (1) I ML_1bit_mismatch =(L 1) I OFF + I ON (2) where I ON and I OFF are the currents of the matched and mismatched cells, respectively. Equations (1) and (2) show that if the ratio I ON /I OFF decreases or the word length L increases, then distinguishing a matched case from a 1-bit mismatch case would be difficult. As the result, when designing TCAM with long word length, we should reduce the leakage current and avoid lengthening the word length directly. III. LOW-POWER TCAM DESIGN FOR OPENFLOW To reduce power consumption, we propose a self-power gating technique at the cell level and a hybrid-pipelined TCAM at the structural level. These techniques are the fundamental processing elements of the proposed low-power dynamic reconfigurable TCAM for OpenFlow-compliant switch. A. Self-Power Gating for NAND and NOR Cells Most power reduction techniques in the literature focus on dynamic power reduction. However, the evolution of semiconductor process technology has resulted in leakage power playing an important role in the total power consumption. As mentioned before, the leakage component of the TCAM also affects the search operation. Therefore, it is important to lower the leakage component of TCAM circuit. To reduce static power consumption, power gating is a popular technique at the circuit level. However, the area overhead of the power gating technique due to global signal routing can be substantial. Fig. 6 shows a super-cut off power gating technique with complex control scheme was proposed in [14]. A twoside self-gating technique to reduce leakage power of the mask cell and control overhead was proposed in [16]. However, these Match NAND = ( (SL 1 D 1 + M 1 ) (SL n D n + M n ) ) (3) Match NOR = ( (SL 1 D 1 ) M 1 + +(SL n D n ) M n ) where SL i, D i,andm i are search data, stored data, and mask bit, respectively. If a mask bit is logic 1, the SL i D i in (3) and SL i D i in (4) are not important for the final match result. Owing to the mask cells work as usual, the search delay caused by this power gating technique is very small, which is consistent with the results of [14] and [16]. Therefore, S-PG technique turns off the XNOR cell in the NAND-type TCAM and the XOR cell in the NOR-type TCAM depending on the mask data of each bit. The additional PMOS and NMOS transistors help reduce the leakage current of the 6 T SRAM storage cell. Moreover, unlike the dynamic power source technique proposed in [17] that unable to turn-off the binary cell completely, the proposed technique lowers the storage data D and D to 0 V, further reducing the gate leakage of NMOS N1 and NMOS N2, and switching activities of NMOS N3, as shown in Table III. On the other hand, if the mask bit M in Fig. 7 is logic 0, then the XNOR and XOR cells work like conventional TCAM. B. Hybrid-Pipelined TCAM in OpenFlow Operations Many power reduction techniques are proposed to reduce TCAM power consumption [5], such as low-swing ML [15] and low-swing SL [18]. For dynamic power consumption, hierarchical ML is a very popular architecture to reduce high ML power consumption, such as pipelined-structure TCAM proposed in [19] and pulse NAND-NOR CAM (PNN-CAM) proposed in [20]. PNN CAM takes advantage of both NAND CAM and NOR TCAM simultaneously. The NAND-type CAM as a rule filter, consumes less dynamic power than the NORtype TCAM. On the other hand, the NOR-type TCAM provides excellent performance as the main-search component. Although the filter can lower the activation of main search components, the main drawback of the NAND ML should be well considered. The quadratic delay dependence comes from the fact that adding a NAND cell to a NAND ML adds both a series resistance and a capacitance to the ground because of the NMOS diffusion capacitance. Hence, most NAND-type TCAMs limit the bits of the NAND ML cells to 8 16 in order to limit the quadratic degradation in speed [5]. The additional delay caused by the filter result in throughput degradation. When traffic congestion occurs in the Internet, throughput degradation can be serious and may even lead to packet dropping, due to the packet buffer running out. Although the Early-Predict (4)

5 CHEN et al.: DYNAMIC RECONFIGURABLE TCAM FOR OPENFLOW-COMPLIANT LOW-POWER PACKET PROCESSING 1665 Fig. 7. Proposed self-power gating technique, without additional control and routing overhead: (a) NAND-type TCAM; (b) NOR-type TCAM. TABLE III TWO MODES OF THE SELF-POWER GATING TECHNIQUES Late-Correct (ELPC) scheme proposed in [21] can improve throughputs of hierarchical ML TCAM, it causes extra dynamic power consumption in the main-search component in the case of a miss-prediction. To improve the throughput, 1-bit pipeline registers are inserted between the NAND- and NOR-type TCAMs, as shown in Fig. 8. Owing to the discontinuous feature of the mask data in an OpenFlow switch, i.e., the mask data may appear in the rule filters, we replace the NAND-CAM with NAND-TCAM. Unlike the EPLC scheme, which complete search operation in one clock cycle, the pipeline registers can divide the search operation into two stages. The activation of PMOS P 3 in Fig. 8, EN,isgivenby EN = MLPre Match (5) where Match is the match information of the previous stages. If match occurs in the first stage, Match is logic 0 and this match information is stored in the left part of the pipeline register. It will be passed to the right part of the pipeline register and PMOS P 3 will be turned on to pre-charge the NOR ML in the subsequent clock cycle. If no match occurs, then PMOS P 3 will be turned off until a match occurs again in the pre-search filter. Owing to the body effect, the threshold voltage of NMOS N2 in Fig. 8 is greater than that of NMOS N3. Match arrives slower than MLPre because of the propagation delay of the pipeline register. To avoid the glitch that results in unnecessary pre-charge power consumption shown in Fig. 9, MLPre is connected to NMOS N2, and Match is connected to NMOS N3.To ensurethe functionalcorrectness, if ML_NAND transmits missinformation, the ML_NOR will be discharge to the ground by NMOS N4 in Fig. 8 in the next clock cycle. Fig. 10 presents a ML Sense Amplifier (MLSA) with a gated clock design. If a match occurs in the first stage, then ML_NOR will be pre-charged to VDD. Meanwhile, the MLSA will be turned on and ML_SENSE will be set to logic 1. If a mismatch occurs in the main-search component, then the MLSA senses the mismatch condition and pulls ML_SENSE to the ground once ML_NOR is discharged to V tp. On the other hand, the MLSA will not be turned on if a mismatch case occurs in the first stage, thus reduce unnecessary power consumption. C. Analysis of Hybrid-Pipelined TCAM Before inserting pipeline registers, the search latency T delay and the search frequency f search of the NAND-NOR TCAM needs to satisfy the following conditions: T delay >T NAND + T NOR, f search < 1 T NAND + T NOR (6) where T NAND and T NOR are the delay time of the NANDand NOR-type TCAMs, respectively, and are dependent on the

6 1666 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 63, NO. 10, OCTOBER 2016 Fig. 8. Pipeline register with selective pre-charge logic is inserted between a 9-bit NAND-type TCAM and a 135-bit NOR-type TCAM. The precharge PMOS P3 of NOR matchline is turned on only when the filter match in the last cycle. order to achieve maximum power reduction, we analyze the expected value of power consumption P EX, given as follows: ( ) k 3 P EX = (NANDmatch(k)+NORdyn(L K)) 4 ( ( ) ) k (NAND miss (k)+nor leak (L K)) (8) 4 Fig. 9. Glitch resulting in unnecessary power consumption. Fig. 10. MLSA with gated clock design. where NAND miss (k) and NAND match (k) represent power consumption of mismatch and match cases, respectively, occurring at the K-bit NAND-type TCAM, and NOR dyn (L K) and NOR leak (L K) are the dynamic and static power consumption of mismatch and match cases, respectively, occurring at the K-bit NAND-type TCAM, and NOR dyn (L K) and NOR leak (L K) are the dynamic and static power consumptions, respectively, of the (L K) bit. In this study, we consider L = 144 for IPv6 lookup tables [22]. The value of k is scanned from 8 to 20 bits, and K =9 is considered as it minimizes P EX. The modified NAND-NOR TCAM can maintain low power consumption and high throughput rate. number of bits. After pipeline registers are inserted, the search frequency f search_pipe is given by: ( ) 1 1 f search_pipe < min, (7) T su + T cq + T NOR T NAND where T su and T cq are the setup time and the propagation delay of the pipeline registers, typically much smaller than T NAND and T NOR. Pipeline registers would almost double throughput when T NAND nearly equal to T NOR. The pipeline overheads are a few latencies, area and power caused by pipeline itself, these overheads are acceptable since only 1-bit pipeline register per ML are inserted. Except for throughput degradation, the filter with long word lengths consumes more power in the first stage. The content of the stored data is typically around 50% don t care data and 50% logic 0 and 1, and the content stored in the filter is only the left k-bit of the table. If the matching probability of each bit in the NAND-type TCAM is 0.75, then the matching probability of the filter is (3/4) k. Due to the low probability of main-search activation, the average bit of the activated power-demanding main search is reduced from (L K) to (L K)(3/4) k.in D. Layout and Simulation Results Self-Power Gating Technique: The proposed S-PG technique and hybrid pipelined TCAM is implemented by Cadence Virtuoso with TSMC 40 nm library CMOS technology at 0.9 V supply voltage. The search delay is 2.4 ns. All of the parameters, including interconnects and parasitic, are extracted from the circuit layout. The post-simulation is done by HSPICE simulation tool. Layout views of the NAND and NOR-type TCAM core cell are shown in Fig. 11(a) and (b), respectively. BL and SL are separated for lower switching capacitance. Without additional control and routing overheads, the area overhead of the S-PG technique is reasonable. We constructed a 256-word TCAM structure with 135-bit NOR-type and 9-bit NAND-type TCAMs. The static power consumption is evaluated for the cases when the mask bit is on and off, while the dynamic power consumption is evaluated for the mismatch case. The simulated dynamic power consumption is normalized to an energy metric, measured in femtojoules per bit per search. Table IV shows the simulated power results with the proposed S-PG technique. We can see a significant power reduction with the proposed S-PG technique when the mask bit is

7 CHEN et al.: DYNAMIC RECONFIGURABLE TCAM FOR OPENFLOW-COMPLIANT LOW-POWER PACKET PROCESSING 1667 TABLE V COMPARISONS WITH OHER RELATED WORKS: POWER GATING TECHNIQUES Fig. 11. Layout: (a) NAND core cell with SPG. (b) NOR core Cell with SPG. TABLE IV POWER REDUCTION OF THE SELF-POWER GATING TECHNIQUE Fig. 12. Post-layout waveforms of TCAM with self-power gating technique. enabled. Only a slight increase (5%) in the leakage power when the mask bit is disabled. The shutdown of the XOR and XNOR cells reduces the static power consumption by nearly half. The proposed S-PG technique also lowers the dynamic power consumption because NMOS N3 is disabled. If the mask data is logic 1, then traditional TCAM without the S-PG technique would consume more dynamic power from unnecessary N3 switching, as shown in Fig. 12. Table V shows a comparison of the proposed S-PG technique and other techniques [14], [16]. Our technique not only effectively reduces dynamic and static power consumption of both the NAND- and NOR-type TCAMs, but serves well for discontinuous and discrete mask data in multiple-field packet classification and other TCAM applications. It is well suited for OpenFlow-compliant packet processing and the hybridpieplined TCAM, which will be discussed in detail below. Hybrid-Pipelined TCAM Structure Based on S-PG Cell: Fig. 13 shows the results of a performance simulation for the hybrid-pipelined TCAM. The timing stability of the TCAM circuit is related to the critical path: the ML needs to be pulled down in time when a match and a mismatch case occur in the NAND- andnor-type TCAMs, respectively. The low-power sense amplifier is enabled only when a match occurs in the filter, which accelerates the sensing time of the NOR ML. Table VI indicates the throughput improvement by the proposed hybrid-pipelined TCAM. Compared with the NAND- NOR TCAM without a pipelined structure, the hybrid-pipelined TCAM can almost double the throughput rate. Although inserting the pipelines would cause slight latency overhead, it can be ignored for the following two reasons. First, the packet classification in an SDN environment can have the shortest or the best routing path, depending on the network topology and traffic conditions. Thus, a slight additional latency in a packet classification application is acceptable. Second, the TCAM itself has displayed superior performance compared with the other packet classification algorithms. Therefore, a slight latency caused by pipeline registers can be ignored, as the overall search latency of the TCAM is still considerably low. Table VII shows the energy-delay products of even and hybrid-pipelined TCAMs. Because the 9-bit NAND-type filters are enough to filter the rules efficiently, the main-search components are seldom activated, and hence, only static power is consumed. The average energy metric is lower than that of the even-pipelined TCAM. Although hybrid-pipelined TCAM requires more evaluation time compared with the even-pipelined TCAM, it has 45% lower and near-optimal energy-delay product. Both the filter and main-search component not only consume very little dynamic power but also maintain reasonable throughput. This hybrid-pipelined TCAM as a basic processing element is well suited to the following dynamic reconfigurable TCAM. IV. FILTER-BASED MULTI-MODE CASCADED TCAM The proposed FBMMC-TCAM for OpenFlow-compliant packet processing can achieve both long and variable word lengths. Furthermore, the power consumption is relatively low, with low area and timing overhead.

8 1668 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 63, NO. 10, OCTOBER 2016 Fig. 13. Post-layout waveforms of the hybrid-pipelined TCAM based on S-PG cell. TABLE VI THROUGHPUT IMPROVEMENT OF HYBRID-PIPELINED TCAM TABLE VII ENERGY-DELAY PRODUCT: EVEN-PIPELINED V.S. HYBRID-PIPELINED Fig. 14. A TCAM with flexible search mode proposed in [24]. A. Background and Motivation As explained earlier, the lengths of the forwarding rules in each table of an OpenFlow switch are different. The rule length in each flow table range from 32 bits (e.g., IPv4_dst.) to 568 bits (14 tuples). Most traditional TCAM architectures can neither support a long rule length with low power nor reconfigure the rule length if required. Cisco was also aware of the problem of variable matching sizes and therefore proposed Depth Cascading technique [29]. This technique can realize various matching sizes such as 72, 144, and 288 bits by cascading several individual CAM in serial. However, it suffers from huge search latency. Moreover, this technique leads to various search latencies with different matching sizes, causing complex control overheads. On the other hand, the proposed FBMMC architecture takes advantage of parallel architecture to configure different word lengths, therefore realizing constant search latency and lower control overheads. A TCAM with 32-, 64-, and 128-bit word lengths that takes advantage of the continuous feature of mask data to achieve storage efficiency has been proposed [23]. However, this design is only suitable for IPv6 applications, which do not work in multiple-field classification or an application without continuous mask data. Fig. 14 presents a TCAM utilizes flexible priority encoder to achieve flexible table option that was implemented in [24]. If the rule length is 320 bits, then a match occurs at all four parts of the 80-bit TCAM, which implies that a match of 320 bits occurs. This search mechanism can extend rule lengths instead of lengthening the word length directly. However, without any rule filter, the energy metric (fj/bit/search) is ten times more than the butterfly TCAM [14]. In addition, the use of priority encoder also results in increased energy consumption of pattern updates and search operations, furthermore, priority encoder is not suitable for multiple-field classification. Therefore, we propose a FBMMC-TCAM architecture, which features low power consumption and area overhead for OpenFlow-compliant switches. The proposed dynamic reconfigurable TCAM has a flexible triple-search mode. A single search mode that stores 256 rules and 144 bits of word lengths per rule, a dual-search mode that stores 128 rules and 288 bits of word lengths per rule, and a quad-search mode that stores up to 64 rules and 576 bits of word length per rule are all achieved. B. Operations and Architecture Fig. 15 shows the architecture of the proposed FBMMC- TCAM. Four hybrid-pipelined TCAMs are grouped as a multi-mode processing element. The multiplexer between the

9 CHEN et al.: DYNAMIC RECONFIGURABLE TCAM FOR OPENFLOW-COMPLIANT LOW-POWER PACKET PROCESSING 1669 Fig. 15. Architecture of the proposed FBMMC-TCAM. TABLE VIII WORD LENGTHS OF DIFFERENT MODE SETTINGS NAND-type TCAM and the pipeline can transmit match information depending on the mode setting. Table VIII indicates the different mode settings of the proposed reconfigurable TCAM. Mode 1 has 144 bits per rule, and the match information of each filters are independent of each other. In Fig. 15, mode 1 represents four independent rules, and the output shifter shifts four output results to ML_out. When in mode 2 (dual-search mode), as shown in Fig. 16, the first and second hybrid-pipelined TCAMs are grouped together, and the third and fourth hybrid-pipelined TCAMs are also cascaded to accomplish a dual-search mode. As a result, mode2 has = 288 bits per rule, including 9 2=18bits for the rule filter. To prevent the pre-charge circuit of the NOR-type TCAM from false activation (e.g., when the search key matches a 9-bit filter but misses the other 9-bit filter, this match information will cause false activation of the main search component), a1-bitnor gate is inserted to check whether the search key matches both the 9-bit filters. False activation would results in unnecessary power consumption but also a false match result. The output shifter shifts two match results to ML_out. In mode 3 (quad-search mode), four hybrid-pipelined TCAMs are cascaded. Because 540 bits of main-search components have greater power consumption than in the case of mode 1 mode 2, four rule filters are grouped to provide stricter rule filtering. The block diagram of the designed FBMMC-TCAM is shown in Fig. 17. The operation is divided into three phases: in the first phase, a control unit and an IO circuit are used to handle mode setting, rule searching, and rule updating. Because of the dualport design, the second phase can simultaneously perform rule searching and updating. The last phase outputs a match result by 4 64 shift registers. These phases can work simultaneously. C. Layout and Simulation Result Fig. 18 shows the layout of the proposed FBMMC-TCAM array, where four NAND- and NOR-type TCAMs are well grouped by multi-mode cascaded units. The proposed FBMMC- TCAM is implemented with TSMC 40 nm 1P9M CMOS technology at 25 C in a typical design-corner. The post-layout result of the FBMMC-TCAM shows that the energy metric is only fj/bit/search, with 50% mask data, and toggled search patterns. Additional logic gates, global routing, buffers increase the energy metric compared with the single-search mode of the hybrid-pipelined TCAM. The maximum search frequency is 416 MHz. Because longer length of TCAM consumes more power consumption, the proposed FBMMC-TCAM can adjust the length of the filter according to the requirement of the mainsearch component. Moreover, the match information of each filter and the main-search component is only 1 bit and the mode information is shared by the flow table specification when required, resulting in low area overhead. In this design, the proposed FBMMC-TCAM can achieve a multiple-table option without wasting much of the unused area. The area and timing overheads of the proposed FBMMC architecture, shown in Table IX, are relatively low. In addition, every hybrid-pipelined TCAM is properly utilized without wasting any storage and search components. This design concept enables easy extension to any mode higher than the triple-search mode, with the use of additional multiplexers and basic logic gates. However, a triple-search mode is sufficient for current OpenFlow specifications. Such an area-efficient reconfigurable design is well suited for multiple-table processing engines. V. C OMPARISON WITH OTHER RELATED WORKS When applying traditional TCAM designs with word length insufficient for multiple-field packet classification, we need to apply an additional packet algorithm to handle multiple match fields. Most packet classification algorithms used in TCAM-based solutions are decision-tree-based [25], [26] and decomposition-based [11], [27] algorithms. Decision-treebased approaches cut the search space recursively into smaller

10 1670 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 63, NO. 10, OCTOBER 2016 Fig. 16. Mode-2 of the proposed FBMMC-TCAM. TABLE IX AREA AND TIMING OVERHEAD OF FBMMC-TCAM Fig. 17. Block diagram of the proposed FBMMC-TCAM module. Fig. 18. Layout of the FBMMC TCAM array. subspaces according to a rule set. However, the performance of decision-tree-based approaches also depends on the rule set. A cut in one match field leads to duplicated rules in other match fields, resulting in rule set expansion [26]. Decomposition-based approaches first search each header field individually, and then merge partial results into the final result. This approach requires an additional hardware module to resolve hash collisions. The above techniques split a flow table with long word lengths into several flow tables with shorter word lengths. In general, a table with a 568-bit rule length is split into at least 4 to 12 flow tables [25], which causes a several times greater search latency than in case of single flow tables. Table X shows a comparison of the proposed TCAM with other TCAM architectures. The normalized energy metric defined in [14] is used to have a fair comparison as follows: ( E = E 40 Technology ) ( 0.9 V DD ) 2 (9) where E represents energy metric (fj/bit/search) and E denotes the energy metric normalized for 40 nm CMOS technology and a supply voltage of 0.9 V. When applying related TCAM designs to an OpenFlow switch, although power consumption of Butterfly [14] and LVMLSS TCAMs [15] is low, they did not consider long word lengths and reconfigurable architecture; resulting in not only high latency coming from split table but inflexible table configuration. Since the proposed dynamic reconfigurable TCAM can support long and variable word lengths, the search latency is several times less than other TCAM designs that are insufficient for long and variable word lengths. Moreover, this is achieved without the need to split the table or an additional hardware module to deal with multiple split tables. Although 4-parallel TCAM design [24] can support long and variable word lengths with flexible priority encoder, however, priority encoder is not suitable for many-field classification; moreover, the energy metric is ten times more than the FBMMC-TCAM. The proposed FBMMC-TCAM takes advantage of hierarchical MLs and a S-PG technique to significantly reduce power consumption. Our design can achieve both long and variable word lengths, with low static and dynamic power consumption and high throughput rate. The proposed TCAM

11 CHEN et al.: DYNAMIC RECONFIGURABLE TCAM FOR OPENFLOW-COMPLIANT LOW-POWER PACKET PROCESSING 1671 TABLE X SUMMARY OF DIFFERENT TERNARY CONTENT-ADDRESSABLE MEMORY design is very suitable for the emerging high performance packet processing in future SDN applications that need long and variable word lengths. VI. CONCLUSION We developed a dynamic reconfigurable TCAM for OpenFlow-compliant low-power packet processing. The proposed FBMMC architecture can configure variable word lengths using relatively low area and timing overheads. The proposed hybrid-pipelined structure can achieve lower dynamic power consumption and high throughput rate. Moreover, the proposed self-power gating technique lowers static power consumption without additional control and global routing overheads. Compared with other reconfigurable architectures, the proposed FBMMC-TCAM enables a considerable reduction in power consumption. The proposed TCAM design is very suitable for the emerging high performance packet processing in future SDN applications that need long and variable word lengths. ACKNOWLEDGMENT The authors would like to thank National Chip Implementation Center (CIC), National Applied Research Laboratories (NARL), Taiwan for technical support in layout. REFERENCES [1] N. Mckeown, Software-Defined Networking, INFOCOM Keynote Talk, Rio de Janeiro, Brazil, Apr [2] S. Jain et al., B4: Experience with a globally-deployed software defined wan, in Proc. ACM SIGCOMM 2013, pp. 3 14, [3] F. Hu, Q. Hao, and K. Bao, A survey on software defined networking (SDN) and OpenFlow: From concept to implementation, IEEE Commun. Surv. Tut., vol. 16, no. 4, pp , 4th Quart [4] ONF: OpenFlow Switch Specification Version [Online]. Available: [5] K. Pagiamtzis and A. Sheikholeslami, Content-addressable memory (CAM) circuits and architectures: A tutorial and survey, IEEE J. Solid- State Circuits, vol. 41, no. 3, pp , Mar [6] P. Gupta and N. McKeown, Algorithms for packet classification, IEEE Network, vol. 15, no. 2, pp , Mar./Apr [7] D. J. Craft, A fast hardware data compression algorithm and some algorithmic extensions, IBM J. Res. Develop., vol. 42, no. 6, pp , Nov [8] M. Nakanishi and T. Ogura, Real-time CAM-based Hough transform and its performance evaluation, Mach. Vis. Appl., vol. 12, no. 2, pp , Aug [9] P.-F. Lin and J. Kuo, A 1-V 128-kb four-way set-associative cmos cache memory using wordline-oriented tag-compare (WLOTC) structure with the content-addressable-memory (CAM) 10-transistor tag cell, IEEE J. Solid-State Circuits, vol. 36, no. 4, pp , Apr [10] C.-C. Wang, C.-J. Cheng, T.-F. Chen, and J.-S. Wang, An adaptively dividable dual-port BiTCAM for virus-detection processors in mobile devices, IEEE J. Solid-State Circuits, vol. 44, no. 5, pp , May [11] Y. Qu, S. Zhou, and V. Prasanna, Scalable many-field packet classification on multi-core processors, in Proc. Int. Symp. Comput. Archit. High Perform. Comput.(SBAC-PAD), Oct. 2013, pp [12] Z. Cai, Z. Wang, K. Zheng, and J. CAO, A distributed TCAMcoprocessor architecture for integrated longest prefix matching, policy filtering, and content filtering, IEEE Trans. Comput., vol. 62, no. 3, pp , Mar [13] H.-J. Tsai, K.-H. Yang, Y.-C. Peng, C.-C. Lin, Y.-H. Tsao, M.-F. Chang, and T.-F. Chen, Energy-efficient non-volatile TCAM search engine design using priority-decision in memory technology for DPI, in Proc. DAC, 2015, pp [14] P.-T. Huang and W. Hwang, A 65 nm fj/bit/search 256x144TCAM macro design for IPv6 lookup tables, IEEE J. Solid-State Circuits, vol. 46, no. 2, pp , Feb [15] I. Hayashi, T. Amano, N. Watanabe, Y. Yano, Y. Kuroda, M. Shirata, K. Dosaka, K. Nii, H. Noda, and H. Kawai, A 250-MHz 18-Mb full ternary CAM with low-voltage matchline sensing scheme in 65-nm CMOS, IEEE J. Solid-State Circuits, vol. 48, no. 11, pp , Nov [16] Y.-J. Chang, K.-L. Tsai, and H.-J. Tsai, Low leakage TCAM for IP lookup using two-side self-gating, IEEE Trans. Circuits Sys. I, Reg. Papers, vol. 60, no. 6, pp , Jun

12 1672 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 63, NO. 10, OCTOBER 2016 [17] Y.-J. Chang, Using the dynamic power source technique to reduce TCAM leakage power, IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 57, no. 11, pp , Nov [18] B.-D. Yang, Y.-K. Lee, S.-W. Sung, J.-J. Min, J.-M. Oh, and H.-J. Kang, A low power content addressable memory using low swing search lines, IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 58, no. 12, pp , Dec [19] K. Pagiamtzis and A. Sheikholeslami, A low-power content-addressable memory (CAM) using pipelined hierarchical search scheme, IEEE J. Solid-State Circuits, vol. 39, no. 9, pp , Sep [20] B.-D. Yang and L.-S. Kim, A low-power CAM using pulsed NAND- NOR match-line and charge-recycling search-line driver, IEEE J. Solid- State Circuits, vol. 55, no. 6, pp , Aug [21] I. Arsovski, T. Hebig, D. Dobson, and R. Wisort, A 32 nm 0.58-fJ/Bit/ search 1-GHz ternary content addressable memory compiler using siliconaware early-predict late-correct sensing with embedded deep-trench capacitor noise mitigation, IEEE J. Solid-State Circuits, vol. 48, no. 4, pp , Apr [22] Deering and R. Hinden, RFC 1883: Internet Protocol Version 6 (IPv6) Specification, Internet Engineering Task Force (IETF), Dec [23] B.-D. Yang, Low-power effective memory-size expanded TCAM using data-relocation scheme, IEEE J. Solid-State Circuits, vol. 50, no. 10, pp , Oct [24] K. Nii, T. Amano, N. Watanabe, M. Yamawaki, K. Yoshinaga, M. Wada, and I. Hayashi, A 28 nm 400 MHz 4-parallel 1.6 Gsearch/s 80 Mb ternary CAM, in 2014 IEEE ISSCC Dig. Tech. Papers, Feb. 2014, pp [25] W. Jiang and V. K. Prasanna, Scalable packet classification on FPGA, IEEE Trans. VLSI Syst., vol. 20, no. 9, pp , Sep [26] P. Gupta and N. McKeown, Classifying packets with hierarchical intelligent cuttings, IEEE Micro, vol. 20, no. 1, pp , Jan./Feb [27] V. Pus and J. Korenek, Fast and scalable packet classification using perfect hash functions, in Proc. ACM/SIGDA Int. Symp. FPGA, 2009, pp [28] P. T. Congdon, P. Mohapatra, M. Farrens, and V. Akella, Simultaneously reducing latency and power consumption in OpenFlow switches, IEEE/ACM Trans. Networking, vol. 22, no. 3, pp , Jun [29] M. Ross, Content Addressable Memory (CAM) With Accesses to Multiple CAM Arrays Used to Generate Result for Various Matching Sizes, U.S. Patent , Feb. 25, Ting-Sheng Chen received the B.S. degree in electrical engineering from National Chiao Tung University, Hsinchu, Taiwan, R.O.C., in He is currently working toward the Ph.D. degree from the Graduate Institute of Electronics Engineering, National Taiwan University, Taipei. His research interests are in the areas of softwaredefined networking, low-power circuit design, and VLSI implementation of DSP algorithms. Ding-Yuan Lee received the B.S. degree in electronics engineering from National Chiao Tung University, Hsinchu, Taiwan, in He is currently working toward the Ph.D. degree in the Graduate Institute of Electronics Engineering, National Taiwan University, Taipei. His research interests are in the areas of softwaredefined networking, arithmetic unit design, and VLSI implementation of DSP algorithms. Tsung-Te Liu received the B.S. and M.S. degrees from the National Taiwan University, Taipei, Taiwan, R.O.C., in 2002 and 2004, respectively, and the Ph.D. degree from the University of California, Berkeley, in 2012, all in electrical engineering. From 2004 to 2005, he was with MediaTek Inc., Taiwan, involved in circuit and system design for wireless communications. From 2005 to 2012, he was a member of the Berkeley Wireless Research Center (BWRC), University of California, Berkeley. From 2012 to 2014, he was with Interuniversity Microelectronics Centre (IMEC), Belgium, where he conducted research on circuit development for advanced CMOS technology. In 2014, he joined the faculty of the National Taiwan University, where he is currently an Assistant Professor of the Graduate Institute of Electronics Engineering and the Department of Electrical Engineering. His research interests involve energy-efficient circuit and system designs. He is the recipient of several design and teaching awards. An-Yeu (Andy) Wu (M 96 SM 12 F 15) received the B.S. degree from National Taiwan University, Taipei, Taiwan, R.O.C., in 1987, and the M.S. and Ph.D. degrees from the University of Maryland, College Park, in 1992 and 1995, respectively, all in electrical engineering. In August 2000, he joined the faculty of the Department of Electrical Engineering and the Graduate Institute of Electronics Engineering, National Taiwan University (NTU), where he is currently a Professor. His research interests include low-power/ high-performance VLSI architectures for DSP and communication applications, adaptive/multirate signal processing, reconfigurable broadband access systems and architectures, and System-on-Chip (SoC)/Network-on-Chip (NoC) platform for software/hardware co-design. He has published more than 190 refereed journal and conference papers in above research areas, together with five book chapters and 16 granted U.S. patents. From August 2007 to December 2009, he was on leave from NTU and served as the Deputy General Director of SoC Technology Center (STC), Industrial Technology Research Institute (ITRI), Hsinchu, Taiwan, supervising Parallel Core Architecture (PAC) VLIW DSP Processor and Multicore/Android SoC platform projects. Dr. Wu received the Outstanding Electrical Engineering Professor Award in 2010 from The Chinese Institute of Electrical Engineering (CIEE), Taiwan. He became an IEEE Fellow in 2015 for his contributions to DSP algorithms and VLSI designs for communication IC/SoC.

Filter-Based Dual-Voltage Architecture for Low-Power Long-Word TCAM Design

Filter-Based Dual-Voltage Architecture for Low-Power Long-Word TCAM Design Filter-Based Dual-Voltage Architecture for Low-Power Long-Word TCAM Design Ting-Sheng Chen, Ding-Yuan Lee, Tsung-Te Liu, and An-Yeu (Andy) Wu Graduate Institute of Electronics Engineering, National Taiwan

More information

Ternary Content Addressable Memory Types And Matchline Schemes

Ternary Content Addressable Memory Types And Matchline Schemes RESEARCH ARTICLE OPEN ACCESS Ternary Content Addressable Memory Types And Matchline Schemes Sulthana A 1, Meenakshi V 2 1 Student, ME - VLSI DESIGN, Sona College of Technology, Tamilnadu, India 2 Assistant

More information

Low Power using Match-Line Sensing in Content Addressable Memory S. Nachimuthu, S. Ramesh 1 Department of Electrical and Electronics Engineering,

Low Power using Match-Line Sensing in Content Addressable Memory S. Nachimuthu, S. Ramesh 1 Department of Electrical and Electronics Engineering, Low Power using Match-Line Sensing in Content Addressable Memory S. Nachimuthu, S. Ramesh 1 Department of Electrical and Electronics Engineering, K.S.R College of Engineering, Tiruchengode, Tamilnadu,

More information

A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup

A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup Yan Sun and Min Sik Kim School of Electrical Engineering and Computer Science Washington State University Pullman, Washington

More information

POWER REDUCTION IN CONTENT ADDRESSABLE MEMORY

POWER REDUCTION IN CONTENT ADDRESSABLE MEMORY POWER REDUCTION IN CONTENT ADDRESSABLE MEMORY Latha A 1, Saranya G 2, Marutharaj T 3 1, 2 PG Scholar, Department of VLSI Design, 3 Assistant Professor Theni Kammavar Sangam College Of Technology, Theni,

More information

Power Gated Match Line Sensing Content Addressable Memory

Power Gated Match Line Sensing Content Addressable Memory International Journal of Embedded Systems, Robotics and Computer Engineering. Volume 1, Number 1 (2015), pp. 1-6 International Research Publication House http://www.irphouse.com Power Gated Match Line

More information

Design of high speed low power Content Addressable Memory (CAM) using parity bit and gated power matchline sensing

Design of high speed low power Content Addressable Memory (CAM) using parity bit and gated power matchline sensing Design of high speed low power Content Addressable Memory (CAM) using parity bit and gated power matchline sensing T. Sudalaimani 1, Mr. S. Muthukrishnan 2 1 PG Scholar, (Applied Electronics), 2 Associate

More information

A Low Power Content Addressable Memory Implemented In Deep Submicron Technology

A Low Power Content Addressable Memory Implemented In Deep Submicron Technology A Low Power ontent Addressable Memory Implemented In Deep Submicron Technology Divyashree.M, Bhagya. P P.G. Student, Dept. of E&, Don Bosco Institute of Technology, Bangalore, Karnataka, India Associate

More information

Resource Efficient Multi Ported Sram Based Ternary Content Addressable Memory

Resource Efficient Multi Ported Sram Based Ternary Content Addressable Memory IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 PP 11-18 www.iosrjen.org Resource Efficient Multi Ported Sram Based Ternary Content Addressable Memory S.Parkavi (1) And S.Bharath

More information

RECENTLY, researches on gigabit wireless personal area

RECENTLY, researches on gigabit wireless personal area 146 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 55, NO. 2, FEBRUARY 2008 An Indexed-Scaling Pipelined FFT Processor for OFDM-Based WPAN Applications Yuan Chen, Student Member, IEEE,

More information

Tree-Based Minimization of TCAM Entries for Packet Classification

Tree-Based Minimization of TCAM Entries for Packet Classification Tree-Based Minimization of TCAM Entries for Packet Classification YanSunandMinSikKim School of Electrical Engineering and Computer Science Washington State University Pullman, Washington 99164-2752, U.S.A.

More information

International Journal of Advance Engineering and Research Development LOW POWER AND HIGH PERFORMANCE MSML DESIGN FOR CAM USE OF MODIFIED XNOR CELL

International Journal of Advance Engineering and Research Development LOW POWER AND HIGH PERFORMANCE MSML DESIGN FOR CAM USE OF MODIFIED XNOR CELL Scientific Journal of Impact Factor (SJIF): 5.71 e-issn (O): 2348-4470 p-issn (P): 2348-6406 International Journal of Advance Engineering and Research Development Volume 5, Issue 04, April -2018 LOW POWER

More information

Implementation of Efficient Ternary Content Addressable Memory by Using Butterfly Technique

Implementation of Efficient Ternary Content Addressable Memory by Using Butterfly Technique International journal of scientific and technical research in engineering (IJSTRE) www.ijstre.com Volume 1 Issue 5 ǁ August 2016. Implementation of Efficient Ternary Content Addressable Memory by Using

More information

VERY large scale integration (VLSI) design for power

VERY large scale integration (VLSI) design for power IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 7, NO. 1, MARCH 1999 25 Short Papers Segmented Bus Design for Low-Power Systems J. Y. Chen, W. B. Jone, Member, IEEE, J. S. Wang,

More information

Design of low power Precharge-Free CAM design using parity based mechanism

Design of low power Precharge-Free CAM design using parity based mechanism Design of low power Precharge-Free CAM design using parity based mechanism S.A.Sivakumar 1, A.Swedha 2 Assistant Professor, ECE Department, Info Institute of Engineering, Coimbatore, India 1 PG Scholar,

More information

A Low Power 32 Bit CMOS ROM Using a Novel ATD Circuit

A Low Power 32 Bit CMOS ROM Using a Novel ATD Circuit International Journal of Electrical and Computer Engineering (IJECE) Vol. 3, No. 4, August 2013, pp. 509~515 ISSN: 2088-8708 509 A Low Power 32 Bit CMOS ROM Using a Novel ATD Circuit Sidhant Kukrety*,

More information

Optimized CAM Design

Optimized CAM Design Vol. 3, Issue. 5, Sep - Oct. 2013 pp-2640-2645 ISSN: 2249-6645 Optimized CAM Design S. Haroon Rasheed 1, M. Anand Vijay Kamalnath 2 Department of ECE, AVR & SVR E C T, Nandyal, India Abstract: Content-addressable

More information

A Survey on Design of Content Addressable Memories

A Survey on Design of Content Addressable Memories A Survey on Design of Content Addressable Memories R.Prashidha, G.Shanthi, K.Priyadharshini Abstract We survey the recent developments in the design of large-capacity content addressable memory (CAM).

More information

Content Addressable Memory performance Analysis using NAND Structure FinFET

Content Addressable Memory performance Analysis using NAND Structure FinFET Global Journal of Pure and Applied Mathematics. ISSN 0973-1768 Volume 12, Number 1 (2016), pp. 1077-1084 Research India Publications http://www.ripublication.com Content Addressable Memory performance

More information

DESIGN OF PARAMETER EXTRACTOR IN LOW POWER PRECOMPUTATION BASED CONTENT ADDRESSABLE MEMORY

DESIGN OF PARAMETER EXTRACTOR IN LOW POWER PRECOMPUTATION BASED CONTENT ADDRESSABLE MEMORY DESIGN OF PARAMETER EXTRACTOR IN LOW POWER PRECOMPUTATION BASED CONTENT ADDRESSABLE MEMORY Saroja pasumarti, Asst.professor, Department Of Electronics and Communication Engineering, Chaitanya Engineering

More information

A 256-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology

A 256-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology http://dx.doi.org/10.5573/jsts.014.14.6.760 JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.14, NO.6, DECEMBER, 014 A 56-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology Sung-Joon Lee

More information

THE latest generation of microprocessors uses a combination

THE latest generation of microprocessors uses a combination 1254 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 30, NO. 11, NOVEMBER 1995 A 14-Port 3.8-ns 116-Word 64-b Read-Renaming Register File Creigton Asato Abstract A 116-word by 64-b register file for a 154 MHz

More information

DESIGN ANDIMPLEMENTATION OF EFFICIENT TERNARY CONTENT ADDRESSABLE MEMORY

DESIGN ANDIMPLEMENTATION OF EFFICIENT TERNARY CONTENT ADDRESSABLE MEMORY DESIGN ANDIMPLEMENTATION OF EFFICIENT TERNARY CONTENT ADDRESSABLE MEMORY Gangadhar Akurathi 1, Suneel kumar Guntuku 2 and K.Babulu 3 1 Department of ECE, JNTUK-UCEV, Vizianagaram, Andhra Pradesh, India

More information

Packet Classification Using Dynamically Generated Decision Trees

Packet Classification Using Dynamically Generated Decision Trees 1 Packet Classification Using Dynamically Generated Decision Trees Yu-Chieh Cheng, Pi-Chung Wang Abstract Binary Search on Levels (BSOL) is a decision-tree algorithm for packet classification with superior

More information

Content Addressable Memory Using Automatic Charge Balancing with Self-Control Mechanism and Master-Slave Match Line Design

Content Addressable Memory Using Automatic Charge Balancing with Self-Control Mechanism and Master-Slave Match Line Design Circuits and Systems, 2016, 7, 597-611 Published Online May 2016 in SciRes. http://www.scirp.org/journal/cs http://dx.doi.org/10.4236/cs.2016.76051 Content Addressable Memory Using Automatic Charge Balancing

More information

DESIGN AND SIMULATION OF 1 BIT ARITHMETIC LOGIC UNIT DESIGN USING PASS-TRANSISTOR LOGIC FAMILIES

DESIGN AND SIMULATION OF 1 BIT ARITHMETIC LOGIC UNIT DESIGN USING PASS-TRANSISTOR LOGIC FAMILIES Volume 120 No. 6 2018, 4453-4466 ISSN: 1314-3395 (on-line version) url: http://www.acadpubl.eu/hub/ http://www.acadpubl.eu/hub/ DESIGN AND SIMULATION OF 1 BIT ARITHMETIC LOGIC UNIT DESIGN USING PASS-TRANSISTOR

More information

Research Scholar, Chandigarh Engineering College, Landran (Mohali), 2

Research Scholar, Chandigarh Engineering College, Landran (Mohali), 2 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY Optimize Parity Encoding for Power Reduction in Content Addressable Memory Nisha Sharma, Manmeet Kaur 1 Research Scholar, Chandigarh

More information

Unique Journal of Engineering and Advanced Sciences Available online: Research Article

Unique Journal of Engineering and Advanced Sciences Available online:   Research Article ISSN 2348-375X Unique Journal of Engineering and Advanced Sciences Available online: www.ujconline.net Research Article A POWER EFFICIENT CAM DESIGN USING MODIFIED PARITY BIT MATCHING TECHNIQUE Karthik

More information

Implementation of SCN Based Content Addressable Memory

Implementation of SCN Based Content Addressable Memory IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 12, Issue 4, Ver. II (Jul.-Aug. 2017), PP 48-52 www.iosrjournals.org Implementation of

More information

Content Addressable Memory with Efficient Power Consumption and Throughput

Content Addressable Memory with Efficient Power Consumption and Throughput International journal of Emerging Trends in Science and Technology Content Addressable Memory with Efficient Power Consumption and Throughput Authors Karthik.M 1, R.R.Jegan 2, Dr.G.K.D.Prasanna Venkatesan

More information

NOISE ELIMINATION USING A BIT CAMS

NOISE ELIMINATION USING A BIT CAMS International Journal of VLSI Design, 2(2), 2011, pp. 97-101 NOISE ELIMINATION USING A BIT CAMS Sundar Srinivas Kuchibhotla 1 & Naga Lakshmi Kalyani Movva 2 1 Department of Electronics & Communication

More information

EMDBAM: A Low-Power Dual Bit Associative Memory with Match Error and Mask Control

EMDBAM: A Low-Power Dual Bit Associative Memory with Match Error and Mask Control Abstract: EMDBAM: A Low-Power Dual Bit Associative Memory with Match Error and Mask Control A ternary content addressable memory (TCAM) speeds up the search process in the memory by searching through prestored

More information

Design of a High Speed FPGA-Based Classifier for Efficient Packet Classification

Design of a High Speed FPGA-Based Classifier for Efficient Packet Classification Design of a High Speed FPGA-Based Classifier for Efficient Packet Classification V.S.Pallavi 1, Dr.D.Rukmani Devi 2 PG Scholar 1, Department of ECE, RMK Engineering College, Chennai, Tamil Nadu, India

More information

Design of Low Power Wide Gates used in Register File and Tag Comparator

Design of Low Power Wide Gates used in Register File and Tag Comparator www..org 1 Design of Low Power Wide Gates used in Register File and Tag Comparator Isac Daimary 1, Mohammed Aneesh 2 1,2 Department of Electronics Engineering, Pondicherry University Pondicherry, 605014,

More information

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 9 /Issue 3 / OCT 2017

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 9 /Issue 3 / OCT 2017 Design of Low Power Adder in ALU Using Flexible Charge Recycling Dynamic Circuit Pallavi Mamidala 1 K. Anil kumar 2 mamidalapallavi@gmail.com 1 anilkumar10436@gmail.com 2 1 Assistant Professor, Dept of

More information

CALCULATION OF POWER CONSUMPTION IN 7 TRANSISTOR SRAM CELL USING CADENCE TOOL

CALCULATION OF POWER CONSUMPTION IN 7 TRANSISTOR SRAM CELL USING CADENCE TOOL CALCULATION OF POWER CONSUMPTION IN 7 TRANSISTOR SRAM CELL USING CADENCE TOOL Shyam Akashe 1, Ankit Srivastava 2, Sanjay Sharma 3 1 Research Scholar, Deptt. of Electronics & Comm. Engg., Thapar Univ.,

More information

DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech)

DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech) DYNAMIC CIRCUIT TECHNIQUE FOR LOW- POWER MICROPROCESSORS Kuruva Hanumantha Rao 1 (M.tech) K.Prasad Babu 2 M.tech (Ph.d) hanumanthurao19@gmail.com 1 kprasadbabuece433@gmail.com 2 1 PG scholar, VLSI, St.JOHNS

More information

DUE to the high computational complexity and real-time

DUE to the high computational complexity and real-time IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 15, NO. 3, MARCH 2005 445 A Memory-Efficient Realization of Cyclic Convolution and Its Application to Discrete Cosine Transform Hun-Chen

More information

Design of a Low Power Content Addressable Memory (CAM)

Design of a Low Power Content Addressable Memory (CAM) Design of a Low Power Content Addressable Memory (CAM) Scott Beamer, Mehmet Akgul Department of Electrical Engineering & Computer Science University of California, Berkeley {sbeamer, akgul}@eecs.berkeley.edu

More information

Implementation of Boundary Cutting Algorithm Using Packet Classification

Implementation of Boundary Cutting Algorithm Using Packet Classification Implementation of Boundary Cutting Algorithm Using Packet Classification Dasari Mallesh M.Tech Student Department of CSE Vignana Bharathi Institute of Technology, Hyderabad. ABSTRACT: Decision-tree-based

More information

THE orthogonal frequency-division multiplex (OFDM)

THE orthogonal frequency-division multiplex (OFDM) 26 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 57, NO. 1, JANUARY 2010 A Generalized Mixed-Radix Algorithm for Memory-Based FFT Processors Chen-Fong Hsiao, Yuan Chen, Member, IEEE,

More information

High Performance Memory Read Using Cross-Coupled Pull-up Circuitry

High Performance Memory Read Using Cross-Coupled Pull-up Circuitry High Performance Memory Read Using Cross-Coupled Pull-up Circuitry Katie Blomster and José G. Delgado-Frias School of Electrical Engineering and Computer Science Washington State University Pullman, WA

More information

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET)

INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECNOLOGY (IJECET) International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN 976 6464(Print), ISSN

More information

LOW POWER SRAM CELL WITH IMPROVED RESPONSE

LOW POWER SRAM CELL WITH IMPROVED RESPONSE LOW POWER SRAM CELL WITH IMPROVED RESPONSE Anant Anand Singh 1, A. Choubey 2, Raj Kumar Maddheshiya 3 1 M.tech Scholar, Electronics and Communication Engineering Department, National Institute of Technology,

More information

Design of Low-Power and Low-Latency 256-Radix Crossbar Switch Using Hyper-X Network Topology

Design of Low-Power and Low-Latency 256-Radix Crossbar Switch Using Hyper-X Network Topology JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.15, NO.1, FEBRUARY, 2015 http://dx.doi.org/10.5573/jsts.2015.15.1.077 Design of Low-Power and Low-Latency 256-Radix Crossbar Switch Using Hyper-X Network

More information

6T- SRAM for Low Power Consumption. Professor, Dept. of ExTC, PRMIT &R, Badnera, Amravati, Maharashtra, India 1

6T- SRAM for Low Power Consumption. Professor, Dept. of ExTC, PRMIT &R, Badnera, Amravati, Maharashtra, India 1 6T- SRAM for Low Power Consumption Mrs. J.N.Ingole 1, Ms.P.A.Mirge 2 Professor, Dept. of ExTC, PRMIT &R, Badnera, Amravati, Maharashtra, India 1 PG Student [Digital Electronics], Dept. of ExTC, PRMIT&R,

More information

MTJ-Based Nonvolatile Logic-in-Memory Architecture

MTJ-Based Nonvolatile Logic-in-Memory Architecture 2011 Spintronics Workshop on LSI @ Kyoto, Japan, June 13, 2011 MTJ-Based Nonvolatile Logic-in-Memory Architecture Takahiro Hanyu Center for Spintronics Integrated Systems, Tohoku University, JAPAN Laboratory

More information

Content Addressable Memory Architecture based on Sparse Clustered Networks

Content Addressable Memory Architecture based on Sparse Clustered Networks IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 6, Issue 6, Ver. III (Nov. - Dec. 2016), PP 21-28 e-issn: 2319 4200, p-issn No. : 2319 4197 www.iosrjournals.org Content Addressable Memory

More information

DESIGN AND PERFORMANCE ANALYSIS OF CARRY SELECT ADDER

DESIGN AND PERFORMANCE ANALYSIS OF CARRY SELECT ADDER DESIGN AND PERFORMANCE ANALYSIS OF CARRY SELECT ADDER Bhuvaneswaran.M 1, Elamathi.K 2 Assistant Professor, Muthayammal Engineering college, Rasipuram, Tamil Nadu, India 1 Assistant Professor, Muthayammal

More information

International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering

International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering IP-SRAM ARCHITECTURE AT DEEP SUBMICRON CMOS TECHNOLOGY A LOW POWER DESIGN D. Harihara Santosh 1, Lagudu Ramesh Naidu 2 Assistant professor, Dept. of ECE, MVGR College of Engineering, Andhra Pradesh, India

More information

Design and Simulation of Low Power 6TSRAM and Control its Leakage Current Using Sleepy Keeper Approach in different Topology

Design and Simulation of Low Power 6TSRAM and Control its Leakage Current Using Sleepy Keeper Approach in different Topology Vol. 3, Issue. 3, May.-June. 2013 pp-1475-1481 ISSN: 2249-6645 Design and Simulation of Low Power 6TSRAM and Control its Leakage Current Using Sleepy Keeper Approach in different Topology Bikash Khandal,

More information

Survey on Stability of Low Power SRAM Bit Cells

Survey on Stability of Low Power SRAM Bit Cells International Journal of Electronics Engineering Research. ISSN 0975-6450 Volume 9, Number 3 (2017) pp. 441-447 Research India Publications http://www.ripublication.com Survey on Stability of Low Power

More information

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 9 /Issue 5 / JAN 2018

INTERNATIONAL JOURNAL OF PROFESSIONAL ENGINEERING STUDIES Volume 9 /Issue 5 / JAN 2018 Implementation of an Efficient Content Addressable Memory Based on Sparse Cluster Network Basumala Chandram 1 Dr. P.Sreenivasulu 2 Dr. P. Prasanna Muralikrishna 3 chandu6240@gmail.com 1 psrinivas.vlsi@gmail.com

More information

Memory Design I. Array-Structured Memory Architecture. Professor Chris H. Kim. Dept. of ECE.

Memory Design I. Array-Structured Memory Architecture. Professor Chris H. Kim. Dept. of ECE. Memory Design I Professor Chris H. Kim University of Minnesota Dept. of ECE chriskim@ece.umn.edu Array-Structured Memory Architecture 2 1 Semiconductor Memory Classification Read-Write Wi Memory Non-Volatile

More information

Low Power Circuits using Modified Gate Diffusion Input (GDI)

Low Power Circuits using Modified Gate Diffusion Input (GDI) IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 4, Issue 5, Ver. II (Sep-Oct. 2014), PP 70-76 e-issn: 2319 4200, p-issn No. : 2319 4197 Low Power Circuits using Modified Gate Diffusion Input

More information

An Area-Efficient BIRA With 1-D Spare Segments

An Area-Efficient BIRA With 1-D Spare Segments 206 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 26, NO. 1, JANUARY 2018 An Area-Efficient BIRA With 1-D Spare Segments Donghyun Kim, Hayoung Lee, and Sungho Kang Abstract The

More information

120 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 61, NO. 2, FEBRUARY 2014

120 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 61, NO. 2, FEBRUARY 2014 120 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 61, NO. 2, FEBRUARY 2014 VL-ECC: Variable Data-Length Error Correction Code for Embedded Memory in DSP Applications Jangwon Park,

More information

ISSCC 2006 / SESSION 22 / LOW POWER MULTIMEDIA / 22.1

ISSCC 2006 / SESSION 22 / LOW POWER MULTIMEDIA / 22.1 ISSCC 26 / SESSION 22 / LOW POWER MULTIMEDIA / 22.1 22.1 A 125µW, Fully Scalable MPEG-2 and H.264/AVC Video Decoder for Mobile Applications Tsu-Ming Liu 1, Ting-An Lin 2, Sheng-Zen Wang 2, Wen-Ping Lee

More information

A 65nm Parallel Prefix High Speed Tree based 64-Bit Binary Comparator

A 65nm Parallel Prefix High Speed Tree based 64-Bit Binary Comparator A 65nm Parallel Prefix High Speed Tree based 64-Bit Binary Comparator Er. Deepak sharma Student M-TECH, Yadavindra College of Engineering, Punjabi University, Guru Kashi Campus,Talwandi Sabo. Er. Parminder

More information

A Review Paper on Reconfigurable Techniques to Improve Critical Parameters of SRAM

A Review Paper on Reconfigurable Techniques to Improve Critical Parameters of SRAM IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 09, 2016 ISSN (online): 2321-0613 A Review Paper on Reconfigurable Techniques to Improve Critical Parameters of SRAM Yogit

More information

/$ IEEE

/$ IEEE IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 56, NO. 1, JANUARY 2009 81 Bit-Level Extrinsic Information Exchange Method for Double-Binary Turbo Codes Ji-Hoon Kim, Student Member,

More information

International Journal of Scientific & Engineering Research, Volume 5, Issue 2, February ISSN

International Journal of Scientific & Engineering Research, Volume 5, Issue 2, February ISSN International Journal of Scientific & Engineering Research, Volume 5, Issue 2, February-2014 938 LOW POWER SRAM ARCHITECTURE AT DEEP SUBMICRON CMOS TECHNOLOGY T.SANKARARAO STUDENT OF GITAS, S.SEKHAR DILEEP

More information

Analysis of 8T SRAM with Read and Write Assist Schemes (UDVS) In 45nm CMOS Technology

Analysis of 8T SRAM with Read and Write Assist Schemes (UDVS) In 45nm CMOS Technology Analysis of 8T SRAM with Read and Write Assist Schemes (UDVS) In 45nm CMOS Technology Srikanth Lade 1, Pradeep Kumar Urity 2 Abstract : UDVS techniques are presented in this paper to minimize the power

More information

STUDY OF SRAM AND ITS LOW POWER TECHNIQUES

STUDY OF SRAM AND ITS LOW POWER TECHNIQUES INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATION ENGINEERING & TECHNOLOGY (IJECET) International Journal of Electronics and Communication Engineering & Technology (IJECET), ISSN ISSN 0976 6464(Print)

More information

Minimizing Power Dissipation during Write Operation to Register Files

Minimizing Power Dissipation during Write Operation to Register Files Minimizing Power Dissipation during Operation to Register Files Abstract - This paper presents a power reduction mechanism for the write operation in register files (RegFiles), which adds a conditional

More information

Multi-core Implementation of Decomposition-based Packet Classification Algorithms 1

Multi-core Implementation of Decomposition-based Packet Classification Algorithms 1 Multi-core Implementation of Decomposition-based Packet Classification Algorithms 1 Shijie Zhou, Yun R. Qu, and Viktor K. Prasanna Ming Hsieh Department of Electrical Engineering, University of Southern

More information

The Serial Commutator FFT

The Serial Commutator FFT The Serial Commutator FFT Mario Garrido Gálvez, Shen-Jui Huang, Sau-Gee Chen and Oscar Gustafsson Journal Article N.B.: When citing this work, cite the original article. 2016 IEEE. Personal use of this

More information

Area Efficient Z-TCAM for Network Applications

Area Efficient Z-TCAM for Network Applications Area Efficient Z-TCAM for Network Applications Vishnu.S P.G Scholar, Applied Electronics, Coimbatore Institute of Technology. Ms.K.Vanithamani Associate Professor, Department of EEE, Coimbatore Institute

More information

554 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 16, NO. 5, MAY 2008

554 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 16, NO. 5, MAY 2008 554 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 16, NO. 5, MAY 2008 Ternary CAM Power and Delay Model: Extensions and Uses Banit Agrawal, Student Member, IEEE, and Timothy Sherwood,

More information

Implementation of Asynchronous Topology using SAPTL

Implementation of Asynchronous Topology using SAPTL Implementation of Asynchronous Topology using SAPTL NARESH NAGULA *, S. V. DEVIKA **, SK. KHAMURUDDEEN *** *(senior software Engineer & Technical Lead, Xilinx India) ** (Associate Professor, Department

More information

Design and Implementation of High Performance Application Specific Memory

Design and Implementation of High Performance Application Specific Memory Design and Implementation of High Performance Application Specific Memory - 고성능 Application Specific Memory 의설계와구현 - M.S. Thesis Sungdae Choi Dec. 20th, 2002 Outline Introduction Memory for Mobile 3D Graphics

More information

Survey on Content Addressable Memory and Sparse Clustered Network

Survey on Content Addressable Memory and Sparse Clustered Network Survey on Content Addressable Memory and Sparse Clustered Network 1 Aswathy K.S., 2 K. Gnana Sheela 1,2 Department of Electronics and Communication, Toc H Institute of Science and Technology Kerala, India

More information

LOGIC EFFORT OF CMOS BASED DUAL MODE LOGIC GATES

LOGIC EFFORT OF CMOS BASED DUAL MODE LOGIC GATES LOGIC EFFORT OF CMOS BASED DUAL MODE LOGIC GATES D.Rani, R.Mallikarjuna Reddy ABSTRACT This logic allows operation in two modes: 1) static and2) dynamic modes. DML gates, which can be switched between

More information

A Low Power SRAM Base on Novel Word-Line Decoding

A Low Power SRAM Base on Novel Word-Line Decoding Vol:, No:3, 008 A Low Power SRAM Base on Novel Word-Line Decoding Arash Azizi Mazreah, Mohammad T. Manzuri Shalmani, Hamid Barati, Ali Barati, and Ali Sarchami International Science Index, Computer and

More information

Selective Boundary Cutting For Packet Classification SOUMYA. K 1, CHANDRA SEKHAR. M 2

Selective Boundary Cutting For Packet Classification SOUMYA. K 1, CHANDRA SEKHAR. M 2 ISSN 2319-8885 Vol.04,Issue.34, August-2015, Pages:6786-6790 www.ijsetr.com SOUMYA. K 1, CHANDRA SEKHAR. M 2 1 Navodaya Institute of Technology, Raichur, Karnataka, India, E-mail: Keerthisree1112@gmail.com.

More information

IMPLEMENTATION OF LOW POWER AREA EFFICIENT ALU WITH LOW POWER FULL ADDER USING MICROWIND DSCH3

IMPLEMENTATION OF LOW POWER AREA EFFICIENT ALU WITH LOW POWER FULL ADDER USING MICROWIND DSCH3 IMPLEMENTATION OF LOW POWER AREA EFFICIENT ALU WITH LOW POWER FULL ADDER USING MICROWIND DSCH3 Ritafaria D 1, Thallapalli Saibaba 2 Assistant Professor, CJITS, Janagoan, T.S, India Abstract In this paper

More information

Power Analysis for CMOS based Dual Mode Logic Gates using Power Gating Techniques

Power Analysis for CMOS based Dual Mode Logic Gates using Power Gating Techniques Power Analysis for CMOS based Dual Mode Logic Gates using Power Gating Techniques S. Nand Singh Dr. R. Madhu M. Tech (VLSI Design) Assistant Professor UCEK, JNTUK. UCEK, JNTUK Abstract: Low power technology

More information

Analysis of 8T SRAM Cell Using Leakage Reduction Technique

Analysis of 8T SRAM Cell Using Leakage Reduction Technique Analysis of 8T SRAM Cell Using Leakage Reduction Technique Sandhya Patel and Somit Pandey Abstract The purpose of this manuscript is to decrease the leakage current and a memory leakage power SRAM cell

More information

A Comparative Study of Power Efficient SRAM Designs

A Comparative Study of Power Efficient SRAM Designs A Comparative tudy of Power Efficient RAM Designs Jeyran Hezavei, N. Vijaykrishnan, M. J. Irwin Pond Laboratory, Department of Computer cience & Engineering, Pennsylvania tate University {hezavei, vijay,

More information

A Normal I/O Order Radix-2 FFT Architecture to Process Twin Data Streams for MIMO

A Normal I/O Order Radix-2 FFT Architecture to Process Twin Data Streams for MIMO 2402 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 24, NO. 6, JUNE 2016 A Normal I/O Order Radix-2 FFT Architecture to Process Twin Data Streams for MIMO Antony Xavier Glittas,

More information

Gated-Demultiplexer Tree Buffer for Low Power Using Clock Tree Based Gated Driver

Gated-Demultiplexer Tree Buffer for Low Power Using Clock Tree Based Gated Driver Gated-Demultiplexer Tree Buffer for Low Power Using Clock Tree Based Gated Driver E.Kanniga 1, N. Imocha Singh 2,K.Selva Rama Rathnam 3 Professor Department of Electronics and Telecommunication, Bharath

More information

Integrated Circuits & Systems

Integrated Circuits & Systems Federal University of Santa Catarina Center for Technology Computer Science & Electronics Engineering Integrated Circuits & Systems INE 5442 Lecture 23-1 guntzel@inf.ufsc.br Semiconductor Memory Classification

More information

Semiconductor Memory Classification. Today. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. CPU Memory Hierarchy.

Semiconductor Memory Classification. Today. ESE 570: Digital Integrated Circuits and VLSI Fundamentals. CPU Memory Hierarchy. ESE 57: Digital Integrated Circuits and VLSI Fundamentals Lec : April 4, 7 Memory Overview, Memory Core Cells Today! Memory " Classification " ROM Memories " RAM Memory " Architecture " Memory core " SRAM

More information

Near Optimal Repair Rate Built-in Redundancy Analysis with Very Small Hardware Overhead

Near Optimal Repair Rate Built-in Redundancy Analysis with Very Small Hardware Overhead Near Optimal Repair Rate Built-in Redundancy Analysis with Very Small Hardware Overhead Woosung Lee, Keewon Cho, Jooyoung Kim, and Sungho Kang Department of Electrical & Electronic Engineering, Yonsei

More information

THE turbo code is one of the most attractive forward error

THE turbo code is one of the most attractive forward error IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 63, NO. 2, FEBRUARY 2016 211 Memory-Reduced Turbo Decoding Architecture Using NII Metric Compression Youngjoo Lee, Member, IEEE, Meng

More information

ELECTROSTATIC discharge (ESD) is a transient process

ELECTROSTATIC discharge (ESD) is a transient process 320 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 18, NO. 2, MAY 2005 SCR Device Fabricated With Dummy-Gate Structure to Improve Turn-On Speed for Effective ESD Protection in CMOS Technology Ming-Dou

More information

Novel low power CAM architecture

Novel low power CAM architecture Rochester Institute of Technology RIT Scholar Works Theses Thesis/Dissertation Collections 8-1-2008 Novel low power CAM architecture Ka Fai Ng Follow this and additional works at: http://scholarworks.rit.edu/theses

More information

Power-Mode-Aware Buffer Synthesis for Low-Power Clock Skew Minimization

Power-Mode-Aware Buffer Synthesis for Low-Power Clock Skew Minimization This article has been accepted and published on J-STAGE in advance of copyediting. Content is final as presented. IEICE Electronics Express, Vol.* No.*,*-* Power-Mode-Aware Buffer Synthesis for Low-Power

More information

Column decoder using PTL for memory

Column decoder using PTL for memory IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735. Volume 5, Issue 4 (Mar. - Apr. 2013), PP 07-14 Column decoder using PTL for memory M.Manimaraboopathy

More information

Performance Improvement of Hardware-Based Packet Classification Algorithm

Performance Improvement of Hardware-Based Packet Classification Algorithm Performance Improvement of Hardware-Based Packet Classification Algorithm Yaw-Chung Chen 1, Pi-Chung Wang 2, Chun-Liang Lee 2, and Chia-Tai Chan 2 1 Department of Computer Science and Information Engineering,

More information

Analysis and Design of Low Voltage Low Noise LVDS Receiver

Analysis and Design of Low Voltage Low Noise LVDS Receiver IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 9, Issue 2, Ver. V (Mar - Apr. 2014), PP 10-18 Analysis and Design of Low Voltage Low Noise

More information

OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions

OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions OUTLINE Introduction Power Components Dynamic Power Optimization Conclusions 04/15/14 1 Introduction: Low Power Technology Process Hardware Architecture Software Multi VTH Low-power circuits Parallelism

More information

Simulation and Analysis of SRAM Cell Structures at 90nm Technology

Simulation and Analysis of SRAM Cell Structures at 90nm Technology Vol.1, Issue.2, pp-327-331 ISSN: 2249-6645 Simulation and Analysis of SRAM Cell Structures at 90nm Technology Sapna Singh 1, Neha Arora 2, Prof. B.P. Singh 3 (Faculty of Engineering and Technology, Mody

More information

Parallel-computing approach for FFT implementation on digital signal processor (DSP)

Parallel-computing approach for FFT implementation on digital signal processor (DSP) Parallel-computing approach for FFT implementation on digital signal processor (DSP) Yi-Pin Hsu and Shin-Yu Lin Abstract An efficient parallel form in digital signal processor can improve the algorithm

More information

A Single Ended SRAM cell with reduced Average Power and Delay

A Single Ended SRAM cell with reduced Average Power and Delay A Single Ended SRAM cell with reduced Average Power and Delay Kritika Dalal 1, Rajni 2 1M.tech scholar, Electronics and Communication Department, Deen Bandhu Chhotu Ram University of Science and Technology,

More information

Design of local ESD clamp for cross-power-domain interface circuits

Design of local ESD clamp for cross-power-domain interface circuits This article has been accepted and published on J-STAGE in advance of copyediting. Content is final as presented. IEICE Electronics Express, Vol.* No.*,*-* Design of local ESD clamp for cross-power-domain

More information

DESIGN AND IMPLEMENTATION OF OPTIMIZED PACKET CLASSIFIER

DESIGN AND IMPLEMENTATION OF OPTIMIZED PACKET CLASSIFIER International Journal of Computer Engineering and Applications, Volume VI, Issue II, May 14 www.ijcea.com ISSN 2321 3469 DESIGN AND IMPLEMENTATION OF OPTIMIZED PACKET CLASSIFIER Kiran K C 1, Sunil T D

More information

A CORDIC Algorithm with Improved Rotation Strategy for Embedded Applications

A CORDIC Algorithm with Improved Rotation Strategy for Embedded Applications A CORDIC Algorithm with Improved Rotation Strategy for Embedded Applications Kui-Ting Chen Research Center of Information, Production and Systems, Waseda University, Fukuoka, Japan Email: nore@aoni.waseda.jp

More information

Very Large Scale Integration (VLSI)

Very Large Scale Integration (VLSI) Very Large Scale Integration (VLSI) Lecture 8 Dr. Ahmed H. Madian ah_madian@hotmail.com Content Array Subsystems Introduction General memory array architecture SRAM (6-T cell) CAM Read only memory Introduction

More information

FAST FOURIER TRANSFORM (FFT) and inverse fast

FAST FOURIER TRANSFORM (FFT) and inverse fast IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 39, NO. 11, NOVEMBER 2004 2005 A Dynamic Scaling FFT Processor for DVB-T Applications Yu-Wei Lin, Hsuan-Yu Liu, and Chen-Yi Lee Abstract This paper presents an

More information