Novel IP Address Lookup Algorithm for Inexpensive Hardware Implementation

Size: px
Start display at page:

Download "Novel IP Address Lookup Algorithm for Inexpensive Hardware Implementation"

Transcription

1 Novel IP Address Lookup Algorithm for Inexpensive Hardware Implementation KARI SEPPÄNEN Information Technology Research Institute Technical Research Centre of Finland (VTT) PO Box 1202, FIN VTT Finland Abstract: The key factor defining the efficiency of IP routers is the speed of the forwarding operation, that is the speed of determining the next-hop destination for each packet. The operation is not simple because the IP addresses are unstructured and the destination subnetworks can be overlapping. This requires so called longest match lookup operation. In this paper I propose a simple and very fast address lookup algorithm that can be easily implemented in hardware. It is designed for inexpensive systems and thus requires only standard SRAM and FPGA devices. However, its performance exceeds even the requirements of today s backbone routers and it allows for incremental forwarding table updates. Key-Words: IP routing, address lookup, Gigabit routers 1 Introduction The operation of the IP networks is based on connectionless datagram routing performed hop-by-hop from the source host to the destination host. While it is possible to define the route explicitly at the source end and include that information into the datagram, the existing networks operate solely in hop-by-hop routing mode to avoid excess overhead. The hopby-hop routing is based on determining the next-hop destination according to the destination address included in each datagram. The next-hop is defined by a forwarding table that is maintained in each network node doing routing operations, that is an IP router. A forwarding table contains the set of IP subnetwork definitions and for each subnetwork the address of desired next-hop destination. One of the key factors determining the efficiency of an IP router is the speed of the forwarding operation, that is the time it takes to resolve the next hop address based on the destination address of a packet. What makes it complicated is the fact that IP version 4 (IPv4) addresses are unstructured (so called classless interdomain routing, CIDR scheme) and thus simple lookup algorithms are not suitable. Moreover, the route specifications can be overlapping, that is there can be smaller address ranges specified inside a larger address range. These address ranges are called prefixes which are composed of a network address and its length. So there can be overlapping prefix definitions like /8 (meaning addresses with first 8 bits = 138) and /10. IP routing policy is defined so that always the longest prefix definition matching the destination address defines the next hop. This is the so called longest match principle [1]. An additional constraint in designing an efficient forwarding algorithm is the dynamic nature of routing information. In principle all routing information changes such as new routes and new subnetworks are visible to all backbone routers. These routers have to process the changes on the fly and to adjust the forwarding table accordingly. However, if updates require extensive computation, vast amount of memory accesses, or even total reconstruction of the search structure, it can degrade the performance of a router considerably. This could be quite serious in certain abnormal, but not rare, situations such as a failure of important backbone router or in case of route flapping [2, p. 233]. All this require a lookup structure that can be updated with a reasonable workload and without stopping the datagram forwarding. Until now there have been only two real alternatives to implement IP address lookup mechanism: either to use a general purpose CPU with a software based algorithm or to use an application specific in-

2 tegrated circuit (ASIC) with a hard-wired algorithm. Both approaches have their weaknesses like poor performance of software algorithms or long design period and large NRE costs of ASICs. However, the ongoing development on field programmable gate arrays (FPGA) and various fast and flexible memory devices such as zero bus turn around static random access memory (ZBT-SRAM) has created a third alternative approach. This combination offers performance par to the ASIC as well as the rapid and versatile development process of software. Furthermore, cleanly designed submodule implementing the lookup algorithm can be easily integrated into any FPGA or ASIC design requiring its functionality. In this paper I first give a short general view into the existing algorithms and point out some of their weaknesses in the light of inexpensive hardware implementation. Then I describe the proposed algorithm and show how it could be implemented efficiently using only simple general purpose hardware components. To conclude I present some result from performance simulations showing, e.g. memory consumption, search structure construction time, and average search times. 2 Existing Algorithms There are many excellent articles on classification and efficiency of different address lookup algorithms like [1, 3]. In this paper the previous work is not redone but it is taken advantage of to find out how suitable these algorithms could be for a FPGA implementation. There are many ways to divide the lookup algorithms in groups but I have used a quite crude method: they are divided into software and hardware based ones. The reason for this division is that the constraints given by the implementation environment differ considerably in those two groups. 2.1 Software algorithms A classical way to present prefixes is a tree based data structure called trie where the bits of prefixes are used to direct branching. A simple binary trie is straight forward to generate and it allows for easy incremental updates. However, there are some critical problems: the worst case search time is long (32 steps for 32-bit IPv4 addresses) and memory efficiency is not very good. The traditional way to overcome these problems has been the path compression techniques (PATRICIA and BSD trie). However, path compression does not guarantee short search times; actually the worst case search time remains the same or is even doubled if backtracking is used. Furthermore, the memory efficiency of these algorithms reduces as more prefixes is added, i.e. as the trie gets denser. There are two basic ways to achieve better performance: to use either multibit tries or prefix range search. However, the latter alternative does not guarantee short worst case search times and thus it is not considered further in this paper. The multibit trie operates on multiple bits simultaneously. The bits inspected in one step is called a stride. Depending on the algorithm all the strides in a trie can have the same size, i.e. fixed-stride multibit trie, or the strides can have different sizes, i.e. variable-stride multibit trie. There are trade-offs in selecting a suitable stride size: a large stride gives short search times as the trie has fewer levels but at the same time there will be lots of empty entries resulting in large memory image and harder updates. The existing multibit trie algorithms use several improvements to the basic scheme to achieve better memory efficiency. One method is to use path compression with multibit tries as in level compressed (LC) trie [4]. Another solution is to use compression with fixed-stride tries, e.g. as in Lulea algorithm [5]. There are also algorithms that rely on determining optimal stride sizes using different optimisation methods. However, these methods usually sacrifice the possibility for incremental updates: LC-trie is very hard to update, Lulea trie is impossible to update and due to compression requires excess memory references, and incremental updates degrade the efficiency of the methods based on optimisation [1]. All the high performance software algorithms are designed to have as small memory foot print as possible to take full advantage of fast cache memories. The primary motivation for this are the large penalties caused by cache miss; many times dozens of clock cycles. On the other hand these algorithms can take an advantage of the complex operations provided by general purpose instruction set. Carefully tuned algorithms can reach quite respectable performance levels: estimates as high as 87 million lookups per second have been reported [6]. However, the efficiency of the most aggressive cache based algorithms rely on the locality of the traffic which

3 cannot be guaranteed in backbone routers. In addition to the problems with incremental updates, the whole class of software based algorithms have some drawbacks that reduce their attractiveness. First of all, the I/O performance of general purpose computers just does not match the requirements for a Gigabit router. On the other hand, embedding a high performance general purpose CPU into a custom built router card is not simple nor inexpensive due to required support circuitry. Furthermore, the effectiveness is threatened from two directions: the increasing number of prefixes may result in a trie that is larger than the cache of the target system and better memory architectures in the future may reduce the complex compression methods from an advantage to a burden. 2.2 Hardware algorithms Using a hardware based address lookup algorithm has some obvious advantages: the memory architecture can be tailored for the requirements of the algorithm and complex bit manipulation operations can be implemented. Furthermore, explicit concurrency can be take advantage of instead of CPU architecture specific implicit one. However, the implementor of a hardware based system cannot rely on an existing apparatus like large on-chip caches or multiple pipelined execution units. Thus it is desired to keep HW-based algorithms as simple as possible. This very important especially with FPGA based implementations because it is quite impossible to implement, e.g. large and fast on-chip caches. One straightforward implementation alternative is to use a multibit trie with only couple of levels. The basic scheme uses two-level trie: 24-bit stride in 1st level and 8-bit strides in the 2nd level [7]. Because the number of prefixes longer than 24 bits is still quite low (1806 in NLARN data from March ), most of the lookups are performed in just single memory reference. The problem with this kind of trie is that in some cases updating the trie can take a long time, e.g. if a prefix of length 8 changes we must update at least = 2 16 entries. Furthermore, the memory requirements, about 33 MB, make it very costly to use the fastest SRAM devices, e.g. it would require 17 x 18 Mbit chips. Using synchronous DRAM devices is not a realistic option as truly random access memory references can take, e.g. 5 cycles to complete reducing the effective memory cycle speed to something like 30 MHz. A realistic option would be slow large capacity SRAM modules but in that case the memory cycle speed is in order of MHz. An alternative HW-based trie algorithm takes advantage of compression methods [8]. However, the ease of incremental updates is in danger again. While the authors of [8] try to belittle this problem by claiming that updates are required only once in s, this is a real problem as their method seems to make incremental updates impossible. The memory requirement of this scheme for a entry forwarding table is said to be MB but the results are obtained using a random prefix set. Standard content addressable memories (CAMs) cannot be applied directly for longest match operation. The reason for this is that the length of the prefix cannot be determined form the IP address without doing the longest match operation. The only direct way to take advantage of the properties CAM is to use them in implementing some fixed structure multibit trie algorithm. In this case the memory consumption can be quite modest even with simple search structure. However, CAM devices are more expensive, slower, and less spacious than ordinary random access memories. A special ternary CAM, that can store three values in each bit (0, 1, * = do not care), are suitable for longest match operation. However, they are even more expensive and have even less capacity than ordinary CAM devices [9, 10]. 3 Proposed Algorithm The requirements for an efficient and versatile HW based address lookup algorithm are: 1. Requires only a small number of memory references per a lookup operation 2. Allows for parallel and / or pipelined implementation 3. Can be incrementally generated and updated 4. Has only a modest memory foot print Requirements 1-2 are directly related to performance: external memories, even the fastest one, cannot match the speed of internal L1 or even L2 caches and thus each memory access is more expensive. On the other hand it is quite challenging task get a FPGA design

4 to run at speeds of fastest SRAM devices and therefore it is important that parallelism and pipelining can be used. While it may be true that instant route updates are not a strict requirement for current IP routers, it is not certain that the situation will remain so. Thus it is important to reserve a possibility to do fast incremental updates. As a HW implementation is not tied down into a limited CPU cache, the memory consumption is not so important issue. However, if we would like to take advantage of the latest and fastest memory technology, realistic memory space is today approximately 4-16 MB. The proposed algorithm is designed to be implemented using standard FPGA and SRAM devices. Using the leading edge FPGA devices there should not be any differences on performance level compared to an ASIC implementation. The reason for this is that the performance depends on the speed of the memory access which, in turn, depends on the memory device. The reason to use fast SRAM devices instead of some type of high bandwidth DRAM devices is the efficiency and simplicity of the memory access cycle. The current high bandwidth DRAM devices have many constraints on memory cycles that have to be considered if good performance is desired. On the contrary devices like ZBT-SRAM or quad rate SRAM offer totally penalty-less freedom on how the memory access should be done. However, there is a drawback: SRAM devices have less capacity and they are more expensive than DRAM. This requires careful search structure design to minimise the memory usage without performance penalties. 3.1 Compact Stride Multibit Trie (CS Trie) EMPTY: DIRECT: EXCEPT: LINK: The proposed algorithm is basically a fixed-stride multibit trie using 8-bit strides in each layer. This results in 4 layer trie with worst case search time equalling to 4 steps. However, due to large number of bits in each stride requiring large trie nodes, this kind of trie would need a large amount of memory. The CS-trie reduces both memory consumption and search time using conditional entries and compacted strides. While the generation of trie becomes more complicated than in the basic scheme, this is compensated by fewer memory references required for each update. The only real drawbacks are larger entries in trie leaves reducing memory savings and conditional masking machinery making FPGA cirtype type type type nhi nhi nhi prefix prefix plen plen Figure 1: Entry types pnhi pointer nsize cuitry more complex. The structure of CS-trie is such that there is a uncompacted root node containing 8-bit (256 entry) stride. All other nodes and leaves, i.e. in the levels 2-4 of the trie, are compacted 8-bit strides. Each n-bit trie contains 2 n entries containing 1-6 fields each depending on the entry type. The entry types (shown in Figure 1) are following: EMPTY No destination network defined for this address resulting no match. Only type field. DIRECT Direct match, type and next hop index (nhi) fields. EXCEPT Includes a prefix and prefix length; if the address matches the prefix next hop is defined by pnhi otherwise by nhi LINK As the EXCEPT but if the address matches the prefix, pointer defines the memory address of next level stride and nsize defines the size of that next stride. If the address does not match to the prefix, nhi defines the next hop index directly. Besides the trie there is an additional table, the next hop table, defining the required information (port number, MAC address, etc.) for datagram forwarding. The nhi field is used to address this table. There are two fixed nhi values: 0 defining local address, i.e. the network processor of the system and all ones defining invalid destination. The last value allows for deciding no match with EXCEPT and LINK entries. An important feature of CS trie are compressed strides. The basic idea is to have a trie node that is just large enough to make all entries in that node separable, e.g. if one 3rd layer leave contains only definitions for two networks, say /17 and /17, we can have size 2 trie node instead of full 256 entry node. Having EXCEPT entries and masked LINK entries make this even more

5 Network processor route calculation gen. trie upd. stack Forwarding engine Longest match funct. Lookup trie Interfaces rounting information Figure 2: Processing routing information and updating forwarding table efficient: now we can have very compact strides even if the length of the prefixes would otherwise require large nodes, e.g. if there are three subnetworks within 11.11/16: /20, /26 and /27, we can still have size 2 nodes in both L3 and L4 (L3[0] = EXCEPT /20, L3[1] = LINK /25, L4[0] = EXCEPT /26, L4[1] = EXCEPT /27). The trie is traversed using the following algorithm: entry <- rootnode[addr[31:24]] level <- 1 while(true) if entry.type == EMPTY return no_match, 0 if entry.type == DIRECT return match, entry.nhi if entry.type == EXCEPT if match(entry.prefix, addr) return match, entry.pnhi else if entry.nhi == return no_match, 0 else return match, entry.nhi if entry.type == LINK if match(entry.prefix, addr) entry <- trie[entry.pointer + indx(addr, level) << (8 - entry.nsize)] level <- level + 1 else if entry.nhi == return no_match, 0 else return match, entry.nhi The sizes of the fields are following: the type field is 2-bit (enough to identify 4 types), both next hop index fields are 10-bit allowing for 1022 next hop destinations, the prefix field is 24-bit (first 8 bits are defined implicitly by root node) and the prefix length field is 5-bit. The size of the pointer pointer has been chosen to be 19 bits that allows us to address 8 MB of memory in 128-bit blocks. The size of the next size field is determined by maximum node size (256 entries): as node sizes are always power-of-2, 3 bits are enough. This means that the length of the longest entry (LINK) is 63 bits and thus 64 bits has to be used for each entry. 3.2 On-the-fly Updates The CS-trie does not rely on aggressive compression methods nor on rigid calculation of the next entry address. This gives us an opportunity to use simple incremental updates just as with PATRICIA trie. Hardware based CS trie has two separate data structures: one for generating the trie and one for performing address lookups (Figure 2). The former data structure is used by a network processor responsible for processing routing information messages and calculating forwarding table updates. The latter one is located inside the forwarding engine. This allows for storing extra information into the former data structure to make it easier to calculate the updates. The basic idea with on-the-fly updates is to use a stack for insertions and deletions. When a new prefix is added into the generating trie at the network processor, the changes are pushed into a stack. After the insertion is complete the changes are poped from the stack and updated into the lookup trie. In this way the update caused by an insertion is done

6 L1 (256x64 bit) External mem. (L2 L4) table updates Addr[31:24]..... Addr[31:0] IP address L1 entry L1 match L2 pointer Addr[31:0] L2 4 entry Memory arbitrator L2 4 match result, NH index Mem addr NH index result, NH index.... result dispatcher next hop next hop table result next hop Figure 3: HW architecture of the proposed implementation of CS Trie based address match system. from leaves to root while the actual calculation advances from root to leaves. This procedure ensures that a concurrent lookup operation will never run into an incomplete subtree. A prefix deletion proceeds from leaves to the root and the changes are again first pushed into a stack and then poped and updated into lookup trie. In this way the deletion updates proceed in opposite direction and, again, a concurrent lookup is not interfered. One very important feature enabling low workload incremental upgrades is dynamic memory management. As nodes can be added, deleted, and resized on-the-fly, monolithic single table trie cannot be used. Instead the memory is divided into pages large enough to contain one full-sized node (256 entries, i.e. 2 kb) and each page can be divided further into subpages depending on the node size. To make the memory management a bit easier one page is always divided into same sized subpages. There are list structures in the network processor to manage, i.e. allocate and deallocate, free pages and subpages. The dynamic memory management guarantees that the changes in the trie structure have only local impacts. 3.3 Pipelined Implementation The pipelined implementation (Figure 3) takes advantage of small internal memory blocks provided by FPGA devices. The L1 node (root node) and the next hop table are placed into these memory blocks instead of the external memory. This enables concurrent memory accesses into these tables and into the external memory containing the rest of the trie (L2-L4). This reduces the number of required external memory accesses by 2 per each address match operation. It should be noted that the idea to use the internal memory was the main reason to use stride trie a large, e.g. 16 stride would be way too large to fit in. The operation of the address match system is following: 1. The first 8 bits of the IP address are used to access L1-table; L1-match unit does the first round of trie traverse and forwards the results (match + nhi or pointer + address) to one of the next two units. 2. The current pointer is used to address the external memory; L2-4-match unit performs corresponding round of the trie traverse and either update the pointer (only with LINK entry) or forward the result to next stage. 3. Result dispatcher gets results from L1- and L2-4-match units and accesses next hop table according to the results. 4. Memory arbitrator gets addresses form match units and trie updates from control unit (not shown). It schedules the memory cycles for the request form different units and hides memory latencies. All these tasks are performed concurrently. If the match units cannot perform entry processing at the rate of memory cycles, they can be pipelined too. In the Figure 4 there is one possible parallel and pipelined architecture shown. At the first stage the

7 IP address Prefix Prefix length next size table pointer Level =1 Mask gen. Level next index mask pipelining cutset Match? next pointer entry type nhi pnhi pipelining cutset result selector address NH index result Figure 4: An example of highly parallel pipelined implementation of the Match unit. address is XORed with the prefix, a mask is created according to the prefix length, and the index to the entry at the next level stride is calculated (of course these values have no meaning for DIRECT or EMPTY entries). At the second stage the results of XOR operation together with the mask are used to check for a prefix match and the index and pointer are used to calculate the address of the next level entry. At the last stage the value of the entry type field is used to select the results to be used for the decision of the correct outcome. 3.4 Performance estimates To get some realistic performance estimates a test trie was generated using a real world routing table and then a set of address matches was carried out. The routing table containing entries used in these test was obtained from NLARN Measurement and Network Analysis Group Web site 1 and it was dated March The resulting trie required kb pages, i.e. approximately 4.3 MB. The compacted strides saved a considerable amount of 1 Table 1: The average level of the trie where a match is found for different prefix lengths. Prefix len. Trie lev. Prefix len. Trie lev. No match memory (57%) as there were 5091 nodes in the trie. It took 530 ms to generate the trie in 480 MHz Sun UltraSPARC-II; about 5.1 µs per each entry. However, I believe that the generation time can be made substantially shorter, if the trie generation would be carefully optimised. Compared to LC-trie [4] the memory consumption was quite competitive: a LC-trie generated with same data required 2.5 MB (trie *4 bytes, base vector 96712*16 bytes, prefix vector 7803*12 bytes). Time to build LC-trie was 490 ms. I was unable to test CS-trie with the same data as was used in [4]; the data provided by Nilsson 2 contained invalid addresses 3. I think that these results are encouraging as the CS-trie is not designed for minimum memory consumption and the incremental CS-trie construction program does already take into account the time required to synchronise memory contents between the network processor and the forwarding engine. For longest match operation performance estimates a large set of random uniformly distributed IP addresses (10 8 ) was generated. The reason for this approach was simply the lack of suitable traffic trace. The estimates are shown in table 1. It can be I am afraid that these errors may affect the results reported in [4].

8 Table 2: Trie size and generation time with different routing table sizes. The routing information is obtained from NLARN and is dated November 8 except year 2001 which is from March 16. Year Routing Trie size (MB) Memory Prefix L4 strides Generation time entries per entry (B) length > 24 total (ms) per entry (µs) noted that declaring no match takes a very short time. However, this result may be misleading as it is unlikely that addresses of non-existing destinations are uniformly distributed. More results for different routing table sizes are shown in the table 2. It seems like that the memory requirements and construction complexity are growing in a linear manner. Furthermore, it can be noted that the number of L4 leaves remain minimal while the number of long prefixes has grown ten fold during the years. Updating the search structure at the forwarding engine by copying it from network processor does not take too long nor it does reserve a high portion of memory cycles. Let s consider a simple situation where forwarding engine and network processor are interconnected by a 33 MHz 32-bit PCI bus. If we could use the bus with efficiency of 60%, a total rewrite of, e.g. 4 MB search structure takes only 53 ms. Furthermore, at the same time only 7.4% of the memory cycles at the routing engine is required for the update, if 133 MHz 64-bit ZBT-SRAM devices are used. Thus, any forwarding table updates are carried out in a short time and without noticeable impact on forwarding performance. 3.5 IP version 6 At this point one may wonder how this proposed table lookup algorithm could be upgraded to support long IPv6 address. However, the whole question is more or less absurd: IPv6 addressing is hierarchical [11]. One of the key ideas of adopting the 128-bit address format was not only to guarantee an address space that is more than adequate but also to get rid of the cumbersome CIDR addressing. This means that with IPv6 there is no need for an algorithm that performs well in longest match operations. One exception is IPv4 compatible addressing mode but then the addresses are 32-bit and thus they can be handled with standard IPv4 address lookup. I think that the best way to upgrade an existing router architecture to support IPv6 is to add an separate IPv6 module. This module can take advantage of efficient hierarchical search methods. Furthermore, the search structures are likely to be quite smallish as fine-grained network topology can be efficiently hidden. In other words the prefix aggregation should really work with the IPv6 addressing. 4 Future work In the future our team is planning to create a detailed VHDL description of the pipelined forwarding unit to be able to simulate its performance. By using exact timing information feedback from place and route process quite accurate estimates can be obtained. However, based on our previous experiences I am quite sure that the implementation can run at MHz in our target system (6 Mgate Xilinx Virtex-II). Our final goal is to include this design into our distributed router system that is also under development. One obvious target for improvements in CS trie is the memory consumption: the size of a entry is defined by the size of the longest entry. However, over 90% of entries are either EMPTY or DIRECT type. If some kind of split memory space or dual memory scheme could be used, considerable amounts of memory can be saved. Quick approximate estimates show that the EMPTY and DIRECT entries could be stored in 16 bits instead of 64 bits. In this way the size of the memory required by the trie could be reduced to just 1/3 of the original, e.g from

9 4.3 MB to 1.4 MB. However, this requires further study; it must be made sure that the performance does not degrade and the basic structure does not become too complex. There are also other possibilities to improve CS trie like using arbitrary bit masks with LINKs and adding extra prefixes into EXCEPT entries at the lowest level. However, it is unsure if these modifications can have any measurable impacts on the performance and thus further studies are required again. 5 Conclusions A novel address lookup method, the CS-trie, was described and its efficiency was demonstrated. I have shown that it is possible to have a highly efficient HW based trie without sacrificing the possibility for incremental updates. Furthermore, it was shown that by using few simple features, i.e. EX- CEPT entries and compacted strides, it is possible to have large memory savings as well as improved lookup performance. I have also introduced a principle of two tries (generating and lookup) that make it much easier to calculate incremental updates. What is more, the results show that memory consumption and generation time of CS-trie are growing in linear manner. An example of inexpensive hardware implementation was also given. A high performance CS-trie based system can be realised using standard FPGA and SRAM devices. Such system could easily perform approximately million address lookup operations per second which is more that adequate for current bit rates (60 Mlookups with 40 byte packets = 19.2 Gbit/s and 95 Mlookups with 250 byte packets = 190 Gbit/s). Acknowledgements I would like to thank the National Science Foundation Cooperative Agreement No. ANI , the National Laboratory for Applied Network Research Measurement, and its Network Analysis Group for kindly providing public access to the routing information used in this work. References [1] Miguel Á. Ruiz-Sánchez, Ernst W. Biersack, and Walid Dabbous. Survey and taxonomy of IP address lookup algorithms. IEEE Network, 15(2):8 23, [2] Christian Huitema. Routing in the Internet. Prentice Hall, 2nd edition, [3] Henry Hong-Yi Tzeng and Tony Przygienda. On fast address-lookup algrithms. IEEE Journal on Selected Areas in Communications, 17(6): , [4] Stefan Nilsson and Gunnar Karlsson. IPaddress lookup using LC-tries. IEEE Journal on Selected Areas in Communications, 17(6): , [5] Mikael Degermark, Andrej Brodnik, Svante Carlsson, and Stephen Pink. Small forwarding tables for fast routing lookups. In Proceedings of SIGCOMM 97 Cannes, France, [6] Tzi-cker Chiueh and Prashant Pradhan. Highperformace IP routing table lookup using CPU caching. In Proceedings of IEEE INFOCOM 1999, volume 3, pages , [7] P. Gupta, S. Lin, and N. McKeown. Routing lookups in hardware at memory access speeds. In Proceedings of IEEE INFOCOM 1998, pages , [8] Nen-Fu Huang and Shi-Ming Zhao. A novel IP-routing lookup scheme and hardware architecture for multigigabit switching routers. IEEE Journal on Selected Areas in Communications, 17(6): , [9] Marcel Waldvogel, George Varghese, Jon Turner, and Bernhard Plattner. Scalable high speed IP routing lookups. In Proceedings of ACM SIGCOMM 97, [10] Zhongchao Yu, Jianping Wu, Ke Xu, and Mingwei Xu. A fast IP classification algorithm applying to multiple fields. In Proceedings of IEEE ICC 2001, [11] Steve King, Ruth Fax, Dimitry Haskin, Wenken Ling, Tom Meehan, Robert Fink, and Charles E. Perkins. The case for IPv6. draftietf-iab-case-for-ipv6-06.txt.

Frugal IP Lookup Based on a Parallel Search

Frugal IP Lookup Based on a Parallel Search Frugal IP Lookup Based on a Parallel Search Zoran Čiča and Aleksandra Smiljanić School of Electrical Engineering, Belgrade University, Serbia Email: cicasyl@etf.rs, aleksandra@etf.rs Abstract Lookup function

More information

Efficient hardware architecture for fast IP address lookup. Citation Proceedings - IEEE INFOCOM, 2002, v. 2, p

Efficient hardware architecture for fast IP address lookup. Citation Proceedings - IEEE INFOCOM, 2002, v. 2, p Title Efficient hardware architecture for fast IP address lookup Author(s) Pao, D; Liu, C; Wu, A; Yeung, L; Chan, KS Citation Proceedings - IEEE INFOCOM, 2002, v 2, p 555-56 Issued Date 2002 URL http://hdlhandlenet/0722/48458

More information

Routing Lookup Algorithm for IPv6 using Hash Tables

Routing Lookup Algorithm for IPv6 using Hash Tables Routing Lookup Algorithm for IPv6 using Hash Tables Peter Korppoey, John Smith, Department of Electronics Engineering, New Mexico State University-Main Campus Abstract: After analyzing of existing routing

More information

Binary Search Schemes for Fast IP Lookups

Binary Search Schemes for Fast IP Lookups 1 Schemes for Fast IP Lookups Pronita Mehrotra, Paul D. Franzon Abstract IP route look up is the most time consuming operation of a router. Route lookup is becoming a very challenging problem due to the

More information

Dynamic Routing Tables Using Simple Balanced. Search Trees

Dynamic Routing Tables Using Simple Balanced. Search Trees Dynamic Routing Tables Using Simple Balanced Search Trees Y.-K. Chang and Y.-C. Lin Department of Computer Science and Information Engineering National Cheng Kung University Tainan, Taiwan R.O.C. ykchang@mail.ncku.edu.tw

More information

Fast IP Routing Lookup with Configurable Processor and Compressed Routing Table

Fast IP Routing Lookup with Configurable Processor and Compressed Routing Table Fast IP Routing Lookup with Configurable Processor and Compressed Routing Table H. Michael Ji, and Ranga Srinivasan Tensilica, Inc. 3255-6 Scott Blvd Santa Clara, CA 95054 Abstract--In this paper we examine

More information

A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup

A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup Yan Sun and Min Sik Kim School of Electrical Engineering and Computer Science Washington State University Pullman, Washington

More information

FPGA Implementation of Lookup Algorithms

FPGA Implementation of Lookup Algorithms 2011 IEEE 12th International Conference on High Performance Switching and Routing FPGA Implementation of Lookup Algorithms Zoran Chicha, Luka Milinkovic, Aleksandra Smiljanic Department of Telecommunications

More information

Disjoint Superposition for Reduction of Conjoined Prefixes in IP Lookup for Actual IPv6 Forwarding Tables

Disjoint Superposition for Reduction of Conjoined Prefixes in IP Lookup for Actual IPv6 Forwarding Tables Disjoint Superposition for Reduction of Conjoined Prefixes in IP Lookup for Actual IPv6 Forwarding Tables Roberto Rojas-Cessa, Taweesak Kijkanjanarat, Wara Wangchai, Krutika Patil, Narathip Thirapittayatakul

More information

Switch and Router Design. Packet Processing Examples. Packet Processing Examples. Packet Processing Rate 12/14/2011

Switch and Router Design. Packet Processing Examples. Packet Processing Examples. Packet Processing Rate 12/14/2011 // Bottlenecks Memory, memory, 88 - Switch and Router Design Dr. David Hay Ross 8b dhay@cs.huji.ac.il Source: Nick Mckeown, Isaac Keslassy Packet Processing Examples Address Lookup (IP/Ethernet) Where

More information

Growth of the Internet Network capacity: A scarce resource Good Service

Growth of the Internet Network capacity: A scarce resource Good Service IP Route Lookups 1 Introduction Growth of the Internet Network capacity: A scarce resource Good Service Large-bandwidth links -> Readily handled (Fiber optic links) High router data throughput -> Readily

More information

Message Switch. Processor(s) 0* 1 100* 6 1* 2 Forwarding Table

Message Switch. Processor(s) 0* 1 100* 6 1* 2 Forwarding Table Recent Results in Best Matching Prex George Varghese October 16, 2001 Router Model InputLink i 100100 B2 Message Switch B3 OutputLink 6 100100 Processor(s) B1 Prefix Output Link 0* 1 100* 6 1* 2 Forwarding

More information

Parallel-Search Trie-based Scheme for Fast IP Lookup

Parallel-Search Trie-based Scheme for Fast IP Lookup Parallel-Search Trie-based Scheme for Fast IP Lookup Roberto Rojas-Cessa, Lakshmi Ramesh, Ziqian Dong, Lin Cai, and Nirwan Ansari Department of Electrical and Computer Engineering, New Jersey Institute

More information

ECE697AA Lecture 20. Forwarding Tables

ECE697AA Lecture 20. Forwarding Tables ECE697AA Lecture 20 Routers: Prefix Lookup Algorithms Tilman Wolf Department of Electrical and Computer Engineering 11/14/08 Forwarding Tables Routing protocols involve a lot of information Path choices,

More information

IP Forwarding. CSU CS557, Spring 2018 Instructor: Lorenzo De Carli

IP Forwarding. CSU CS557, Spring 2018 Instructor: Lorenzo De Carli IP Forwarding CSU CS557, Spring 2018 Instructor: Lorenzo De Carli 1 Sources George Varghese, Network Algorithmics, Morgan Kauffmann, December 2004 L. De Carli, Y. Pan, A. Kumar, C. Estan, K. Sankaralingam,

More information

Multiway Range Trees: Scalable IP Lookup with Fast Updates

Multiway Range Trees: Scalable IP Lookup with Fast Updates Multiway Range Trees: Scalable IP Lookup with Fast Updates Subhash Suri George Varghese Priyank Ramesh Warkhede Department of Computer Science Washington University St. Louis, MO 63130. Abstract In this

More information

Novel Hardware Architecture for Fast Address Lookups

Novel Hardware Architecture for Fast Address Lookups Novel Hardware Architecture for Fast Address Lookups Pronita Mehrotra, Paul D. Franzon ECE Department, North Carolina State University, Box 7911, Raleigh, NC 27695-791 1, USA Ph: +1-919-515-735 1, Fax:

More information

Forwarding and Routers : Computer Networking. Original IP Route Lookup. Outline

Forwarding and Routers : Computer Networking. Original IP Route Lookup. Outline Forwarding and Routers 15-744: Computer Networking L-9 Router Algorithms IP lookup Longest prefix matching Classification Flow monitoring Readings [EVF3] Bitmap Algorithms for Active Flows on High Speed

More information

CS419: Computer Networks. Lecture 6: March 7, 2005 Fast Address Lookup:

CS419: Computer Networks. Lecture 6: March 7, 2005 Fast Address Lookup: : Computer Networks Lecture 6: March 7, 2005 Fast Address Lookup: Forwarding/Routing Revisited Best-match Longest-prefix forwarding table lookup We looked at the semantics of bestmatch longest-prefix address

More information

Efficient IP-Address Lookup with a Shared Forwarding Table for Multiple Virtual Routers

Efficient IP-Address Lookup with a Shared Forwarding Table for Multiple Virtual Routers Efficient IP-Address Lookup with a Shared Forwarding Table for Multiple Virtual Routers ABSTRACT Jing Fu KTH, Royal Institute of Technology Stockholm, Sweden jing@kth.se Virtual routers are a promising

More information

Fast Update of Forwarding Tables in Internet Router Using AS Numbers Λ

Fast Update of Forwarding Tables in Internet Router Using AS Numbers Λ Fast Update of Forwarding Tables in Internet Router Using AS Numbers Λ Heonsoo Lee, Seokjae Ha, and Yanghee Choi School of Computer Science and Engineering Seoul National University San 56-1, Shilim-dong,

More information

THE advent of the World Wide Web (WWW) has doubled

THE advent of the World Wide Web (WWW) has doubled IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 17, NO. 6, JUNE 1999 1093 A Novel IP-Routing Lookup Scheme and Hardware Architecture for Multigigabit Switching Routers Nen-Fu Huang, Member, IEEE,

More information

IP Address Lookup in Hardware for High-Speed Routing

IP Address Lookup in Hardware for High-Speed Routing IP Address Lookup in Hardware for High-Speed Routing Andreas Moestedt and Peter Sjödin am@sics.se, peter@sics.se Swedish Institute of Computer Science P.O. Box 1263, SE-164 29 KISTA, Sweden Abstract This

More information

A Pipelined IP Address Lookup Module for 100 Gbps Line Rates and beyond

A Pipelined IP Address Lookup Module for 100 Gbps Line Rates and beyond A Pipelined IP Address Lookup Module for 1 Gbps Line Rates and beyond Domenic Teuchert and Simon Hauger Institute of Communication Networks and Computer Engineering (IKR) Universität Stuttgart, Pfaffenwaldring

More information

Packet Classification Using Dynamically Generated Decision Trees

Packet Classification Using Dynamically Generated Decision Trees 1 Packet Classification Using Dynamically Generated Decision Trees Yu-Chieh Cheng, Pi-Chung Wang Abstract Binary Search on Levels (BSOL) is a decision-tree algorithm for packet classification with superior

More information

Data Structures for Packet Classification

Data Structures for Packet Classification Presenter: Patrick Nicholson Department of Computer Science CS840 Topics in Data Structures Outline 1 The Problem 2 Hardware Solutions 3 Data Structures: 1D 4 Trie-Based Solutions Packet Classification

More information

Review on Tries for IPv6 Lookups

Review on Tries for IPv6 Lookups Available online www.ejaet.com European Journal of Advances in Engineering and Technology, 2016, 3(7): 28-33 Review Article ISSN: 2394-658X Review on Tries for IPv6 Lookups Rohit G Bal Department of Computer

More information

Lecture 12: Aggregation. CSE 123: Computer Networks Alex C. Snoeren

Lecture 12: Aggregation. CSE 123: Computer Networks Alex C. Snoeren Lecture 12: Aggregation CSE 123: Computer Networks Alex C. Snoeren Lecture 12 Overview Subnetting Classless addressing Route aggregation 2 Class-based Addressing Most significant bits determines class

More information

IP packet forwarding, or simply, IP-lookup, is a classic

IP packet forwarding, or simply, IP-lookup, is a classic Scalable Tree-based Architectures for IPv4/v6 Lookup Using Prefix Partitioning Hoang Le, Student Member, IEEE, and Viktor K. Prasanna, Fellow, IEEE Abstract Memory efficiency and dynamically updateable

More information

The iflow Address Processor Forwarding Table Lookups using Fast, Wide Embedded DRAM

The iflow Address Processor Forwarding Table Lookups using Fast, Wide Embedded DRAM Enabling the Future of the Internet The iflow Address Processor Forwarding Table Lookups using Fast, Wide Embedded DRAM Mike O Connor - Director, Advanced Architecture www.siliconaccess.com Hot Chips 12

More information

Lecture 12: Addressing. CSE 123: Computer Networks Alex C. Snoeren

Lecture 12: Addressing. CSE 123: Computer Networks Alex C. Snoeren Lecture 12: Addressing CSE 123: Computer Networks Alex C. Snoeren Lecture 12 Overview IP Addresses Class-based addressing Subnetting Classless addressing Route aggregation 2 Addressing Considerations Fixed

More information

Novel Hardware Architecture for Fast Address Lookups

Novel Hardware Architecture for Fast Address Lookups Novel Hardware Architecture for Fast Address Lookups Pronita Mehrotra Paul D. Franzon Department of Electrical and Computer Engineering North Carolina State University {pmehrot,paulf}@eos.ncsu.edu This

More information

CS 268: Route Lookup and Packet Classification

CS 268: Route Lookup and Packet Classification Overview CS 268: Route Lookup and Packet Classification Packet Lookup Packet Classification Ion Stoica March 3, 24 istoica@cs.berkeley.edu 2 Lookup Problem Identify the output interface to forward an incoming

More information

Scalable IP Routing Lookup in Next Generation Network

Scalable IP Routing Lookup in Next Generation Network Scalable IP Routing Lookup in Next Generation Network Chia-Tai Chan 1, Pi-Chung Wang 1,Shuo-ChengHu 2, Chung-Liang Lee 1, and Rong-Chang Chen 3 1 Telecommunication Laboratories, Chunghwa Telecom Co., Ltd.

More information

15-744: Computer Networking. Routers

15-744: Computer Networking. Routers 15-744: Computer Networking outers Forwarding and outers Forwarding IP lookup High-speed router architecture eadings [McK97] A Fast Switched Backplane for a Gigabit Switched outer Optional [D+97] Small

More information

FAST IP ADDRESS LOOKUP ENGINE FOR SOC INTEGRATION

FAST IP ADDRESS LOOKUP ENGINE FOR SOC INTEGRATION FAST IP ADDRESS LOOKUP ENGINE FOR SOC INTEGRATION Tomas Henriksson Department of Electrical Engineering Linköpings universitet SE-581 83 Linköping tomhe@isy.liu.se Ingrid Verbauwhede UCLA EE Dept 7440B

More information

LONGEST prefix matching (LPM) techniques have received

LONGEST prefix matching (LPM) techniques have received IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 14, NO. 2, APRIL 2006 397 Longest Prefix Matching Using Bloom Filters Sarang Dharmapurikar, Praveen Krishnamurthy, and David E. Taylor, Member, IEEE Abstract We

More information

Midterm Review. Congestion Mgt, CIDR addresses,tcp processing, TCP close. Routing. hierarchical networks. Routing with OSPF, IS-IS, BGP-4

Midterm Review. Congestion Mgt, CIDR addresses,tcp processing, TCP close. Routing. hierarchical networks. Routing with OSPF, IS-IS, BGP-4 Midterm Review Week 1 Congestion Mgt, CIDR addresses,tcp processing, TCP close Week 2 Routing. hierarchical networks Week 3 Routing with OSPF, IS-IS, BGP-4 Week 4 IBGP, Prefix lookup, Tries, Non-stop routers,

More information

IP LOOK-UP WITH TIME OR MEMORY GUARANTEE AND LOW UPDATE TIME 1

IP LOOK-UP WITH TIME OR MEMORY GUARANTEE AND LOW UPDATE TIME 1 2005 IEEE International Symposium on Signal Processing and Information Technology IP LOOK-UP WITH TIME OR MEMORY GUARANTEE AND LOW UPDATE TIME 1 G.T. Kousiouris and D.N. Serpanos Dept. of Electrical and

More information

Transport and TCP. EE122 Fall 2011 Scott Shenker

Transport and TCP. EE122 Fall 2011 Scott Shenker Transport and TCP EE122 Fall 2011 Scott Shenker http://inst.eecs.berkeley.edu/~ee122/ Materials with thanks to Jennifer Rexford, Ion Stoica, Vern Paxson and other colleagues at Princeton and UC Berkeley

More information

1 Connectionless Routing

1 Connectionless Routing UCSD DEPARTMENT OF COMPUTER SCIENCE CS123a Computer Networking, IP Addressing and Neighbor Routing In these we quickly give an overview of IP addressing and Neighbor Routing. Routing consists of: IP addressing

More information

Computer Networks CS 552

Computer Networks CS 552 Computer Networks CS 552 Routers Badri Nath Rutgers University badri@cs.rutgers.edu. High Speed Routers 2. Route lookups Cisco 26: 8 Gbps Cisco 246: 32 Gbps Cisco 286: 28 Gbps Power: 4.2 KW Cost: $5K Juniper

More information

Problem Statement. Algorithm MinDPQ (contd.) Algorithm MinDPQ. Summary of Algorithm MinDPQ. Algorithm MinDPQ: Experimental Results.

Problem Statement. Algorithm MinDPQ (contd.) Algorithm MinDPQ. Summary of Algorithm MinDPQ. Algorithm MinDPQ: Experimental Results. Algorithms for Routing Lookups and Packet Classification October 3, 2000 High Level Outline Part I. Routing Lookups - Two lookup algorithms Part II. Packet Classification - One classification algorithm

More information

Tree, Segment Table, and Route Bucket: A Multistage Algorithm for IPv6 Routing Table Lookup

Tree, Segment Table, and Route Bucket: A Multistage Algorithm for IPv6 Routing Table Lookup Tree, Segment Table, and Route Bucket: A Multistage Algorithm for IPv6 Routing Table Lookup Zhenqiang LI Dongqu ZHENG Yan MA School of Computer Science and Technology Network Information Center Beijing

More information

Lecture 11: Packet forwarding

Lecture 11: Packet forwarding Lecture 11: Packet forwarding Anirudh Sivaraman 2017/10/23 This week we ll talk about the data plane. Recall that the routing layer broadly consists of two parts: (1) the control plane that computes routes

More information

Efficient Construction Of Variable-Stride Multibit Tries For IP Lookup

Efficient Construction Of Variable-Stride Multibit Tries For IP Lookup " Efficient Construction Of Variable-Stride Multibit Tries For IP Lookup Sartaj Sahni & Kun Suk Kim sahni, kskim @ciseufledu Department of Computer and Information Science and Engineering University of

More information

Multiway Range Trees: Scalable IP Lookup with Fast Updates

Multiway Range Trees: Scalable IP Lookup with Fast Updates Washington University in St. Louis Washington University Open Scholarship All Computer Science and Engineering Research Computer Science and Engineering Report Number: WUCS-99-28 1999-01-01 Multiway Range

More information

An Efficient Parallel IP Lookup Technique for IPv6 Routers Using Multiple Hashing with Ternary marker storage

An Efficient Parallel IP Lookup Technique for IPv6 Routers Using Multiple Hashing with Ternary marker storage An Efficient Parallel IP Lookup Technique for IPv Routers Using Multiple Hashing with Ternary marker storage P. Kiran Sree Dr. Inampudi Ramesh Babu Mr. P.Kiran Sree,Associate Professor, Department of Computer

More information

Lecture 12. Memory Design & Caches, part 2. Christos Kozyrakis Stanford University

Lecture 12. Memory Design & Caches, part 2. Christos Kozyrakis Stanford University Lecture 12 Memory Design & Caches, part 2 Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee108b 1 Announcements HW3 is due today PA2 is available on-line today Part 1 is due on 2/27

More information

FIGURE 3. Two-Level Internet Address Structure. FIGURE 4. Principle Classful IP Address Formats

FIGURE 3. Two-Level Internet Address Structure. FIGURE 4. Principle Classful IP Address Formats Classful IP Addressing When IP was first standardized in September 1981, the specification required that each system attached to an IP-based Internet be assigned a unique, 32-bit Internet address value.

More information

Last Lecture: Network Layer

Last Lecture: Network Layer Last Lecture: Network Layer 1. Design goals and issues 2. Basic Routing Algorithms & Protocols 3. Addressing, Fragmentation and reassembly 4. Internet Routing Protocols and Inter-networking 5. Router design

More information

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy

Chapter 5A. Large and Fast: Exploiting Memory Hierarchy Chapter 5A Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) Fast, expensive Dynamic RAM (DRAM) In between Magnetic disk Slow, inexpensive Ideal memory Access time of SRAM

More information

Scalable Lookup Algorithms for IPv6

Scalable Lookup Algorithms for IPv6 Scalable Lookup Algorithms for IPv6 Aleksandra Smiljanić a*, Zoran Čiča a a School of Electrical Engineering, Belgrade University, Bul. Kralja Aleksandra 73, 11120 Belgrade, Serbia ABSTRACT IPv4 addresses

More information

5. Classless and Subnet Address Extensions 최양희서울대학교컴퓨터공학부

5. Classless and Subnet Address Extensions 최양희서울대학교컴퓨터공학부 5. Classless and Subnet Address Extensions 최양희서울대학교컴퓨터공학부 1 Introduction In the original IP addressing scheme, each physical network is assigned a unique network address Individual sites can have the freedom

More information

Power Efficient IP Lookup with Supernode Caching

Power Efficient IP Lookup with Supernode Caching Power Efficient IP Lookup with Supernode Caching Lu Peng, Wencheng Lu * and Lide Duan Department of Electrical & Computer Engineering Louisiana State University Baton Rouge, LA 73 {lpeng, lduan1}@lsu.edu

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Principle of Locality Programs access a small proportion of their address space at any time Temporal locality Items accessed recently are likely to

More information

A Multi-stage IPv6 Routing Lookup Algorithm Based on Hash Table and Multibit Trie Xing-ya HE * and Yun YANG

A Multi-stage IPv6 Routing Lookup Algorithm Based on Hash Table and Multibit Trie Xing-ya HE * and Yun YANG 2017 International Conference on Computer, Electronics and Communication Engineering (CECE 2017) ISBN: 978-1-60595-476-9 A Multi-stage IPv6 Routing Lookup Algorithm Based on Hash Table and Multibit Trie

More information

A Novel Level-based IPv6 Routing Lookup Algorithm

A Novel Level-based IPv6 Routing Lookup Algorithm A Novel Level-based IPv6 Routing Lookup Algorithm Xiaohong Huang 1 Xiaoyu Zhao 2 Guofeng Zhao 1 Wenjian Jiang 2 Dongqu Zheng 1 Qiong Sun 1 Yan Ma 1,3 1. School of Computer Science and Technology, Beijing

More information

IP lookup with low memory requirement and fast update

IP lookup with low memory requirement and fast update Downloaded from orbit.dtu.dk on: Dec 7, 207 IP lookup with low memory requirement and fast update Berger, Michael Stübert Published in: Workshop on High Performance Switching and Routing, 2003, HPSR. Link

More information

CS Computer Architecture

CS Computer Architecture CS 35101 Computer Architecture Section 600 Dr. Angela Guercio Fall 2010 An Example Implementation In principle, we could describe the control store in binary, 36 bits per word. We will use a simple symbolic

More information

AN ASSOCIATIVE TERNARY CACHE FOR IP ROUTING. 1. Introduction. 2. Associative Cache Scheme

AN ASSOCIATIVE TERNARY CACHE FOR IP ROUTING. 1. Introduction. 2. Associative Cache Scheme AN ASSOCIATIVE TERNARY CACHE FOR IP ROUTING James J. Rooney 1 José G. Delgado-Frias 2 Douglas H. Summerville 1 1 Dept. of Electrical and Computer Engineering. 2 School of Electrical Engr. and Computer

More information

* I D ~~~ ~ Figure 2: Longest matching prefix.

* I D ~~~ ~ Figure 2: Longest matching prefix. A fast and compact longest match prefix look-up method using pointer cache for very long network address Masanori Uga Kohei Shiomoto "IT Network Service Systems Laboratories Midori 3-9-, Musashino, Tokyo

More information

ITTC High-Performance Networking The University of Kansas EECS 881 Packet Switch I/O Processing

ITTC High-Performance Networking The University of Kansas EECS 881 Packet Switch I/O Processing High-Performance Networking The University of Kansas EECS 881 Packet Switch I/O Processing James P.G. Sterbenz Department of Electrical Engineering & Computer Science Information Technology & Telecommunications

More information

Three Different Designs for Packet Classification

Three Different Designs for Packet Classification Three Different Designs for Packet Classification HATAM ABDOLI Computer Department Bu-Ali Sina University Shahid Fahmideh street, Hamadan IRAN abdoli@basu.ac.ir http://www.profs.basu.ac.ir/abdoli Abstract:

More information

A Trie Merging Approach with Incremental Updates for Virtual Routers

A Trie Merging Approach with Incremental Updates for Virtual Routers 213 Proceedings IEEE INFOCOM A Trie Merging Approach with Incremental Updates for Virtual Routers Layong Luo *, Gaogang Xie *,Kavé Salamatian, Steve Uhlig, Laurent Mathy, Yingke Xie * * Institute of Computing

More information

Massively Parallel Priority Queue for High-Speed Switches and Routers

Massively Parallel Priority Queue for High-Speed Switches and Routers Massively Parallel Priority Queue for High-Speed Switches and Routers KARI SEPPÄNEN Information Technology Research Institute Technical Research Centre of Finland (VTT) PO Box 1202, FIN-02044 VTT Finland

More information

Overview. Memory Classification Read-Only Memory (ROM) Random Access Memory (RAM) Functional Behavior of RAM. Implementing Static RAM

Overview. Memory Classification Read-Only Memory (ROM) Random Access Memory (RAM) Functional Behavior of RAM. Implementing Static RAM Memories Overview Memory Classification Read-Only Memory (ROM) Types of ROM PROM, EPROM, E 2 PROM Flash ROMs (Compact Flash, Secure Digital, Memory Stick) Random Access Memory (RAM) Types of RAM Static

More information

INF5050 Protocols and Routing in Internet (Friday ) Subject: IP-router architecture. Presented by Tor Skeie

INF5050 Protocols and Routing in Internet (Friday ) Subject: IP-router architecture. Presented by Tor Skeie INF5050 Protocols and Routing in Internet (Friday 9.2.2018) Subject: IP-router architecture Presented by Tor Skeie High Performance Switching and Routing Telecom Center Workshop: Sept 4, 1997. This presentation

More information

Scalable Packet Classification for IPv6 by Using Limited TCAMs

Scalable Packet Classification for IPv6 by Using Limited TCAMs Scalable Packet Classification for IPv6 by Using Limited TCAMs Chia-Tai Chan 1, Pi-Chung Wang 1,Shuo-ChengHu 2, Chung-Liang Lee 1,and Rong-Chang Chen 3 1 Telecommunication Laboratories, Chunghwa Telecom

More information

Fast binary and multiway prefix searches for packet forwarding

Fast binary and multiway prefix searches for packet forwarding Computer Networks 5 (27) 588 65 www.elsevier.com/locate/comnet Fast binary and multiway prefix searches for packet forwarding Yeim-Kuan Chang Department of Computer Science and Information Engineering,

More information

Implementation of Boundary Cutting Algorithm Using Packet Classification

Implementation of Boundary Cutting Algorithm Using Packet Classification Implementation of Boundary Cutting Algorithm Using Packet Classification Dasari Mallesh M.Tech Student Department of CSE Vignana Bharathi Institute of Technology, Hyderabad. ABSTRACT: Decision-tree-based

More information

Introduction. Introduction. Router Architectures. Introduction. Recent advances in routing architecture including

Introduction. Introduction. Router Architectures. Introduction. Recent advances in routing architecture including Router Architectures By the end of this lecture, you should be able to. Explain the different generations of router architectures Describe the route lookup process Explain the operation of PATRICIA algorithm

More information

Efficient Prefix Cache for Network Processors

Efficient Prefix Cache for Network Processors Efficient Prefix Cache for Network Processors Mohammad J. Akhbarizadeh and Mehrdad Nourani Center for Integrated Circuits & Systems The University of Texas at Dallas Richardson, TX 7583 feazadeh,nouranig@utdallas.edu

More information

CHAPTER 2 LITERATURE SURVEY

CHAPTER 2 LITERATURE SURVEY 23 CHAPTER 2 LITERATURE SURVEY The current version of the Internet Protocol IPv4 was first developed in the 1970s (Tanenbaum 2002), and the main protocol standard RFC 791 that governs IPv4 functionality

More information

CS 5520/ECE 5590NA: Network Architecture I Spring Lecture 10: IP Routing and Addressing Extensions

CS 5520/ECE 5590NA: Network Architecture I Spring Lecture 10: IP Routing and Addressing Extensions CS 5520/ECE 5590NA: Network Architecture I Spring 2009 Lecture 10: IP Routing and Addressing Extensions This lecture provides discussion of the mechanisms used to route IP datagrams (Chapter 7). It also

More information

Outline EEL 5764 Graduate Computer Architecture. Chapter 3 Limits to ILP and Simultaneous Multithreading. Overcoming Limits - What do we need??

Outline EEL 5764 Graduate Computer Architecture. Chapter 3 Limits to ILP and Simultaneous Multithreading. Overcoming Limits - What do we need?? Outline EEL 7 Graduate Computer Architecture Chapter 3 Limits to ILP and Simultaneous Multithreading! Limits to ILP! Thread Level Parallelism! Multithreading! Simultaneous Multithreading Ann Gordon-Ross

More information

Dynamic Pipelining: Making IP- Lookup Truly Scalable

Dynamic Pipelining: Making IP- Lookup Truly Scalable Dynamic Pipelining: Making IP- Lookup Truly Scalable Jahangir Hasan T. N. Vijaykumar School of Electrical and Computer Engineering, Purdue University SIGCOMM 05 Rung-Bo-Su 10/26/05 1 0.Abstract IP-lookup

More information

Tree-Based Minimization of TCAM Entries for Packet Classification

Tree-Based Minimization of TCAM Entries for Packet Classification Tree-Based Minimization of TCAM Entries for Packet Classification YanSunandMinSikKim School of Electrical Engineering and Computer Science Washington State University Pullman, Washington 99164-2752, U.S.A.

More information

EECS 122: Introduction to Computer Networks Switch and Router Architectures. Today s Lecture

EECS 122: Introduction to Computer Networks Switch and Router Architectures. Today s Lecture EECS : Introduction to Computer Networks Switch and Router Architectures Computer Science Division Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley,

More information

1. Memory technology & Hierarchy

1. Memory technology & Hierarchy 1 Memory technology & Hierarchy Caching and Virtual Memory Parallel System Architectures Andy D Pimentel Caches and their design cf Henessy & Patterson, Chap 5 Caching - summary Caches are small fast memories

More information

Multiprocessors and Thread-Level Parallelism. Department of Electrical & Electronics Engineering, Amrita School of Engineering

Multiprocessors and Thread-Level Parallelism. Department of Electrical & Electronics Engineering, Amrita School of Engineering Multiprocessors and Thread-Level Parallelism Multithreading Increasing performance by ILP has the great advantage that it is reasonable transparent to the programmer, ILP can be quite limited or hard to

More information

Large-Scale Network Simulation Scalability and an FPGA-based Network Simulator

Large-Scale Network Simulation Scalability and an FPGA-based Network Simulator Large-Scale Network Simulation Scalability and an FPGA-based Network Simulator Stanley Bak Abstract Network algorithms are deployed on large networks, and proper algorithm evaluation is necessary to avoid

More information

Mapping Multi-Million Gate SoCs on FPGAs: Industrial Methodology and Experience

Mapping Multi-Million Gate SoCs on FPGAs: Industrial Methodology and Experience Mapping Multi-Million Gate SoCs on FPGAs: Industrial Methodology and Experience H. Krupnova CMG/FMVG, ST Microelectronics Grenoble, France Helena.Krupnova@st.com Abstract Today, having a fast hardware

More information

CMSC 411 Computer Systems Architecture Lecture 13 Instruction Level Parallelism 6 (Limits to ILP & Threading)

CMSC 411 Computer Systems Architecture Lecture 13 Instruction Level Parallelism 6 (Limits to ILP & Threading) CMSC 411 Computer Systems Architecture Lecture 13 Instruction Level Parallelism 6 (Limits to ILP & Threading) Limits to ILP Conflicting studies of amount of ILP Benchmarks» vectorized Fortran FP vs. integer

More information

PUSHING THE LIMITS, A PERSPECTIVE ON ROUTER ARCHITECTURE CHALLENGES

PUSHING THE LIMITS, A PERSPECTIVE ON ROUTER ARCHITECTURE CHALLENGES PUSHING THE LIMITS, A PERSPECTIVE ON ROUTER ARCHITECTURE CHALLENGES Greg Hankins APRICOT 2012 2012 Brocade Communications Systems, Inc. 2012/02/28 Lookup Capacity and Forwarding

More information

ADDRESS LOOKUP SOLUTIONS FOR GIGABIT SWITCH/ROUTER

ADDRESS LOOKUP SOLUTIONS FOR GIGABIT SWITCH/ROUTER ADDRESS LOOKUP SOLUTIONS FOR GIGABIT SWITCH/ROUTER E. Filippi, V. Innocenti and V. Vercellone CSELT (Centro Studi e Laboratori Telecomunicazioni) Via Reiss Romoli 274 Torino, 10148 ITALY ABSTRACT The Internet

More information

Inter-networking. Problem. 3&4-Internetworking.key - September 20, LAN s are great but. We want to connect them together. ...

Inter-networking. Problem. 3&4-Internetworking.key - September 20, LAN s are great but. We want to connect them together. ... 1 Inter-networking COS 460 & 540 2 Problem 3 LAN s are great but We want to connect them together...across the world Inter-networking 4 Internet Protocol (IP) Routing The Internet Multicast* Multi-protocol

More information

Homework 1 Solutions:

Homework 1 Solutions: Homework 1 Solutions: If we expand the square in the statistic, we get three terms that have to be summed for each i: (ExpectedFrequency[i]), (2ObservedFrequency[i]) and (ObservedFrequency[i])2 / Expected

More information

The Nios II Family of Configurable Soft-core Processors

The Nios II Family of Configurable Soft-core Processors The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture

More information

Chapter 5. Large and Fast: Exploiting Memory Hierarchy

Chapter 5. Large and Fast: Exploiting Memory Hierarchy Chapter 5 Large and Fast: Exploiting Memory Hierarchy Memory Technology Static RAM (SRAM) 0.5ns 2.5ns, $2000 $5000 per GB Dynamic RAM (DRAM) 50ns 70ns, $20 $75 per GB Magnetic disk 5ms 20ms, $0.20 $2 per

More information

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip Reducing Hit Times Critical Influence on cycle-time or CPI Keep L1 small and simple small is always faster and can be put on chip interesting compromise is to keep the tags on chip and the block data off

More information

Introduction. Router Architectures. Introduction. Introduction. Recent advances in routing architecture including

Introduction. Router Architectures. Introduction. Introduction. Recent advances in routing architecture including Introduction Router Architectures Recent advances in routing architecture including specialized hardware switching fabrics efficient and faster lookup algorithms have created routers that are capable of

More information

Master Course Computer Networks IN2097

Master Course Computer Networks IN2097 Chair for Network Architectures and Services Prof. Carle Department for Computer Science TU München Chair for Network Architectures and Services Prof. Carle Department for Computer Science TU München Master

More information

Master Course Computer Networks IN2097

Master Course Computer Networks IN2097 Chair for Network Architectures and Services Prof. Carle Department for Computer Science TU München Master Course Computer Networks IN2097 Prof. Dr.-Ing. Georg Carle Christian Grothoff, Ph.D. Chair for

More information

Memory Systems IRAM. Principle of IRAM

Memory Systems IRAM. Principle of IRAM Memory Systems 165 other devices of the module will be in the Standby state (which is the primary state of all RDRAM devices) or another state with low-power consumption. The RDRAM devices provide several

More information

Chapter 8 & Chapter 9 Main Memory & Virtual Memory

Chapter 8 & Chapter 9 Main Memory & Virtual Memory Chapter 8 & Chapter 9 Main Memory & Virtual Memory 1. Various ways of organizing memory hardware. 2. Memory-management techniques: 1. Paging 2. Segmentation. Introduction Memory consists of a large array

More information

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE RAID SEMINAR REPORT 2004 Submitted on: Submitted by: 24/09/2004 Asha.P.M NO: 612 S7 ECE CONTENTS 1. Introduction 1 2. The array and RAID controller concept 2 2.1. Mirroring 3 2.2. Parity 5 2.3. Error correcting

More information

Scalable High-Speed Prefix Matching

Scalable High-Speed Prefix Matching Scalable High-Speed Prefix Matching Marcel Waldvogel Washington University in St. Louis and George Varghese University of California, San Diego and Jon Turner Washington University in St. Louis and Bernhard

More information

DATABASE PERFORMANCE AND INDEXES. CS121: Relational Databases Fall 2017 Lecture 11

DATABASE PERFORMANCE AND INDEXES. CS121: Relational Databases Fall 2017 Lecture 11 DATABASE PERFORMANCE AND INDEXES CS121: Relational Databases Fall 2017 Lecture 11 Database Performance 2 Many situations where query performance needs to be improved e.g. as data size grows, query performance

More information

Fast and Scalable IP Address Lookup with Time Complexity of Log m Log m (n)

Fast and Scalable IP Address Lookup with Time Complexity of Log m Log m (n) 58 JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, VOL. 5, NO. 2, MAY 214 Fast and Scalable IP Address Lookup with Time Complexity of Log m Log m (n) Abhishant Prakash Motilal Nehru National Institute of

More information