Fast Evaluation of Protocol Processor Architectures for IPv6 Routing

Size: px
Start display at page:

Download "Fast Evaluation of Protocol Processor Architectures for IPv6 Routing"

Transcription

1 Fast Evaluation of Protocol Processor Architectures for IPv6 Routing Johan Lilius Dragos Truscan Seppo Virtanen Turku Centre for Computer Science Embedded Systems Laboratory Lemminkäisenkatu 14A, Turku, Finland {Johan.Lilius, Dragos.Truscan, Abstract In this paper we present a design case study in configuring our protocol processor architecture to meet the performance requirements of IPv6 routing at gigabit speeds. Our methodology makes it possible to make fast reliable analyses of the problem on a high level and to find its key bottlenecks and design constraints. Based on the analyses we suggest architectural configurations for the target application. The best configurations can then be further analyzed in more detailed system-level simulations and physical estimations. 1. Introduction In recent years addressing the conflicting requirements on network applications has become an important challenge for hardware designers. On the one hand there is a need for faster time-to-market for the product, a goal that has traditionally been achieved by using general purpose processors, and on the other hand one would like to work with ASIC s to obtain an optimal solution in terms of performance. One proposed solution to this problem has been the adoption of network or protocol processors, special programmable processors that are tailored towards network and protocol processing. Such a processor is an attempt to harness the processing speed of ASIC s and the programmability of general purpose processors for optimal protocol processing speed. In our protocol processor design framework TACO (Tools for Application-specific hardware/software COdesign) [11] we have developed tools and methods for helping the designer in specifying, simulating, evaluating and synthesizing a certain type of TTA (Transport Triggered Architectures, [2, 9]) based protocol processors, TACO processors (introduced in [13]). A TTA processor is formed of FUs (functional units) that communicate via an interconnection network of data buses, controlled by an interconnection network controller unit. The FUs connect to the buses through modules called sockets. Each functional unit in a TACO processor performs a specific protocol processing task, and each FU has been designed to be able to complete the execution of its function in one clock cycle. TTA s are in essence one instruction processors, as instructions only specify data moves between functional units. Thus, the instruction word of any TTA processor consists mostly of source and destination addresses. The maximum number of instructions (i.e. data transports) that can be carried out in one clock cycle is equal to the number of data buses in the interconnection network. In this paper we present a case study in which we apply the TACO design methodology [6, 12] to configure TACO protocol processors for efficient IPv6 routing. We have previously described smaller case studies from the system level through logic synthesis in e.g. [6, 13]. Implementing an IPv6 router that uses the Routing Information Protocol (RIPng) is a considerably more complex application of our methodology. The contribution of this paper is to show that within a short time frame we can efficiently and reliably analyze the problem, its key bottlenecks and constraints, and provide solutions to them at abstraction levels on and above the system level. All this is done prior to proceeding to detailed system level simulations and physical estimations. In our methodology, a processor configuration is obtained by deciding what functionality should be implemented in hardware (as FUs) and what in software (as data move operations between the FUs). This process involves identifying the operations to be performed by the router. In order to reach a good balance between the router s performance and its physical characteristics (like power and area use), we explore different architectural configurations by varying the number of FUs of each required type, and varying the number of buses in the interconnection network. These different configurations are simulated and their physical characteristics are estimated. In the end we select for

2 synthesis a configuration that is able to perform the target application within given power and area constraints Related work With a trend towards increasing system complexities, abstract specification based modelling flows are constantly gaining in popularity. The idea is to start with a very abstract system description and then to incrementally refine this specification to contain more and more architectural details. In [1] an OCAPI [10] description of system behavior is used as a starting point in a design methodology for an ethernet packet decoder. In [5] a Y-chart based design methodology and its use for design space exploration is presented. In the Y-chart methodology the performance of a selected architecture is analyzed for a given set of applications. As a result of this analysis the designer receives performance data, based on which decisions and design choices can be made to the architecture. The process is repeated iteratively until a satisfactory architecture for the target application is found. There are similarities between the flows outlined above and the TACO flow, as one might expect: in TACO we work in a specific problem domain, it being protocol processing. In the TACO flow the hardware architecture is also represented as a specification and simulation model written in a high level language, but only prior to synthesis. And, lastly, from the high level simulations we obtain performance data such as clock cycle requirements and module utilization (in addition to the verification of correct processor functionality). However, our approach is very much library-based and allows extensive component re-use for both simulation and synthesis. Since we do not develop all modules from scratch every time the actual hardware design times in our flow can be quite short. As a major difference to the mentioned design flows, we develop TACO processor models in three different development environments at the same time: we have a model for system-level simulations written in SystemC, a model for estimating physical parameters (e.g. processor area and power consumption) at the system level written in Matlab, and a model for synthesizing architectures written in VHDL. The models are highly parameterizable, and hence top-level description files for a given architecture can be automatically generated for all three models using a single hardware design tool [14]. 2. Design Flow and Methodology There are two distinct parts in designing applications for TACO processor. The first part takes care of specifying and analyzing the problem requirements using UML, in order to decide the functionality that has to be implemented by the processor. The second part evaluates and simulates different design configurations and implements the chosen one in hardware. The UML design-flow of the problem consists of incremental refinement steps of the problem specification until the level of detail allows gathering necessary architectural requirements. We combine the functional specification with domain-based knowledge in order to provide component reuse and fast identification of resources. Since we address the TTA architecture, our goal is to find the basic operations the processor should be able to perform, and to implement them using dedicated FUs of TACO. More information can be found in [6]. System-Level design flow. When the recommended hardware module types have been determined, we start evaluating different hardware configurations, i.e. architecture instances. Architecture instances are constructed by varying the number of modules of the same type in the processor as well as varying the internal data transport capacity of the instances. The evaluation of the architecture instances consists of two parts: system level simulations [11, 15] in SystemC and system level physical characteristics estimation [8] in Matlab. The application code needs to be tuned for each instance separately. The simulations yield functional correctness information as well as the total cycle count of the application running on the particular architecture instance. The physical estimation yields estimates of required area and power as well as of the clock frequency of the final product. By co-analyzing the results from SystemC and Matlab the designer is able to determine at the system level whether the architecture instance is suitable for the target protocol processing application. After an architecture instance that fulfills the design constraints has been found, it can be synthesized and its characteristics can be verified using the TACO VHDL synthesis model. 3. The IPv6 Router Architecture An IPv6 router should be able to receive IPv6 datagrams from the connected networks, to check their validity for the right addressing and fields, to interrogate the routing table for the interface(s) they should be forwarded on, and to send the datagrams on the appropriate interface. Additionally a router should build and maintain a routing table that contains information about network topology. The router builds up the Routing Table by listening for specific datagrams broadcasted by the adjacent routers, in order to find out information about the topology of the network. At regular intervals, the routing table information is broadcasted to the adjacent routers to inform them about changes in topology. The IPv6 protocol is described in e.g. [3] and [7]. Routers have to handle two types of Internet traffic, one that updates the routing tables and one that has to be for- 2

3 #1 #2 TACO processor Switching fabric #3 Figure 1. Generic router. #4 warded on adjacent networks. The forwarding process has to search the routing table for a specific network prefix with the longest prefix length possible. Since a routing table can consist of thousands of entries, finding the matching prefix can require long computational time. The current bandwidth demands of internet networks put a high pressure on the routing table look-up speed. To meet these demands, the router implementations need to use fast searching algorithms and dedicated hardware in order to improve the forwarding throughput. Our router is composed of a TACO processor and a number of line cards corresponding to each connected network interface of the router. We are only interested in the design and performance of the TACO processor for implementing routing and forwarding tasks. The line cards can be chosen from the the available products on the market (Intel IFX18103, Cisco GigE 12000, etc.). The interface between the cards and processor is dependent on the products used. Each network card contains a set of independent input and output registers that can be read and written by the processor. The line cards deal with implementing the protocol and its specific tasks, provide fully assembled decapsulated IPv6 datagrams to the processor, take care of fragmentation and encapsulation of outgoing datagrams, and also resolve ARP/RARP requests. The TACO processor is in charge of deciding how the forwarded datagrams are to be routed between the line cards and takes care of building and maintaining its routing table. It scans the input ports of the line cards for pending datagrams, which are transferred into the main memory of the processor. Usually, the existing router implementations split the Internet datagrams into header and payload, and only the header is stored in the main memory for further processing. The payload is only analyzed for datagrams addressed to the router. In our design, we choose to transfer Program Memory MATCHER COMPARATOR COUNTER CHECKSUM SHIFTER MASKER Registers Network Controller Data Memory Memory Management Unit Routing Table Unit Local Info Unit ippu the entire datagram in the main memory because in IPv6 the IP header can be accompanied by a variable number of extension headers that also have to be taken into consideration. Once saved in the memory, the datagrams are processed one at a time, the header is updated and then the entire datagram is saved in the output buffer of the corresponding line card. One important design feature of a TTA processor is the modularity of its architecture. Each FU computes independently of the interconnection network and other units. So the performance of the processor is reflected by the number of transports on the buses and implicitly by the time in which each operand becomes available in the output registers of the functional units. In order to decrease the waiting time, FUs have to complete their computation in as few cycles as possible. In this sense a balance should exist between the amount of complexity and the response time of functional units. The Preprocessing Unit (ippu) scans the input buffers for new datagrams. If a datagram is pending it is stored in the main memory. A pointer to the memory address where the datagram was stored is saved in a queue, along with the interface identifier of the input buffer. The ippu is connected to the buses in the processor s interconnection network through one trigger and two result registers. It also provides a 1-bit signal connected to the Interconnection Network Controller (see figure 2) to notify it of new enoppu Figure 2. TACO architecture. 3

4 R1 = b R2 = 2 R3 = c R4 =4 Mov (b, R1) Mov (2, R2) Mov (c, R3) Mov (4, R4) Non -optimized TACO TTA -optimized code R5 = R1 * R2 R6 = R5 + R3 R7 = R6 / R4 Mul2 (R1, R2, R5) Add (R5, R3, R6) Div2 (R6, R4, R7) a = ( b * 2 + c )/4 A = R7 Mov (R7, a) Domain Operation List Figure 3. TACO Code Optimization Process tries pending in the queue. The PostProcessing Unit (oppu) manages the output traffic of the router. The unit contains an internal queue in which pointers to memory addresses of the datagrams to be sent are stored along with the output interface identifer. The oppu interrogates its internal queue and for each entry it moves the corresponding datagram from the data memory to the specified output buffer. The Counter Unit performs arithmetical operations (increment, decrement, addition, substraction) and counting (upwards or downwards from a start value to a stop value). When the stop value has been reached a result signal directly connected to the Network Controller is enabled. For comparing operands with a given value a Comparer Unit has been designed. The result of a comparison or unit is signaled to the Network Controller via a result signal. The Matcher and the Masker are bitstring manipulation FUs that process only parts of their input operands according to a given mask. The Matcher reports its result to the Interconnection Network Controller by means of a result bit signal directly connected between them. The Masker sets the bits of a register according to a given mask and a given value. In addition to logical shifting, a Shifter can also be used for arithmetical multiplication by 2. From the programmer s point of view, programming TACO processors is a matter of moving data from output to input registers. Using registers for FUs allows using optimization techniques like moving operands from an output register to an input register without additional temporary storage (bypassing), using the same output register or general purpose register for multiple data transports (operand sharing), easy removing of registers that are no longer in use, etc. All these techniques help in reducing code size by reducing the number of transports on buses. Some general compiler optimizations can also be performed on TACO assembly code like sinking, loop unrolling, etc. Code optimization for TACO processors reduces in fact to well-known bus scheduling and registry allocation problems. We have to schedule move instructions on the buses and to allocate registers to the operands of the instructions. The scheduling and allocation policies have been widely discussed in the literature and we are not suggesting any new methods. A compiler can do necessary allocation and scheduling, along with some final optimizations. 4. Results Previously we specified and implemented a set of resources (FUs and buses) TACO processors should use for processing Internet datagrams. These resources can be used as a test bench for specifying the final configuration of the TACO processor used in the router. To be able to do this, we still have to decide on several implementation options of the routing table. The Routing Table implementation is the most important aspect of a router s performance, so we decided to create a dedicated functional unit for it. Different routing table implementations have been suggested in literature [4] depending on the target application. For instance, in routers that have a relatively small number of entries in the routing table, a fast memory can be used. For larger routing tables the cost of a fast memory chip would be too high, so software-based algorithms are needed. Among proposed hardware solutions, using Content Addressable Memories (CAM) seem to be very tempting because of the fast match time (tens of nanoseconds) they provide. On the other hand the price of this type of memory is very high. Also, most of 4

5 Routing Table Architecture Required Bus util. Area Avg. Power Implementation configuration speed [%] [mm 2 ] [W] Sequential 1BUS/1FU 6 GHz 100 NA NA access 3BUS/1FU 2 GHz 100 NA NA 3bus/3CNT, 3CMP, 3M 1GHz Balanced 1BUS/1FU 1,2 GHz 100 NA NA tree 3BUS/1FU 600 MHz bus/3CNT, 3CMP, 3M 250 MHz Content 1BUS/1FU 118 MHz Addressable 3BUS/1FU 40 MHz Memory 3bus/3CNT, 3CMP, 3M 35 MHz Table 1. Estimated minimum clock frequencies, processor areas and average power consumption for different processor architectures. NA (not available) indicates an architecture that was not estimated due to its high clock frequency requirement. The CAM estimates do not include the area and power used by the CAM chip. the existing solutions provide support for IPv4 protocol and very few for IPv6. The difference between the two is the size of the address to be matched in the routing tables (128 bits for IPv6 and only 32 bits for IPv4). We simulate and estimate different TACO processor configurations to evaluate their performance with respect to IPv6 routing throughput, silicon area and power consumption. Our goal is to choose the configuration (and implicitly the routing table implementation) that is the best fit for the performance requirements and constraints of the router. Our system-level estimation model has been verified in previous work [12] to give quite precise results when compared to post-synthesis results. As the first case we implemented the routing table using a cache memory in which the entries are organized sequentially. As the second case we simulate a balanced tree structure. For both cases we tested different TACO architecture configurations that we obtained by varying the number of buses and the number of functional units used. Each of these configurations has to be able to achieve the 10 Gbps ethernet throughput with a maximum size of 100 entries in the routing table. Based on these constraints we calculated the minimum clock frequencies for the TACO processor configurations. This is done by taking into account the number of clock cycles the datagram forwarding process takes to complete in each case. In the case of sequential routing table organization we calculated that with one bus and one functional unit of each type the clock speed of the processor should be 6 GHz (See results in table 1). This exceeds the capabilities of the 0.18 µm standard cell library that we currently use. We estimate that the upper limit for TACO clock frequencies using this technology is near 1 GHz. By configuring the TACO processor to use three buses we observed a required clock frequency of 2 GHz, which also is beyond the capabilities of our current implementation technology. By performing data and control analysis we tried to optimize the program code. Based on this analysis we configured the router to use 3 matchers, 3 counters and 3 comparers. The increase in performance brought the clock frequency into the vicinity of 1 GHz, meaning that this configuration could be possible to implement in the 0.18 µm technology. However, as seen in table 1, at this clock speed the average power consumed by the architecture is not acceptable. The high power consumption follows from the fact that larger gate sizes had to be used in order to reach the 1 GHz clock speed. This naturally also had an effect on the total processor area estimation. In sequential organization of the routing table entries we have linear complexity of searching time. In order to get a faster search time we implemented a balanced tree structure, that offers logarithmic complexity of searching time. However, the insertion and deletion operations become much more complex. Still, this does not influence the throughput of the router significantly. Statistics show that when the topology of the network stabilizes, the routing table updates appear once in 2 minutes, which does not require much computational effort. By performing simulations and estimations, we can see that the gain in performance is evident. The faster two architectures qualify as possible solutions as seen in table 1; The slowest one would require even more power and area than the last case in sequential routing, and was thus not estimated. Finally we evaluated a hardware-based solution for the routing table. We used a 136-bit wide content addressable memory (CAM) and a commercially available SRAM chip. By combining these two circuits we calculated that the routing table searching time would be 40 ns. This is a major 5

6 boost in router performance in detriment of high implementation cost. By using the mentioned industrial IP blocks, we transformed the TACO processor into a system-on-chip design solution. As we can see in table 1, the speed requirements for the TACO processors drop dramatically; especially when using 3 buses and one functional unit of each needed type. Multiplying the number of functional units does not anymore seem to offer considerable increase in routing table access performance, instead it actually causes the power and area requirements to increase. It is also important to realize that the power and area required by the Content Addressable Memory (CAM) chip are not included in the estimates in table 1. As an example, the Micron Harmony 1Mb CAM consumes the average power of 1.5 to 2 Watts when operated at 133 MHz. Therefore, the total power consumed when using a CAM processor to handle routing table searches is approximately the same as when using only a TACO processor for it. On the other hand, in the CAM case the total footprint area required by the two circuits is of course larger than the area required by just a TACO processor. 5. Conclusions The increased complexity of network applications produces constantly more requirements for device design efforts in the areas of optimization, testing and validation. By dealing with the requirements and constraints at the system level allows us to address more complex designs efficiently. Simulation and estimation in the early phases of the design decreases the development time of the products by allowing early verification and performance evaluation. In addition to shortening the development time, this also significantly reduces the cost of the final product. In this paper we experimented with our design methodology to quickly evaluate different architectural alternatives for an IPv6 routing protocol processor. By simulating and estimating different architectural configurations at the system-level we obtained a fast turn-around time for finding well-suited configurations to match the target application and its constraints. Our future work includes the full implementation and synthesis of an IPv6 router. In the same time we would like to develop a tool that automates the design space exploration phase, which based on some heuristics will suggest good solutions, with respect to performance requirements and physical constraints. References [1] M. Attia and I. Verbauwhede. Programmable Gigabit packet processor design methodology. In Proceedings of the European Conference on Circuit Theory and Design (ECCTD 01), pages III: , Espoo, Finland, August [2] H. Corporaal. Microprocessor Architectures - from VLIW to TTA. John Wiley and Sons Ltd., Chichester, West Sussex, England, [3] S. Deering and R. Hinden. Internet protocol, version 6 (IPv6) specification. RFC 2460, [4] S. Kechav and R. Sharma. Issues and trends in router design. IEEE Communication Magazine, pages , May [5] B. Kienhuis, E. F. Deprettere, P. van der Wolf, and K. Vissers. A Methodology to Design Programmable Embedded Systems, volume 2268 of LNCS, pages Springer- Verlag, [6] J. Lilius and D. Truscan. UML-driven TTA-based protocol processor design. In Proceedings of Forum for Design Languages 02 (FDL 02), Marseille, France, [7] M. A. Miller. Implementing IPv6. MeT Books, [8] T. Nurmi, S. Virtanen, J. Isoaho, and H. Tenhunen. Physical modeling and system level performance characterization of a protocol processor architecture. In Proceedings of the 18th IEEE NORCHIP Conference, pages , Turku, Finland, November [9] D. Tabak and G. J. Lipovski. MOVE architecture in digital controllers. IEEE Transactions on Computers, 29(2): , February [10] S. Vernalde, P. Schaumont, and I. Bolsens. An Object Oriented Programming Approach for Hardware Design. In IEEE Computer Society Workshop on VLSI 99, Orlando, USA, [11] S. Virtanen and J. Lilius. The TACO protocol processor simulation environment. In Proceedings of the 9th International Symposium on Hardware/Software Codesign, [12] S. Virtanen, J. Lilius, T. Nurmi, and T. Westerlund. TACO: Rapid design space exploration for protocol processors. In the Ninth IEEE/DATC Electronic Design Processes Workshop Notes, Monterey, CA, USA, April [13] S. Virtanen, J. Lilius, and T. Westerlund. A processor architecture for the TACO protocol processor development framework. In Proceedings of the 18th IEEE NORCHIP Conference, pages , Turku, Finland, November [14] S. Virtanen, T. Lundström, and J. Lilius. A processor design tool for the TACO framework. In Proceedings of 2002 IEEE Norchip Conference, November [15] S. Virtanen, D. Truscan, and J. Lilius. SystemC based object oriented system design. In Proceedings of the Fourth International Forum on Design Languages (FDL 01), Acknowledgements The authors wish to thank M.Sc. Tero Nurmi, University of Turku, for providing physical estimates for the processor architectures discussed in this paper. 6

A design methodology for TTA protocol processors

A design methodology for TTA protocol processors A design methodology for TTA protocol processors Presentation by Seppo Virtanen seppo.virtanen@utu.fi http://users.utu.fi/seaavi Embedded Systems lab, Turku Centre for Computer Science (TUCS) http://www.tucs.fi

More information

Design of Transport Triggered Architecture Processor for Discrete Cosine Transform

Design of Transport Triggered Architecture Processor for Discrete Cosine Transform Design of Transport Triggered Architecture Processor for Discrete Cosine Transform by J. Heikkinen, J. Sertamo, T. Rautiainen,and J. Takala Presented by Aki Happonen Table of Content Introduction Transport

More information

TACO IPv6 Router - a Case Study in Protocol Processor Design

TACO IPv6 Router - a Case Study in Protocol Processor Design TACO IPv6 Router - a Case Study in Protocol Processor Design Seppo Virtanen Dragos Truscan Johan Lilius Embedded Systems Laboratory Turku Centre for Computer Science TUCS Technical Report No 528 April

More information

Seppo Virtanen. A Framework for Rapid Design and Evaluation of Protocol Processors. Turku Centre for Computer Science

Seppo Virtanen. A Framework for Rapid Design and Evaluation of Protocol Processors. Turku Centre for Computer Science Seppo Virtanen A Framework for Rapid Design and Evaluation of Protocol Processors Turku Centre for Computer Science TUCS Dissertations No 55, September 2004 A Framework for Rapid Design and Evaluation

More information

Hardware Modeling using Verilog Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Hardware Modeling using Verilog Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Hardware Modeling using Verilog Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture 01 Introduction Welcome to the course on Hardware

More information

Novel ASIP and Processor Architecture for Packet Decoding

Novel ASIP and Processor Architecture for Packet Decoding Novel ASIP and Processor Architecture for Packet Decoding Tomas Henriksson Dept. of Electrical Engineering Linköpings universitet SE-581 83 Linköping, Sweden Phone: +46-13-288956 E-mail: tomhe@isy.liu.se

More information

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University

Hardware Design Environments. Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Hardware Design Environments Dr. Mahdi Abbasi Computer Engineering Department Bu-Ali Sina University Outline Welcome to COE 405 Digital System Design Design Domains and Levels of Abstractions Synthesis

More information

Topic 4a Router Operation and Scheduling. Ch4: Network Layer: The Data Plane. Computer Networking: A Top Down Approach

Topic 4a Router Operation and Scheduling. Ch4: Network Layer: The Data Plane. Computer Networking: A Top Down Approach Topic 4a Router Operation and Scheduling Ch4: Network Layer: The Data Plane Computer Networking: A Top Down Approach 7 th edition Jim Kurose, Keith Ross Pearson/Addison Wesley April 2016 4-1 Chapter 4:

More information

The iflow Address Processor Forwarding Table Lookups using Fast, Wide Embedded DRAM

The iflow Address Processor Forwarding Table Lookups using Fast, Wide Embedded DRAM Enabling the Future of the Internet The iflow Address Processor Forwarding Table Lookups using Fast, Wide Embedded DRAM Mike O Connor - Director, Advanced Architecture www.siliconaccess.com Hot Chips 12

More information

A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup

A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup A Hybrid Approach to CAM-Based Longest Prefix Matching for IP Route Lookup Yan Sun and Min Sik Kim School of Electrical Engineering and Computer Science Washington State University Pullman, Washington

More information

440GX Application Note

440GX Application Note Overview of TCP/IP Acceleration Hardware January 22, 2008 Introduction Modern interconnect technology offers Gigabit/second (Gb/s) speed that has shifted the bottleneck in communication from the physical

More information

Chapter 4 Network Layer: The Data Plane

Chapter 4 Network Layer: The Data Plane Chapter 4 Network Layer: The Data Plane A note on the use of these Powerpoint slides: We re making these slides freely available to all (faculty, students, readers). They re in PowerPoint form so you see

More information

CMPE 150/L : Introduction to Computer Networks. Chen Qian Computer Engineering UCSC Baskin Engineering Lecture 11

CMPE 150/L : Introduction to Computer Networks. Chen Qian Computer Engineering UCSC Baskin Engineering Lecture 11 CMPE 150/L : Introduction to Computer Networks Chen Qian Computer Engineering UCSC Baskin Engineering Lecture 11 1 Midterm exam Midterm this Thursday Close book but one-side 8.5"x11" note is allowed (must

More information

Design and Implementation of 5 Stages Pipelined Architecture in 32 Bit RISC Processor

Design and Implementation of 5 Stages Pipelined Architecture in 32 Bit RISC Processor Design and Implementation of 5 Stages Pipelined Architecture in 32 Bit RISC Processor Abstract The proposed work is the design of a 32 bit RISC (Reduced Instruction Set Computer) processor. The design

More information

On Efficiency of Transport Triggered Architectures in DSP Applications

On Efficiency of Transport Triggered Architectures in DSP Applications On Efficiency of Transport Triggered Architectures in DSP Applications JARI HEIKKINEN 1, JARMO TAKALA 1, ANDREA CILIO 2, and HENK CORPORAAL 3 1 Tampere University of Technology, P.O.B. 553, 33101 Tampere,

More information

Router Architectures

Router Architectures Router Architectures Venkat Padmanabhan Microsoft Research 13 April 2001 Venkat Padmanabhan 1 Outline Router architecture overview 50 Gbps multi-gigabit router (Partridge et al.) Technology trends Venkat

More information

1-1. Switching Networks (Fall 2010) EE 586 Communication and. October 25, Lecture 24

1-1. Switching Networks (Fall 2010) EE 586 Communication and. October 25, Lecture 24 EE 586 Communication and Switching Networks (Fall 2010) Lecture 24 October 25, 2010 1-1 Announcements Midterm 1: Mean = 92.2 Stdev = 8 Still grading your programs (sorry about the delay) Network Layer

More information

CSCE 463/612 Networks and Distributed Processing Spring 2018

CSCE 463/612 Networks and Distributed Processing Spring 2018 CSCE 463/612 Networks and Distributed Processing Spring 2018 Network Layer II Dmitri Loguinov Texas A&M University April 3, 2018 Original slides copyright 1996-2004 J.F Kurose and K.W. Ross 1 Chapter 4:

More information

EP2120 Internetworking/Internetteknik IK2218 Internets Protokoll och Principer

EP2120 Internetworking/Internetteknik IK2218 Internets Protokoll och Principer EP2120 Internetworking/Internetteknik IK2218 Internets Protokoll och Principer Homework Assignment 1 (Solutions due 20:00, Mon., 10 Sept. 2018) (Review due 20:00, Wed., 12 Sept. 2018) 1. IPv4 Addressing

More information

Digital Design Methodology

Digital Design Methodology Digital Design Methodology Prof. Soo-Ik Chae Digital System Designs and Practices Using Verilog HDL and FPGAs @ 2008, John Wiley 1-1 Digital Design Methodology (Added) Design Methodology Design Specification

More information

A distributed architecture of IP routers

A distributed architecture of IP routers A distributed architecture of IP routers Tasho Shukerski, Vladimir Lazarov, Ivan Kanev Abstract: The paper discusses the problems relevant to the design of IP (Internet Protocol) routers or Layer3 switches

More information

The Network Layer and Routers

The Network Layer and Routers The Network Layer and Routers Daniel Zappala CS 460 Computer Networking Brigham Young University 2/18 Network Layer deliver packets from sending host to receiving host must be on every host, router in

More information

Last Lecture: Network Layer

Last Lecture: Network Layer Last Lecture: Network Layer 1. Design goals and issues 2. Basic Routing Algorithms & Protocols 3. Addressing, Fragmentation and reassembly 4. Internet Routing Protocols and Inter-networking 5. Router design

More information

Chapter 4: network layer. Network service model. Two key network-layer functions. Network layer. Input port functions. Router architecture overview

Chapter 4: network layer. Network service model. Two key network-layer functions. Network layer. Input port functions. Router architecture overview Chapter 4: chapter goals: understand principles behind services service models forwarding versus routing how a router works generalized forwarding instantiation, implementation in the Internet 4- Network

More information

PARALLEL ALGORITHMS FOR IP SWITCHERS/ROUTERS

PARALLEL ALGORITHMS FOR IP SWITCHERS/ROUTERS THE UNIVERSITY OF NAIROBI DEPARTMENT OF ELECTRICAL AND INFORMATION ENGINEERING FINAL YEAR PROJECT. PROJECT NO. 60 PARALLEL ALGORITHMS FOR IP SWITCHERS/ROUTERS OMARI JAPHETH N. F17/2157/2004 SUPERVISOR:

More information

CSE 3214: Computer Network Protocols and Applications Network Layer

CSE 3214: Computer Network Protocols and Applications Network Layer CSE 314: Computer Network Protocols and Applications Network Layer Dr. Peter Lian, Professor Department of Computer Science and Engineering York University Email: peterlian@cse.yorku.ca Office: 101C Lassonde

More information

Hardware, Software and Mechanical Cosimulation for Automotive Applications

Hardware, Software and Mechanical Cosimulation for Automotive Applications Hardware, Software and Mechanical Cosimulation for Automotive Applications P. Le Marrec, C.A. Valderrama, F. Hessel, A.A. Jerraya TIMA Laboratory 46 Avenue Felix Viallet 38031 Grenoble France fphilippe.lemarrec,

More information

A Pipelined IP Address Lookup Module for 100 Gbps Line Rates and beyond

A Pipelined IP Address Lookup Module for 100 Gbps Line Rates and beyond A Pipelined IP Address Lookup Module for 1 Gbps Line Rates and beyond Domenic Teuchert and Simon Hauger Institute of Communication Networks and Computer Engineering (IKR) Universität Stuttgart, Pfaffenwaldring

More information

Unit 2: High-Level Synthesis

Unit 2: High-Level Synthesis Course contents Unit 2: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 2 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis

More information

Latches. IT 3123 Hardware and Software Concepts. Registers. The Little Man has Registers. Data Registers. Program Counter

Latches. IT 3123 Hardware and Software Concepts. Registers. The Little Man has Registers. Data Registers. Program Counter IT 3123 Hardware and Software Concepts Notice: This session is being recorded. CPU and Memory June 11 Copyright 2005 by Bob Brown Latches Can store one bit of data Can be ganged together to store more

More information

CMSC 332 Computer Networks Network Layer

CMSC 332 Computer Networks Network Layer CMSC 332 Computer Networks Network Layer Professor Szajda CMSC 332: Computer Networks Where in the Stack... CMSC 332: Computer Network 2 Where in the Stack... Application CMSC 332: Computer Network 2 Where

More information

Introduction to TCP/IP Offload Engine (TOE)

Introduction to TCP/IP Offload Engine (TOE) Introduction to TCP/IP Offload Engine (TOE) Version 1.0, April 2002 Authored By: Eric Yeh, Hewlett Packard Herman Chao, QLogic Corp. Venu Mannem, Adaptec, Inc. Joe Gervais, Alacritech Bradley Booth, Intel

More information

Chapter 4. Computer Networking: A Top Down Approach 5 th edition. Jim Kurose, Keith Ross Addison-Wesley, sl April 2009.

Chapter 4. Computer Networking: A Top Down Approach 5 th edition. Jim Kurose, Keith Ross Addison-Wesley, sl April 2009. Chapter 4 Network Layer A note on the use of these ppt slides: We re making these slides freely available to all (faculty, students, readers). They re in PowerPoint form so you can add, modify, and delete

More information

The Central Processing Unit

The Central Processing Unit The Central Processing Unit All computers derive from the same basic design, usually referred to as the von Neumann architecture. This concept involves solving a problem by defining a sequence of commands

More information

Introduction to Routers and LAN Switches

Introduction to Routers and LAN Switches Introduction to Routers and LAN Switches Session 3048_05_2001_c1 2001, Cisco Systems, Inc. All rights reserved. 3 Prerequisites OSI Model Networking Fundamentals 3048_05_2001_c1 2001, Cisco Systems, Inc.

More information

EEC-484/584 Computer Networks

EEC-484/584 Computer Networks EEC-484/584 Computer Networks Lecture 13 wenbing@ieee.org (Lecture nodes are based on materials supplied by Dr. Louise Moser at UCSB and Prentice-Hall) Outline 2 Review of lecture 12 Routing Congestion

More information

Routers: Forwarding EECS 122: Lecture 13

Routers: Forwarding EECS 122: Lecture 13 Input Port Functions Routers: Forwarding EECS 22: Lecture 3 epartment of Electrical Engineering and Computer Sciences University of California Berkeley Physical layer: bit-level reception ata link layer:

More information

Chapter 4 Network Layer

Chapter 4 Network Layer Chapter 4 Network Layer Computer Networking: A Top Down Approach Featuring the Internet, 3 rd edition. Jim Kurose, Keith Ross Addison-Wesley, July 2004. Network Layer 4-1 Chapter 4: Network Layer Chapter

More information

Network Layer: outline

Network Layer: outline Network Layer: outline 1 introduction 2 virtual circuit and datagram networks 3 what s inside a router 4 IP: Internet Protocol datagram format IPv4 addressing ICMP IPv6 5 routing algorithms link state

More information

Overview. Implementing Gigabit Routers with NetFPGA. Basic Architectural Components of an IP Router. Per-packet processing in an IP Router

Overview. Implementing Gigabit Routers with NetFPGA. Basic Architectural Components of an IP Router. Per-packet processing in an IP Router Overview Implementing Gigabit Routers with NetFPGA Prof. Sasu Tarkoma The NetFPGA is a low-cost platform for teaching networking hardware and router design, and a tool for networking researchers. The NetFPGA

More information

A Stream-based Reconfigurable Router Prototype

A Stream-based Reconfigurable Router Prototype A Stream-based Reconfigurable Router Prototype David C. Lee, Scott J. Harper, Peter M. Athanas, and Scott F. Midkiff Bradley Department of Electrical and Computer Engineering Virginia Polytechnic Institute

More information

Lecture 16: Network Layer Overview, Internet Protocol

Lecture 16: Network Layer Overview, Internet Protocol Lecture 16: Network Layer Overview, Internet Protocol COMP 332, Spring 2018 Victoria Manfredi Acknowledgements: materials adapted from Computer Networking: A Top Down Approach 7 th edition: 1996-2016,

More information

Topics for Today. Network Layer. Readings. Introduction Addressing Address Resolution. Sections 5.1,

Topics for Today. Network Layer. Readings. Introduction Addressing Address Resolution. Sections 5.1, Topics for Today Network Layer Introduction Addressing Address Resolution Readings Sections 5.1, 5.6.1-5.6.2 1 Network Layer: Introduction A network-wide concern! Transport layer Between two end hosts

More information

OPTIMIZATION OF IPV6 PACKET S HEADERS OVER ETHERNET FRAME

OPTIMIZATION OF IPV6 PACKET S HEADERS OVER ETHERNET FRAME OPTIMIZATION OF IPV6 PACKET S HEADERS OVER ETHERNET FRAME 1 FAHIM A. AHMED GHANEM1, 2 VILAS M. THAKARE 1 Research Student, School of Computational Sciences, Swami Ramanand Teerth Marathwada University,

More information

DESIGN OF TRANSPORT TRIGGERED ARCHITECTURE PROCESSOR FOR DISCRETE COSINE TRANSFORM

DESIGN OF TRANSPORT TRIGGERED ARCHITECTURE PROCESSOR FOR DISCRETE COSINE TRANSFORM DESG F ASP GGEED ACHECUE PCESS F DSCEE CSE ASF Jari Heikkinen, Jaakko Sertamo, ino autiainen, and Jarmo akala ampere University of echnology, P..B. 553, F-33101 ampere, Finland ABSAC he trend in programmable

More information

Network Layer PREPARED BY AHMED ABDEL-RAOUF

Network Layer PREPARED BY AHMED ABDEL-RAOUF Network Layer PREPARED BY AHMED ABDEL-RAOUF Network layer transport segment from sending to receiving host on sending side encapsulates segments into datagrams on receiving side, delivers segments to transport

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION CHAPTER 1 INTRODUCTION Rapid advances in integrated circuit technology have made it possible to fabricate digital circuits with large number of devices on a single chip. The advantages of integrated circuits

More information

Lecture 3. The Network Layer (cont d) Network Layer 1-1

Lecture 3. The Network Layer (cont d) Network Layer 1-1 Lecture 3 The Network Layer (cont d) Network Layer 1-1 Agenda The Network Layer (cont d) What is inside a router? Internet Protocol (IP) IPv4 fragmentation and addressing IP Address Classes and Subnets

More information

Digital Design Methodology (Revisited) Design Methodology: Big Picture

Digital Design Methodology (Revisited) Design Methodology: Big Picture Digital Design Methodology (Revisited) Design Methodology Design Specification Verification Synthesis Technology Options Full Custom VLSI Standard Cell ASIC FPGA CS 150 Fall 2005 - Lec #25 Design Methodology

More information

Network Layer: Control/data plane, addressing, routers

Network Layer: Control/data plane, addressing, routers Network Layer: Control/data plane, addressing, routers CS 352, Lecture 10 http://www.cs.rutgers.edu/~sn624/352-s19 Srinivas Narayana (heavily adapted from slides by Prof. Badri Nath and the textbook authors)

More information

Experience with the NetFPGA Program

Experience with the NetFPGA Program Experience with the NetFPGA Program John W. Lockwood Algo-Logic Systems Algo-Logic.com With input from the Stanford University NetFPGA Group & Xilinx XUP Program Sunday, February 21, 2010 FPGA-2010 Pre-Conference

More information

Routers: Forwarding EECS 122: Lecture 13

Routers: Forwarding EECS 122: Lecture 13 Routers: Forwarding EECS 122: Lecture 13 epartment of Electrical Engineering and Computer Sciences University of California Berkeley Router Architecture Overview Two key router functions: run routing algorithms/protocol

More information

Users Guide: Fast IP Lookup (FIPL) in the FPX

Users Guide: Fast IP Lookup (FIPL) in the FPX Users Guide: Fast IP Lookup (FIPL) in the FPX Gigabit Kits Workshop /22 FIPL System Design Each FIPL Engine performs a longest matching prefix lookup on a single 32-bit IPv4 destination address FIPL Engine

More information

COMP211 Chapter 4 Network Layer: The Data Plane

COMP211 Chapter 4 Network Layer: The Data Plane COMP211 Chapter 4 Network Layer: The Data Plane All material copyright 1996-2016 J.F Kurose and K.W. Ross, All Rights Reserved Computer Networking: A Top Down Approach 7 th edition Jim Kurose, Keith Ross

More information

Introduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation

Introduction to Electronic Design Automation. Model of Computation. Model of Computation. Model of Computation Introduction to Electronic Design Automation Model of Computation Jie-Hong Roland Jiang 江介宏 Department of Electrical Engineering National Taiwan University Spring 03 Model of Computation In system design,

More information

Routers Technologies & Evolution for High-Speed Networks

Routers Technologies & Evolution for High-Speed Networks Routers Technologies & Evolution for High-Speed Networks C. Pham Université de Pau et des Pays de l Adour http://www.univ-pau.fr/~cpham Congduc.Pham@univ-pau.fr Router Evolution slides from Nick McKeown,

More information

Router Construction. Workstation-Based. Switching Hardware Design Goals throughput (depends on traffic model) scalability (a function of n) Outline

Router Construction. Workstation-Based. Switching Hardware Design Goals throughput (depends on traffic model) scalability (a function of n) Outline Router Construction Outline Switched Fabrics IP Routers Tag Switching Spring 2002 CS 461 1 Workstation-Based Aggregate bandwidth 1/2 of the I/O bus bandwidth capacity shared among all hosts connected to

More information

Networking for Data Acquisition Systems. Fabrice Le Goff - 14/02/ ISOTDAQ

Networking for Data Acquisition Systems. Fabrice Le Goff - 14/02/ ISOTDAQ Networking for Data Acquisition Systems Fabrice Le Goff - 14/02/2018 - ISOTDAQ Outline Generalities The OSI Model Ethernet and Local Area Networks IP and Routing TCP, UDP and Transport Efficiency Networking

More information

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance Lecture 13: Interconnection Networks Topics: lots of background, recent innovations for power and performance 1 Interconnection Networks Recall: fully connected network, arrays/rings, meshes/tori, trees,

More information

ISSN: [Bilani* et al.,7(2): February, 2018] Impact Factor: 5.164

ISSN: [Bilani* et al.,7(2): February, 2018] Impact Factor: 5.164 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY A REVIEWARTICLE OF SDRAM DESIGN WITH NECESSARY CRITERIA OF DDR CONTROLLER Sushmita Bilani *1 & Mr. Sujeet Mishra 2 *1 M.Tech Student

More information

ECE 587 Hardware/Software Co-Design Lecture 23 Hardware Synthesis III

ECE 587 Hardware/Software Co-Design Lecture 23 Hardware Synthesis III ECE 587 Hardware/Software Co-Design Spring 2018 1/28 ECE 587 Hardware/Software Co-Design Lecture 23 Hardware Synthesis III Professor Jia Wang Department of Electrical and Computer Engineering Illinois

More information

Towards a Dynamically Reconfigurable System-on-Chip Platform for Video Signal Processing

Towards a Dynamically Reconfigurable System-on-Chip Platform for Video Signal Processing Towards a Dynamically Reconfigurable System-on-Chip Platform for Video Signal Processing Walter Stechele, Stephan Herrmann, Andreas Herkersdorf Technische Universität München 80290 München Germany Walter.Stechele@ei.tum.de

More information

Topic & Scope. Content: The course gives

Topic & Scope. Content: The course gives Topic & Scope Content: The course gives an overview of network processor cards (architectures and use) an introduction of how to program Intel IXP network processors some ideas of how to use network processors

More information

4. Hardware Platform: Real-Time Requirements

4. Hardware Platform: Real-Time Requirements 4. Hardware Platform: Real-Time Requirements Contents: 4.1 Evolution of Microprocessor Architecture 4.2 Performance-Increasing Concepts 4.3 Influences on System Architecture 4.4 A Real-Time Hardware Architecture

More information

CS 356: Computer Network Architectures. Lecture 14: Switching hardware, IP auxiliary functions, and midterm review. [PD] chapter 3.4.1, 3.2.

CS 356: Computer Network Architectures. Lecture 14: Switching hardware, IP auxiliary functions, and midterm review. [PD] chapter 3.4.1, 3.2. CS 356: Computer Network Architectures Lecture 14: Switching hardware, IP auxiliary functions, and midterm review [PD] chapter 3.4.1, 3.2.7 Xiaowei Yang xwy@cs.duke.edu Switching hardware Software switch

More information

Understanding Cisco Express Forwarding

Understanding Cisco Express Forwarding Understanding Cisco Express Forwarding Document ID: 47321 Contents Introduction Prerequisites Requirements Components Used Conventions Overview CEF Operations Updating the GRP's Routing Tables Packet Forwarding

More information

Network on Chip Architecture: An Overview

Network on Chip Architecture: An Overview Network on Chip Architecture: An Overview Md Shahriar Shamim & Naseef Mansoor 12/5/2014 1 Overview Introduction Multi core chip Challenges Network on Chip Architecture Regular Topology Irregular Topology

More information

A Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on

A Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on A Reconfigurable Crossbar Switch with Adaptive Bandwidth Control for Networks-on on-chip Donghyun Kim, Kangmin Lee, Se-joong Lee and Hoi-Jun Yoo Semiconductor System Laboratory, Dept. of EECS, Korea Advanced

More information

Tutorial 9. SOLUTION Since the number of supported interfaces is different for each subnet, this is a Variable- Length Subnet Masking (VLSM) problem.

Tutorial 9. SOLUTION Since the number of supported interfaces is different for each subnet, this is a Variable- Length Subnet Masking (VLSM) problem. Tutorial 9 1 Router Architecture Consider a router with a switch fabric, 2 input ports (A and B) and 2 output ports (C and D). Suppose the switch fabric operates at 1.5 times the line speed. a. If, for

More information

The Design and Implementation of a Low-Latency On-Chip Network

The Design and Implementation of a Low-Latency On-Chip Network The Design and Implementation of a Low-Latency On-Chip Network Robert Mullins 11 th Asia and South Pacific Design Automation Conference (ASP-DAC), Jan 24-27 th, 2006, Yokohama, Japan. Introduction Current

More information

CC312: Computer Organization

CC312: Computer Organization CC312: Computer Organization 1 Chapter 1 Introduction Chapter 1 Objectives Know the difference between computer organization and computer architecture. Understand units of measure common to computer systems.

More information

Hardware/Software Partitioning for SoCs. EECE Advanced Topics in VLSI Design Spring 2009 Brad Quinton

Hardware/Software Partitioning for SoCs. EECE Advanced Topics in VLSI Design Spring 2009 Brad Quinton Hardware/Software Partitioning for SoCs EECE 579 - Advanced Topics in VLSI Design Spring 2009 Brad Quinton Goals of this Lecture Automatic hardware/software partitioning is big topic... In this lecture,

More information

Automatic compilation framework for Bloom filter based intrusion detection

Automatic compilation framework for Bloom filter based intrusion detection Automatic compilation framework for Bloom filter based intrusion detection Dinesh C Suresh, Zhi Guo*, Betul Buyukkurt and Walid A. Najjar Department of Computer Science and Engineering *Department of Electrical

More information

INTEL Architectures GOPALAKRISHNAN IYER FALL 2009 ELEC : Computer Architecture and Design

INTEL Architectures GOPALAKRISHNAN IYER FALL 2009 ELEC : Computer Architecture and Design INTEL Architectures GOPALAKRISHNAN IYER FALL 2009 GBI0001@AUBURN.EDU ELEC 6200-001: Computer Architecture and Design Silicon Technology Moore s law Moore's Law describes a long-term trend in the history

More information

VLSI Design Automation. Calcolatori Elettronici Ing. Informatica

VLSI Design Automation. Calcolatori Elettronici Ing. Informatica VLSI Design Automation 1 Outline Technology trends VLSI Design flow (an overview) 2 IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing

More information

Position of IP and other network-layer protocols in TCP/IP protocol suite

Position of IP and other network-layer protocols in TCP/IP protocol suite Position of IP and other network-layer protocols in TCP/IP protocol suite IPv4 is an unreliable datagram protocol a best-effort delivery service. The term best-effort means that IPv4 packets can be corrupted,

More information

AN ASSOCIATIVE TERNARY CACHE FOR IP ROUTING. 1. Introduction. 2. Associative Cache Scheme

AN ASSOCIATIVE TERNARY CACHE FOR IP ROUTING. 1. Introduction. 2. Associative Cache Scheme AN ASSOCIATIVE TERNARY CACHE FOR IP ROUTING James J. Rooney 1 José G. Delgado-Frias 2 Douglas H. Summerville 1 1 Dept. of Electrical and Computer Engineering. 2 School of Electrical Engr. and Computer

More information

VLSI Design Automation

VLSI Design Automation VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,

More information

Last time. BGP policy. Broadcast / multicast routing. Link virtualization. Spanning trees. Reverse path forwarding, pruning Tunneling

Last time. BGP policy. Broadcast / multicast routing. Link virtualization. Spanning trees. Reverse path forwarding, pruning Tunneling Last time BGP policy Broadcast / multicast routing Spanning trees Source-based, group-shared, center-based Reverse path forwarding, pruning Tunneling Link virtualization Whole networks can act as an Internet

More information

Computer Architecture. R. Poss

Computer Architecture. R. Poss Computer Architecture R. Poss 1 ca01-10 september 2015 Course & organization 2 ca01-10 september 2015 Aims of this course The aims of this course are: to highlight current trends to introduce the notion

More information

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Lecture: Interconnection Networks

Lecture: Interconnection Networks Lecture: Interconnection Networks Topics: Router microarchitecture, topologies Final exam next Tuesday: same rules as the first midterm 1 Packets/Flits A message is broken into multiple packets (each packet

More information

II. Principles of Computer Communications Network and Transport Layer

II. Principles of Computer Communications Network and Transport Layer II. Principles of Computer Communications Network and Transport Layer A. Internet Protocol (IP) IPv4 Header An IP datagram consists of a header part and a text part. The header has a 20-byte fixed part

More information

Design of Embedded DSP Processors

Design of Embedded DSP Processors Design of Embedded DSP Processors Unit 3: Microarchitecture, Register file, and ALU 9/11/2017 Unit 3 of TSEA26-2017 H1 1 Contents 1. Microarchitecture and its design 2. Hardware design fundamentals 3.

More information

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 4. The Processor COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface 5 th Edition Chapter 4 The Processor Introduction CPU performance factors Instruction count Determined by ISA and compiler CPI and Cycle

More information

Controller Synthesis for Hardware Accelerator Design

Controller Synthesis for Hardware Accelerator Design ler Synthesis for Hardware Accelerator Design Jiang, Hongtu; Öwall, Viktor 2002 Link to publication Citation for published version (APA): Jiang, H., & Öwall, V. (2002). ler Synthesis for Hardware Accelerator

More information

Integrated Services. Integrated Services. RSVP Resource reservation Protocol. Expedited Forwarding. Assured Forwarding.

Integrated Services. Integrated Services. RSVP Resource reservation Protocol. Expedited Forwarding. Assured Forwarding. Integrated Services An architecture for streaming multimedia Aimed at both unicast and multicast applications An example of unicast: a single user streaming a video clip from a news site An example of

More information

This document provides an overview of buffer tuning based on current platforms, and gives general information about the show buffers command.

This document provides an overview of buffer tuning based on current platforms, and gives general information about the show buffers command. Contents Introduction Prerequisites Requirements Components Used Conventions General Overview Low-End Platforms (Cisco 1600, 2500, and 4000 Series Routers) High-End Platforms (Route Processors, Switch

More information

System Level Design with IBM PowerPC Models

System Level Design with IBM PowerPC Models September 2005 System Level Design with IBM PowerPC Models A view of system level design SLE-m3 The System-Level Challenges Verification escapes cost design success There is a 45% chance of committing

More information

1.3 Data processing; data storage; data movement; and control.

1.3 Data processing; data storage; data movement; and control. CHAPTER 1 OVERVIEW ANSWERS TO QUESTIONS 1.1 Computer architecture refers to those attributes of a system visible to a programmer or, put another way, those attributes that have a direct impact on the logical

More information

Development of Integrated Hard- and Software Systems: Tasks and Processes

Development of Integrated Hard- and Software Systems: Tasks and Processes TECHNISCHE UNIVERSITÄT ILMENAU Development of Integrated Hard- and Software Systems: Tasks and Processes Integrated Communication Systems http://www.tu-ilmenau.de/iks General Development Tasks Analysis

More information

Development of Integrated Hard- and Software Systems: Tasks and Processes

Development of Integrated Hard- and Software Systems: Tasks and Processes TECHNISCHE UNIVERSITÄT ILMENAU Development of Integrated Hard- and Software Systems: Tasks and Processes Integrated Hard- and Software Systems http://www.tu-ilmenau.de/ihs System Development Poor Process

More information

Computer Networks. Instructor: Niklas Carlsson

Computer Networks. Instructor: Niklas Carlsson Computer Networks Instructor: Niklas Carlsson Email: niklas.carlsson@liu.se Notes derived from Computer Networking: A Top Down Approach, by Jim Kurose and Keith Ross, Addison-Wesley. The slides are adapted

More information

Code Compression for DSP

Code Compression for DSP Code for DSP Charles Lefurgy and Trevor Mudge {lefurgy,tnm}@eecs.umich.edu EECS Department, University of Michigan 1301 Beal Ave., Ann Arbor, MI 48109-2122 http://www.eecs.umich.edu/~tnm/compress Abstract

More information

Table of Contents. Cisco Buffer Tuning for all Cisco Routers

Table of Contents. Cisco Buffer Tuning for all Cisco Routers Table of Contents Buffer Tuning for all Cisco Routers...1 Interactive: This document offers customized analysis of your Cisco device...1 Introduction...1 Prerequisites...1 Requirements...1 Components Used...1

More information

CSE398: Network Systems Design

CSE398: Network Systems Design CSE398: Network Systems Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University April 04, 2005 Outline Recap

More information

Chapter 4: network layer

Chapter 4: network layer Chapter 4: network layer chapter goals: understand principles behind network layer services: network layer service models forwarding versus routing how a router works routing (path selection) broadcast,

More information

Multi processor systems with configurable hardware acceleration

Multi processor systems with configurable hardware acceleration Multi processor systems with configurable hardware acceleration Ph.D in Electronics, Computer Science and Telecommunications Ph.D Student: Davide Rossi Ph.D Tutor: Prof. Roberto Guerrieri Outline Motivations

More information

FABRICATION TECHNOLOGIES

FABRICATION TECHNOLOGIES FABRICATION TECHNOLOGIES DSP Processor Design Approaches Full custom Standard cell** higher performance lower energy (power) lower per-part cost Gate array* FPGA* Programmable DSP Programmable general

More information

ARM ARCHITECTURE. Contents at a glance:

ARM ARCHITECTURE. Contents at a glance: UNIT-III ARM ARCHITECTURE Contents at a glance: RISC Design Philosophy ARM Design Philosophy Registers Current Program Status Register(CPSR) Instruction Pipeline Interrupts and Vector Table Architecture

More information