WITH the development of the semiconductor technology,

Similar documents
Noc Evolution and Performance Optimization by Addition of Long Range Links: A Survey. By Naveen Choudhary & Vaishali Maheshwari

Network on Chip Architectures BY JAGAN MURALIDHARAN NIRAJ VASUDEVAN

Routing Algorithms, Process Model for Quality of Services (QoS) and Architectures for Two-Dimensional 4 4 Mesh Topology Network-on-Chip

MinRoot and CMesh: Interconnection Architectures for Network-on-Chip Systems

Network on Chip Architecture: An Overview

BARP-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs

Multi-path Routing for Mesh/Torus-Based NoCs

Fault-Tolerant Multiple Task Migration in Mesh NoC s over virtual Point-to-Point connections

Escape Path based Irregular Network-on-chip Simulation Framework

High Performance Interconnect and NoC Router Design

FPGA BASED ADAPTIVE RESOURCE EFFICIENT ERROR CONTROL METHODOLOGY FOR NETWORK ON CHIP

on Chip Architectures for Multi Core Systems

Modeling the Traffic Effect for the Application Cores Mapping Problem onto NoCs

Bursty Communication Performance Analysis of Network-on-Chip with Diverse Traffic Permutations

Oblivious Routing Design for Mesh Networks to Achieve a New Worst-Case Throughput Bound

SURVEY ON LOW-LATENCY AND LOW-POWER SCHEMES FOR ON-CHIP NETWORKS

Fault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies. Mohsin Y Ahmed Conlan Wesson

STG-NoC: A Tool for Generating Energy Optimized Custom Built NoC Topology

Evaluation of NOC Using Tightly Coupled Router Architecture

A Literature Review of on-chip Network Design using an Agent-based Management Method

A 256-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology

Network-on-chip (NOC) Topologies

FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC)

A Layer-Multiplexed 3D On-Chip Network Architecture Rohit Sunkam Ramanujam and Bill Lin

HARDWARE IMPLEMENTATION OF PIPELINE BASED ROUTER DESIGN FOR ON- CHIP NETWORK

Design and Implementation of Buffer Loan Algorithm for BiNoC Router

Available online at ScienceDirect. Procedia Computer Science 89 (2016 )

NEtwork-on-Chip (NoC) [3], [6] is a scalable interconnect

Mapping and Configuration Methods for Multi-Use-Case Networks on Chips

A Strategy for Interconnect Testing in Stacked Mesh Network-on- Chip

Lecture 2: Topology - I

Dynamic Stress Wormhole Routing for Spidergon NoC with effective fault tolerance and load distribution

CAD System Lab Graduate Institute of Electronics Engineering National Taiwan University Taipei, Taiwan, ROC

High Throughput and Low Power NoC

Implementation of Ant Colony Optimization Adaptive Network- On-Chip Routing Framework Using Network Information Region

Networks-on-Chip Router: Configuration and Implementation

Improving Routing Efficiency for Network-on-Chip through Contention-Aware Input Selection

FT-Z-OE: A Fault Tolerant and Low Overhead Routing Algorithm on TSV-based 3D Network on Chip Links

Demand Based Routing in Network-on-Chip(NoC)

A Survey of Techniques for Power Aware On-Chip Networks.

Trading hardware overhead for communication performance in mesh-type topologies

SONA: An On-Chip Network for Scalable Interconnection of AMBA-Based IPs*

A Novel Energy Efficient Source Routing for Mesh NoCs

A NEW ROUTER ARCHITECTURE FOR DIFFERENT NETWORK- ON-CHIP TOPOLOGIES

Performance- and Cost-Aware Topology Generation Based on Clustering for Application-Specific Network on Chip

A Flexible Design of Network on Chip Router based on Handshaking Communication Mechanism

Design of Low-Power and Low-Latency 256-Radix Crossbar Switch Using Hyper-X Network Topology

Transaction Level Model Simulator for NoC-based MPSoC Platform

FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow

An Energy Efficient Topology Augmentation Methodology Using Hash Based Smart Shortcut Links in 2-D Mesh

Design and Implementation of a Packet Switched Dynamic Buffer Resize Router on FPGA Vivek Raj.K 1 Prasad Kumar 2 Shashi Raj.K 3

STLAC: A Spatial and Temporal Locality-Aware Cache and Networkon-Chip

Design and Implementation of Low Complexity Router for 2D Mesh Topology using FPGA

Clustering-Based Topology Generation Approach for Application-Specific Network on Chip

ISSN Vol.04,Issue.01, January-2016, Pages:

Exploring Optimal Topology and Routing Algorithm for 3D Network on Chip

Design of a router for network-on-chip. Jun Ho Bahn,* Seung Eun Lee and Nader Bagherzadeh

A MULTI-PATH ROUTING SCHEME FOR TORUS-BASED NOCS 1. Abstract: In Networks-on-Chip (NoC) designs, crosstalk noise has become a serious issue

Power and Performance Efficient Partial Circuits in Packet-Switched Networks-on-Chip

AREA-EFFICIENT DESIGN OF SCHEDULER FOR ROUTING NODE OF NETWORK-ON-CHIP

Modeling NoC Traffic Locality and Energy Consumption with Rent s Communication Probability Distribution

Energy-efficient Custom Topology-based Dynamic Voltage-frequency Island-enabled Network-on-chip Design

LOW POWER REDUCED ROUTER NOC ARCHITECTURE DESIGN WITH CLASSICAL BUS BASED SYSTEM

Efficient And Advance Routing Logic For Network On Chip

POLYMORPHIC ON-CHIP NETWORKS

DLABS: a Dual-Lane Buffer-Sharing Router Architecture for Networks on Chip

NETWORKS-ON-CHIP (NoC) plays an important role in

A DRAM Centric NoC Architecture and Topology Design Approach

Application Specific Buffer Allocation for Wormhole Routing Networks-on-Chip

Lecture 12: Interconnection Networks. Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E)

A Fault Tolerant NoC Architecture for Reliability Improvement and Latency Reduction

On the Physicl Layout of PRDT-Based NoCs

A NEW DEADLOCK-FREE FAULT-TOLERANT ROUTING ALGORITHM FOR NOC INTERCONNECTIONS

Design and Implementation of Multistage Interconnection Networks for SoC Networks

Design of a System-on-Chip Switched Network and its Design Support Λ

Partitioning Methods for Multicast in Bufferless 3D Network on Chip

Design And Verification of 10X10 Router For NOC Applications

Performance of Multihop Communications Using Logical Topologies on Optical Torus Networks

AC : HOT SPOT MINIMIZATION OF NOC USING ANT-NET DYNAMIC ROUTING ALGORITHM

Application-Specific Network-on-Chip Architecture Customization via Long-Range Link Insertion

Interconnection Networks: Topology. Prof. Natalie Enright Jerger

SoC Design Lecture 13: NoC (Network-on-Chip) Department of Computer Engineering Sharif University of Technology

Lecture: Interconnection Networks

Evaluation of Effect of Packet Injection Rate and Routing Algorithm on Network-on-Chip Performance

SERVICE ORIENTED REAL-TIME BUFFER MANAGEMENT FOR QOS ON ADAPTIVE ROUTERS

A Thermal-aware Application specific Routing Algorithm for Network-on-chip Design

Implementation of PNoC and Fault Detection on FPGA

OVERVIEW: NETWORK ON CHIP 3D ARCHITECTURE

NED: A Novel Synthetic Traffic Pattern for Power/Performance Analysis of Network-on-chips Using Negative Exponential Distribution

International Journal of Research and Innovation in Applied Science (IJRIAS) Volume I, Issue IX, December 2016 ISSN

A Multicast Routing Algorithm for 3D Network-on-Chip in Chip Multi-Processors

Interconnection Networks

Temperature and Traffic Information Sharing Network in 3D NoC

Mapping and Configuration Methods for Multi-Use-Case Networks on Chips

Parallelized Network-on-Chip-Reused Test Access Mechanism for Multiple Identical Cores

Lecture 7: Flow Control - I

Lecture 18: Communication Models and Architectures: Interconnection Networks

4. Networks. in parallel computers. Advances in Computer Architecture

Topologies. Maurizio Palesi. Maurizio Palesi 1

The (Low) Power of Less Wiring: Enabling Energy Efficiency in Many-Core Platforms Through Wireless NoC (Invited Paper)

Transcription:

Dual-Link Hierarchical Cluster-Based Interconnect Architecture for 3D Network on Chip Guang Sun, Yong Li, Yuanyuan Zhang, Shijun Lin, Li Su, Depeng Jin and Lieguang zeng Abstract Network on Chip (NoC) has emerged as a promising on chip communication infrastructure. Three Dimensional Integrate Circuit (3D IC) provides small interconnection length between layers and the interconnect scalability in the third dimension, which can further improve the performance of NoC. Therefore, in this paper, a hierarchical cluster-based interconnect architecture is merged with the 3D IC. This interconnect architecture significantly reduces the number of long wires. Since this architecture only has approximately a quarter of routers in 3D mesh-based architecture, the average number of hops is smaller, which leads to lower latency and higher throughput. Moreover, smaller number of routers decreases the area overhead. Meanwhile, some dual links are inserted into the bottlenecks of communication to improve the performance of NoC. Simulation results demonstrate our theoretical analysis and show the advantages of our proposed architecture in latency, throughput and area, when compared with 3D mesh-based architecture. Keywords Network on Chip (NoC), interconnect architecture, performance, area, Three Dimensional Integrate Circuit (3D IC). I. INTRODUCTION WITH the development of the semiconductor technology, large quantities of transistors are available on a single chip, which allows designers to integrate numerous processors together with large amounts of embedded memory [1][2]. In order to alleviate the complex communication issues which occur as the number of on-chip components increases, Network on Chip (NoC) architecture has been recently proposed as a promising communication paradigm to replace global interconnects [3][4][5]. NoC provides lower power consumption and better performance, flexibility and scalability compared to previous solutions for on-chip communication [3][6][7]. The ability of the network to efficiently disseminate information depends largely on the underlying topology architecture [3]. The simplicity and regularity of grid structures make design approaches based on such a modular topology (e.g., mesh and torus) very attractive [3]. High radix networks like the flattened butterfly [8] reduce latency and power by reducing the number of intermediate routers. However, they increase the number of long wires. Three Dimensional Integrate Circuit (3D IC) emerges as an attractive option, for the reduction of interconnection length and the added interconnect scalability in the third dimension offer an opportunity to further improve the performance of NoC. As show in Fig.1, 3D mesh-based NoC provides the interconnect scalability in Guang Sun is with State Key Laboratory on Microwave and Digital Communications, Tsinghua National Laboratory for Information Science and Technology, Department of Electronic Engineering, Tsinghua University, Beijing 100084, China, Email: sung08@mails.thu.edu.cn, jindp@mail.tsinghua.edu.cn. the third dimension. Moreover, the length of the through-via interconnection between layers ranges from 5 μm to 50 μm [3], which is much smaller than the intra-layer wiring length. Fig. 1. The interconnect architecture of the 3D mesh-based NoC. In this paper, a hierarchical cluster-based interconnect architecture is merged with the 3D IC. Compared with 3D meshbased topology, this topology reduces the number of intermediate routers and leads to lower latency and area. Moreover, the added interconnect scalability in the third dimension and the small interconnection length between layers reduce the number of long wires. Meanwhile, in order to improve the throughput of the network, we insert some dual links into the bottlenecks of communication. The rest of this paper is organized as follows. Section II describes our proposed interconnect architecture. Section III gives the performance analysis. In Section IV we show the simulation results compared with 3D mesh-based topology. Finally, in Section V we draw conclusions. II. PROPOSED INTERCONNECT ARCHITECTURE In this section, we first describe the hierarchical clusterbased topology for 2D NoC, and then expand it to 3D NoC. A. Hierarchical Cluster-Based Topology for 2D NoC The hierarchical cluster-based interconnect architecture for 2D NoC is showed in Fig. 2. Each local router is connected with four IPs and meanwhile each local router is connected to higher hierarchy router, namely global router. Thus, the communications from local router to global router become the bottlenecks of the whole network, and we insert dual links to improve the throughput. This topology contains 4 n Intelligence Properties (IPs) and approximately 4 n 1 routers, where n is the number of the hierarchies. Therefore, there are small number of intermediate routers in this topology. However, the interconnect length from the local router to the global router gets longer when the number of IPs increasing. 1388

(a) (b) Single Link Dual Link IP Models Global Router Local Router Fig. 2. The Dual-Link Hierarchical Cluster-Based Topology for 2D NoC with different number of IPs. (a)4 IPs (b) 16 IPs B. Hierarchical Cluster-Based Topology for 3D NoC Fig. 3 shows the hierarchical cluster-based interconnect architecture for 3D NoC. In this topology, each local router not only connects to the global router in the intra-layer, but also connects to the directly upper and below local routers in the adjacent layers. Moreover, we insert dual links into these connections due to the bottlenecks of communication in the network. Different with the situation in 2D NoC, the small interconnection length between layers in 3D NoC increases the interconnect scalability in the third dimension, reduces the number of long wires and improves the performance of network. The added interconnect scalability in the third dimension leads to small number of IPs in each layer (usually 16 IPs). Therefore, the length of long wires is acceptant. Meanwhile, in order to reduce the impact of long wires on performance, repeaters and pipeline registers can be inserted into the long wires. Fig. 3. Single Link Dual Link IP Models Global Router Local Router The Dual-Link Hierarchical Cluster-Based Topology for 3D NoC III. PERFORMANCE ANALYSIS With the knowledge of the proposed interconnect architecture, next, we will analyze its performance. First, we calculate and compare the average number of hops in our proposed topology and 3D mesh-based topology. The decrease of the average number of hops usually leads to a decrease of average latency and an increase of throughput. And then we discuss the impact of multi-link on performance (e.g. latency and throughput) and area. A. Average Number of Hops Intuitively, our proposed topology has smaller average number of hops than 3D mesh-based topology, because our topology only has approximately a quarter of routers in 3D mesh-based topology. Next, we will calculate and compare the average number of hops in the two topologies in which each layer has 16 IPs (4 4). In our discussion, it is assumed that the destination addresses of the generated messages are uniformly distributed across all of the IP cores and meanwhile each IP core doesn t send messages to itself. In our proposed topology, the close form expression of the average number of hops, denoted by H our, is given by, H our = 16k2 +72k 16 3(16k 1) where k is the number of layers in 3D NoC. The proof of equation (1) is given in the Appendix A. In 3D mesh-based topology, the average number of hops, denoted by H 3D mesh, is given by [9] H 3D mesh = 16k2 + 120k 16 3(16k 1) Compared with 3D mesh-based topology, the average number of hops in our topology is smaller, which is showed in Fig.4. Moreover, the decrement, denoted by H 3Dmesh our,is given by H 3Dmesh our = H 3D mesh H our = 16k (3) 16k 1 The Average Number Of Hops 7 6 5 4 3 2 3D Mesh-Based Topology Our Proposed Topology 1 0 2 4 6 8 10 12 The Number Of Layers Fig. 4. The average number of hops in our proposed topology and 3D mesh-based topology B. Single Link Versus Multiple Links One of the key differences between on chip and inter-chip interconnects is that there are more wire resources on chip, while inter-chip connections are normally limited by available chip IO pins [10]. One way is to increase the width of the links in NoC to improve performance [11]. we explore another option of increasing the number of links in the bottlenecks of (1) (2) 1389

the communication. The difference between multi-link architecture and virtual channels is that virtual channels increase the number of buffers and utilization of links, while multilink architecture increases the number of connecting links and then improves the bandwidth of communication. Therefore, inserting some links into the bottlenecks of the communication can improve the performance (e.g. latency and throughput) efficiently. However, multi-link architecture increases the area overhead. For the tradeoff between performance and area, we only insert some dual links in the bottlenecks of communication in NoC. IV. PERFORMANCE EVALUATION In order to evaluate the performance of our proposed interconnect architecture, we compare it with 3D mesh-based architecture. VHDL language is used to design our performance simulation platform, because it is believed that platform designed by hardware description language is more similar to realistic on-chip network [3]. We set up a standard simulation model as follows: 1) Both interconnect architectures have 32 IP cores (4 4 2). 2) Fixed-length messages are broken into 8 flits, and each flit is 32 bits wide. 3) IP cores independently generate messages and follow a Poisson process. 4) Message destinations are uniformly distributed across all the IP cores. 5) Each physical link has 4 virtual channels. 6) Wormhole switch and shortest path routing is used. 7) Buffers in the source IP core have infinite capacity. The project is synthesized in Stratix EP1S80F1508C5. First, we clarify the definitions of the latency, throughput and area overhead discussed in this section [12]. The latency, which is measured by cycles, is defined as the length of time elapses between the occurrence of the message header at the source IP core and the reception of the message tail at the destination IP core. The throughput, denoted by TP, is defined by TP = M L (4) I T where M is the number of messages that successfully arrive at their destination IPs. L denotes the message length measured by flits. I is the number of IP cores in NoC. T refers to the time (in cycles) from the occurrence of the first message generation and the last message reception. Therefore, throughput is measured as the maximal load that the network is capable of physically handling. Area overhead is the area required by all routers. Next, we evaluate the performance of the two interconnect architectures. Fig.5 shows the comparison of the average latency. Table I compares the maximal throughput and area overhead. It is observed in Fig.5 that compared with 3D mesh-based architecture, the average latency in our architecture decreases by 17% at most. In Table I, the maximal throughput increases by 8.5% and the area overhead in logical elements(les) decreases by 25% mainly caused by smaller number of routers. Fig. 5. Average Message Latency(cycles) 80 70 60 50 40 30 20 10 Our Architecture 3D Mesh-Based Architecture 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Injection Load(flits/cycle/IP) Comparison of the average latency versus injection load TABLE I COMPARISON OF THE THROUGHPUT AND AREA OVERHEAD. Maximal throughput Area overhead 3D mesh-based architecture 0.71 flits/cycle/ip 748715 LEs Our architecture 0.77 flits/cycle/ip 561240 LEs V. CONCLUSION In this paper, we propose a dual-link hierarchical clusterbased interconnect architecture for 3D NoC. It only has approximately a quarter of routers in 3D mesh-based architecture, which results in low area overhead and small average number of hops. The decrease of the average number of hops leads to a decrease of average latency and an increase of throughput. Moreover, in order to improve performance, we insert dual links into some connections which are the bottlenecks of communication in the network. Simulation results show that compared with 3D mesh-based architecture, the average latency in our architecture decreases by 17%, the maximal throughput increases by 8.5% and the area overhead decreases by 25%. APPENDIX A PROOF OF EQUATION (1) Assume k is the number of layers in 3D NoC and each layer has 16 IPs. H ij denotes the total hops from one source IP which is in layer i to destination IPs which are in layer j. Thus, we can get the results as follow: H ii =0 3+2 12 = 24, i [1,k] (5) H ij = j i 16 + 0 4+2 12 = j i 16 + 24, i, j [1,k] The total hops from one source IP which is in layer i to destination IPs which are in all layers (the number is 16k 1), denoted by T i,isgivenby T i = (6) k H ij = H i1 + H i2 +... + H ik (7) j=1 1390

Thus, we can get the results as follow:...... T 1 =H 11 + H 12 + H 13 +... + H 1k =24+(24+16 1) + (24 + 16 2) +... +(24+16 (k 1)) =24k +(1+2+... +(k 1)) 16 T 2 =H 21 + H 22 + H 23 +... + H 2k =(24 + 16 1) + 24 + (24 + 16 1) +... +(24+16 (k 2)) =24k +(1+1+2+... +(k 2)) 16 T k 1 =H (k 1)1 + H (k 1)2 + H (k 1)3 +... + H (k 1)k =(24 + 16 (k 2)) + (24 + 16 (k 3))+ (24 + 16 (k 4))... +(24+16 1) (8) (9) =24k +((k 2) + (k 3) + (k 4) +...1+1) 16 (10) T k =H k1 + H k2 + H k3 +... + H kk =(24 + 16 (k 1)) + (24 + 16 (k 2))+ (24 + 16 (k 3))... +24 =24k +((k 1) + (k 2) + (k 3) +...1) 16 (11) Next, we can calculate the average number of hops, denoted by H our, in our architecture as follow: k H our = T i /(k (16k 1)) i=1 =(24k 2 +16k(k 1)( k +1 ))/(k (16k 1)) 3 =(16k 2 +72k 16)/(3 (16k 1)) (12) Until now, we have proved equation (1). REFERENCES [1] Hu. Jingcao and R. Marculescu, Energy-aware mapping for tile-based NoC architectures under performance constraints, Proceedings of the ASP-DAC 2003, pp.233-239, 2003. [2] L. Benini and G. Micheli, Networks on chips: a new SoC paradigm, IEEE Comput, vol.35, 2002. [3] R. Marculescu, U.Y. Ogras, Peh. Li-Shiuan, N.E. Jerger, and Y. Hoskote, Outstanding research problems in NoC design: system, microarchitecture, and circuit perspectives, IEEE Transactions on Computer- Aided Design of Integrated Circuits and Systems, vol.28, pp. 3-21, 2009. [4] C. Marcon and A. Borin, Time and energy efficient mapping of embedded applications onto NoCs, ASP-DAC 2005, pp.33-38, 2005. [5] S. Murali, M. Coenen, A. Radulescu, K. Goossens and G. De Micheli, A methodology for mapping multiple use-cases onto networks on chips, Proc. Des. Autom. Test Eur. Conf 2006, pp.118-123, 2006. [6] H. G. Lee, N. Chang, U. Y. Ogras and R. Marculescu, On-chip communication architecture exploration: a quantitative evaluation of point-to-point, bus, and network-on-chip approaches, ACM Trans. Des. Autom.Electron. Syst, vol.12, pp.20-40, 2007. [7] M. Horowitz, R. Ho, and K. Mai, The future of wires, Proc. IEEE,vol.89, pp.490-504, 2001. [8] J. Kim, W.J. Dally and D. Abts, Flattened butterfly: a cost-efficient topology for high-radix networks, Proceedings of the 34th annual international symposium on Computer architecture, pp.126-137,2007. [9] V.F. Pavlidis and E.G. Friedman, 3-D topologies for networks-on-chip, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol.15, no.10, pp.1081-1090, 2007. [10] Z. Yu, and B.M. Baas, A Low-Area Multi-Link Interconnect Architecture for GALS Chip Multiprocessors, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol.18, no.5, pp.750-762, 2010. [11] WJ. Dally and B. Towles, Route packets, not wires: On-chip interconnection networks, Design Automation Conference, 2001. Proceedings, pp.684 689, 2001. [12] P.P. Pande, C. Grecu, M. Jones, A. Ivanov and R. Saleh, Performance evaluation and design trade-offs for network-on-chip interconnect architectures, IEEE Transactions on Computers, vol.54, no.8, pp.1025 1040, 2005. Guang Sun received the B.S. from Xiamen University, Xiamen, China in 2008 in the department of electronics engineering. Now he is pursuing the Ph.D. degree in Tsinghua University. His research interests include on-chip network, message-passing multiprocessor and NoC architecture. Yong Li received the B.S. from Huazhong University of Science and Technology, Wuhan, China in 2007 in the department of electronics and information engineering. Now he is pursuing the Ph.D. degree in Tsinghua University. His research interests include wireless networking, mobility management and future internet architecture Yuanyuan Zhang received the B.S. from Shandong University, Jinan, China in 2007 in the department of electronics engineering. Now she is pursuing the Ph.D. degree in Tsinghua University. Her research interests include Network-on-Chip network, Multiprocessor System-on-Chip and Chip Multiprocessor. Shijun Lin received the B.S. degree from Xiamen University, Xiamen, China in 2005 and Ph.D. degree from Tsinghua University, Beijing, China in 2010. Now he is an assistant professor at Xiamen University. His research interests include On-Chip interconnect, high-speed networks and ASIC design. 1391

Li Su received the B.S. degree from Nankai University, Tianjin, China, in 1999 and Ph.D. degree from Tsinghua University, Beijing, China in 2007 respectively both in electronics engineering. Now he is a research associate with Department of Electronic Engineering, Tsinghua University. His research interests include telecommunications, future internet architecture and on-chip network. Depeng Jin received the B.S. and Ph.D. degrees from Tsinghua University, Beijing, China, in 1995 and 1999 respectively both in electronics engineering. Now he is an associate professor at Tsinghua University and vice chair of Department of Electronic Engineering. Dr. Jin was awarded National Scientific and Technological Innovation Prize (Second Class) in 2002. His research fields include telecommunications, high-speed networks, ASIC design and future internet architecture. Lieguang Zeng received the B.S. and Ph.D. degrees from Tsinghua University, Beijing, China, in 1995 and 1999 respectively both in electronics engineering. Now he is an associate professor at Tsinghua University and vice chair of Department of Electronic Engineering. Dr. Jin was awarded National Scientific and Technological Innovation Prize (Second Class) in 2002. His research fields include telecommunications, high-speed networks, ASIC design and future internet architecture. 1392