An Overload-Free Data-Driven Ultra-Low-Power Networking Platform Architecture

Size: px
Start display at page:

Download "An Overload-Free Data-Driven Ultra-Low-Power Networking Platform Architecture"

Transcription

1 An Overload-Free Data-Driven Ultra-Low-Power Networking Platform Architecture Shuji SANNOMIYA 1, Yukikuni NISHIDA 2, Makoto IWATA 3, and Hiroaki NISHIKAWA 1 1 Faculty of Engineering, Information and Systems, University of Tsukuba, Tsukuba Science City, Ibaraki, Japan 2 Graduate School of Systems and Information Engineering, University of Tsukuba, Tsukuba Science City, Ibaraki, Japan 3 School of Information, Kochi University of Technology, Kami, Kochi, Japan Abstract In order to enhance the sustainability of communication especially in times of disaster, both low-power consumption and the tolerance for traffic increased due to the emergency communication should be realized urgently. Already our previous study has presented ULP-DDNS (Ultra-Low-Power Data-Driven Networking System) extending the lifetime of battery-operated devices to form an ad-hoc network which can provide a communication environment in the area where fixed and wired networks are disabled due to the disaster. In this paper, a networking platform architecture with a runtime overload-avoidance mechanism to dynamically maintain the processing load within the design target is revealed to provide the ULP-DDNS with the tolerance for the increased traffic. The runtime overload-avoidance mechanism exploits the unique positive correlation between the processing load and consumption current in the datadriven processors realized by self-timed pipeline, and it enhances the throughput for reducing the processing load by runtime voltage scaling when the current increases. Keywords: data-driven processor, protocol handling, real-time multiprocessing, self-timed pipeline 1. Introduction To enhance the sustainability of communication is one of the urgent issues in emergent situations especially in times of disaster. We have already proposed ULP-DDNS (Ultra-Low- Power Data-Driven Networking System) [1] to achieve ultralow-power consumption indispensable to extend the lifetime of the battery-operated mobile devices to form an ad-hoc network which can provide a communication environment in the area where fixed and wired networks are disabled due to the disaster. To ensure the connectivity over the ULP-DDNS, it is indispensable to provide tolerance for traffic increased due to emergency communication for safety confirmation, information gathering, and so forth. Concretely, protocol processing should be guaranteed on every platform (network node) even when traffic increases. However, the platform may become inoperative when incoming traffic increases. This is because the increased traffic may increase the number of packets concurrently processed in the platform beyond the design target, i.e., the pipeline occupancy which is the ratio of the number of valid data to the number of pipeline stages may exceed the design target. To make the platforms free from such overload situation, both observability and controllability on the pipeline occupancy are indispensable. Unfortunately, the pipeline occupancy of currently mainstream processors cannot be observed accurately because the number of valid data may change at runtime depending on the unpredictable branches or/and interrupts. In contrast, data-driven processors realized by self-timed pipeline can provide direct observability on their pipeline occupancy because the localized data transfer of the selftimed pipeline drives only pipeline stages with valid data and thus the consumption current of the self-timed pipeline is in proportion to the runtime pipeline occupancy, i.e. the pipeline occupancy can be externally observed by the amount of the consumption current. Moreover, the throughput of the self-timed pipeline can be controlled in real-time by changing the supplied voltage based on a DVS (Dynamic Voltage Scaling) technique [2]. Consequently, the pipeline occupancy can be kept within the design target by increasing the pipeline throughput by the DVS when the consumption current is increased due to the increased traffic. In this paper, an overload-free data-driven networking platform architecture is proposed based on the direct observability and controllability on the pipeline occupancy of the self-timed pipeline. The changing of the throughput based on the DVS technique takes time because of both the signal propagation in the control circuit and the parasitic capacitance on the circuit, and thus the fluctuation of the pipeline occupancy should be temporally smoothed and reduced in order to keep the pipeline occupancy within the design target until the throughput becomes a target value. The key idea of the proposed architecture is to temporally smooth and lower the pipeline occupancy at runtime by changing the parallelism of target protocol handling based on the realtime multiprocessing capability of the data-driven processor realized by the self-timed pipeline. The feasibility of the

2 Fig. 1: Self-timed (clockless) pipeline. proposed architecture is discussed based on the measurement of the latest version of the data-driven processors realized by self-timed pipeline [3]. 2. ULP-DDNS platform To realize the overload-free networking platform, both the observability and controllability on the pipeline occupancy are indispensable. Fortunately, they can be provided in the data-driven networking platform of the ULP-DDNS. In this section, how these indispensable features are provided is explained, and a basic technique to exploit this unique feature for achieving an overload-free networking platform is discussed. 2.1 Ultra-low-power data-driven networking processor The platform of the proposed ULP-DDNS is realized by both an ad-hoc networking scheme for reducing the redundant traffic and a data-driven processor for handling communication protocols with low-power. The proposed ad-hoc networking scheme realizes an adhoc network over mobile devices in the area where existing fixed- and wired-network infrastructure becomes inoperative due to fault or disaster, and it reduces the redundant traffic caused by existing simple flooding (broadcasting) to deliver urgent information all over the ad-hoc network [4]. As a result of our evaluation, it is revealed that the proposed ad-hoc networking scheme reduces the traffic to 1/10 [1]. This reduction of the traffic directly decreases the number of sending and receiving packets in every node (platform) in the ad-hoc network, and thus it contributes to the lowering power consumption of every platform. In addition to the ad-hoc networking scheme, a datadriven networking processor is proposed to lower the power consumption required to handle the protocol for both sending and receiving each packet. The proposed data-driven networking processor, named ULP-DDCMP (Ultra-Low-Power Data-Driven Chip MultiProcessor), is realized by using an optimized circular pipeline which makes it possible to bypass the pipeline stages for firing control to detect the arrival of a pair of operands when unary operations are executed [3]. Each processor core of the ULP-DDCMP is named ULP- CUE (Ultra-Low-Power CUE) as a successor of the CUE series data-driven processors [3]. The ultra-low-power consumption as a result of the synergistic effect between the traffic reduction by the ad-hoc networking scheme and the low-power protocol handling by the ULP-DDCMP is demonstrated by using simulators and a prototype VLSI chip of the ULP-DDCMP, and it is revealed that the ULP-DDNS can reduce power consumption to a several-hundredth in comparison with an existing network system [1]. 2.2 Real-time observability and controllability One of the main contributors to the ultra-low-power consumption is the localized data transfer of the selftimed pipeline (STP) which is used to realize the ULP- DDCMP. The localized data transfer also provides both a strong positive correlation between the pipeline occupancy and consumption current and the real-time adaptability for dynamic voltage scaling. In the STP, only pipeline stages with valid data are driven exclusively as a consequence of the localized data transfer called handshake. Figure 1 shows the basic structure of the STP in which each stage consists of a data-latch (DL), functional logic (FL) and transfer control unit (C). The STP is a kind of asynchronous bundled data pipelines, and it employs four-phased handshake [5]. Based on the fourphased handshake, the valid data in the STP are transferred between adjacent stages, as follows. Reset: After the assertion of the reset signal, the C negates both its send signal representing transfer request and ack signal representing acknowledge. The C asserts its ack signal after its send signal is asserted. After the assertion of the ack signal, the preceding C negates its send signal. After the negation of the send signal, the C asserts both its gate open signal (cp) and its send signal and it negates concurrently its ack signal, only if the ack signal is negated. As a result, the data is latched in the stage to which the C belongs. The succeeding C repeats the above steps similarly to the C. This handshake not only concentrates dynamic consumption current into the pipeline stages with valid data but also eliminates global clocks. Generally, clock-synchronized circuit requires PLL (Phase-Locked Loop) circuit to change the clock-frequency according to the supplied voltage, and it takes several tens of µ seconds to change the clock-frequency by the PLL. That is, the supplied voltage should be kept at constant within several tens of µ seconds.

3 Throughput (normalized) Design target Pipeline occupancy Fig. 2: Direct observability on pipeline occupancy. Design target Pipeline occupancy In contrast, no PLL is required in the STP, and the delay times of the DL, FL and C are changed at equal rate according to the supplied voltage. Therefore, the supplied voltage of the STP can be scaled at runtime while the rate of change of the voltage is moderate enough to guarantee the transistor switching, i.e., the throughput of the ULP-DDCMP can be changed while target protocols are handled. In the ULP-DDCMP, both the occupancy and throughput increase when the number of packets processed concurrently increases. Figure 2 shows the characteristics which are measured by using the existing ULP-DDCMP chip. As shown in this figure 2(a), the throughput is kept at a maximum value regardless of the pipeline occupancy while the pipeline occupancy exceeds the design target value, therefore, the ULP-DDCMP may become inoperative due to the overflow of the STP if the input traffic continues to exceed the design target. That is, the pipeline occupancy should be kept within the design target to realize the overload-free networking platform. As shown in the figure 2(b), the pipeline occupancy correlate with the consumption current of the STP, i.e., the statically unpredictable pipeline occupancy can be observed at runtime based on the consumption current. Consequently, the overload situation can be avoided by increasing the pipeline throughput to keep the pipeline occupancy within the design target value when the pipeline occupancy increases. 3. Runtime overload-avoidance mechanism Based on the direct observability and controllability, the throughput of the protocol handling in the DDCMP can be changed when input traffic increases. To realize this runtime load control for overload-avoidance, the platform architecture is discussed in this section. 3.1 Networking platform architecture As already described, the observation of the pipeline occupancy by the consumption current and the control of the effective throughput by the DVS can be realized at runtime. Unfortunately, some delay time is introduced until the effective throughput becomes a target value after the pipeline occupancy changes because of the signal propagation delay through control circuits and their parasitic capacitance. Therefore, the fluctuation of the pipeline occupancy should be temporally moderate to provide enough time for changing the effective throughput. To make the pipeline occupancy fluctuation temporally smooth without any runtime overhead, the data-driven programs of target protocols are modified to reduce the variety of the numbers of operations executed concurrently. As illustrated in figure 3, the programs are defined by data-flow graph (DFG) in the data-driven processors. The DFG consists of nodes and arcs, and each node describes an operation while each arc represents the data-dependency between two successive operations. The data-dependencies between operations represent naturally the ILP (Instruction Level Parallelism) inherent in the programs, and thus describing target program by using DFG results in extracting the ILP in the target programs. In the data-driven processors, each operand is executed independently from the other operands and the execution time of each operand is also independent from that of the other operands as a result of the real-time multiprocessing [6]. Based on this feature, the number of operations executed concurrently can be changed by postponing the execution timing of the operations on non-critical paths, as shown in figure 3. This program modification can temporally smooth

4 Fig. 3: Temporally-smoothing the number of operations executed concurrently. the number of operations executed concurrently without any overhead on the execution time of the operations on the critical path of target programs. Figure 4 shows the basic architecture to realize an overload-free networking platform based on the techniques discussed. To enhance the throughput of the protocol handling when the input traffic increases, a runtime overload avoidance mechanism is introduced to increase the supplied voltage according to the increased consumption current. This runtime overload avoidance mechanism can be implemented by using runtime voltage scaling technique [2] for the selftimed pipeline. This kind of load control in the platform should not increase the traffic in the ad hoc network because the increasing traffic leads to the network congestion. From this standpoint, the throughput of the protocol handling for receiving packets should be kept at constant to guarantee the receiving packets because the retransmission due to the denial of packet reception increases the traffic in the ad hoc network. Therefore, the receiving protocol handling at link layer is out of the throughput control as shown in the figure 4. On the other hand, the throughput of the protocol handling for sending packets is enhanced by increasing the supplied voltage in order to reduce the pipeline occupancy for the increased traffic. Based on this basic architecture, the pipeline occupancy derived from the protocol handling up to network layer can be reduced for the increased traffic. However, the pipeline occupancy depends on not only the protocol handling up to the network layer but also the internal processing including the upper layer protocol handling and the application processing. 3.2 Runtime parallelism transformation As for the internal processing, the enhancement of the throughput may not necessarily result in the reduction of the pipeline occupancy because some of the internal processing may be resident. For example, a GUI (Graphical User Interface) manager continues to run while the display device is lit. To reduce the pipeline occupancy derived from such internal processing, the number of data (tokens in the data-driven processors) flowing through the STP should be reduced. However, tokens derived from different programs are concurrently processed at the different stages of the STP without any distinction on the types of processing, and thus it is difficult to selectively remove the flowing tokens of a particular processing type. Fortunately, the processing time constraint of the upper layer protocol handling and the application processing is often lazy in comparison with that of the link level protocol handling. For instance, the response time of the MAC (Media Access Control) protocol handling is strictly and tightly determined on the µ second time scale depending on the specification of the physical layer hardware while the several seconds delay time of a mailer application can be accepted or ignored. By utilizing such slack time of some internal processing, the pipeline occupancy can be temporally smoothed and reduced in the data-driven processors. By utilizing the real-time multiprocessing feature, the number of operations executed simultaneously can be reduced as already shown in the figure 3. As for the internal processing with the slack time, the number of the concurrently executing operations can be more reduced at the expense of the increase in the processing time. In an extreme case, it can be 1 as shown in figure 5 while the increased time is acceptable. Consequently, the pipeline occupancy derived from the internal processing with the slack time can be reduced by transforming the parallelism of the programs. To realize such transformation of the parallelism, any overhead on the processing time of the running programs should be avoided in order to satisfy the processing time constraints required. In this paper, a runtime parallelism transformation with no overhead on the processing time is introduced by exploiting the real-time multiprocessing capability of the data-driven processor realized by the STP. The runtime parallelism transformation is realized by switch-

5 Fig. 4: Networking platform with runtime overload-avoidance mechanism. Fig. 5: Runtime parallelism transformation by switching DFG. ing the program at runtime, i.e. an internal processing program with high throughput (parallelism) is switched to its alternative version with low parallelism when the pipeline occupancy increases. It is difficult to switch the running program to the alternative version because the tokens of the running program are spread over the STP. Therefore, the switching should be realized at the beginning of the execution of the program or the iteration. This switching should be coordinated with the change of the pipeline occupancy, and thus a switch operation is introduced to realize the branch on the pipeline occupancy. As shown in the figure 5, the switch operation changes the data-flow at runtime according to the direction externally input from the runtime overload-avoidance mechanism. In the runtime overload-avoidance mechanism, the direction of the switch operation is determined according to the input consumption current representing the pipeline occupancy. As a result of the control by the runtime overload-avoidance mechanism, the pipeline occupancy can be reduced by both enhancing the throughput of the protocol handling for sending packets and decreasing the number of operations executed concurrently when the input traffic increases. 3.3 Preliminary evaluation The proposed architecture completely depends on not only the already proposed runtime DVS technique [2] but also both the parallelism transformation of the target protocol handling program and the real-time processing capability of the data-driven processors realized by the STP. As a preliminary evaluation of the feasibility of the proposed architecture, both the parallelism transformation and the realtime multiprocessing are verified by using the ULP-DDCMP chip which is the latest data-driven processor realized by the STP. As a concrete protocol, UDP/IP is focused on because its connection-less packet transfer results in low-power consumption indispensable in ad-hoc networking, i.e., it is one of the protocols expected to be used in ad hoc networking. As shown in figure 6(a), the ULP-DDCMP chip houses four ULP-CUE s interconnected by a multi-stage token router realized by the STP. In the design of this chip, the circular STP realizing each ULP-CUE is divided finely in order to eliminate the pipeline bottleneck. As a result of this

6 Fig. 6: ULP-DDCMP chip and its evaluation board. pipeline division, the number of stages of each ULP-CUE is 13. The chip is fabricated by 65nm CMOS 7-metal-layer process technology. The ULP-DDCMP is implemented on an evaluation board which mounts two FPGA s; one FPGA is used to realize the runtime DVS with PID (Proportional Integral Derivative) control to stabilize the supplied voltage at a target value and the other FPGA realizes logging of the performance and power consumption. The evaluation board is shown in the figure 6(b). The ULP-DDCMP provides an instruction set enough to describe the UDP/IP handling program. Actually, the datadriven program of the UDP/IP handling is described by using the instruction set. The described UDP/IP handling program realizes the checksum calculation and the generation of the UDP/IP header, and the packets containing pseudo header and payload are input to the program and the program outputs IP datagrams. The number of the operations executed simultaneously in the originally described program varies from 1 to 5, and thus the maximum pipeline occupancy becomes approximately 38% (= 5/13). This means that one UDP/IP handling can be executed in one ULP-CUE within the design target because the design target of each ULP-CUE is 40% as shown in the figure 2. To verify the temporally-smoothing of the number of concurrently executed operations, an alternative version of the UDP/IP handling is derived from the original version by using the introduced scheme as shown in the figure 5. In the derived alternative version, the number of operations executed concurrently is reduced to almost 1. That is, it is verified that the parallelism can be changed by modifying the program. By using the alternative version, the real-time processing capability is verified. The processing time required to process one packet is measured by using the logging function on the evaluation board while the number of input packet is increased, i.e. the multiplicity is increased. Figure 7 shows the measured result. In the sequential processing, the processing time per one packet is in proportion to the multiplicity. In contrast, the real-time processing capability of the ULP-DDCMP can keep the processing time per packet at approximately constant regardless of the multiplicity, as shown in the result. In addition, the processing time per one packet is measured for the different input timing of the packets, and the same results are obtained. That is, the processing time of a program is independent from that of the other programs. It is true that the processing time per packet experiences approximately a 10% increase when the multiplicity is 4 in comparison with the other results. The cause of this increase is the elastic capability of the STP. The STP can maintain its maximum throughput even when the pipeline occupancy exceeds the design target, as shown in the figure 2. The number of the operations executed concurrently in the alternative version is not exactly 1 and it temporarily becomes 2, therefore, the pipeline occupancy exceeds the design target temporarily when the multiplicity is 4. In other words, the STP provides a tolerance for temporal overload naturally. If the increased processing time is not acceptable, the processing time can be kept at constant by limiting the multiplicity to be within 3 or by pipelining the STP more deeply. 4. Conclusion In this paper, a data-driven networking platform architecture with a runtime overload-avoidance mechanism is revealed in order to realize an overload-free networking platform indispensable to realize sustainable networking environment. Based on the direct observability and controllability on the pipeline occupancy which is the processing load of the platform, the overload-avoidance mechanism makes it possible to dynamically keep the pipeline occupancy within the design target. Concretely, the pipeline occupancy is

7 Fig. 7: Processing time for one packet. observed by the consumption current, and it is reduced by increasing the pipeline throughput with the runtime DVS when the input traffic increases. Moreover, a runtime parallelism transformation is proposed to make the control delay time inherent in the DVS circuit ignorable. As a preliminary evaluation, the feasibility of the newly proposed runtime parallelism transformation is verified through the measurement of the latest version of data-driven processors. Now we are developing a simulator [7] realizing the comprehensive evaluation on the ad hoc network environment realized by the proposed architecture, and the evaluation result will be reported soon. Acknowledgement Although it is impossible to give credit individually to all those who organized and supported our project, the authors would like to express their sincere appreciation to all the colleagues in the project. This research work was supported in part by Core Research for Evolutional Science and Technology (CREST), Japan Science and Technology Agency (JST). References [1] Kazuhiro Aoki, Hiroshi Ishii, Makoto Iwata, and Hiroaki Nishikawa, A Comprehensive Evaluation of ULP-DDNS by Platform Simulator, in Proc. of PDPTA, pp , July [2] Kei Miyagi, Shuji Sannomiya, Makoto Iwata, and Hiroaki Nishikawa, Low-Powered Self-Timed Pipeline with Runtime Fine-Grain Power Supply, in Proc. of PDPTA, pp , July [3] Shuji Sannomiya, Kazuhiro Aoki, Makoto Iwata, and Hiroaki Nishikawa, Power-Performance Verification of Ultra-Low-Power Data-Driven Networking Processor: ULP-CUE, in Proc. of PDPTA, pp , July [4] Hiroshi Ishii, Keisuke Utsu, and Hiroaki Nishikawa, Integrated Evaluation on Effectiveness of ULP-DDNS Networking Layer, in Proc. of PDPTA, pp , July [5] C. J. Myers, Asynchronous circuit design, Univ. of Utah John Wiley & Sons, Inc., [6] Hiroaki Nishikawa, Design Philosophy of a Networking-Oriented Data-Driven Processor: CUE, IEICE Transactions on Electronics, Vol.E89-C No.3, pp , Mar [7] Kazuhiro Aoki, Shuji Sannomiya, Makoto Iwata, Hiroshi Ishii and Hiroaki Nishikawa, An Implementation of Platform Simulator for Congestion-Free Ultra-Low-Power Data-Driven Networking System, in Proc. of PDPTA, PDP2081, July 2013.

Data-Driven Sensor Networking Processor Tolerating Instantaneously Excessive Load

Data-Driven Sensor Networking Processor Tolerating Instantaneously Excessive Load 316 Int'l Conf. Par. and Dist. Proc. Tech. and Ap. PDPTA'16 Data-Driven Sensor Networking Processor Tolerating Instantaneously Excessive Load Shuji SANNOMIYA 1, Yukikuni NISHIDA 2, Makoto IWATA 3, and

More information

An Implementation of Platform Simulator for Congestion-Free Ultra-Low-Power Data-Driven Networking System

An Implementation of Platform Simulator for Congestion-Free Ultra-Low-Power Data-Driven Networking System An Implementation of Platform Simulator for Congestion-Free Ultra-Low-Power Data-Driven Networking System Kazuhiro Aoki 1, Shuji Sannomiya 2, Makoto Iwata 3, Hiroshi Ishii 4 and Hiroaki Nishikawa 2 1 Information

More information

Intermediate Achievement of Ultra-Low-Power Data-Driven Networking System: ULP-DDNS

Intermediate Achievement of Ultra-Low-Power Data-Driven Networking System: ULP-DDNS Intermediate Achievement of Ultra-Low-Power Data-Driven Networking System: ULP-DDNS Hiroaki Nishikawa 1, Kazuhiro Aoki 2, Hiroshi Ishii 3 and Makoto Iwata 4 1 Department of Computer Science, Graduate School

More information

Data-Driven Sensor Networking System Simulator

Data-Driven Sensor Networking System Simulator 564 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'15 Data-Driven Sensor Networking System Simulator Kazuhiro Aoki 1, Shuji Sannomiya 2 and Hiroaki Nishikawa 2 1 Information Infrastructure Laboratory,

More information

Self-Timed Single Circular Pipeline for Multiple FFTs

Self-Timed Single Circular Pipeline for Multiple FFTs Self-Timed Single Circular Pipeline for Multiple FFTs Ryuichi TAGUCHI, Hajime OHISO, Keizo MENDORI, Kei MIYAGI, Makoto IWATA Graduate School of Engineering, Kochi University of Technology, Kami, Kochi,

More information

CHAPTER 3 ASYNCHRONOUS PIPELINE CONTROLLER

CHAPTER 3 ASYNCHRONOUS PIPELINE CONTROLLER 84 CHAPTER 3 ASYNCHRONOUS PIPELINE CONTROLLER 3.1 INTRODUCTION The introduction of several new asynchronous designs which provides high throughput and low latency is the significance of this chapter. The

More information

CUE-v3: Data-Driven Chip Multi-Processor for Ad hoc and Ubiquitous Networking Environment

CUE-v3: Data-Driven Chip Multi-Processor for Ad hoc and Ubiquitous Networking Environment CUE-v3: Data-Driven Chip Multi-Processor for Ad hoc and Ubiquitous Networking Environment Hiroaki Nishikawa Hiroshi Tomiyasu Masanobu Okamoto Masayoshi Sugiyama Hiroyuki Uchida Osamu Mizuno Hiroshi Ishii

More information

Portland State University ECE 588/688. Dataflow Architectures

Portland State University ECE 588/688. Dataflow Architectures Portland State University ECE 588/688 Dataflow Architectures Copyright by Alaa Alameldeen and Haitham Akkary 2018 Hazards in von Neumann Architectures Pipeline hazards limit performance Structural hazards

More information

FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS. Waqas Akram, Cirrus Logic Inc., Austin, Texas

FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS. Waqas Akram, Cirrus Logic Inc., Austin, Texas FILTER SYNTHESIS USING FINE-GRAIN DATA-FLOW GRAPHS Waqas Akram, Cirrus Logic Inc., Austin, Texas Abstract: This project is concerned with finding ways to synthesize hardware-efficient digital filters given

More information

Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation

Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation Achieving Lightweight Multicast in Asynchronous Networks-on-Chip Using Local Speculation Kshitij Bhardwaj Dept. of Computer Science Columbia University Steven M. Nowick 2016 ACM/IEEE Design Automation

More information

A Novel Information Sharing Architecture Constructed by Broadcast Based Information Sharing System (BBISS)

A Novel Information Sharing Architecture Constructed by Broadcast Based Information Sharing System (BBISS) A Novel Information Sharing Architecture Constructed by Broadcast Based Information Sharing System (BBISS) Keisuke Utsu, Chee Onn Chow 2, Hiroaki Nishikawa 3, and Hiroshi Ishii School of Information and

More information

Real Time NoC Based Pipelined Architectonics With Efficient TDM Schema

Real Time NoC Based Pipelined Architectonics With Efficient TDM Schema Real Time NoC Based Pipelined Architectonics With Efficient TDM Schema [1] Laila A, [2] Ajeesh R V [1] PG Student [VLSI & ES] [2] Assistant professor, Department of ECE, TKM Institute of Technology, Kollam

More information

CHAPTER 3 EFFECTIVE ADMISSION CONTROL MECHANISM IN WIRELESS MESH NETWORKS

CHAPTER 3 EFFECTIVE ADMISSION CONTROL MECHANISM IN WIRELESS MESH NETWORKS 28 CHAPTER 3 EFFECTIVE ADMISSION CONTROL MECHANISM IN WIRELESS MESH NETWORKS Introduction Measurement-based scheme, that constantly monitors the network, will incorporate the current network state in the

More information

CAD Technology of the SX-9

CAD Technology of the SX-9 KONNO Yoshihiro, IKAWA Yasuhiro, SAWANO Tomoki KANAMARU Keisuke, ONO Koki, KUMAZAKI Masahito Abstract This paper outlines the design techniques and CAD technology used with the SX-9. The LSI and package

More information

Low-Power Technology for Image-Processing LSIs

Low-Power Technology for Image-Processing LSIs Low- Technology for Image-Processing LSIs Yoshimi Asada The conventional LSI design assumed power would be supplied uniformly to all parts of an LSI. For a design with multiple supply voltages and a power

More information

A Novel Pseudo 4 Phase Dual Rail Asynchronous Protocol with Self Reset Logic & Multiple Reset

A Novel Pseudo 4 Phase Dual Rail Asynchronous Protocol with Self Reset Logic & Multiple Reset A Novel Pseudo 4 Phase Dual Rail Asynchronous Protocol with Self Reset Logic & Multiple Reset M.Santhi, Arun Kumar S, G S Praveen Kalish, Siddharth Sarangan, G Lakshminarayanan Dept of ECE, National Institute

More information

Stream Control Transmission Protocol (SCTP)

Stream Control Transmission Protocol (SCTP) Stream Control Transmission Protocol (SCTP) Definition Stream control transmission protocol (SCTP) is an end-to-end, connectionoriented protocol that transports data in independent sequenced streams. SCTP

More information

FPGA. Logic Block. Plessey FPGA: basic building block here is 2-input NAND gate which is connected to each other to implement desired function.

FPGA. Logic Block. Plessey FPGA: basic building block here is 2-input NAND gate which is connected to each other to implement desired function. FPGA Logic block of an FPGA can be configured in such a way that it can provide functionality as simple as that of transistor or as complex as that of a microprocessor. It can used to implement different

More information

The Design and Implementation of a Low-Latency On-Chip Network

The Design and Implementation of a Low-Latency On-Chip Network The Design and Implementation of a Low-Latency On-Chip Network Robert Mullins 11 th Asia and South Pacific Design Automation Conference (ASP-DAC), Jan 24-27 th, 2006, Yokohama, Japan. Introduction Current

More information

ISSN Vol.03, Issue.02, March-2015, Pages:

ISSN Vol.03, Issue.02, March-2015, Pages: ISSN 2322-0929 Vol.03, Issue.02, March-2015, Pages:0122-0126 www.ijvdcs.org Design and Simulation Five Port Router using Verilog HDL CH.KARTHIK 1, R.S.UMA SUSEELA 2 1 PG Scholar, Dept of VLSI, Gokaraju

More information

Organic Computing DISCLAIMER

Organic Computing DISCLAIMER Organic Computing DISCLAIMER The views, opinions, and/or findings contained in this article are those of the author(s) and should not be interpreted as representing the official policies, either expressed

More information

A Pipelined Memory Management Algorithm for Distributed Shared Memory Switches

A Pipelined Memory Management Algorithm for Distributed Shared Memory Switches A Pipelined Memory Management Algorithm for Distributed Shared Memory Switches Xike Li, Student Member, IEEE, Itamar Elhanany, Senior Member, IEEE* Abstract The distributed shared memory (DSM) packet switching

More information

Impact of Divided Static Random Access Memory Considering Data Aggregation for Wireless Sensor Networks

Impact of Divided Static Random Access Memory Considering Data Aggregation for Wireless Sensor Networks APSITT8/Copyright 8 IEICE 7SB8 Impact of Divided Static Random Access Considering Aggregation for Wireless Sensor Networks Takashi Matsuda, Shintaro Izumi, Takashi Takeuchi, Hidehiro Fujiwara Hiroshi Kawaguchi,

More information

Fault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies. Mohsin Y Ahmed Conlan Wesson

Fault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies. Mohsin Y Ahmed Conlan Wesson Fault Tolerant and Secure Architectures for On Chip Networks With Emerging Interconnect Technologies Mohsin Y Ahmed Conlan Wesson Overview NoC: Future generation of many core processor on a single chip

More information

9th Slide Set Computer Networks

9th Slide Set Computer Networks Prof. Dr. Christian Baun 9th Slide Set Computer Networks Frankfurt University of Applied Sciences WS1718 1/49 9th Slide Set Computer Networks Prof. Dr. Christian Baun Frankfurt University of Applied Sciences

More information

Reconfigurable PLL for Digital System

Reconfigurable PLL for Digital System International Journal of Engineering Research and Technology. ISSN 0974-3154 Volume 6, Number 3 (2013), pp. 285-291 International Research Publication House http://www.irphouse.com Reconfigurable PLL for

More information

VLSI Testing. Virendra Singh. Bangalore E0 286: Test & Verification of SoC Design Lecture - 7. Jan 27,

VLSI Testing. Virendra Singh. Bangalore E0 286: Test & Verification of SoC Design Lecture - 7. Jan 27, VLSI Testing Fault Simulation Virendra Singh Indian Institute t of Science Bangalore virendra@computer.org E 286: Test & Verification of SoC Design Lecture - 7 Jan 27, 2 E-286@SERC Fault Simulation Jan

More information

Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System

Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System Distributed Scheduling for the Sombrero Single Address Space Distributed Operating System Donald S. Miller Department of Computer Science and Engineering Arizona State University Tempe, AZ, USA Alan C.

More information

A Proposal for a High Speed Multicast Switch Fabric Design

A Proposal for a High Speed Multicast Switch Fabric Design A Proposal for a High Speed Multicast Switch Fabric Design Cheng Li, R.Venkatesan and H.M.Heys Faculty of Engineering and Applied Science Memorial University of Newfoundland St. John s, NF, Canada AB X

More information

4. Hardware Platform: Real-Time Requirements

4. Hardware Platform: Real-Time Requirements 4. Hardware Platform: Real-Time Requirements Contents: 4.1 Evolution of Microprocessor Architecture 4.2 Performance-Increasing Concepts 4.3 Influences on System Architecture 4.4 A Real-Time Hardware Architecture

More information

FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC)

FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC) FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC) D.Udhayasheela, pg student [Communication system],dept.ofece,,as-salam engineering and technology, N.MageshwariAssistant Professor

More information

What Is Congestion? Effects of Congestion. Interaction of Queues. Chapter 12 Congestion in Data Networks. Effect of Congestion Control

What Is Congestion? Effects of Congestion. Interaction of Queues. Chapter 12 Congestion in Data Networks. Effect of Congestion Control Chapter 12 Congestion in Data Networks Effect of Congestion Control Ideal Performance Practical Performance Congestion Control Mechanisms Backpressure Choke Packet Implicit Congestion Signaling Explicit

More information

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance

Lecture 13: Interconnection Networks. Topics: lots of background, recent innovations for power and performance Lecture 13: Interconnection Networks Topics: lots of background, recent innovations for power and performance 1 Interconnection Networks Recall: fully connected network, arrays/rings, meshes/tori, trees,

More information

Problem Formulation. Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets.

Problem Formulation. Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets. Clock Routing Problem Formulation Specialized algorithms are required for clock (and power nets) due to strict specifications for routing such nets. Better to develop specialized routers for these nets.

More information

Basic Low Level Concepts

Basic Low Level Concepts Course Outline Basic Low Level Concepts Case Studies Operation through multiple switches: Topologies & Routing v Direct, indirect, regular, irregular Formal models and analysis for deadlock and livelock

More information

Curriculum 2013 Knowledge Units Pertaining to PDC

Curriculum 2013 Knowledge Units Pertaining to PDC Curriculum 2013 Knowledge Units Pertaining to C KA KU Tier Level NumC Learning Outcome Assembly level machine Describe how an instruction is executed in a classical von Neumann machine, with organization

More information

ECE519 Advanced Operating Systems

ECE519 Advanced Operating Systems IT 540 Operating Systems ECE519 Advanced Operating Systems Prof. Dr. Hasan Hüseyin BALIK (10 th Week) (Advanced) Operating Systems 10. Multiprocessor, Multicore and Real-Time Scheduling 10. Outline Multiprocessor

More information

Unicast Routing in Mobile Ad Hoc Networks. Dr. Ashikur Rahman CSE 6811: Wireless Ad hoc Networks

Unicast Routing in Mobile Ad Hoc Networks. Dr. Ashikur Rahman CSE 6811: Wireless Ad hoc Networks Unicast Routing in Mobile Ad Hoc Networks 1 Routing problem 2 Responsibility of a routing protocol Determining an optimal way to find optimal routes Determining a feasible path to a destination based on

More information

COMPUTER ORGANISATION CHAPTER 1 BASIC STRUCTURE OF COMPUTERS

COMPUTER ORGANISATION CHAPTER 1 BASIC STRUCTURE OF COMPUTERS Computer types: - COMPUTER ORGANISATION CHAPTER 1 BASIC STRUCTURE OF COMPUTERS A computer can be defined as a fast electronic calculating machine that accepts the (data) digitized input information process

More information

A Low-Power Field Programmable VLSI Based on Autonomous Fine-Grain Power Gating Technique

A Low-Power Field Programmable VLSI Based on Autonomous Fine-Grain Power Gating Technique A Low-Power Field Programmable VLSI Based on Autonomous Fine-Grain Power Gating Technique P. Durga Prasad, M. Tech Scholar, C. Ravi Shankar Reddy, Lecturer, V. Sumalatha, Associate Professor Department

More information

POWER consumption has become one of the most important

POWER consumption has become one of the most important 704 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 39, NO. 4, APRIL 2004 Brief Papers High-Throughput Asynchronous Datapath With Software-Controlled Voltage Scaling Yee William Li, Student Member, IEEE, George

More information

Implementation of Asynchronous Topology using SAPTL

Implementation of Asynchronous Topology using SAPTL Implementation of Asynchronous Topology using SAPTL NARESH NAGULA *, S. V. DEVIKA **, SK. KHAMURUDDEEN *** *(senior software Engineer & Technical Lead, Xilinx India) ** (Associate Professor, Department

More information

A Low-Latency DMR Architecture with Efficient Recovering Scheme Exploiting Simultaneously Copiable SRAM

A Low-Latency DMR Architecture with Efficient Recovering Scheme Exploiting Simultaneously Copiable SRAM A Low-Latency DMR Architecture with Efficient Recovering Scheme Exploiting Simultaneously Copiable SRAM Go Matsukawa 1, Yohei Nakata 1, Yuta Kimi 1, Yasuo Sugure 2, Masafumi Shimozawa 3, Shigeru Oho 4,

More information

Wave-Pipelining the Global Interconnect to Reduce the Associated Delays

Wave-Pipelining the Global Interconnect to Reduce the Associated Delays Wave-Pipelining the Global Interconnect to Reduce the Associated Delays Jabulani Nyathi, Ray Robert Rydberg III and Jose G. Delgado-Frias Washington State University School of EECS Pullman, Washington,

More information

NETWORK TOPOLOGIES. Application Notes. Keywords Topology, P2P, Bus, Ring, Star, Mesh, Tree, PON, Ethernet. Author John Peter & Timo Perttunen

NETWORK TOPOLOGIES. Application Notes. Keywords Topology, P2P, Bus, Ring, Star, Mesh, Tree, PON, Ethernet. Author John Peter & Timo Perttunen Application Notes NETWORK TOPOLOGIES Author John Peter & Timo Perttunen Issued June 2014 Abstract Network topology is the way various components of a network (like nodes, links, peripherals, etc) are arranged.

More information

3. Quality of Service

3. Quality of Service 3. Quality of Service Usage Applications Learning & Teaching Design User Interfaces Services Content Process ing Security... Documents Synchronization Group Communi cations Systems Databases Programming

More information

Networks-on-Chip Router: Configuration and Implementation

Networks-on-Chip Router: Configuration and Implementation Networks-on-Chip : Configuration and Implementation Wen-Chung Tsai, Kuo-Chih Chu * 2 1 Department of Information and Communication Engineering, Chaoyang University of Technology, Taichung 413, Taiwan,

More information

Growth. Individual departments in a university buy LANs for their own machines and eventually want to interconnect with other campus LANs.

Growth. Individual departments in a university buy LANs for their own machines and eventually want to interconnect with other campus LANs. Internetworking Multiple networks are a fact of life: Growth. Individual departments in a university buy LANs for their own machines and eventually want to interconnect with other campus LANs. Fault isolation,

More information

Admission Control in Time-Slotted Multihop Mobile Networks

Admission Control in Time-Slotted Multihop Mobile Networks dmission ontrol in Time-Slotted Multihop Mobile Networks Shagun Dusad and nshul Khandelwal Information Networks Laboratory Department of Electrical Engineering Indian Institute of Technology - ombay Mumbai

More information

Advantages and disadvantages

Advantages and disadvantages Advantages and disadvantages Advantages Disadvantages Asynchronous transmission Simple, doesn't require synchronization of both communication sides Cheap, timing is not as critical as for synchronous transmission,

More information

IMPLICATIONS OF RELIABILITY ENHANCEMENT ACHIEVED BY FAULT AVOIDANCE ON DYNAMICALLY RECONFIGURABLE ARCHITECTURES

IMPLICATIONS OF RELIABILITY ENHANCEMENT ACHIEVED BY FAULT AVOIDANCE ON DYNAMICALLY RECONFIGURABLE ARCHITECTURES 20 21st International Conference on Field Programmable Logic and Applications IMPLICATIONS OF RELIABILITY ENHANCEMENT ACHIEVED BY FAULT AVOIDANCE ON DYNAMICALLY RECONFIGURABLE ARCHITECTURES Hiroaki KONOURA,

More information

OVERHEADS ENHANCEMENT IN MUTIPLE PROCESSING SYSTEMS BY ANURAG REDDY GANKAT KARTHIK REDDY AKKATI

OVERHEADS ENHANCEMENT IN MUTIPLE PROCESSING SYSTEMS BY ANURAG REDDY GANKAT KARTHIK REDDY AKKATI CMPE 655- MULTIPLE PROCESSOR SYSTEMS OVERHEADS ENHANCEMENT IN MUTIPLE PROCESSING SYSTEMS BY ANURAG REDDY GANKAT KARTHIK REDDY AKKATI What is MULTI PROCESSING?? Multiprocessing is the coordinated processing

More information

THE TRANSPORT LAYER UNIT IV

THE TRANSPORT LAYER UNIT IV THE TRANSPORT LAYER UNIT IV The Transport Layer: The Transport Service, Elements of Transport Protocols, Congestion Control,The internet transport protocols: UDP, TCP, Performance problems in computer

More information

Subject: Adhoc Networks

Subject: Adhoc Networks ISSUES IN AD HOC WIRELESS NETWORKS The major issues that affect the design, deployment, & performance of an ad hoc wireless network system are: Medium Access Scheme. Transport Layer Protocol. Routing.

More information

Local Area Network Overview

Local Area Network Overview Local Area Network Overview Chapter 15 CS420/520 Axel Krings Page 1 LAN Applications (1) Personal computer LANs Low cost Limited data rate Back end networks Interconnecting large systems (mainframes and

More information

CHAPTER 6 PILOT/SIGNATURE PATTERN BASED MODULATION TRACKING

CHAPTER 6 PILOT/SIGNATURE PATTERN BASED MODULATION TRACKING CHAPTER 6 PILOT/SIGNATURE PATTERN BASED MODULATION TRACKING 6.1 TRANSMITTER AND RECEIVER Each modulated signal is preceded by a unique N bit pilot sequence (Manton, JH 2001). A switch in the transmitter

More information

VLSI Testing. Fault Simulation. Virendra Singh. Indian Institute of Science Bangalore

VLSI Testing. Fault Simulation. Virendra Singh. Indian Institute of Science Bangalore VLSI Testing Fault Simulation Virendra Singh Indian Institute of Science Bangalore virendra@computer.org E0 286: Test & Verification of SoC Design Lecture - 4 Jan 25, 2008 E0-286@SERC 1 Fault Model - Summary

More information

Analysis of Different Multiplication Algorithms & FPGA Implementation

Analysis of Different Multiplication Algorithms & FPGA Implementation IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 4, Issue 2, Ver. I (Mar-Apr. 2014), PP 29-35 e-issn: 2319 4200, p-issn No. : 2319 4197 Analysis of Different Multiplication Algorithms & FPGA

More information

High Performance Computer Architecture Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

High Performance Computer Architecture Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur High Performance Computer Architecture Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture - 18 Dynamic Instruction Scheduling with Branch Prediction

More information

Future Gigascale MCSoCs Applications: Computation & Communication Orthogonalization

Future Gigascale MCSoCs Applications: Computation & Communication Orthogonalization Basic Network-on-Chip (BANC) interconnection for Future Gigascale MCSoCs Applications: Computation & Communication Orthogonalization Abderazek Ben Abdallah, Masahiro Sowa Graduate School of Information

More information

Computer and Hardware Architecture II. Benny Thörnberg Associate Professor in Electronics

Computer and Hardware Architecture II. Benny Thörnberg Associate Professor in Electronics Computer and Hardware Architecture II Benny Thörnberg Associate Professor in Electronics Parallelism Microscopic vs Macroscopic Microscopic parallelism hardware solutions inside system components providing

More information

Implementation of Software-based EPON-OLT and Performance Evaluation

Implementation of Software-based EPON-OLT and Performance Evaluation This article has been accepted and published on J-STAGE in advance of copyediting. Content is final as presented. IEICE Communications Express, Vol.1, 1 6 Implementation of Software-based EPON-OLT and

More information

ASSEMBLY LANGUAGE MACHINE ORGANIZATION

ASSEMBLY LANGUAGE MACHINE ORGANIZATION ASSEMBLY LANGUAGE MACHINE ORGANIZATION CHAPTER 3 1 Sub-topics The topic will cover: Microprocessor architecture CPU processing methods Pipelining Superscalar RISC Multiprocessing Instruction Cycle Instruction

More information

Unit 2: High-Level Synthesis

Unit 2: High-Level Synthesis Course contents Unit 2: High-Level Synthesis Hardware modeling Data flow Scheduling/allocation/assignment Reading Chapter 11 Unit 2 1 High-Level Synthesis (HLS) Hardware-description language (HDL) synthesis

More information

AN EFFICIENT MAC PROTOCOL FOR SUPPORTING QOS IN WIRELESS SENSOR NETWORKS

AN EFFICIENT MAC PROTOCOL FOR SUPPORTING QOS IN WIRELESS SENSOR NETWORKS AN EFFICIENT MAC PROTOCOL FOR SUPPORTING QOS IN WIRELESS SENSOR NETWORKS YINGHUI QIU School of Electrical and Electronic Engineering, North China Electric Power University, Beijing, 102206, China ABSTRACT

More information

Fairness Example: high priority for nearby stations Optimality Efficiency overhead

Fairness Example: high priority for nearby stations Optimality Efficiency overhead Routing Requirements: Correctness Simplicity Robustness Under localized failures and overloads Stability React too slow or too fast Fairness Example: high priority for nearby stations Optimality Efficiency

More information

Real-time and smooth scalable video streaming system with bitstream extractor intellectual property implementation

Real-time and smooth scalable video streaming system with bitstream extractor intellectual property implementation LETTER IEICE Electronics Express, Vol.11, No.5, 1 6 Real-time and smooth scalable video streaming system with bitstream extractor intellectual property implementation Liang-Hung Wang 1a), Yi-Mao Hsiao

More information

Connection-oriented (virtual circuit) Reliable Transfer Buffered Transfer Unstructured Stream Full Duplex Point-to-point Connection End-to-end service

Connection-oriented (virtual circuit) Reliable Transfer Buffered Transfer Unstructured Stream Full Duplex Point-to-point Connection End-to-end service 최양희서울대학교컴퓨터공학부 Connection-oriented (virtual circuit) Reliable Transfer Buffered Transfer Unstructured Stream Full Duplex Point-to-point Connection End-to-end service 1 2004 Yanghee Choi 2 Addressing: application

More information

Delay Time Analysis of Reconfigurable. Firewall Unit

Delay Time Analysis of Reconfigurable. Firewall Unit Delay Time Analysis of Reconfigurable Unit Tomoaki SATO C&C Systems Center, Hirosaki University Hirosaki 036-8561 Japan Phichet MOUNGNOUL Faculty of Engineering, King Mongkut's Institute of Technology

More information

Implementation of ALU Using Asynchronous Design

Implementation of ALU Using Asynchronous Design IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) ISSN: 2278-2834, ISBN: 2278-8735. Volume 3, Issue 6 (Nov. - Dec. 2012), PP 07-12 Implementation of ALU Using Asynchronous Design P.

More information

CH : 15 LOCAL AREA NETWORK OVERVIEW

CH : 15 LOCAL AREA NETWORK OVERVIEW CH : 15 LOCAL AREA NETWORK OVERVIEW P. 447 LAN (Local Area Network) A LAN consists of a shared transmission medium and a set of hardware and software for interfacing devices to the medium and regulating

More information

OPTIMIZATION OF IPV6 PACKET S HEADERS OVER ETHERNET FRAME

OPTIMIZATION OF IPV6 PACKET S HEADERS OVER ETHERNET FRAME OPTIMIZATION OF IPV6 PACKET S HEADERS OVER ETHERNET FRAME 1 FAHIM A. AHMED GHANEM1, 2 VILAS M. THAKARE 1 Research Student, School of Computational Sciences, Swami Ramanand Teerth Marathwada University,

More information

Mark Sandstrom ThroughPuter, Inc.

Mark Sandstrom ThroughPuter, Inc. Hardware Implemented Scheduler, Placer, Inter-Task Communications and IO System Functions for Many Processors Dynamically Shared among Multiple Applications Mark Sandstrom ThroughPuter, Inc mark@throughputercom

More information

7. TCP 최양희서울대학교컴퓨터공학부

7. TCP 최양희서울대학교컴퓨터공학부 7. TCP 최양희서울대학교컴퓨터공학부 1 TCP Basics Connection-oriented (virtual circuit) Reliable Transfer Buffered Transfer Unstructured Stream Full Duplex Point-to-point Connection End-to-end service 2009 Yanghee Choi

More information

Concurrent/Parallel Processing

Concurrent/Parallel Processing Concurrent/Parallel Processing David May: April 9, 2014 Introduction The idea of using a collection of interconnected processing devices is not new. Before the emergence of the modern stored program computer,

More information

Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors

Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors University of Crete School of Sciences & Engineering Computer Science Department Master Thesis by Michael Papamichael Network Interface Architecture and Prototyping for Chip and Cluster Multiprocessors

More information

Address InterLeaving for Low- Cost NoCs

Address InterLeaving for Low- Cost NoCs Address InterLeaving for Low- Cost NoCs Miltos D. Grammatikakis, Kyprianos Papadimitriou, Polydoros Petrakis, Marcello Coppola, and Michael Soulie Technological Educational Institute of Crete, GR STMicroelectronics,

More information

TriScale Clustering Tech Note

TriScale Clustering Tech Note TriScale Clustering Tech Note www.citrix.com Table of Contents Expanding Capacity with TriScale Clustering... 2 How Clustering Works... 2 Cluster Communication... 3 Cluster Configuration and Synchronization...

More information

SUMMERY, CONCLUSIONS AND FUTURE WORK

SUMMERY, CONCLUSIONS AND FUTURE WORK Chapter - 6 SUMMERY, CONCLUSIONS AND FUTURE WORK The entire Research Work on On-Demand Routing in Multi-Hop Wireless Mobile Ad hoc Networks has been presented in simplified and easy-to-read form in six

More information

Routing. Information Networks p.1/35

Routing. Information Networks p.1/35 Routing Routing is done by the network layer protocol to guide packets through the communication subnet to their destinations The time when routing decisions are made depends on whether we are using virtual

More information

Computer Architecture Crash course

Computer Architecture Crash course Computer Architecture Crash course Frédéric Haziza Department of Computer Systems Uppsala University Summer 2008 Conclusions The multicore era is already here cost of parallelism is dropping

More information

A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding

A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding A Low Power Asynchronous FPGA with Autonomous Fine Grain Power Gating and LEDR Encoding N.Rajagopala krishnan, k.sivasuparamanyan, G.Ramadoss Abstract Field Programmable Gate Arrays (FPGAs) are widely

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION CHAPTER 1 INTRODUCTION Rapid advances in integrated circuit technology have made it possible to fabricate digital circuits with large number of devices on a single chip. The advantages of integrated circuits

More information

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS

CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS CS6303 Computer Architecture Regulation 2013 BE-Computer Science and Engineering III semester 2 MARKS UNIT-I OVERVIEW & INSTRUCTIONS 1. What are the eight great ideas in computer architecture? The eight

More information

A Network Storage LSI Suitable for Home Network

A Network Storage LSI Suitable for Home Network 258 HAN-KYU LIM et al : A NETWORK STORAGE LSI SUITABLE FOR HOME NETWORK A Network Storage LSI Suitable for Home Network Han-Kyu Lim*, Ji-Ho Han**, and Deog-Kyoon Jeong*** Abstract Storage over (SoE) is

More information

A distributed architecture of IP routers

A distributed architecture of IP routers A distributed architecture of IP routers Tasho Shukerski, Vladimir Lazarov, Ivan Kanev Abstract: The paper discusses the problems relevant to the design of IP (Internet Protocol) routers or Layer3 switches

More information

Multiprocessors and Thread-Level Parallelism. Department of Electrical & Electronics Engineering, Amrita School of Engineering

Multiprocessors and Thread-Level Parallelism. Department of Electrical & Electronics Engineering, Amrita School of Engineering Multiprocessors and Thread-Level Parallelism Multithreading Increasing performance by ILP has the great advantage that it is reasonable transparent to the programmer, ILP can be quite limited or hard to

More information

Evaluating the Impact of Signal to Noise Ratio on IEEE PHY-Level Packet Loss Rate

Evaluating the Impact of Signal to Noise Ratio on IEEE PHY-Level Packet Loss Rate 21 13th International Conference on Network-Based Information Systems Evaluating the Impact of Signal to Noise Ratio on IEEE 82.15.4 PHY-Level Packet Loss Rate MGoyal, S Prakash,WXie,YBashir, H Hosseini

More information

Reducing SpaceWire Time-code Jitter

Reducing SpaceWire Time-code Jitter Reducing SpaceWire Time-code Jitter Barry M Cook 4Links Limited The Mansion, Bletchley Park, Milton Keynes, MK3 6ZP, UK Email: barry@4links.co.uk INTRODUCTION Standards ISO/IEC 14575[1] and IEEE 1355[2]

More information

A Synthesizable RTL Design of Asynchronous FIFO Interfaced with SRAM

A Synthesizable RTL Design of Asynchronous FIFO Interfaced with SRAM A Synthesizable RTL Design of Asynchronous FIFO Interfaced with SRAM Mansi Jhamb, Sugam Kapoor USIT, GGSIPU Sector 16-C, Dwarka, New Delhi-110078, India Abstract This paper demonstrates an asynchronous

More information

Extended Correspondent Registration Scheme for Reducing Handover Delay in Mobile IPv6

Extended Correspondent Registration Scheme for Reducing Handover Delay in Mobile IPv6 Extended Correspondent Registration Scheme for Reducing Handover Delay in Mobile IPv6 Ved P. Kafle Department of Informatics The Graduate University for Advanced Studies Tokyo, Japan Eiji Kamioka and Shigeki

More information

Basic Processing Unit: Some Fundamental Concepts, Execution of a. Complete Instruction, Multiple Bus Organization, Hard-wired Control,

Basic Processing Unit: Some Fundamental Concepts, Execution of a. Complete Instruction, Multiple Bus Organization, Hard-wired Control, UNIT - 7 Basic Processing Unit: Some Fundamental Concepts, Execution of a Complete Instruction, Multiple Bus Organization, Hard-wired Control, Microprogrammed Control Page 178 UNIT - 7 BASIC PROCESSING

More information

Next-Generation Switching Systems

Next-Generation Switching Systems Next-Generation Switching Systems Atsuo Kawai Shin ichi Iwaki Haruyoshi Kiyoku Keizo Kusaba ABSTRACT: Today, the telecommunications industry is encountering a new wave of multimedia communications for

More information

High Performance Interconnect and NoC Router Design

High Performance Interconnect and NoC Router Design High Performance Interconnect and NoC Router Design Brinda M M.E Student, Dept. of ECE (VLSI Design) K.Ramakrishnan College of Technology Samayapuram, Trichy 621 112 brinda18th@gmail.com Devipoonguzhali

More information

CHAPTER 3 ENHANCEMENTS IN DATA LINK LAYER

CHAPTER 3 ENHANCEMENTS IN DATA LINK LAYER 32 CHAPTER 3 ENHANCEMENTS IN DATA LINK LAYER This proposed work describes the techniques used in the data link layer to improve the performance of the TCP in wireless networks and MANETs. In the data link

More information

Parallel graph traversal for FPGA

Parallel graph traversal for FPGA LETTER IEICE Electronics Express, Vol.11, No.7, 1 6 Parallel graph traversal for FPGA Shice Ni a), Yong Dou, Dan Zou, Rongchun Li, and Qiang Wang National Laboratory for Parallel and Distributed Processing,

More information

Informal Quiz #01: SOLUTIONS

Informal Quiz #01: SOLUTIONS ECSE-6600: Internet Protocols Informal Quiz #01: SOLUTIONS : GOOGLE: Shiv RPI shivkuma@ecse.rpi.edu 1 Review of Networking Concepts (I): Informal Quiz SOLUTIONS For each T/F question: Replace the appropriate

More information

Ultra Depedable VLSI by Collaboration of Formal Verifications and Architectural Technologies

Ultra Depedable VLSI by Collaboration of Formal Verifications and Architectural Technologies Ultra Depedable VLSI by Collaboration of Formal Verifications and Architectural Technologies CREST-DVLSI - Fundamental Technologies for Dependable VLSI Systems - Masahiro Fujita Shuichi Sakai Masahiro

More information

STUDY, DESIGN AND SIMULATION OF FPGA BASED USB 2.0 DEVICE CONTROLLER

STUDY, DESIGN AND SIMULATION OF FPGA BASED USB 2.0 DEVICE CONTROLLER STUDY, DESIGN AND SIMULATION OF FPGA BASED USB 2.0 DEVICE CONTROLLER 1 MS. PARUL BAHUGUNA CD 1 M.E. [VLSI & Embedded System Design] Student, Gujarat Technological University PG School, Ahmedabad, Gujarat.

More information

Flexible Platform for Neural Network Based on Data Flow Principles

Flexible Platform for Neural Network Based on Data Flow Principles Flexible Platform for Neural Network Based on Data Flow Principles Liberios Vokorokos, Norbert Ádám, Anton Baláž Department of Computers and Informatics, Faculty of Electrical Engineering and Informatics,

More information