ORION3.0: A Comprehensive NoC Router Estimation Tool
|
|
- William Dawson
- 5 years ago
- Views:
Transcription
1 IEEE EMBEDDED SYSTEMS LETTERS, VOL. XX, NO. Y, MONTH XX, ORION3.0: A Comprehensive NoC Router Estimation Tool Andrew B. Kahng, Fellow, IEEE, Bill Lin, Senior Member, IEEE, and Siddhartha Nath, Student Member, IEEE Abstract Networks-on-Chip (NoCs) are increasingly used in many-core architectures. ORION2.0 [9] is a widely adopted NoC power and area estimation tool but its estimation models can have large errors (up to 185%) versus actual implementation. We present, ORION3.0, a tool that implements parametric and non-parametric modeling methodologies that fundamentally differ from ORION2.0 logic template-based approaches in that the estimation models are derived from actual physical implementation data. When compared with actual implementations, our modeling methodologies achieve average estimation errors of no more than 9.8% across microarchitecture, implementation, and operational parameters as well as multiple router RTL generators. These methodologies are implemented in the ORION3.0 distribution [20]. I. INTRODUCTION Networks-on-Chip (NoCs) have proven to be highly scalable and low-latency interconnection fabrics in the era of many-core architectures, as evidenced by commercial chips such as the Intel 80-core [16], and IBM Blue Gene [15] processors. NoCs are part of uncore in the new 2013 International Technology Roadmap for Semiconductors (ITRS) [17] MPU model. Because of their growing importance, NoC implementations must be optimized for latency and power [11]. To facilitate early design-space exploration, tools that can accurately estimate NoC power and area are required. This paper describes ORION3.0, a comprehensive NoC router estimation tool that embodies both parametric and nonparametric models. We include new models of router component blocks using parametric [5] and non-parametric modeling [6] methodologies that fundamentally differ from approaches like ORION2.0 [9] in that the estimation models are derived from actual post place-and-route (P&R) data that corresponds to the actual RTL generator and target cell library. Following this paradigm, we describe two approaches that are implemented in ORION3.0. The first approach is based on parametric modeling. Our work in [5] is a substantial departure from the ORION2.0 approach because no logic template is assumed for any router component block. Instead, for each component block in the router RTL, appropriate parametric models are derived from the postsynthesis netlists by observing how instance counts change with A. B. Kahng is with the Departments of Computer Science and Engineering, and of Electrical and Computer Engineering, University of California at San Diego, La Jolla, CA abk@ucsd.edu. B. Lin is with the Department of Electrical and Computer Engineering, University of California at San Diego, La Jolla, CA billlin@ece.ucsd.edu. S. Nath is with the Department of Computer Science and Engineering, University of California at San Diego, La Jolla, CA sinath@ucsd.edu Copyright c 2013 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending an to pubs-permissions@ieee.org. microarchitectural, implementation, and operational parameters. We call these models ORION NEW. We perform least-squares regression (LSQR) with actual post-p&r power and area data to refine these ORION NEW models. The resulting parametric models achieve worst-case errors significantly better than those of ORION2.0. Our parametric modeling methodology does not require the architect or developer to understand how the architectural components are implemented. Rather, the methodology relies on a one-time characterization of postsynthesis data to derive parametric models of component blocks, and automatic fitting of these models to post-p&r data using parametric regression. The second approach is based on non-parametric modeling. In this approach, estimation models are also derived from actual post P&R layout power and area data that correspond to the actual RTL generator and target cell library. The non-parametric modeling approach is powerful in that it can automatically derive accurate estimation models based on a sample set of post P&R results. In ORION3.0, we extend the ideas of [8] [3] by incorporating four other metamodeling techniques for automatic model generation: Radial Basis Functions (RBF), Kriging (KG), Multivariate Adaptive Regression Splines with linear and cubic splines, and Support Vector Machine (SVM) regression. The non-parametric modeling approach also does not require the architect or developer to understand how the architectural components are implemented. For backward compatibility, ORION3.0 also includes logic template-based models in ORION2.0. Based on requirements of modeling accuracy, availability of training and testing data for regression, users have the flexibility of choosing specific modeling methodologies. For example, when training and testing data for a technology or tool flow is not available, users may use ORION NEW models with scaled technology parameters. Our main contributions are as follows: 1) We describe a new parametric modeling methodology that derives accurate parametric models from actual postsynthesis netlists by observing how instance counts change with microarchitectural, implementation, and operational parameters. The post-synthesis netlists accurately capture contributions from both the control and data paths. 2) We demonstrate that popular non-parametric regression techniques, namely, RBF, KG, MARS, and SVM, can be highly accurate (worst-case errors less than 20%) in NoC power and area estimations. 3) We have released ORION3.0 on the web for download. There have been over 100 downloads since webpage opened for download in February The remainder of this paper is organized as follows. Section II presents ORION NEW models and describes our parametric modeling methodology, and Section III provides description of our non-parametric modeling methodology. Section IV describes
2 IEEE EMBEDDED SYSTEMS LETTERS, VOL. XX, NO. Y, MONTH XX, the software architecture of ORION3.0, possible extensions with newer user-defined models as well as training and testing datasets, and details on ORION3.0 distribution and tool citations. Section V concludes this work. II. PARAMETRIC MODELING Figure 1 shows example of a modern on-chip network router with input and output buffers, switch and virtual channel arbiter, and the crossbar. ORION2.0 uses logic template models for these router blocks. However, these models can be inaccurate because of mismatches between the actual RTL and the templates assumed. Moreover, typical design flows involve sophisticated design steps that have complex interactions among them, making their effects difficult to characterize. To illustrate the degree of inaccuracy, we show in Figures 2(a) and (b) power and instance counts estimation errors at 65nm for ORION2.0 when compared to two router RTL generators, Netmaker [19] from Cambridge and the Stanford NoC router [22], as a function of the number of input ports in the router. The maximum errors are greater than 100% and 1000%, respectively. For these two RTL generators, [5] reports significant improvements in power, area, and instance counts estimation using ORION NEW models. Fig. 1. Router architecture [11]. Fig. 2. Poor estimations by ORION2.0 [9]. (a) Power and (b) instance counts of Netmaker and Stanford NoC vs. ORION2.0 at 65nm as a function of #ports. A. Model Enhancements The ORION NEW router block models in ORION3.0 model instances (or gates) in each router block, which our studies show to be required for accurate estimations of area and power. The microarchitecture parameters used are #Ports (P), #VCs (V), #Buffers (B) and Flit-width (F). Crossbar (XBAR) Model. ORION NEW models comprehend modern router RTL implementations which use smaller crossbars instead of traditional matrix [11] and multiplexer tree [9] implementations in ORION2.0. The XBAR uses tri-state buffers (modeled as 2:1 MUX) to control each flit. Hence, the total number of such MUXes required is: P P F. Switch and VC Arbiter (SWVC) Model. ORION NEW removes the default overhead factor of 30% used by ORION2.0 because our analysis indicates that this overhead is not needed with frequencies in the range 400MHz-900MHz for process nodes 45nm to 130nm. SWVC is modeled as: 9 (P (P (V 2 + 1) + (V 2 1)). The constant factor 9 arises because six 2-input NOR gates, two INVerters, and one D-FlipFlop are used to generate one grant signal on each path. Input Buffer (InBUF) Model. ORION NEW models take into account control signals and housekeeping logic which are needed required to decode flits and manage VCs. ORION2.0 models lack these components and are hence inaccurate. In ORION NEW, FIFO buffers are modeled as: 2 P V B F, and control signals and housekeeping logic are modeled as: 180 P V + 2 P 2 V B+3 P V B+5 P 2 B+P 2 +F P+15 P. The constant factors are derived by using linear regression to fit the number of instances in the post-synthesis netlists to the microarchitecture parameters. Output Buffer (OutBUF) Model. ORION NEW models take into account hybrid output buffer implementations in modern router RTLs as well as control signals per port and VC associated with each buffer. Therefore, instead of modeling output buffers in exactly the same way as it models input buffers; OutBUF is modeled as: P (80 V + 25). The constant factors are derived by using linear regression to fit the number of instances in the post-synthesis netlists to the microarchitecture parameters. Clock and Control Logic (CLKCTRL) Model. ORION NEW models the clock buffers and clock routing resources as frequency scales. ORION2.0 does not model impact on these resources because of frequency scaling. ORION NEW models these resources as 2% of the sum of instances in the SWVC, InBUF and OutBUF component blocks. Frequency Derating Model. ORION2.0 models are agnostic to implementation parameters such as clock frequency and results in large estimation errors at high frequencies. We first find the frequency below which instance counts change by less than 1%. We derate instance counts by a multiplier Instance that is based on this frequency as: Instance = Frequency ConstantFactor. The constant factors are derived by using linear regression to fit the number of instances in the post-synthesis netlists to frequency and the microarchitecture parameters. B. Modeling Methodology Figure 3 shows our flow to derive new parametric models, ORION NEW, from post-synthesis netlists and subsequent refinement process by fitting the models to post-p&r area, power, and instance counts data. We use the Netmaker and the Stanford NoC router RTL generators, and a range of values of microarchitecture parameters (#Ports (P), #VCs (V), #Buffers (B) and Flit-width (F)) and implementation parameters (clock frequency and technology node) to configure the router. We synthesize the router RTLs using Synopsys Design Compiler vf sp4-64 (DC) [23] and Cadence RTL Compiler vedi10.1 (RC) [13], with options to preserve module hierarchy after synthesis because we analyze each router component block. We then analyze the post-synthesis netlists to derive ORION NEW models. To refine these models, we generate post-p&r power and area data. We place and route the synthesized netlists using Cadence SOC Encounter vedi10.1 (SOCE) with die utilization of 0.75 and die aspect ratio of 1.0. We use Synopsys PrimeTime-PX (PT- PX) [23] to run power analysis based on the post-p&r netlist,
3 IEEE EMBEDDED SYSTEMS LETTERS, VOL. XX, NO. Y, MONTH XX, NoC router RTL generators µarch params: P, V, B, F Synthesis and P&R: DC/RC, SOCE Analysis of blocks: XBAR, SW & VC arbiter, Input & Output buffers ORION_NEW models for each component block Implementation params: Clock Frequency Post-P&R area, power, instances LSQR New fitted models Fig. 3. High-level flow used to develop the ORION NEW and fitted models using post-p&r data. SPEF [23] and SDC [1]. Finally, we use the MATLAB [18] function lsqnonneg to fit the models to post-p&r data. C. Results We compare area and power estimation errors of ORION2.0 to those of ORION NEW fitted using our regression approach. Figures 4(a) and (b) plot estimation errors for power and area respectively at 45nm and 65nm technology nodes. The ORION NEW estimates are very close to actual implementation (average error of 9.8% in estimating Netmaker power at 45nm) and are robust across multiple microarchitecture, implementation parameters, and router RTLs. popular non-parametric regression or metamodeling techniques RBF, KG, MARS, and SVM. Detailed descriptions of these techniques are in [2]. A. Modeling Methodology We derive NoC area and power models by performing non-parametric fit of post-p&r data using 65nm and 45nm technology libraries. We first perform synthesis using Synopsys Design Compiler [23], followed by place and route using Cadence SOC Encounter [13], of the Netmaker [19] router RTL. Next, we generate area and power reports of these designs to use as training and test data points. Figure 5 shows our synthesis and P&R flow. We use two sampling methodologies to generate the training sets a modified Latin Hypercube Sampling (LHS) [4], and a restricted sampling methodology. The restricted sampling methodology does not include higher values of the microarchitectural parameters in the training set. More precisely, the resulting training sets omit all values of {B=7}, or of {P = 9}, or of {V = 7}, or of {F = 64}. Fig. 5. Non-parametric regression flow. Fig. 4. Least-squares regression fit vs. ORION2.0: (a) power estimation error and (b) area estimation error. III. NON-PARAMETRIC MODELING Non-parametric regression techniques provide another approach to estimate NoC power and area [8] [3]. Such techniques can considerably reduce modeling efforts because they require only the microarchitectural, implementation, and operational parameters as input variables. The models determine the interactions between these input variables and how they affect the output (or response). This alleviates the effort needed to model a NoC, as details of architecture-level implementations are avoided. At the same time, non-parametric regression approaches are scalable across multiple router RTLs, technology libraries and commercial tool flows. In ORION3.0, we implement four B. Results We use 256 data points of post-p&r power and area values using 45nm and 65nm technology libraries to generate training and test data points. The input variables to all the models are P, V, B and F and the responses are post-p&r power and area. We use two training set sizes sparse and restricted with 50 data points, and sparse only with 64 data points and test the accuracy of models in estimating area and power for input parameters which are beyond the range of values used for training. In each experiment, model generation takes around 3s and response estimation takes around 1.88s. We repeat all experiments 10 times for each training set size, and report the averages of all the error values across the 10 trials. Figures 6(a) and (b) show the percentage errors observed in estimating standard-cell area and power, respectively at 45nm and 65nm by training models using sparse and restricted training sets. The average errors are less than 10% in area and 20% in power. We observe that RBF performs better than other techniques at both technologies across area and power estimations. The maximum estimation error for RBF is less than 20% and can be up to 3x more accurate than KG, MARS and SVM. Figures 7(a) and (b) show similar plots by using only sparse training sets. We observe that RBF is more accurate than other techniques, but the difference in accuracy is not as significant with KG and MARS as compared to the sparse and restricted case. The improvement in accuracy for SVM is less than 23% for power, but 100% for area at 45nm. Across all training set sizes used in our experiments, we observe that area
4 IEEE EMBEDDED SYSTEMS LETTERS, VOL. XX, NO. Y, MONTH XX, and power estimation errors are the smallest for RBF and are the highest for SVM. In general, these techniques have smaller estimation errors at 65nm than at 45nm. Fig. 6. Estimation accuracy of (a) area and (b) power at 45nm with sparse and restricted training dataset (i.e. 50 data points for training). Fig. 7. Estimation accuracy of (a) area and (b) power at 65nm with only sparse training dataset (i.e., 64 data points for training). IV. ORION3.0 DISTRIBUTION We now describe ORION3.0 software architecture, extensibility with new models and training/testing datasets, and details of ORION3.0 software distribution. A. Software Architecture ORION3.0 uses a modular software architecture and is written in C and MATLAB. Figure 8 shows the high-level software architecture, how router configuration is read by the tool and how models get invoked. To this end, ORION3.0 adds several new command line options for the user to choose (1) ORION3.0 vs. ORION2.0 models, (2) ORION3.0 modeling techniques, namely, basic, lsqr, rbf, kg, mars, and svm, (3) training, and (4) training data points when the modeling technique is from the set {lsqr, rbf, kg, mars, svm}. As with ORION2.0, users can configure microarchitecture parameters such as, flit-width, #input and output buffers, #virtual channels, #pipeline stages, type of crossbar, #ports, etc. in the SIM port.h file. These parameters are used when user chooses either the ORION2.0 models or the ORION3.0 basic model. Furthermore, we updated the technology files with accurate leakage and internal power data from the TSMC 45nm technology libraries for all cell-types used in modeling. Users can specify implementation and operational parameters such as technology library, size of MUXes in the crossbar, and the input load on the router from the link. The tool uses ORION3.0 basic models by default when no options are specified by the user. The basic models refer to the ORION NEW models described in Section II. All models report area in µm 2, power in mw, and energy in J. Fig. 8. Software architecture of ORION3.0. to make sure that the input can be converted to a non-singular matrix. The user has the option to not provide either of these training or test data points; the tool uses default training and testing data points based on the technology configured in the SIM port.h file in the PARM TECH POINT field. In addition, users can develop their own regression models in MATLAB and integrate it with ORION3.0. They need to invoke a wrapper in orion router.c and invoke the model from a shell script. The shell script is invoked from orion router.c using the system method in C. C. Software Distribution ORION3.0 is available for web download from [20] and is referenceable as [6] [5]. We provide academic MATLAB toolboxes for RBF [21], KG [10], MARS [12], and SVM [14] under the same copyright and license agreements as available in their distributions. We provide doxygen-based documentation of all functions and structures implemented as part of the distribution. V. CONCLUSION Accurate modeling for NoC area and power estimation is critical to successful early design-space exploration in the era of many-core computing. ORION2.0, while very popular, has large errors versus actual implementation. This is because there is often a mismatch between the actual router RTL and the templates assumed. Also, typical design flows involve sophisticated optimizations that are difficult to characterize. We present ORION3.0, a tool that incorporates comprehensive parametric and non-parametric modeling techniques to accurately estimate NoC power and area. Our parametric models, ORION NEW, explicitly account for control and data path resources. We further refine these parametric models by performing leastsquares regression analysis (LSQR) on post-p&r data. Our nonparametric models are four popular techniques RBF, KG, MARS, and SVM. Our results show that these techniques can be low-overhead and highly accurate in estimating NoC power and area. Our results show that RBF can be more accurate than others for sparse and restricted training sets. ORION3.0 is now available for web download and distribution [20]. B. Tool Extensibility When user chooses lsqr or any of the non-parametric regression modeling techniques, namely, rbf, kg, mars, svm, then they have the option of providing their own training and testing data points. ORION3.0 does basic validation of the data points ACKNOWLEDGMENTS This work was supported in part by the NSF grant SHF , the MARCO GSRC focus center, and the SRC. We also thank Mr. Jeremiah Fong for developing the front-end interfaces for ORION3.0.
5 IEEE EMBEDDED SYSTEMS LETTERS, VOL. XX, NO. Y, MONTH XX, REFERENCES [1] J. Bhasker and R. Chadha, Static Timing Analysis for Nanometer Designs: A Practical Approach, Springer, [2] T. Hastie, R. Tibshirani and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer, [3] K. Jeong, A. B. Kahng, B. Lin and K. Samadi, Accurate Machine Learning-Based On-Chip Router Modeling, IEEE ESL, 2(3) (2010), pp [4] R. Jin, W. Chen and T. W. Simpson, Comparative Studies of Metamodeling Techniques Under Multiple Modeling Criteria, Trans. Struct. Multidiscip. Optim. 23 (2001), pp [5] A. B. Kahng, B. Lin and S. Nath, Explicit Modeling of Control and Data for Improved NoC Router Estimation Proc. DAC, 2012, pp [6] A. B. Kahng, B. Lin and S. Nath, Comprehensive Modeling Methodologies for NoC Router Estimation, University of California San Diego Technical Report CS , 2012, pp [7] A. B. Kahng, B. Lin and S. Nath, Enhanced Metamodeling Techniques for High-Dimensional IC Design Estimation Problems, Proc. DATE, 2013, pp [8] A. B. Kahng, B. Lin and K. Samadi, Improved On-Chip Router Analytical Power and Area Modeling Proc. ASP-DAC, 2010, pp [9] A. B. Kahng, B. Li, L.-S. Peh and K. Samadi, ORION 2.0: A Fast and Accurate NoC Power and Area Model for Early-Stage Design Space Exploration, Proc. DATE, 2009, pp [10] S. N. Lophaven, H. B. Nielsen and J. Sondergaard, Aspects of the MATLAB Toolbox DACE, Technical University of Denmark Technical Report IMM-REP , [11] H.-S. Wang, L.-S. Peh and S. Malik, Orion: A Power-Performance Simulator for Interconnection Networks, Proc. MICRO, 2002, pp [12] ARESLab. [13] Cadence Design Implementation. [14] LIBSVM. cjlin/libsvm [15] IBM Blue Gene processor. [16] Intel 80-Core Report. [17] ITRS Edition Reports and Ordering. [18] MATLAB. [19] Netmaker. rdm34/wiki [20] ORION [21] RBF2 Manual. mjo/rbf.html [22] Stanford NoC. [23] Synopsys Tools.
6 IEEE EMBEDDED SYSTEMS LETTERS, VOL. XX, NO. Y, MONTH XX, Dear Editor and Reviewers: We are submitting ORION3.0: A Comprehensive NoC Router Estimation Tool. In this paper we describe ORION3.0, recently released on the web for download from which makes significant improvements on ORION2.0 (2009). Quality of the release is supported by over 100 downloads during the past several months. Improvements to NoC parametric modeling are described in our paper, Explicit Modeling of Control and Data for NoC Router Estimation that appeared in Proc. ACM IEEE EDAC Design Automation Conference, June We made the following enhancements in ORION3.0. We significantly improve accuracy of parametric models of NoC router blocks of ORION2.0 (2009) by implementing ORION NEW models described in our DAC12 paper. These models are derived from analysis of post-synthesis netlists of multiple router RTL generators and synthesized using multiple commercial tools. We also implement automatic fitting of post-layout data using least-squares regression to the ORION NEW models. We augment non-parametric models derived by applying well-known techniques such as Radial Basis Functions (RBF), Kriging (KG), Multivariate Adaptive Regression Splines (MARS) and Support Vector Machines (SVM). These models automatically fit post-layout data that is specific to a commercial SP&R tool flow and technology library. We provide our training and testing set of data points from NoC routers implemented using TSMC 45nm and 65nm technologies. We describe the software architecture and user interface of ORION3.0. A user can choose from the following modeling options: (1) ORION2.0, (2) ORION NEW models from DAC12, (3) ORION NEW fitted with post-layout data using least-squares regression, (4) RBF, (5) KG, (6) MARS, and (7) SVM. We demonstrate that ORION3.0 interfaces are highly flexible. These interfaces seamlessly allow users to add new nonparametric NoC models, training and testing set of data points to ORION3.0. Please contact me with any questions concerning the submission. Thank you for your consideration. Sincerely, Siddhartha Nath (on behalf of co-authors Andrew B. Kahng and Bill Lin) Department of Computer Science and Engineering University of California at San Diego La Jolla, CA sinath@ucsd.edu
ORION3.0: A Comprehensive NoC Router Estimation Tool
IEEE EMBEDDED SYSTEMS LETTERS, VOL. XX, NO. Y, MONTH XX, 2013. 1 ORION3.0: A Comprehensive NoC Router Estimation Tool Andrew B. Kahng, Fellow, IEEE, Bill Lin, Senior Member, IEEE, and Siddhartha Nath,
More informationNETWORKS-ON-CHIP (NoCs) have proven to be highly
IEEE EMBEDDED SYSTEMS LETTERS, VOL. XX, NO. Y, MONTH XX, 213. 1 ORION3.: A Comprehensive NoC Router Estimation Tool Andrew B. Kahng, Fellow, IEEE, Bill Lin, Senior Member, IEEE, and Siddhartha Nath, Student
More informationNETWORKS-ON-CHIP (NoCs) have proven to be highly
IEEE EMBEDDED SYSTEMS LETTERS, VOL. XX, NO. Y, MONTH XX, 13. 1 ORION3.: A Comprehensive NoC Router Estimation Tool Andrew B. Kahng, Fellow, IEEE, Bill Lin, Senior Member, IEEE, and Siddhartha Nath, Student
More informationComprehensive Modeling Methodologies for NoC Router Estimation
1 Comprehensive Modeling Methodologies for NoC Router Estimation Andrew B. Kahng Bill Lin Siddhartha Nath Abstract Networks-on-Chip (NoCs) are increasingly used in many-core architectures. ORION2.0 [18]
More informationNETWORKS-ON-CHIPS (NoCs) are emerging as the
62 IEEE EMBEDDED SYSTEMS LETTERS, VOL. 2, NO. 3, SEPTEMBER 2010 Accurate Machine-Learning-Based On-Chip Router Modeling Kwangok Jeong, Student Member, IEEE, Andrew B. Kahng, Fellow, IEEE, Bill Lin, Senior
More informationEnhanced Metamodeling Techniques for High-Dimensional IC Design Estimation Problems
Enhanced Metamodeling Techniques for High-Dimensional IC Design Estimation Problems + Andrew B. Kahng +, Bill Lin and Siddhartha Nath + CSE and ECE Departments, University of California at San Diego. abk@ucsd.edu,
More informationHigh-Dimensional Metamodeling for Prediction of Clock Tree Synthesis Outcomes
High-Dimensional Metamodeling for Prediction of Clock Tree Synthesis Outcomes + Andrew B. Kahng +, Bill Lin and Siddhartha Nath + CSE and ECE Departments, University of California at San Diego {abk, billlin,
More informationFPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC)
FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC) D.Udhayasheela, pg student [Communication system],dept.ofece,,as-salam engineering and technology, N.MageshwariAssistant Professor
More informationCMP Model Application in RC and Timing Extraction Flow
INVENTIVE CMP Model Application in RC and Timing Extraction Flow Hongmei Liao*, Li Song +, Nickhil Jakadtar +, Taber Smith + * Qualcomm Inc. San Diego, CA 92121 + Cadence Design Systems, Inc. San Jose,
More informationA Dynamic NOC Arbitration Technique using Combination of VCT and XY Routing
727 A Dynamic NOC Arbitration Technique using Combination of VCT and XY Routing 1 Bharati B. Sayankar, 2 Pankaj Agrawal 1 Electronics Department, Rashtrasant Tukdoji Maharaj Nagpur University, G.H. Raisoni
More informationLow-Power Interconnection Networks
Low-Power Interconnection Networks Li-Shiuan Peh Associate Professor EECS, CSAIL & MTL MIT 1 Moore s Law: Double the number of transistors on chip every 2 years 1970: Clock speed: 108kHz No. transistors:
More informationWorst-Case Performance Prediction Under Supply Voltage and Temperature Noise
Worst-Case Performance Prediction Under Supply Voltage and Temperature Noise Chung-Kuan Cheng, Andrew B. Kahng, Kambiz Samadi and Amirali Shayan June 13, 2010 CSE and ECE Departments University of California,
More informationA General Sign Bit Error Correction Scheme for Approximate Adders
A General Sign Bit Error Correction Scheme for Approximate Adders Rui Zhou and Weikang Qian University of Michigan-Shanghai Jiao Tong University Joint Institute Shanghai Jiao Tong University, Shanghai,
More informationSynthesizable FPGA Fabrics Targetable by the VTR CAD Tool
Synthesizable FPGA Fabrics Targetable by the VTR CAD Tool Jin Hee Kim and Jason Anderson FPL 2015 London, UK September 3, 2015 2 Motivation for Synthesizable FPGA Trend towards ASIC design flow Design
More informationHARDWARE IMPLEMENTATION OF PIPELINE BASED ROUTER DESIGN FOR ON- CHIP NETWORK
DOI: 10.21917/ijct.2012.0092 HARDWARE IMPLEMENTATION OF PIPELINE BASED ROUTER DESIGN FOR ON- CHIP NETWORK U. Saravanakumar 1, R. Rangarajan 2 and K. Rajasekar 3 1,3 Department of Electronics and Communication
More informationA 256-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology
http://dx.doi.org/10.5573/jsts.014.14.6.760 JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.14, NO.6, DECEMBER, 014 A 56-Radix Crossbar Switch Using Mux-Matrix-Mux Folded-Clos Topology Sung-Joon Lee
More informationDynamic Packet Fragmentation for Increased Virtual Channel Utilization in On-Chip Routers
Dynamic Packet Fragmentation for Increased Virtual Channel Utilization in On-Chip Routers Young Hoon Kang, Taek-Jun Kwon, and Jeff Draper {youngkan, tjkwon, draper}@isi.edu University of Southern California
More informationDesign and Test Solutions for Networks-on-Chip. Jin-Ho Ahn Hoseo University
Design and Test Solutions for Networks-on-Chip Jin-Ho Ahn Hoseo University Topics Introduction NoC Basics NoC-elated esearch Topics NoC Design Procedure Case Studies of eal Applications NoC-Based SoC Testing
More informationOn GPU Bus Power Reduction with 3D IC Technologies
On GPU Bus Power Reduction with 3D Technologies Young-Joon Lee and Sung Kyu Lim School of ECE, Georgia Institute of Technology, Atlanta, Georgia, USA yjlee@gatech.edu, limsk@ece.gatech.edu Abstract The
More informationFCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow
FCUDA-NoC: A Scalable and Efficient Network-on-Chip Implementation for the CUDA-to-FPGA Flow Abstract: High-level synthesis (HLS) of data-parallel input languages, such as the Compute Unified Device Architecture
More informationWITH the development of the semiconductor technology,
Dual-Link Hierarchical Cluster-Based Interconnect Architecture for 3D Network on Chip Guang Sun, Yong Li, Yuanyuan Zhang, Shijun Lin, Li Su, Depeng Jin and Lieguang zeng Abstract Network on Chip (NoC)
More informationFABRICATION TECHNOLOGIES
FABRICATION TECHNOLOGIES DSP Processor Design Approaches Full custom Standard cell** higher performance lower energy (power) lower per-part cost Gate array* FPGA* Programmable DSP Programmable general
More informationMulticycle-Path Challenges in Multi-Synchronous Systems
Multicycle-Path Challenges in Multi-Synchronous Systems G. Engel 1, J. Ziebold 1, J. Cox 2, T. Chaney 2, M. Burke 2, and Mike Gulotta 3 1 Department of Electrical and Computer Engineering, IC Design Research
More informationPrediction Router: Yet another low-latency on-chip router architecture
Prediction Router: Yet another low-latency on-chip router architecture Hiroki Matsutani Michihiro Koibuchi Hideharu Amano Tsutomu Yoshinaga (Keio Univ., Japan) (NII, Japan) (Keio Univ., Japan) (UEC, Japan)
More informationDesign and Implementation of Low Complexity Router for 2D Mesh Topology using FPGA
Design and Implementation of Low Complexity Router for 2D Mesh Topology using FPGA Maheswari Murali * and Seetharaman Gopalakrishnan # * Assistant professor, J. J. College of Engineering and Technology,
More informationBARP-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs
-A Dynamic Routing Protocol for Balanced Distribution of Traffic in NoCs Pejman Lotfi-Kamran, Masoud Daneshtalab *, Caro Lucas, and Zainalabedin Navabi School of Electrical and Computer Engineering, The
More informationThe Benefits of Using Clock Gating in the Design of Networks-on-Chip
The Benefits of Using Clock Gating in the Design of Networks-on-Chip Michele Petracca, Luca P. Carloni Dept. of Computer Science, Columbia University, New York, NY 127 Abstract Networks-on-chip (NoC) are
More informationParallelized Network-on-Chip-Reused Test Access Mechanism for Multiple Identical Cores
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 35, NO. 7, JULY 2016 1219 Parallelized Network-on-Chip-Reused Test Access Mechanism for Multiple Identical Cores Taewoo
More informationAn Overview of Standard Cell Based Digital VLSI Design
An Overview of Standard Cell Based Digital VLSI Design With examples taken from the implementation of the 36-core AsAP1 chip and the 1000-core KiloCore chip Zhiyi Yu, Tinoosh Mohsenin, Aaron Stillmaker,
More informationNoCAlert: An On-Line and Real- Time Fault Detection Mechanism for Network-on-Chip Architectures
NoCAlert: An On-Line and Real- Time Fault Detection Mechanism for Network-on-Chip Architectures Andreas Prodromou, Andreas Panteli, Chrysostomos Nicopoulos, and Yiannakis Sazeides University of Cyprus
More informationSilicon Virtual Prototyping: The New Cockpit for Nanometer Chip Design
Silicon Virtual Prototyping: The New Cockpit for Nanometer Chip Design Wei-Jin Dai, Dennis Huang, Chin-Chih Chang, Michel Courtoy Cadence Design Systems, Inc. Abstract A design methodology for the implementation
More informationDesign Compiler Graphical Create a Better Starting Point for Faster Physical Implementation
Datasheet Create a Better Starting Point for Faster Physical Implementation Overview Continuing the trend of delivering innovative synthesis technology, Design Compiler Graphical streamlines the flow for
More informationRe-Examining Conventional Wisdom for Networks-on-Chip in the Context of FPGAs
This work was funded by NSF. We thank Xilinx for their FPGA and tool donations. We thank Bluespec for their tool donations. Re-Examining Conventional Wisdom for Networks-on-Chip in the Context of FPGAs
More informationIntroduction 1. GENERAL TRENDS. 1. The technology scale down DEEP SUBMICRON CMOS DESIGN
1 Introduction The evolution of integrated circuit (IC) fabrication techniques is a unique fact in the history of modern industry. The improvements in terms of speed, density and cost have kept constant
More informationStratix vs. Virtex-II Pro FPGA Performance Analysis
White Paper Stratix vs. Virtex-II Pro FPGA Performance Analysis The Stratix TM and Stratix II architecture provides outstanding performance for the high performance design segment, providing clear performance
More informationPower-Mode-Aware Buffer Synthesis for Low-Power Clock Skew Minimization
This article has been accepted and published on J-STAGE in advance of copyediting. Content is final as presented. IEICE Electronics Express, Vol.* No.*,*-* Power-Mode-Aware Buffer Synthesis for Low-Power
More informationORION 2.0: A Power-Area Simulator for Interconnection Networks REFERENCES
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 1, JANUARY 2012 191 according to the simulation results. In order to mitigate the impact two techniques may be used: 1) design
More informationMore Course Information
More Course Information Labs and lectures are both important Labs: cover more on hands-on design/tool/flow issues Lectures: important in terms of basic concepts and fundamentals Do well in labs Do well
More informationFast Flexible FPGA-Tuned Networks-on-Chip
This work was funded by NSF. We thank Xilinx for their FPGA and tool donations. We thank Bluespec for their tool donations. Fast Flexible FPGA-Tuned Networks-on-Chip Michael K. Papamichael, James C. Hoe
More informationPrimeTime: Introduction to Static Timing Analysis Workshop
i-1 PrimeTime: Introduction to Static Timing Analysis Workshop Synopsys Customer Education Services 2002 Synopsys, Inc. All Rights Reserved PrimeTime: Introduction to Static 34000-000-S16 Timing Analysis
More informationA Survey of Techniques for Power Aware On-Chip Networks.
A Survey of Techniques for Power Aware On-Chip Networks. Samir Chopra Ji Young Park May 2, 2005 1. Introduction On-chip networks have been proposed as a solution for challenges from process technology
More informationDesign of a Low Density Parity Check Iterative Decoder
1 Design of a Low Density Parity Check Iterative Decoder Jean Nguyen, Computer Engineer, University of Wisconsin Madison Dr. Borivoje Nikolic, Faculty Advisor, Electrical Engineer, University of California,
More informationLithography Simulation-Based Full-Chip Design Analyses
Lithography Simulation-Based Full-Chip Design Analyses Puneet Gupta a, Andrew B. Kahng a, Sam Nakagawa a,saumilshah b and Puneet Sharma c a Blaze DFM, Inc., Sunnyvale, CA; b University of Michigan, Ann
More informationDesign Tools for 100,000 Gate Programmable Logic Devices
esign Tools for 100,000 Gate Programmable Logic evices March 1996, ver. 1 Product Information Bulletin 22 Introduction The capacity of programmable logic devices (PLs) has risen dramatically to meet the
More informationDesign of Low-Power and Low-Latency 256-Radix Crossbar Switch Using Hyper-X Network Topology
JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.15, NO.1, FEBRUARY, 2015 http://dx.doi.org/10.5573/jsts.2015.15.1.077 Design of Low-Power and Low-Latency 256-Radix Crossbar Switch Using Hyper-X Network
More informationAn overview of standard cell based digital VLSI design
An overview of standard cell based digital VLSI design Implementation of the first generation AsAP processor Zhiyi Yu and Tinoosh Mohsenin VCL Laboratory UC Davis Outline Overview of standard cellbased
More informationIMPROVES. Initial Investment is Low Compared to SoC Performance and Cost Benefits
NOC INTERCONNECT IMPROVES SOC ECONO CONOMICS Initial Investment is Low Compared to SoC Performance and Cost Benefits A s systems on chip (SoCs) have interconnect, along with its configuration, verification,
More informationDesign of Adaptive Communication Channel Buffers for Low-Power Area- Efficient Network-on. on-chip Architecture
Design of Adaptive Communication Channel Buffers for Low-Power Area- Efficient Network-on on-chip Architecture Avinash Kodi, Ashwini Sarathy * and Ahmed Louri * Department of Electrical Engineering and
More informationPerformance Explorations of Multi-Core Network on Chip Router
Performance Explorations of Multi-Core Network on Chip Router U.Saravanakumar Department of Electronics and Communication Engineering PSG College of Technology Coimbatore, India saran.usk@gmail.com R.
More informationNEtwork-on-Chip (NoC) [3], [6] is a scalable interconnect
1 A Soft Tolerant Network-on-Chip Router Pipeline for Multi-core Systems Pavan Poluri and Ahmed Louri Department of Electrical and Computer Engineering, University of Arizona Email: pavanp@email.arizona.edu,
More informationA VERIOG-HDL IMPLEMENTATION OF VIRTUAL CHANNELS IN A NETWORK-ON-CHIP ROUTER. A Thesis SUNGHO PARK
A VERIOG-HDL IMPLEMENTATION OF VIRTUAL CHANNELS IN A NETWORK-ON-CHIP ROUTER A Thesis by SUNGHO PARK Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements
More informationEE-382M VLSI II. Early Design Planning: Front End
EE-382M VLSI II Early Design Planning: Front End Mark McDermott EE 382M-8 VLSI-2 Page Foil # 1 1 EDP Objectives Get designers thinking about physical implementation while doing the architecture design.
More informationDLABS: a Dual-Lane Buffer-Sharing Router Architecture for Networks on Chip
DLABS: a Dual-Lane Buffer-Sharing Router Architecture for Networks on Chip Anh T. Tran and Bevan M. Baas Department of Electrical and Computer Engineering University of California - Davis, USA {anhtr,
More informationDESIGN AND PERFORMANCE EVALUATION OF ON CHIP NETWORK ROUTERS
DESIGN AND PERFORMANCE EVALUATION OF ON CHIP NETWORK ROUTERS 1 U.SARAVANAKUMAR, 2 R.RANGARAJAN 1 Asst Prof., Department of ECE, PSG College of Technology, Coimbatore, INDIA 2 Professor & Principal, Indus
More informationKiloCore: A 32 nm 1000-Processor Array
KiloCore: A 32 nm 1000-Processor Array Brent Bohnenstiehl, Aaron Stillmaker, Jon Pimentel, Timothy Andreas, Bin Liu, Anh Tran, Emmanuel Adeagbo, Bevan Baas University of California, Davis VLSI Computation
More informationDESIGN OF EFFICIENT ROUTING ALGORITHM FOR CONGESTION CONTROL IN NOC
DESIGN OF EFFICIENT ROUTING ALGORITHM FOR CONGESTION CONTROL IN NOC 1 Pawar Ruchira Pradeep M. E, E&TC Signal Processing, Dr. D Y Patil School of engineering, Ambi, Pune Email: 1 ruchira4391@gmail.com
More informationA Software LDPC Decoder Implemented on a Many-Core Array of Programmable Processors
A Software LDPC Decoder Implemented on a Many-Core Array of Programmable Processors Brent Bohnenstiehl and Bevan Baas Department of Electrical and Computer Engineering University of California, Davis {bvbohnen,
More informationNISC Application and Advantages
NISC Application and Advantages Daniel D. Gajski Mehrdad Reshadi Center for Embedded Computer Systems University of California, Irvine Irvine, CA 92697-3425, USA {gajski, reshadi}@cecs.uci.edu CECS Technical
More informationClocked and Asynchronous FIFO Characterization and Comparison
Clocked and Asynchronous FIFO Characterization and Comparison HoSuk Han Kenneth S. Stevens Electrical and Computer Engineering University of Utah Abstract Heterogeneous blocks, IP reuse, network-on-chip
More informationECE/CS 757: Advanced Computer Architecture II Interconnects
ECE/CS 757: Advanced Computer Architecture II Interconnects Instructor:Mikko H Lipasti Spring 2017 University of Wisconsin-Madison Lecture notes created by Natalie Enright Jerger Lecture Outline Introduction
More informationAiyar, Mani Laxman. Keywords: MPEG4, H.264, HEVC, HDTV, DVB, FIR.
2015; 2(2): 201-209 IJMRD 2015; 2(2): 201-209 www.allsubjectjournal.com Received: 07-01-2015 Accepted: 10-02-2015 E-ISSN: 2349-4182 P-ISSN: 2349-5979 Impact factor: 3.762 Aiyar, Mani Laxman Dept. Of ECE,
More informationTOPIC : Verilog Synthesis examples. Module 4.3 : Verilog synthesis
TOPIC : Verilog Synthesis examples Module 4.3 : Verilog synthesis Example : 4-bit magnitude comptarator Discuss synthesis of a 4-bit magnitude comparator to understand each step in the synthesis flow.
More informationHigh Quality, Low Cost Test
Datasheet High Quality, Low Cost Test Overview is a comprehensive synthesis-based test solution for compression and advanced design-for-test that addresses the cost challenges of testing complex designs.
More informationHigh Throughput and Low Power NoC
IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 5, o 3, September 011 www.ijcsi.org 431 High Throughput and Low Power oc Magdy El-Moursy 1, Member IEEE and Mohamed Abdelgany 1 Mentor
More informationCHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP
133 CHAPTER 6 FPGA IMPLEMENTATION OF ARBITERS ALGORITHM FOR NETWORK-ON-CHIP 6.1 INTRODUCTION As the era of a billion transistors on a one chip approaches, a lot of Processing Elements (PEs) could be located
More informationRuntime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays
Runtime Adaptation of Application Execution under Thermal and Power Constraints in Massively Parallel Processor Arrays Éricles Sousa 1, Frank Hannig 1, Jürgen Teich 1, Qingqing Chen 2, and Ulf Schlichtmann
More informationLecture 11 Logic Synthesis, Part 2
Lecture 11 Logic Synthesis, Part 2 Xuan Silvia Zhang Washington University in St. Louis http://classes.engineering.wustl.edu/ese461/ Write Synthesizable Code Use meaningful names for signals and variables
More informationA Sensor-Assisted Self-Authentication Framework for Hardware Trojan Detection
A Sensor-Assisted Self-Authentication Framework for Hardware Trojan Detection Min Li, Azadeh Davoodi, and Mohammad Tehranipoor Department of Electrical and Computer Engineering University of Wisconsin
More informationLOW POWER REDUCED ROUTER NOC ARCHITECTURE DESIGN WITH CLASSICAL BUS BASED SYSTEM
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 5, May 2015, pg.705
More informationUCLA 3D research started in 2002 under DARPA with CFDRC
Coping with Vertical Interconnect Bottleneck Jason Cong UCLA Computer Science Department cong@cs.ucla.edu http://cadlab.cs.ucla.edu/ cs edu/~cong Outline Lessons learned Research challenges and opportunities
More informationOpenSMART: An Opensource Singlecycle Multi-hop NoC Generator
OpenSMART: An Opensource Singlecycle Multi-hop NoC Generator Hyoukjun Kwon and Tushar Krishna Georgia Institute of Technology Synergy Lab (http://synergy.ece.gatech.edu) OpenSMART (https://tinyurl.com/get-opensmart)
More informationDeadlock-free XY-YX router for on-chip interconnection network
LETTER IEICE Electronics Express, Vol.10, No.20, 1 5 Deadlock-free XY-YX router for on-chip interconnection network Yeong Seob Jeong and Seung Eun Lee a) Dept of Electronic Engineering Seoul National Univ
More informationVCS AMS. Mixed-Signal Verification Solution. Overview. testing with transistor-level accuracy. Introduction. Performance. Multicore Technology
DATASHEET VCS AMS Mixed-Signal Verification Solution Scalable mixedsignal regression testing with transistor-level accuracy Overview The complexity of mixed-signal system-on-chip (SoC) designs is rapidly
More informationClock Skew Optimization Considering Complicated Power Modes
Clock Skew Optimization Considering Complicated Power Modes Chiao-Ling Lung 1,2, Zi-Yi Zeng 1, Chung-Han Chou 1, Shih-Chieh Chang 1 National Tsing-Hua University, HsinChu, Taiwan 1 Industrial Technology
More informationReNoC: A Network-on-Chip Architecture with Reconfigurable Topology
1 ReNoC: A Network-on-Chip Architecture with Reconfigurable Topology Mikkel B. Stensgaard and Jens Sparsø Technical University of Denmark Technical University of Denmark Outline 2 Motivation ReNoC Basic
More informationOCB-Based SoC Integration
The Present and The Future 黃俊達助理教授 Juinn-Dar Huang, Assistant Professor March 11, 2005 jdhuang@mail.nctu.edu.tw Department of Electronics Engineering National Chiao Tung University 1 Outlines Present Why
More informationDesign and Implementation of A Reconfigurable Arbiter
Proceedings of the 7th WSEAS International Conference on Signal, Speech and Image Processing, Beijing, China, September 15-17, 2007 100 Design and Implementation of A Reconfigurable Arbiter YU-JUNG HUANG,
More informationCopyright 2007 Society of Photo-Optical Instrumentation Engineers. This paper was published in Proceedings of SPIE (Proc. SPIE Vol.
Copyright 2007 Society of Photo-Optical Instrumentation Engineers. This paper was published in Proceedings of SPIE (Proc. SPIE Vol. 6937, 69370N, DOI: http://dx.doi.org/10.1117/12.784572 ) and is made
More informationImplementing Tile-based Chip Multiprocessors with GALS Clocking Styles
Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles Zhiyi Yu, Bevan Baas VLSI Computation Lab, ECE Department University of California, Davis, USA Outline Introduction Timing issues
More informationSoft-Core Embedded Processor-Based Built-In Self- Test of FPGAs: A Case Study
Soft-Core Embedded Processor-Based Built-In Self- Test of FPGAs: A Case Study Bradley F. Dutton, Graduate Student Member, IEEE, and Charles E. Stroud, Fellow, IEEE Dept. of Electrical and Computer Engineering
More informationAn Efficient Design of Serial Communication Module UART Using Verilog HDL
An Efficient Design of Serial Communication Module UART Using Verilog HDL Pogaku Indira M.Tech in VLSI and Embedded Systems, Siddhartha Institute of Engineering and Technology. Dr.D.Subba Rao, M.Tech,
More informationA Dynamic Memory Management Unit for Embedded Real-Time System-on-a-Chip
A Dynamic Memory Management Unit for Embedded Real-Time System-on-a-Chip Mohamed Shalan Georgia Institute of Technology School of Electrical and Computer Engineering 801 Atlantic Drive Atlanta, GA 30332-0250
More informationNetSpeed ORION: A New Approach to Design On-chip Interconnects. August 26 th, 2013
NetSpeed ORION: A New Approach to Design On-chip Interconnects August 26 th, 2013 INTERCONNECTS BECOMING INCREASINGLY IMPORTANT Growing number of IP cores Average SoCs today have 100+ IPs Mixing and matching
More informationHardware Modeling using Verilog Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur
Hardware Modeling using Verilog Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture 01 Introduction Welcome to the course on Hardware
More informationAn Asynchronous NoC Router in a 14nm FinFET Library: Comparison to an Industrial Synchronous Counterpart
An Asynchronous NoC Router in a 14nm FinFET Library: Comparison to an Industrial Synchronous Counterpart Weiwei Jiang Columbia University, USA Gabriele Miorandi University of Ferrara, Italy Wayne Burleson
More informationNetwork on Chip Architecture: An Overview
Network on Chip Architecture: An Overview Md Shahriar Shamim & Naseef Mansoor 12/5/2014 1 Overview Introduction Multi core chip Challenges Network on Chip Architecture Regular Topology Irregular Topology
More informationProcessor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP
Processor Architectures At A Glance: M.I.T. Raw vs. UC Davis AsAP Presenter: Course: EEC 289Q: Reconfigurable Computing Course Instructor: Professor Soheil Ghiasi Outline Overview of M.I.T. Raw processor
More informationA Robust Compressed Sensing IC for Bio-Signals
A Robust Compressed Sensing IC for Bio-Signals Jun Kwang Oh Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2014-114 http://www.eecs.berkeley.edu/pubs/techrpts/2014/eecs-2014-114.html
More informationMapping real-life applications on run-time reconfigurable NoC-based MPSoC on FPGA. Singh, A.K.; Kumar, A.; Srikanthan, Th.; Ha, Y.
Mapping real-life applications on run-time reconfigurable NoC-based MPSoC on FPGA. Singh, A.K.; Kumar, A.; Srikanthan, Th.; Ha, Y. Published in: Proceedings of the 2010 International Conference on Field-programmable
More informationAsynchronous on-chip Communication: Explorations on the Intel PXA27x Peripheral Bus
Asynchronous on-chip Communication: Explorations on the Intel PXA27x Peripheral Bus Andrew M. Scott, Mark E. Schuelein, Marly Roncken, Jin-Jer Hwan John Bainbridge, John R. Mawer, David L. Jackson, Andrew
More informationarxiv: v1 [cs.lg] 5 Mar 2013
GURLS: a Least Squares Library for Supervised Learning Andrea Tacchetti, Pavan K. Mallapragada, Matteo Santoro, Lorenzo Rosasco Center for Biological and Computational Learning, Massachusetts Institute
More information1 Introduction Data format converters (DFCs) are used to permute the data from one format to another in signal processing and image processing applica
A New Register Allocation Scheme for Low Power Data Format Converters Kala Srivatsan, Chaitali Chakrabarti Lori E. Lucke Department of Electrical Engineering Minnetronix, Inc. Arizona State University
More informationSTARTING in the 1990s, much work was done to enhance
1048 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS, VOL. 57, NO. 5, MAY 2010 A Low-Complexity Message-Passing Algorithm for Reduced Routing Congestion in LDPC Decoders Tinoosh Mohsenin, Dean
More informationManish Vachharajani Fall 2004
ECEN 5003-001: Evaluating Processor Microarchitectures Manish Vachharajani Fall 2004 August 23, 2004 ECEN 5003-001 Lecture 4 1 Contribution of Microarchitecture Graph courtesy Sanjay Patel, data courtesy
More informationVLSI Design Automation
VLSI Design Automation IC Products Processors CPU, DSP, Controllers Memory chips RAM, ROM, EEPROM Analog Mobile communication, audio/video processing Programmable PLA, FPGA Embedded systems Used in cars,
More informationNetworks-on-Chip Router: Configuration and Implementation
Networks-on-Chip : Configuration and Implementation Wen-Chung Tsai, Kuo-Chih Chu * 2 1 Department of Information and Communication Engineering, Chaoyang University of Technology, Taichung 413, Taiwan,
More informationLogic and Physical Synthesis Methodology for High Performance VLIW/SIMD DSP Core
for High Performance VLIW/SIMD DSP Core Jagesh Sanghavi, Helene Deng, Tony Lu Tensilica, Inc sanghavi@tensilica.com ABSTRACT We describe logic and physical synthesis methodology to achieve timing closure
More informationDesign and Simulation of Router Using WWF Arbiter and Crossbar
Design and Simulation of Router Using WWF Arbiter and Crossbar M.Saravana Kumar, K.Rajasekar Electronics and Communication Engineering PSG College of Technology, Coimbatore, India Abstract - Packet scheduling
More informationDSP Flow for SmartFusion2 and IGLOO2 Devices - Libero SoC v11.6 TU0312 Quickstart and Design Tutorial
DSP Flow for SmartFusion2 and IGLOO2 Devices - Libero SoC v11.6 TU0312 Quickstart and Design Tutorial Table of Contents Introduction... 3 Tutorial Requirements... 3 Synphony Model Compiler ME (Microsemi
More informationThree-Dimensional Integrated Circuits: Performance, Design Methodology, and CAD Tools
Three-Dimensional Integrated Circuits: Performance, Design Methodology, and CAD Tools Shamik Das, Anantha Chandrakasan, and Rafael Reif Microsystems Technology Laboratories Massachusetts Institute of Technology
More information