RETARGETABLE CODE GENERATION FOR DIGITAL SIGNAL PROCESSORS

Similar documents
PERFORMANCE ANALYSIS OF REAL-TIME EMBEDDED SOFTWARE

MULTIMEDIA DATABASE MANAGEMENT SYSTEMS

Topological Structure and Analysis of Interconnection Networks

COMMUNICATION SYSTEMS The State of the Art

Energy Efficient Microprocessor Design

Groupware and the World Wide Web

ARCHITECTURE AND CAD FOR DEEP-SUBMICRON FPGAs

HIGH-LEVEL SYNTHESIS FOR REAL-TIME DIGITAL SIGNAL PROCESSING

FUZZY DATABASES Principles and Applications

Optical Network Design and Modelling

UML for SOC Design GRANT MARTIN WOLFGANG MÜLLER. Edited by. Tensilica Inc., Santa Clara, CA, USA. and. University of Paderborn, Germany

Computational Geometry on Surfaces

ADAPTIVE HYPERTEXT AND HYPERMEDIA

TIME-CONSTRAINED TRANSACTION MANAGEMENT. Real-Time Constraints in Database Transaction Systems

VIDEO CODING. The Second Generation Approach

SPECC: SPECIFICATION LANGUAGE AND METHODOLOGY

Whitestein Series in software Agent Technologies. About whitestein Technologies

Fuzzy Modeling for Control.,,i.

THE VERILOG? HARDWARE DESCRIPTION LANGUAGE

SYNTHESIS OF FINITE STATE MACHINES: LOGIC OPTIMIZATION

INFORMATION RETRIEVAL SYSTEMS: Theory and Implementation

INVERSE PROBLEMS IN GROUNDWATER MODELING

Exploiting Isomorphism for Speeding-Up Instance-Binding in an Integrated Scheduling, Allocation and Assignment Approach to Architectural Synthesis

Computer-Aided Design in Magnetics

Rainer Leupers, Wolfgang Schenk, Peter Marwedel. University of Dortmund, Lehrstuhl Informatik 12, Dortmund, Germany

HW/SW Co-design. Design of Embedded Systems Jaap Hofstede Version 3, September 1999

FUNCTIONAL DECOMPOSITION WITH APPLICATION TO FPGA SYNTHESIS

INTRUSION DETECTION IN DISTRIBUTED SYSTEMS An Abstraction-Based Approach

WIRELESS ATM AND AD-HOC NETWORKS. Protocols and Architectures

Application-Specific Mesh-based Heterogeneous FPGA Architectures

INFORMATION SECURITY MANAGEMENT & SMALL SYSTEMS SECURITY

Graphics Programming in c++

Algorithm Collections for Digital Signal Processing Applications Using Matlab

Philip Andrew Simpson. FPGA Design. Best Practices for Team-based Reuse. Second Edition

Clustering and Information Retrieval

Loop Tiling for Parallelism

Software Development for SAP R/3

Fundamentals of Operating Systems. Fifth Edition

PARALLEL, OBJECT -ORIENTED, AND ACTIVE KNOWLEDGE BASE SYSTEMS

FINITE FIELDS FOR COMPUTER SCIENTISTS AND ENGINEERS

MINING VERY LARGE DATABASES WITH PARALLEL PROCESSING

Low Level X Window Programming

Guide to OSI and TCP/IP Models

HIGH-SPEED COMMUNICATION NETWORKS

Hardware Acceleration of EDA Algorithms

PARALLEL ARCHITECTURES AND PARALLEL ALGORITHMS FOR INTEGRATED VISION SYSTEMS

Retargetable Assembly Code Generation by Bootstrapping. Rainer Leupers, Wolfgang Schenk, Peter Marwedel

Scheduling in Distributed Computing Systems Analysis, Design & Models

THE GLOBAL PALEOMAGNETIC DATABASE

ASSIGNMENT PROBLEMS IN PARALLEL AND DISTRIBUTED COMPUTING

Fault-Tolerant Parallel and Distributed Systems

PERFORMANCE EVALUATION, PREDICTION AND VISUALIZATION OF PARALLEL SYSTEMS

DISSEMINATING SECURITY UPDATES AT INTERNET SCALE

Compiler Design Issues for Embedded Processors

TECHNICAL TRANSLATION

Metodologie di Progettazione Hardware e Software

Linear Programming: Mathematics, Theory and Algorithms

Theory of Automatic Robot Assembly and Programming

Preface. and Its Applications 81, ISBN , doi: / , Springer Science+Business Media New York, 2013.

English for Academic Research. Series editor Adrian Wallwork Pisa Italy

Retargetable Compilation for Low Power

A Generic Tool Set for Application Specific Processor Architectures Λ

Hardware/ Software Partitioning

Stereo Scene Flow for 3D Motion Analysis

Chunjie Duan Brock J. LaMeres Sunil P. Khatri. On and Off-Chip Crosstalk Avoidance in VLSI Design

Parallel Algorithms for Irregular Problems: State of the Art

THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE

c-xsc R. Klatte U. Kulisch A. Wiethoff C. Lawo M. Rauch A C++ Class Library for Extended Scientific Computing Springer-Verlag Berlin Heidelberg GmbH

Segmentation and Recovery of Superquadrics

Mapping Array Communication onto FIFO Communication - Towards an Implementation

SYNCHRONIZATION IN REAL-TIME SYSTEMS. A Priority Inheritance Approach

Video Traces for Network Performance Evaluation

The Discovery and Retrieval of Temporal Rules in Interval Sequence Data

ADAPTIVE VIDEO STREAMING FOR BANDWIDTH VARIATION WITH OPTIMUM QUALITY

COMPILER CONSTRUCTION FOR A NETWORK IDENTIFICATION SUMIT SONI PRAVESH KUMAR

HW SW Partitioning. Reading. Hardware/software partitioning. Hardware/Software Codesign. CS4272: HW SW Codesign

George Grätzer. Practical L A TEX

LOAD BALANCING IN PARALLEL COMPUTERS Theory and Practice

Wide Area 2D/3D Imaging

Designing Heterogeneous FPGAs with Multiple SBs *

Robust SRAM Designs and Analysis

Adding and Removing Applications on the BFS

Hardware Software Codesign of Embedded Systems

Introduction to Multiprocessors (Part I) Prof. Cristina Silvano Politecnico di Milano

Computers as Components Principles of Embedded Computing System Design

The S6000 Family of Processors

Tag der mündlichen Prüfung: 03. Juni 2004 Dekan / Dekanin: Prof. Dr. Bernhard Steffen Gutachter / Gutachterinnen: Prof. Dr. Francky Catthoor, Prof. Dr

Fuzzy Set Theory and Its Applications. Second, Revised Edition. H.-J. Zimmermann. Kluwer Academic Publishers Boston / Dordrecht/ London

Computer Architecture

TASK SCHEDULING FOR PARALLEL SYSTEMS

Novel Multimedia Instruction Capabilities in VLIW Media Processors. Contents

Hardware Software Codesign of Embedded System

Exploiting Distributed Resources in Wireless, Mobile and Social Networks Frank H. P. Fitzek and Marcos D. Katz

INFORMATION TECHNOLOGY Selected Tutorials

Rainer Leupers, Anupam Basu,Peter Marwedel. University of Dortmund Dortmund, Germany.

Embedded Systems. Series Editors

Stock Message Boards

Traffic Analysis on Business-to-Business Websites. Masterarbeit

Hierarchical Scheduling in Parallel and Cluster Systems

MASTERING COBOL PROGRAMMING

Transcription:

RETARGETABLE CODE GENERATION FOR DIGITAL SIGNAL PROCESSORS

RETARGETABLE CODE GENERATION FOR DIGITAL SIGNAL PROCESSORS Rainer LEUPERS University of Dortmund Department of Computer Science Dortmund, Germany Springer-Science+Business Media, B.V.

A C.I.P. Catalogue record for this book is available from the Library of Congress ISBN 978-1-4419-5181-6 ISBN 978-1-4757-2570-4 (ebook) DOI 10.1007/978-1-4757-2570-4 Printed an acid-free paper AII Rights Reserved 1997 Springer Science+Business Media Dordrecht Originally published by Kluwer Academic Publishers in 1997 Softcover reprint of the hardcover 1 st edition 1997 No part of the material protected by this copyright notice may be reproduced or utilized in any form or by any means, electronic or mechanical, inciuding photocopying, recording or by any information storage and retrieval system, without written permission from the copyright owner.

CONTENTS FOREWORD PREFACE Vll lx 1 INTRODUCTION 1 1.1 Design automation for VLSI 1 1.2 HW /SW codesign of embedded systems 2 1.3 Embedded software development 5 1.4 DSP algorithms and architectures 8 1.5 Problems and solution approach 14 1.6 Overview of related work 18 1.7 Goals and outline of the book 27 2 PROCESSOR MODELLING 29 2.1 MIMOLA language elements 29 2.2 The MSSQ compiler 33 2.3 Application studies 36 3 INSTRUCTION-SET EXTRACTION 45 3.1 Processor description styles 45 3.2 Analysis of control signals 48 3.3 Binary decision diagrams 51 3.4 Instruction-set model 52 3.5 Internal processor model 59 3.6 Behavioral analysis 60 3.7 Structural analysis 68 3.8 Postprocessing 78 v

Vl RETARGETABLE CODE GENERATION FOR DSPs 3.9 Experimental results 80 3.10 ISE as a validation procedure 82 4 CODE GENERATION 85 4.1 rrarget architecture styles 85 4.2 Program representations 86 4.3 Related work 89 4.4 rrhe code generation procedure 91 4.5 DFL language elements 95 4.6 Intermediate representation 98 4.7 Code selection by tree parsing 105 4.8 Rrr scheduling 118 5 INSTRUCTION-LEVEL PARALLELISM 127 5.1 Address generation in DSPs 127 5.2 Generic AGU model 130 5.3 Addressing scalar variables 132 5.4 Arrays and delay lines 146 5.5 Code compaction 161 6 THE RECORD COMPILER 179 6.1 Retargetability 179 6.2 Code quality 184 7 CONCLUSIONS 191 7.1 Contributions of this book 191 7.2 Future research 193 REFERENCES 195 INDEX 207

FOREWORD According to market analysts, the market for consumer electronics will continue to grow at a rate higher than that of electronic systems in general. The consumer market can be characterized by rapidly growing complexities of applications and a rather short market window. As a result, more and more complex designs have to be completed in shrinking time frames. A key concept for coping with such stringent requirements is re-use. Since the re-use of completely fixed large hardware blocks is limited to subproblems of system-level applications (for example MPEG-2), flexible, programmable processors are being used as building blocks for more and more designs. Processors provide a unique combination offeatures: they provide flexibility and re-use. The processors used in consumer electronics are, however, in many cases different from those that are used for screen and keyboard-based equipment, such as PCs. For the consumer market in particular, efficiency of the product plays a dominating role. Hence, processor architectures for these applications are usually highly-optimized and tailored towards a certain application domain. Efficiency is more important than other characteristics, such as regularity or orthogonality. For example, multiply-accumulate instructions are very important for digital signal processing (DSP). Most DSP processors use a special register (the accumulator) for storing partial sums. Orthogonality would require any register to be usable for partial sums, but this would possibly require another register field in the instruction and potentially lengthen the critical path for that instruction. In the light of the importance of efficiency, compiler techniques should be extended to handle features that contribute. to the efficiency of embedded processors. The contribution by Rainer Leupers is one of the few recent contributions aiming at providing efficient (in the sense of efficient code) compilers for embedded processors. These contributions show that efficient compilation is feasible, if the special characteristics of certain application domains (such as DSP) are exploited. Vll

Vlll RETARGETABLE CODE GENERATION FOR DSPs In this contribution, Rainer Leupers describes, how code generators for compiling DSP programs onto DSP processors can be built. The work includes a number of optimization algorithms which aim at making high-level programming for DSP applications practical. It can be expected that these and similar optimizations will be included in many future compilers for DSP processors. It can also be expected that the whole area of optimizing techniques will receive an enormous attention in the future, possibly also stimulated by Leupers' contribution. A key aspect ofthe approach developed by Rainer Leupers is the automatic generation of code generators from hard~are descriptions of the target processor. Hence, retargetability is achieved. Retargetability is the ability of quickly generating new tools for new target processors. Why is retargetability desirable? Due to the interest in efficient processors, domain or even application-specific processors are of interest. This means that a significant number of different architectures and instructions sets is expected to exist. Retargetability makes the generation of compilers for these architectures economically feasible. In the approach developed by Leupers, compilers can be generated from a description in a hardware description language. This approach is hence the first that really bridges the gap between ECAD and compiler worlds. This reduces the amount of component models that are required and enables new applications such as analyzing the effect of hardware changes with respect to different cost and performance metrics. This also enables a better understanding of the software /hard ware interface. According to common belief retargetable compilers are not as efficient as targetspecific compilers. Nevertheless, Leupers' RECORD compiler outperforms a target-specific compiler for a standard DSP on the majority of the benchmark examples. The design of the compiler generation system required background knowledge about similar previous work. Some of this knowledge was available in the department Leupers was affiliated with. Much of this knowledge came from earlier projects on compiler generators. Some key concepts such as the concepts of instruction conflicts and alternative code covers could be kept. A huge amount of others are new. It therefore makes me, happy to know that these solutions will be generally available in the form of this book. Dortmund, April 1997 P. Marwedel

PREFACE This book responds to the demand for a new class of CAD tools for embedded VLSI systems, located at the edge between software compilation and hardware design. Compilation of efficient machine code for embedded processors cannot be done without a close view of the underlying hardware architecture. In particular this holds for digital signal processors (DSPs), which show highly specialized architectures and instruction sets. Very few attention has been paid so far to development of high-level language compilers for DSPs, which is also-expressed in the following statement by Hennessy and Patterson [HePa90]: "Digital signal processors are not derived from the traditional model of computing, and tend to look like horizontal microprogrammed machines or VLIW machines. They tend to solve real~ time problems, essentially having an infinite-input data stream. There has been little emphasis on compiling from programming languages such as C, but that is starting to change. As DSPs bend to the demands of programming languages, it will be interesting to see how they differ from traditional microprocessors." As a result, the code quality achieved by commercial compilers for DSPs_ is still far from satisfactory. In addition, current compiler technology does not support frequent changes of the target processor, which would be necessary for effective hardware-software codesign environments. In this book, new techniques for retargetable and optimizing compilers for DSPs are presented and put into context with related work. The whole compilation process is covered, including target processor capture, intermediate code generation, code selection, register allocation, scheduling and optimizations for parallelism. This research monograph is a revised version of my doctoral thesis, which has been submitted to the Department of Computer Science at the University of Dortmund (Germany) in November 1996. I would like to thank my advisor Prof. Dr. Peter Marwedel and my co-referee Prof. Dr. Ernst-Erich Doberkat for their efforts and valuable comments. Furthermore, I would like to thank IX

X RETARGETABLE CODE GENERATION FOR DSPs my colleagues at the "LS XII" division of the Computer Science department at the University of Dortmund. Particular support has been provided by Steven Bashford, Birger Landwehr, Wolfgang Schenk, and Ingolf Markhof. Dr. Detlef Sieling gave important hints concerning binary decision diagrams. I also gratefully acknowledge the help of Dr. Jef van Meerbergen (Philips Research Labs, Eindhoven, The Netherlands) and Vojin Zivojnovic (Technical University of Aachen, Germany), who contributed material for application studies. Last but not least, I would like to thank my family for providing some technical and so much mental support and encouragement. I dedicate this book to my ladies Bettina, Helen, and Paulina. Dortmund, April 1997 Rainer Leupers