Engineer-to-Engineer Note

Similar documents
Engineer To Engineer Note

Engineer To Engineer Note

Engineer-to-Engineer Note

Engineer-to-Engineer Note

Engineer To Engineer Note

Engineer-to-Engineer Note

Engineer-to-Engineer Note

Engineer To Engineer Note

Enginner To Engineer Note

Engineer-to-Engineer Note

Engineer To Engineer Note

Small Business Networking

Small Business Networking

Small Business Networking

Voltage Monitoring Products

Small Business Networking

Engineer To Engineer Note

Small Business Networking

Small Business Networking

Small Business Networking

Small Business Networking

UT1553B BCRT True Dual-port Memory Interface

Engineer-to-Engineer Note

Small Business Networking

Address/Data Control. Port latch. Multiplexer

EasyMP Network Projection Operation Guide

Epson Projector Content Manager Operation Guide

MIPS I/O and Interrupt

EasyMP Multi PC Projection Operation Guide

Tilt-Sensing with Kionix MEMS Accelerometers

a Technical Notes on using Analog Devices' DSP components and development tools

Epson iprojection Operation Guide (Windows/Mac)

Engineer-to-Engineer Note

IZT DAB ContentServer, IZT S1000 Testing DAB Receivers Using ETI

Parallel Square and Cube Computations

16 Bit Software Tools ADDU-21xx-PC-1 Code Generation and Simulation

File Manager Quick Reference Guide. June Prepared for the Mayo Clinic Enterprise Kahua Deployment

E201 USB Encoder Interface

Engineer-to-Engineer Note

EasyMP Multi PC Projection Operation Guide

pdfapilot Server 2 Manual

EECS150 - Digital Design Lecture 23 - High-level Design and Optimization 3, Parallelism and Pipelining

ECEN 468 Advanced Logic Design Lecture 36: RTL Optimization

Voluntary Product Accessibility Template. Summary Table Voluntary Product Accessibility Template

Simrad ES80. Software Release Note Introduction

SoC Architecture Design Approaches

Engineer-to-Engineer Note

Tool Vendor Perspectives SysML Thus Far

vcloud Director Service Provider Admin Portal Guide vcloud Director 9.1

Unit #9 : Definite Integral Properties, Fundamental Theorem of Calculus

Installation Guide AT-VTP-800

Midterm 2 Sample solution

LoRaWANTM Concentrator Card Mini PCIe LRWCCx-MPCIE-868

McAfee Network Security Platform

Data Flow on a Queue Machine. Bruno R. Preiss. Copyright (c) 1987 by Bruno R. Preiss, P.Eng. All rights reserved.

Computer Arithmetic Logical, Integer Addition & Subtraction Chapter

McAfee Network Security Platform

c360 Add-On Solutions

12-B FRACTIONS AND DECIMALS

Engineer-to-Engineer Note

EasyMP Multi PC Projection Operation Guide

Agilent Mass Hunter Software

Engineer-to-Engineer Note

05-247r2 SAT: Add 16-byte CDBs and PIO modes 1 September 2005

Pin-down Cache: A Virtual Memory Management Technique for Zero-copy Communication

JCM TRAINING OVERVIEW DBV Series DBV-500 Banknote Validator

Information regarding

Ver 1.2. Radiation Hardened Bidirectional Multipurpose Transceiver. Datasheet. Part Number:B54ACS164245SARH

vcloud Director Tenant Portal Guide vcloud Director 9.1

Section 10.4 Hyperbolas

What do all those bits mean now? Number Systems and Arithmetic. Introduction to Binary Numbers. Questions About Numbers

McAfee Network Security Platform

Digital Signal Processing: A Hardware-Based Approach

Overview. Network characteristics. Network architecture. Data dissemination. Network characteristics (cont d) Mobile computing and databases

OUTPUT DELIVERY SYSTEM

EasyMP Network Projection Operation Guide

Data sharing in OpenMP

Digital Design. Chapter 1: Introduction. Digital Design. Copyright 2006 Frank Vahid

HFX 100 Tx and HFX 100 Rx User Guide

Release Notes for. LANCOM Advanced VPN Client 4.10 Rel

Engineer-to-Engineer Note

Passwords Passwords Changing Passwords... <New Passwords> 130 Setting UIM PIN... <UIM PIN/UIM PIN2> 130 Unlocking a Locked UIM...

Chapter 1: Introduction

A New Learning Algorithm for the MAXQ Hierarchical Reinforcement Learning Method

Welch Allyn CardioPerfect Workstation Installation Guide

Before We Begin. Introduction to Spatial Domain Filtering. Introduction to Digital Image Processing. Overview (1): Administrative Details (1):

Series LJ1. Uniaxial Electric Actuator

x )Scales are the reciprocal of each other. e

On Computation and Resource Management in Networked Embedded Systems

DMXter4 RDM software release notes September 28,2010 Software V4.10

What do all those bits mean now? Number Systems and Arithmetic. Introduction to Binary Numbers. Questions About Numbers

IE R-EM. User s Manual. 75X Series Emulation Board 75XL Series Emulation Board Version For IE R IE R

NR002556F Page 1 of 25. Media Platforms and Services Group

L. Yaroslavsky. Fundamentals of Digital Image Processing. Course

vcloud Director Tenant Portal Guide vcloud Director 9.0

ECE 468/573 Midterm 1 September 28, 2012

LCI/USB LonWorks Commissioning Interface

Transparent neutral-element elimination in MPI reduction operations

Lab 1 - Counter. Create a project. Add files to the project. Compile design files. Run simulation. Debug results

TSGS#15(02)0025. Technical Specification Group Services and System Aspects Meeting #15, Cheju Island, Korea, March 2002

Transcription:

Engineer-to-Engineer Note EE-328 Technicl notes on using Anlog Devices DSPs, processors nd development tools Visit our Web resources http://www.nlog.com/ee-notes nd http://www.nlog.com/processors or e-mil processor.support@nlog.com or processor.tools.support@nlog.com for technicl support. Migrting from 2106x/2116x to 2126x/2136x/2137x SHARC Processors Contributed by Divy Sunkr Rev 1 July 27, 2007 Introduction This EE-Note highlights relevnt detils when migrting system design from 2106x or 2116x SHARC processors to 2126x, 2136x, or 2137x SHARC processors. While ll SHARC processors re code comptible nd hve similr core nd peripherl rchitecture, some key differences in the processor cores, internl memory opertion, externl memory ccess, nd peripherls configurtion cn provide chllenges to successful migrtion. It is importnt to note tht this document identifies migrtion issues, provides guidelines for resolution, nd refers to product documenttion (dt sheets, processor hrdwre reference books, nd tools mnuls) for detiled informtion. To implement the guidelines in this EE-Note, you will need to use the product documenttion. Tble 1 compres processor fetures. Some of these feture comprisons identify esy-to-spot differences between processors (such s different operting voltges) tht must be ccommodted when migrting the systems design. These types of migrtion issues re not the focus of this note. Insted, this note focuses on more subtle, less obvious feture differences (such s different pipeline depths, which relte to stll issues) tht must be ddressed when optimizing the system design migrtion. This EE-Note ddresses the following migrtion issues: Internl Memory Access Pipeline Depth SISD/SIMD Progrm Execution PLL Configurtion Externl Memory Access Externl Port Throughput SPORT Feture Differences DAI/SRU Progrmming DMA/IOP Usge Interrupt Vector Tble Setup Power Dissiption Clcultions Copyright 2007, Anlog Devices, Inc. All rights reserved. Anlog Devices ssumes no responsibility for customer product design or the use or ppliction of customers products or for ny infringements of ptents or rights of others which my result from Anlog Devices ssistnce. All trdemrks nd logos re property of their respective holders. Informtion furnished by Anlog Devices pplictions nd development tools engineers is believed to be ccurte nd relible, however no responsibility is ssumed by Anlog Devices regrding technicl ccurcy nd topiclity of the content provided in Anlog Devices Engineer-to-Engineer Notes.

Processors Fetures 21060/ 21061/ 21062 21065L 21160/ 21161 21261 21262 21266 1 21362/ 21363/ 21364/ 21365 1 / 21366 1 21367 1 / 21368/ 21369 21371/ 21375 Mx Freq. (MHz) Core Voltge (3.3V I/O) Dul- / Single- Ported RAM Int. Mem. Mbits (RAM/ROM) 40 66 100 150 200 150/ 200 333 400 266 3.3 2 3.3 1.8 1.2 1.2 1.2 1.2 1.3 1.2 Dul Dul Dul Dul Dul Dul Single Single Single 4/0 0.5/0 1/0 1/3 2/4 2/4 3/4 2/6 1/4 0.5/2 Pipeline Depth 3 3 3 3 3 3 5 5 5 SISD/SIMD SISD SISD SIMD SIMD SIMD SIMD SIMD SIMD SIMD PLL Config. XTAL only XTAL only H/W only H/W+S/W H/W+S/W H/W+S/W H/W+S/W H/W+S/W H/W+S/W Ext. Port (A/D) 32/48 24/32 24/32 n/ n/ n/ n/ 24/32 24/32 3 Ext./Pr. Port Throughput 4 160M 264M 200M 66M 66M 66M 55M 222M Execute Externl Yes Yes Yes n/ n/ n/ n/ n/ Yes Prllel Port (muxed A/D) MP/Shred Memory SDRAM Controller n/ n/ n/ 16 16 16 16 n/ n/ Yes Yes Yes n/ n/ n/ n/ Yes 5 n/ n/ Yes Yes n/ n/ n/ n/ Yes Yes 176M 3 SPORTs (duplex) 2 (full) 2 (full) 4 (full) 4 (hlf) 6 (hlf) 6 (hlf) 6 (hlf) 8 (hlf) 8 (hlf) 3 I 2 S Support n/ Yes Yes Yes Yes Yes Yes Yes Yes Link Ports up to 6 n/ up to 4 n/ n/ n/ n/ n/ n/ BGA (Blls) 225 225 225 136 136 136 136 256 n/ LQFP (leds) n/ n/ n/ 144 144 144 144 208 208 Tble 1. SHARC processor feture comprison 1 Processor includes udio specific peripherls nd on-chip fctory progrmmed ROM. IP holder license greement required. 2 The 21060/21061/21062 re lso vilble in 5.0V versions. 3 Vlue shown pplies to the 21371. The 21375 externl dt bus is 16 bits wide, the 21375 throughput is 88 Mbytes/s, nd the 21375 hs four SPORTs. 4 Externl Port throughput is estimted for dt ccesses over 32-bit-wide dt bus. See Externl Port Throughput for detils. 5 Shred memory is vilble on the 21368. See Externl Memory Access for detils. Migrting from 2106x/2116x to 2126x/2136x/2137x SHARC Processors (EE-328) Pge 2 of 14

Internl Memory Access The Dul-/Single- Ported RAM row in Tble 1 identifies n importnt difference between 2106x/2116x nd 2126x/2136x/2137x SHARC processors. This memory ccess rchitecture difference cn gretly influence the success of design migrtion. Access on Legcy SHARC Processors The internl memory of 2106x/2116x SHARC processors (referred to s legcy SHARC processors) hve dul-ported memory structure with two (2) memory blocks ccessible by ny two (2) of the progrm memory (PM), dt memory (DM), nd I/O buses in the sme memory cycle. On legcy SHARC processors, PM ccess nd DM ccess to the sme block relies on the instruction cche to provide single-cycle throughput fter the first itertion of looped instruction. The PM or DM nd the I/O bus cn ccess either of the two blocks in the sme core clock cycle. The I/O bus used by the DMA controller provides legcy SHARC processors with core ccess to memory-mpped IOP registers used to control peripherls. These processors llow mixing code nd dt segments in both blocks, with core stlls for PM nd DM memory block conflicts. Access on Newer SHARC processors The internl memory of 2126x/2136x/2137x SHARC processors (referred to s newer SHARC processors) permits similr mixing of code nd dt segments cross ny internl memory block nd llows DMA nd core ccess of ny block s is vilble on legcy SHARC processors. A crucil difference between the two rchitectures is tht the dulported internl memory blocks in the legcy SHARC processors which prevent memory block conflicts between the core (PM or DM bus) nd the IOP (I/O bus) re not vilble in the newer SHARC processors. Insted, newer SHARC processors provide four single-ported memory blocks. Becuse the memory blocks re single ported, there is n dditionl memory block conflict when the core nd IOP ttempt to ccess the sme memory block in the sme cycle. The extr two blocks of memory s compred to legcy SHARC processors re intended to help void this type of memory block conflict. The SHARC processors hndle memory block conflicts s follows: On ll SHARC processors (legcy nd newer), conflict between DM nd PM ccess is lwys resolved in fvor of DM, with the PM ccess occurring in the second cycle. On newer SHARC processors, conflict between DM/PM nd I/O is resolved in fvor of I/O ccesses. Becuse the I/O bus runs t hlf the core clock frequency (CCLK), I/O ccesses re requested t mximum rte of once in two core clock cycles. This provides fir shring of memory ccess to the core nd I/O buses. The I/O bus is used in core ccesses of memorympped IOP registers, used to configure peripherls, nd by the DMA controller to trnsfer dt to/from memory nd peripherls. The I/O bus is lso used by the DMA controller to ccess Trnsfer Control Blocks (TCBs) for DMA chining, s TCBs re stored in internl memory. Despite the potentil for block ccess conflicts, with some forethought nd nlysis, system designers cn use memory with full performnce by following these guidelines: Use the defult linker description file (.LDF) s the strting point for describing system memory nd plcing progrm/dt. If performnce becomes n issue due to conflict-cused stlls, do the following: Migrting from 2106x/2116x to 2126x/2136x/2137x SHARC Processors (EE-328) Pge 3 of 14

Plce code nd dt in seprte blocks whenever possible. Allow DMA (dt buffers nd TCBs if pplicble) to use block not being used by the core. Allow DMA to ping-pong between memory blocks insted of within block. Use the PM bus for instructions only. The Memory chpter of the 2136x SHARC Processor Progrmming Reference includes informtion on these conflict-cused stlls nd provides digrms describing the use of the buses with the internl memory blocks. Pipeline Depth The Pipeline Depth row in Tble 1 identifies the difference between ll SHARC processors tht ffects progrm execution performnce. To ccommodte fster memory nd processor core speeds, the 2136x nd 2137x SHARC processors chnged from 3-stge pipeline to 5-stge pipeline. Tble 2 shows comprison. Stge 2106x 2116x 2126x 2136x 2137x 1 Fetch Fetch 1 2 Decode Fetch 2 3 Execute Decode 4 N/A Address 5 N/A Execute Tble 2. Pipeline structure comprison This increse in pipeline depth introduces some slight chnges in behvior. This behvior is seen only in short loops nd in the ltency in interrupts nd brnches. For developers porting code from the 3-stge pipeline SHARC processors, it is importnt to note the following migrtion issues relting to Stlls, Hrdwre Loops, nd Ltencies, which stem from incresed pipeline depth. Stlls Potentil sources of stlls include: DAG register lod to usge in ddress genertion: M0 = 1; DM(I2,M0) = R1; /* 2 cycle stll */ DAG register lod to usge in indirect jump/cll: M0 = 1; JUMP (M0,I1); /* 2 cycle stll */ Post- to pre-modify using sme index register: dm(i0,m1) = R1; R2 = dm(-1,i0); /* 1 cycle stll */ Ureg lod to strt of H/W counter-bsed loop: USTAT1 = 0x5; LCNTR = USTAT1, do ( ) until LCE; /* 1 cycle stll */ Compute to usge of generted condition: R0 = R0 1; if ne jump BEGIN_OF_LOOP; /* 1 cycle stll */ Hrdwre Loops The following re dditionl cses of short loops tht incur stlls. To chieve no-overhed loops (eliminte ll stll cycles), pply these guidelines: A loop of length one must iterte t lest four times. A loop of length two must iterte t lest two times. A loop of length three must iterte t lest two times. Migrting from 2106x/2116x to 2126x/2136x/2137x SHARC Processors (EE-328) Pge 4 of 14

Ltencies Interrupts nd jumps/clls hve different ltencies due to pipeline lengthening. For interrupts, the lengthened pipeline cuses some response ltency: 5 cycles if forced by write to IRPTL 6 cycles if generted by hrdwre For jumps/clls, there lso re some relted ltency issues: Immedite brnch: 3 cycles Delyed brnch: 1 cycle Conditionl brnch: 4 if tken, 0 if not tken SISD/SIMD Progrm Execution One of the issues tht cn gretly improve performnce during migrtion is updting the system to SIMD execution. The SISD/SIMD row in Tble 1 identifies the SHARC processors tht support SIMD. 2106x SHARC processors re singleinstruction, single-dt (SISD) mchines. Their single processing element nd SHARC rchitecture provides up to 66 MIPS, 66 MMACS, nd 132 MFLOPS of performnce. 2116x, 2126x, 2136x, nd 2137x SHARC processors support singleinstruction, multiple-dt (SIMD) execution. This rchitecture enhncement provides second identicl processing element, which cn effectively double performnce. For exmple, when combined with the incresed (266 MHz) core instruction rte, 21375 processors cn perform t 266 MIPS, 533 MMACS, nd 1.596 GFLOPS. Severl EE-Notes describe how to implement SIMD opertion, including: Extended-Precision Fixed-Point Arithmetic on SIMD SHARC Processors (EE-270) Implementing In-Plce FFTs on SISD nd SIMD SHARC Processors (EE-267) PLL Configurtion With the increse of processor speed, when moving from legcy SHARC processors to newer SHARC processors, support for greter flexibility in clock nd phse-locked loop (PLL) configurtion becme more importnt. The PLL Config row in Tble 1 identifies the SHARC processors tht support PLL configurtion through clock crystl input (XTAL only), clock crystl input plus clock rtio selection pins (H/W only), or clock crystl input with clock rtio selection pins plus softwre configurtion (H/W+S/W). Clock Control on Erliest SHARC Processors On 2106x SHARC processors, the CLKIN input frequency from microprocessor-grde clock crystl provides the clock input for processor core opertion directly. One member of this processor fmily (the 21065L processor), doubled the input frequency, running the processor core t twice the frequency of the CLKIN input. Unlike 21161x, 2126x, 2136x, nd 2137x SHARC processors, 2106x SHARC processors do not hve clock configurtion pins to progrm n internl phse-locked loop (PLL) nd thus the core clock rte. Clock Control on Lter SHARC Processors On 2116x SHARC processors, n on-chip PLL provides clen clock for the processor core. The rtio between the CLKIN input nd the PLL output (to the processor core) is controlled by setting externl clock configurtion (CLKCFG) pins. The stte of these CLKCFG pins defines n effective multiply rtio, yielding desired core clock (CCLK) rte from slower, redily- Migrting from 2106x/2116x to 2126x/2136x/2137x SHARC Processors (EE-328) Pge 5 of 14

vilble crystl or crystl oscilltor (XO). The PLL locks to the CLKIN source nd provides the requested CCLK rte just fter strtup. The CLKCFG pins cn only be selected while the SHARC processor is in reset stte. One member of this processor fmily ( 21161 processors) includes clock doubling (CLKDBL) pin, which multiplies the CLKIN input by fctor of 2 before the 2x clock source psses through the internl PLL. Clock Control on Newer SHARC Processors On 2126x, 2136x, nd 2137x SHARC processors, softwreconfigurble PLL is vilble, offering greter flexibility in core clock frequency control. In ddition to the CLKCFG pins, the PLL on these processors is configurble using softwre, permitting choice between relying on the CLKCFG pin settings or pplying n dditionl set of multipliers nd divisors giving wider rnge of grnulrity thn the three (3) rtios supplied by the CLKCFG pins lone. This softwre clock control cn be pplied during n initiliztion routine or nytime the processor is operting (not in reset stte). Clock Control Guidelines It is importnt to understnd some detils bout PLL progrmmbility to ese migrtion from erlier SHARC processors: PLL hedroom limit (this limit ffects 2126x, 2136x, nd 2137x SHARC processors) Use the initil divisor (INDIV) bit in PMCTL to divide the CLKIN source by two (2) before pssing it to the PLL input. When the INDIV bit is clered, CLKIN * PLLM should be <400 MHz. When the INDIV bit is set, CLKIN * PLLM should be <800 MHz. Use the divisor enble (DIVEN) bit to cuse the PLL to lock using the PLLD divisor vlue tht hs been entered. Remember to cler the DIVEN bit when setting the PLL in bypss mode. Refer to Mnging the Core PLL on Third- Genertion SHARC Processors (EE-290) nd/or the System Design chpter of the pproprite Hrdwre Reference for the SHARC processor being used. Externl Memory Access Depending on the mount of dt system stores in externl memory nd whether the system executes instructions directly from externl memory, externl memory ccess cn be one of the most chllenging issues for system migrtion. Looking t Tble 1, the following rows identify externl memory ccess fetures tht ffect system migrtion: Ext. Port (Add/Dt) width of ddress nd dt buses extended off-chip Execute Ext. support for progrm execution from externl memory Prllel Port (muxed A/D) width of multiplexed externl ddress nd dt bus MP/shred memory support for multiprocessor nd shred memory ccess SDRAM Controller glueless support for externl SDRAM in system Access on Legcy SHARC Processors Legcy SHARC processors (2106x nd 2116x processors) hve externl ports with dedicted dt nd ddress pins, which extend the 32-bit dt bus nd the 24-bit ddress bus off-chip. These legcy SHARC processors hve vriety of feture support in their externl ports. One key feture tht influences migrtion is support for Migrting from 2106x/2116x to 2126x/2136x/2137x SHARC Processors (EE-328) Pge 6 of 14

dt pcking (utomticlly ccommodting differences between internl on-chip bus width nd externl off-chip bus width). These processors cn be configured to work with 16-, 32-, or (in some cses) 64-bit externl dt buses. For exmple, 21160 processors hve 64-bit dt bus, but cn be configured to operte with 16-, 32-, or 64-bit externl bus with dt pcking. Another exmple of flexible externl port support is tht 21161 processors cn pply unused link port dt pins (when the link ports re disbled) to widen their externl dt bus to 48 bits, llowing direct execution from externl spce without the need for instruction pcking. Among legcy SHARC processors, note tht the 21065L, 21160, nd 21161 processors cn execute instructions from externl memory spce. Combined with some memory control signls (/RD, /WR, ACK, nd /MSx), the externl port on legcy SHARC processors llows systemfriendly connectivity to common SRAM devices s well s prllel DACs nd ADCs. Some legcy SHARC processors (21065L nd 21161 processors) include SDRAM controller functionlity within their externl port. This functionlity dds externl control signls for SDRAM devices (/RAS, /CAS, DQM, /SDWE, SDCLK0-1, nd SDCKESDA10) nd internl logic to mnge the strtup nd refresh needs of the SDRAM device. Multiprocessor-bsed clustering nd host processor support re vilble on most legcy SHARC processors. These fetures consist of dditionl externl port signls (/HBR, /HBG, ID2-0, nd /BRx). These signls llow multiple SHARC processors to rbitrte for ccess to common externl memory nd/or specified segments of ech other s internl memory, referred to s multiprocessor memory spce. These multiprocessor control signls lso llow host processor to ccess ech SHARC processor s memory-mpped I/O processor spce. Access on Newer SHARC Processors There re two types of externl ports on newer SHARC processors. Some of the newer SHARC processors ( 2126x nd 21362/3/4/5/6) hve prllel port (simpler support thn full externl port), which uses fewer externl pins. Other newer SHARC processors (21367/8/9 nd 21371/5) hve full externl ports. It is importnt to understnd the difference between these two types of ports nd how the difference in support my ffect system migrtion. A prllel port on newer SHARC processor is 16 bits wide nd uses multiplex scheme to shre the ddress nd dt signls on the sme externl pins (AD15-0). This prllel port feture is importnt becuse: Multiplexing of ddress nd dt pins on the port requires n externl ltch (glue logic) to ddress externl memory (SRAM). The port cnnot support SDRAM usge. The port cnnot support execution of instructions from externl memory spce. The port cnnot support host or multiprocessor ccess. The port cnnot support processor core ccess directly to externl memory. All port ccess to externl memory is ccomplished using DMA. The port s control nd DMA setup registers cn be red/written by the core, permitting the DMA through the prllel port. Becuse of this lst limittion on prllel port support, the softwre development tools provide seprte externl memory window (Externl Dt (DM) Byte Memory) to disply externl memory dt. Strting with the VisulDSP++ 4.5 development tools, the DMA ONLY specifier Migrting from 2106x/2116x to 2126x/2136x/2137x SHARC Processors (EE-328) Pge 7 of 14

ws introduced for externl memory segments on these processors. The 21367/8/9 nd 21371/5 SHARC processors hve more robust externl port tht returns to much of the functionlity provided on 2106x nd 2116x processors. Specificlly, the externl port on these newer SHARC processors hs the dedicted ddress nd dt pins, nd the externl port control signls (/RD, /WRACK, nd /MSx). Note tht the externl port dt bus width on these newer SHARC processors is 32 bits, except on the 21375 processor, which hs 16- bit dt bus. While the externl port on these newer SHARC processors provides no host or multiprocessor functionlity, these processors do include n SDRAM controller. There is some vriety in the externl port fetures mong these processors. For exmple, 21371/5 processors support progrm execution from externl memory (Bnk 0 only). Also, 21368 processors provide some shred memory support common bnk of SRAM or SDRAM mong four 21368 SHARC processors. Additionl externl signls (ID1-0 nd /BRx) rbitrte mong the 21368 processors for the bus. Unlike legcy SHARC processors with SDRAM control, the 21367/8/9 processors SDRAM controller does not hve pins to drive the DQM pins on typicl SDRAM devices. The SDRAM s DQM pin cn be tied ctive when connected to n 21367/8/9 processor. For more informtion on using SDRAM with newer SHARC processors, see the pproprite SHARC processor nd SDRAM device dt sheets nd see Interfcing SDRAM Memory to 21368 nd 2137x SHARC Processors (EE-286). Externl Port Throughput Becuse dt throughput using the externl port cn gretly influence system design, migrtion plnning should include comprison of this feture of legcy SHARC processors nd newer SHARC processors. Tble 1 lists this feture in the Ext./Pr.Port Throughput row. Dt Throughput on Legcy SHARC Processors Clcultions of externl port throughput for legcy SHARC processors re defined using the speed t which the externl port is timed. For exmple, n 21160 processor externl port functions t the CLKIN rte (mximum rte of 50 MHz). Clculting the dt throughput for this processor (ssuming 32-bit-wide externl bus) yields: 6 4Bytes 50 10 MBytes Throughput = = 200 1 sec sec It is importnt to note tht 2116x processors lso support 64-bit externl bus. Using the whole 64-bit dt bus, this yields throughput of 400 Mbyte/s. Dt Throughput on Newer SHARC Processors Clcultions of externl port throughput for newer SHARC processors re not defined s simply s on legcy SHARC processors the externl port throughput does not scle with the increse in core clock speed s compred with the legcy SHARC processors. Looking t the throughput comprison in Tble 1 shows tht the newer SHARC processor s externl port throughput does not differ gretly from legcy SHARC processors. On 21369 processors, the throughput clcultion for externl port stems from: Using 32-bit bus running t 166 MHz (CCLK/2 = 333/2 = 166 MHz) Externl ccesses tht require three cycles to complete Migrting from 2106x/2116x to 2126x/2136x/2137x SHARC Processors (EE-328) Pge 8 of 14

Clculting effective externl bus speed s 166/3 (55.3 MHz) All of which provides clcultion of: 6 4Bytes 55.3 10 MBytes Throughput = = 222 1 sec sec On some of the newer SHARC processors (21367/8/9), synchronous ccesses occur t the speed of the externl port logic, which is clocked by the SDCLK even though the ccesses re synchronous. At 333-MHz CCLK, the fstest SDCLK selectble is 166 MHz (CCLK/2). If the system uses externl SRAM only, the externl port throughput for n 21367/8/9 t 400 MHz is slightly better. With n effective externl bus speed of 200/3 (66.667 MHz), the clcultion is: 6 4Bytes 66.67 10 MBytes Throughput = = 266 1 sec sec This technique is used in 21367/8/9 SHARC processor dt sheet to derive the throughput, but it is importnt to know tht using SDRAM in the system (for exmple, in Bnk0) mens this number is not relistic. If the system includes SDRAM, the bove clcultion cnnot pply. Systems with SDRAM use the CCLK:SDCLK rtio, which is set to 2:1 for 333-MHz CCLK, nd 166-MHz SDCLK. This mens tht the synchronous trnsfers (even in other bnks) run t n externl bus speed of 166/3 (55.3 MHz), yielding throughput of pproximtely 220 Mbytes/s. Dt Throughput on Prllel Ports 2126x nd 21362/3/4/5/6 processors hve simplified prllel port, rther thn including full externl port. It is importnt to note the prllel port throughput if it ffects migrtion plnning. For exmple, throughput of the prllel port is 66 Mbyte/s on 2126x processors nd 55 Mbyte/s on 21362/3/4/5/6 processors. The throughput difference (despite speeding up the lter processor s speed-grde) is due to chnge in peripherl clock derivtion. In the 2126x, the peripherl clock (nd therefore the prllel port) runs t the CCLK rte, nd runs t one hlf the CCLK rte on 21362/3/4/5/6 SHARC processors. Dt Throughput Summry Tble 3 summrizes the best throughput for dt ccess in the vrious SHARC processors. Note tht 21367/8/9 processor SDRAM write throughput is dependent upon whether the core or DMA controller mnges the ccess. Also note tht 2116x processors support ccess to synchronous burst SRAM (SBSRAM) devices, yet 2126x nd 2136x processors do not provide this support. Processor 21065L/ 21161 21367/8/9 Access Type Oper. Pge Throughput per EPort Cycles (not CCLK) Sequentil, Red Sme 1 word per cycle uninterrupted Write Sme 1 word per cycle Sequentil, uninterrupted 2137x Sequentil, uninterrupted Red Sme 32 words per 37 cycles Write Sme Tble 3. Dt throughput comprison Core: 1 word per cycle DMA: 1 word per 2 cycles Red Sme 32 words per 37 cycles Instruction Pcking nd Throughput Write Sme 1 word per cycle The externl port s instruction pcking lets the processor fetch instructions from externl memory. Support for externl execution is vilble on the 21065L, 21161, nd 2137x processors only. Migrting from 2106x/2116x to 2126x/2136x/2137x SHARC Processors (EE-328) Pge 9 of 14

21065L nd 21161 instruction pcking fetures include: 48-bit instructions in 32-bit externl memory Two (2) CLKIN or externl port cycles per instruction; for exmple, two (2) 32-bit loctions per instruction. This mnner of pcking instructions into 32-bit memory wstes 16 bits of memory per instruction. 21161 instruction pcking fetures lso include: Supported 48-bit instructions in 48-bit externl memory (using unused link port dt pins when link ports re disbled), or pcked instructions in 32-bit externl memory, 16-bit externl memory, nd 8-bit externl memory. One (1) CLKIN or externl port instruction in 48-bit-wide memory Two (2) CLKIN or externl port instructions in 32-bit-wide memory (wstes 16 bits/instruction) Four (4) CLKIN or externl port cycles per instruction in 16-bit-wide memory (wstes 2 bytes/instruction) Eight (8) CLKIN or externl port cycles per instruction in 8-bit-wide memory (wstes 2 bytes/instruction) 2137x instruction pcking fetures include: 48-bit instructions in 32-bit externl memory Three (3) SDCLK cycles per 2 instructions; for exmple, three (3) 32-bit loctions per 2 instructions. This is more efficient mnner to pck instructions in 32-bit memory. SPORT Feture Differences Synchronous seril port (SPORT) fetures vry mong SHARC processors. The differences stem from the stedy increse in functionlity nd dt throughput over the life spn of the SHARC fmily. In Tble 1, these differences re highlighted with the SPORTs (duplex) nd I 2 S Support rows, but the difference in fetures re more subtle nd detiled thn these rows imply. The SPORTs on legcy SHARC processors (2106x nd 21160 processors) re cpble of full-duplex opertion. But with feture chnges going to the 21161, system designs must use two SPORTs together to implement full-duplex opertion. The 21161 nd newer SHARC processors (2126x, 2136x, nd 2137x) SPORT dt pins support progrmmble direction of either inputs or outputs. This feture is n enhncement over the originl 2106x nd 21160 processors SPORTs, which were fixed trnsmitters or receivers. Another feture enhncement is I 2 S support. The SPORTs on 21065L, 21161, nd newer SHARC processors provide support for I 2 S. The TDM support within the SPORT lso hs chnged over time. 21065L nd 21161 processors were the first SPORTs to support chnnel B secondry Tx/Rx pir. On these legcy SHARC processors, the SPORT only supports multichnnel TDM mode on the primry A chnnel only. On newer SHARC processors the SPORT supports TDM mode on the B chnnel s well, doubling TDM throughput over legcy SHARC processors on pired SPORTs. TDM chnnel support hs lso brodened. The 2106x SPORTs only support up to 32 chnnels in TDM mode. 2126x nd 2136x SHARC processors TDM support hs been expnded to 128 chnnels per frme. The SPORTs on the newest SHARC processors (21367/8/9 nd 2137x) hve no restrictions regrding which SPORTs re used in TDM mode. Erlier SHARC processors hd piring scheme in which prticulr SPORT ws used for trnsmitting, nd corresponding Migrting from 2106x/2116x to 2126x/2136x/2137x SHARC Processors (EE-328) Pge 10 of 14

SPORT hd to be used for receiving, or if not needed for receiving, could not be used t ll. Frming error logic is vilble in SPORTs for 21367/8/9 processors. This logic cn detect frme syncs occurring erly nd ssert n interrupt, even before the previous trnsmit or receive completes. On the interrupt, SPERRSTAT register is polled to determine the SPORT tht suffered the frming error. The mximum internlly generted clock vries on SHARC processors. On the 2106x SHARC processors, the SPORTs rn t up to 50 MHz, with divide by 0 in the CLKDIV registers. 2116x processors restricted the mximum seril clock rte with divide by 2 of the CCLK. 2126x processors restrict the mximum clock rte to divide by 4 of the CCLK. 2136x processors restrict the mximum clock rte to divide by 8 of the CCLK. DAI/SRU Progrmming When plnning system migrtion, it is importnt not just to exmine how fetures used for the erlier design hve chnged, but lso to closely exmine completely new fetures, identifying how these cn improve performnce in the new system. The DAI/SRU fetures in the new SHARC processors fll into this ctegory. With 2126x processors, new wy for peripherls to shre pins ws introduced, the Signl Routing Unit (SRU). This feture is fully described in the processor Hrdwre Reference, but it bers mention here becuse it is importnt to know tht this is n esily-progrmmed group of pins tht llows very flexible use of the mny peripherls. There re severl wys of progrmming the signl routing, including GUI plug-in for the VisulDSP++ development tools, mnul register mnipultion, nd softwre mcro usble in C nd ssembly. The VisulDSP++ exmples use the SRU mcro. Note tht the GUI plug-in does not come with the VisulDSP++ pckge. For more informtion, refer to Using the Expert DAI for 2126x, 2136x nd 2137x SHARC Processors (EE-243). In the pst, when routing SPORT clocking signls s outputs, some system design choices hve led to signl integrity issues. Use the informtion vilble in the Seril Ports chpter of the 21368 SHARC Processor Hrdwre Reference to ensure tht this issue does not occur during system migrtion. DMA/IOP Usge On the 2137x SHARC processors, the externl DMA port hs been enhnced to provide support for udio dely-lines. Essentilly, this feture consists of chined DMA or block of udio dt writes to externl memory, followed by reds of smples (tps) for udio plybck. This feture ws implemented first on 21367/8/9 SHARC processors, but ws limited to reding single smples only for ech tp. Since ech tp in externl memory requires n entry in internl memory (the tp list), hving n internl word to describe ech externl word is not spceefficient. This limittion ws rectified in the lter 2137x SHARC processors by llowing reds of multiple smples for ech tp. When porting chined DMA setup code, keep in mind tht due to the internl memory strting ddress chnges between SHARC processors, the chin point register PCI bit is different between newer SHARC processors nd legcy SHARC processors. A port of DMA code without ltertion of the PCI bit will result in no interrupts generted fter the completion of the DMA. Migrting from 2106x/2116x to 2126x/2136x/2137x SHARC Processors (EE-328) Pge 11 of 14

Interrupt Vector Tble Setup In the list of items for migrtion plnning, remember to check the interrupt vector tble setup. In migrting between legcy nd newer SHARC processors (prticulrly when porting legcy ssembly code), the progrmmer should py ttention to new mppings of peripherls vector ddresses which might hve moved to different interrupt vector loctions. Porting code without these modifictions could result in n inbility to service SPORTs or other interrupts. Power Dissiption Clcultions Power dissiption is criticl system feture tht cn gretly influence successful system migrtion. Over the lifetime of SHARC processors, power dissiption hs become more difficult to specify due to the lrger proportion of lekge current nd the incresing design ttention on power consumption nd het dissiption. Clcultions for 21060L Processors The power dissiption clcultion for this processor is: I DDINHIGH @ 25 ns T CK = 0.475 A P EXT = 0.074 W P INT = I DDINHIGH * V DD = 0.475 A * 3.3 V P TOTAL = 1.644 W Also, the 21061L SHARC processor is unique in the ddition of n idle16 instruction. This instruction executes NOP while slowing the core clock to 1/16th the originl resulting in svings of idle power, pproximtely 50 ma vs. 180 ma compred to IDLE, nd compred to pproximtely 475 ma used in the exmple bove, representing IDDINHIGH. For more detils, refer to the pproprite 2106x SHARC processor dt sheet. Clcultions for 21065L Processors The power dissiption clcultion for this processor is: P EXT = 0.068 W P INT = I DDINHIGH * V DD = 0.275 A * 3.3 V P TOTAL = 0.9755 W Clcultions for 21161N Processors In the 21160M/N nd 21161N processors, seprte voltges for the core nd I/O ring were introduced. The power dissiption clcultion for the 21161N processor is: P EXT = 0.185 W P INT = I DDINHIGH * V DDINT = 0.660 A * 1.8 V P TOTAL = 1.373 W Note: The 0.660 A vlue bove includes 10 ma for AI DD for PLL supply. Clcultions for Newer SHARC Processors In the newer SHARC processors (2126x processors nd beyond), s lekge current becme more substntil portion of dissipted power, Anlog Devices switched to communicting power dissiption clcultion informtion through EE-Notes, insted of dt sheets. This chnge llows more informtion nd explntion to be shred with the system designers. With n EE-Note for ech group of SHARC processors (2126x, 21362/3/4/5/6, 21367/8/9, nd 21371/5), system designers cn ccurtely predict P EXT using n ctul peripherl usge cse, estimte P INT bsed on both sttic (lekge) nd dynmic (switching) components, nd illustrte the effects of voltge, temperture, nd frequency on power dissiption. The dt in the EE-Note, similr to dt supplied in dt sheets for erlier SHARC processors, is bsed on chrcteriztion dt. The power dissiption clcultion dt provides vluble Migrting from 2106x/2116x to 2126x/2136x/2137x SHARC Processors (EE-328) Pge 12 of 14

informtion for understnding power supply requirements nd for estimting power svings tht my be chieved by mnging the core clock rte using the progrmmble PLL. Reference EE-Notes on SHARC processor power dissiption include: Estimting Power Dissiption for 21368 SHARC Processors (EE-299) Estimting Power for the 21362 SHARC Processors (EE-277) Estimting Power Dissiption for Industril Grde 21262 SHARC Processors (EE- 250) Estimting Power Dissiption for 21262S SHARC DSPs (EE-216) Conclusion Migrting system design from legcy SHARC processors to newer SHARC processors is mngeble tsk when the differences between the processors re understood. The chllenges to migrtion stem not just from obvious specifiction differences between the prts, but lso from the more subtle, less obvious performnce feture differences. The intent of this EE-Note is to provide cler informtion on the subtle feture differences to ese system migrtion. Using this note is only the beginning though. The issues rised here should led system designers to the relevnt sections of processor documenttion ffecting their migrtion plnning. Migrting from 2106x/2116x to 2126x/2136x/2137x SHARC Processors (EE-328) Pge 13 of 14

References [1] 2136x SHARC Processor Progrmming Reference. Rev 1.0, Mrch 2007. Anlog Devices, Inc. [2] 21368 SHARC Processor Hrdwre Reference. Rev 1.0, September 2006. Anlog Devices, Inc. [3] 21160 SHARC DSP Instruction Set Reference. Rev 2.0, November 2003. Anlog Devices, Inc. [4] 2126x SHARC Processor Peripherls Mnul. Rev 3.0, December 2005. Anlog Devices, Inc. [5] 2136x SHARC Processor Hrdwre Reference for the 21362/3/4/5/6 Processors. Rev 1.0, October 2005. Anlog Devices, Inc. [6] 21161 SHARC Processor Hrdwre Reference. Rev 4.0, Februry 2005. Anlog Devices, Inc. [7] 21160 SHARC DSP Hrdwre Reference. Rev 3.0, November 2003. Anlog Devices, Inc. [8] 2126x SHARC DSP Core Mnul. Rev 2.0, Februry 2004. Anlog Devices, Inc. [9] 21065L User's Mnul. Rev 2.0, July 2003. Anlog Devices, Inc. [10] 2106x SHARC User's Mnul. Rev 2.1, Mrch 2004. Anlog Devices, Inc. [11] Extended-Precision Fixed-Point Arithmetic on SIMD SHARC Processors (EE-270). Rev 1, July 2005. Anlog Devices, Inc. [12] Implementing In-Plce FFTs on SISD nd SIMD SHARC Processors (EE-267). Rev 1, Mrch 2005. Anlog Devices, Inc. [13] Mnging the Core PLL on Third-Genertion SHARC Processors (EE-290). Rev 2, My 2007. Anlog Devices, Inc. [14] Interfcing SDRAM Memory to 21368 nd 2137x SHARC Processors (EE-286). Rev 3, April 2007. Anlog Devices, Inc. [15] Using the Expert DAI for 2126x, 2136x nd 2137x SHARC Processors (EE-243). Rev 3, June 2006. Anlog Devices, Inc. [16] Estimting Power Dissiption for 21368 SHARC Processors (EE-299). Rev 1, December 2006. Anlog Devices, Inc. [17] Estimting Power for the 21362 SHARC Processors (EE-277). Rev 1, Jnury 2006. Anlog Devices, Inc. [18] Estimting Power Dissiption for Industril Grde 21262 SHARC Processors (EE-250). Rev 1, My 2005. Anlog Devices, Inc. [19] Estimting Power Dissiption for 21262S SHARC DSPs (EE-216). Rev 1, December 2003. Anlog Devices, Inc. Document History Revision Rev 1 June 27, 2007 by Divy Sunkr Description Initil relese Migrting from 2106x/2116x to 2126x/2136x/2137x SHARC Processors (EE-328) Pge 14 of 14