Engineer-to-Engineer Note

Similar documents
Engineer To Engineer Note

Engineer-to-Engineer Note

Engineer To Engineer Note

Engineer To Engineer Note

Engineer-to-Engineer Note

Engineer To Engineer Note

Enginner To Engineer Note

Engineer-to-Engineer Note

a Technical Notes on using Analog Devices' DSP components and development tools

Engineer-to-Engineer Note

Engineer To Engineer Note

Engineer-to-Engineer Note

Engineer To Engineer Note

Engineer-to-Engineer Note

MIPS I/O and Interrupt

Engineer-to-Engineer Note

Address/Data Control. Port latch. Multiplexer

Engineer-to-Engineer Note

05-247r2 SAT: Add 16-byte CDBs and PIO modes 1 September 2005

IZT DAB ContentServer, IZT S1000 Testing DAB Receivers Using ETI

pdfapilot Server 2 Manual

x )Scales are the reciprocal of each other. e

EasyMP Network Projection Operation Guide

Section 3.1: Sequences and Series

Unit #9 : Definite Integral Properties, Fundamental Theorem of Calculus

2 Computing all Intersections of a Set of Segments Line Segment Intersection

UT1553B BCRT True Dual-port Memory Interface

Engineer-to-Engineer Note

Overview. Network characteristics. Network architecture. Data dissemination. Network characteristics (cont d) Mobile computing and databases

vcloud Director Service Provider Admin Portal Guide vcloud Director 9.1

Passwords Passwords Changing Passwords... <New Passwords> 130 Setting UIM PIN... <UIM PIN/UIM PIN2> 130 Unlocking a Locked UIM...

Fall 2018 Midterm 1 October 11, ˆ You may not ask questions about the exam except for language clarifications.

NOTES. Figure 1 illustrates typical hardware component connections required when using the JCM ICB Asset Ticket Generator software application.

05-247r0 SAT: Add 16-byte CDBs and PIO modes 16 June 2005

Functor (1A) Young Won Lim 8/2/17

Allocator Basics. Dynamic Memory Allocation in the Heap (malloc and free) Allocator Goals: malloc/free. Internal Fragmentation

Section 10.4 Hyperbolas

II. THE ALGORITHM. A. Depth Map Processing

2014 Haskell January Test Regular Expressions and Finite Automata

Engineer-to-Engineer Note

CS201 Discussion 10 DRAWTREE + TRIES

TECHNICAL NOTE MANAGING JUNIPER SRX PCAP DATA. Displaying the PCAP Data Column

Functor (1A) Young Won Lim 10/5/17

Stack. A list whose end points are pointed by top and bottom

Complete Coverage Path Planning of Mobile Robot Based on Dynamic Programming Algorithm Peng Zhou, Zhong-min Wang, Zhen-nan Li, Yang Li

How to Design REST API? Written Date : March 23, 2015

PARALLEL AND DISTRIBUTED COMPUTING

Before We Begin. Introduction to Spatial Domain Filtering. Introduction to Digital Image Processing. Overview (1): Administrative Details (1):

Engineer-to-Engineer Note

Agilent Mass Hunter Software

EasyMP Network Projection Operation Guide

EasyMP Multi PC Projection Operation Guide

box Boxes and Arrows 3 true 7.59 'X' An object is drawn as a box that contains its data members, for example:

1 Quad-Edge Construction Operators

OPERATION MANUAL. DIGIFORCE 9307 PROFINET Integration into TIA Portal

Chapter 2 Sensitivity Analysis: Differential Calculus of Models

LINX MATRIX SWITCHERS FIRMWARE UPDATE INSTRUCTIONS FIRMWARE VERSION

What do all those bits mean now? Number Systems and Arithmetic. Introduction to Binary Numbers. Questions About Numbers

Epson Projector Content Manager Operation Guide

Today. Search Problems. Uninformed Search Methods. Depth-First Search Breadth-First Search Uniform-Cost Search

Union-Find Problem. Using Arrays And Chains. A Set As A Tree. Result Of A Find Operation

Parallel Square and Cube Computations

Small Business Networking

Geometric transformations

Small Business Networking

9.1 apply the distance and midpoint formulas

File Manager Quick Reference Guide. June Prepared for the Mayo Clinic Enterprise Kahua Deployment

Digital Design. Chapter 1: Introduction. Digital Design. Copyright 2006 Frank Vahid

Epson iprojection Operation Guide (Windows/Mac)

In the last lecture, we discussed how valid tokens may be specified by regular expressions.

10/9/2012. Operator is an operation performed over data at runtime. Arithmetic, Logical, Comparison, Assignment, Etc. Operators have precedence

PNC NC code PROGRAMMER'S MANUAL

An Efficient Divide and Conquer Algorithm for Exact Hazard Free Logic Minimization

Slides for Data Mining by I. H. Witten and E. Frank

Engineer-to-Engineer Note

CHAPTER III IMAGE DEWARPING (CALIBRATION) PROCEDURE

Preserving Constraints for Aggregation Relationship Type Update in XML Document

Computer Arithmetic Logical, Integer Addition & Subtraction Chapter

Fig.1. Let a source of monochromatic light be incident on a slit of finite width a, as shown in Fig. 1.

Revisiting the notion of Origin-Destination Traffic Matrix of the Hosts that are attached to a Switched Local Area Network

Unit 5 Vocabulary. A function is a special relationship where each input has a single output.

1. SEQUENCES INVOLVING EXPONENTIAL GROWTH (GEOMETRIC SEQUENCES)

MA1008. Calculus and Linear Algebra for Engineers. Course Notes for Section B. Stephen Wills. Department of Mathematics. University College Cork

A Formalism for Functionality Preserving System Level Transformations

View, evaluate, and publish assignments using the Assignment dropbox.

Fig.25: the Role of LEX

Chapter 7. Routing with Frame Relay, X.25, and SNA. 7.1 Routing. This chapter discusses Frame Relay, X.25, and SNA Routing. Also see the following:

CSCI 446: Artificial Intelligence

Small Business Networking

05-247r1 SAT: Add 16-byte CDBs and PIO modes 19 August 2005

Questions About Numbers. Number Systems and Arithmetic. Introduction to Binary Numbers. Negative Numbers?

Chapter 1: Introduction

Midterm 2 Sample solution

E201 USB Encoder Interface

Spring 2018 Midterm Exam 1 March 1, You may not use any books, notes, or electronic devices during this exam.

JCM TRAINING OVERVIEW DBV Series DBV-500 Banknote Validator

Agenda & Reading. Class Exercise. COMPSCI 105 SS 2012 Principles of Computer Science. Arrays

pdftoolbox Server 4 Manual

L. Yaroslavsky. Fundamentals of Digital Image Processing. Course

Small Business Networking

Transcription:

Engineer-to-Engineer Note EE-295 Technicl notes on using Anlog Devices DSPs, processors nd development tools Visit our Web resources http://www.nlog.com/ee-notes nd http://www.nlog.com/processors or e-mil processor.support@nlog.com or processor.tools.support@nlog.com for technicl support. Implementing Dely Lines on SHARC Processors Contributed by Divy Sunkr Rev 1 October 25, 2006 Introduction This EE-Note discusses the implementtion of dely line on the ADSP-2126x, ADSP-21362, ADSP-21363, ADSP-21364, ADSP-21365, nd ADSP-21366 SHARC series of processors, herefter representtively referred to s ADSP- 21262 processors. Becuse ADSP-21262 processors ccess externl memory vi the prllel port, specil considertions re required while implementing dely lines on these processors when compred to other SHARC processors. Discrete filters cn be implemented with smplebsed dely lines nd set of coefficients tht define the filter. Due to memory constrints, dely lines re usully implemented in externl memory using vrious techniques. This document discusses two dely-line implementtions using the prllel port. First, smple-bsed dely line using prllel port core ccess is discussed. Lter in the document, techniques used in block-bsed dely line implemented vi prllel port DMA re described. Smple-Bsed Dely Lines Dely effects re chieved by implementing finite impulse response (FIR) or infinite impulse response (IIR) filters. For smple-bsed dely lines, this document focuses on the implementtion of simple stereo dely effect lgorithm with cross feedbck. The left nd right chnnels re coupled by introducing crossfeedbck coefficients, so tht the delyed output of one chnnel is fed into the input of the other chnnel, creting stereo dely effect. For detils, refer to the smple-bsed dely line code for the ADSP-21262 processor locted in the ssocited.zip file. The implementtion of the bove lgorithm requires two dely lines: one for left chnnel, nd one for the right chnnel. The dely lines re implemented s circulr buffers in externl memory. The length of the dely line depends on the required dely nd the input smpling rte. For exmple, 0.5-second dely requires 24000 tp dely line for n 48-KHz input smpling rte. Since this implementtion requires feedbck from the delyed output, the totl dely length is 24001, which includes the previous smple for the dely. Since the externl memory is 8 bits wide, ech tp (32-bit word) requires four externl memory loctions. Hence, the totl dely-line buffer length is four times the number of tps. Due to its feedbck structure, this lgorithm requires red from the dely-line buffer followed by write. The red nd write occur in smple-bsed mnner. Becuse the core ccesses externl memory vi prllel port DMA registers (EIPP, EMPP, nd ECPP), the circulr buffers in externl memory re implemented by using the DAG registers. Even though the DAG registers cnnot ccess externl memory, the DAG registers' circulr ddress cpbilities re Copyright 2006, Anlog Devices, Inc. All rights reserved. Anlog Devices ssumes no responsibility for customer product design or the use or ppliction of customers products or for ny infringements of ptents or rights of others which my result from Anlog Devices ssistnce. All trdemrks nd logos re property of their respective holders. Informtion furnished by Anlog Devices pplictions nd development tools engineers is believed to be ccurte nd relible, however no responsibility is ssumed by Anlog Devices regrding technicl ccurcy nd topiclity of the content provided in Anlog Devices Engineer-to-Engineer Notes.

used to clculte the ddress to ccess the externl memory vi the prllel port. The DAG register pointing to externl circulr buffers re then ssigned to the prllel port externl index register (EIPP) in order for the core to ccess the dt. Initilly, the red pointer points to the buffer's lst tp nd the write pointer points to the first tp. Following ech red nd write, both pointers re decremented nd the trnsfers continue smple by smple. For smple-bsed processing, single output is trnsmitted t prticulr time. Core-driven trnsfers to the externl memory vi the prllel port cn be initited using four techniques s described in the ADSP-2126x SHARC Processors Peripherls Mnul s well s in the ADSP-2136x SHARC Processor Hrdwre Reference. The known-durtion ccess technique, which is the most efficient of ll the ccesses, is used for reds nd writes. With 8-bit externl memory, ech 32-bit ccess vi the prllel port requires known durtion of 15 cycles [(1 ALE-cycle * 3 CCLK) + (4 dtcycles * 3 CCLK)] with PPDUR (prllel port durtion setting) of 3. The progrm cn use the 15 cycles tht occur fter the dt is written to the trnsmit register (or red from the receive register of prllel port) efficiently for other purposes. Another considertion while ccessing dt vi the prllel port is the ddressing of externl memory. This is due to the fct tht the prllel port no longer ccepts logicl ddresses but requires physicl ddresses of the externl memory to trnsfer dt. Consecutive 32-bit vlues in externl memory re ddressed by n offset of 4 becuse ech 32-bit vlue occupies four-byte-wide loctions of externl memory. Figure 1. FIR Filter The impulse response of the filter is vilble in Figure 2. Figure 3, which illustrtes the FIR implementtion using block-bsed dely lines in externl memory, depicts the implementtion with block of eight smples, filter length of six, nd two tp ccesses (tp2 nd tp4). Figure 1 shows the FIR implementtion of single block of dt. The FIR implementtion of single block of input dt requires three processes: write DMA, red DMAs, nd the dely process (convolution of input nd coefficient vlues). Ech of these processes is discussed in detil below. Even though the block-bsed dely line code implements the filter shown in Figure 1 nd Figure 2, Figure 3 shows the implementtion of only two tps of the filter to simplify the illustrtion; the dditionl tps follow the sme templte. Block-Bsed Dely Lines For the block-bsed dely-line technique, this document discusses the implementtion of lowreflection setting FIR s shown in Figure 1. Figure 2. FIR Filter Response Implementing Dely Lines on SHARC Processors (EE-295) Pge 2 of 7

Figure 3. FIR Filter Implementtion Write DMA The smple dt rrives s block of NUM_SAMPLES. The dt in the input buffer contins the left nd right chnnel smples s shown in Figure 4. Therefore, two dely lines re required for ech of the chnnel. The processing on ech of the chnnels is identicl. The length of ech dely line is equl to: Filter Length + ((NUM_SAMPLES/2) -1) For block-bsed FIR implementtions, the processing strts with writing block of dt (write DMA) to the dely-line buffer in externl memory. Figure 4 illustrtes the write DMA process to the left nd right dely lines. With block length (NUM_SAMPLES) of eight nd filter length of six, the dely length of ech of the chnnels is nine. Ech of the nine loctions of the dely line shown in Figure 4 corresponds to four loctions in externl byte-wide memory. Therefore, the totl dely line length in externl memory is 28 (9 x 4). The write pointers for ech of the chnnels re initilized to the first loctions of the left nd right dely line buffers shown s WrLptr nd WrRptr in Figure 4. Two write DMAs tke plce, one for ech chnnel. The function used to perform write DMA for both chnnels is the sme (Listing 1). The prmeters pssed to this function vry, nd re bsed on prticulr chnnel. The DMA for the left chnnel tkes plce with the internl index register pointing to TrLptr nd the externl index register of prllel port pointing to WrLptr. Similrly, for the right chnnel, the internl index register points to TrRptr nd externl index register points to WrRptr. The internl modify register is set to 2 nd the externl modify register is set to -1, plcing ech chnnel dt in counter clockwise direction in externl dely line memory strting with the first smple locted t the strt of the dely line buffer s shown in Figure 4. void TrnsDMA(int *Tinptr, int Tcount, int *Textptr) *pppctl = 0; /* initite prllel port DMA registers for first DMA*/ *piipp = Tinptr; *pimpp = 2; *picpp = Tcount; *pempp = -1; *peipp = Textptr; /* For 8-bit externl memory, the Externl count is four times the internl count */ *pecpp = Tcount * 4; *pppctl = PPTRAN PPBHC PPDUR4; *pppctl = PDEN PPEN; // enlble prllel port DMA /*poll to ensure prllel port hs completed the trnsfer*/ do ; while( (*pppctl & (PPBS PPDS) )!= 0); *pppctl = 0; // disble prllel port // strt of second DMA Listing 1. TrnsmitDMA.c Implementing Dely Lines on SHARC Processors (EE-295) Pge 3 of 7

void WriteDMA( int *Inptr, int *Wrptr, int *TestDMA, int *BUF) int DMAcount1,DMAcount2,Inc; int*wrptr1,*wrptr2,*inptr1,*inptr2; if (Wrptr < TestDMA) DMAcount2 = (TestDMA-Wrptr)/4; DMAcount1 = ((NUM_SAMPLES)/2) - DMAcount2; Wrptr1 = Wrptr; // Pointer to the strt DMA Inc = -(DMAcount1 * 4); Wrptr2 = Wrptr1; // externl pointer for first DMA Wrptr2 = circptr(wrptr2, Inc, BUF, LENGTH); // externl pointer for second DMA Inptr1 = Inptr; // internl pointer for first DMA Inptr2 = Inptr1+ DMAcount1*2; // internl pointer for second DMA TrnsDMA(Inptr1, DMAcount1, Wrptr1); TrnsDMA(Inptr2, DMAcount2, Wrptr2); Figure 4. Write DMA Circulr Buffering Since the prllel port DMA prmeter registers do not tke into ccount the wrpping of circulr buffers, if the write pointer needs to wrp round the end, the DMA is split into two seprte DMA sections. One of the DMA sections strts t the write pointer nd the other DMA section begins t the end of circulr dely line buffer for tht chnnel. For exmple, if the DMA for left chnnel strts t WrLptr (see Figure 4), it would be split into two DMAs with one strting t WrLptr nd trnsferring one word (4 bytes), nd the other strting t the end of the left chnnel dely line buffer, trnsferring four words (16 bytes). The function tht tests for this wrpping ppers in Listing 2. In the function, the TestDMA vrible points to Num_Smple/2 loctions (the number of smples trnsferred in prticulr chnnel per DMA) offset from the strt of externl dely line of prticulr chnnel. else TrnsDMA(Inptr, NUM_SAMPLES/2, Wrptr); RdptrInit= Wrptr; // Next Red pointer for red DMA Inc =-((NUM_SAMPLES/2)*4); WrptrTemp = Wrptr; WrptrTemp = circptr(wrptrtemp,inc,buf,length); // Next write pointer for write DMA Listing 2. writedma.c Red DMA For ech write DMA to the externl dely line buffers, the number of red DMAs correspond to the number of tps ccessed in the FIR filter. Figure 5 depicts the red DMA from the left nd right dely line buffers. As described bove, two tps re ccessed in this illustrtion (tp2 nd tp4), nd there re two red DMAs for every write DMA. The length of the internl buffer to perform these red DMAs is clculted by multiplying the number of smples trnsferred per DMA times the number of tps ccessed. With regrd to Figure 5, the internl buffer length for red DMAs of ech of the chnnels is eight [4 (number of smples per DMA) * 2 (number of tps ccessed)]. The function Implementing Dely Lines on SHARC Processors (EE-295) Pge 4 of 7

tht performs the DMA from the dely line buffer to the internl buffer is given in Listing 3. The DMA tkes plce with the internl modify vlue set to number of tps nd the externl modify vlue set to -1. As shown in Figure 5, two red DMAs re performed for ech of the chnnels with the externl index register pointing to RdLptr1 nd RdLptr2 for the left chnnel (nd RdRptr1 nd Rdrptr2 for the right chnnel), nd the corresponding internl index register points to ReLptr1 nd ReLptr2 for left chnnel (nd RdRptr1 nd RdRptr2 for right chnnel). Ech smple trnsferred from externl memory requires corresponding loction in internl memory, bsed on the modify register vlues s shown in Figure 5. RdLptr nd RdRptr ct s the reference pointers, obtining the externl index register vlues for red DMAs for ech of the chnnels; they re lwys initilized to the current write DMA pointers (WrLptr nd WrRptr) of ech of the chnnels, respectively. The red pointer for ech of the tps is derived by offsetting the tp number from the reference. The internl index for ech successive red DMAs is incremented by one. The function tht sets the externl nd internl index pointers for ech of the red DMA is shown in Listing 4. This function lso checks for wrpping of circulr buffers similr to tht discussed in the Write DMA section of this document. The reference pointer for the red DMAs of the left nd right chnnels for next set of dt is set to the write pointers for the next set of dt, which re NextWrLptr (left chnnel write pointer) nd NextWrRptr (right chnnel write pointer) with respect to Figure 4. Figure 5. Red DMAs void RecivDMA(int *Rinptr, int Rcount, int *Rextptr) *pppctl = 0; *piipp = Rinptr; *pimpp = NUM_TAPS; *picpp = Rcount; *pempp = -1; *peipp = Rextptr; /* For 8-bit externl memory, the Externl count is four times the internl count */ *pecpp = Rcount * 4; *pppctl = PPBHC PPDUR4; /* Receive(Red) */ *pppctl = PPDEN PPEN; // Enble prllel port DMA /*poll to ensure prllel port hs completed the trnsfer*/ do ; while( (*pppctl & (PPBS PPDS) )!= 0); *pppctl = 0; // disble prllel port Listing 3. ReceiveDMA.c Implementing Dely Lines on SHARC Processors (EE-295) Pge 5 of 7

void RedDMA(int *Inptr, int *Rdptr, int *TestDMA, int *BUF) int DMAcount1,DMAcount2,Inc; int *Rdptr1,*Rdptr2,*Inptr1,*Inptr2; int *RdptrTemp,i,InptrTemp; int Tpno[NUM_TAPS] = 993,1315,2524,2700,3202,2700,3119,3123,3202,32 68,3321,3515; InptrTemp = Inptr; for(i=0; i<num_taps; i++) RdptrTemp = Rdptr; Inc = Tpno[i] *4; RdptrTemp = circptr(rdptrtemp, Inc,BUF, LENGTH); if (RdptrTemp < TestDMA) DMAcount2 = (TestDMA-RdptrTemp)/4; DMAcount1 = ((NUM_SAMPLES)/2) - DMAcount2; Rdptr1 = RdptrTemp; Inc = -(DMAcount1 *4); Rdptr2 = Rdptr1; Rdptr2 = circptr(rdptr2, Inc, BUF, LENGTH); Inptr1 = InptrTemp; Inptr2 = Inptr1+ DMAcount1*NUM_TAPS; // internl pointer for second DMA RecivDMA(Inptr1, DMAcount1, Rdptr1); RecivDMA(Inptr2, DMAcount2, Rdptr2); else RecivDMA(InptrTemp, NUM_SAMPLES/2, RdptrTemp); InptrTemp = Inptr+1; Listing 4. RedDMA.c Dely Process After the red DMAs for ll the required tps re completed, processing tkes plce on the vlues red into the internl memory. Figure 6 shows the dely process on the internl memory dt. This process is identicl for both chnnels. The tp vlues for ech of the chnnels re in consecutive loctions in internl memory s result of the red DMA. The dely process implements n FIR filter, which involves multiplying the tp dt vlue times its corresponding coefficient vlues nd dding the results to get the smple output. Consecutive smples re clculted in the sme mnner for both the left nd right chnnels. The processed left nd right smples re plced in lternte loctions to be trnsmitted out in I2S mode. DelyProcess(int*InBlockptr,int*InBufptr) int *TempInBufptr; flot ErlyRef, InSmple; int x,j,m,i,k,l; flot TpVlue[NUM_TAPS]=0.490,0.346,0.192,0.181,0.1 76,0.181,0.180,0.181,0.176, 0.142,0.167,0.134; for (k=0;k<(num_samples/2);k++) TempInBufptr = InBufptr; j = k*2; // increment for block pointer m = k*num_taps;// increment for Buffer pointer InSmple = *(InBlockptr + j)*0.33 ; ErlyRef = 0.0; for (i=0; i<num_taps; i++) x=m+i; ErlyRef = ErlyRef + (*(TempInBufptr+x) * TpVlue[i]); *(InBlockptr + j) = (int)(insmple + ErlyRef * 0.33); Listing 5. Delyprocessing.c Implementing Dely Lines on SHARC Processors (EE-295) Pge 6 of 7

Figure 6. Dely Process Conclusion This document discusses the methods used to implement dely lines for prllel port-bsed processors such s the ADSP-2126x, ADSP- 21362, ADSP-21363, ADSP-21364, ADSP- 21365 nd ADSP-21366 SHARC series of processors. Implementtion methods for smplebsed nd block-bsed dely line re described. The progrm code for ech of these techniques for ADSP-21262 processors is vilble in the ssocited.zip file. For n FIR filter implementtion with blockbsed dely line, n input block of 512 smples with filter length (dely) of 3520 nd 18 tp ccesses tkes 463,814 core clock cycles t clock rte of 200 MHz for ADSP-21262. Incresing the number of input block smples or the number of tp ccesses increses the processing time for implementing the filter. References [1] ADSP-2126x SHARC Processor Peripherls Mnul. Revision 3.0, December, 2005. Anlog Devices, Inc. [2] ADSP-2136x SHARC Processor Hrdwre Reference. Revision 1.0, October 2005. Anlog Devices, Inc. [3] Introduction to Signl Processing, S. J. Orfnidis, Prentice-Hll,1996 [4] About This Reverbertion Business, J. A. Moorer, Computer Music Journl, vol. 3(2), pp.13-18 (1979). Document History Revision Rev 1 October 25, 2006 by D.Sunkr Description Initil version Implementing Dely Lines on SHARC Processors (EE-295) Pge 7 of 7