KeyStone Training. Keystone Architecture Debug

Size: px
Start display at page:

Download "KeyStone Training. Keystone Architecture Debug"

Transcription

1 Keytone Training Keystone Architecture Debug Debug Overview As devices get more complicated and more IP is integrated into the device (oc) there is a requirement to provide sufficient hooks in the silicon to provide enough visibility to facilitate debug in the system. Multiple cores executing concurrently and asynchronously stresses the memory in variable fashion during execution. It is critical to understand the performance or each core (or other processing component) in the full system context. Multiple EDMA' transactions move a very large amount of data through the chip concurrently, converging at memory and accelerator endpoints. It is critical to understand the bandwidth available to, and latency seen by each master in the full system context.

2 Core Visibility Features Each core has visibility requirements critical for successful software development and debug in the field: Run time execution control (e.g. halt, step, run) Accurate reflection of memory and register contents Visibility into system stalls Bank conflict cache miss/coherency overhead Memory latency EDMA conflict Cache state (coherency with external memory) Visibility to processing PC, data, timing trace (control through system events and core actions) Watch points on PC (execution zones of interest) and data (transactions of interest) Visibility to processing relative to system Event trace (with PC/timing) Ability to correlate between traces Processor, IP, EDMA Multicore Visibility Issues In addition to the generic considerations for a processing core, there are additional 'core' visibility requirements in the multi core environment hared memory: Cache state (coherency between cores) Conflict between cores (performance degradation) hared atomic resources (memory structure, IP, etc) Ownership through EM/atomic monitors view/force IPC flags view/force Volatile shared structs view/update hared multi channel resources (e.g. EDMA channels) view/change context without impacting other cores Cross triggering Local halt, step, run Global halt, step, run (synchronized) Local hardware BP Global software BP Global hardware BP (assert and receive)

3 EDMA Features There are a large number of masters in the system that drive data transactions through the switch fabric. A certain minimum amount of visibility is required to balance priority and scheduling for successful system integration and debug. Global configuration view tate information, control values (per core) Transaction trace Ability to 'see' the data transactions occurring by each master to each slave Logical start/end per transfer of interest (i.e. servicing submitted transfer request) Min/max/average transaction times (latency) Min/max/average bandwidth individual start/end per physical bus transaction (i.e. visibility of individual bus transactions Arbitration with concurrent transactions to/through the same slave points tall times (duration, blocking information who, when, how long) stalls should be at reflective of stalls to the master, but should be available from any point of stall (i.e. intermediate bridge or endpoint). EDMA Interaction Points of Interest M0 M1 M2 CRM0 CR64 CR Multiple Masters Multiple slaves Multiple points of interaction M3 CRM1 M4 M5 3 M6 4 CR128 M7 CRM2 5 CR2 M8 6

4 Memory Endpoint Features Visibility Local view (what a Processsor 'sees') Global view (true memory contents, coherency b/t cores) Memory protection R/W/X control Unified control for shared memory tatistics Access counters (GLOBAL, LOCAL, EXTERNAL, MasterID, PrivID) Count number accesses, amount transferred elect r/w/x/any tall counters (GLOBAL, LOCAL, EXTERNAL, MasterID, PrivID) Cache Features Visibility Tag RAM view Address LRU status Dirty status Activity counters Hit (r/w/x) occurrences Miss (r/w/x) occurrences Total With writeback Miss (r/w/x) cycles (stall assertion) Events Hit (r/w/x) occurrences Miss (r/w/x) occurrences Any With writeback

5 Differences from 64x+ Improved embedded trace functionality: Real time trace capture and draining solved with TETB Ability to capture trace data to memory Ability to export trace data through standard interface (e.g. RIO, Ethernet) Able to drain trace buffers without affecting trace collection Improved chip level visibility: EDMA data transaction trace highlighted as a critical need. With and without timestamp on data switch fabric. Circular buffer control captures chip level traces as well as core pc traces. Monitor and trace functionality added into switch fabric Emulation Features Host tooling can halt any or all of the cores on the device Each core supports a direct connection to the JTAG interface Emulation has full visibility of the DP memory map Real Time Emulation allows the user to debug application code while interrupts designated as real time continue to be serviced. Normal code execution running code prior to a debug event. The Processor is running code without a debug event halting execution with the peripheral operating in a continuous fashion. econdary code execution running code related to the service of a real time interrupt after a debug event has halted code execution. No code execution not running code. A debug event halts code execution, and no real time interrupt is being serviced after code execution is halted. Advanced event triggering (AET) allows the user to identify events of interest Uses instruction and data bus comparators, auxiliary event detection, sequencers/state machines, and event counters Triggers manage breakpoints, trace acquisition, data collection via an interrupt, timing measurement, and generate external triggers. Triggers control a state machine and counters used to create the intermediate events (loop counts and state machines). These events can be combined to create simple or complex triggers using modules call trigger builders. AET logic is provided for monitoring program, memory bus, system event activity, remembering event sequences, counting event occurrences, or measuring the interval between events. Can perform range and identity comparisons Can detect exact transactions Can detect touching of a byte or range of bytes by memory references External event detectors provide a means to monitor external triggers or internal states of interest (i.e. cache miss). 4 states allow for the identification of a sequence of triggers Allow specific system activity to generate breakpoints, an interrupt used for the collection of system data, or the identification of program activity that is observed through. Any system event routed to a C64x+ core can be routed (through software selection) to the AET. This is controlled through software.

6 What Is AET? AET is a set of logic that can generate various triggers based on combinations of application events. Block Diagram * # CorePacs CorePac TETB ADTF ATB VBUP VBUP PTI - *#CorePacs Debug Chip level debug control subsystem ADTF and TETB Integrated as independent IP per Core. CP_TRACERs provided to monitor and trace EDMA transactions through the chip JTAG DC OCP XTRIG DRM PTI DV ATB ICEPICK TM OCP OCP DAP AHB APB OCP OCP Debug Configuration Bus OCP OCP OCP VBUP VBUP VBUP DP witch Fabric TETB VBUP Debug-0.8' VBUP CP_Monitor CP_Monitor CP_Monitor CP_Monitor

7 Architecture DATA ADDRE BUE Processor DATA BUE PROGRAM BU & AET Jobs Comparators Compressor Cycle counter C66x DP XD560T POD Additional JTAG Emulation Pins Buffer ize (256K 64Mb) User electable in increments of 256K XD560T POD RECORDING UNIT To/from host PC XD560 Hardware upports 256K to 64Mb Capture Buffer Buffer Configuration top when Full First amples Circular Buffer Last amples Host PC & CCtudio XD560 Emulator Target Board XD560 Product Cable

8 TI Embedded Buffer (TETB) Designed during TCI6484 development, enhanced from ETB used in 6488 and 6472 Dedicated 16K Embedded Buffer that can be used for capturing trace Local buffers per core upports concurrent trace and EDMA draining to memory upports concurrent trace concurrently on all cores XD560 Pod not needed (Can use tandard XD560 Emulator Cable) Can use with either 60 pin or 14 pin header Data Capture Capability is significantly more limited. How Works Address Disassembly 0x00088 MVK.L2 1,B4 0x00089 B.1 _WI 0x00090 UB.L2 B15,8,B15 ~~~~ ~~~~~ 0x0489C MVKH.2 0x0000,B6 0x048A0 ADD.L2 8,B15,B15 ; _IR2 0x048A4 TH.D2T2 B4,*+B6[B5] 0x048A8 NOP 2 Addr Data 0x088 MVK TRACE Data Collection Debug Port XD560 DP Chip DP Core Trigger Points = 0x089 = 0x48A0 = 0x0xx TRC ON TRC OFF N/A Advanced Event Analysis Unit Trigger Generator Trigger point point set set to to turn turn ON ON when when PC PC = x089 x089 PC match no action taken

9 How Works Address Disassembly 0x00088 MVK.L2 1,B4 0x00089 B.1 _WI 0x00090 UB.L2 B15,8,B15 ~~~~ ~~~~~ 0x0489C MVKH.2 0x0000,B6 0x048A0 ADD.L2 8,B15,B15 ; _IR2 0x048A4 TH.D2T2 B4,*+B6[B5] 0x048A8 NOP 2 Addr Data 0x089 AMOV B.1 TRACE Data Collection Debug Port XD560 DP Chip DP Core Trigger Points = 0x089 = 0x48A0 = 0x0xx TRC ON TRC OFF N/A Advanced Event Analysis Unit Trigger Generator Trigger match match with with current PC PC Trigger generator enables trace How Works Address Disassembly 0x00088 MVK.L2 1,B4 0x00089 B.1 _WI 0x00090 UB.L2 B15,8,B15 ~~~~ ~~~~~ 0x0489C MVKH.2 0x0000,B6 0x048A0 ADD.L2 8,B15,B15 ; _IR2 0x048A4 TH.D2T2 B4,*+B6[B5] 0x048A8 NOP 2 Addr Data 0x x x0489C TRACE Data Collection (Enabled) Debug Port XD560 DP Chip DP Core Trigger Points = 0x089 = 0x48A0 = 0x0xx TRC ON TRC OFF N/A Advanced Event Analysis Unit Trigger Generator continues continues exporting exporting trace. trace. Trigger Trigger setup setup to to turn turn trace trace OFF OFF when when PC=0x48A0 PC=0x48A0 PC match no action taken

10 How Works Address Disassembly 0x00088 MVK.L2 1,B4 0x00089 B.1 _WI 0x00090 UB.L2 B15,8,B15 ~~~~ ~~~~~ 0x0489C MVKH.2 0x0000,B6 0x048A0 ADD.L2 8,B15,B15 ; _IR2 0x048A4 TH.D2T2 B4,*+B6[B5] 0x048A8 NOP 2 0x00089 B.1 0x00090 UB.L2 ~~~~ ~~~~~ 0x0489C MVKH.2 0x048A0 ADD.L2 Addr Data 0x089 0x48A0 AMOV B.1 TRACE Data Collection (Enabled) Debug Port XD560 DP Chip DP Core Trigger Points = 0x089 = 0x48A0 = 0x0xx TRC ON TRC OFF N/A Advanced Event Analysis Unit Trigger Generator Trigger match match with with current PC PC Trigger generator disables trace ubsystem (implified) TETB Buffers (same as 6484) tream(s) Optionally Exported Pin election (same as C64x+) ETB0 CorePac 0 One Embedded Buffer per CorePac DRM TeraNet CR Master/lave w/ monitor hooks ETBn-1 CorePac n Other Masters TeraNet witch Fabric 0 m Other laves CP_TRACER IP CP_MONI TOR_0 CP_MONI TOR_m VBU command signals exported to CP_MONITORs One CP_MONITOR per monitored slave endpoint CR TM ETBn Logs generated through dedicated TeraNet CR One Embedded for ystem

11 Pin upport for XD560T On Chip Embedded Buffers 16 KB (Core) / 64 KB (TM) on chip receiver One ETB per core for and one for TM napshot and circular buffer mode Features imultaneous write (sink) and read (drain) capability Can be used in Coreight ETB mode C66x DP : targets the debug of unstable code Provides for the recording of program flow, memory references, cache statistics, and application specific data with a time stamp, performance analysis, and quality assurance. Bus snoopers to collect and export trace data using hardware dedicated to the trace function. All or a percentage of the debug port pins can be allocated to trace for any of the cores (or a mix). Program flow and timing can be traced at the same rate generated by the Processor. Event trace provides a log of user selectable system events. Can also be used in conjunction with profiling tools. Data references must be restricted however as the export mechanism is limited to a number of pins, which is insufficient to sustain tracing of all memory references. The Advanced Event Triggering facilities provide a means to restrict the trace data exported to data of interest to maintain the non intrusive aspect of trace. Error indications are embedded in the debug stream in the event the export logic is unable to keep up with the data rate generated by the collection logic. The user can optionally select the export of all specified trace data. In this case the Processor is stalled to avoid the loss of trace data The user is notified that trace stalls have occurred although the number and location of stalls is not recorded. Chip Level : provides visibility to chip level data transactions Provides for the recording of transaction information for accesses to/from memory endpoints by system masters. Bus snoopers collect and export/record trace data using hardware dedicated to the trace function. Data may be recorded to an embedded trace buffer or exported through the trace pin interface Transaction flow and timing can be traced across several endpoints concurrently. Monitored endpoints must be restricted however as the export/recording mechanism is limited. The CP_TRACER logic facilities provide a means to restrict the number of monitored endpoints as well as monitored system masters being monitored at a given time. Embedded statistics capture CP_TRACER logic can record statistics related to the memory transactions, rather than exporting a transaction trace. This provides information on the memory utilization without the detailed transaction information without consuming trace bandwidth. Legend Bridge Wireless Apps Only Media Apps Only CP r VUR M TPCC TC0 M 16ch QDMA TC1 M EDMA_0 QM PCIe x4 for Wireless x8 for Media M RIO M PA/A M TPCC TC2 M 64ch TPCC TC3 TC6 M M TC4 M QDMA 64ch TC7 M TC5TC8 M M QDMA TC9 M EDMA_1,2 Monitors transactions from AIF,RIO, Core, TCs Monitors transactions from AIF, TCs TAC_FE M RAC_BE0,1 M FFTC / DMA M AIF / DMA M M M DAP (Debug) M TIP0,1 M x2 x2 CPU / 3 32b TeraNet CR CPU/2 256b TeraNet CR CPU/3 128b TeraNet CR CPT CPT CPU / 3 128b CR MPU CPT Keytone CP r Modules XMC X 4/ x 8 CorePac M RIO Bridge 12 Bridge 13 Bridge 14 M3_DDR M3_L2 CPT CPT CPT CPT QM PCIe VUR TCP3e_W/R TCP3d VCP2 (x4) CPU/3 32b TeraNet CR CPT CONFIG MMC_ MPU CPT CR CPU /2 CR CPU / 3 CR CPU / 3 for EMIF_DDR3 (36b) 4 CPTs for RAM (36b) TETB EMIF16 CPU/3 CPU / 6 CP r (x5) M 32b 32b TeraNet Boot ROM TeraNet CP r (x8) M Write-only CR CR Preliminary Information PI under NDA CP r - subject (x7) to M change x2 x4 x4 MPU CPT MPU CPT TM TETB TPCC TPTC TPCC TPTC TPCC TPTC Debug TM TETB x8 x2 CPU/3 32b x4 TeraNet CR emaphore QM x5 x7 M M CPU/6 32b X8 / x16 TeraNet CR DDR3 RIO CP r (x5) CP r (x8) CP r (x7) TIP AIF2 VCP2 TCP3D TCP3E FFTC Timer GPIO PA/A Debug EC_CTL PLL_CTL Bootcfg I2C INTC UART Global Timestamp

12 Embedded in TeraNet CR Interfaces to be monitored are configured in CR config file during generation All events originating from master I/Fs to a selected slave I/F in an CR can be monitored e.g. transactions from all master ports targeted towards the EMIF All events tagged with master ID, xid and bytecount (where appropriate) Events are one cycle assert high pulses generated at the CR clock frequency. tatistics counters: Throughput counts represent the total number of bytes forwarded to the target slave during a specified time duration. counter accumulates the byte count presented at the initiation of a new transfer. can be used to calculate the effective throughput in terms of bytes per second at a given memory slave interface. can be used to track the bandwidth consumed by the system masters. (#bytes/time) Each CP_r provides two independent throughput counters. Each can be used to track the total number of bytes forwarded from a group of masters. Each system master can be assigned to either / both /none of the two masters groups for throughput collection. CP_r also provide address range based filtering and transaction qualifier based filtering function to further narrow the interested transactions. Accumulated Wait Time Counter Provides an indication of how busy the bus is and how many cycles elapsed with at least one bus master waiting for access to the bus. Num Grant Counter Provides an indication of the number of bus grants. The average transaction size can be determined by looking at throughput / num grant. CP_TRACER (1/2) Event name Master requesting to slave (A) New request to slave (B) Last write data from master (C) First read data to master (D) Last read data to master (E) Function This event triggers when there is a new request from the master decoded to the slave. This event triggers when a transaction is sent to the slave. This event triggers when the last write data from the master is accepted, thus completing the write burst. This event triggers when the first read data is returned to the master. This event triggers when the last read data is returned to the master, thus completing the read burst. CP_TRACER (2/2) Transaction trace (output to TM) Ability to 'see' the transactions for each master to selected slave interfaces through tracing of key transaction points: Arbitration won (Event B) transaction complete (Event C, E) Two filtering functions for transaction traces to bring out the specific transactions: Transaction qualifier filtering: read/write Address range based filtering liding Time Window: specifies the measurement interval for all the CBA statistic counters implemented in the CP_r module. When the sliding window timer expires, the counter values are loaded into the respective registers and the count starts again. If enabled, an interrupt is also generated when the sliding time window expires. The host Processor and/or EDMA can read the statistics counters upon assertion of the interrupt. If enabled, the counter values can also be exported to TM automatically after the sliding time window is expired. Cross trigger generation: can assert EMU0/1 when a qualified event occurs External trigger to start/stop monitoring. The EMU0 trigger line is coupled to trace start. The EMU1 trigger line is coupled to trace stop. Both EMU0 and EMU1 are sourced from any of the cores. It can also be controlled from an external source via the EMU0 and EMU1 pins on the device. The EMU0 trigger enables the EMU01_Enabletatus bit of the Transaction Qualifier register, the EMU1 trigger disables this bit. TM Export Enables tatus message Event message tatistics message

13 Add l Chip Level tatistics: DMA Tooling TCC (early) EVT TR ubmit Read data Write data TR ubmit Read tat TCC (true) Write tat TR ubmit Read tat To AET for Triggering/counting Write tat DMA can be used to record transaction data DMA channel servicing data can chain to tooling channel Tooling channel captures stats either at time of submit or time of completion DMA can route system events and submit or completion events to AET: AET can count or time DMA transactions AET can trace DMA events Chip Level tatistics: XMC Counters for profiling based on pre fetch logic: Four counters per core. Counting mode is configured through software. everal profiling data points can be derived from registers directly.

14 Chip Level tatistics: MMC MMC profile register allow view into memory access arbitration between masters. Register MATH MAH MAMP MAEP MAC Description Transmit header for profile data export through TM Access Counter to RAM Priority elevation counter for MMC memory arbiters and accumulated waiting cycles Priority elevation counter for MMC memory arbiters and accumulated waiting cycles CPU Filter, PrivID filter, priority threshhold filter, enable/clear MMC can export of the state of the performance analysis registers with EDMA transfers through the M port. The MATH register can be programmed with the transmit header appropriate for the trace export protocol used 32 byte EDMA read would read out all the performance analysis registers along with the header as a pre formatted trace packet in a single data phase. This kind of a transfer clears all the count registers while retaining the state of MATH and MAC. This action results in a restart of the counts for a new analysis count window. Chip Level tatistics: DDR3 Twp counters in the EMIF Controller Can set a master ID mask to filter for specific requestors Can configure each counter to count events:

15 Embedded ystem Monitor All preceding functionality is required in the development environment (i.e. with a JTAG debugger). Much of the features are needed as part of an embedded solution as well Allows debug in system test environment Host can communicate with embedded application and control function for debug during real system scenarios. Requires resident target software to service the host commands. information can be taken from any device/core in a system, without needed JTAG connectivity. Allows detailed failure analysis in deployed system Allows detailed device activity to be recorded in memory and available through memory dump to analyze faults Depending on accessibility through operator premises, it may be possible to interactively collect information as in the system test environment. Instrumentation Client Host D cripts CC4 DVT Event logs Remote Debugging and Monitoring (e.g., Ethernet Transport) criptable Java Classes tackable TM Decoder Framework Ethernet PHY & MAC TCP/IP Interface (e.g. http client) UUID Metadata XML Endpoint description Target Device TCF for Back-channel communications Ethernet Cmds, PHY & Events MAC ystem Memory Map ETB ETB ETB ETB ETB ETB ystem Module Processor Core C64X+ TCP/IP Interface TM ETB AET Lib DCI UUID Monitor TCF Agent To / From other cores Log erver EDMA Application LogWrite( OAL & HAL ILogger Memory (e.g. M3) CP_r Modules TM Library (OT compliant) Global Timestamp Legend: Control & tatus Path Data Path Interface 1 or more device pins TM (ystem Module) UIA (Unified Instr. Arch.)

16 Example Event Use of a common event allows correlation between CorePac event traces. CorePac 0: CorePac 1: (Time00) TIME_EVENT (Time00) TIME_EVENT PCxxxoffset + timing (Time01) EDMA_INT3 PCxx4offset + timing (Time02) EDMA_INT4 PCxxxoffset + timing (Time03) EDMA_INT5 (Time04) TIME_EVENT (Time04) EDMA_INT3 (Time05) EDMA_INT0 (Time05) EDMA_INT4 (Time06) EDMA_INT1 (Time06) EDMA_INT5 (Time07) EDMA_INT2 (Time07) TIME_EVENT (Time08) TIME_EVENT (Time08) EDMA_INT3 (Time09) EDMA_INT0 (Time09) EDMA_INT4 (Time10) EDMA_INT1 (Time10) EDMA_INT5 (Time11) EDMA_INT2 (Time11) EDMA_INT3 (Time12) EDMA_INT4 (Time13) EDMA_INT5 Correlation done by host CorePac 2: (Time00) TIME_EVENT (Time01) EDMA_INT6 (Time02) EDMA_INT7 (Time03) EDMA_INT6 (Time04) EDMA_INT6 (Time05) EDMA_INT6 (Time06) TIME_EVENT (Time07) EDMA_INT7 (Time08) EDMA_INT7 (Time09) EDMA_INT7 (Time10) EDMA_INT6 (Time11) EDMA_INT7 (Time12) EDMA_INT6 (Time13) EDMA_INT7 Example EDMA Log Use common event from event trace to correlate Event, EDMA, and API logs Assumes that TIME_EVENT is based on UMT time. Assumes time recorded with API calls is UMT counter, related to the TIME_EVENT. Time_axx = actual time at API call (Time_axx / period = TIME_EVENT window) TIME_EVENT = periodic event based on time, defines event trace window Countxx = Processor count at time of event Core 0: (Count00) TINT0 (Count01) EDMA_INT0 (Count02) EDMA_INT1 (Count03) EDMA_INT2 (Count04) TIME_EVENT (Count05) EDMA_INT0 (Count06) EDMA_INT1 (Count07) EDMA_INT2 (Count08) TIME_EVENT (Count09) EDMA_INT0 (Count10) EDMA_INT1 (Count11) EDMA_INT2 Correlation done by host EDMA LOG: TIME_EVENT0: - UMT Time 0 - Processor Task Log for time period 0 - DDR tatistics for time period 0 - RAC tatistics for time period 0 - AIF tatistics for time period 0 TIME_EVENT1: - UMT Time 1 - Processor Task Log for time period 1 - DDR tatistics for time period 1 - RAC tatistics for time period 1 - AIF tatistics for time period 1 API LOG A: (Time_a00) log_struct (Time_a01) API_call01 (Time_a02) API_call02 (Time_a03) API_call03 (Time_a04) API_call04 (Time_a05) API_call05 (Time_a06) API_call06 (Time_a07) API_call07 (Time_a08) API_call08 (Time_a09) API_call09 (Time_a10) API_call10 (Time_a11) API_call11

17 Possible ystem Log Display Goal is to build a transaction timing diagram to show transfer flow to the various endpoints Per slave timing diagram can show transaction points for each master under observation Master0 Master1 Mastern A0 B0 D0 E0 A1 B1 C1 A3 B3 D3 A2 B2 D2 A4 B4 E2 D3 tatistical Profiling GOAL Get a quick overall view of which functions in an application consume the most cycles ampling every Program Counter in an application quickly consumes Buffer Bandwidth, preventing analysis of the entire application. We can eliminate this problem by only capturing a statistical sample of application execution.

18 tatistical Profiling Overview The Program Address is sampled at regular intervals tatistical Analysis is performed on the captured samples As in any statistical analysis, the determinations made on the statistical sample can be related to the general population Fxn1 Fxn2 Fxn3 Fxn4 Fxn5 Fxn tatistical Profiling Results Comma eparated Value Format orted from most intensive functions to least AET contains all of the hardware needed to capture trace samples at a specified interval Interval should be carefully chose so as not to coincide with a periodic function Application instrumentation can switch AET off in locations that are not of interest

19 Interrupt Profiling Overview Capture Program Address and Timestamp whenever the PC is within the Interrupt Vector Table Generate a cycle accurate picture of when each interrupt starts executing GOAL: Graphically display interrupt cycle accurate interrupt servicing frequency Raw Data Processed Data Interrupt Profiling Log PC Cycles 0x00897C x00897C x00897C x00897C6C x00897C x00897C x00897C x00897C7C

20 Interrupt Profiling Thread Aware Profiling Thread (name) GOAL: Generate a cycle accurate execution graph of a Thread/Task based application

21 Thread Aware Profiling Instrument the task/thread switch function to write the task/thread ID to a wellknown location (global variable); Operating systems typically provide hooks to insert functions in this location. all of the writes to that location, and get a timestamp with each. For More Information Advanced Event Triggering (AET): AETLib: Embedded Buffer (ETB): Buffer : Debugging with : For questions regarding topics covered in this training, visit the support forums at the TI E2E Community website.

XDS560 Trace. Technology Showcase. Daniel Rinkes Texas Instruments

XDS560 Trace. Technology Showcase. Daniel Rinkes Texas Instruments XDS560 Trace Technology Showcase Daniel Rinkes Texas Instruments Agenda AET / XDS560 Trace Overview Interrupt Profiling Statistical Profiling Thread Aware Profiling Thread Aware Dynamic Call Graph Agenda

More information

XDS560 Trace. Advanced Use Cases for Profiling. Daniel Rinkes Texas Instruments

XDS560 Trace. Advanced Use Cases for Profiling. Daniel Rinkes Texas Instruments XDS560 Trace Advanced Use Cases for Profiling Daniel Rinkes Texas Instruments Agenda AET / XDS560Trace Overview Interrupt Profiling Statistical Profiling Thread Aware Profiling Thread Aware Dynamic Call

More information

SoC Overview. Multicore Applications Team

SoC Overview. Multicore Applications Team KeyStone C66x ulticore SoC Overview ulticore Applications Team KeyStone Overview KeyStone Architecture & Internal Communications and Transport External Interfaces and s Debug iscellaneous Application and

More information

KeyStone C66x Multicore SoC Overview. Dec, 2011

KeyStone C66x Multicore SoC Overview. Dec, 2011 KeyStone C66x Multicore SoC Overview Dec, 011 Outline Multicore Challenge KeyStone Architecture Reminder About KeyStone Solution Challenge Before KeyStone Multicore performance degradation Lack of efficient

More information

KeyStone C665x Multicore SoC

KeyStone C665x Multicore SoC KeyStone Multicore SoC Architecture KeyStone C6655/57: Device Features C66x C6655: One C66x DSP Core at 1.0 or 1.25 GHz C6657: Two C66x DSP Cores at 0.85, 1.0, or 1.25 GHz Fixed and Floating Point Operations

More information

KeyStone Training. Turbo Encoder Coprocessor (TCP3E)

KeyStone Training. Turbo Encoder Coprocessor (TCP3E) KeyStone Training Turbo Encoder Coprocessor (TCP3E) Agenda Overview TCP3E Overview TCP3E = Turbo CoProcessor 3 Encoder No previous versions, but came out at same time as third version of decoder co processor

More information

Keystone Architecture Inter-core Data Exchange

Keystone Architecture Inter-core Data Exchange Application Report Lit. Number November 2011 Keystone Architecture Inter-core Data Exchange Brighton Feng Vincent Han Communication Infrastructure ABSTRACT This application note introduces various methods

More information

TMS320C6678 Memory Access Performance

TMS320C6678 Memory Access Performance Application Report Lit. Number April 2011 TMS320C6678 Memory Access Performance Brighton Feng Communication Infrastructure ABSTRACT The TMS320C6678 has eight C66x cores, runs at 1GHz, each of them has

More information

C66x KeyStone Training HyperLink

C66x KeyStone Training HyperLink C66x KeyStone Training HyperLink 1. HyperLink Overview 2. Address Translation 3. Configuration 4. Example and Demo Agenda 1. HyperLink Overview 2. Address Translation 3. Configuration 4. Example and Demo

More information

KeyStone Training. Power Management

KeyStone Training. Power Management KeyStone Training Management Overview Domains Clock Domains States SmartReflex Agenda Overview Domains Clock Domains States SmartReflex C66x Overview New Management Features New features: Switchable Logic

More information

C66x KeyStone Training HyperLink

C66x KeyStone Training HyperLink C66x KeyStone Training HyperLink 1. HyperLink Overview 2. Address Translation 3. Configuration 4. Example and Demo Agenda 1. HyperLink Overview 2. Address Translation 3. Configuration 4. Example and Demo

More information

KeyStone Training. Multicore Navigator Overview

KeyStone Training. Multicore Navigator Overview KeyStone Training Multicore Navigator Overview What is Navigator? Overview Agenda Definition Architecture Queue Manager Sub-System (QMSS) Packet DMA () Descriptors and Queuing What can Navigator do? Data

More information

Technical Note on NGMP Verification. Next Generation Multipurpose Microprocessor. Contract: 22279/09/NL/JK

Technical Note on NGMP Verification. Next Generation Multipurpose Microprocessor. Contract: 22279/09/NL/JK NGP-EVAL-0013 Date: 2010-12-20 Page: 1 of 7 Technical Note on NGP Verification Next Generation ultipurpose icroprocessor Contract: 22279/09/NL/JK Aeroflex Gaisler AB EA contract: 22279/09/NL/JK Deliverable:

More information

FPQ6 - MPC8313E implementation

FPQ6 - MPC8313E implementation Formation MPC8313E implementation: This course covers PowerQUICC II Pro MPC8313 - Processeurs PowerPC: NXP Power CPUs FPQ6 - MPC8313E implementation This course covers PowerQUICC II Pro MPC8313 Objectives

More information

Multi-core microcontroller design with Cortex-M processors and CoreSight SoC

Multi-core microcontroller design with Cortex-M processors and CoreSight SoC Multi-core microcontroller design with Cortex-M processors and CoreSight SoC Joseph Yiu, ARM Ian Johnson, ARM January 2013 Abstract: While the majority of Cortex -M processor-based microcontrollers are

More information

file://c:\documents and Settings\degrysep\Local Settings\Temp\~hh607E.htm

file://c:\documents and Settings\degrysep\Local Settings\Temp\~hh607E.htm Page 1 of 18 Trace Tutorial Overview The objective of this tutorial is to acquaint you with the basic use of the Trace System software. The Trace System software includes the following: The Trace Control

More information

On-Chip Debugging of Multicore Systems

On-Chip Debugging of Multicore Systems Nov 1, 2008 On-Chip Debugging of Multicore Systems PN115 Jeffrey Ho AP Technical Marketing, Networking Systems Division of Freescale Semiconductor, Inc. All other product or service names are the property

More information

HyperLink Programming and Performance consideration

HyperLink Programming and Performance consideration Application Report Lit. Number July, 2012 HyperLink Programming and Performance consideration Brighton Feng Communication Infrastructure ABSTRACT HyperLink provides a highest-speed, low-latency, and low-pin-count

More information

SEMICON Solutions. Bus Structure. Created by: Duong Dang Date: 20 th Oct,2010

SEMICON Solutions. Bus Structure. Created by: Duong Dang Date: 20 th Oct,2010 SEMICON Solutions Bus Structure Created by: Duong Dang Date: 20 th Oct,2010 Introduction Buses are the simplest and most widely used interconnection networks A number of modules is connected via a single

More information

ARM Processors for Embedded Applications

ARM Processors for Embedded Applications ARM Processors for Embedded Applications Roadmap for ARM Processors ARM Architecture Basics ARM Families AMBA Architecture 1 Current ARM Core Families ARM7: Hard cores and Soft cores Cache with MPU or

More information

TMS320C64x EDMA Architecture

TMS320C64x EDMA Architecture Application Report SPRA994 March 2004 TMS320C64x EDMA Architecture Jeffrey Ward Jamon Bowen TMS320C6000 Architecture ABSTRACT The enhanced DMA (EDMA) controller of the TMS320C64x device is a highly efficient

More information

Heterogeneous Multi-Processor Coherent Interconnect

Heterogeneous Multi-Processor Coherent Interconnect Heterogeneous Multi-Processor Coherent Interconnect Kai Chirca, Matthew Pierson Processors, Texas Instruments Inc, Dallas TX 1 Agenda q TI KeyStoneII Architecture and MSMC (Multicore Shared Memory Controller)

More information

KeyStone Training. Keystone Device Tooling

KeyStone Training. Keystone Device Tooling KeyStone Training Keystone Device Tooling Agenda Code Composer Studio v4 Keystone Architecture Simulator Multicore Application Deployment OpenMP Initiative Code Composer Studio v4 Code Composer Studio

More information

Test and Verification Solutions. ARM Based SOC Design and Verification

Test and Verification Solutions. ARM Based SOC Design and Verification Test and Verification Solutions ARM Based SOC Design and Verification 7 July 2008 1 7 July 2008 14 March 2 Agenda System Verification Challenges ARM SoC DV Methodology ARM SoC Test bench Construction Conclusion

More information

Intelop. *As new IP blocks become available, please contact the factory for the latest updated info.

Intelop. *As new IP blocks become available, please contact the factory for the latest updated info. A FPGA based development platform as part of an EDK is available to target intelop provided IPs or other standard IPs. The platform with Virtex-4 FX12 Evaluation Kit provides a complete hardware environment

More information

Migrating RC3233x Software to the RC32434/5 Device

Migrating RC3233x Software to the RC32434/5 Device Migrating RC3233x Software to the RC32434/5 Device Application Note AN-445 Introduction By Harpinder Singh and Nebojsa Bjegovic Operating system kernels, board support packages, and other processor-aware

More information

Unit 3 and Unit 4: Chapter 4 INPUT/OUTPUT ORGANIZATION

Unit 3 and Unit 4: Chapter 4 INPUT/OUTPUT ORGANIZATION Unit 3 and Unit 4: Chapter 4 INPUT/OUTPUT ORGANIZATION Introduction A general purpose computer should have the ability to exchange information with a wide range of devices in varying environments. Computers

More information

Welcome to this presentation of the STM32 direct memory access controller (DMA). It covers the main features of this module, which is widely used to

Welcome to this presentation of the STM32 direct memory access controller (DMA). It covers the main features of this module, which is widely used to Welcome to this presentation of the STM32 direct memory access controller (DMA). It covers the main features of this module, which is widely used to handle the STM32 peripheral data transfers. 1 The Direct

More information

Nexus Instrumentation architectures and the new Debug Specification

Nexus Instrumentation architectures and the new Debug Specification Nexus 5001 - Instrumentation architectures and the new Debug Specification Neal Stollon, HDL Dynamics Chairman, Nexus 5001 Forum neals@hdldynamics.com nstollon@nexus5001.org HDL Dynamics SoC Solutions

More information

ARM s IP and OSCI TLM 2.0

ARM s IP and OSCI TLM 2.0 ARM s IP and OSCI TLM 2.0 Deploying Implementations of IP at the Programmer s View abstraction level via RealView System Generator ESL Marketing and Engineering System Design Division ARM Q108 1 Contents

More information

Copyright 2016 Xilinx

Copyright 2016 Xilinx Zynq Architecture Zynq Vivado 2015.4 Version This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able to: Identify the basic building

More information

The Challenges of System Design. Raising Performance and Reducing Power Consumption

The Challenges of System Design. Raising Performance and Reducing Power Consumption The Challenges of System Design Raising Performance and Reducing Power Consumption 1 Agenda The key challenges Visibility for software optimisation Efficiency for improved PPA 2 Product Challenge - Software

More information

Performance Optimization for an ARM Cortex-A53 System Using Software Workloads and Cycle Accurate Models. Jason Andrews

Performance Optimization for an ARM Cortex-A53 System Using Software Workloads and Cycle Accurate Models. Jason Andrews Performance Optimization for an ARM Cortex-A53 System Using Software Workloads and Cycle Accurate Models Jason Andrews Agenda System Performance Analysis IP Configuration System Creation Methodology: Create,

More information

TMS320C672x DSP Dual Data Movement Accelerator (dmax) Reference Guide

TMS320C672x DSP Dual Data Movement Accelerator (dmax) Reference Guide TMS320C672x DSP Dual Data Movement Accelerator (dmax) Reference Guide Literature Number: SPRU795D November 2005 Revised October 2007 2 SPRU795D November 2005 Revised October 2007 Contents Preface... 11

More information

The S6000 Family of Processors

The S6000 Family of Processors The S6000 Family of Processors Today s Design Challenges The advent of software configurable processors In recent years, the widespread adoption of digital technologies has revolutionized the way in which

More information

Designing Embedded Processors in FPGAs

Designing Embedded Processors in FPGAs Designing Embedded Processors in FPGAs 2002 Agenda Industrial Control Systems Concept Implementation Summary & Conclusions Industrial Control Systems Typically Low Volume Many Variations Required High

More information

Debugging with System Analyzer. Todd Mullanix TI-RTOS Apps Manager Oct. 15, 2017

Debugging with System Analyzer. Todd Mullanix TI-RTOS Apps Manager Oct. 15, 2017 Debugging with System Analyzer Todd Mullanix TI-RTOS Apps Manager Oct. 15, 2017 Abstract In software engineering, tracing involves a specialized use of logging to record information about a program's execution.

More information

Renesas 78K/78K0R/RL78 Family In-Circuit Emulation

Renesas 78K/78K0R/RL78 Family In-Circuit Emulation _ Technical Notes V9.12.225 Renesas 78K/78K0R/RL78 Family In-Circuit Emulation This document is intended to be used together with the CPU reference manual provided by the silicon vendor. This document

More information

Introduction Electrical Considerations Data Transfer Synchronization Bus Arbitration VME Bus Local Buses PCI Bus PCI Bus Variants Serial Buses

Introduction Electrical Considerations Data Transfer Synchronization Bus Arbitration VME Bus Local Buses PCI Bus PCI Bus Variants Serial Buses Introduction Electrical Considerations Data Transfer Synchronization Bus Arbitration VME Bus Local Buses PCI Bus PCI Bus Variants Serial Buses 1 Most of the integrated I/O subsystems are connected to the

More information

FPGA Adaptive Software Debug and Performance Analysis

FPGA Adaptive Software Debug and Performance Analysis white paper Intel Adaptive Software Debug and Performance Analysis Authors Javier Orensanz Director of Product Management, System Design Division ARM Stefano Zammattio Product Manager Intel Corporation

More information

Using ARM ETB with TI CCS. CCS 3.3 with SR9 on TMS320DM6446

Using ARM ETB with TI CCS. CCS 3.3 with SR9 on TMS320DM6446 Using ARM ETB with TI CCS CCS 3.3 with SR9 on TMS320DM6446 1 ETB Usage Brief Tutorial 1. Setup CCS setup configuration to include the ETB. 2. Connect to the target (including the ETB) 3. Select the ETB

More information

2 MARKS Q&A 1 KNREDDY UNIT-I

2 MARKS Q&A 1 KNREDDY UNIT-I 2 MARKS Q&A 1 KNREDDY UNIT-I 1. What is bus; list the different types of buses with its function. A group of lines that serves as a connecting path for several devices is called a bus; TYPES: ADDRESS BUS,

More information

William Stallings Computer Organization and Architecture 10 th Edition Pearson Education, Inc., Hoboken, NJ. All rights reserved.

William Stallings Computer Organization and Architecture 10 th Edition Pearson Education, Inc., Hoboken, NJ. All rights reserved. + William Stallings Computer Organization and Architecture 10 th Edition 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. 2 + Chapter 3 A Top-Level View of Computer Function and Interconnection

More information

Fujitsu System Applications Support. Fujitsu Microelectronics America, Inc. 02/02

Fujitsu System Applications Support. Fujitsu Microelectronics America, Inc. 02/02 Fujitsu System Applications Support 1 Overview System Applications Support SOC Application Development Lab Multimedia VoIP Wireless Bluetooth Processors, DSP and Peripherals ARM Reference Platform 2 SOC

More information

Choosing the Appropriate Simulator Configuration in Code Composer Studio IDE

Choosing the Appropriate Simulator Configuration in Code Composer Studio IDE Application Report SPRA864 November 2002 Choosing the Appropriate Simulator Configuration in Code Composer Studio IDE Pankaj Ratan Lal, Ambar Gadkari Software Development Systems ABSTRACT Software development

More information

Buses. Maurizio Palesi. Maurizio Palesi 1

Buses. Maurizio Palesi. Maurizio Palesi 1 Buses Maurizio Palesi Maurizio Palesi 1 Introduction Buses are the simplest and most widely used interconnection networks A number of modules is connected via a single shared channel Microcontroller Microcontroller

More information

It's not about the core, it s about the system

It's not about the core, it s about the system It's not about the core, it s about the system Gajinder Panesar, CTO, UltraSoC gajinder.panesar@ultrasoc.com RISC-V Workshop 18 19 July 2018 Chennai, India Overview Architecture overview Example Scenarios

More information

2. System Interconnect Fabric for Memory-Mapped Interfaces

2. System Interconnect Fabric for Memory-Mapped Interfaces 2. System Interconnect Fabric for Memory-Mapped Interfaces QII54003-8.1.0 Introduction The system interconnect fabric for memory-mapped interfaces is a high-bandwidth interconnect structure for connecting

More information

Accessing I/O Devices Interface to CPU and Memory Interface to one or more peripherals Generic Model of IO Module Interface for an IO Device: CPU checks I/O module device status I/O module returns status

More information

Fast, Scalable and Energy Efficient IO Solutions: Accelerating infrastructure SoC time-to-market

Fast, Scalable and Energy Efficient IO Solutions: Accelerating infrastructure SoC time-to-market Fast, calable and Energy Efficient IO olutions: Accelerating infrastructure oc time-to-market ridhar Valluru Product Manager ARM Tech ymposia 2016 Intelligent Flexible Cloud calability and Flexibility

More information

April 4, 2001: Debugging Your C24x DSP Design Using Code Composer Studio Real-Time Monitor

April 4, 2001: Debugging Your C24x DSP Design Using Code Composer Studio Real-Time Monitor 1 This presentation was part of TI s Monthly TMS320 DSP Technology Webcast Series April 4, 2001: Debugging Your C24x DSP Design Using Code Composer Studio Real-Time Monitor To view this 1-hour 1 webcast

More information

TMS320C642x DSP Peripheral Component Interconnect (PCI) User's Guide

TMS320C642x DSP Peripheral Component Interconnect (PCI) User's Guide TMS320C642x DSP Peripheral Component Interconnect (PCI) User's Guide Literature Number: SPRUEN3C May 2010 2 Preface... 8 1 Introduction... 9 1.1 Purpose of the Peripheral... 9 1.2 Features... 9 1.3 Features

More information

The CoreConnect Bus Architecture

The CoreConnect Bus Architecture The CoreConnect Bus Architecture Recent advances in silicon densities now allow for the integration of numerous functions onto a single silicon chip. With this increased density, peripherals formerly attached

More information

Hercules ARM Cortex -R4 System Architecture. Processor Overview

Hercules ARM Cortex -R4 System Architecture. Processor Overview Hercules ARM Cortex -R4 System Architecture Processor Overview What is Hercules? TI s 32-bit ARM Cortex -R4/R5 MCU family for Industrial, Automotive, and Transportation Safety Hardware Safety Features

More information

Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan

Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan Processors Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan chanhl@maili.cgu.edu.twcgu General-purpose p processor Control unit Controllerr Control/ status Datapath ALU

More information

Simplifying the Development and Debug of 8572-Based SMP Embedded Systems. Wind River Workbench Development Tools

Simplifying the Development and Debug of 8572-Based SMP Embedded Systems. Wind River Workbench Development Tools Simplifying the Development and Debug of 8572-Based SMP Embedded Systems Wind River Workbench Development Tools Agenda Introducing multicore systems Debugging challenges of multicore systems Development

More information

_ V Renesas R8C In-Circuit Emulation. Contents. Technical Notes

_ V Renesas R8C In-Circuit Emulation. Contents. Technical Notes _ V9.12. 225 Technical Notes Renesas R8C In-Circuit Emulation This document is intended to be used together with the CPU reference manual provided by the silicon vendor. This document assumes knowledge

More information

Product Technical Brief S3C2413 Rev 2.2, Apr. 2006

Product Technical Brief S3C2413 Rev 2.2, Apr. 2006 Product Technical Brief Rev 2.2, Apr. 2006 Overview SAMSUNG's is a Derivative product of S3C2410A. is designed to provide hand-held devices and general applications with cost-effective, low-power, and

More information

Keystone ROM Boot Loader (RBL)

Keystone ROM Boot Loader (RBL) Keystone Bootloader Keystone ROM Boot Loader (RBL) RBL is a code used for the device startup. RBL also transfers application code from memory or host to high speed internal memory or DDR3 RBL code is burned

More information

(Advanced) Computer Organization & Architechture. Prof. Dr. Hasan Hüseyin BALIK (3 rd Week)

(Advanced) Computer Organization & Architechture. Prof. Dr. Hasan Hüseyin BALIK (3 rd Week) + (Advanced) Computer Organization & Architechture Prof. Dr. Hasan Hüseyin BALIK (3 rd Week) + Outline 2. The computer system 2.1 A Top-Level View of Computer Function and Interconnection 2.2 Cache Memory

More information

Module 12: I/O Systems

Module 12: I/O Systems Module 12: I/O Systems I/O hardwared Application I/O Interface Kernel I/O Subsystem Transforming I/O Requests to Hardware Operations Performance 12.1 I/O Hardware Incredible variety of I/O devices Common

More information

TMS320C674x/OMAP-L1x Processor General-Purpose Input/Output (GPIO) User's Guide

TMS320C674x/OMAP-L1x Processor General-Purpose Input/Output (GPIO) User's Guide TMS320C674x/OMAP-L1x Processor General-Purpose Input/Output (GPIO) User's Guide Literature Number: SPRUFL8B June 2010 2 Preface... 7 1 Introduction... 9 1.1 Purpose of the Peripheral... 9 1.2 Features...

More information

System Debug. This material exempt per Department of Commerce license exception TSU Xilinx, Inc. All Rights Reserved

System Debug. This material exempt per Department of Commerce license exception TSU Xilinx, Inc. All Rights Reserved System Debug This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able to: Describe GNU Debugger (GDB) functionality Describe Xilinx

More information

The Nios II Family of Configurable Soft-core Processors

The Nios II Family of Configurable Soft-core Processors The Nios II Family of Configurable Soft-core Processors James Ball August 16, 2005 2005 Altera Corporation Agenda Nios II Introduction Configuring your CPU FPGA vs. ASIC CPU Design Instruction Set Architecture

More information

KeyStone Training. Bootloader

KeyStone Training. Bootloader KeyStone Training Bootloader Overview Configuration Device Startup Summary Agenda Overview Configuration Device Startup Summary Boot Overview Boot Mode Details Boot is driven on a device reset. Initial

More information

Interconnects, Memory, GPIO

Interconnects, Memory, GPIO Interconnects, Memory, GPIO Dr. Francesco Conti f.conti@unibo.it Slide contributions adapted from STMicroelectronics and from Dr. Michele Magno, others Processor vs. MCU Pipeline Harvard architecture Separate

More information

With Fixed Point or Floating Point Processors!!

With Fixed Point or Floating Point Processors!! Product Information Sheet High Throughput Digital Signal Processor OVERVIEW With Fixed Point or Floating Point Processors!! Performance Up to 14.4 GIPS or 7.7 GFLOPS Peak Processing Power Continuous Input

More information

KeyStone Training Serial RapidIO (SRIO) Subsystem

KeyStone Training Serial RapidIO (SRIO) Subsystem KeyStone Training Serial RapidIO (SRIO) Subsystem SRIO Overview SRIO Overview DirectIO Operation Message Passing Operation Other RapidIO Features Summary Introduction To RapidIO Two Basic Modes of Operation:

More information

LEON4: Fourth Generation of the LEON Processor

LEON4: Fourth Generation of the LEON Processor LEON4: Fourth Generation of the LEON Processor Magnus Själander, Sandi Habinc, and Jiri Gaisler Aeroflex Gaisler, Kungsgatan 12, SE-411 19 Göteborg, Sweden Tel +46 31 775 8650, Email: {magnus, sandi, jiri}@gaisler.com

More information

FPQ9 - MPC8360E implementation

FPQ9 - MPC8360E implementation Training MPC8360E implementation: This course covers PowerQUICC II Pro MPC8360E - PowerPC processors: NXP Power CPUs FPQ9 - MPC8360E implementation This course covers PowerQUICC II Pro MPC8360E Objectives

More information

The control of I/O devices is a major concern for OS designers

The control of I/O devices is a major concern for OS designers Lecture Overview I/O devices I/O hardware Interrupts Direct memory access Device dimensions Device drivers Kernel I/O subsystem Operating Systems - June 26, 2001 I/O Device Issues The control of I/O devices

More information

Section 6 Blackfin ADSP-BF533 Memory

Section 6 Blackfin ADSP-BF533 Memory Section 6 Blackfin ADSP-BF533 Memory 6-1 a ADSP-BF533 Block Diagram Core Timer 64 L1 Instruction Memory Performance Monitor JTAG/ Debug Core Processor LD0 32 LD1 32 L1 Data Memory SD32 DMA Mastered 32

More information

Achieving UFS Host Throughput For System Performance

Achieving UFS Host Throughput For System Performance Achieving UFS Host Throughput For System Performance Yifei-Liu CAE Manager, Synopsys Mobile Forum 2013 Copyright 2013 Synopsys Agenda UFS Throughput Considerations to Meet Performance Objectives UFS Host

More information

EE108B Lecture 17 I/O Buses and Interfacing to CPU. Christos Kozyrakis Stanford University

EE108B Lecture 17 I/O Buses and Interfacing to CPU. Christos Kozyrakis Stanford University EE108B Lecture 17 I/O Buses and Interfacing to CPU Christos Kozyrakis Stanford University http://eeclass.stanford.edu/ee108b 1 Announcements Remaining deliverables PA2.2. today HW4 on 3/13 Lab4 on 3/19

More information

Negotiating the Maze Getting the most out of memory systems today and tomorrow. Robert Kaye

Negotiating the Maze Getting the most out of memory systems today and tomorrow. Robert Kaye Negotiating the Maze Getting the most out of memory systems today and tomorrow Robert Kaye 1 System on Chip Memory Systems Systems use external memory Large address space Low cost-per-bit Large interface

More information

An Ultra High Performance Scalable DSP Family for Multimedia. Hot Chips 17 August 2005 Stanford, CA Erik Machnicki

An Ultra High Performance Scalable DSP Family for Multimedia. Hot Chips 17 August 2005 Stanford, CA Erik Machnicki An Ultra High Performance Scalable DSP Family for Multimedia Hot Chips 17 August 2005 Stanford, CA Erik Machnicki Media Processing Challenges Increasing performance requirements Need for flexibility &

More information

PCI-4IPM Revision C. Second Generation Intelligent IP Carrier for PCI Systems Up to Four IndustryPack Modules Dual Ported SRAM, Bus Master DMA

PCI-4IPM Revision C. Second Generation Intelligent IP Carrier for PCI Systems Up to Four IndustryPack Modules Dual Ported SRAM, Bus Master DMA PCI-4IPM Revision C Second Generation Intelligent IP Carrier for PCI Systems Up to Four IndustryPack Modules Dual Ported SRAM, Bus Master DMA REFERENCE MANUAL 781-21-000-4000 Version 2.1 April 2003 ALPHI

More information

Operating Systems: Internals and Design Principles, 7/E William Stallings. Chapter 1 Computer System Overview

Operating Systems: Internals and Design Principles, 7/E William Stallings. Chapter 1 Computer System Overview Operating Systems: Internals and Design Principles, 7/E William Stallings Chapter 1 Computer System Overview What is an Operating System? Operating system goals: Use the computer hardware in an efficient

More information

Instruction Register. Instruction Decoder. Control Unit (Combinational Circuit) Control Signals (These signals go to register) The bus and the ALU

Instruction Register. Instruction Decoder. Control Unit (Combinational Circuit) Control Signals (These signals go to register) The bus and the ALU Hardwired and Microprogrammed Control For each instruction, the control unit causes the CPU to execute a sequence of steps correctly. In reality, there must be control signals to assert lines on various

More information

Hardware Design. MicroBlaze 7.1. This material exempt per Department of Commerce license exception TSU Xilinx, Inc. All Rights Reserved

Hardware Design. MicroBlaze 7.1. This material exempt per Department of Commerce license exception TSU Xilinx, Inc. All Rights Reserved Hardware Design MicroBlaze 7.1 This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able to: List the MicroBlaze 7.1 Features List

More information

UNIT II PROCESSOR AND MEMORY ORGANIZATION

UNIT II PROCESSOR AND MEMORY ORGANIZATION UNIT II PROCESSOR AND MEMORY ORGANIZATION Structural units in a processor; selection of processor & memory devices; shared memory; DMA; interfacing processor, memory and I/O units; memory management Cache

More information

Chapter 5 Input/Output Organization. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan

Chapter 5 Input/Output Organization. Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan Chapter 5 Input/Output Organization Jin-Fu Li Department of Electrical Engineering National Central University Jungli, Taiwan Outline Accessing I/O Devices Interrupts Direct Memory Access Buses Interface

More information

Chapter 13: I/O Systems

Chapter 13: I/O Systems Chapter 13: I/O Systems I/O Hardware Application I/O Interface Kernel I/O Subsystem Transforming I/O Requests to Hardware Operations Streams Performance Objectives Explore the structure of an operating

More information

Modeling Performance Use Cases with Traffic Profiles Over ARM AMBA Interfaces

Modeling Performance Use Cases with Traffic Profiles Over ARM AMBA Interfaces Modeling Performance Use Cases with Traffic Profiles Over ARM AMBA Interfaces Li Chen, Staff AE Cadence China Agenda Performance Challenges Current Approaches Traffic Profiles Intro Traffic Profiles Implementation

More information

TMS320VC5503/5507/5509/5510 DSP Direct Memory Access (DMA) Controller Reference Guide

TMS320VC5503/5507/5509/5510 DSP Direct Memory Access (DMA) Controller Reference Guide TMS320VC5503/5507/5509/5510 DSP Direct Memory Access (DMA) Controller Reference Guide Literature Number: January 2007 This page is intentionally left blank. Preface About This Manual Notational Conventions

More information

Lecture 15: I/O Devices & Drivers

Lecture 15: I/O Devices & Drivers CS 422/522 Design & Implementation of Operating Systems Lecture 15: I/O Devices & Drivers Zhong Shao Dept. of Computer Science Yale University Acknowledgement: some slides are taken from previous versions

More information

Programmable Logic Design Grzegorz Budzyń Lecture. 15: Advanced hardware in FPGA structures

Programmable Logic Design Grzegorz Budzyń Lecture. 15: Advanced hardware in FPGA structures Programmable Logic Design Grzegorz Budzyń Lecture 15: Advanced hardware in FPGA structures Plan Introduction PowerPC block RocketIO Introduction Introduction The larger the logical chip, the more additional

More information

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 18 Multicore Computers

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 18 Multicore Computers William Stallings Computer Organization and Architecture 8 th Edition Chapter 18 Multicore Computers Hardware Performance Issues Microprocessors have seen an exponential increase in performance Improved

More information

Speeding AM335x Programmable Realtime Unit (PRU) Application Development Through Improved Debug Tools

Speeding AM335x Programmable Realtime Unit (PRU) Application Development Through Improved Debug Tools Speeding AM335x Programmable Realtime Unit (PRU) Application Development Through Improved Debug Tools The hardware modules and descriptions referred to in this document are *NOT SUPPORTED* by Texas Instruments

More information

RISC-V Core IP Products

RISC-V Core IP Products RISC-V Core IP Products An Introduction to SiFive RISC-V Core IP Drew Barbier September 2017 drew@sifive.com SiFive RISC-V Core IP Products This presentation is targeted at embedded designers who want

More information

Course Introduction. Purpose: Objectives: Content: 27 pages 4 questions. Learning Time: 20 minutes

Course Introduction. Purpose: Objectives: Content: 27 pages 4 questions. Learning Time: 20 minutes Course Introduction Purpose: This course provides an overview of the Direct Memory Access Controller and the Interrupt Controller on the SH-2 and SH-2A families of 32-bit RISC microcontrollers, which are

More information

EDBG. Description. Programmers and Debuggers USER GUIDE

EDBG. Description. Programmers and Debuggers USER GUIDE Programmers and Debuggers EDBG USER GUIDE Description The Atmel Embedded Debugger (EDBG) is an onboard debugger for integration into development kits with Atmel MCUs. In addition to programming and debugging

More information

WS_CCESSH5-OUT-v1.01.doc Page 1 of 7

WS_CCESSH5-OUT-v1.01.doc Page 1 of 7 Course Name: Course Code: Course Description: System Development with CrossCore Embedded Studio (CCES) and the ADI ADSP- SC5xx/215xx SHARC Processor Family WS_CCESSH5 This is a practical and interactive

More information

Product Technical Brief S3C2412 Rev 2.2, Apr. 2006

Product Technical Brief S3C2412 Rev 2.2, Apr. 2006 Product Technical Brief S3C2412 Rev 2.2, Apr. 2006 Overview SAMSUNG's S3C2412 is a Derivative product of S3C2410A. S3C2412 is designed to provide hand-held devices and general applications with cost-effective,

More information

Chapter 13: I/O Systems

Chapter 13: I/O Systems Chapter 13: I/O Systems Chapter 13: I/O Systems I/O Hardware Application I/O Interface Kernel I/O Subsystem Transforming I/O Requests to Hardware Operations Streams Performance 13.2 Silberschatz, Galvin

More information

CISC RISC. Compiler. Compiler. Processor. Processor

CISC RISC. Compiler. Compiler. Processor. Processor Q1. Explain briefly the RISC design philosophy. Answer: RISC is a design philosophy aimed at delivering simple but powerful instructions that execute within a single cycle at a high clock speed. The RISC

More information

MCUXpresso IDE Instruction Trace Guide. Rev May, 2018 User guide

MCUXpresso IDE Instruction Trace Guide. Rev May, 2018 User guide MCUXpresso IDE Instruction Trace Guide User guide 14 May, 2018 Copyright 2018 NXP Semiconductors All rights reserved. ii 1. Trace Overview... 1 1.1. Instruction Trace Overview... 1 1.1.1. Supported Targets...

More information

ARM Cortex-M4 Architecture and Instruction Set 1: Architecture Overview

ARM Cortex-M4 Architecture and Instruction Set 1: Architecture Overview ARM Cortex-M4 Architecture and Instruction Set 1: Architecture Overview M J Brockway January 25, 2016 UM10562 All information provided in this document is subject to legal disclaimers. NXP B.V. 2014. All

More information

Assembling and Debugging VPs of Complex Cycle Accurate Multicore Systems. July 2009

Assembling and Debugging VPs of Complex Cycle Accurate Multicore Systems. July 2009 Assembling and Debugging VPs of Complex Cycle Accurate Multicore Systems July 2009 Model Requirements in a Virtual Platform Control initialization, breakpoints, etc Visibility PV registers, memories, profiling

More information

Module 12: I/O Systems

Module 12: I/O Systems Module 12: I/O Systems I/O Hardware Application I/O Interface Kernel I/O Subsystem Transforming I/O Requests to Hardware Operations Performance Operating System Concepts 12.1 Silberschatz and Galvin c

More information