Lab-2: Profiling m4v_dec on GR-XC3S National Chiao Tung University Chun-Jen Tsai 3/28/2011

Similar documents
HW/SW Co-design Lecture 2: Lab Environment Setup

Lab-3: Simple Accelerator Design. National Chiao Tung University Chun-Jen Tsai 4/11/2011

HW/SW Co-design Lecture 4: Lab 2 Passive HW Accelerator Design

ESA Contract 18533/04/NL/JD

LEON4: Fourth Generation of the LEON Processor

Lab-1: Profiling/Optimizing Video Decoder Using ADS. National Chiao Tung University Chun-Jen Tsai 3/3/2011

Introduction to LEON3, GRLIB

AN OPEN-SOURCE VHDL IP LIBRARY WITH PLUG&PLAY CONFIGURATION

Next Generation Multi-Purpose Microprocessor

Processor and Peripheral IP Cores for Microcontrollers in Embedded Space Applications

COMPARISON BETWEEN GR740, LEON4-N2X AND NGMP

Altera Innovate Nordic 2007: Leon3 MP on Altera FPGA, Final Report. Altera Innovate Nordic 2007 Final Project Report 8/28/2007

Booting a LEON system over SpaceWire RMAP. Application note Doc. No GRLIB-AN-0002 Issue 2.1

VxWorks 5.4 LEON BSP 1.0.1

ASIC Logic. Speaker: Juin-Nan Liu. Adopted from National Chiao-Tung University IP Core Design

Xilinx ISE8.1 and Spartan-3 Tutorial EE3810

Designing Embedded Processors in FPGAs

Tutorial: ISE 12.2 and the Spartan3e Board v August 2010

Graduate Institute of Electronics Engineering, NTU. ASIC Logic. Speaker: Lung-Hao Chang 張龍豪 Advisor: Prof. Andy Wu 吳安宇教授.

Lab 6: Intro to FPGAs

Xilinx ISE8.1 and Spartan-3 Tutorial (Prepared by Kahraman Akdemir based on Professor Duckworth's Tutorials updated September 2006)

Rapid Prototyping.

Real Time Trace Solution for LEON/GRLIB System-on-Chip. Master of Science Thesis in Embedded Electronics System Design ALEXANDER KARLSSON

A Boot-Strap Loader and Monitor for SPARC LEON2/3/FT

Designing with ALTERA SoC Hardware

Core Peripherals. Speaker: Tzu-Wei Tseng. Adopted from National Chiao-Tung University IP Core Design. SOC Consortium Course Material

GRPCI Master/Target Abort Erratum. Technical note Doc. No GRLIB-TN-0003 Issue 1.1

Creating the AVS6LX9MBHP211 MicroBlaze Hardware Platform for the Spartan-6 LX9 MicroBoard Version

System-On-Chip Design with the Leon CPU The SOCKS Hardware/Software Environment

ECE 491 Laboratory 1 Introducing FPGA Design with Verilog September 6, 2004

Gaisler Research LEON SPARC Processor The past, present and future Jiri Gaisler Gaisler Research

User Manual. LPC-StickView V3.0. for LPC-Stick (LPC2468) LPC2478-Stick LPC3250-Stick. Contents

RiceNIC. Prototyping Network Interfaces. Jeffrey Shafer Scott Rixner

Copyright 2014 Xilinx

Developing a LEON3 template design for the Altera Cyclone-II DE2 board Master of Science Thesis in Integrated Electronic System Design

Fujitsu System Applications Support. Fujitsu Microelectronics America, Inc. 02/02

Contents ASIC Logic Overview Background Information Rapid Prototyping with Logic Module Basic Platforms: AHB and

Technical Note on NGMP Verification. Next Generation Multipurpose Microprocessor. Contract: 22279/09/NL/JK

GRLIDE. LEON IDE plugin for Eclipse User's Manual. The most important thing we build is trust GR-LIDE-UM. August 2016, Version 1.

TLL5000 Electronic System Design Base Module

Configuring the Xilinx Spartan-6 LX9 MicroBoard

User Manual. LPC-StickView V1.1. for LPC-Stick. Contents

University of Massachusetts Amherst Computer Systems Lab 1 (ECE 354) LAB 1 Reference Manual

Introduction to the OpenSSD Jasmine Platform

RCC User s Manual Version 1.1.1

System-on-Chip Architectures and Modelling

Migrating from the UT699 to the UT699E

ECEN 449: Microprocessor System Design Department of Electrical and Computer Engineering Texas A&M University

UT700 LEAP VxWorks 6.7 BSP Manual

CCSDS Unsegmented Code Transfer Protocol (CUCTP)

ML410 VxWorks BSP and System Image Creation for the BSB Design Using EDK 8.2i SP1. April

TLL5000 Electronic System Design Base Module. Getting Started Guide, Ver 3.4

SpartanMC. SpartanMC. Quick Guide

Programming Multicore Systems

ChipScope Inserter flow. To see the Chipscope added from XPS flow, please skip to page 21. For ChipScope within Planahead, please skip to page 23.

QSPI Flash Memory Bootloading In Standard SPI Mode with KC705 Platform

GRMON2 User's Manual

Fujitsu SOC Fujitsu Microelectronics America, Inc.

V8uC: Sparc V8 micro-controller derived from LEON2-FT

Intelop. *As new IP blocks become available, please contact the factory for the latest updated info.

Embedded Systems Laboratory Manual ARM 9 TDMI

Lecture 5: Computing Platforms. Asbjørn Djupdal ARM Norway, IDI NTNU 2013 TDT

5. On-chip Bus

ML410 VxWorks Workbench BSP and System Image Creation for the BSB Design Using EDK 8.2i SP2. April

An H.264/AVC Main Profile Video Decoder Accelerator in a Multimedia SOC Platform

Manual: SnapGear Linux for LEON

ML410 VxWorks BSP and System Image Creation for the BSB DDR2 Design Using EDK 8.2i SP1. April

Embedded Fingerprint Verification and Matching System

Spartan-6 LX9 MicroBoard Embedded Tutorial. Lab 6 Creating a MicroBlaze SPI Flash Bootloader

A SOC DESIGN METHODOLOGY FOR LEON2 ON FPGA

BCC - Bare-C Cross-Compiler User s Manual

Verification of Multiprocessor system using Hardware/Software Co-simulation

Banks, Jasmine Elizabeth (2011) The Spartan 3E Tutorial 1 : Introduction to FPGA Programming, Version 1.0. [Tutorial Programme]

RedBoot for Freescale i.mx27

Nexys 2/3 board tutorial (Decoder, ISE 13.2) Jim Duckworth, August 2011, WPI. (updated March 2012 to include Nexys2 board)

Multi-DSP/Micro-Processor Architecture (MDPA)

EE 367 Logic Design Lab #1 Introduction to Xilinx ISE and the ML40X Eval Board Date: 1/21/09 Due: 1/28/09

SP601 Standalone Applications

Project 1a: Hello World!

GigaX API for Zynq SoC

Practical Hardware Debugging: Quick Notes On How to Simulate Altera s Nios II Multiprocessor Systems Using Mentor Graphics ModelSim

Firmware for Embedded Computing. National Chiao Tung University Chun-Jen Tsai 3/10/2011

GRMON3 User's Manual

Using Serial Flash on the Xilinx Spartan-3E Starter Board. Overview. Objectives. Version 8.1 February 23, 2006 Bryan H. Fletcher

Designing with ALTERA SoC

Compute Node Design for DAQ and Trigger Subsystem in Giessen. Justus Liebig University in Giessen

SP605 Standalone Applications

EMUL-PPC-PC. Getting Started Guide. Version 1.0

Rad-Hard Microcontroller For Space Applications

ECEN 449: Microprocessor System Design Department of Electrical and Computer Engineering Texas A&M University

ARM Processors for Embedded Applications

DTK2410 Specification

Development Environment of Embedded System

Interfacing the RAD1419 A/D converter with the GRLIB IP core library

Avnet Zynq Mini Module Plus Embedded Design

Support for RISC-V. Lauterbach GmbH. Bob Kupyn Lauterbach Markus Goehrle - Lauterbach GmbH

Codewarrior for ColdFire (Eclipse) 10.0 Setup

IP Core Design. Lab TA:Tzung-Shian Yang

Overview The Microcontroller The Flex Board Expansion boards Multibus board Demo board How to: Compile demo Flash & Run Demos

LINUXBUILD User's Manual

Transcription:

Lab-2: Profiling m4v_dec on GR-XC3S-1500 National Chiao Tung University Chun-Jen Tsai 3/28/2011

Profiling with Real-time Timer Goal: Profiling m4v_vdec on GR-XC3S-1500 using a real-time timer Tasks: Install a real-time timer ISR in the video decoder Use tftp protocol to read/write video data Use the timer to measure the performance of your optimized video decoder from lab1 Give a demo to TAs and upload a report by the end of 4/8 2/27

GR-XC3S-1500 Development Board The system core IC is a Xilinx Spartan III FPGA Features FPGA: XC3S-1500-FG456-4C FPGA On-board memory 8 MB Flash 64 MB SDRAM On-board I/O interfaces 10/100 Ethernet PHY 24-bit VGA Video DAC USB 2.0 PHY Two UART Transceivers JTAG port 3/27

GRLIB IP Library The board supports a reusable IP library, GRLIB Designed for system-on-chip (SoC) development Based on AMBA bus protocol Standard IPs in GRLIB: RS-232 LEON3 processor core LEON3 DSU3 Serial JTAG Processor Debug Link Debug Link BUS controllers Memory controller AHB Memory mcomp idct Controller Controller Debug support unit 8-bit/32-bit memory bus Interrupt controller PROM SDRAM Timer Memory extension I/O controllers: UART, Ethernet, VGA, USB, JTAG PHY Ethernet MAC AHB/APB Bridge VGA Video DAC Spartan3-1500 FPGA AMBA AHB AMBA APB Timers To be added in future labs IRQ Controller 4/27

Debug Support Unit (DSU) GRLIB has a DSU IP, which is an AHB slave Accessible by any AHB master (e.g. debug interfaces) A debug Interface can generate read or write transfers to any address on the AHB bus through DSU DSU can also be used to access Processor registers Instruction trace buffer AHB trace buffer 5/27

System Development Flow Hardware Design Entry (Configure template design) Software Design Entry (Write C code & Makefile) Building ecos Library ecos is an open source library that provides basic OS kernel functions to your applications; Build FPGA bit file Load bit file to FPGA Build the application and link with ecos library Debug the software with gdb/grmon In our case, we need ecos to provide us tftp and timer support At this moment, the HW IDCT logic is substituted with a C model function Integrate with HW logic (HW/SW co-design flow) Perfect? No Modify the code Illustration by Michael Wu, 2008 6/27

Setup Cygwin Environment Installing the latest Cygwin under Win32 by running http://www.cygwin.com/setup.exe In addition to the default Cygwin packages, make sure the following packages are installed: automake, gcc, gdb, make, sharutils, tcltk, wget If you want to use GHDL under Cygwin, you may also need the mpfr package 7/27

BCC Cross-Compiler Installation Download the BCC package, sparc-elf-3.4.4-1.0.29d-cygwin.tar.bz2, from: ftp://graisler.com/bcc/bcc/bin/windows/old Under a Cygwin console, type $ mkdir /opt $ tar xjf sparc-elf-3.4.4-1.0.29d-cygwin.tar.bz2 C /opt Modify the PATH variable by adding the following line to.bashrc in your home directory: export PATH=/opt/sparc-elf-3.4.4/bin:$PATH 8/27

ecos Installation An ecos port (ecos-rep-1.0.8.tar.gz) to GRLIB can be downloaded from ftp://gaisler.com/ecos/ecos/src Under a Cygwin console, type $ mkdir /home/soc/std_id $ tar xzf ecos-rep-1.0.8.tar.gz C /home/soc/std_id Each student create your own subdirectory (use student ID) under the system account 9/27

Installing ecos Configuration Tool Download and install the ecos configuration tool, configtool-2.11-setup.exe, for Windows from: ftp://gaisler.com/ecos/ecos/bin/configtool/windows/ Run ecos configuration tool Setup paths Tools Paths Build Tools = c:\cygwin\opt\sparc-elf-3.4.4\bin Tools Paths User Tools = c:\cygwin\bin Build Repository = c:\cygwin\home\soc\std_id\ecos-rep-1.0.8 10/27

Building ecos Library We have to build an ecos with timer and tftp support Inside ecos configuration tool Select: Build Templates Hardware: LEON3 processor with GRETH ethernet Packages: net Search for CYGHWR_NET_DRIVER_ETH0_ADDRS Edit addresses according to environment setup Select: File Save Save the configuration as student_id/ecos_leon/leon.ecc Select: Build Library See soc10_leon_tutorial.doc for detail settings 11/27

Building the Application Unzip the lab package under your work directory +-- ecos_leon/ +-- lab2_pkg/ +-- bitstream/ +-- leon_bit_files/ +-- m4v_dec_ecos/ +-- src/ +-- Makefile +-- readme.txt +-- timer_example.c Simply type make in m4v_dec_ecos/, and you will have an m4v_dec.elf executable 12/27

System Setup To Internet Host PC 192.168.1.x USB JTAG cable 192.168.0.2 (board IP) 192.168.0.1 (tftp server IP) 13/27

GRMON Debug Monitor GRMON is a general debug monitor for the LEON processor, it supports Downloading and execution of LEON applications Breakpoint and watchpoint management USB, JTAG, RS232, PCI, and Ethernet debug links Remote connection to GNU debugger (gdb) Access to all system registers and memory Built-in disassembler and trace buffer management 14/27

Installation of GRMON Download the GRMON package, grmon-eval- 1.1.39.tar.gz, from: ftp://gaisler.com/grmon/grmon Under a Cygwin console, type $ tar xzf grmon-eval-1.1.39.tar.gz C /opt Modify the PATH variable by adding the following line to.bashrc in your home directory: export PATH=/opt/grmon-eval/cygein/bin:$PATH 15/27

Uploading FPGA Bit File (1/2) Although we don t have to synthesize HW logic in this lab, you still need the HW EDA tool in order to configure the FPGA Download and install ISE WebPACK 10.1 from http://www.xilinx.com/tools/webpack.htm The latest version is 11.1, but 10.1 is used in this course 16/27

Uploading FPGA Bit File (2/2) Run impact in the ISE suite Select: Create a new project (.ipf) OK Select: Configure devices using Boundary-Scan (JTAG) You will see a scan chain with 3 devices: xcf04s, xcf01s, and xc3s1500 Select Bypass for the first 2 devices For xc3s1500, open the bit file that comes with the lab package (under leon_bit_files/) Right click on xc3s1500 and select Program Select Verify and click OK Wait for a blue Program Succeeded message to appear 17/27

Control the Board from GRMON Under Cygwin prompt, type $ grmon-eval eth u This tells GRMON to use the Ethernet debug interface The u flag let application pipe the output to grmon console GRMON LEON debug monitor v1.1.39 evaluation version Copyright (C) 2004-2008 Aeroflex Gaisler - all rights reserved. For latest updates, go to http://www.gaisler.com/ Comments or bug-reports to support@gaisler.com This evaluation version will expire on 2/11/2010 ethernet startup. GRLIB build version: 4075 initialising... detected frequency: 9 MHz Component Vendor LEON3 SPARC V8 Processor Gaisler Research... Modular Timer Unit Gaisler Research General purpose I/O port Gaisler Research Use command 'info sys' to print a detailed report of attached cores grlib> 18/27

Run a tftp Server on Host PC We must run a tftp server on the host PC in order to send/receive data to/from the board It is recommended that you use the tftp32 server (http://tftpd32.jounin.net/): 19/27

Decoder Execution Under Cygwin console, type the commands: ~/$ cd <your working directory> ~/$ grmon-eval eth u. some GRMON startup messages.. grlib > load m4v_dec.elf grlib > run 20/27

Sample Execution Result Reading bitstream using tftp... bitstream_size = 83860, err = 0 Initializing decoder... Decoding frames: 0...16...32...48...64...80...96...112...128...144... IDCT Computation: 9118.00 ms (46.61% of total decoding time) Inverse Quantization: 1009.00 ms ( 5.16% of total decoding time) Motion Compensation: 4152.00 ms (21.22% of total decoding time) Boundary Extension: 559.00 ms ( 2.86% of total decoding time) Boundary Removal: 646.00 ms ( 3.30% of total decoding time) Block Data Transfer: 1724.00 ms ( 8.81% of total decoding time) DC/AC Prediction: 160.00 ms ( 0.82% of total decoding time) VLC Decoding: 349.00 ms ( 1.78% of total decoding time) Total decoding time: 19564.00 ms, we measured 17717.00 ms (90.56%) Writing decoded YCbCr frames using tftp... Finished decoding. 21/27

General Purpose Timer in GRLIB The Leon platform contains GPTIMER IP which you can use for profiling purposes By default, there will be two hardware timers The first one is used by ecos The second one is free for you to use To use the timer, you need its interrupt ID, grlib> info sys 03.01:011 Gaisler Research Modular Timer Unit (ver 0x0) irq 8 apb: 80000300-80000400 8-bit scaler, 2 * 32-bit timers, divisor 10 See GRLIB IP Core User s Manual, Chapter 35 GPTIMER - General Purpose Timer Unit 22/27

Installing of Timer ISR in ecos (1/3) Timer register definitions #include <cyg/kernel/kapi.h> #include <cyg/hal/hal_io.h> #define TICKS_PER_MS 40000 /* 40 MHz per second/1000 */ /* The IRQ value '8' displayed in 'info sys' is for */ /* the first timer, which is used by ecos. We will */ /* use the 2nd timer. Therefore, the IRQ value is 9. */ #define CYGNUM_HAL_INTERRUPT_TIMER2 (8+1) /* The base address of timer registers is again */ /* obtained by 'info sys'. */ #define TIMER_BASE 0x80000300 #define SCALER_RELOAD_VALUE TIMER_BASE + 0x04 #define TIMER2_RELOAD_VALUE TIMER_BASE + 0x24 #define TIMER2_CTRL_REGISTER TIMER_BASE + 0x28 23/27

Installing of Timer ISR in ecos (2/3) Timer ISR installation: cyg_interrupt_create(cygnum_hal_interrupt_timer2, 0, data, my_isr, NULL, &handle, &isr_struct); cyg_interrupt_attach(handle); cyg_interrupt_unmask(cygnum_hal_interrupt_timer2); /* initialize the timer to 1 ms per tick. For a 40MHz */ /* clock, 40000/prescaler_value equals 1 ms. */ HAL_READ_UINT32(SCALER_RELOAD_VALUE, prescaler_value); HAL_WRITE_UINT32(TIMER2_RELOAD_VALUE, TICKS_PER_MS/prescaler_value); /* set the control register */ HAL_READ_UINT32(TIMER2_CTRL_REGISTER, ctrl_reg); ctrl_reg = 0xA; /* enable the interrupt */ ctrl_reg = 0x1; /* start ticking */ HAL_WRITE_UINT32(TIMER2_CTRL_REGISTER, ctrl_reg); 24/27

Installing of Timer ISR in ecos (3/3) Timer ISR cyg_uint32 my_isr(cyg_vector_t vector, cyg_addrword_t data) { long *counter = (long *) data; (*counter)++; cyg_interrupt_acknowledge(vector); return CYG_ISR_HANDLED; } Uninstallation cyg_interrupt_mask(cygnum_hal_interrupt_timer2); cyg_interrupt_detach(handle); cyg_interrupt_delete(handle); 25/27

Using tftp Protocol in Your Code #include <network.h> #include <tftp_support.h> int main(int argc, char**argv) { struct sockaddr_in host; int err, size, max_size, yuv_size; char *ifname, *ofname, *bit_buf, yuv_buf;... /* initialize network interface */ init_all_network_interfaces(); memset((char *) &host, 0, sizeof(host)); host.sin_len = sizeof(host); host.sin_family = AF_INET; host.sin_addr = eth0_bootp_data.bp_siaddr; host.sin_port = 0; /* retrieve video bitstream */ size = tftp_get(ifname, &host, bit_buf, max_size, TFTP_OCTET, &err);... /* output decoded YCbCr frames */ tftp_put(ofname, &host, yuv_buf, yuv_size, TFTP_OCTET, &err); } 26/27

Final Remark When it comes to system design, the completeness of your solution depends on the effort you put into it. and it shows! 27/27