RHEALSTONE BENCHMARKING OF FREERTOS AND THE XILINX ZYNQ EXTENSIBLE PROCESSING PLATFORM

Size: px
Start display at page:

Download "RHEALSTONE BENCHMARKING OF FREERTOS AND THE XILINX ZYNQ EXTENSIBLE PROCESSING PLATFORM"

Transcription

1 RHEALSTONE BENCHMARKING OF FREERTOS AND THE XILINX ZYNQ EXTENSIBLE PROCESSING PLATFORM A Thesis Submitted to the Temple University Graduate Board In Partial Fulfillment of the Requirement for the Degree MASTER OF SCIENCE in ELECTRICAL ENGINEERING by Timothy J. Boger May, 2013 Thesis Approval(s): Dennis Silage, PhD, Thesis Adviser, Electrical and Computer Engineering John Helferty, PhD, Electrical and Computer Engineering Eugene Kwatny, PhD, Computer and Information Sciences

2 ABSTRACT Embedded system designers require deterministic, real-time operating system (RTOS) support for the commonly available processing hardware. The Xilinx Zynq Extensible Processing Platform (EPP) offers software, hardware, and input/output (I/O) programmability on a single chip. The Xilinx Zynq EPP features a Dual ARM Cortex-A9 MPCore, Advanced Microcontroller Bus Architecture (AMBA) Advanced extensible Interface 4 (AXI4) interconnect, and Xilinx Kintex-7 series Programmable Logic (PL) which provide the requisite capabilities for the increasing demands of embedded processing applications. The AMBA AXI4 interconnect provides high speed point to point interconnections between the ARM processor cores and the Field Programmable Gate Array (FPGA) structure allowing for rapid data transmission to optimize system performance. The incorporation of an RTOS ensures predictable execution times of applications. Benchmarks, such as the Rhealstone, were developed to provide designers with a method of evaluating and comparing these multitasking RTOSs running on various hardware platforms. This thesis research performs Rhealstone benchmarking and evaluates the AMBA AXI4 interconnect performance while executing FreeRTOS on the ARM core of the Zynq EPP device. i

3 TABLE OF CONTENTS ABSTRACT... i LIST OF FIGURES... v LIST OF TABLES... vi NOMENCLATURE... vii CHAPTER 1 INTRODUCTION Motivation Research Objectives Organization of the Thesis... 7 CHAPTER 2 BACKGROUND ARM Architecture Introduction ARM Cortex A AMBA Bus Zynq Extensible Processing Platform Introduction Xilinx Zynq-7000 Evaluation Kit ZedBoard Real-Time Operating Systems FreeRTOS Rhealstone: Real-Time Benchmark Task Switching Time Preemption Time Interrupt Latency Semaphore Shuffling Time Deadlock Breaking Time Intertask Messaging Latency Calculating the Rhealstone Performance Number ii

4 CHAPTER 3 DESIGN TOOLS Introduction Xilinx ISE PlanAhead Embedded Design Kit Xilinx Platform Studio Base System Builder Wizard AXI Interconnection Hardware Platform Configuration Software Development Kit Board Support Package Xilinx C Project First Stage Boot Loader Program FPGA XMD Console ChipScope Pro Tera Term CHAPTER 4 ZYNQ EPP OPERATING SYSTEMS Introduction Bare-Metal Linux FreeRTOS CHAPTER 5 UTILIZING THE ZYNQ EPP HARDWARE Introduction Booting Pmod Connection CHAPTER 6 RESULTS Introduction iii

5 6.2 Bare-Metal - Single Core FreeRTOS - Multitasking Single Core Benchmarking FreeRTOS Task-Switching Time FreeRTOS Preemption Time FreeRTOS Semaphore Shuffle Time FreeRTOS Deadlock Breaking Time FreeRTOS Intertask Messaging Latency FreeRTOS Rhealstone Benchmark CHAPTER 7 CONCLUSION CHAPTER 8 FUTURE WORK REFERENCES APPENDICES A: TASK-SWITCHING CODE B: PREEMPTION TIME CODE C: INTERTASK MESSAGE LATENCY CODE D: DEADLOCK-BREAK TIME CODE E: SEMAPHORE SHUFFLE TIME CODE iv

6 LIST OF FIGURES Figure 1: Kernel Layout: (a) Monolithic (b) Modular (c) Extensible (d) Layered... 2 Figure 2: Typical Symmetric Multiprocessing System Layout... 3 Figure 3: Typical Asymmetric Multiprocessing System Layout... 3 Figure 4: Message Passing using the MULTIBUS II... 6 Figure 5: Cortex-A9 Dual MP Core Architecture Figure 6: Xilinx Zynq-7000 Extensible Processing Platform Architecture Figure 7: Zynq 7000 Evaluation Kit Figure 8: ZedBoard Figure 9: Rhealstone Benchmarking: Task-Switching Time Figure 10: Rhealstone Benchmarking: Preemption Time Figure 11: Rhealstone Benchmarking: Interrupt Latency Figure 12: Rhealstone Benchmarking: Semaphore-Shuffle Time Figure 13: Rhealstone Benchmarking: Deadlock-Break Time Figure 14: Rhealstone Benchmarking: Intertask Message Latency Figure 15: Design Tools Block Diagram Xilinx ISE Figure 16: Design Tools Block Diagram PlanAhead Figure 17: Design Tools Block Diagram XPS Figure 18: Processing Platform Peripheral Configuration XPS Figure 19 Design Tools Block Diagram SDK v

7 LIST OF TABLES Table 1: FreeRTOS Rhealstone Benchmarks vi

8 NOMENCLATURE ACP Accelerator Coherence Port MMU Memory Management Unit AHB AMBA High-performance Bus MPC Message Passing Coprocessor AMBA Advanced Microcontroller Bus Architecture OCM On-Chip Memory AMP Asymmetric Multi-Processing OS Operating System APB Advanced Peripheral Bus PL Programmable Logic API Application Programming Interface PLB Processor Local Bus APSL Advanced Processor Systems Laboratory Pmod Peripheral Module ARM Advanced RISC Machine PPC Power PC ASB Advanced System Bus PS Programming System ATB Advanced Trace Bus QEMU Quick EMUlator AXI Advanced extensible Interface QSPI Queued Serial Peripheral Interface BIF Boot Image File RTL Register Transfer Level BSB Base System Builder RISC Reduce Instruction Set Computer BSP Board Support Package RTOS Real-Time Operating System CPU Central Processing Unit SCDL System Chip Design Laboratory DMAC Direct Memory Access Controller SDK Software Development Kit EPP Extensible Processing Platform SIMD Single Instruction-Multiple Data EDK Embedded Development Kit SMP Symmetric Multiprocessing ELF Executable and Linkable Format SCU Snoop Control Unit EMIO Extended Multiplexed I/O SoC System-On-a-Chip FPGA Field-Programmable Gate Array TCL Tool Command Language FSBL First Stage Boot Loader TDP Targeted Design Platforms GDB GNU Debugger UCF User Constraints File GPIO General Purpose I/O UART Universal Asynchronous Rx/Tx HPS Hard Processor System XMD Xilinx Microprocessor Debugger HDL Hardware Descriptive Language XML extensible Markup Language I/O Input/Output XMP Xilinx Microprocessor Project IP Intellectual Property XPS Xilinx Platform Studio ISA Instruction Set Architecture ZED Zynq Evaluation & Development ISS Instruction Set Simulator ISE Integrated Software Environment JTAG Joint Test Action Group MHS Microprocessor Hardware Specification MIO Multiplexed I/O vii

9 CHAPTER 1 INTRODUCTION 1.1 Motivation The goal of this thesis research is to provide performance benchmarks for the Xilinx Zynq-7000 Extensible Processing Platform (EPP) and to provide a premise for future embedded design. The Xilinx Zynq EPP is capable of running Asymmetric Multiprocessing (AMP) of a Real-Time Operating System (RTOS) called FreeRTOS. [1] The Dual ARM Cortex A-9 MPCore processor is provided with various features including a primary Advanced Microcontroller Bus Architecture (AMBA) Advanced extensible Interface 4 (AXI4) 64-bit interconnect that can be used with various soft-core and hard-core peripherals and the 28nm Programmable Logic (PL) of the Xilinx Kintex-7 series Field Programmable Gate Array (FPGA). Embedded system designers require these benchmarks in order to evaluate and design an efficient Processing System (PS). Multicore processor architectures have the potential to provide increased performance and power efficiency, but at the cost of programming complexity. [2] The complexity involved has been the hindrance in the widespread adoption of multicore architecture. Multicore systems can be implemented in Symmetric Multiprocessing (SMP) or AMP modes. These modes refer to how the Operating System (OS) kernel will run on a system that has more than one Central Processing Unit (CPU) or core. A kernel is the underlying main component of the majority of computer OS. It bridges the gap between the hardware and the application executing on the PS. The kernel's responsibilities include managing the communication between hardware and software components to allocate the system's resources accordingly through system calls and inter-process communication. The 1

10 hardware components it manages include the processor and I/O devices. [3] There are various types of kernel structures including monolithic, modular, extensible and layered. Figure 1 depicts these structures. Figure 1: Kernel Layout: (a) Monolithic (b) Modular (c) Extensible (d) Layered [4] Monolithic kernels are the most primitive with the OS code executing in the same address space. This direct intercommunication is highly efficient and increases performance, but makes it difficult to manage and maintain. Modular kernels allow for better overall functionality with ease of management due to its modular nature, but lacks performance. Layered kernels are used to divide components into manageable layers, but have degraded performance when communication between multiple levels is required. Extensible kernels, also known as microkernels, execute services in user space as servers to improve modularity and maintainability while also having a lower level skeletal nucleus that controls basic process synchronization. [4] With SMP, the kernel itself can run on any processor and can run simultaneously on multiple processors. SMP handles programs using multiple processors sharing a common OS. There is a single copy of the operating system that supervises all of the processors and shares everything symmetrically among them. The processors share memory and a bus as shown in Figure 2. [5] 2

11 Figure 2: Typical Symmetric Multiprocessing System Layout [6] AMP is the employ of more than one CPU with a specified role with each kernel running exclusively. This means each processor shares the same physical memory, but have independently running OS on each core. With this method, processes can run on either processor. Figure 3 shows a typical AMP system layout. [7] Figure 3: Typical Asymmetric Multiprocessing System Layout [8] There are heterogeneous and homogeneous processor systems. A heterogeneous system is the use of different processor cores with one for general-purpose work and the other for such things as DSP. The advantage of this approach is the ability to match processor cores with features that are appropriate for on-chip tasks applications. Tailoring a processor core to a specific task forces the processor to limit its number of abilities to no more than are required by removing unneeded features from each processor. The heterogeneous design needs a different software-development tool set, which would include a compiler, assembler, debugger, instruction-set simulator, and OS for each of the different processor cores used in the system design. [9] Heterogeneous processors 3

12 generally do not use SMP due to the processors not being capable of executing identical instructions from the same copy in memory. Homogeneous processors, such as Xilinx's Zynq EPP with two ARM cores, run the same code from a single copy in memory using SMP. These processors can also be used with AMP creating a more independent processor. In this case each processor can run different code from its personal local memory. AMP with the Xilinx Zynq EPP, for example, could utilize a RTOS on one processor while running Linux on the other or a RTOS could be resident on both cores. [1] Multicore processing can be somewhat complex and intimidating, so it is important to have an OS developed that offers ease of use independent of the system configuration. OSs can provide autonomy to process load balancing and handling which alleviates concern about how the processors are explicitly handling the workload. Some OS are designed to automatically run processes on any available processor to provide transparent mapping of multithreading on a multicore architecture. [2] Multithreading is the ability of efficiently executing multiple threads running on a single core by utilizing thread-level and instruction-level parallelism. Multiprocessing involve the use of multiple complete CPUs in a single system. These complimentary systems can sometimes be combined in systems with multiple multithreading cores. [10] FPGAs are used to develop soft-core processors that are used for various embedded applications. The use of FPGAs as soft-core processors such as the 32-bit Xilinx MicroBlaze have some utility, but have various complications including synthesizing the various interconnects on the programmable fabric. The challenge of using FPGAs with embedded CPUs lies in the communication between the processor and PL. Using a processing platform, a processor centric design, compared to an FPGA has various advantages. In the most recent version of the FPGA architecture by Xilinx, the Vertex FX FPGA series, Power PC (PPC) cores are used as hard 4

13 Intellectual Property (IP). [11] IP is an algorithm or function that is provided to designers through licensing from software developers. These predefined functions are intended to save time by prebuilt solution for such things as processors and bus interfaces. [15] A FPGA centric design means that the FPGA is the master and PPC is the slave. Development requires the configuration of the FPGA in order to use the CPU cores and cannot boot independently of the FPGA fabric. On the other hand, the Zynq EPP includes a Dual ARM Cortex-A9 as hard IP. This means the ARM PS is the master and the FPGA is the slave. Additionally, the CPU can boot without powering or configuring the FPGA. [11] The Zynq EPP utilizes the AMBA AXI4 interconnects in its System-on-a-Chip (SoC) design. An embedded designer needs to understand how to utilize the hardware they are given. The bus interconnect, in any system, is the communication link between hardware. To understand how to utilize the AMBA AXI4 interconnect, we can look at previous bus architectures and methods used to handle multiprocessing systems. The Multibus is an asynchronous bus standard developed by Intel in The Multibus was designed to be robust and became a widely used industry standard in the 1980s with systems still currently operational. [12] The Intel MULTIBUS II was designed to address the multiprocessing problem caused by the increased demands for processing power. The MULTIBUSS II was designed to improve system performance and reduce the complexity of multiprocessing systems. It introduced the mechanism of message passing to improve the performance of a system and in doing so simplified multiprocessing system implementations. The mechanism that supports this message passing is the Message Passing Coprocessor (MPC). There are various ways to implement a multiprocessing system. Traditionally, processors can share data using the bus and a common memory area. This memory is either available globally or dual-ported into the local memory of one processor. [12] Another method for data sharing is to 5

14 have a host CPU and a disk controlling with communication done through message passing. The use of the MPC in this manner is demonstrated in Figure 4. Figure 4: Message Passing using the MULTIBUS II [13] Depending on how this message passing is implemented, the bus can become a bottleneck. However, efficiently and effectively getting data quickly into the local memory of the second CPU can achieve performance improvements. The MULTIBUS II supported all of these methods for communication. The design objectives behind using the AMBA for SoC designs is to improve processor independence by encouraging modular system design, the development of reusable libraries for peripherals and system IP and on-chip communication that minimizes silicon infrastructure while maintaining low power and high performance. [14] These IP blocks address the various needs of embedded designers with pre-designed cores that can be implemented on Xilinx FPGA devices. [15] For example, there are IP blocks designed for Xilinx Targeted Design Platforms (TDP) provided by Xilinx and its Alliance Program Members. TDPs are development kits released with boards, Integrated Software Environments (ISE) Design Suite tools, IP cores, reference designs, and designer support for initial application development. [16] 6

15 The System Chip Design Laboratory (SCDL) is a research facility of the Department of Electrical and Computer Engineering at Temple University s College of Engineering. SCDL was started in 1999 and is a descendant of the Advanced Processor Systems Laboratory (APSL) established in The laboratory worked with the Multibus II multiprocessor computer system which utilized the Intel irmx III real-time multitasking operating system. The SCDL pursues innovative investigations in the SoC design methodology utilizing hard processor IP cores, configurable SoC and soft core architectures on FPGAs, on-chip busing arbitration architectures, and heterogeneous multiple processor RTOS. [38] 1.2 Research Objectives The objective of this thesis is to develop embedded operating system support for the Xilinx Zynq EPP with multitasking FreeRTOS. Doing so will develop an understanding of effectively implementing FreeRTOS on the platform and to produce benchmark results that can be used evaluate the AMBA AXI4 interconnect performance. The Rhealstone real-time benchmark will be used to perform this benchmarking. This will provide embedded designers with a platform for further implementation on the Zynq EPP. This work will be added to the Xilinx and Zynq Evaluation & Development (Zed) board websites as a resource to inform and strengthen the Zynq community. 1.3 Organization of the Thesis The thesis is organized as follows. A background is given in order to lay the foundation for this work. The ARM Architecture is discussed, followed by an outline of the ARM Cortex A9 architecture. AMBA is described to develop an understanding of its associations with the Dual ARM processor. The Zynq EPP architecture is discussed along with Zynq EPP platform which is utilized on both the Xilinx Zynq Evaluation Board and the Zed Board. A brief discussion about RTOSs followed by a discussion of implementing FreeRTOS is provided. The Rhealstone 7

16 Benchmark's use and implementation is reviewed and its application to the work in this thesis is discussed. The Design Tools used to develop on the Zynq EPP is review and key software and hardware design elements are discussed. The thesis results are discussed and concluded. Finally, the framework for future work with the Zynq EPP is discussed. 8

17 CHAPTER 2 BACKGROUND 2.1 ARM Architecture Introduction ARM is known for its high performance for low price and low power consumption. The reduced instruction set computer (RISC) instruction set architecture (ISA) of the ARM is not designed to produce the most powerful processor, but to create a processor capable of powering the latest technologies at a price that could be used in low-cost processing systems. The advantages of RISC stemmed from the concept that performance could be improved through smaller chip sizes with shorter signal paths implying shorter instruction cycles which results in a faster processor. A smaller die size is a result of the RISC chip being simpler and therefore requiring fewer transistors to implement the smaller instruction set. RISC was intended to shorten the design process through smaller chips with fewer instructions making the design less complicated and ultimately taking less time to complete and debug. [17] The history of the ARM resides in the United Kingdom with Acorn Computers Ltd. ARM was established in Cambridge, originally know as Acorn RISC Machine, and developed its first ARM chip between The company became popular when Acorn's British Broadcasting Corporation (BBC) Microcomputer which was widely used in UK classrooms during the 1980's. In 1985, the ARM1 was released and focused on improved instruction sets in order to improve and maximize performance of the systems using it. [17] The Archimedes home computer launched in 1987 was the first commercial product using the ARM. It utilized the ARM2 8 MHz processor and was the first RISC processor available in a low-cost PC. The intent of these first two processors was to offer quality performance in a low- 9

18 cost system. Since Intel and Motorola-based computers competed on the market with their highend personal and workstation computer systems the ARM based systems were overshadowed. [17] The release of the ARM3 in 1989 was designed to improve the performance of the ARM by including a 4 Kbyte on-chip data and instruction cache. This 25 MHz processor could run at a higher clock rate due to the denser fabrication of the chip compared to its predecessors and inherently improving the overall performance while using the same support chips and low cost memory as the ARM2. In 1990 the ARM2aS, a static version of the processor, added low power consumption to the list of ARM feature which opened ARM to the personal hand-held and communications devices market. Though this specific processor only reached prototyping stage of mobile devices, it sparked greater interested in RISC and the ARM family. [17] With financial growth of Acorn and the increasing demand for RISC processors, an agreement was made between Acorn, VLSI Technology and Apple. This resulted in the foundation of ARM Ltd and the name change to the Advanced RISC Machine (ARM). ARM Ltd licenses its designs to chip foundries for royalties rather than establishing its own fabrication facilities. VLSI Technology, who had built all previous ARM chips, was the first licensee. ARM Ltd's first development after the ARM3, was the ARM6 which included full 32-bit addressing. This was designed to meet the requests of its new partner, Apple. [17] ARM, since then, has continued its growth in various avenues including its ARM Cortex A9 being used on the Zynq EPP ARM Cortex A9 ARM Cortex -A9 processor is available as either a single core or configurable multicore processor with either synthesizable or hard-macro implementations. The ARM Cortex-A9 processor is available as a single core or MPCore model with up to four cores. MPCore is an integrated SMP or AMP with multiple processors in a single device. The Cortex-A9 processor is 10

19 a power efficient, high performance option for a cost-sensitive system with power or thermal constraints. Full virtual memory capabilities are provided by the L1 cache and implemented byt the ARMv7-A architecture. It can execute 32-bit ARM instructions as well as 16-bit and 32-bit Thumb instructions and 8-bit Java bytecodes. [18] Figure 5 shows the Cortex-A9 Dual MP Core Architecture. Figure 5: Cortex-A9 Dual MP Core Architecture [19] The processor was designed with high efficiency in mind with dual-issue superscalar, out-oforder, and a speculating dynamic length pipeline. The Cortex-A9 architecture supports 16, 32 or 64KB configurations of four way associative L1 caches and an optional L2 cache controller up to 8MB. The Cortex-A9 has physical IP available for designers. The processor comes with the ARM Development Suite 5 tools and CoreSight Debug & Trace IP. CoreSight is an on-chip debug and real-time trace kit for SoC designs utilizing ARM processors to optimize debugging the system. [20] The Cortex A-9 MPCore with 2 cores integrated as hard IP component on the Zynq EPP is a 800- MHz dual-core processor that supports both SMP and AMP. Each processor core has a dual-issue superscalar pipeline, the NEON processing engine, a single- and double-precision floating-point 11

20 unit (FPU), and 32-KB instruction and 32-KB data cache with cache coherence. The ARM Cortex-A series processors utilize NEON technology which is a 128-bit Single Instruction Multiple Data (SIMD) engine used to process multimedia formats. [21] SIMD is an extension to the architecture of the ARM providing operation extensions for registers and floating-point. [22] The Cortex-A Series also includes a Memory Management Unit (MMU), a Snoop Control Unit (SCU), shared 8-way 512-KB associative L2 cache, generic interrupt controller, Direct Memory Access Controller (DMAC), and a 32-bit general purpose timer on the chip. [19] The ARM Cortex-A9 processor, when combined with embedded peripherals, interfaces, and on-chip Memory (OCM), create a Hard Processor System (HPS). Connecting the HPS and FPGA of the SoC with a high-bandwidth on-chip backbone provides large bandwidth for sharing data between the ARM processor and hardware accelerators within the FPGA fabric. 2.2 AMBA Bus The ARM Cortex A9 AMBA 3 located on the chip is the backbone for communication within the SoC. The AMBA has been widely used as an on-chip bus architecture in many SoC designs. The AMBA has since exceeded its initial design potential and has gone beyond the use in microcontroller devices. The AMBA 1 consists of the Advanced Peripheral Bus (APB) and Advanced System Bus (ASB). In the second generation, AMBA 2, ARM added a single clockedge protocol called AMBA High-performance Bus (AHB). AMBA 3, the third generation AMBA, reached higher performance interconnects by adding the Advanced extensible Interface (AXI). It also included the Advanced Trace Bus (ATB) which was designed to work with the CoreSight on-chip trace and debug tools. [24] The AMBA 3 specifications replaced AMBA 2, but AMBA 2 peripherals can still be used on AMBA 3 based systems. The protocol specification of the AMBA family is an ARM open standard for on-chip buses and provides solutions to SoC interconnections and functional block 12

21 management for embedded design with multiple processors and multiple peripherals. [23] The AMBA 3's interface protocol specification encompasses all the required on-chip data traffic requirements. These requirements include high data throughput from data intensive processes, low bandwidth communication with low power and gate count, and on-chip testing and debugging. [24] AMBA 3's AXI is an interface and a protocol, but is not a bus. There is no bus arbitration because it is utilizes point to point connections. [53] AXI provides support for data traffic throughput with five unidirectional channels and out-of-order data transaction capabilities. This allows for high speed operations through the pipelined interconnections, simultaneous reading and writing transactions, and efficient high latency peripheral support and bridging between frequencies for power management. The AHB interface enables high efficiency interconnects between single frequency subsystems of simpler peripherals when the AXI is not need. The structure of the AHB is a fixed pipelined and an unidirectional channel allows for back compatibility with AMBA 2 peripherals. [24] APB provides low bandwidth transaction support to access necessary configuration registers in peripherals as well as data traffic in peripherals with low bandwidth. This interface is highly compact and low power isolates data traffic from the AHB and AXI high performance interconnects. ATB adds a data trace interface for data diagnostics in a trace system. This provides debugging capabilities due to the trace components and bus sitting in parallel with interconnects and peripherals. [24] 2.3 Zynq Extensible Processing Platform Introduction The Zynq EPP 7000 family of devices combine the hardware programmability of an FPGA and the software programmability of a processor. The overview of the hardware is depicted in Figure 13

22 6. The Zynq EPP platform's PS includes the Dual ARM Cortex-A9 MPCore that utilizes 32kB instruction and data L1 cache per core, shared 512kB L2 cache, FPU and NEON media engine. The memory interfaces include 256kB OCM in addition to NAND Flash and NOR Flash Memory Controller which includes DDR2, LPDDR2, and DDR3. Peripherals include Queued Serial Peripheral Interface (QSPI), USB2.0, GbE, CAN, SDIO, Universal Asynchronous Receiver and Transmitter (UART), SPI, I2C, General Purpose I/O (GPIO), 12bit 1 Mbps ADC, AES and SHA [25] There are four available models of the Zynq EPP designed for various applications. The available FPGA types for each of the model types include the Artix-7 for Z-7010 and Z-7020 and the Kintex-7 for Z-730 and Z The FPGA sizes vary and include logic cells that range from 30k-350k, block RAM ranging from 240kB-2,180kB, DSP Slices from , and user I/Os of The Kintex-7 devices also have eight PCI Express2 and 12.5 Gbps Transceivers. Quick EMUlator (QEMU), a virtual platform, is used for the model of the processing subsystem. [25] Figure 6: Xilinx Zynq-7000 Extensible Processing Platform Architecture [26] 14

23 Xilinx implemented AMBA on the Zynq as two switch matrices. The AMBA AXI interconnect exists in two areas on the Zynq EPP. One is grouped around the DRAM controller and the other is used for general peripherals. There are two switches on the peripheral side. The first has one connection for a hard static memory controller, eight hard I/O controller blocks, five connections for the CPU cluster, and four stubs that end at the programmable fabric. The second has five connections ending in the fabric, two connections for the hard DRAM controller, and two CPU ports. [27] One of these five ports supports the Accelerator Coherence Port (ACP). This port provides the ability for the accelerator to snoop the processor cluster s caches, but not the cluster s OCM so a CPU task could leave a control and data block in cache. From here, an accelerator in the programmable fabric can read the block directly from cache and therefore avoiding a write-back to DRAM. This protocol is not symmetric and therefore the accelerators are not fully coherent. This is because the CPU reads and writes do not snoop memory in the fabric. The AMBA I/O ports, the DRAM controller accessible AXI ports, and ACP provide the Zynq EPP with a range of programmable fabric structure design possibilities. The current available hardware platforms include the Xilinx Zynq-7000 ZC702 Evaluation Kit, the Xilinx Zyqn-7000 EPP Video Kit, and the Zynq-7000 EPP ZedBoard. [27] Xilinx Zynq-7000 Evaluation Kit The Xilinx Zynq-7000 ZC702 Evaluation Kit is a kit from Xilinx that includes a silicon board with the Zynq EPP, development tools, IP, and a variety of reference designs. An image of the board is shown in Figure 7. It provides abundant I/O expandability for embedded designers to develop upon. It is also backed with OS support and by the ARM community. The kit is provided with the XC7Z020-1CLG484CES device Zynq chip, design suites, various cables for scoping the 15

24 board, 8 GB SD card that contains the provided Linux startup kernel, and documentation with step-by-step guides. It also contains all schematics and PCB files and design examples. [28] Figure 7: Zynq 7000 Evaluation Kit [28] ZedBoard The ZedBoard is a community driven approach of the Zynq EPP by Silica and Digilent. An image of the board is shown in Figure 8. The concept behind the board is to be designed in an open source community manner. The board contains various peripherals with extension options that include a FPGA Mezzanine Card and peripheral modules using the Peripheral Module (Pmod) connector to connect components such as an ADC, DAC, Sensors, Switches, Displays, RF, WiFi, Bluetooth, or Storage. The ZebBoard website, ZedBoard.org, is where all the collaboration material is maintained. [29] 16

25 2.4 Real-Time Operating Systems Figure 8: ZedBoard [30] An OS is an abstraction of hardware in a system that provides an interface for servicing applications. The OS replaces the direct interface to hardware with program functionalities a user of the system wants or needs. It supports the basic functions of a computer system and makes the system easier to maintain, faster, and easier to write applications. When designing an OS various parameters are considered including performance, resources management, security, marketability, and failure tolerance. It is responsible for managing hardware and software resources. Hardware resources include processors, memory, and I/O devices. Software resources include programs and data files. An OS is comprised of layers that create an environment that hides and simplifies the underlying hardware by providing sets of commands to meet the user's needs. Though the structure of the OS kernel can vary, they all attempt to provide the user with a platform in which to utilize the system hardware. Many OSs make multiple programs and processes appear to run at the same time through multitasking. However, a processor can only handle one thread of execution at a time. A 17

26 scheduler is used to manage the processes executing on the processor and to create the illusion of simultaneous execution through a process call time slicing and rapidly switching between program threads. [31] The type of OS can be defined by its scheduler and how it decides which process threads to run and for how long. A multi user OS, like Unix, will ensure processing time is shared equally between users. A desktop type OS, like Windows, has a scheduler that ensures that the system remains responsive to users when needed. A RTOS's scheduler is designed to provide predictable execution patterns to systems that have real time requirements. Embedded systems often have these demands and means the system must respond to a given event within a strictly defined deadline. This means that the OS's scheduler must be deterministic in order to predict the real time requirements of the system. [32] FreeRTOS uses a traditional real time scheduler by allowing the user to assign a priority to each thread to determine execution. This scheduler, based on the priority, knows which thread of execution to run next. FreeRTOS is a versatile class of RTOS designed for many applications including being implemented on small microcontrollers. FreeRTOS is designed for systems that do not require a full RTOS implementation, many times in the design of embedded applications, or do not have the ability to run a full RTOS. FreeRTOS only provides the core real time scheduling functionality, inter-task communication, and timing and synchronization primitives and would more accurately be referred to as a real time kernel. If additional functionality is required, they can be included as add-on components. [33] 2.5 FreeRTOS FreeRTOS is a RTOS from Real Time Engineers Ltd written in C and, as of October 2011, supports 31 processor architectures. FreeRTOS is a lightweight real-time kernel designed for small embedded systems that require deterministic and real-time responsiveness to system events. 18

27 Lightweight means it is a less complex OS with a basic instruction set designed to be faster and not as heavily resource dependent. Key features of FreeRTOS include an Application Programming Interface (API), message passing, binary and counting semaphores, mutual exclusion with priority inheritance, pre-emptive scheduling, co-operative scheduling, and round robin with time slicing. Round robin is a simple scheduling algorithm for process time slicing in which each process is assigned equal portions of execution time and in circular order. [33] With the growing complexity in embedded design due to the availability of more memory and various communication peripherals, there is an inherent increase in software complexity. The inherent benefit of using an OS kernel is clear. FreeRTOS is free and is released to its users as open source. FreeRTOS implements its open source by releasing moderated versions instead of pure open source. This ensures that only software originated by FreeRTOS is used in the official release. There are, however, community contributed files that are separate and available as open source. FreeRTOS's license model is designed around the idea that code on the application side that uses FreeRTOS remains closed, while code that modifies or extends the kernel itself is open source. [34] FreeRTOS supports several Xilinx products including Microblaze, PowerPC, and the Zynq. Microblaze, which is a 32-bit soft processor core port, runs on various Xilinx FPGA's including the Spartan-6 and Virtex5. PowerPC 405/440 are configurable processor cores that run on Virtex4 and Virtex5 FPGA's repetitively. The initial release of the FreeRTOS is available for the Zynq in October The original port was for the Xilinx Zynq EPP and was developed to run on the Zynq 7000 EPP based ZC702 board and implemented on version 14.1 of the Xilinx ISE Design Suite. As Xilinx releases newer versions of their design suites, the ports are updated and released accordingly. [35] 19

28 2.6 Rhealstone: Real-Time Benchmark The Rhealstone Benchmark was proposed by Rabindra P. Kar and Kent Porter in 1989 and was designed to be a metric for comparing the performance of real-time multitasking systems independent of any features found in any CPU, bus architecture, or a specific OS or kernel. [36] At that time, there was the Whetstones and Dhrystones that benchmarked code generated by compilers and the throughput of hardware platforms, but no equivalent measurement for real-time systems. Rhealstone was a proposed standard for objectively measuring real-time performance and summarizing the components of performance. The Rhealstone metric mainly helps embedded developers select real-time systems appropriate for a specific application. It should be noted that an encompassing real-time solution would consist of the system, the application software, and external devices, so Rhealstones doesn't measure the quality of the complete solution, but instead a measurement targeted specifically toward a multitasking solution. The scope of Rhealstones is with complex systems running five to thirty concurrent processes. It will be adopted to be used with a multitasking AMP system. The Rhealstone takes into account that all real-time applications are unique. One system may be highly interrupt-driven while another relies heavily on message-passing among tasks or another that fights for resources. The Rhealstone figure is a sum obtained from six categories of activity most crucial to the performance of real-time systems. The categories include task switching, preemption, interrupt latency time, semaphore shuffling, deadlock breaking, and intertask message latency time. It uses coefficients that the system designer assigns weight to each Rhealstone component based on relative importance. 20

29 2.6.1 Task Switching Time Task switching time is the average time the system takes to switch between two independent and active, not suspended or sleeping, tasks of equal priority. Task switching is synchronous and nonpreemptive and is an important measure of any multitasking system. This metric is influenced by the host CPU's architecture, instruction set, and features and is designed to assess the compactness of task control data structures and the efficiency with which the executive manipulates the data structures in saving and restoring contexts. Task switching time, additionally, measures the executive's list management capabilities. [36] A demonstration of this performance parameter is shown in Figure 9. Figure 9: Rhealstone Benchmarking: Task-Switching Time [37] Preemption Time Preemption time is the average time it takes a higher-priority task to take control of the system from a running task of lower priority and usually occurs when the higher-priority task moves from an idle to a ready state in response to some external event. In other words, it is the average time the executive takes to recognize an external event and switch control of the system from a running task of lower priority to an idle task of higher priority. A demonstration of preemption time is shown in Figure 10. Preemption and interrupt latency, which is discussed next, can be 21

30 considered most significant real-time performance parameter since multitasking systems assign task priorities and even dynamically through applications. [36] Figure 10: Rhealstone Benchmarking: Preemption Time [37] Interrupt Latency Interrupt latency, shown in Figure 11, is the time between the CPU's receipt of an interrupt request and the execution of the first instruction in the interrupt service routine. Its reflected by the delay introduced by an executive and the processor and not delays occurring on the bus or interfaces to external devices. [36] Figure 11: Rhealstone Benchmarking: Interrupt Latency [37] 22

31 2.6.4 Semaphore Shuffling Time Semaphore shuffling time is the delay between a task's release of a semaphore and the activation of another task blocked on the "wait semaphore" primitive. When implementing this, at least three tasks with different priorities should be active and no other tasks should be scheduled in between. The semaphore shuffling time measures the overhead associated with mutual exclusion. This occurs when multiple tasks compete for the same resources. Semaphore based mutual exclusion provides a way of ensuring that a nonshareable resource only serves one master at a time. [36] Semaphore shuffling time is shown in Figure 12. Figure 12: Rhealstone Benchmarking: Semaphore-Shuffle Time [37] Deadlock Breaking Time Deadlock breaking occurs when a higher-priority task preempts a lower-priority task that holds a resource needed by the higher-priority task and the metric measures the average time it takes the executive to resolve this conflict. Deadlocks are a common multitasking problem and are sometime not handled effectively. This can be solved by temporarily raising the priority of the running task above that of the interrupting task until the needed resource is released by the lowerpriority task. The temporary priority is then lowered so the new task can run. Deadlock breaking, 23

32 shown in Figure 13, is the sum of times required to resolve an ownership dispute between a lowpriority task holding a resource and a higher-priority task that needs it. [36] Figure 13: Rhealstone Benchmarking: Deadlock-Break Time [37] Intertask Messaging Latency Intertask message latency, demonstrated in Figure 14, is the delay within the executive when a nonzero-length data message is sent from one task to another. In order to measure it properly, the sending task should stop executing immediately after sending the message and the receiving task should be suspended while waiting for it. Figure 14: Rhealstone Benchmarking: Intertask Message Latency [37] 24

33 The intertask message-passing link must be established at run time and if multiple messages are sent on the same link, the receiving task gets a chance to read an old message before the sending task can overwrite it with a new one. This can be handled with various mechanisms such as pipes, queues, and stream files which are usually provided by multitasking executives for intertask data communication. [36] Calculating the Rhealstone Performance Number The measurement of the six performance categories provide embedded designers with a well rounded analysis of the system performance. Rhealstone also makes it easy to compare systems by generating a single real-time value. All of the benchmarks must be first represented in seconds (t1-t6). Then, added together and the average of them found. The number is then inverted to get the Rhealstone performance number that is represented with the units Rhealstones/second as shown in Equation 2.1. [37] / (2.1) The above performance number is a method to compare systems on a general level by considering all the parameters to be equally occurring. If an embedded designer needs to evaluate a system based on a specific category, for example, an application that is heavily interrupt-driven a weight can be chosen before calculating the performance number. This method is referred to as application specific Rhealstone and is shown as Equation 2.2. [37] / (2.2) Nonnegative real coefficients (n1-n6) for each category are set based on occurrence within the application. If interrupts occur 5 times more than task switching, its coefficient should be 5 times larger. Similarly, if a category does not happen at all, the coefficient is set to zero. For example, if 25

34 there is no inter-task message passing performed by the application, its coefficient should be set to zero. The application specific Rhealstone Performance Number is then again calculated by inverting the average. [37] 26

35 CHAPTER 3 DESIGN TOOLS 3.1 Introduction Complex embedded systems require powerful and well developed design tools. With an embedded system such as the Zynq EPP, embedded engineers are faced with complex design projects that have both hardware and software design problems. Using an FPGA in the design makes the system even more complicated and combining each individually designed subsystem into one complete system is again a difficult task. With the Zynq EPP and the addition of the ARM dual core as Hard IP, Xilinx has developed a set of design tools that manage this complexity and help make the design process as simple as possible. The broad array of development system tools provided by Xilinx is collectively called the ISE Design Suite. The Xilinx ISE Design Suite 14 is the current version used for designing on the Zynq-7000 All Programmable SoC platform. [39] 3.2 Xilinx ISE 14 The Xilinx ISE Design Suite is the current development tool set used to design every aspect of the Zynq-7000 All Programmable SoC. There are currently three editions, the Logic, Embedded, and DSP, of the ISE Design Suite and are all included as part of the System edition. [40] The Xilinx ISE Design Suite 14.2 Embedded Edition was used for development on the Xilinx ZC702 Rev C Evaluation Board for this thesis. Xilinx Vivado is the next generation of this design suite and will be replacing the Xilinx ISE for future Xilinx products. The first generation of Vivado did not support the Zynq EPP, but will support it in the future. The Embedded Edition of the ISE Design Suite includes the PlanAhead design analysis tool, ChipScope Pro, and the Embedded Development Kit (EDK). The EDK consists of the Xilinx 27

36 Platform Studio (XPS) and the Software Development Kit (SDK). [39] Aside from the ISE, Tera Term was also used for the design process. Figure 15 depicts the block diagram for the software packages within the ISE Design Suite and how they interact with each other. PlanAhead is the initial development tool for starting an embedded design. Planahead works with the EDK to design the hardware and software system. 3.3 PlanAhead Figure 15: Design Tools Block Diagram Xilinx ISE 14 The PlanAhead design and analysis tool is used to add various hardware sources and manage the link between the hardware and software design aspects of the project. It helps with FPGA I/O assignments and advanced FPGA layout planning to optimize the connectivity between the PCB and FGPA. [40] PlanAhead allows the embedded designer to create a project with an embedded processor system as the top level and works with the EDK to design the embedded system. The hardware system is created using XPS and imported back into the PlanAhead project. The 28

37 PlanAhead project is then exported to the SDK to develop software for the hardware design that was just created. When PlanAhead executes, it allows the embedded designer to create a new project or open an existing one. A Register Transfer Level (RTL) project is created to begin the design in PlanAhead. The RTL Project allows the embedded designer to add sources, generate IP, and run an RTL analysis. The designer starts by setting up the type of hardware board the project is being design for. For this thesis, the Zynq ZC702 Evaluation Board was selected. PlanAhead is then used to import various sources into the project. It can add constraints such as a User Constraint Files (UCF) which specifies how the logical design constraints are implemented on the target device [54], add design sources such as the HDL Verilog, or an Embedded Source for setting up the PS peripherals and various other settings. PlanAhead also generates the bitstream s bit file for programming the PL in the SDK. [51] Figure 16 shows how the project files of PlanAhead interact with the rest of the ISE Design tools. Figure 16: Design Tools Block Diagram PlanAhead 29

38 When an embedded source is added to the project, it recognizes that an embedded processor system was created and starts XPS to setup the added source. When the designer is finished with XPS, the design is updated in the PlanAhead tool. From here the embedded processor system can be created as the top level of the system by creating a Top HDL with Verilog. The entire project is then exported to the SDK. [39] 3.4 Embedded Design Kit The EDK is used to design a complete embedded processor system for implementation on a Xilinx hardware device. It assists designers in hardware and software application design, debugging, and execution. The design can be run on the destination boards for verification of a working design. The EDK includes hardware IP, drivers and libraries, and GNU compiler and debugger for C/C++ software development for the ARM Cortex-A9MP processors in the Zynq PS. It also provides documentation and sample tutorial projects for understanding the basics. [39] The tool kits included in the EDK are the XPS and SDK. Within these two kits are various tools including the Base System Builder (BSB) Wizard, Xilinx Microprocessor Debugger (XMD) and GNU Software Debugging Tools, Simulation Model Generation Tool (SimGen), Create and Import Peripheral Wizard, GNU Software Development Tools, Library Generation Tool (LibGen), Bitstream Initializer (BitInit), and the Hardware Platform Generation Tool (PlatGen). [40] 3.5 Xilinx Platform Studio XPS provides a development environment for designing the embedded PS s hardware. XPS is primarily used for setting up the processor, peripheral, and interconnection configurations for the embedded processor hardware system. It s designed to make it easy to add desired IPs and create port connections for components like the clock and reset. The XPS project can be designed from the ground up using a blank project or the BSB wizard can be used to add default peripherals to 30

39 the fabric and to automatically select a default configuration for the PS I/O interface. After the BSB is used, the Zynq EPP PS block diagram is displayed in XPS. This allows the designer to click on any of the configurable green blocks and make configuration changes. The configuration process of XPS is shown in Figure 17. Figure 17: Design Tools Block Diagram XPS XPS creates hardware platform information in the Xilinx Microprocessor Project (XMP) file format. [51] This file includes information about the PS configurations including GPIO such as MIO and Extended MIO (EMIO), and adds IP and information about configuring the PL in PlanAhead. Closing XPS will update the currently open PlanAhead session Base System Builder Wizard The BSB wizard is part of the XPS and prompts the designer to choose whether they want assistants in setting up the basic configurations of PS. The BSB helps create a working embedded design for the evaluation board quickly by setting up basic features and common functionality 31

40 automatically. After setting up the basics of the system, XPS and other ISE software tools can be used to perform system customization. The first aspect of the design the BSB wizard sets up is what type of interface is going to be used whether it be AXI or Processor Local Bus (PLB) which is an old interface standard used by Xilinx. The BSB then needs to know what type of board the system is being design for. Fortunately, this information was imported from PlanAhead and if it wasn t the correct board setup can be selected. [39] The BSB closes and now allows the designer to customize the existing design AXI Interconnection The AXI bus interface IP cores started being used by Xilinx with their Spartan-6 and Virtex-6 hardware devices. An AXI system interface comes with standard Xilinx IP and tool flows and will be the standard interface used for all current and future versions of Xilinx products. The PLB system is a legacy bus standard used by Xilinx FPGA families up to the Spartan 6 and Virtex 6 and is not supporting newer FPGA families. This means it is not suggested to start new projects with PLB if they will be used on new Xilinx platforms. [43] The AXI specification is in charge of providing a framework for defining protocols for moving data between IP. It does this using a defined signaling standard. The AXI standard is responsible for making sure that IP can exchange data is moved across a system properly. [42] The AXI and other IP can be added to the PS design to create a custom embedded system Hardware Platform Configuration The Zynq s PS can be configured in various ways. When the BSB is finished setting up the basic system, the designer is provided with the Zynq EPP processing platform configuration tab shown in Figure 18. This tab allows the designer to configure I/O peripherals, clocks, memory, and other aspects of the PS. The green blocks are customizable portions of the PS. 32

41 Figure 18: Processing Platform Peripheral Configuration XPS Various IP can be added to the PS using the bus interface tab. Once the peripherals are added to the system, the ports tab is used to setup the I/O peripherals and clocks. There are 54 MIO that can be used by the PS. If more I/O is required or the designer wants to utilize the PL, the I/O can be setup as EMIO for use by FPGA fabric. [53] Once the PS is configured, XPS is exited and the design is updated in PlanAhead and ready to be exported to the SDK. 3.6 Software Development Kit The SDK is used for developing the software design for the embedded project. The SDK is used for C/C++ embedded software application creation and verification of software application projects and was built on the Eclipse open-source standard framework. [40] The SDK provides tools for software project management and gives access to the GNU toolchain for code compilation and debugging. It can be used to run applications on the target hardware board and 33

42 create bootable images. The FPGA fabric can also be programmed when needed. Figure 19 shows the relationship between the files within the SDK. Figure 19 Design Tools Block Diagram SDK The PlanAhead design tool exports the hardware platform specification files from XPS to the SDK. These files include the XML, Microprocessor Hardware Specification (MHS), and the ps7_init.c, ps7_init.h, ps7_init.tcl, and ps7_init.html files. The XML file is the main file used for setting up the First Stage Boot Loader (FSBL) and Board Support Package (BSP). The MHS file contains information about the interconnects between the PS and PL. Ps7_init.c, ps7_init.h, ps7_init.tcl, and ps7_init.html files are internal configuration files containing information on the Zynq EPP peripheral configurations. The ps7_init.c and ps7_init.h files contain settings for DDR, clocks, plls, and MIOs to initialize the Zynq EPP PS. The SDK uses these specified settings so that applications can be run on top of the PS. It should be noted that here are some settings of the PS that are fixed for the ZC702 evaluation board and cannot be changed. [44] 34

Copyright 2016 Xilinx

Copyright 2016 Xilinx Zynq Architecture Zynq Vivado 2015.4 Version This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able to: Identify the basic building

More information

1-1 SDK with Zynq EPP

1-1 SDK with Zynq EPP -1 1SDK with Zynq EPP -2 Objectives Generating the processing subsystem with EDK SDK Project Management and Software Flow SDK with Zynq EPP - 1-2 Copyright 2012 Xilinx 2 Generating the processing subsystem

More information

Copyright 2014 Xilinx

Copyright 2014 Xilinx IP Integrator and Embedded System Design Flow Zynq Vivado 2014.2 Version This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able

More information

Introduction to Embedded System Design using Zynq

Introduction to Embedded System Design using Zynq Introduction to Embedded System Design using Zynq Zynq Vivado 2015.2 Version This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able

More information

Zynq-7000 All Programmable SoC Product Overview

Zynq-7000 All Programmable SoC Product Overview Zynq-7000 All Programmable SoC Product Overview The SW, HW and IO Programmable Platform August 2012 Copyright 2012 2009 Xilinx Introducing the Zynq -7000 All Programmable SoC Breakthrough Processing Platform

More information

SoC Platforms and CPU Cores

SoC Platforms and CPU Cores SoC Platforms and CPU Cores COE838: Systems on Chip Design http://www.ee.ryerson.ca/~courses/coe838/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrical and Computer Engineering Ryerson University

More information

Zynq Architecture, PS (ARM) and PL

Zynq Architecture, PS (ARM) and PL , PS (ARM) and PL Joint ICTP-IAEA School on Hybrid Reconfigurable Devices for Scientific Instrumentation Trieste, 1-5 June 2015 Fernando Rincón Fernando.rincon@uclm.es 1 Contents Zynq All Programmable

More information

Hardware In The Loop (HIL) Simulation for the Zynq-7000 All Programmable SoC Author: Umang Parekh

Hardware In The Loop (HIL) Simulation for the Zynq-7000 All Programmable SoC Author: Umang Parekh Application Note: Zynq-7000 AP SoC XAPP744 (v1.0.2) November 2, 2012 Hardware In The Loop (HIL) Simulation for the Zynq-7000 All Programmable SoC Author: Umang Parekh Summary The Zynq -7000 All Programmable

More information

Zynq-7000 All Programmable SoC: Concepts, Tools, and Techniques (CTT)

Zynq-7000 All Programmable SoC: Concepts, Tools, and Techniques (CTT) Zynq-7000 All Programmable SoC: Concepts, Tools, and Techniques (CTT) A Hands-On Guide to Effective Embedded System Design Notice of Disclaimer The information disclosed to you hereunder (the Materials

More information

Designing a Multi-Processor based system with FPGAs

Designing a Multi-Processor based system with FPGAs Designing a Multi-Processor based system with FPGAs BRINGING BRINGING YOU YOU THE THE NEXT NEXT LEVEL LEVEL IN IN EMBEDDED EMBEDDED DEVELOPMENT DEVELOPMENT Frank de Bont Trainer / Consultant Cereslaan

More information

Design Choices for FPGA-based SoCs When Adding a SATA Storage }

Design Choices for FPGA-based SoCs When Adding a SATA Storage } U4 U7 U7 Q D U5 Q D Design Choices for FPGA-based SoCs When Adding a SATA Storage } Lorenz Kolb & Endric Schubert, Missing Link Electronics Rudolf Usselmann, ASICS World Services Motivation for SATA Storage

More information

SoC FPGAs. Your User-Customizable System on Chip Altera Corporation Public

SoC FPGAs. Your User-Customizable System on Chip Altera Corporation Public SoC FPGAs Your User-Customizable System on Chip Embedded Developers Needs Low High Increase system performance Reduce system power Reduce board size Reduce system cost 2 Providing the Best of Both Worlds

More information

Performance Optimization for an ARM Cortex-A53 System Using Software Workloads and Cycle Accurate Models. Jason Andrews

Performance Optimization for an ARM Cortex-A53 System Using Software Workloads and Cycle Accurate Models. Jason Andrews Performance Optimization for an ARM Cortex-A53 System Using Software Workloads and Cycle Accurate Models Jason Andrews Agenda System Performance Analysis IP Configuration System Creation Methodology: Create,

More information

MYC-C7Z010/20 CPU Module

MYC-C7Z010/20 CPU Module MYC-C7Z010/20 CPU Module - 667MHz Xilinx XC7Z010/20 Dual-core ARM Cortex-A9 Processor with Xilinx 7-series FPGA logic - 1GB DDR3 SDRAM (2 x 512MB, 32-bit), 4GB emmc, 32MB QSPI Flash - On-board Gigabit

More information

MYD-C7Z010/20 Development Board

MYD-C7Z010/20 Development Board MYD-C7Z010/20 Development Board MYC-C7Z010/20 CPU Module as Controller Board Two 0.8mm pitch 140-pin Connectors for Board-to-Board Connections 667MHz Xilinx XC7Z010/20 Dual-core ARM Cortex-A9 Processor

More information

ARM Processors for Embedded Applications

ARM Processors for Embedded Applications ARM Processors for Embedded Applications Roadmap for ARM Processors ARM Architecture Basics ARM Families AMBA Architecture 1 Current ARM Core Families ARM7: Hard cores and Soft cores Cache with MPU or

More information

SYSTEMS ON CHIP (SOC) FOR EMBEDDED APPLICATIONS

SYSTEMS ON CHIP (SOC) FOR EMBEDDED APPLICATIONS SYSTEMS ON CHIP (SOC) FOR EMBEDDED APPLICATIONS Embedded System System Set of components needed to perform a function Hardware + software +. Embedded Main function not computing Usually not autonomous

More information

Effective System Design with ARM System IP

Effective System Design with ARM System IP Effective System Design with ARM System IP Mentor Technical Forum 2009 Serge Poublan Product Marketing Manager ARM 1 Higher level of integration WiFi Platform OS Graphic 13 days standby Bluetooth MP3 Camera

More information

Multi-core microcontroller design with Cortex-M processors and CoreSight SoC

Multi-core microcontroller design with Cortex-M processors and CoreSight SoC Multi-core microcontroller design with Cortex-M processors and CoreSight SoC Joseph Yiu, ARM Ian Johnson, ARM January 2013 Abstract: While the majority of Cortex -M processor-based microcontrollers are

More information

Hardware Design. University of Pannonia Dept. Of Electrical Engineering and Information Systems. MicroBlaze v.8.10 / v.8.20

Hardware Design. University of Pannonia Dept. Of Electrical Engineering and Information Systems. MicroBlaze v.8.10 / v.8.20 University of Pannonia Dept. Of Electrical Engineering and Information Systems Hardware Design MicroBlaze v.8.10 / v.8.20 Instructor: Zsolt Vörösházi, PhD. This material exempt per Department of Commerce

More information

Hardware Design. MicroBlaze 7.1. This material exempt per Department of Commerce license exception TSU Xilinx, Inc. All Rights Reserved

Hardware Design. MicroBlaze 7.1. This material exempt per Department of Commerce license exception TSU Xilinx, Inc. All Rights Reserved Hardware Design MicroBlaze 7.1 This material exempt per Department of Commerce license exception TSU Objectives After completing this module, you will be able to: List the MicroBlaze 7.1 Features List

More information

Introduction to Zynq

Introduction to Zynq Introduction to Zynq Lab 2 PS Config Part 1 Hello World October 2012 Version 02 Copyright 2012 Avnet Inc. All rights reserved Table of Contents Table of Contents... 2 Lab 2 Objectives... 3 Experiment 1:

More information

Commercial Real-time Operating Systems An Introduction. Swaminathan Sivasubramanian Dependable Computing & Networking Laboratory

Commercial Real-time Operating Systems An Introduction. Swaminathan Sivasubramanian Dependable Computing & Networking Laboratory Commercial Real-time Operating Systems An Introduction Swaminathan Sivasubramanian Dependable Computing & Networking Laboratory swamis@iastate.edu Outline Introduction RTOS Issues and functionalities LynxOS

More information

Vivado Design Suite User Guide: Embedded Processor Hardware Design

Vivado Design Suite User Guide: Embedded Processor Hardware Design Vivado Design Suite User Guide: Embedded Processor Hardware Design Notice of Disclaimer The information disclosed to you hereunder (the "Materials") is provided solely for the selection and use of Xilinx

More information

Embedded Systems: Architecture

Embedded Systems: Architecture Embedded Systems: Architecture Jinkyu Jeong (Jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu ICE3028: Embedded Systems Design, Fall 2018, Jinkyu Jeong (jinkyu@skku.edu)

More information

Spartan-6 and Virtex-6 FPGA Embedded Kit FAQ

Spartan-6 and Virtex-6 FPGA Embedded Kit FAQ Spartan-6 and Virtex-6 FPGA FAQ February 5, 2009 Getting Started 1. Where can I purchase an Embedded kit? A: You can purchase your Spartan-6 and Virtex-6 FPGA Embedded kits online at: Spartan-6 FPGA :

More information

The CoreConnect Bus Architecture

The CoreConnect Bus Architecture The CoreConnect Bus Architecture Recent advances in silicon densities now allow for the integration of numerous functions onto a single silicon chip. With this increased density, peripherals formerly attached

More information

Embedded Operating Systems. Unit I and Unit II

Embedded Operating Systems. Unit I and Unit II Embedded Operating Systems Unit I and Unit II Syllabus Unit I Operating System Concepts Real-Time Tasks and Types Types of Real-Time Systems Real-Time Operating Systems UNIT I Operating System Manager:

More information

Building an Embedded Processor System on a Xilinx Zync FPGA (Profiling): A Tutorial

Building an Embedded Processor System on a Xilinx Zync FPGA (Profiling): A Tutorial Building an Embedded Processor System on a Xilinx Zync FPGA (Profiling): A Tutorial Embedded Processor Hardware Design October 6 t h 2017. VIVADO TUTORIAL 1 Table of Contents Requirements... 3 Part 1:

More information

S2C K7 Prodigy Logic Module Series

S2C K7 Prodigy Logic Module Series S2C K7 Prodigy Logic Module Series Low-Cost Fifth Generation Rapid FPGA-based Prototyping Hardware The S2C K7 Prodigy Logic Module is equipped with one Xilinx Kintex-7 XC7K410T or XC7K325T FPGA device

More information

Efficiency and memory footprint of Xilkernel for the Microblaze soft processor

Efficiency and memory footprint of Xilkernel for the Microblaze soft processor Efficiency and memory footprint of Xilkernel for the Microblaze soft processor Dariusz Caban, Institute of Informatics, Gliwice, Poland - June 18, 2014 The use of a real-time multitasking kernel simplifies

More information

«Real Time Embedded systems» Multi Masters Systems

«Real Time Embedded systems» Multi Masters Systems «Real Time Embedded systems» Multi Masters Systems rene.beuchat@epfl.ch LAP/ISIM/IC/EPFL Chargé de cours rene.beuchat@hesge.ch LSN/hepia Prof. HES 1 Multi Master on Chip On a System On Chip, Master can

More information

The ARM Cortex-A9 Processors

The ARM Cortex-A9 Processors The ARM Cortex-A9 Processors This whitepaper describes the details of the latest high performance processor design within the common ARM Cortex applications profile ARM Cortex-A9 MPCore processor: A multicore

More information

Introduction CHAPTER IN THIS CHAPTER

Introduction CHAPTER IN THIS CHAPTER CHAPTER Introduction 1 IN THIS CHAPTER What Is the ARM Cortex-M3 Processor?... 1 Background of ARM and ARM Architecture... 2 Instruction Set Development... 7 The Thumb-2 Technology and Instruction Set

More information

SECURE PARTIAL RECONFIGURATION OF FPGAs. Amir S. Zeineddini Kris Gaj

SECURE PARTIAL RECONFIGURATION OF FPGAs. Amir S. Zeineddini Kris Gaj SECURE PARTIAL RECONFIGURATION OF FPGAs Amir S. Zeineddini Kris Gaj Outline FPGAs Security Our scheme Implementation approach Experimental results Conclusions FPGAs SECURITY SRAM FPGA Security Designer/Vendor

More information

1. Microprocessor Architectures. 1.1 Intel 1.2 Motorola

1. Microprocessor Architectures. 1.1 Intel 1.2 Motorola 1. Microprocessor Architectures 1.1 Intel 1.2 Motorola 1.1 Intel The Early Intel Microprocessors The first microprocessor to appear in the market was the Intel 4004, a 4-bit data bus device. This device

More information

Designing, developing, debugging ARM Cortex-A and Cortex-M heterogeneous multi-processor systems

Designing, developing, debugging ARM Cortex-A and Cortex-M heterogeneous multi-processor systems Designing, developing, debugging ARM and heterogeneous multi-processor systems Kinjal Dave Senior Product Manager, ARM ARM Tech Symposia India December 7 th 2016 Topics Introduction System design Software

More information

A Closer Look at the Epiphany IV 28nm 64 core Coprocessor. Andreas Olofsson PEGPUM 2013

A Closer Look at the Epiphany IV 28nm 64 core Coprocessor. Andreas Olofsson PEGPUM 2013 A Closer Look at the Epiphany IV 28nm 64 core Coprocessor Andreas Olofsson PEGPUM 2013 1 Adapteva Achieves 3 World Firsts 1. First processor company to reach 50 GFLOPS/W 3. First semiconductor company

More information

VLSI Design of Multichannel AMBA AHB

VLSI Design of Multichannel AMBA AHB RESEARCH ARTICLE OPEN ACCESS VLSI Design of Multichannel AMBA AHB Shraddha Divekar,Archana Tiwari M-Tech, Department Of Electronics, Assistant professor, Department Of Electronics RKNEC Nagpur,RKNEC Nagpur

More information

Midterm Exam. Solutions

Midterm Exam. Solutions Midterm Exam Solutions Problem 1 List at least 3 advantages of implementing selected portions of a complex design in software Software vs. Hardware Trade-offs Improve Performance Improve Energy Efficiency

More information

Chapter 5. Introduction ARM Cortex series

Chapter 5. Introduction ARM Cortex series Chapter 5 Introduction ARM Cortex series 5.1 ARM Cortex series variants 5.2 ARM Cortex A series 5.3 ARM Cortex R series 5.4 ARM Cortex M series 5.5 Comparison of Cortex M series with 8/16 bit MCUs 51 5.1

More information

Bringing the benefits of Cortex-M processors to FPGA

Bringing the benefits of Cortex-M processors to FPGA Bringing the benefits of Cortex-M processors to FPGA Presented By Phillip Burr Senior Product Marketing Manager Simon George Director, Product & Technical Marketing System Software and SoC Solutions Agenda

More information

Chapter 6 Storage and Other I/O Topics

Chapter 6 Storage and Other I/O Topics Department of Electr rical Eng ineering, Chapter 6 Storage and Other I/O Topics 王振傑 (Chen-Chieh Wang) ccwang@mail.ee.ncku.edu.tw ncku edu Feng-Chia Unive ersity Outline 6.1 Introduction 6.2 Dependability,

More information

ECE 471 Embedded Systems Lecture 2

ECE 471 Embedded Systems Lecture 2 ECE 471 Embedded Systems Lecture 2 Vince Weaver http://www.eece.maine.edu/~vweaver vincent.weaver@maine.edu 3 September 2015 Announcements HW#1 will be posted today, due next Thursday. I will send out

More information

Parallella Linux - quickstart guide. Antmicro Ltd

Parallella Linux - quickstart guide. Antmicro Ltd Parallella Linux - quickstart guide Antmicro Ltd June 13, 2016 Contents 1 Introduction 1 1.1 Xilinx tools.......................................... 1 1.2 Version information.....................................

More information

PS2 VGA Peripheral Based Arithmetic Application Using Micro Blaze Processor

PS2 VGA Peripheral Based Arithmetic Application Using Micro Blaze Processor PS2 VGA Peripheral Based Arithmetic Application Using Micro Blaze Processor K.Rani Rudramma 1, B.Murali Krihna 2 1 Assosiate Professor,Dept of E.C.E, Lakireddy Bali Reddy Engineering College, Mylavaram

More information

ECE 471 Embedded Systems Lecture 3

ECE 471 Embedded Systems Lecture 3 ECE 471 Embedded Systems Lecture 3 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 10 September 2018 Announcements New classroom: Stevens 365 HW#1 was posted, due Friday Reminder:

More information

Xilinx Vivado/SDK Tutorial

Xilinx Vivado/SDK Tutorial Xilinx Vivado/SDK Tutorial (Laboratory Session 1, EDAN15) Flavius.Gruian@cs.lth.se March 21, 2017 This tutorial shows you how to create and run a simple MicroBlaze-based system on a Digilent Nexys-4 prototyping

More information

Designing with ALTERA SoC Hardware

Designing with ALTERA SoC Hardware Designing with ALTERA SoC Hardware Course Description This course provides all theoretical and practical know-how to design ALTERA SoC devices under Quartus II software. The course combines 60% theory

More information

Building High Performance, Power Efficient Cortex and Mali systems with ARM CoreLink. Robert Kaye

Building High Performance, Power Efficient Cortex and Mali systems with ARM CoreLink. Robert Kaye Building High Performance, Power Efficient Cortex and Mali systems with ARM CoreLink Robert Kaye 1 Agenda Once upon a time ARM designed systems Compute trends Bringing it all together with CoreLink 400

More information

High Speed Data Transfer Using FPGA

High Speed Data Transfer Using FPGA High Speed Data Transfer Using FPGA Anjali S S, Rejani Krishna P, Aparna Devi P S M.Tech Student, VLSI & Embedded Systems, Department of Electronics, Govt. Model Engineering College, Thrikkakkara anjaliss.mec@gmail.com

More information

Signal Processing Algorithms into Fixed Point FPGA Hardware Dennis Silage ECE Temple University

Signal Processing Algorithms into Fixed Point FPGA Hardware Dennis Silage ECE Temple University Signal Processing Algorithms into Fixed Point FPGA Hardware Dennis Silage silage@temple.edu ECE Temple University www.temple.edu/scdl Signal Processing Algorithms into Fixed Point FPGA Hardware Motivation

More information

Contents of this presentation: Some words about the ARM company

Contents of this presentation: Some words about the ARM company The architecture of the ARM cores Contents of this presentation: Some words about the ARM company The ARM's Core Families and their benefits Explanation of the ARM architecture Architecture details, features

More information

SAMA5D2 Quad SPI (QSPI) Performance. Introduction. SMART ARM-based Microprocessor APPLICATION NOTE

SAMA5D2 Quad SPI (QSPI) Performance. Introduction. SMART ARM-based Microprocessor APPLICATION NOTE SMART ARM-based Microprocessor SAMA5D2 Quad SPI (QSPI) Performance APPLICATION NOTE Introduction The Atmel SMART SAMA5D2 Series is a high-performance, powerefficient embedded MPU based on the ARM Cortex

More information

Maximizing heterogeneous system performance with ARM interconnect and CCIX

Maximizing heterogeneous system performance with ARM interconnect and CCIX Maximizing heterogeneous system performance with ARM interconnect and CCIX Neil Parris, Director of product marketing Systems and software group, ARM Teratec June 2017 Intelligent flexible cloud to enable

More information

Shared Address Space I/O: A Novel I/O Approach for System-on-a-Chip Networking

Shared Address Space I/O: A Novel I/O Approach for System-on-a-Chip Networking Shared Address Space I/O: A Novel I/O Approach for System-on-a-Chip Networking Di-Shi Sun and Douglas M. Blough School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta, GA

More information

Simplify System Complexity

Simplify System Complexity Simplify System Complexity With the new high-performance CompactRIO controller Fanie Coetzer Field Sales Engineer Northern South Africa 2 3 New control system CompactPCI MMI/Sequencing/Logging FieldPoint

More information

4. Hardware Platform: Real-Time Requirements

4. Hardware Platform: Real-Time Requirements 4. Hardware Platform: Real-Time Requirements Contents: 4.1 Evolution of Microprocessor Architecture 4.2 Performance-Increasing Concepts 4.3 Influences on System Architecture 4.4 A Real-Time Hardware Architecture

More information

FPGA Entering the Era of the All Programmable SoC

FPGA Entering the Era of the All Programmable SoC FPGA Entering the Era of the All Programmable SoC Ivo Bolsens, Senior Vice President & CTO Page 1 Moore s Law: The Technology Pipeline Page 2 Industry Debates on Cost Page 3 Design Cost Estimated Chip

More information

Use ZCU102 TRD to Accelerate Development of ZYNQ UltraScale+ MPSoC

Use ZCU102 TRD to Accelerate Development of ZYNQ UltraScale+ MPSoC Use ZCU102 TRD to Accelerate Development of ZYNQ UltraScale+ MPSoC Topics Hardware advantages of ZYNQ UltraScale+ MPSoC Software stacks of MPSoC Target reference design introduction Details about one Design

More information

Operating Systems: Internals and Design Principles. Chapter 2 Operating System Overview Seventh Edition By William Stallings

Operating Systems: Internals and Design Principles. Chapter 2 Operating System Overview Seventh Edition By William Stallings Operating Systems: Internals and Design Principles Chapter 2 Operating System Overview Seventh Edition By William Stallings Operating Systems: Internals and Design Principles Operating systems are those

More information

LEON4: Fourth Generation of the LEON Processor

LEON4: Fourth Generation of the LEON Processor LEON4: Fourth Generation of the LEON Processor Magnus Själander, Sandi Habinc, and Jiri Gaisler Aeroflex Gaisler, Kungsgatan 12, SE-411 19 Göteborg, Sweden Tel +46 31 775 8650, Email: {magnus, sandi, jiri}@gaisler.com

More information

Embedded HW/SW Co-Development

Embedded HW/SW Co-Development Embedded HW/SW Co-Development It May be Driven by the Hardware Stupid! Frank Schirrmeister EDPS 2013 Monterey April 18th SPMI USB 2.0 SLIMbus RFFE LPDDR 2 LPDDR 3 emmc 4.5 UFS SD 3.0 SD 4.0 UFS Bare Metal

More information

EDK Concepts, Tools, and Techniques

EDK Concepts, Tools, and Techniques EDK Concepts, Tools, and Techniques A Hands-On Guide to Effective Embedded System Design Notice of Disclaimer The information disclosed to you hereunder (the Materials ) is provided solely for the selection

More information

CISC RISC. Compiler. Compiler. Processor. Processor

CISC RISC. Compiler. Compiler. Processor. Processor Q1. Explain briefly the RISC design philosophy. Answer: RISC is a design philosophy aimed at delivering simple but powerful instructions that execute within a single cycle at a high clock speed. The RISC

More information

Simplify System Complexity

Simplify System Complexity 1 2 Simplify System Complexity With the new high-performance CompactRIO controller Arun Veeramani Senior Program Manager National Instruments NI CompactRIO The Worlds Only Software Designed Controller

More information

Introducing the Spartan-6 & Virtex-6 FPGA Embedded Kits

Introducing the Spartan-6 & Virtex-6 FPGA Embedded Kits Introducing the Spartan-6 & Virtex-6 FPGA Embedded Kits Overview ß Embedded Design Challenges ß Xilinx Embedded Platforms for Embedded Processing ß Introducing Spartan-6 and Virtex-6 FPGA Embedded Kits

More information

Santa Fe (MAXREFDES5#) MicroZed Quick Start Guide

Santa Fe (MAXREFDES5#) MicroZed Quick Start Guide Santa Fe (MAXREFDES5#) MicroZed Quick Start Guide Rev 0; 5/14 Maxim Integrated cannot assume responsibility for use of any circuitry other than circuitry entirely embodied in a Maxim Integrated product.

More information

«Real Time Embedded systems» Cyclone V SOC - FPGA

«Real Time Embedded systems» Cyclone V SOC - FPGA «Real Time Embedded systems» Cyclone V SOC - FPGA Ref: http://www.altera.com rene.beuchat@epfl.ch LAP/ISIM/IC/EPFL Chargé de cours rene.beuchat@hesge.ch LSN/hepia Prof. HES 1 SOC + FPGA (ex. Cyclone V,

More information

EEM870 Embedded System and Experiment Lecture 3: ARM Processor Architecture

EEM870 Embedded System and Experiment Lecture 3: ARM Processor Architecture EEM870 Embedded System and Experiment Lecture 3: ARM Processor Architecture Wen-Yen Lin, Ph.D. Department of Electrical Engineering Chang Gung University Email: wylin@mail.cgu.edu.tw March 2014 Agenda

More information

Cortex-A9 MPCore Software Development

Cortex-A9 MPCore Software Development Cortex-A9 MPCore Software Development Course Description Cortex-A9 MPCore software development is a 4 days ARM official course. The course goes into great depth and provides all necessary know-how to develop

More information

CMP Conference 20 th January Director of Business Development EMEA

CMP Conference 20 th January Director of Business Development EMEA CMP Conference 20 th January 2011 eric.lalardie@arm.com Director of Business Development EMEA +33 6 07 83 09 60 1 1 Unparalleled Applicability ARM Cortex Advanced Processors Architectural innovation, compatibility

More information

Building an Embedded Processor System on Xilinx NEXYS3 FPGA and Profiling an Application: A Tutorial

Building an Embedded Processor System on Xilinx NEXYS3 FPGA and Profiling an Application: A Tutorial Building an Embedded Processor System on Xilinx NEXYS3 FPGA and Profiling an Application: A Tutorial Introduction: Modern FPGA s are equipped with a lot of resources that allow them to hold large digital

More information

ARM ARCHITECTURE. Contents at a glance:

ARM ARCHITECTURE. Contents at a glance: UNIT-III ARM ARCHITECTURE Contents at a glance: RISC Design Philosophy ARM Design Philosophy Registers Current Program Status Register(CPSR) Instruction Pipeline Interrupts and Vector Table Architecture

More information

A unified multicore programming model

A unified multicore programming model A unified multicore programming model Simplifying multicore migration By Sven Brehmer Abstract There are a number of different multicore architectures and programming models available, making it challenging

More information

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 18 Multicore Computers

William Stallings Computer Organization and Architecture 8 th Edition. Chapter 18 Multicore Computers William Stallings Computer Organization and Architecture 8 th Edition Chapter 18 Multicore Computers Hardware Performance Issues Microprocessors have seen an exponential increase in performance Improved

More information

REAL TIME OPERATING SYSTEM PROGRAMMING-I: VxWorks

REAL TIME OPERATING SYSTEM PROGRAMMING-I: VxWorks REAL TIME OPERATING SYSTEM PROGRAMMING-I: I: µc/os-ii and VxWorks Lesson-1: RTOSes 1 1. Kernel of an RTOS 2 Kernel of an RTOS Used for real-time programming features to meet hard and soft real time constraints,

More information

Designing Embedded Processors in FPGAs

Designing Embedded Processors in FPGAs Designing Embedded Processors in FPGAs 2002 Agenda Industrial Control Systems Concept Implementation Summary & Conclusions Industrial Control Systems Typically Low Volume Many Variations Required High

More information

Embedded Linux Architecture

Embedded Linux Architecture Embedded Linux Architecture Types of Operating Systems Real-Time Executive Monolithic Kernel Microkernel Real-Time Executive For MMU-less processors The entire address space is flat or linear with no memory

More information

DESIGN AND VERIFICATION ANALYSIS OF APB3 PROTOCOL WITH COVERAGE

DESIGN AND VERIFICATION ANALYSIS OF APB3 PROTOCOL WITH COVERAGE DESIGN AND VERIFICATION ANALYSIS OF APB3 PROTOCOL WITH COVERAGE Akhilesh Kumar and Richa Sinha Department of E&C Engineering, NIT Jamshedpur, Jharkhand, India ABSTRACT Today in the era of modern technology

More information

EDK Concepts, Tools, and Techniques

EDK Concepts, Tools, and Techniques EDK Concepts, Tools, and Techniques A Hands-On Guide to Effective Embedded System Design Notice of Disclaimer The information disclosed to you hereunder (the Materials ) is provided solely for the selection

More information

ELCT 912: Advanced Embedded Systems

ELCT 912: Advanced Embedded Systems ELCT 912: Advanced Embedded Systems Lecture 2-3: Embedded System Hardware Dr. Mohamed Abd El Ghany, Department of Electronics and Electrical Engineering Embedded System Hardware Used for processing of

More information

Analyzing and Debugging Performance Issues with Advanced ARM CoreLink System IP Components

Analyzing and Debugging Performance Issues with Advanced ARM CoreLink System IP Components Analyzing and Debugging Performance Issues with Advanced ARM CoreLink System IP Components By William Orme, Strategic Marketing Manager, ARM Ltd. and Nick Heaton, Senior Solutions Architect, Cadence Finding

More information

ECE 471 Embedded Systems Lecture 2

ECE 471 Embedded Systems Lecture 2 ECE 471 Embedded Systems Lecture 2 Vince Weaver http://www.eece.maine.edu/ vweaver vincent.weaver@maine.edu 4 September 2014 Announcements HW#1 will be posted tomorrow (Friday), due next Thursday Working

More information

Chapter 4. Enhancing ARM7 architecture by embedding RTOS

Chapter 4. Enhancing ARM7 architecture by embedding RTOS Chapter 4 Enhancing ARM7 architecture by embedding RTOS 4.1 ARM7 architecture 4.2 ARM7TDMI processor core 4.3 Embedding RTOS on ARM7TDMI architecture 4.4 Block diagram of the Design 4.5 Hardware Design

More information

PowerPC 740 and 750

PowerPC 740 and 750 368 floating-point registers. A reorder buffer with 16 elements is used as well to support speculative execution. The register file has 12 ports. Although instructions can be executed out-of-order, in-order

More information

Operating Systems. Computer Science & Information Technology (CS) Rank under AIR 100

Operating Systems. Computer Science & Information Technology (CS) Rank under AIR 100 GATE- 2016-17 Postal Correspondence 1 Operating Systems Computer Science & Information Technology (CS) 20 Rank under AIR 100 Postal Correspondence Examination Oriented Theory, Practice Set Key concepts,

More information

ELC4438: Embedded System Design ARM Embedded Processor

ELC4438: Embedded System Design ARM Embedded Processor ELC4438: Embedded System Design ARM Embedded Processor Liang Dong Electrical and Computer Engineering Baylor University Intro to ARM Embedded Processor (UK 1990) Advanced RISC Machines (ARM) Holding Produce

More information

EMBEDDED SYSTEM FOR VIDEO AND SIGNAL PROCESSING

EMBEDDED SYSTEM FOR VIDEO AND SIGNAL PROCESSING EMBEDDED SYSTEM FOR VIDEO AND SIGNAL PROCESSING Slavy Georgiev Mihov 1, Dimitar Stoykov Dimitrov 2, Krasimir Angelov Stoyanov 3, Doycho Dimitrov Doychev 4 1, 4 Faculty of Electronic Engineering and Technologies,

More information

Zynq-7000 All Programmable SoC: Embedded Design Tutorial. A Hands-On Guide to Effective Embedded System Design

Zynq-7000 All Programmable SoC: Embedded Design Tutorial. A Hands-On Guide to Effective Embedded System Design Zynq-7000 All Programmable SoC: Embedded Design Tutorial A Hands-On Guide to Effective Embedded System Design Revision History The following table shows the revision history for this document. Date Version

More information

Spartan-6 LX9 MicroBoard Embedded Tutorial. Tutorial 1 Creating an AXI-based Embedded System

Spartan-6 LX9 MicroBoard Embedded Tutorial. Tutorial 1 Creating an AXI-based Embedded System Spartan-6 LX9 MicroBoard Embedded Tutorial Tutorial 1 Creating an AXI-based Embedded System Version 13.1.01 Revision History Version Description Date 13.1.01 Initial release for EDK 13.1 5/15/2011 Table

More information

ESE Back End 2.0. D. Gajski, S. Abdi. (with contributions from H. Cho, D. Shin, A. Gerstlauer)

ESE Back End 2.0. D. Gajski, S. Abdi. (with contributions from H. Cho, D. Shin, A. Gerstlauer) ESE Back End 2.0 D. Gajski, S. Abdi (with contributions from H. Cho, D. Shin, A. Gerstlauer) Center for Embedded Computer Systems University of California, Irvine http://www.cecs.uci.edu 1 Technology advantages

More information

ISE Design Suite Software Manuals and Help

ISE Design Suite Software Manuals and Help ISE Design Suite Software Manuals and Help These documents support the Xilinx ISE Design Suite. Click a document title on the left to view a document, or click a design step in the following figure to

More information

Beyond Moore. Beyond Programmable Logic.

Beyond Moore. Beyond Programmable Logic. Beyond Moore Beyond Programmable Logic Steve Trimberger Xilinx Research FPL 30 August 2012 Beyond Moore Beyond Programmable Logic Agenda What is happening in semiconductor technology? Moore s Law More

More information

NanoMind Z7000. Datasheet On-board CPU and FPGA for space applications

NanoMind Z7000. Datasheet On-board CPU and FPGA for space applications NanoMind Z7000 Datasheet On-board CPU and FPGA for space applications 1 Table of Contents 1 TABLE OF CONTENTS... 2 2 OVERVIEW... 3 2.1 HIGHLIGHTED FEATURES... 3 2.2 BLOCK DIAGRAM... 4 2.3 FUNCTIONAL DESCRIPTION...

More information

ARM Cortex-A9 ARM v7-a. A programmer s perspective Part1

ARM Cortex-A9 ARM v7-a. A programmer s perspective Part1 ARM Cortex-A9 ARM v7-a A programmer s perspective Part1 ARM: Advanced RISC Machine First appeared in 1985 as Acorn RISC Machine from Acorn Computers in Manchester England Limited success outcompeted by

More information

Parallel Simulation Accelerates Embedded Software Development, Debug and Test

Parallel Simulation Accelerates Embedded Software Development, Debug and Test Parallel Simulation Accelerates Embedded Software Development, Debug and Test Larry Lapides Imperas Software Ltd. larryl@imperas.com Page 1 Modern SoCs Have Many Concurrent Processing Elements SMP cores

More information

Microblaze for Linux Howto

Microblaze for Linux Howto Microblaze for Linux Howto This tutorial shows how to create a Microblaze system for Linux using Xilinx XPS on Windows. The design is targeting the Spartan-6 Pipistello LX45 development board using ISE

More information

CSE 4/521 Introduction to Operating Systems. Lecture 29 Windows 7 (History, Design Principles, System Components, Programmer Interface) Summer 2018

CSE 4/521 Introduction to Operating Systems. Lecture 29 Windows 7 (History, Design Principles, System Components, Programmer Interface) Summer 2018 CSE 4/521 Introduction to Operating Systems Lecture 29 Windows 7 (History, Design Principles, System Components, Programmer Interface) Summer 2018 Overview Objective: To explore the principles upon which

More information

ESA Contract 18533/04/NL/JD

ESA Contract 18533/04/NL/JD Date: 2006-05-15 Page: 1 EUROPEAN SPACE AGENCY CONTRACT REPORT The work described in this report was done under ESA contract. Responsibility for the contents resides in the author or organisation that

More information