Contents Getting Started Pico Software and Firmware Architecture

Size: px
Start display at page:

Download "Contents Getting Started Pico Software and Firmware Architecture"

Transcription

1 Pico User's Guide

2 Contents 1 Getting Started Installing the Hardware Installing the Software Monitoring Modules with Purty Learning More About the Pico Framework Documentation Running a Sample Program Creating and Building Your Own Application Using Build Script to Create the Project Opening the Project using Xilinx Vivado Opening the Project using Intel Quartus Prime Creating Your Own UserModule Building the Project Troubleshooting General Suggestions Module not shown in purty Unable to compile programs Software-generated Errors Pico Software and Firmware Architecture Overview Architecture PicoBus Model Stream Model Software API Firmware Framework PicoBus PicoBus Firmware PicoBus Signals Reading From the PicoBus Writing To the PicoBus PicoBus Software Streams Stream Firmware Stream Signals Stream Numbering Stream Firmware Protocol Stream Software Avoiding Blocking With Streams DDR3 and HMC Memory Streams Stream Efficiency Multiple Threads

3 2.5 Building Software Building Firmware Configuring the Pico Framework Software API Reference PicoDrv Lifecycle FindSiblings Running.bit Files PicoBus Functions Stream Functions TieStreams Opening Reading and Writing Closing Error Codes Status Miscellaneous Functions GetSerial GetSlot GetPicoConfig Pico Memory System Overview Accessing Memory from Firmware Configuration AXI Writing Reading More Info Performance Considerations Accessing Memory from Software Pico Software Simulation Overview Pre-Requisites Simulator Waveform Viewer Preparing your Design Firmware Firmware Restrictions Software Restrictions Running the Simulation Micron Technology, Inc.

4 1 Getting Started System Requirements In order to operate a Pico FPGA product, your system must meet all following requirements: System must have at least one available x16 PCIe slot for each backplane. Power supply must have 1 available PCIe power cable for each backplane (GPU style connector). Power supply must be able to deliver at least 240 W per backplane. System must have cooling fans pointed at the FPGA(s) in order to keep them under 85 C. System must be running the 64-bit Ubuntu or RHEL/CentOS desktop OS. We recommend version Ubuntu LTS, Ubuntu LTS, RHEL/CentOS 6.8 or above. We recommend an Intel motherboard chipset, but it is not a requirement Micron Technology, Inc. 1

5 Overview This Guide will help you get started using your Pico module(s). When you complete the installation using the instructions, you will have: A fully assembled, working hardware system Device driver software for the FPGAs The Pico API, a high-level C++ library for communicating with the FPGAs The Pico Framework firmware, including DMA and multi-port memory components A set of sample programs for testing your system and creating new projects Full documentation about all of the above 1.1 Installing the Hardware Take the boards out of their ESD bags, being careful to avoid electrostatic discharge. The small AC-series boards, which hold the FPGAs, are called modules or sometimes cards. We call the large EX-series boards backplanes. Their purpose is to hold the modules and provide a physical medium for the modules to communicate with the rest of the system over PCIe. 1. Completely power down the host system. 2. If the modules are not shipped attached to the backplane(s), place each of the modules into one of the available slots on a backplane and push down gently. 3. Fasten the modules in place with the provided screws. 4. After the modules are installed on the backplane, insert the backplane into a x16 PCIe slot on the motherboard. 5. Make sure there is enough cooling to the module(s). Active fan blowing directly onto the heat sinks is recommanded. 6. Plug in the PCIe power cable. 7. Turn the host system back on. The FPGA modules should now have power. Please refer to the hardware manual of each FPGA module for the indications of the LEDs Micron Technology, Inc.

6 1.2 Installing the Software Note: It is recommanded to have network access during the installation, as the installation may install dependent packages. The installer is provided as a deb package for Ubuntu or rpm for RHEL/CentOS. It is recommanded to install the installer package using command line: To install the software from a terminal, navigate to the location of the installer file, and run dpkg on the resulting deb or rpm file. Ubuntu: sudo dpkg -i <pico installer file name>.deb RHEL/CentOS: sudo rpm -ivh <pico installer file name>.rpm After you have installed the installer package, power cycle the system by shutting it all the way down and starting it up again. The software should now be fully installed. Within one minute of powerup, you should see bright, flashing multicolored LEDs on the backplane. You should now verify that the firmware and software are correctly installed by running Pico s card monitoring program, called Purty as described in Section Monitoring Modules with Purty Pico provides a program for monitoring the state of Pico modules, called Purty (an acronym for Pico Utility for Reliability and Testing ). Purty reports the status and health of all the modules in the system, including each module s temperature, voltage, installation status, serial number, and physical location. To start purty, launch a terminal and type following command: $PICOBASE/bin/purty The Purty window is shown as Figure 1. On the left side, you can see a list of the modules in your system along with various information about their status (from the left, temperature, 2017 Micron Technology, Inc. 3

7 voltage, current, model, and physical location on the backplane). On the right, we have opened the temperature/power log from the selection menu in the upper right corner. By choosing appropriately from this menu, you can also access console output from each of the FPGAs. Figure 1: Purty Window The slot number shows the location of the FPGA modules on the backplane. Facing towards the board logo, from the left side to the right, the slots are numbered from 1 to Learning More About the Pico Framework Documentation After you ve finished going through this getting-started guide, you will probably want to learn more about the details of the Pico Framework. Full documentation about the software and firmware parts of the framework can be found in the Pico User Guide documentation file, which is located at $PICOBASE/doc/ along with all of the other distributed documentation for the Pico Framework. This document provides a full description of the Pico Application Programming Interface (API), Micron Technology, Inc.

8 and how you can use it to deploy the capabilities of the Pico Framework. The software side of the Framework is written in C++. Running a Sample Program For a more hands-on introduction to the Pico Framework, we suggest that you try running one of our sample programs, located in $PICOBASE/samples One of our most often-used sample programs is StreamLoopback128. In this program, the software on the host computer sends the FPGA an increasing sequence of numbers via an input stream. The firmware on the FPGA sums these data and returns a running total in an output stream. You can test it out by typing: cp -r $PICOBASE/samples/StreamLoopback128 ~/ cd ~/StreamLoopback128/software make./streamloopback128 $PICOBASE/samples/StreamLoopback128/firmware/<bitsteam file> where, in the last command, you should substitute the <bitstream file> appropriate to your FPGA. For example, on our AC-510 FPGA, we used the bit file M510_KU060_StreamLoopback128.bit. The program s output should look like what is shown in Figure Micron Technology, Inc. 5

9 Figure 2: Stream Loopback Example This sample is self-checking and reports a simple success message upon completion if everything went according to plan. More details about the StreamLoopback128 sample can be found in the StreamLoopback128 documentation. Likewise, several other fully documented samples are also available in individual subdirectories of $PICOBASE/samples. 1.5 Creating and Building Your Own Application You are now ready to begin creating your own projects. A complete project using the Pico Framework requires both firmware for the FPGAs and software for the host computer. This section describes how to create the firmware for a new project. Several of the sections below will be divided into two halves, one using Xilinx tools and the other using Intel tools. In most cases, however, the procedures described will be very similar Micron Technology, Inc.

10 Note: The sample designs in the installer are tested with with certain version of FPGA vendor tools, for example, Xilinx Vivado or Intel Quartus Prime Newer software versions should work, but older ones may not. In this section, we will show you how to open and build one of the more commonly used sample projects, the StreamLoopback128 example that we introduced above. We recommend that you familiarize yourself with that sample before continuing on to the next section. Start by copying the samples/streamloopback128 directory to your home directory, where we ll work on it. For example, run: cp -r $PICOBASE/samples/StreamLoopback128 ~/ Using Build Script to Create the Project We provide a build script to create the project from scratch. This script can be used for either Vivado or Quartus. To use the build script, user needs to use the fwproj file to setup correct parameters for the project. Take previous StreamLoopback128 as example: cd ~/StreamLoopback128/firmware $PICOBASE/bin/build-pico-fw.sh <fwproj file> [project] where, in the last command, you should substitute fwproj file with the corresponding fwproj file. The project parameter will let the tools to create the project only. By removing the project parameter, the tools will go through the build process and create bitstream if no error occurs. Users may edit the fwproj file to include their own user module or custom design source file. The source will be included in the final assembled project. Opening the Project using Xilinx Vivado To open a project using Xilinx Vivado, follow these steps: 1. Launch Vivado. 2. Click Open Project. The Vivado project file will be located in StreamLoopback128/firmware in a subdirectory appropriate to your FPGA. For example, for AC-510, open the file: 2017 Micron Technology, Inc. 7

11 ~/StreamLoopback128/firmware/M510_StreamLoopback128/M510_StreamLoopback128.xpr 3. In this subdirectory, double-click on the.xpr file. When the project is open, your screen should look like what is show in Figure 3. Figure 3: Vivado Open Project Screen In the small Sources pane at the top of the Vivado window, you can see the design sources, constraints, and simulation sources for the project. Under Design Sources, Verilog Header contains the header PicoDefines.v that names the project and defines stream widths, information about the FPGA, and other important details as needed for a given project. The top-level module of the Pico architecture is called PicoToplevel. This top-level module instantiates the modules required for every project, the most important of which are the PicoFramework and your HDL modules. You Micron Technology, Inc.

12 should only modify your own UserModule and files below it in this hierarchy. The only exception is the include file, PicoDefines.v, which you should change as mentioned below so that your project files can be included and built. The rest of the hierarchy is made up of a variety of modules and other files that generally operate behind the scenes. The core logic used for PCIe communication with the host CPU, including a DMA engine, is located in PicoFramework. The UserWrapper instantiates logic for connecting the DMA engine to the streaming and PicoBus interfaces in the UserModule. Other modules found under PicoModules handle streams, the PicoBus, and memory access. Constraint files located under Constraints convey pin placement and timing information to the Vivado synthesis tool, which is used during the mapping and the place and route phases of generating a bitfile. Opening the Project using Intel Quartus Prime To open a project using Quartus Prime, follow these steps: 1. Launch Quartus Prime. 2. The Quartus project file will be located in StreamLoopback128/firmware in a subdirectory appropriate to your FPGA (for example, M520_AX115_StreamLoopback128 for AC-520). Click File->Open Project to open the qpf file Micron Technology, Inc. 9

13 Figure 4: Quartus Open Project Screen Micron Technology, Inc.

14 In the small Project Navigator pane in the top left corner of the Quartus window, there should be a project hierarchy with one top-level module, PicoToplevel. This module instantiates the modules required for every project. Specifically, it creates the basic PicoFramework firmware module and your module, which in turn instantiates memory, stream, and user firmware, among other things. Creating Your Own UserModule For now, we will not edit anything in the sample. When you are ready to create a new design, you should create a new user module file (or copy one from a sample) and edit the USER_MODULE_NAME definition in PicoDefines.v (probably near the bottom of the list in the Files tab) to match the name of your new module. Do not modify any other files in the project. Building the Project After you are finished writing the initial version of your design, it s time to build it and test it on a real system. To build the project, click Generate Bitstream in the Flow Navigator pane on the left side of the Vivado window; or double-click Assembler (Generate Programming Files) in the Tasks pane on the left side of the Quartus window. The build process is computationally intensive and may take anywhere from 5 to 10 minutes for our simple samples, to several hours for a more complex project, depending on the speed of your computer. You should now run through the StreamLoopback128 software again, this time using the new bit file instead of the precompiled one provided in the distribution. This operation checks that your Vivado installation is operating correctly. In order to take advantage of other features of the Pico Framework, you may wish to start with another sample project rather than StreamLoopback128. Each sample has full documentation, explaining how it works and what it does, included in the Pico Framework distribution. StreamLoopback128 demonstrates several of the most useful features of the Framework, including streams and DMA transactions with the host. The PicoBus128_HelloWorld sample illustrates the PicoBus, the other possible communication scheme (other than streams) allowed by the Pico architecture. For more information on the PicoBus, see the Pico API documentation. The DDR3_MovingAverage sample shows how to interact with the DDR3 off-chip memory on an M-series module. More information on memory systems can be found in the PicoMemory documentation file. The MultiStream example shows how to do computation using multiple FPGAs by programming them to communicate with streams Micron Technology, Inc. 11

15 The GUPS example shows how to use the multiple user ports of HMC to achieve high randam access rate. The AXI_HMC_Test_512 shows how to use the single wide AXI interface of HMC to achieve high throughput. Two of the programs (HelloWorld and StreamLoopback) have both 32- and 128-bit versions in order to demonstrate both sizes of streams and the PicoBus. If you wish to use 32-bit streams in your design, it is easier to start with the 32-bit version of these projects. 1.6 Troubleshooting General Suggestions Be sure to verify that your system meets all system requirements. Is your system powered? Check to see that the orange LEDs on your modules are turned on. If not, your backplane(s) may not be plugged into a PCIe power cable or the connection of the FPGA might be loose. Is your FPGA programmed? Check to see that the green LEDs on your modules are turned on. If not, your FPGA is not programmed with the correct bit image, or the FPGA is overheated. Power cycling the system by shutting it down completely, waiting a few seconds, and then restarting may help. Module not shown in purty If you have just finished installing the Pico software, you may simply need to power cycle your system. Try shutting it down and restarting. If the Pico software is fully installed and your module still does not show up in the monitor program, open a command prompt and type (case sensitive): sudo lspci -nn grep Pico If you do not see anything, then the system does not see the module at all. There is nothing that can be done in software. If your board is a module on the EX-700, such as the AC-510, please check the LEDs to ensure the board is getting power. See the hardware manual for LED locations Micron Technology, Inc.

16 If, however, the lspci program does report a Pico module, the problem is likely with the driver. To see if the driver is loaded, run (case sensitive): lsmod grep pico If you see no output, the driver is not loaded. Try loading the driver by running the following as root: depmod; modprobe -v pico Now use dmesg to look at the system log to see if the driver has reported any errors: dmesg If the modprobe command did not fix the problem, please run these commands: dmesg > dmesg.log sudo lspci -nn -vvvvv > lspci.log and contact Pico with the resulting log files. Unable to compile programs Check to make sure the PICOBASE environment variable is pointing to the latest Pico installation. This variable is set when the Pico package is installed. You should see something like this: echo $PICOBASE [output] /usr/src/picocomputing If you do not see PICOBASE pointing to a valid directory, you can set it with: export PICOBASE=/usr/src/picocomputing or a similar command appropriate to the version of the Pico software you have installed. Now rebuild the program with the the PICOBASE variable set correctly Micron Technology, Inc. 13

17 Software-generated Errors For a detailed list of error codes produced by the Pico software, please consult the Pico API error code appendix Micron Technology, Inc.

18 2 Pico Software and Firmware Architecture This chapter describes the Pico Computing Application Programming Interfaces (APIs) for Linux platforms. 2.1 Overview Pico Computing APIs provide the link between your application software running on a host computer, and the hardware algorithm, or firmware, implemented in the FPGA. The hardware-software interfaces provided by Pico include a number of software library functions and corresponding firmware components enabling FPGA control and processing. The goal of these interfaces is to abstract away the details of communication and provide an easy to use, yet powerful method of programming Pico Computing products. Pico Computing FPGA platforms are most often attached to host computers using either PCI Express. Depending on the Pico platform you are using, the bandwidth and latency of data transferred between host computer and FPGA may vary substantially. These variations depend on the electrical characteristics of the interface (for example, the number of PCI Express lanes being used), and also on the nature of the application and the size of individual data elements being transferred. As an application programmer, you can have a significant impact on the final performance of these interfaces by learning and using appropriate APIs and methods for your target platform and application. 2.2 Architecture A host computer with Pico FPGAs attached to it is a diverse system of parts, all of which need to work in concert. For example, the host system may host dozens of FPGAs, each potentially running different code. Each FPGA will have its own connection to the PCIe network, through which it can communicate with the host CPU or directly with other FPGAs. The host CPUs must manage dataflow to and from each of the FPGAs and control their processing. Finally, the CPU must mediate the access to the FPGAs from the actual user-level application. This system is best understood by dividing it into two parts: software and firmware: The software side is composed of the host CPUs and the code they re running. This includes the application program, the Pico device driver, and any data stored in host memory. The software side is typically programmed in C/C++, and should be familiar to software programmers Micron Technology, Inc. 15

19 The firmware side is composed of the Pico FPGA boards, and all the code running on them. This code is typically written in Verilog or VHDL. This software/firmware distinction is useful, because it gives us two separate worlds that are cleanly separated by an API boundary. The Pico API defines two models for communicating between the software and firmware worlds. The first model is the traditional bus-based model, while the other is the data stream model. We discuss the concepts of both in this section about architecture, and then explain the details of the software and firmware sides in subsequent sections. PicoBus Model The PicoBus connects the software and firmware worlds by presenting the firmware as a slave on a memory mapped bus, which is driven by software running on the host. For example, if the job of the FPGA application is to iterate on a seed value, this configuration provides a straightforward way for the software to set the initial seed value and to retrieve the computed results when the FPGA is done. The PicoBus should be familiar to engineers from both software and firmware/hardware backgrounds. Firmware engineers will recognize the memory-mapped paradigm that is present in one form or another in almost every digital device. Software engineers will recognize it as a shared memory form of communication, or alternatively, as the de facto standard method for controlling hardware devices from software. The PicoBus can be used for both writing data to the firmware from the software, and reading data from the firmware into the software. Each transaction involves both data and an address. For example, the software may say write 0xdeadbeef to address 0x4200. The PicoBus is most often used to model a set of registers this way. Continuing our example, we might have two registers to specify parameters for the firmware, a register to monitor the status of the firmware, and finally a register to read back the firmware s computed result. The firmware implements the registers on the bus by checking the bus address during reads and writes, and the software uses the chosen address for accessing each register. The PicoBus is intended for cases where you do not need to move more than a few megabytes of data per second. For such cases, it provides a lightweight register-based interface. If you need more throughput, see the streaming model in Section 2.2. Stream Model The stream method of communicating between the software and firmware is based on the concept of Communicating Sequential Processes (CSP) Micron Technology, Inc.

20 Even if the CSP model sounds foreign to you, you have probably used streams before. Software uses pipes for interprocess communication, TCP streams for network communication, and hardware FIFO-based streams are everywhere. Just as in these common stream models, a stream is a sequence of data, with flow control. The Pico API lets you know when the stream is available for reading or writing, and you tell the API when you are ready to send or receive data. Streams are a simpler abstraction than the PicoBus. For example, there are no addresses in a stream. The data is simply sent or received in order. Addresses are not needed, because the data moves in order, rather than jumping around to different addresses. This simplicity allows the streams to be more efficient at moving large amounts of data. Streams are the fastest way to move data to and from Pico FPGAs. The simplicity of the stream model gives the Pico framework the freedom to transfer the data in the fastest and most efficient way. A stream is a uni-directional communication channel. As such, there are two varieties of stream: inputs to the FPGA, and outputs from the FPGA. A typical system will have at least one stream in each direction for bidirectional communication. So far we have been talking about communicating between the FPGAs and the host CPU, but streams can be used to communicate directly between the FPGAs with no host intervention. This approach provides a very low latency way to connect FPGAs to each other, and is described in more detail in the Section 2.4. Software API The Pico software API is distributed as a set of C++ source files that are included in the Pico installation package. The main point of entry is the PicoDrv class, which represents an FPGA. For example, if you have 24 FPGA modules in your system, you could create 24 PicoDrv objects, each corresponding to one of your FPGAs. Sections 2.3 (about the PicoBus) and 2.4 (about stream models) will introduce the relevant API functions for each type of communication, and there is a complete reference of the software API functions in Section 2.7. Firmware Framework Pico provides a Verilog framework that provides access to all the basic functionality of the FPGA board. The main components of this framework are a PCIe DMA engine and a multi-port memory interface. When you build a configuration bit file for the FPGA, the Pico framework will be the toplevel, and your module will be instantiated inside the framework. The communication between your module and the host system through PCIe is described in Sections 2.3 (PicoBus) and 2.4 (Streams). For information about how to connect to the memory on the FPGA module, please see Section 3 (Pico Memory System). Pico provides a number of Vivado/Quartus projects you can use as templates to get started Micron Technology, Inc. 17

21 See Section 2.6 (Building Firmware) for a complete explanation. 2.3 PicoBus This section describes the software and firmware sides of the PicoBus communication model. A high-level description of the PicoBus is in Section 2.2 (PicoBus Model). There are two widths of PicoBus available: 32 bits and 128 bits. We will use the 32-bit version for these descriptions, but everything also applies to the 128-bit version. The behavior of the bus is the same. The only difference is that the width is 128-bit rather than 32-bit. We recommend using the 32-bit version, because of its lower resource use in the FPGA. PicoBus Firmware The PicoBus is a fairly standard 32-bit wide synchronous bus with read and write strobes. It is driven by the host computer where the Pico card is located. In other words, the host computer is the master and your firmware is the slave on the bus. (Section 2.3 describes the software API for accessing the PicoBus.) PicoBus Signals The following signals constitute the PicoBus. All are inputs to your module except PicoDataOut. Table 1: PicoBus Signals PicoClk PicoRst PicoAddr[31:0] PicoDataOut[31:0] PicoRd PicoDataIn[31:0] PicoWr 250MHz free-running clock active high reset signal 32-bit byte address (the address the host is reading from or writing to) [output] 32-bit data output (data read from the FPGA) active high read strobe asserted when the host is doing a read 128-bit data input (data written to the FPGA) active high write strobe asserted when the host is doing a write Reading From the PicoBus When the host initiates a read, the PicoAddr signal is set to the address specified by the software and the PicoRd signal is toggled high for one clock cycle. Your FPGA logic, which owns the specified address, will be watching for these signals, and places the requested data on the PicoDataOut Micron Technology, Inc.

22 signal the next clock cycle after PicoRd is high. The data will then be returned to the software on the host that requested it. The waveform PicoBus Read shows a read on the PicoBus (see Figure 5). Figure 5: Read on PicoBUs Reads can be performed back-to-back on the bus with no gaps in between. For example, if the bus is reading two addresses back-to-back: On the first clock cycle: PicoRd asserted, and PicoAddr set to the first address On the second clock cycle: PicoRd still high, and PicoAddr set to the second address. Also on the second clock cycle, your logic will drive PicoDataOut to the first value read. On the third clock cycle: PicoRd will be inactive, PicoAddr will be undefined, and your logic will drive PicoDataOut with the second value. Notice how the one-cycle delay between the PicoRd strobe and PicoDataOut being valid causes the first piece of data returned to line up with the second read address and strobe during the second clock cycle. It s important that PicoDataOut be zero when the bus is not talking to your module, because it is a shared bus, and your output will get OR-ed with the outputs of other peripherals on the bus. So all peripherals must output only zeros when they re not being addressed. Here is some typical code for handling reads from several addresses. You can think of each address as being a register. We also connect a vector to the bus, which will be inferred as a block ram. // declare the registers we'll put on the bus. reg [31:0] status_reg, result_reg ; // this vector of 32-bit registers will be inferred to be a 2017 Micron Technology, Inc. 19

23 // hard block ram. reg [31:0] bram[511:0] ; PicoClk) begin // at the rising clock edge, we check the PicoRd and // PicoAddr signals to see which then we assign that // register's value to PicoDataOut if (PicoRd && PicoAddr == 32'h0) PicoDataOut <= status_reg ; else if (PicoRd && PicoAddr == 32'h4) PicoDataOut <= result_reg ; else if (PicoRd && PicoAddr[31:12] == 20'h10) // this is the block ram address decoding. we // accept any value in the low 12 bits of // PicoAddr and use them as the PicoDataOut <= bram[picoaddr[10:2]] ; else PicoDataOut <= 32'h0 ; end The PicoBus is word addressable, and the addresses are always specified in bytes. For the 32-bit PicoBus, this means you will address 32-bit words, and thus the low 2 address bits will always be zero. For the 128-bit PicoBus, the word size is 128 bits, and the low 4 bits will always be zero. Notice how we chose 0 and 4 for the addresses of our two registers in this sample code. The reason for this choice is PicoBus accesses must be aligned with the width of the PicoBus. Thus 0 and 4 are the first two addresses on the bus. Another way of looking at this situation is to notice that the register at address 0 is 32 bits, or four bytes. That means it occupies byte addresses 0-3, and the next available address is 4. The block ram is addressed similarly. We place it at addresses 0x1000-0x17fc (first element at 0x1000, second at 0x1004, etc.). Notice how we trim the low two address bits, since each element in our block ram is four bytes. Writing To the PicoBus When the host initiates a write: PicoWr is asserted PicoAddr is set to the requested address PicoDataIn is set to the data that s being written. All these signals are valid for one clock cycle. They are all valid on the same clock cycle as well there is no staggering. Unless multiple addresses are being written to, PicoWr will be deasserted on the next cycle. See Figure 6 for an example Micron Technology, Inc.

24 Figure 6: PicoBus Write Waveform As with reads, writes may also be performed back-to-back. Writes are simpler because everything for a given write happens on the same clock cycle, rather than having a delay like reads. Here is some Verilog to handle writes to the PicoBus. Building on the previous block of code for reads, we add the ability to write to the inferred block ram we created. // declare the register we'll put on the bus. reg [31:0] command_reg ; // this vector of 32-bit registers will be inferred to be a // hard block ram. reg [31:0] bram[511:0] ; PicoClk) begin // at the rising clock edge, we check the PicoWr and // PicoAddr signals to see which then we assign the // incoming data to that register. if (PicoWr && PicoAddr == 32'h8) command_reg <= PicoDataIn ; if (PicoWr && PicoAddr[31:12] == 20'h1) // this is the block ram address decoding. we // accept any value in the low 12 bits of // PicoAddr and use them as the bram[picoaddr[10:2]] <= PicoDataIn ; end PicoBus Software Now that we have seen how the PicoBus is implemented in firmware, let us look at how to drive it from software. We will assume we have already created a PicoDrv object that is connected to the 2017 Micron Technology, Inc. 21

25 FPGA we want to talk to. (See Section 2.7 for more information about creating the object.) You use your PicoDrv object to read and write on the PicoBus with these two functions: int PicoDrv::ReadDeviceAbsolute (pico_size_t addr, void* buf, int numbytes) ; int PicoDrv::WriteDeviceAbsolute (pico_size_t addr, const void* buf, int numbytes) ; Both functions share the addr and numbytes parameters, which are both in bytes. Both addr and numbytes must be multiples of the PicoBus size (32 bits or 128 bits). For example, trying to read 1 byte will result in an error. You must read at least 4 bytes. You can read or write multiple sequential addresses in a single call by specifying an appropriate numbytes. For example, if your firmware has registers at 0x10 and 0x14, you can read both in a single call by specifying an addr of 0x10 and a numbytes of 8. This capability is useful if you have something like a block ram on the PicoBus. ReadDeviceAbsolute takes a pointer to the destination memory buffer in buf. WriteDeviceAbsolute also takes a pointer to the data to write in buf. ReadDeviceAbsolute and WriteDeviceAbsolute both return the number of bytes read or written on success, and a negative value on failure. You should always check the return value. 2.4 Streams This section describes the software and firmware sides of the stream communication model. A high-level description of streams is in Section 2.2 (Stream Model). Stream Firmware The firmware realization of a stream is essentially a FIFO interface. There is data, and a flow control handshake. The naming of Pico s stream signals is based on AXI-Stream interface. Stream Signals There are two signals that are shared among all streams, both input and output: clk input free running clock for all the streams rst input active-high reset signals for the streams Micron Technology, Inc.

26 The following signals constitute stream #1 coming into the FPGA: s1i_valid input high when this stream has data to provide to your module s1i_rdy output drive this high when you latch the current s1i_data value s1i_data[127:0] input the current data value. valid when s1i_valid is high And the signals for stream #1 going out of the FPGA: s1o_valid output drive this high when you have valid data on s1o_data s1o_rdy input high when the stream is ready to accept s1o_data[127:0] output the value you re putting out on the stream Stream Numbering Unlike the PicoBus, which has the ability to differentiate between many addresses, each stream is essentially a point-to-point connection. Thus, the framework supports multiple streams that you can use to separate logically separate communication channels. You can use up to 32 streams in each direction, for a total of 64 streams. You may have noticed that the stream signal names above have a number in them, in this case 1. For each stream s signals, replace this number with the number of the stream, from 1 to 32. It is possible to create two streams with the same number if one is going in each direction. For example, you can have stream #1 from the host to the FPGA, and stream #1 from the FPGA to the host. Section 2.4 describes how to tell the Pico framework which streams you are using. Stream Firmware Protocol As you can see from the signal description, the input and output streams are very similar. Both have a data signal, and one handshaking wire going in each direction. The protocol is also similar for each direction. Data is latched by the recipient when both valid and rdy are high. For example, if an input stream has data ready for you, it will assert s1i_valid. When you are ready to accept the data, you latch s1i_data and assert s1i_rdy in the same clock cycle. As we mentioned earlier, the protocol for streams is very similar to a FIFO protocol. You can think of an input stream as being a FIFO in FWFT mode (first word fall through), and the s1i_valid signal being an inverted version of the FIFO s empty signal. In fact, this approach is how streams are implemented in the Pico framework for some of our boards. Reading From a Stream in Firmware 2017 Micron Technology, Inc. 23

27 Here is a simple Verilog module that takes one input stream and sums the low 32 bits of each element. It is not very useful as is, since it has no data outputs, but it demonstrates how to connect to an input stream. module StreamLoopback128 ( // signals shared by all streams input clk, input rst, // signals for input stream #1 (moves data from host to FPGA) input s1i_valid, output s1i_rdy, input [127:0] s1i_data ); // a register to sum the low 32 bits of all data we receive. reg [31:0] sum ; // we can always accept more data. we never need to // apply backpressure. assign s1i_rdy = 1 ; clk) begin if (rst) begin sum <= 32'h0 ; end else begin // when both rdy and valid are true, // grab the current value on the data if (s1i_rdy && s1i_valid) begin sum <= sum + s1i_data[31:0] ; end end end endmodule Notice that since we are always ready, we just tie s1i_rdy to 1. If we had a pipeline in our module that could stall, we would monitor its state and de-assert s1i_rdy when we needed to hold off the input. It is important to notice that when s1i_valid and s1i_rdy are both valid, the data latched is the current clock cycle s data, not the next. This fact is why we mention first-word-fall-through when comparing our interface to a FIFO. Writing to a Stream in Firmware Let us look at a simple counter to explain the firmware interface for an output stream. This module just emits a sequence of numbers starting at 0, incrementing by 0x42 each time: Micron Technology, Inc.

28 module StreamLoopback128 ( // signals shared by all streams input clk, input rst, // signals for output stream #1 (moves data from FPGA to host) output s1o_valid, input s1o_rdy, output [127:0] s1o_data ); // an incrementing counter reg [31:0] count ; // we can always provide more output data. assign s1o_valid = 1; // construct our output data: // 96 bits of constant "signature" data // 32 bits of our incrementing counter assign s1o_data = {32'h , 32'hdeadbeef, 32'h , count[31:0]}; clk) begin if (rst) begin count <= 32'h0 ; end else begin // every time we provide a new value on the output, // increment our count if (s1o_rdy && s1o_valid) begin count <= count + 32'h42 ; end end end endmodule We drive s1o_valid high all the time, because we can always generate a new value to send out on the stream. When s1o_rdy is low, we wait for the stream to be ready, and hold our current value. As soon as it is ready, we increment our counter to the next value. Stream Software To use a stream in software, you first create a handle for the stream using the CreateStream function: virtual int CreateStream (int streamnum); 2017 Micron Technology, Inc. 25

29 The handle that is returned can be used for both the input and the output stream with the specified number. You use the following two functions to read or write on it, depending on which direction it is going: virtual int ReadStream (int streamhandle, void* buf, int numbytes); virtual int WriteStream (int streamhandle, const void* buf, int numbytes); ReadStream and WriteStream take the stream handle returned from CreateStream as their first argument. Note that this is not the stream number, it is an opaque handle. Their other two arguments are the data buffer pointer, buf, and the size of the transfer in bytes, numbytes. They return the number of bytes transferred on success, and a negative error code on failure. The biggest difference between the stream functions and the PicoBus functions, other than the lack of an address, is that the stream functions block until the firmware is ready. Because the stream implementation in firmware has flow control in the form of *_rdy* and *_valid* signals, the Pico framework is able to take care of flow control all the way up through the software API. For example, if you have a firmware module on the FPGA that is generating data very slowly, you can issue a ReadStream call for the entire amount of data you expect, and the call will wait until the FPGA has generated that much data. Whether the FPGA generates the data immediately, or takes an hour, the ReadStream call will wait until the data is ready. All stream accesses, even accesses to 32-bit streams, must move a multiple of 128 bits at a time. Avoiding Blocking With Streams If you want to avoid the blocking behavior of the streams functions, you can use the GetBytesAvailable function to query the API to learn how much data you can move without blocking. For example, if GetBytesAvailable says there are 5000 bytes available on a stream to the FPGA, you are guaranteed that a call to WriteStream for that stream will return immediately if you call it with 5000 bytes or less. You can still call it with more than 5000 bytes, but it may block if the FPGA is not ready to accept the data right away. Here is the prototype for GetBytesAvailable. Note that streamhandle is the handle of the stream, not its number. The isread parameter specifies the direction of the stream you are inquiring about: true for FPGA to host streams and false for host to FPGA streams. virtual int GetBytesAvailable (int streamhandle, bool isread=true); Micron Technology, Inc.

30 DDR3 and HMC Memory Streams Two special streaming functions are included in the API to aid the user in writing to and reading from the DDR3 and HMC memory systems. WriteRam allows a user to stream data to the FPGA, while ReadRam allows the user to stream data from the FPGA. virtual int ReadRam (uint64_t addr, void* buf, int size, int memid = PICO_DDR3_0); virtual int WriteRam (uint64_t addr, const void* buf, int size, int memid = PICO_DDR3_0 ); More information about these two functions can be found in Chapter 3 (Pico Memory System). Stream Efficiency A common misconception is that streams are less efficient than the PicoBus. This idea arises because people assume that the flow control that the streams provide must slow them down. In fact, streams are much more efficient and provide higher throughput than the PicoBus. A large part of this efficiency is due to the fact that with the PicoBus, the CPU needs to set up the address for each transfer, while with streams there is only a continuous stream of data. This situation supports the pushing of the stream machinery down into the FPGA itself. All the CPU must do is tell the Pico framework in the FPGA where the data buffer is in host memory, and wait for an interrupt from the FPGA telling it the transfer is complete. All the flow control (the pausing and restarting) happens inside the FPGA. For the streams to be most efficient, calls to WriteStream and ReadStream need to move large blocks of data (on the order of 1 MB). Because GetBytesAvailable will never return more than several kilobytes, it is more efficient to use the blocking calls to WriteStream and ReadStream (without the preceding call to GetBytesAvailable). Multiple Threads In many applications, you may want to write input data to a stream, have the FPGA process the data, and read the results out on a stream. In such a system, you can avoid using GetBytesAvailable (thereby producing much higher streaming bandwidth) by using multiple threads: one for writing to the FPGA and one for reading from the FPGA. In the described system, the first thread will only call WriteStream, the second thread will only call ReadStream, and the Pico framework and driver will handle the flow control behind the scenes. To create these multiple threads, you can use any threading library, but Pico recommends using POSIX threads (pthreads). Calls to WriteStream or ReadStream on different streams are always thread-safe. In other words, the Pico driver will safely handle the accesses to the card from multiple threads on different streams without the need for a mutex. Please note that an input stream and 2017 Micron Technology, Inc. 27

31 an output stream that share the same stream number are independent (for example, you can call WriteStream and ReadStream on stream 1 from two different threads without a mutex. The WriteRam and ReadRam functions are very similar to writing and reading a stream. However, when calling ReadRam, the Pico software must first write the address and read size to the same stream used for calls to WriteRam. After the address and size are interpreted, the read is processed by the memory system, and the Pico software reads the data from the ReadRam stream. Due to this required write during the ReadRam call, you must include a mutex when calling WriteRam and ReadRam from different threads. Examples that do not require a mutex: Thread 1 WriteStream to stream 1, thread 2 WriteStream to stream 2. Explanation: Different streams are thread-safe. Thread 1 WriteStream to stream 1, thread 2 ReadStream from stream 1. Explanation: Input and output streams are separate. Thread 1 WriteRam, thread 2 WriteStream to stream 1. Explanation: Calls to WriteRam should be treated as a write to a stream. Thread 1 ReadRam, thread thread 2 reads from stream 1. Explanation: Calls to ReadRam should be treated as a write then a read from a stream. Examples that do require a mutex: Thread 1 WriteStream to stream 1, thread 2 WriteStream to stream 1. Explanation: 2 threads cannot write to the same stream without a mutex. Thread 1 ReadStream from stream 1, thread 2 ReadStream from stream 1. Explanation: 2 threads cannot read from the same stream without a mutex. Thread 1 calls WriteRam, thread 2 calls ReadRam. Explanation: Call to ReadRam must first write the address and read size on the same stream that services WriteRam. 2.5 Building Software The Pico software is distributed as a set of C++ source files and headers. These files reside in $PICOBASE/software/source and $PICOBASE/software/include. Pico provides a makefile you can include in your projects that will handle pulling in all the required source files, and adding the Micron Technology, Inc.

32 required flags for the compiler. To see how to make use of this makefile, let us look at an example makefile for the StreamLoopback128 sample project: TARGET = StreamLoopback128 USER_SOURCES = StreamLoopback128.cpp include $(PICOBASE)/software/Makefile.common This file sets the desired name of the executable to be built, and specifies the source files. Then the last thing it does is include the Pico makefile. At that point, the Pico makefile will add the required Pico API source files to the list, and set up the rules to compile the program. When we run make on this project, here is the command that the Pico makefile runs for us: pico@pico-desktop:~/pico/samples/streamloopback128/software$ make g++ -O3 -Wno-multichar -DLINUX -DPOSIX -I. -I/home/pico/pico/software/include/linux -I/home/pico/pico/software/include -fmessage-length=0 -fdiagnostics-show-option -fms-extensions -Wno-write-strings -DOSNAME="Linux" -o StreamLoopback StreamLoopback.cpp /home/pico/pico/software/source/gstring.cpp /home/pico/pico/software/source/pico_drv.cpp /home/pico/pico/software/source/pico_errors.cpp /home/pico/pico/software/source/linux/linux.cpp /home/pico/pico/software/source/pico_drv_linux.cpp If you prefer to use your own makefile rather than hooking into Pico s, you can copy these options into your own makefile. Make sure you get the Pico source files, the include paths, and the compiler options. 2.6 Building Firmware Pico firmware is designed to be built with Xilinx s Vivado or Intel s Quartus Prime environment. Because of the difficult nuances of configuring a Vivado/Quartus project, and the challenges associated with using FPGA vendor tools, Pico distributes a build script that can assemble the project for you. This approach is more reliable than just distributing source files and leaving you, the engineer, to construct a project from scratch. This script is described in 1.5. To get started on your own project, find the sample that best matches your communication model (PicoBus, streams, etc.), create the project using the build script and copy it to your work directory. Most samples have Vivado/Quartus projects for several different cards, such as the AC-510 and AC-520. Make sure you use the correct one for your hardware. When you copy the project directory, you will get all the required source files for the Pico framework. All that is left is to add your own code Micron Technology, Inc. 29

33 Configuring the Pico Framework When you have an project with your code in it, you need to tell the Pico framework about how your module communicates. It may use the PicoBus, streams, or both. The framework needs to know which interfaces your module uses, so it can create the necessary backing logic and connect it to your module. You tell the framework about your module by setting defines in PicoDefines.v. Here is the PicoDefines.v file for the StreamLoopback128 sample project for the AC-510: (if you use a different sample project as a starting point, its PicoDefines.v will be slightly different) // PicoDefines.v - here we configure the base firmware for our // project // This includes a placeholder "user" module that you will // replace with your code. To use your own module, just // change the name from PicoBus128_counter to your module's // name, and then add the file to your ISE project. `define USER_MODULE_NAME StreamLoopback128 `define XILINX_FPGA `define XILINX_ULTRASCALE `define PCIE_GEN3X8 `define LED `define STREAM1_IN_WIDTH 128 `define STREAM1_OUT_WIDTH 128 // We define the type of FPGA and card we're using. `define PICO_MODEL_M510 `define PICO_FPGA_KU060 `include "BasePicoDefines.v" The first thing this file sets is the name of your top module. Because it is the module that the Pico framework will instantiate, you need to specify its name. The next couple of lines state what kind of FPGA features and PCIE interfaces are. Next we configure that we are using two streams. Each line indicates the stream number (#1), direction (IN/OUT) and width (128). When the framework sees these definitions, it will connect the signals for these two streams to your module. Make sure to use matching streams in your user model or the streams will not be connected. The last two lines indicate that we re using an AC-510 board with the Ultrascale KU060 FPGA. The last line includes a Pico file - BasePicoDefines.v. This include must be the last line in PicoDefines.v in order for the framework to build correctly Micron Technology, Inc.

34 2.7 Software API Reference We have described some of the software API as it relates to the PicoBus and streams, but now we cover all of it in one place. As mentioned earlier, the main point of entry is the PicoDrv class, which represents an FPGA. To use the PicoDrv class, include picodrv.h in your code. The public methods in the PicoDrv class are: class PicoDrv {public: PicoDrv (); ~PicoDrv () =0; int LoadFPGA (const char *filenamep, const uint8_t *datap=null, int size=0)=0; int ReadDeviceAbsolute (pico_size_t addr, void* buf, int size); int WriteDeviceAbsolute (pico_size_t addr, const void* buf, int size); int CreateStream (int streamnum); void CloseStream (int streamhandle); int ReadStream (int streamhandle, void* buf, int size)=0; int WriteStream (int streamhandle, const void* buf, int size)=0; int GetBytesAvailable (int streamhandle, bool isread=true); int ReadRam (uint64_t addr, void* buf, int size, int memid = PICO_DDR3_0); int WriteRam (uint64_t addr, const void* buf, int size, int memid = PICO_DDR3_0); }; int GetPicoConfig (PICO_CONFIG *); int GetSerial (); int GetSlot (char *buf, int bufsize); int GetSysMon (float *temp, float *voltage, float *current); PicoDrv Lifecycle There are two ways to create a PicoDrv object. The first is to connect to an FPGA that has already been configured with a bitfile using the FindPico() function. To use the function, you must either set up a PICO_CONFIG struct with the desired configuration: int FindPico(const PICO_CONFIG *cfg, PicoDrv **drvpp) ; or you can simply specify the model of the target card: int FindPico(uint32_t model, PicoDrv **drvpp) ; 2017 Micron Technology, Inc. 31

35 This function does not load a new bit file onto the FPGA or modify it in any way. To check for failure after using this function, check the return value, which will be positive if there were no errors connecting to the FPGA. For example, assuming you wish to connect to an AC-510 that has already been configured with a bitfile, you can use: PicoDrv *pico ; PICO_CONFIG config ; config.model = 0x510 ; config.shareaccess = false ; int error = FindPico(&config, &pico) ; or PicoDrv *pico ; int error = FindPico(0x501, &pico) ; The other way to connect to an FPGA is with the RunBitFile function: int RunBitFile(const char *bitfilepath, PicoDrv **drvpp) ; This function determines the type of Pico board that the given bit file is meant for, finds an available board, and loads the bit file on it. It returns 0 on success or a negative error code on failure. (Note: RunBitFile is not a member of the PicoDrv class.) Both methods of opening an FPGA will lock the FPGA for exclusive access. If you try to open the same FPGA twice with PICODRV, the second attempt will fail. RunBitFile, on the other hand, will pass over any FPGAs that are already in use and connect to the first available one. Because of this difference, RunBitFile is more commonly used. It is important to delete a PicoDrv object when you are done using it, to free it for use by other processes. FindSiblings int FindSiblings(PicoDrv* pico, PicoDrv **drvpp, int max) ; FindSiblings is a version of FindPico that finds up to max FPGAs that share a backplane with pico and have the same model type Micron Technology, Inc.

36 This function does not load a new bit file onto any of the FPGAs. It returns the positive number of siblings found up to the requested max on success. It returns zero if no siblings were found, or a negative error code on failure. If fewer than max siblings are found, the number actually found is returned. Each sibling that is found adds a PicoDrv pointer into drvpp. For example, if there are 4 AC-510 s on an EX-700 backplane and FindSiblings is called with a pointer to one of them in pico and max is set to 5, the return value will be 3 and drvpp will be updated with pointers to the three found FPGAs. A good default value for max is 5, unless fewer modules are desired, because there at most 6 modules on a backplane minus the one that gets passed to FindSiblings as pico. Running.bit Files We have already seen one way to load bit files onto the FPGA: RunBitFile. That approach is useful if you are just connecting to an FPGA for the first time, but if you already have a PicoDrv object and you want to load a different bit file, you can use LoadFPGA: int PicoDrv::LoadFPGA(const char *filenamep, const uint8_t *datap=null, int size=0) ; LoadFPGA allows you to specify the bit file either as a filename in filenamep, or as a binary blob of size size bytes, located at datap. If datap is NULL, filenamep must be valid. LoadFPGA returns 0 on success, or a negative error code on failure. Possible error codes are: no compatible card in your system, all compatible cards are busy, etc. PicoBus Functions int PicoDrv::ReadDeviceAbsolute (pico_size_t addr, void* buf, int size) ; This function reads a block of size bytes from the PicoBus at address addr and places the data into buffer buf. The ReadDeviceAbsolute function returns the number of bytes read on success, or a negative error code on failure. The size must be a multiple of the PicoBus size. If your firmware has implemented a 32 bit PicoBus, you must read multiples of 4B (32 bits). If your firmware has implemented a 128 bit PicoBus, you must read multiples of 16B (128 bits). int PicoDrv::WriteDeviceAbsolute (pico_size_t addr, const void* buf, int size) ; This function writes a block of size bytes from the buffer at buf to the PicoBus at address addr. The WriteDeviceAbsolute function returns the number of bytes written on success, or a negative error code on failure Micron Technology, Inc. 33

37 The size must be a multiple of the PicoBus size. If your firmware has implemented a 32 bit PicoBus, you must write multiples of 4B (32 bits). If your firmware has implemented a 128 bit PicoBus, you must write multiples of 16B (128 bits). Stream Functions TieStreams The TieStreams API connects two user streams from two different FPGAs together. With TieStreams, data can be passed between FPGAs without involving the host machine, so it improves efficiency. The TieStream API connects one output stream on one FPGA to one input stream on the other FPGA. Here, we use the MultiStream sample to describe how it works. In the MultiStream sample, this line ties the stream together: err = last_pico->tiestreams(1,cur_pico,1) ; The MultiStream sample loops through several FPGAs, tying previous FPGA s output stream to current FPGA s input stream. So the API can be written like this: RC = (PicoDrv * picoa) ->TieStreams(in stream_numa, PicoDrv *picob, int stream_numb) ; RC: return code picoa: output FPGA stream_numa: stream number from output FPGA picob: input FPGA stream_numb: stream number from input FPGA Note: A. The TieStreams API cannot be used across different backplanes. B. The TieStreams API cannot be used to tie Streams on the same FPGA Micron Technology, Inc.

38 Opening As mentioned in the architectural overview of streams, the software API communicates over streams through the use of handles. To open a stream handle, use CreateStream: int PicoDrv::CreateStream(int streamid) ; CreateStream takes a stream ID, and returns a handle that can be used for both reading and writing on that stream number. For example: int mystream = pico->createstream(1) ; Reading and Writing Use ReadStream to read: int PicoDrv::ReadStream(int streamhandle, void* buf, size_t count) ; This function reads count bytes from the stream into the buffer at buf. count must be a multiple of the stream s data width (either four or 16 bytes). ReadStream returns the number of bytes read, or a negative error code on failure. For example: int err = pico->readstream(mystream, buf, size) ; ReadStream will block until all the requested data has been transferred. Use WriteStream to write: int PicoDrv::WriteStream(int streamhandle, const void* buf, size_t count) ; WriteStream writes count bytes from the the buffer at buf to the stream. count must be a multiple of the stream s data width (either four or 16 bytes). WriteStream returns the number of bytes read, or a negative error code on failure Micron Technology, Inc. 35

39 For example: int err = pico->writestream(mystream, buf, size) ; WriteStream will block until all the data has been written. Closing Streams are automatically closed when your program terminates, but you can also close them manually with the CloseStream function: void PicoDrv::CloseStream(int streamhandle) ; Error Codes All of the functions from the PicoDrv class described in this document return an int. If this return value is negative (<0), it should be interpreted as an error condition for that API call. For example, if a call to RunBitFile returns a negative value, then an error has occurred. These error codes can be found in the PicoAPI appendix. They can also be decoded using the PicoErrors_FullError function as shown in the following example. This function automatically converts the error code to a textual description of the error condition. PicoDrv* pico; const char* bitfilename; char ibuf[1024]; int err = RunBitFile(bitFileName, &pico); if (err < 0) { fprintf(stderr, "RunBitFile error: %s\n", PicoErrors_FullError(err, ibuf, sizeof(ibuf) )); exit(1); } Status You can use GetBytesAvailable to ask the API how much data can be moved immediately without blocking. ReadStream and WriteStream both block until the data can be transferred, which is not always what you want in your program. You can effectively poll the streams with Micron Technology, Inc.

40 GetBytesAvailable to determine how much data/space is available right away. Note that these provide only a lower bound. You may be able to transfer more data. Thus, if you have 1MB to transfer, you cannot just call GetBytesAvailable waiting for it to say 1MB. On the Virtex-6 implementation of streams, for example, GetBytesAvailable never returns more than about 8kB, so you would be waiting forever. You can either just call WriteStream with the full 1MB, or use GetBytesAvailable to transfer the data in smaller, non-blocking chunks. It is important to realize that this function is just a status you can use if you are curious, and is not required for normal operation. As mentioned in the last paragraph, you can simply call ReadStream or WriteStream and let the Pico framework sort out the flow control. In fact, calling GetBytesAvailable yourself will usually result in code that runs much more slowly. When in doubt, do not call GetBytesAvailable. int PicoDrv::GetBytesAvailable(int streamhandle, bool reading) ; streamnum is the number/id of the stream (not the handle!), and reading is true if the stream is directed from the FPGA to the host. The return code is the number of bytes available without blocking, or a negative error code on failure. Miscellaneous Functions GetSerial You can get the serial number of an open Pico card by calling GetSerial. This function returns the serial number of the card as an int. int PicoDrv::GetSerial() ; GetSlot GetSlot tells you which physical slot a Pico card is in. It returns this as a string containing two slot numbers separated by a colon. For example, if the card is in slot 5 of an EX-700, and the EX-700 is in slot 2 on the motherboard, GetSlot will return the string 2:5. If one of the slots cannot be determined, it will be blank. For example, when using a laptop adapter, you will get the string :. Please see the Pico hardware documentation for slot numbering conventions. int PicoDrv::GetSlot(char *buf, int bufsize) ; 2017 Micron Technology, Inc. 37

41 buf is a pointer to a buffer of size bufsize where GetSlot will write the string. GetSlot returns 0 on success, or a negative error code on failure. GetPicoConfig GetPicoConfig gives you a PICO_CONFIG struct with information about the card. It accepts a pointer to a PICO_CONFIG struct that it will fill in. It returns 0 on success or a negative error code on failure. Most of these fields are opaque and not intended for use by user programs, but the following fields are supported: pico_version_t firmwarever; This is a 32-bit representation of the version of firmware currently running on the FPGA. (pico_version_t is defined to be uint32_t). Each of the four numbers that make up the version are given 8 bits. For example, version would be represented as the 32-bit value 0x05_06_00_00 (0x ). uint32_t model ; This is a hex representation of the model of the card. Possible values are: 0x505 (AC-505) 0x506 (AC-506) 0x510 (AC-510) 0x520 (AC-520) Micron Technology, Inc.

42 3 Pico Memory System The Pico Memory System chapter serves as a user guide for those who want to access the memory on Pico FPGA modules 3.1 Overview Within this chapter, we list the memory available on each board, demonstrate how to read and write the memory from the firmware, and demonstrate how to read and write the memory from the software. 3.2 Accessing Memory from Firmware This section describes how to read and write the memory from the hardware description language (HDL) loaded into the FPGA. Section 3.2 describes how to configure your design to include a memory controller, and how to specify the number of user ports that need to talk to the memory. Section 3.2 describes how to write to/read from the memory using the supported memory interface protocols. Configuration To create a design that interfaces to a memory: 1. Locate one of the distributed samples that communicates with the memory (for example, the DDR3_MovingAverage sample). 2. Copy that sample into your own user directory. 3. Edit the PicoDefines.v file in that newly copied project. To include a DDR3 or DDR4 memory controller in your design, be sure to add the following line to your PicoDefines.v file. `define PICO_DDR3 Or `define PICO_DDR Micron Technology, Inc. 39

43 You can also specify the number user ports that must communicate with the memory using the PicoDefines.v file. For example, if you need 3 ports to communicate with the memory, add the following line to your PicoDefines.v file. `define PICO_3_AXI_MASTERS The memory interface width (interface from the user logic to the memory controller) is 32 Bytes (256 bits), and there can be up to seven user ports to the memory. AXI The Advanced extensible Interface (AXI) protocol is a standard protocol that uses five independent channels for communication: Write Address Write Data Write Response Read Address Read Data Write communication uses separate address and data channels, and a response for each write burst is returned on a write response channel. Read bursts are initiated on the read address channel, and data is returned on the read data channel. The AXI protocol defines interactions between a single master and a single slave on these five channels. Address and control signals flow from the master to the slave on the write address, write data, and read address channels. Conversely, data flows from the slave to the master on the write response and read data channels. All channels are completely independent, and data can be moving on both the write and read channels simultaneously. In our case, user HDL is acting as the master for this AXI protocol, and it should therefore generate address and control information on the write address, write data, and read address channels. The data (both write and read data) can be delivered in bursts of up to 256 transfer cycles. The AXI protocol requires at least 1 address per burst. To clarify, the user HDL can send 1 address (on the write address or read address channel) per 256 transfers of data (on the write data or read data channel). All addresses are specified in terms of bytes (in other words, AXI is byte-addressable) Micron Technology, Inc.

44 Each channel in the AXI protocol uses a simple valid-ready handshake, where data only moves from the master to the slave (write address, write data, read address channels) or the slave to the master (for write response, read data channels) when both valid and ready are asserted. All 5 channels have their own valid-ready handshake signals. Users can specify whichever clock that they want to use for the interface to the memory on a per-port basis. For example, users can drive a 100 MHz clock for writing and reading the memory on port 1. At the same time, they can drive a 250 MHz clock for writing and reading the memory on port 2. The clock for a port s interface should be driven on the c0_sx_axi_clk, where X is the port number, so as to avoid having to cross in and out of a memory clock domain. Both reading and writing for a given port (for example, port 1) is synchronous to the clock for that port. In other words, you cannot read at a different clock rate, because you are writing on the same port. You can, however, use different ports for reading and writing at different clock frequencies. This approach is currently supported up to a 250 MHz clock for each port s interface to the memory. Writing Before writing to the memory from firmware (HDL), make sure the memory controller has gone through calibration. The memory controller is forced to go through calibration automatically with the first call to either WriteRam() or ReadRam() from software (see Section 3.3, Accessing Memory From Software). Therefore, before writing to the memory from firmware, either write or read the memory from software (by calling WriteRam or ReadRam). Then notify your firmware that it is safe to write to the memory by having the software write some data to a stream or set a bit in a status register via the PicoBus. The write address channel specifies the address for the write (axi_awaddr), the size of each data transfer (axi_awsize), and the number of data transfers for the current write (axi_awlen). It also specifies some advanced control options, such as whether this burst should be exclusive (axi_awlock) and whether this data should be cached for future use (axi_awcache). Write address channel signals that can be static for simple user designs: assign c0_s1_axi_awlock assign c0_s1_axi_awcache assign c0_s1_axi_awprot assign c0_s1_axi_awburst assign c0_s1_axi_awqos assign c0_s1_axi_awsize = `NORMAL_ACCESS; = `NON_CACHE_NON_BUFFER; = `DATA_SECURE_NORMAL; = `INCREMENTING; = `NOT_QOS_PARTICIPANT; = `THIRTY_TWO_BYTES; Write address channel signals that will probably be dynamic (although not necessarily required to be) for user designs: 2017 Micron Technology, Inc. 41

45 c0_s1_axi_awid c0_s1_axi_awaddr c0_s1_axi_awlen c0_s1_axi_awvalid An example write address transfer can be seen in Figure 7. Note that the address (c0_s1_axi_awaddr) and length of the burst (c0_s1_axi_awlen) are loaded before asserting the valid signal (c0_s1_axi_awvalid). The transfer of the address and control info from the master to the slave occurs when both c0_s1_axi_awvalid and c0_s1_axi_awready are asserted. Figure 7: Write Address Transfer Example Note that the valid signal (c0_s1_axi_awvalid) must not depend upon the ready signal (c0_s1_axi_awready). Also, once the write address valid is asserted, the signal must remain asserted until the write address ready signal is asserted. The write data channel provides the data to be written (axi_wdata), along with a signal that gets asserted with the last data transfer for a burst (axi_wlast). A byte write enable signal (axi_wstrb) also specifies the bytes that should be written and the bytes that should be ignored for the transfer. The write data channel does not have an ID, so it must be sent to the memory in the same order as the write address channel. In other words, the first piece of data sent to the memory will correspond to the first write address sent to the memory on the write address channel. Write data channel signals that can be static for simple user designs: assign c0_s1_axi_wstrb = ~0; // byte write enable signals Micron Technology, Inc.

46 Write data channel signals that will probably be dynamic (although not necessarily required to be) for user designs: c0_s1_axi_wlast c0_s1_axi_wdata c0_s1_axi_wvalid The final few transfers of data for a write burst can be seen Figure 8. Note that the c0_s1_axi_wvalid signal does not have to be asserted every cycle (data does not move from master to slave if wvalid is not asserted) and that the c0_s1_axi_wlast signal is asserted with the last clock cycle of a write data burst. Figure 8: Write Data Example Note that the valid signal (c0_s1_axi_wvalid) must not depend upon the ready signal (c0_s1_axi_wready). Also, once the write data valid is asserted, the signal must remain asserted until the write data ready signal is asserted. The write response channel is quite simple. The memory system will return a write response for each write address that was received (in other words, 1 write response for each write burst). This response lets users know when the write has completed. The write response has the same ID as the write address that it corresponds to. Write responses are returned in the same order as the write addresses were sent to the memory. Note that writes to the memory system are expected to have very high latency. To get a good performance out of the memory system, in particular for random accesses, you should pipeline multiple writes to the memory without waiting for a write response. Do not wait for a write response before issuing your next write to the memory. Reading 2017 Micron Technology, Inc. 43

47 Before reading from the memory from firmware (HDL), make sure the memory controller has gone through calibration. The memory controller is forced to go through calibration automatically with the first call to either WriteRam() or ReadRam() from software (see Section 3.3, Accessing Memory From Software). Therefore, before we can read the memory from firmware, we must either write or read the memory from software (by calling WriteRam or ReadRam). Then notify your firmware that it is safe to read from the memory by having the software write some data to a stream or set a bit in a status register via the PicoBus. The read address channel provides the read address (axi_araddr) and size (axi_arsize and axi_arlen) to the slave device along with the same advanced control options from the write address channel. The read address is also specified in bytes. Read address channel signals that can be static for simple user designs: assign c0_s1_axi_arlock = `NORMAL_ACCESS ; assign c0_s1_axi_arcache = `NON_CACHE_NON_BUFFER ; assign c0_s1_axi_arprot = `DATA_SECURE_NORMAL ; assign c0_s1_axi_arburst = `INCREMENTING ; assign c0_s1_axi_arqos = `NOT_QOS_PARTICIPANT ; assign c0_s1_axi_arsize = `THIRTY_TWO_BYTES ; Read address channel signals that will probably be dynamic (although not necessarily required to be) for user designs: c0_s1_axi_arid c0_s1_axi_araddr c0_s1_axi_arlen c0_s1_axi_arvalid See Figure 9. for a read address example. Note that the address (c0_s1_axi_araddr) and length of the burst (c0_s1_axi_arlen) are loaded before asserting the valid signal (c0_s1_axi_arvalid). The transfer of address and control signals from the master to the slave occurs when both c0_s1_axi_arvalid and c0_s1_axi_arready are asserted Micron Technology, Inc.

48 Figure 9: Read Address Example Note that the valid signal (c0_s1_axi_arvalid) must not depend upon the ready signal (c0_s1_axi_arready). Also, once the read address valid is asserted, the signal must remain asserted until the read address ready signal is asserted. The read data channel returns the read data (axi_rdata) to the master device. Read response channel signals that will probably be dynamic (although not necessarily required to be) for user designs: c0_s1_axi_rready See Figure 10 for an example of the final few transactions of a read data burst. Note that the c0_s1_axi_rvalid signal does not have to be asserted every cycle (data does not move from the slave to the master if the valid signal is not asserted) and that the c0_s1_axi_rlast signal is asserted with the last clock cycle for the read data burst. Similar to the write response, c0_s1_axi_rresp is 0 for successful reads from the DDR system. Figure 10: Read Data Ready Burst Example 2017 Micron Technology, Inc. 45

Pico Computing. M 501 / M 503 Getting Started Guide. March 7, Overview 1. 2 System Requirements 1. 3 Ubuntu Linux Configuration 2

Pico Computing. M 501 / M 503 Getting Started Guide. March 7, Overview 1. 2 System Requirements 1. 3 Ubuntu Linux Configuration 2 Pico Computing M 501 / M 503 Getting Started Guide March 7, 2012 Contents 1 Overview 1 2 System Requirements 1 3 Ubuntu Linux Configuration 2 4 Installing the Pico Software 4 5 Monitoring Cards With purty

More information

Pico Computing M501 PSP User Guide Linux Version 1.0.1

Pico Computing M501 PSP User Guide Linux Version 1.0.1 CoDeveloper Platform Support Package Pico Computing M501 PSP User Guide Linux Version 1.0.1 Impulse Accelerated Technologies, Inc. www.impulseaccelerated.com 1 1.0 Table of Contents 1.0 TABLE OF CONTENTS...

More information

Pico ChipScope Documentation

Pico ChipScope Documentation Pico ChipScope Documentation Contents 1 Disclaimer 1 2 Overview 1 3 Firmware 2 4 Pico Module 6 4.1 M-503 Cables...................................................... 6 4.2 M-505 Cables......................................................

More information

SD Card Controller IP Specification

SD Card Controller IP Specification SD Card Controller IP Specification Marek Czerski Friday 30 th August, 2013 1 List of Figures 1 SoC with SD Card IP core................................ 4 2 Wishbone SD Card Controller IP Core interface....................

More information

ADQ14 Development Kit

ADQ14 Development Kit ADQ14 Development Kit Documentation : P Devices PD : ecurity Class: : Release : P Devices Page 2(of 21) ecurity class Table of Contents 1 Tools...3 2 Overview...4 2.1 High-level block overview...4 3 How

More information

Xilinx Vivado/SDK Tutorial

Xilinx Vivado/SDK Tutorial Xilinx Vivado/SDK Tutorial (Laboratory Session 1, EDAN15) Flavius.Gruian@cs.lth.se March 21, 2017 This tutorial shows you how to create and run a simple MicroBlaze-based system on a Digilent Nexys-4 prototyping

More information

Xilinx ChipScope ICON/VIO/ILA Tutorial

Xilinx ChipScope ICON/VIO/ILA Tutorial Xilinx ChipScope ICON/VIO/ILA Tutorial The Xilinx ChipScope tools package has several modules that you can add to your Verilog design to capture input and output directly from the FPGA hardware. These

More information

Chapter 3: Computer Assembly

Chapter 3: Computer Assembly Chapter 3: Computer Assembly IT Essentials v6.0 ITE v6.0 1 Chapter 3 - Sections & Objectives 3.1 Assemble the Computer Build a Computer. 3.2 Boot the Computer Explain how to verify BIOS and UEFI settings.

More information

E85: Digital Design and Computer Engineering Lab 2: FPGA Tools and Combinatorial Logic Design

E85: Digital Design and Computer Engineering Lab 2: FPGA Tools and Combinatorial Logic Design E85: Digital Design and Computer Engineering Lab 2: FPGA Tools and Combinatorial Logic Design Objective The purpose of this lab is to learn to use Field Programmable Gate Array (FPGA) tools to simulate

More information

Introduction Electrical Considerations Data Transfer Synchronization Bus Arbitration VME Bus Local Buses PCI Bus PCI Bus Variants Serial Buses

Introduction Electrical Considerations Data Transfer Synchronization Bus Arbitration VME Bus Local Buses PCI Bus PCI Bus Variants Serial Buses Introduction Electrical Considerations Data Transfer Synchronization Bus Arbitration VME Bus Local Buses PCI Bus PCI Bus Variants Serial Buses 1 Most of the integrated I/O subsystems are connected to the

More information

Chapter Operation Pinout Operation 35

Chapter Operation Pinout Operation 35 68000 Operation 35 Chapter 6 68000 Operation 6-1. 68000 Pinout We will do no construction in this chapter; instead, we will take a detailed look at the individual pins of the 68000 and what they do. Fig.

More information

System-On-Chip Design with the Leon CPU The SOCKS Hardware/Software Environment

System-On-Chip Design with the Leon CPU The SOCKS Hardware/Software Environment System-On-Chip Design with the Leon CPU The SOCKS Hardware/Software Environment Introduction Digital systems typically contain both, software programmable components, as well as application specific logic.

More information

ASSEMBLY LANGUAGE MACHINE ORGANIZATION

ASSEMBLY LANGUAGE MACHINE ORGANIZATION ASSEMBLY LANGUAGE MACHINE ORGANIZATION CHAPTER 3 1 Sub-topics The topic will cover: Microprocessor architecture CPU processing methods Pipelining Superscalar RISC Multiprocessing Instruction Cycle Instruction

More information

SigStream Quick Start Guide

SigStream Quick Start Guide SigStream Quick Start Guide 797 North Grove Rd, Suite 101 Richardson, TX 75081 Phone: 972-671-9570 www.redrapids.com Red Rapids Red Rapids reserves the right to alter product specifications or discontinue

More information

Design of Embedded Hardware and Firmware

Design of Embedded Hardware and Firmware Design of Embedded Hardware and Firmware Introduction on "System On Programmable Chip" NIOS II Avalon Bus - DMA Andres Upegui Laboratoire de Systèmes Numériques hepia/hes-so Geneva, Switzerland Embedded

More information

University of Massachusetts Amherst Computer Systems Lab 2 (ECE 354) Spring Lab 1: Using Nios 2 processor for code execution on FPGA

University of Massachusetts Amherst Computer Systems Lab 2 (ECE 354) Spring Lab 1: Using Nios 2 processor for code execution on FPGA University of Massachusetts Amherst Computer Systems Lab 2 (ECE 354) Spring 2007 Lab 1: Using Nios 2 processor for code execution on FPGA Objectives: After the completion of this lab: 1. You will understand

More information

08 - Address Generator Unit (AGU)

08 - Address Generator Unit (AGU) October 2, 2014 Todays lecture Memory subsystem Address Generator Unit (AGU) Schedule change A new lecture has been entered into the schedule (to compensate for the lost lecture last week) Memory subsystem

More information

Lecture 5: Computing Platforms. Asbjørn Djupdal ARM Norway, IDI NTNU 2013 TDT

Lecture 5: Computing Platforms. Asbjørn Djupdal ARM Norway, IDI NTNU 2013 TDT 1 Lecture 5: Computing Platforms Asbjørn Djupdal ARM Norway, IDI NTNU 2013 2 Lecture overview Bus based systems Timing diagrams Bus protocols Various busses Basic I/O devices RAM Custom logic FPGA Debug

More information

TR4. PCIe Qsys Example Designs. June 20, 2018

TR4. PCIe Qsys Example Designs.   June 20, 2018 Q 1 Contents 1. PCI Express System Infrastructure... 3 2. PC PCI Express Software SDK... 4 3. PCI Express Software Stack... 5 4. PCI Express Library API... 10 5. PCIe Reference Design - Fundamental...

More information

CARDBUS INTERFACE USER MANUAL

CARDBUS INTERFACE USER MANUAL CARDBUS INTERFACE USER MANUAL 1 Scope The COM-13xx ComBlock modules are PC cards which support communication with a host computer through a standard CardBus interface. These ComBlock modules can be used

More information

ROCCC 2.0 Pico Port Generation - Revision 0.7.4

ROCCC 2.0 Pico Port Generation - Revision 0.7.4 ROCCC 2.0 Pico Port Generation - Revision 0.7.4 June 4, 2012 1 Contents CONTENTS 1 Pico Interface Generation GUI Options 4 1.1 Hardware Configuration....................................... 4 1.2 Stream

More information

Applying the Benefits of Network on a Chip Architecture to FPGA System Design

Applying the Benefits of Network on a Chip Architecture to FPGA System Design white paper Intel FPGA Applying the Benefits of on a Chip Architecture to FPGA System Design Authors Kent Orthner Senior Manager, Software and IP Intel Corporation Table of Contents Abstract...1 Introduction...1

More information

Turbo Encoder Co-processor Reference Design

Turbo Encoder Co-processor Reference Design Turbo Encoder Co-processor Reference Design AN-317-1.2 Application Note Introduction The turbo encoder co-processor reference design is for implemention in an Stratix DSP development board that is connected

More information

and 32 bit for 32 bit. If you don t pay attention to this, there will be unexpected behavior in the ISE software and thing may not work properly!

and 32 bit for 32 bit. If you don t pay attention to this, there will be unexpected behavior in the ISE software and thing may not work properly! This tutorial will show you how to: Part I: Set up a new project in ISE 14.7 Part II: Implement a function using Schematics Part III: Simulate the schematic circuit using ISim Part IV: Constraint, Synthesize,

More information

Building an Embedded Processor System on Xilinx NEXYS3 FPGA and Profiling an Application: A Tutorial

Building an Embedded Processor System on Xilinx NEXYS3 FPGA and Profiling an Application: A Tutorial Building an Embedded Processor System on Xilinx NEXYS3 FPGA and Profiling an Application: A Tutorial Introduction: Modern FPGA s are equipped with a lot of resources that allow them to hold large digital

More information

Vivado Design Suite User Guide

Vivado Design Suite User Guide Vivado Design Suite User Guide Design Flows Overview Notice of Disclaimer The information disclosed to you hereunder (the Materials ) is provided solely for the selection and use of Xilinx products. To

More information

Laboratory Exercise 3 Comparative Analysis of Hardware and Emulation Forms of Signed 32-Bit Multiplication

Laboratory Exercise 3 Comparative Analysis of Hardware and Emulation Forms of Signed 32-Bit Multiplication Laboratory Exercise 3 Comparative Analysis of Hardware and Emulation Forms of Signed 32-Bit Multiplication Introduction All processors offer some form of instructions to add, subtract, and manipulate data.

More information

Microprocessors and Microcontrollers Prof. Santanu Chattopadhyay Department of E & EC Engineering Indian Institute of Technology, Kharagpur

Microprocessors and Microcontrollers Prof. Santanu Chattopadhyay Department of E & EC Engineering Indian Institute of Technology, Kharagpur Microprocessors and Microcontrollers Prof. Santanu Chattopadhyay Department of E & EC Engineering Indian Institute of Technology, Kharagpur Lecture - 09 8085 Microprocessors (Contd.) (Refer Slide Time:

More information

Release Notes CCURDSCC (WC-AD3224-DS)

Release Notes CCURDSCC (WC-AD3224-DS) Release Notes CCURDSCC (WC-AD3224-DS) Driver ccurdscc (WC-AD3224-DS) Rev 6.3 OS RedHawk Rev 6.3 Vendor Concurrent Computer Corporation Hardware PCIe 32-Channel Delta Sigma Converter Card (CP-AD3224-DS)

More information

NIOS CPU Based Embedded Computer System on Programmable Chip

NIOS CPU Based Embedded Computer System on Programmable Chip 1 Objectives NIOS CPU Based Embedded Computer System on Programmable Chip EE8205: Embedded Computer Systems This lab has been constructed to introduce the development of dedicated embedded system based

More information

ENGG3380: Computer Organization and Design Lab4: Buses and Peripheral Devices

ENGG3380: Computer Organization and Design Lab4: Buses and Peripheral Devices ENGG3380: Computer Organization and Design Lab4: Buses and Peripheral Devices School of Engineering, University of Guelph Winter 2017 1 Objectives: The purpose of this lab is : Learn basic bus design techniques.

More information

Introduction to the Personal Computer

Introduction to the Personal Computer Introduction to the Personal Computer 2.1 Describe a computer system A computer system consists of hardware and software components. Hardware is the physical equipment such as the case, storage drives,

More information

PCI GS or PCIe8 LX Time Distribution Board

PCI GS or PCIe8 LX Time Distribution Board PCI GS or PCIe8 LX Time Distribution Board for use with PCI GS or PCIe8 LX Main Board August 28, 2008 008-02783-01 The information in this document is subject to change without notice and does not represent

More information

University of Massachusetts Amherst Computer Systems Lab 1 (ECE 354) LAB 1 Reference Manual

University of Massachusetts Amherst Computer Systems Lab 1 (ECE 354) LAB 1 Reference Manual University of Massachusetts Amherst Computer Systems Lab 1 (ECE 354) LAB 1 Reference Manual Lab 1: Using NIOS II processor for code execution on FPGA Objectives: 1. Understand the typical design flow in

More information

Bus Master DMA Performance Demonstration Reference Design for the Xilinx Endpoint PCI Express Solutions Author: Jake Wiltgen and John Ayer

Bus Master DMA Performance Demonstration Reference Design for the Xilinx Endpoint PCI Express Solutions Author: Jake Wiltgen and John Ayer XAPP1052 (v2.5) December 3, 2009 Application Note: Virtex-6, Virtex-5, Spartan-6 and Spartan-3 FPGA Families Bus Master DMA Performance Demonstration Reference Design for the Xilinx Endpoint PCI Express

More information

Recommended Design Techniques for ECE241 Project Franjo Plavec Department of Electrical and Computer Engineering University of Toronto

Recommended Design Techniques for ECE241 Project Franjo Plavec Department of Electrical and Computer Engineering University of Toronto Recommed Design Techniques for ECE241 Project Franjo Plavec Department of Electrical and Computer Engineering University of Toronto DISCLAIMER: The information contained in this document does NOT contain

More information

FIFO Generator v13.0

FIFO Generator v13.0 FIFO Generator v13.0 LogiCORE IP Product Guide Vivado Design Suite Table of Contents IP Facts Chapter 1: Overview Native Interface FIFOs.............................................................. 5

More information

Verilog Design Entry, Synthesis, and Behavioral Simulation

Verilog Design Entry, Synthesis, and Behavioral Simulation ------------------------------------------------------------- PURPOSE - This lab will present a brief overview of a typical design flow and then will start to walk you through some typical tasks and familiarize

More information

Multimedia Decoder Using the Nios II Processor

Multimedia Decoder Using the Nios II Processor Multimedia Decoder Using the Nios II Processor Third Prize Multimedia Decoder Using the Nios II Processor Institution: Participants: Instructor: Indian Institute of Science Mythri Alle, Naresh K. V., Svatantra

More information

Intel Stratix 10 H-Tile PCIe Link Hardware Validation

Intel Stratix 10 H-Tile PCIe Link Hardware Validation Intel Stratix 10 H-Tile PCIe Link Hardware Validation Subscribe Send Feedback Latest document on the web: PDF HTML Contents Contents 1. Intel Stratix 10 H-Tile PCIe* Link Hardware Validation... 3 1.1.

More information

Using the FADC250 Module (V1C - 5/5/14)

Using the FADC250 Module (V1C - 5/5/14) Using the FADC250 Module (V1C - 5/5/14) 1.1 Controlling the Module Communication with the module is by standard VME bus protocols. All registers and memory locations are defined to be 4-byte entities.

More information

Getting Started Guide with AXM-A30

Getting Started Guide with AXM-A30 Series PMC-VFX70 Virtex-5 Based FPGA PMC Module Getting Started Guide with AXM-A30 ACROMAG INCORPORATED Tel: (248) 295-0310 30765 South Wixom Road Fax: (248) 624-9234 P.O. BOX 437 Wixom, MI 48393-7037

More information

ECEN 449: Microprocessor System Design Department of Electrical and Computer Engineering Texas A&M University

ECEN 449: Microprocessor System Design Department of Electrical and Computer Engineering Texas A&M University ECEN 449: Microprocessor System Design Department of Electrical and Computer Engineering Texas A&M University Prof. Sunil P Khatri (Lab exercise created and tested by Ramu Endluri, He Zhou, Andrew Douglass

More information

Using Synplify Pro, ISE and ModelSim

Using Synplify Pro, ISE and ModelSim Using Synplify Pro, ISE and ModelSim VLSI Systems on Chip ET4 351 Rene van Leuken Huib Lincklaen Arriëns Rev. 1.2 The EDA programs that will be used are: For RTL synthesis: Synplicity Synplify Pro For

More information

CS 101, Mock Computer Architecture

CS 101, Mock Computer Architecture CS 101, Mock Computer Architecture Computer organization and architecture refers to the actual hardware used to construct the computer, and the way that the hardware operates both physically and logically

More information

Introduction to Zynq

Introduction to Zynq Introduction to Zynq Lab 2 PS Config Part 1 Hello World October 2012 Version 02 Copyright 2012 Avnet Inc. All rights reserved Table of Contents Table of Contents... 2 Lab 2 Objectives... 3 Experiment 1:

More information

ModuleFabric Framework Documentation. ModuleFabric Framework

ModuleFabric Framework Documentation. ModuleFabric Framework ModuleFabric Framework Documentation ModuleFabric Framework ModuleFabric Framework Documentation 2012 Linera Ar-Ge Ltd All rights reserved. Unauthorized duplication, in whole or part is prohibited without

More information

Generic Serial Flash Interface Intel FPGA IP Core User Guide

Generic Serial Flash Interface Intel FPGA IP Core User Guide Generic Serial Flash Interface Intel FPGA IP Core User Guide Updated for Intel Quartus Prime Design Suite: 18.0 Subscribe Send Feedback Latest document on the web: PDF HTML Contents Contents 1. Generic

More information

Using XILINX WebPACK Software to Create CPLD Designs

Using XILINX WebPACK Software to Create CPLD Designs Introduction to WebPACK Using XILINX WebPACK Software to Create CPLD Designs RELEASE DATE: 10/24/1999 All XS-prefix product designations are trademarks of XESS Corp. All XC-prefix product designations

More information

BlazePPS (Blaze Packet Processing System) CSEE W4840 Project Design

BlazePPS (Blaze Packet Processing System) CSEE W4840 Project Design BlazePPS (Blaze Packet Processing System) CSEE W4840 Project Design Valeh Valiollahpour Amiri (vv2252) Christopher Campbell (cc3769) Yuanpei Zhang (yz2727) Sheng Qian ( sq2168) March 26, 2015 I) Hardware

More information

Lab 4: Interrupts and Realtime

Lab 4: Interrupts and Realtime Lab 4: Interrupts and Realtime Overview At this point, we have learned the basics of how to write kernel driver module, and we wrote a driver kernel module for the LCD+shift register. Writing kernel driver

More information

Simplify System Complexity

Simplify System Complexity 1 2 Simplify System Complexity With the new high-performance CompactRIO controller Arun Veeramani Senior Program Manager National Instruments NI CompactRIO The Worlds Only Software Designed Controller

More information

ECE1387 Exercise 3: Using the LegUp High-level Synthesis Framework

ECE1387 Exercise 3: Using the LegUp High-level Synthesis Framework ECE1387 Exercise 3: Using the LegUp High-level Synthesis Framework 1 Introduction and Motivation This lab will give you an overview of how to use the LegUp high-level synthesis framework. In LegUp, you

More information

Virtex-7 FPGA Gen3 Integrated Block for PCI Express

Virtex-7 FPGA Gen3 Integrated Block for PCI Express Virtex-7 FPGA Gen3 Integrated Block for PCI Express Product Guide Table of Contents Chapter 1: Overview Feature Summary.................................................................. 9 Applications......................................................................

More information

CS/EE 3710 Computer Architecture Lab Checkpoint #2 Datapath Infrastructure

CS/EE 3710 Computer Architecture Lab Checkpoint #2 Datapath Infrastructure CS/EE 3710 Computer Architecture Lab Checkpoint #2 Datapath Infrastructure Overview In order to complete the datapath for your insert-name-here machine, the register file and ALU that you designed in checkpoint

More information

MICROPROCESSOR AND MICROCONTROLLER BASED SYSTEMS

MICROPROCESSOR AND MICROCONTROLLER BASED SYSTEMS MICROPROCESSOR AND MICROCONTROLLER BASED SYSTEMS UNIT I INTRODUCTION TO 8085 8085 Microprocessor - Architecture and its operation, Concept of instruction execution and timing diagrams, fundamentals of

More information

ADM-PCIE-9H7 Support & Development Kit Release: 0.1.1

ADM-PCIE-9H7 Support & Development Kit Release: 0.1.1 ADM-PCIE-9H7 Support & Development Kit Release: 0.1.1 Document Revision: 1.1 3rd January 2019 2019 Copyright Alpha Data Parallel Systems Ltd. All rights reserved. This publication is protected by Copyright

More information

Block Diagram. mast_sel. mast_inst. mast_data. mast_val mast_rdy. clk. slv_sel. slv_inst. slv_data. slv_val slv_rdy. rfifo_depth_log2.

Block Diagram. mast_sel. mast_inst. mast_data. mast_val mast_rdy. clk. slv_sel. slv_inst. slv_data. slv_val slv_rdy. rfifo_depth_log2. Key Design Features Block Diagram Synthesizable, technology independent IP Core for FPGA, ASIC and SoC reset Supplied as human readable VHDL (or Verilog) source code mast_sel SPI serial-bus compliant Supports

More information

Quartus II Tutorial. September 10, 2014 Quartus II Version 14.0

Quartus II Tutorial. September 10, 2014 Quartus II Version 14.0 Quartus II Tutorial September 10, 2014 Quartus II Version 14.0 This tutorial will walk you through the process of developing circuit designs within Quartus II, simulating with Modelsim, and downloading

More information

Adding Custom IP to the System

Adding Custom IP to the System Lab Workbook Introduction This lab guides you through the process of creating and adding a custom peripheral to a processor system by using the Vivado IP Packager. You will create an AXI4Lite interface

More information

EECS150 - Digital Design Lecture 16 - Memory

EECS150 - Digital Design Lecture 16 - Memory EECS150 - Digital Design Lecture 16 - Memory October 17, 2002 John Wawrzynek Fall 2002 EECS150 - Lec16-mem1 Page 1 Memory Basics Uses: data & program storage general purpose registers buffering table lookups

More information

2. Why is it important to remove loose jewelry before working inside a computer case?

2. Why is it important to remove loose jewelry before working inside a computer case? Chapter 2 Solutions Reviewing the Basics 1. When taking a computer apart, why is it important to not stack boards on top of each other? Answer: You could accidentally dislodge a chip. 2. Why is it important

More information

LegUp HLS Tutorial for Microsemi PolarFire Sobel Filtering for Image Edge Detection

LegUp HLS Tutorial for Microsemi PolarFire Sobel Filtering for Image Edge Detection LegUp HLS Tutorial for Microsemi PolarFire Sobel Filtering for Image Edge Detection This tutorial will introduce you to high-level synthesis (HLS) concepts using LegUp. You will apply HLS to a real problem:

More information

Building an Embedded Processor System on a Xilinx Zync FPGA (Profiling): A Tutorial

Building an Embedded Processor System on a Xilinx Zync FPGA (Profiling): A Tutorial Building an Embedded Processor System on a Xilinx Zync FPGA (Profiling): A Tutorial Embedded Processor Hardware Design October 6 t h 2017. VIVADO TUTORIAL 1 Table of Contents Requirements... 3 Part 1:

More information

TP : System on Chip (SoC) 1

TP : System on Chip (SoC) 1 TP : System on Chip (SoC) 1 Goals : -Discover the VIVADO environment and SDK tool from Xilinx -Programming of the Software part of a SoC -Control of hardware peripheral using software running on the ARM

More information

Using ModelSim to Simulate Logic Circuits for Altera FPGA Devices

Using ModelSim to Simulate Logic Circuits for Altera FPGA Devices Using ModelSim to Simulate Logic Circuits for Altera FPGA Devices This tutorial is a basic introduction to ModelSim, a Mentor Graphics simulation tool for logic circuits. We show how to perform functional

More information

Release Notes CCURDSCC (WC-AD3224-DS)

Release Notes CCURDSCC (WC-AD3224-DS) Release Notes CCURDSCC (WC-AD3224-DS) Driver ccurdscc (WC-AD3224-DS) v 23.0.1 OS RedHawk Vendor Concurrent Real-Time, Inc. Hardware PCIe 32-Channel Delta Sigma Converter Card (CP- AD3224-DS) (CP-AD3224-DS-10)

More information

PMC-DA Channel 16 Bit D/A for PMC Systems REFERENCE MANUAL Version 1.0 June 2001

PMC-DA Channel 16 Bit D/A for PMC Systems REFERENCE MANUAL Version 1.0 June 2001 PMC-DA816 8 Channel 16 Bit D/A for PMC Systems REFERENCE MANUAL 796-10-000-4000 Version 1.0 June 2001 ALPHI TECHNOLOGY CORPORATION 6202 S. Maple Avenue #120 Tempe, AZ 85283 USA Tel: (480) 838-2428 Fax:

More information

6.9. Communicating to the Outside World: Cluster Networking

6.9. Communicating to the Outside World: Cluster Networking 6.9 Communicating to the Outside World: Cluster Networking This online section describes the networking hardware and software used to connect the nodes of cluster together. As there are whole books and

More information

Introduction to Configuration. Chapter 4

Introduction to Configuration. Chapter 4 Introduction to Configuration Chapter 4 This presentation covers: > Qualities of a Good Technician > Configuration Overview > Motherboard Battery > Hardware Configuration Overview > Troubleshooting Configurations

More information

Vivado Design Suite Tutorial. Designing IP Subsystems Using IP Integrator

Vivado Design Suite Tutorial. Designing IP Subsystems Using IP Integrator Vivado Design Suite Tutorial Designing IP Subsystems Using IP Integrator Notice of Disclaimer The information disclosed to you hereunder (the "Materials") is provided solely for the selection and use of

More information

Quartus II Version 14.0 Tutorial Created September 10, 2014; Last Updated January 9, 2017

Quartus II Version 14.0 Tutorial Created September 10, 2014; Last Updated January 9, 2017 Quartus II Version 14.0 Tutorial Created September 10, 2014; Last Updated January 9, 2017 This tutorial will walk you through the process of developing circuit designs within Quartus II, simulating with

More information

INT G bit TCP Offload Engine SOC

INT G bit TCP Offload Engine SOC INT 10011 10 G bit TCP Offload Engine SOC Product brief, features and benefits summary: Highly customizable hardware IP block. Easily portable to ASIC flow, Xilinx/Altera FPGAs or Structured ASIC flow.

More information

LogiCORE IP AXI Video Direct Memory Access (axi_vdma) (v3.01.a)

LogiCORE IP AXI Video Direct Memory Access (axi_vdma) (v3.01.a) DS799 June 22, 2011 LogiCORE IP AXI Video Direct Memory Access (axi_vdma) (v3.01.a) Introduction The AXI Video Direct Memory Access (AXI VDMA) core is a soft Xilinx IP core for use with the Xilinx Embedded

More information

Using the ChipScope Pro for Testing HDL Designs on FPGAs

Using the ChipScope Pro for Testing HDL Designs on FPGAs Using the ChipScope Pro for Testing HDL Designs on FPGAs Compiled by OmkarCK CAD Lab, Dept of E&ECE, IIT Kharagpur. Introduction: Simulation based method is widely used for debugging the FPGA design on

More information

DE4 NetFPGA Reference Router User Guide

DE4 NetFPGA Reference Router User Guide DE4 NetFPGA Reference Router User Guide Revision History Date Comment Author O8/11/2011 Initial draft Harikrishnan 08/15/2012 Revision 1 DMA APIs included Harikrishnan 08/23/2012 Revision 2 Directory Structure

More information

Vivado Design Suite User Guide

Vivado Design Suite User Guide Vivado Design Suite User Guide Design Flows Overview Notice of Disclaimer The information disclosed to you hereunder (the Materials ) is provided solely for the selection and use of Xilinx products. To

More information

Introduction to the Altera SOPC Builder Using Verilog Designs. 1 Introduction

Introduction to the Altera SOPC Builder Using Verilog Designs. 1 Introduction Introduction to the Altera SOPC Builder Using Verilog Designs 1 Introduction This tutorial presents an introduction to Altera s SOPC Builder software, which is used to implement a system that uses the

More information

Tutorial: Working with the Xilinx tools 14.4

Tutorial: Working with the Xilinx tools 14.4 Tutorial: Working with the Xilinx tools 14.4 This tutorial will show you how to: Part I: Set up a new project in ISE Part II: Implement a function using Schematics Part III: Implement a function using

More information

Making Qsys Components. 1 Introduction. For Quartus II 13.0

Making Qsys Components. 1 Introduction. For Quartus II 13.0 Making Qsys Components For Quartus II 13.0 1 Introduction The Altera Qsys tool allows a digital system to be designed by interconnecting selected Qsys components, such as processors, memory controllers,

More information

Laboratory Exercise 8

Laboratory Exercise 8 Laboratory Exercise 8 Memory Blocks In computer systems it is necessary to provide a substantial amount of memory. If a system is implemented using FPGA technology it is possible to provide some amount

More information

Tutorial on Software-Hardware Codesign with CORDIC

Tutorial on Software-Hardware Codesign with CORDIC ECE5775 High-Level Digital Design Automation, Fall 2017 School of Electrical Computer Engineering, Cornell University Tutorial on Software-Hardware Codesign with CORDIC 1 Introduction So far in ECE5775

More information

Verilog Design Principles

Verilog Design Principles 16 h7fex // 16-bit value, low order 4 bits unknown 8 bxx001100 // 8-bit value, most significant 2 bits unknown. 8 hzz // 8-bit value, all bits high impedance. Verilog Design Principles ECGR2181 Extra Notes

More information

Vivado Design Suite User Guide. Designing IP Subsystems Using IP Integrator

Vivado Design Suite User Guide. Designing IP Subsystems Using IP Integrator Vivado Design Suite User Guide Designing IP Subsystems Using IP Integrator Notice of Disclaimer The information disclosed to you hereunder (the "Materials") is provided solely for the selection and use

More information

FPGA design with National Instuments

FPGA design with National Instuments FPGA design with National Instuments Rémi DA SILVA Systems Engineer - Embedded and Data Acquisition Systems - MED Region ni.com The NI Approach to Flexible Hardware Processor Real-time OS Application software

More information

2 nd Year Laboratory. Experiment: FPGA Design with Verilog. Department of Electrical & Electronic Engineering. Imperial College London.

2 nd Year Laboratory. Experiment: FPGA Design with Verilog. Department of Electrical & Electronic Engineering. Imperial College London. Department of Electrical & Electronic Engineering 2 nd Year Laboratory Experiment: FPGA Design with Verilog Objectives By the end of this experiment, you should know: How to design digital circuits using

More information

V8-uRISC 8-bit RISC Microprocessor AllianceCORE Facts Core Specifics VAutomation, Inc. Supported Devices/Resources Remaining I/O CLBs

V8-uRISC 8-bit RISC Microprocessor AllianceCORE Facts Core Specifics VAutomation, Inc. Supported Devices/Resources Remaining I/O CLBs V8-uRISC 8-bit RISC Microprocessor February 8, 1998 Product Specification VAutomation, Inc. 20 Trafalgar Square Nashua, NH 03063 Phone: +1 603-882-2282 Fax: +1 603-882-1587 E-mail: sales@vautomation.com

More information

Saleae Device SDK Starting a Device SDK Project on Windows Starting a Device SDK Project on Linux... 7

Saleae Device SDK Starting a Device SDK Project on Windows Starting a Device SDK Project on Linux... 7 Contents Starting a Device SDK Project on Windows... 2 Starting a Device SDK Project on Linux... 7 Debugging your Project with GDB... 9 Starting a Device SDK Project on Mac... 11 Build Script / Command

More information

Introducing SPI Xpress SPI protocol Master / Analyser on USB

Introducing SPI Xpress SPI protocol Master / Analyser on USB Introducing SPI Xpress SPI protocol Master / Analyser on USB SPI Xpress is Byte Paradigm s SPI protocol exerciser and analyser. It is controlled from a PC through a USB 2.0 high speed interface. It allows

More information

MD4 esata. 4-Bay Rack Mount Chassis. User Manual February 6, v1.0

MD4 esata. 4-Bay Rack Mount Chassis. User Manual February 6, v1.0 4-Bay Rack Mount Chassis User Manual February 6, 2009 - v1.0 EN Introduction 1 Introduction 1.1 System Requirements 1.1.1 PC Requirements Minimum Intel Pentium III CPU 500MHz, 128MB RAM esata equipped

More information

CSE 591: Advanced Hardware Design and Verification (2012 Spring) LAB #0

CSE 591: Advanced Hardware Design and Verification (2012 Spring) LAB #0 Lab 0: Tutorial on Xilinx Project Navigator & ALDEC s Active-HDL Simulator CSE 591: Advanced Hardware Design and Verification Assigned: 01/05/2011 Due: 01/19/2011 Table of Contents 1 Overview... 2 1.1

More information

Lab 1: Using the LegUp High-level Synthesis Framework

Lab 1: Using the LegUp High-level Synthesis Framework Lab 1: Using the LegUp High-level Synthesis Framework 1 Introduction and Motivation This lab will give you an overview of how to use the LegUp high-level synthesis framework. In LegUp, you can compile

More information

Universal Serial Bus Host Interface on an FPGA

Universal Serial Bus Host Interface on an FPGA Universal Serial Bus Host Interface on an FPGA Application Note For many years, designers have yearned for a general-purpose, high-performance serial communication protocol. The RS-232 and its derivatives

More information

ARM 64-bit Register File

ARM 64-bit Register File ARM 64-bit Register File Introduction: In this class we will develop and simulate a simple, pipelined ARM microprocessor. Labs #1 & #2 build some basic components of the processor, then labs #3 and #4

More information

Intel Stratix 10 Low Latency 40G Ethernet Design Example User Guide

Intel Stratix 10 Low Latency 40G Ethernet Design Example User Guide Intel Stratix 10 Low Latency 40G Ethernet Design Example User Guide Updated for Intel Quartus Prime Design Suite: 18.1 Subscribe Latest document on the web: PDF HTML Contents Contents 1. Quick Start Guide...

More information

Simplify System Complexity

Simplify System Complexity Simplify System Complexity With the new high-performance CompactRIO controller Fanie Coetzer Field Sales Engineer Northern South Africa 2 3 New control system CompactPCI MMI/Sequencing/Logging FieldPoint

More information

PCIE PCIE GEN2 EXPANSION SYSTEM USER S MANUAL

PCIE PCIE GEN2 EXPANSION SYSTEM USER S MANUAL PCIE2-2711 PCIE GEN2 EXPANSION SYSTEM USER S MANUAL The information in this document has been carefully checked and is believed to be entirely reliable. However, no responsibility is assumed for inaccuracies.

More information

Vivado Design Suite Tutorial. Designing IP Subsystems Using IP Integrator

Vivado Design Suite Tutorial. Designing IP Subsystems Using IP Integrator Vivado Design Suite Tutorial Designing IP Subsystems Using IP Integrator Notice of Disclaimer The information disclosed to you hereunder (the "Materials") is provided solely for the selection and use of

More information

SPI 3-Wire Master (VHDL)

SPI 3-Wire Master (VHDL) SPI 3-Wire Master (VHDL) Code Download Features Introduction Background Port Descriptions Clocking Polarity and Phase Command and Data Widths Transactions Reset Conclusion Contact Code Download spi_3_wire_master.vhd

More information

With Fixed Point or Floating Point Processors!!

With Fixed Point or Floating Point Processors!! Product Information Sheet High Throughput Digital Signal Processor OVERVIEW With Fixed Point or Floating Point Processors!! Performance Up to 14.4 GIPS or 7.7 GFLOPS Peak Processing Power Continuous Input

More information