RISC-V Core IP Products

Similar documents
Custom Silicon for all

SiFive E31 Core Complex Manual v1p2. c SiFive, Inc.

Jack Kang ( 剛至堅 ) VP Product June 2018

Evaluating SiFive RISC- V Core IP

SiFive E31 Core Complex Manual v2p0. SiFive, Inc.

Introducing the Latest SiFive RISC-V Core IP Series

SiFive E31 Coreplex Manual v1p0. c SiFive, Inc.

SiFive Freedom SoCs: Industry s First Open-Source RISC-V Chips

SiFive U54 Core Complex Manual v1p1. SiFive, Inc.

SiFive E24 Core Complex Manual v1p0. SiFive, Inc.

ARM Processors for Embedded Applications

RISC-V based core as a soft processor in FPGAs Chowdhary Musunuri Sr. Director, Solutions & Applications Microsemi

Copyright 2016 Xilinx

SiFive FE310-G000 Manual c SiFive, Inc.

Agile Hardware Design: Building Chips with Small Teams

Accelerating the RISC-V Revolution: Unleashing Custom Silicon with Revolutionary Design Platforms and Custom Accelerators

CPU Project in Western Digital: From Embedded Cores for Flash Controllers to Vision of Datacenter Processors with Open Interfaces

NVIDIA'S DEEP LEARNING ACCELERATOR MEETS SIFIVE'S FREEDOM PLATFORM. Frans Sijstermans (NVIDIA) & Yunsup Lee (SiFive)

Domain-Specific Acceleration via AndeStar V5 Processors

The Nios II Family of Configurable Soft-core Processors

Designing, developing, debugging ARM Cortex-A and Cortex-M heterogeneous multi-processor systems

The Next Steps in the Evolution of Embedded Processors

Optimizing ARM SoC s with Carbon Performance Analysis Kits. ARM Technical Symposia, Fall 2014 Andy Ladd

Free Chips Project: a nonprofit for hosting opensource RISC-V implementations, tools, code. Yunsup Lee SiFive

SiFive FU540-C000 Manual v1p0. SiFive, Inc.

Linux-Ready RV-GC AndesCore with Architecture Extensions Charlie Su, Ph.D. CTO and SVP 2018/05/09

Fast Interrupts. Krste Asanovic, UC Berkeley / SiFive Inc. (Chair) Kevin Chen, Andes (Vice-Chair)

ARM ARCHITECTURE. Contents at a glance:

Chapter 5. Introduction ARM Cortex series

Contents of this presentation: Some words about the ARM company

SoC FPGAs. Your User-Customizable System on Chip Altera Corporation Public

SYSTEMS ON CHIP (SOC) FOR EMBEDDED APPLICATIONS

Growth outside Cell Phone Applications

HB0761 CoreRISCV_AXI4 v2.0 Handbook

ECE 353 Lab 4. General MIDI Explorer. Professor Daniel Holcomb Fall 2015

Zynq-7000 All Programmable SoC Product Overview

Department of Computer Science, Institute for System Architecture, Operating Systems Group. Real-Time Systems '08 / '09. Hardware.

Multi-core microcontroller design with Cortex-M processors and CoreSight SoC

SiFive FE310-G000 Manual v2p3. c SiFive, Inc.

REVOLUTIONIZING THE COMPUTING LANDSCAPE AND BEYOND.

Hercules ARM Cortex -R4 System Architecture. Processor Overview

The CoreConnect Bus Architecture

Hello, and welcome to this presentation of the STM32 Flash memory interface. It covers all the new features of the STM32F7 Flash memory.

The Next Steps in the Evolution of ARM Cortex-M

Course Introduction. Purpose: Objectives: Content: 27 pages 4 questions. Learning Time: 20 minutes

Design of an Efficient FSM for an Implementation of AMBA AHB in SD Host Controller

EECS 373 Design of Microprocessor-Based Systems

S2C K7 Prodigy Logic Module Series

THE NVIDIA DEEP LEARNING ACCELERATOR

A 1-GHz Configurable Processor Core MeP-h1

DQ8051. Revolutionary Quad-Pipelined Ultra High performance 8051 Microcontroller Core

HB0801 MiV_RV32IMAF_L1_AHB V2.0 Handbook

SEMICON Solutions. Bus Structure. Created by: Duong Dang Date: 20 th Oct,2010

LEON4: Fourth Generation of the LEON Processor

Buses. Maurizio Palesi. Maurizio Palesi 1

Universität Dortmund. ARM Architecture

Programmable Logic Design Grzegorz Budzyń Lecture. 15: Advanced hardware in FPGA structures

Each Milliwatt Matters

CISC RISC. Compiler. Compiler. Processor. Processor

IMPLEMENTATION OF SOC CORE FOR IOT ENGINE

CMP Conference 20 th January Director of Business Development EMEA

08 - Address Generator Unit (AGU)

ENHANCED TOOLS FOR RISC-V PROCESSOR DEVELOPMENT

15CS44: MICROPROCESSORS AND MICROCONTROLLERS. QUESTION BANK with SOLUTIONS MODULE-4

RM3 - Cortex-M4 / Cortex-M4F implementation

Section 6 Blackfin ADSP-BF533 Memory

Industrial-Strength High-Performance RISC-V Processors for Energy-Efficient Computing

C66x KeyStone Training HyperLink

STM32 MICROCONTROLLER

Introduction to Embedded Systems

Introduction to ASIC Design

The Challenges of System Design. Raising Performance and Reducing Power Consumption

Oberon M2M IoT Platform. JAN 2016

Kinetis Software Optimization

Diplomatic Design Patterns

The ARM Cortex-M0 Processor Architecture Part-1

Practical Hardware Debugging: Quick Notes On How to Simulate Altera s Nios II Multiprocessor Systems Using Mentor Graphics ModelSim

Software Defined Modem A commercial platform for wireless handsets

Revolutionary Quad-Pipelined Ultra High Performance 16/32-bit Microcontroller v. 6.05

Chapter 15 ARM Architecture, Programming and Development Tools

Keywords- AMBA, AHB, APB, AHB Master, SOC, Split transaction.

Copyright 2014 Xilinx

ARM Cortex -M7: Bringing High Performance to the Cortex-M Processor Series. Ian Johnson Senior Product Manager, ARM

Performance Optimization for an ARM Cortex-A53 System Using Software Workloads and Cycle Accurate Models. Jason Andrews

Test and Verification Solutions. ARM Based SOC Design and Verification

EEM870 Embedded System and Experiment Lecture 3: ARM Processor Architecture

The Embedded Computing Platform

Analyzing and Debugging Performance Issues with Advanced ARM CoreLink System IP Components

Hi Hsiao-Lung Chan, Ph.D. Dept Electrical Engineering Chang Gung University, Taiwan

Purpose This course provides an overview of the SH-2A 32-bit RISC CPU core built into newer microcontrollers in the popular SH-2 series

VLSI Design of Multichannel AMBA AHB

Design of an AMBA AHB Reconfigurable Arbiter for On-chip Bus Architecture

Welcome to this presentation of the STM32 direct memory access controller (DMA). It covers the main features of this module, which is widely used to

PowerPC 740 and 750

L2 - C language for Embedded MCUs

Tomasz Włostowski Beams Department Controls Group Hardware and Timing Section. Developing hard real-time systems using FPGAs and soft CPU cores

C66x KeyStone Training HyperLink

Designing with Nios II Processor for Hardware Engineers

RISC V - Architecture and Interfaces

AMBA AHB Bus Protocol Checker

Transcription:

RISC-V Core IP Products An Introduction to SiFive RISC-V Core IP Drew Barbier September 2017 drew@sifive.com

SiFive RISC-V Core IP Products This presentation is targeted at embedded designers who want to learn more about SiFive s RISC-V Core IP products. This presentation will introduce the SiFive RISC-V Core IP products, the E31 Core Complex and the E51 Core Complex 2

3 Part Webinar Series RISC-V 101 The Fundamentals of RISC-V architecture Recording: https://info.sifive.com/risc-v-webinar Introduction to SiFive RISC-V Core IP Introduction to the SiFive E31 and E51 Core Complexes Getting Started with SiFive RISC-V Core IP November 2017 3

How to ask questions 4

SiFive Introduction Jack Kang, VP of Product and Business Development

Introduction to SiFive Founded by the inventors of RISC-V Maintain the open-source Freedom Platform https://github.com/sifive/freedom Including popular Rocket RISC-V Core License RISC-V cores Build custom SoCs based on the Freedom Platform Bring the power of open-source to hardware 6

SiFive Products: RISC-V SoCs and RISC-V IPs Tailored RISC-V Solutions for both Chip and System Designers RISC-V Core IP SiFive Freedom SoCs Low-power, 32-bit and 64-bit Embedded CPU IP Standard RISC-V extensions and privileged modes Physical Memory Protection Microcontrollers, IOT, Housekeeping cores High-performance, Unixcapable, 32-bit and 64-bit CPU IP Standard RISC-V extensions and privileged modes Virtual Memory Support Application Processors, Datacenter Accelerators Low-power, 32-bit microcontrollers TSMC 180nm Digital and Analog peripherals Edge Computing (AI), Embedded, Smart IOT, Wearables High-performance, 64-bit multi-core SoCs TSMC 28nm Cache coherent accelerator support High speed peripherals: PCIe Gen3, GbE, DDR3/4 Datacenter Accelerators, Storage, SSD Controllers, Networking, Baseband 7

SiFive changing the IP Business Model www.sifive.com Detailed information Datasheet Power, Perf, Area Free No NDA No Registration FPGA Bitstream Evaluation RTL Freedom Studio IDE Free Evaluation License Configure Online Transparent Pricing 7-page license agreement 8

E-Series Products

E31 Core Complex High Performance RISC-V MCU 32-bit, RV32IMAC RISC-V CPU Core In-order, 5-6 stage variable pipeline Silicon proven, most widely deployed RISC-V core in the world Choice of buses: TileLink, APB, AHB, AXI4 Smaller area, lower power compared to competing ISAs Designed for high performance embedded systems: Smart IOT Wearables Embedded Microcontrollers 10

E31 Core Complex High Performance RISC-V MCU Higher Performance 320+MHz in TSMC 180G 515.2+ Total DMIPS, 873.6 Total CoreMarks at TSMC180G Flexible Memory Architecture I-Cache can be reconfigured into I-Cache + ITIM DTIM for fast on Core Complex Data Access Off Core Complex memory access through System and Peripheral Ports Supports Local and Global Interrupts 271 total interrupts 8 Region Physical Memory Protection Unit Specification 28nm HPC 55nm LP E31 Core Complex Area* (with 16KB I$/16KB DTIM) 0.187 mm 2 0.593 mm 2 E31 Core Only Area* (w/o SRAM) 0.0286 mm 2 0.0507 mm 2 Frequency Power (DMIPs/mW) *Area calculated with 85% utilization Typical: 1.45 GHz Worst: 900 MHz 59.4 DMIPs/mW at 1.4 GHz 2300 Total DMIPS Typical: 540 MHz Worst: 320 MHz 20.8 DMIPs/mW at 500 MHz 805 Total DMIPS 11

E31 Core Complex Feature Table Features Architecture Modes Supported Microarchitecture Hardware Multiply/Divide (M) Atomic Support (A) Compressed Support (C) E31 Details RV32IMAC Machine and User 5+ stages, in order execution Multiplier is pipelined, 8 bits per cycle Divide is 1 bit per cycle, with early termination PMP Regions 8 Branch Prediction Instruction Subsystem Data Subsystem Interrupts Platform Level Interrupt Controller Core Local Interruptor Debug Yes Yes 40-entry BTB, 128-entry BHT, 2-entry RAS 16KB, 2-way associative Instruction Cache Can re-configure up to 8KB as ITIM Up to 64KB Data Tightly Integrated Memory 16 Local Interrupts, with vectored addresses 255 inputs, 7 priority levels 1 timer interrupt,1 SW interrupt JTAG Support, 2 programmable breakpoints 12

E51 Core Complex 64-bit RISC-V MCU 64-bit, RV64IMAC RISC-V CPU Core In-order, 5-6 stage variable pipeline Smallest 64-bit core in the market today Choice of buses: TileLink, APB, AHB, AXI4 64-bit core makes it ideal for integration into larger 64-bit SoC Systems Example uses: Minion Core, Boot Processor, Security Processor Other embedded applications that benefit: SSD Controllers Networking Applications 13

E51 Core Complex 64-bit RISC-V MCU Better integer performance than 32- bit or smaller architectures Supports the RISC-V C Extension Smaller code size than competing 64bit architectures Extended memory map Support for up to 40 physical address bits Supports Local and Global Interrupts 527 total interrupts Specification 28nm HPC 55nm LP E51 Core Complex Area* (with 16KB I$/16KB DTIM) 0.236 mm 2 0.704 mm 2 E51 Core Only Area* (w/o SRAM) 0.0473 mm 2 0.0877 mm 2 Frequency Power (DMIPs/mW) *Area calculated with 85% utilization Typical: 1.45 GHz Worst: 900 MHz 58.0 DMIPs/mW at 1.4 GHz 2585 Total DMIPS Typical: 540 MHz Worst: 320 MHz 24.7 DMIPs/mW at 500 MHz 905 Total DMIPS 14

E51 Core Complex Feature Table Features Architecture Modes Supported Microarchitecture Hardware Multiply/Divide (M) Atomic Support (A) Compressed Support (C) E51 Details RV64IMAC Machine and User 5+ stages, in order execution Multiplier is pipelined, 8 bits per cycle Divide is 1 bit per cycle, with early termination PMP Regions 8 Branch Prediction Instruction Subsystem Data Subsystem Interrupts Platform Level Interrupt Controller Core Local Interruptor Debug Yes Yes 40-entry BTB, 128-entry BHT, 2-entry RAS 16KB, 2-way associative Instruction Cache Can re-configure up to 8KB as ITIM Up to 64KB Data Tightly Integrated Memory 16 Local Interrupts, with vectored addresses 511 inputs, 7 priority levels 1 timer interrupt,1 SW interrupt JTAG Support, 2 programmable breakpoints 15

E-Series Core Complex Features

E-Series Core Complex Clocking core_clock Main CPU and L1 memory clock clock Secondary peripheral clock Frequency must be an integer multiple of core_clock 1:1 ratio is acceptable rtc_toggle Real Time Clock input as defined by RISC-V Architecture Must run at strictly less than half the rate of clock core_clock clock > (2x rtc_toggle) 17

SiFive RISC-V Core IP Interrupts 2 Interrupt Types: Local and Global Local Interrupts Connected directly to the core = Lower Latency Can be vectored = Lower Latency Including additional Local Interrupts 0-15 Global Interrupts Interrupts routed through the Platform-Level Interrupt Controller (PLIC) Programmable priority levels and thresholds The PLIC is in clock domain Overall Core Complex Prioritization of Interrupts: Local Interrupts > Machine External > Machine Software > Machine Timer 18

SiFive Local Interrupts Can be Vectored by setting the LSB of mtvec Allows for jumping directly to an interrupt handler without first reading mcause Local Interrupts have a fixed priority scheme The higher the Interrupt ID, The higher the Priority Local Interrupt 15 is the highest priority Interrupt in the system and special due to its position in the vector table - no jump necessary 19

Core Local Interruptor (CLINT) Used to generate Software and Timer Interrupts Contains the RISC-V mtime and mtimecmp memory mapped CSRs The msip memory mapped CSR can be used to generate Machine Software Interrupts Multi-Core scalable and consistent register map across SiFive RISC-V Core IP 20

Global Interrupts and the PLIC The PLIC handles the majority of the Core Complex s Interrupts Priority Registers 4B registers containing 3-bit interrupt priority 1 is lowest priority, 7 is the highest, 0 disables Pending and Enable Registers Bit packed registers Threshold Register Only interrupts with Priority > Threshold will trigger an interrupt Claim/Complete Returns the ID of the highest pending interrupt Interrupt completion is signaled to the PLIC by software writing the ID back to this register 21

Max ITIM I-Cache Re-Configurability I$ W1 I$ W1 Remaining cache now acts as a direct mapped cache By default Core Complexes boot with all L1 Instruction Cache ways enabled The Instruction Cache allows mapping of all or part of a Cache Way into the memory space dynamically This memory space is called Instruction Tightly Integrated Memory (ITIM) Use ITIM for deterministic timing When re-configuring entire cache way The other way acts as a direct mapped cache When re-configuring part of a cache way Cache still behaves as a 2 way cache with the remaining matching sets in the cache The orphaned cache set then act as a direct mapped cache Normal Configuration ¼ ITIM I$ W0 I$ W1 I$ W0 Normal Configuration Reconfigured Reconfigured ITIM Entire Way Reconfigured I$ W1 I$ DM I$ W0 ITIM ½ Way Reconfigured ITIM is now mapped into the Address space at base address: 0x0800_0000 Remaining cache sets remain a 2-way cache Orphaned cache sets act as a direct mapped cache ITIM is now mapped into the Address space at base address: 0x0800_0000 22

Instruction Cache Re-Configurability How To Create an ITIM by storing to it ITIM size is determined by the address written on allocation, rounded up to the nearest cache line (64B) Software should not make assumptions about ITIM contents on creation Deallocate ITIM Store 0x00 to the first byte after the ITIM region returns the entire ITIM space back to the Instruction Cache Store to 0x0800_1000 23

Instruction Cache Re-Configurability How To Create an ITIM by storing to it ITIM size is determined by the address written on allocation, rounded up to the nearest cache line (64B) Software should not make assumptions about ITIM contents on creation Deallocate ITIM Store 0x00 to the first byte after the ITIM region returns the entire ITIM space back to the Instruction Cache Store to 0x0800_1000 24

Physical Memory Protection (PMP) 0xFFFF_FFFF 4 Byte Region Locked. Only accessible after a reset Locked Region Protects up to 8 regions of memory Enforces access permissions on U- Modes only Ability to Lock a region A locked region enforces permissions on all accesses, including M-Mode Only way to unlock a region is a Reset Minimum region size of 4 bytes Failed accesses generate a Load, Store, or Instruction access exception User Mode has full RWX Privileges User Mode has Read only Privileges User Mode has Execute only Privileges 0x0000_0000 User Mode Context User Machine Mode Mode Data Software Shared Library Code Example PMP Memory Map Entire address map is not accessible to U-Mode 25

PMP Register Descriptions Each region has pmpxcfg field in a pmpcfgx register Set R/W/X permissions Region lock bit Address matching scheme (next slide) pmpaddrx encodes: Most significant 32 bits of a 34 bit physical address for RV32 Most significant 54 bits of a 56 bit physical address for RV64 26

PMP Addressing Modes Top of Range (TOR) Uses pmpaddrx and pmpaddrx-1 to define a specific address range If region 0 is set to TOR, address 0x00 is the lower bound Naturally Aligned Power of 2 Region Uses first zero bit of pmpaddrx to encode region size (unary encoding) Region Size = 2^(zero bit position+3) Region Base = most significant bits above first 0 Enables fast context switching Single write to pmpaddrx is sufficient to map in a new Context //encodes base and address into RISC-V PMP NAPOT Address format get_pmp_napot_addr(base, size) { napot_size = ((size/2)-1) //napot_size = 0x7FFF } pmp_addr= (base + napot_size)>>2 //pmp_addr = 0x1000_01FF return pmp_addr; PMP NAPOT Pseudocode region_base=0x40000000 region_size=0x1000 write_csr(pmpaddr1, get_pmp_napot_addr(region_base, region_size)) 27

Core Complex Debug 2 Hardware Breakpoints Instruction and Data address matches are supported Hardware Performance Monitors 2x 40bit event counting CSRs: mhpmcounter3(h) and mhpmcounter4(h) 2x Event Selector CSRs: mhpevent3 and mhpevent4 Counters will increment each time a selected event occurs Can select multiple events per counter 28

Core Complex Memory Map and Interfaces Compatible memory maps across all SiFive RISC-V Core IP 3 Ports on each Core Complex Peripheral Port (TL-UL) Efficient low bandwidth bus, supports Atomics System Port (TL-UH) Higher bandwidth bus with burst support Front Port (TL-UH) Allows other masters to access Core Complex DTIM and ITIM address space AMBA bridge options Can bridge any port to APB, AHB, or AXI 29

Evaluate Today www.sifive.com

Evaluate FPGA Bitstreams Prebuilt FPGA Bitstreams Low cost $99 FPGA Dev Kit Fast! ~65MHz Evaluate code size, tools, CPU performance 31

Evaluate Verilog RTL Free, Instant Access Fully functional, synthesizable, Verilog RTL Synthesize into your SOC Simulate in your environment Determine your PPA for your process node Evaluation License automatically sent electronically Countersign electronically Instant access in your Developer Dashboard 32

Questions

3 Part Webinar Series RISC-V 101 The Fundamentals of RISC-V architecture Recording: https://info.sifive.com/risc-v-webinar Introduction to SiFive RISC-V Core IP Introduction to the SiFive E31 and E51 Core Complexes Getting Started with SiFive RISC-V Core IP November 2017 34

Resources https://riscv.org/ RISC-V Specifications Links to the RISC-V mailing lists Workshop proceedings GitHub https://github.com/sifive/ https://github.com/ https://www.sifive.com/ RISC-V Core IP and Development Boards RISC-V Tools Forums 35

36 36