PCIe driver development for Exynos SoC

Similar documents
Introduction Electrical Considerations Data Transfer Synchronization Bus Arbitration VME Bus Local Buses PCI Bus PCI Bus Variants Serial Buses

Two years of ARM SoC support mainlining: lessons learned

ARM support in the Linux kernel

ARM support in the Linux kernel

Introduction to the PCI Interface. Meeta Srivastav

2. THE PCI EXPRESS BUS

Device Tree as a stable ABI: a fairy tale?

Computer Architecture CS 355 Busses & I/O System

Jeff Dodson / Avago Technologies

Kernel maintainership: an oral tradition

Spring 2017 :: CSE 506. Device Programming. Nima Honarmand

I/O Handling. ECE 650 Systems Programming & Engineering Duke University, Spring Based on Operating Systems Concepts, Silberschatz Chapter 13

Computer Architecture

Peripheral Component Interconnect - Express

Free Electrons Company profile Kernel, drivers, embedded Linux and Android development, consulting, training and support

Architecture Specification

PCI-SIG ENGINEERING CHANGE NOTICE

Storage. Hwansoo Han

SOM i1 Single Core SOM (System-On-Module) Rev 1.5

ECE 471 Embedded Systems Lecture 30

Introduction I/O 1. I/O devices can be characterized by Behavior: input, output, storage Partner: human or machine Data rate: bytes/sec, transfers/sec

PCI-SIG ENGINEERING CHANGE NOTICE

William Stallings Computer Organization and Architecture 10 th Edition Pearson Education, Inc., Hoboken, NJ. All rights reserved.

Version PEX Recommended only for designs migrating from PEX 8516 please use PEX 8518 for new designs

Example Networks on chip Freescale: MPC Telematics chip

CEC 450 Real-Time Systems

PEX 8680, PCI Express Gen 2 Switch, 80 Lanes, 20 Ports

Computer Organization and Structure. Bing-Yu Chen National Taiwan University

CEC 450 Real-Time Systems

PCI Express System Interconnect Software Architecture for PowerQUICC TM III-based Systems

PEX8764, PCI Express Gen3 Switch, 64 Lanes, 16 Ports

Booting It Successfully For The First Time In Mainline

Ron Emerick, Oracle Corporation

Red Hat Enterprise Virtualization 3.6

27 March 2018 Mikael Arguedas and Morgan Quigley

Enabling Multi-peer Support with a Standard-Based PCI Express Multi-ported Switch

Bus Example: Pentium II

Input/Output. Today. Next. Principles of I/O hardware & software I/O software layers Disks. Protection & Security

AN 690: PCI Express DMA Reference Design for Stratix V Devices

... Application Note AN-531. PCI Express System Interconnect Software Architecture. Notes Introduction. System Architecture.

Bringing display and 3D to the C.H.I.P computer

NXP-Freescale i.mx6. Dual Core SOM (System-On-Module) Rev 1.5

VGA Assignment Using VFIO. Alex Williamson October 21 st, 2013

Apalis A New Architecture for Embedded Computing

KVM on POWER Status update & IO Architecture

ECE 471 Embedded Systems Lecture 2

NXP-Freescale i.mx6 MicroSoM i2. Dual Core SoM (System-On-Module) Rev 1.3

Achieving UFS Host Throughput For System Performance

NetBSD on Marvell Armada XP System on a Chip

An Intelligent NIC Design Xin Song

PEX 8636, PCI Express Gen 2 Switch, 36 Lanes, 24 Ports

Bringing display and 3D to the C.H.I.P computer

SATA in Mobile Computing

A user s guide to selecting the right computer-on-module (COM)

NXP-Freescale i.mx6 MicroSoM i4pro. Quad Core SoM (System-On-Module) Rev 1.3

Kontron s ARM-based COM solutions and software services

F28HS Hardware-Software Interface: Systems Programming

MIL-STD-1553 (T4240/T4160/T4080) 12/8/4 2 PMC/XMC 2.0 WWDT, ETR, RTC, 4 GB DDR3

(Advanced) Computer Organization & Architechture. Prof. Dr. Hasan Hüseyin BALIK (3 rd Week)

ARM and x86 on Qseven & COM Express Mini. Zeljko Loncaric, Marketing Engineer, congatec AG

Xen Automotive Hypervisor Automotive Linux Summit 1-2 July, Tokyo

Key Points. Rotational delay vs seek delay Disks are slow. Techniques for making disks faster. Flash and SSDs

viommu/arm: full emulation and virtio-iommu approaches Eric Auger KVM Forum 2017

Supercomputing with Commodity CPUs: Are Mobile SoCs Ready for HPC?

PCI-X Protocol Addendum to the PCI Local Bus Specification Revision 2.0a

Chapter 6. Storage and Other I/O Topics

ARM Device Tree status report

QuickSpecs. HP Z 10GbE Dual Port Module. Models

FTF-CON-F0403. An Introduction to Heterogeneous Multiprocessing (ARM Cortex -A + Cortex- M) on Next-Generation i.mx Applications Processors

i.mx 7 - Hetereogenous Multiprocessing Architecture

Chapter 6 Storage and Other I/O Topics

PEX 8696, PCI Express Gen 2 Switch, 96 Lanes, 24 Ports

OpenMPDK and unvme User Space Device Driver for Server and Data Center

SoC Idling & CPU Cluster PM

OPT3001EVM Linux Getting Started Guide. Felipe Balbi

Development of I/O Pass-through: Current Status & the Future. Nov 21, 2008 Yuji Shimada NEC System Technologies, Ltd.

Combo-card convenience and cost-savings

CS/ECE 217. GPU Architecture and Parallel Programming. Lecture 16: GPU within a computing system

CS330: Operating System and Lab. (Spring 2006) I/O Systems

Developing deterministic networking technology for railway applications using TTEthernet software-based end systems

genzconsortium.org Gen-Z Technology: Enabling Memory Centric Architecture

Red Hat Virtualization 4.1 Hardware Considerations for Implementing SR-IOV

Porting Nouveau to Tegra K1

Leverage Vybrid's asymmetrical multicore architecture for real-time applications by Stefan Agner

NVMe over Fabrics support in Linux Christoph Hellwig Sagi Grimberg

Matrix. Get Started Guide

Advantages of MIPI Interfaces in IoT Applications

PCI Express to PCI/PCI-X Bridge Specification Revision 1.0

Introduction PCI Interface Booting PCI driver registration Other buses. Linux Device Drivers PCI Drivers

RunBMC - A Modular BMC Mezzanine Card BUV - Bring Up Vehicle For BMC Mezzanine. Eric Shobe & Jared Mednick Hardware Engineer - Salesforce

NVMe Performance Testing and Optimization Application Note

TXMC885. Four Channel 10/100/1000 Mbit/s Ethernet Adapter. Version 1.0. User Manual. Issue October 2011

I/O Systems. Jo, Heeseung

2017 Storage Developer Conference. Mellanox Technologies. All Rights Reserved.

An FPGA-Based Optical IOH Architecture for Embedded System

Porting Linux to a new SoC

Chapter 11: Input/Output Organisation. Lesson 15: Standard I/O bus PCI

Performance Considerations of Network Functions Virtualization using Containers

Version PEX 8516

Xen on ARM ARMv7 with virtualization extensions

Transcription:

PCIe driver development for Exynos SoC Korea Linux Forum 2013 Jingoo Han Samsung Electronics

Introduction S/W engineer at Samsung Electronics since 2005 Linux kernel development for Samsung Exynos ARM SoC develop Linux drivers PCIe driver, USB driver Framebuffer driver, Display Port driver developed Samsung smart phones and tablet PC. 2

Introduction Linux kernel Maintainer since 2011 maintain 3 drivers and 1 subsystem Exynos PCIe driver Exynos Display Port driver Samsung Framebuffer driver Backlight Subsystem 3

Contents PCI Overview PCI Standard Exynos PCIe PCI in Device Tree PCI domain structure Mainline upstream Conclusion 4

PCI Overview

What is PCIe? PCI Express is a High speed serial bus. PCI Exrpess connects a computer to its attached peripharal devices. PCI Express specification was developed by the PCI Special Interest Group (PCI-SIG). PCI Express is the most popular bus standard in a computer these days. 6

History of PCI Express Conventional PCI (1992/1993) Conventional PCI appeared in 1992. replaced MCA, EISA, and VLB buses provided better reliability than such legacy buses provided auto configuration via PCI Configuration space PCI-X (1999) PCI-X enhanced conventional PCI for higher bandwidth. provided compatibility with conventional PCI replaced by PCI Express 7

History of PCI Express PCI Express 1.0 (2003) In 2003, PCI-SIG introduced PCIe 1.0 with a transfer rate of 2.5GHz. PCI Express 2.0 (2007) PCIe 2.0 was announced in 2007. doubled the transfer rate to 5GHz improved the point-to-point data transfer protocol PCI Express 3.0 (2010) PCIe 3.0 was released in 2010. The transfer rate is 8GHz. upgrades encoding scheme in order to enhance bandwidth 8

Advantages of PCIe PCIe provides high bus bandwidth and high performance. I/O pin count is lower thanks to serial signal and embedded clock. PCIe reduces latency and cost. Scalability is supported, according to requirements of PCIe devices. Reliability is enhanced, by detailed error detection and error reporting. PCIe supports hardware I/O virtualization. 9

PCI Standard

PCI Components Root complex connects the CPU and memory to the PCIe switch or Endpoint Switch is a logical assembly of multiple PCI-to-PCI Bridge devices There are one upstream port and some downstream ports. Endpoint can be the Requester or Completer Memory CPU Root Complex Switch Endpoint Endpoint PCI Topology Endpoint 11

Address Spaces PCI has three physical address spaces for PCI bus transaction: configuration, memory, and I/O address spaces Configuration Address Space Configuration Address Space has been defined to support PCI hardware configuration. Configuration Address Space provides access to 256 bytes of special configuration registers per PCI device. PCI Express introduces 'Extended Configuration Space which extends the Configuration Space to 4096 bytes per Function. Configuration Read command and Configuration Write command are used to transfer data between each Configuration Address Space of PCI devices. 12

Address Spaces Memory Address Space Memory address space is used for burst transactions. Two different address formats are used: 32-bit address format and 64-bit address format Memory Read command and Memory Write command are used to transfer data in Memory Address Space. I/O Address Space I/O address space is for compatibility with the x86 architecture's I/O port address space. Future revisions of PCI Express specification may deprecate the use of I/O Address Space. I/O Read command and I/O Write command are used to transfer data in I/O Address Space. 13

Configuration Space PCI devices have a set of registers referred to as 'Configuration Space. This configuration space is used for auto configuration of inserted PCI cards. Configuration Space registers are mapped to memory locations. PCI framework S/W should access the configuration space. 14

Configuration Space Device Identification Vendor ID identifies the manufacturer of the device. Device ID identifies the particular device. Revision ID specifies a device specific revision identifier. Class Code identifies the generic function of the device. Header Type defines the layout of header. Standard registers of Configuration Space Header 15

Configuration Space Device Control & Status Command register provides control over a device's ability. Status register is used to record status information. Base Address Registers Base Address registers are mapped into Memory Address Space or I/O Address Space. Standard registers of Configuration Space Header 16

Configuration Space Capabilities List Certain capabilities are supported by adding a set of registers to a linked list called the Capabilities List. Capabilities Pointer register points to the first item in the list. Each capability in the list consists of an 8-bit ID field assigned by the PCI SIG, an 8 bit pointer in configuration space to the next capability. Example of Capabilities List Standard registers of Configuration Space Header 17

Interrupts INTx Conventional PCI introduced INTx. There are 4 hard-wired Interrupt pins. INTx is Level-triggered for robustness. MSI (Message Signaled interrupt) MSI is in-band method of signaling an interrupt by writing a data to specific memory regions. MSI provides 32 interrupts. There is a slight performance enhancement. MSI-X MSI-X provides up to 2048 interrupts. 18

Exynos PCIe

Exynos PCIe H/W EXYNOS ARM-based SoC made by Samsung Electronics originates from the Greek words smart (exypnos) and green (prasinos) enhances the performance reduces the power consumption 20

Exynos PCIe H/W Features There are two Root Complex controllers. PCI Express 2.0 is supported. There are 4 Lanes. MSI is available. EXYNOS PCIe DesignWare PCIe Core IP Exynos Specific PCIe application controller H/W Architecture of Exynos PCIe 21

Exynos PCIe H/W DesignWare PCIe Core IP DesignWare PCIe Core IP supports PCIe Express protocol layers such as Transaction layer, Data Link layer, and Physical Layer. However, this PCIe Core IP does not support a full root complex. Exynos Specific PCIe application controller Application controller is necessary to support full root complex functionality. DesignWare PCIe Core IP Exynos Specific PCIe application controller H/W Architecture of Exynos PCIe 22

Exynos PCIe S/W Common part controls DesignWare PCIe Core can be shared by various PCIe drivers Exynos specific part controls Exynos specific application controller controls PHY controller PCI Framework allocates resources Common DesignWare Core PCIe driver Exynos Specific PCIe driver S/W Architecture of Exynos PCIe 23

PCI in Device Tree

Device tree usage Device Tree Device tree usage is necessary, because Exynos supports Device Tree. Device tree is a data structure for describing hardware. Many aspect of the hardware can be described in a data structure that is passed to the operating system at boot time. The data structure itself is a simple tree of named nodes and properties. Nodes contain properties and child nodes. Properties are simple name-value pairs. The structure can hold any kind of data. node { } property = <value>;... 25

Device tree usage SoC Specific Properties There are several SoC specific properties that describe the relationship between PCIe controller and other controllers. Register property is necessary for accessing PCIe registers. Interrupt property describes interrupt numbers that can be used for PCIe hardware. clocks property and clock-names property describe clocks used for PCIe hardware. Interrupt CPU pcie@290000 { reg = <0x290000 0x1000 0x270000 0x1000 0x271000 0x40>; interrupts = <0 20 0>, <0 21 0>, <0 22 0>; clocks = <&clock 28>, <&clock 27>; clock-names = "pcie", "pcie_bus ; Example of PCI in Device Tree PCIe EXYNOS Clock 26

Device tree usage PCI Address Translation The PCI address space is completely separate from the CPU address space, so address translation is necessary. Addresses for memory and I/O regions are usually same. It is called identical mapping. However, I/O regions should be different. PCI Address CPU Address MEM MEM 0x4001_1000 0x4000_0000 CFG 0x4001_1000 0x4000_1000 0x4000_0000 I/O CFG 0x0000_0000 I/O 27

Device tree usage PCI Address Translation Ranges property is used for PCI Address Translation. Each Red colored child address means PCI addresses. Each Blue colored child address means CPU addresses. Each child address in last columns means size of regions. Ranges property should be added to Device Tree, in order to merge the PCI driver to the mainline kernel. pcie@290000 {.. ranges = <0x00000800 0 0x40000000 0x40000000 0 0x00001000 0x81000000 0 0 0x40001000 0 0x00010000 0x82000000 0 0x40011000 0x40011000 0 0x1ffef000>; Example of PCI ranges property in Device Tree 28

PCI domain structure

Multi-domain vs Single domain PCI domain is defined to be a set of PCI busses. There are two types of PCI domain structure. Multi-domain structure Single domain structure PCI domain structure should be chosen according to hardware architecture. 30

Multi-domain Multi-domain structure Multi-domain structure provides separate PCI domains. Each root complex has separate PCI Memory address regions. Multi domains structure is easy to implement. Domain 0 Bus 0 Device 0 PCIe 0 Domain 1 Bus 0 Device 0 PCIe 1 Multi-domain structure 31

Multi-domain Multi-domain structure in Device Tree In multi PCI domains, there are separate nodes for each domain in Device Tree. Separate PCI Memory/IO/Configuration address regions are defined by each ranges property. CPU Address pcie@290000 { ranges = <...>; }; pcie@2a0000 { ranges = <...>; }; Example of Device Tree for Multi-domain structure MEM I/O CFG MEM I/O CFG 32

Single domain Single domain structure Single domain structure provides only one PCI domain. One PCI Memory address region can be shared. All ports are available under the single domain. Single domain structure can save memory regions. However, it is a little bit difficult to implement single domain structure. Domain 0 Bus 0 Device 0 PCIe 0 Device 1 PCIe 1 Single domain structure 33

Single domain Single domain structure in Device Tree In single PCI domain, there is only one node in Device Tree. PCI Memory and IO address regions are shared. However, Configuration regions are defined per port via assignedaddresses properties. CPU Address pcie-controller@290000 { ranges = <...>; pci@1,0 { assigned-addresses = <..>; }; pci@2,0 { assigned-addresses = <..>; }; }; Example of Device Tree for Single domain structure MEM I/O CFG CFG 34

Exynos PCIe domain structure Exynos PCIe driver uses Multi-domain structure Exynos PCIe hardware was designed for multi-domain structure, not single domain structure. Each PCIe root complex controller cannot share PCIe memory regions. Domain 1 Bus 0 Device 0 PCIe 1 0x8000_0000 CPU Address PCIe1 MEM (512MB) Domain 0 Bus 0 Device 0 PCIe 0 0x6000_0000 0x4000_0000 PCIe0 MEM (512MB) 35

Mainline upstream

Mainline upstream Benefit of mainline upstream Mainline upstream increases the quality & robustness of the driver Mainline upstream reduces customer support for long term. It is easy to back-port new patches from the latest mainline kernel. Requirement of mainline upstream It is not easy to upstream the driver to mainline kernel. It is necessary to keep up with discussions on mailing-list. It is necessary to discuss actively with other Maintainers and contributors. 37

Mainline upstream for Exynos PCIe New subdirectory for PCI drivers Previously, PCIe driver was added to the machine directory. New subdirectory for PCIe drivers is created since 3.11 kernel. Exynos PCIe driver was added to drivers/pci/host/ directory. arch directory arch/arm/mach-*/pci.c drivers directory drivers/pcie/host/pci-*.c Advantages with new subdirectory It is possible to avoid duplication by sharing common functions. It reduces machine dependency. So, PCIe driver can be easily reused for different architectures. 38

Mainline upstream for Exynos PCIe Exynos PCIe upstream 1st patchset was submitted to mailing-list on March in 2013 There were a lot of review comments from Maintainers and Developers from mailing-list. The patch was modified and submitted 10 times for about 4 months. Finally, 10th patchset was merged to 3.11 kernel. 3.11-rc1 (7/14) Mainline kernel (Linus Torvalds) 2013 6/27 ARM-SOC tree (Arnd Bergmann) Internal development git (Jingoo Han) 2013 3/04 V1 2013 6/21 V9 2013 6/21 V10 39

Exynos PCIe in 3.11 kernel v3.11 kernel PCI maintainer and Samsung SoC maintainer gave ACK. Exynos PCIe driver was applied to ARM-SOC tree. Exynos PCIe driver was merged to 3.11 kernel via ARM-SOC tree. 3.10-rc6 (6/15) 3.10-rc7 (6/22) 3.10 (6/30) 3.11-rc1 (7/14) Mainline kernel (Linus Torvalds) 2013 6/27 ARM-SOC tree (Arnd Bergmann) 2013 6/21 40

Exynos PCIe in 3.12 kernel v3.12 kernel Exynos PCIe driver was split into two parts such as common part and Exynos specific part. Common layer can be re-used for other vendors SoCs using DesignWare PCIe Core IP. Re-usability Stability & Robustness Exynos PCIe driver Common DesignWare Core PCIe driver Exynos Specific PCIe driver 41

Exynos PCIe in 3.13 kernel v3.13 kernel MSI will be supported. MAINTAINER of Exynos PCIe driver will be added. Some minor patches will be merged via PCI tree. 42

Mainline upstream for PCI host PCI host merge strategy Three PCI drivers were merged to mainline kernel via ARM-SOC tree. New drivers will be merged to 3.13 kernel via PCI tree. According to the situation, PCI driver can be merged via one of these two trees. 3.11-rc1 (7/14) 3.12-rc1 (9/16) 3.13-rc1 (11/17) Mainline kernel (Linus Torvalds) ARM-SOC tree (Arnd Bergmann, Olof Johansson) Marvell Armada Samsung Exynos Nvidia Tegra PCI maintainer s git (Bjorn Helgaas) Freescale i.mx6 Renesas R-Car 43

Mainline upstream for PCI host PCI host maintainers PCI host maintainer-ship was announced by PCI maintainer. Each PCI driver should be reviewed by each maintainer. Samsung Exynos: Jingoo Han Marvell Armada: Jason Cooper, Andrew Lunn, Gregory Clement Nvidia Tegra: Thierry Reding Freescale i.mx6: Shawn Guo 44

Conclusion

How to validate Exynos PCIe The following features are tested. Network support Storage support Network support With PCIe LAN card, network works properly between Exynos and network. EXYNOS PCIe PCIe LAN card Network 46

How to validate Exynos PCIe Storage support PCIe SATA card and SATA disk are tested. File read/write works properly between Exynos and SATA disk. EXYNOS PCIe SATA Disk PCIe SATA card 47

Future work Power optimization Power optimization should be considered. I/O Virtualization I/O Virtualization allows multiple operating systems within a single computer to natively share PCIe devices. SR-IOV feature will be supported. PCIe Switch PCIe Switch functionality will be tested with PCIe Switch devices. 48

Conclusion Samsung Exynos ARM SoC can support PCI Express. Mainline Linux kernel can support Exynos PCIe driver since v3.11 kernel. 49

Reference PCI Express Base Specification Revision 3.0 (PCI-SIG, 2010) http://devicetree.org/device_tree_usage http://en.wikipedia.org/wiki/pci_express 50

Q & A