Connecting the EDG front-end to LLVM. Renato Golin, Evzen Muller, Jim MacArthur, Al Grant ARM Ltd.

Similar documents
Fuzzing Clang to find ABI Bugs. David Majnemer

GCC and LLVM collaboration

LLVM, Clang and Embedded Linux Systems. Bruno Cardoso Lopes University of Campinas

KOTLIN/NATIVE + CLANG, TRAVEL NOTES NIKOLAY IGOTTI, JETBRAINS

Annotation Annotation or block comments Provide high-level description and documentation of section of code More detail than simple comments

15-411: LLVM. Jan Hoffmann. Substantial portions courtesy of Deby Katz

LDC: The LLVM-based D Compiler

Recitation #11 Malloc Lab. November 7th, 2017

TMS470 ARM ABI Migration

Skip the FFI! Embedding Clang for C Interoperability. Jordan Rose Compiler Engineer, Apple. John McCall Compiler Engineer, Apple

Modern Processor Architectures. L25: Modern Compiler Design

Tutorial: Building a backend in 24 hours. Anton Korobeynikov

Accelerating Stateflow With LLVM

Tokens, Expressions and Control Structures

Heap Arrays and Linked Lists. Steven R. Bagley

Hardening LLVM with Random Testing

McSema: Static Translation of X86 Instructions to LLVM

Heap Arrays. Steven R. Bagley

Lecture Topics. Administrivia

INTRODUCTION TO LLVM Bo Wang SA 2016 Fall

Fast, precise dynamic checking of types and bounds in C

Data Storage. August 9, Indiana University. Geoffrey Brown, Bryce Himebaugh 2015 August 9, / 19

CS504 4 th Quiz solved by MCS GROUP

Targeting LLVM IR. LLVM IR, code emission, assignment 4

Exception Namespaces C Interoperability Templates. More C++ David Chisnall. March 17, 2011

Keeping Up With The Linux Kernel. Marc Dionne AFS and Kerberos Workshop Pittsburgh

Midterm CSE 131B Spring 2005

C Programming. Course Outline. C Programming. Code: MBD101. Duration: 10 Hours. Prerequisites:

QUIZ. What is wrong with this code that uses default arguments?

Frama-Clang, a C++ front-end for Frama-C

Introduction to Modern Fortran

Tutorial: Building a backend in 24 hours. Anton Korobeynikov

Linux Kernel Evolution. OpenAFS. Marc Dionne Edinburgh

Supporting the new IBM z13 mainframe and its SIMD vector unit

Non 8-bit byte support in Clang and LLVM Ed Jones, Simon Cook

Getting started. Roel Jordans

Heap Management. Heap Allocation

TILE PROCESSOR APPLICATION BINARY INTERFACE

COMP-520 GoLite Tutorial

Modern Processor Architectures (A compiler writer s perspective) L25: Modern Compiler Design

BUD Navigating the ABI for the ARM Architecture. Peter Smith

Assignment 1c: Compiler organization and backend programming

Object Code Emission & llvm-mc

EL2310 Scientific Programming

FreeBSD on latest ARM Processors

Deallocation Mechanisms. User-controlled Deallocation. Automatic Garbage Collection

Machine Language Instructions Introduction. Instructions Words of a language understood by machine. Instruction set Vocabulary of the machine

Modular Codegen. Further Benefits of Explicit Modularization

LLVM and IR Construction

Inheritance, Polymorphism and the Object Memory Model

High Performance Computing in C and C++

Libabigail & ABI compatibility

Who are we? Andre Platzer Out of town the first week GHC TAs Alex Crichton, senior in CS and ECE Ian Gillis, senior in CS

A Fast Review of C Essentials Part I

Declaration Syntax. Declarations. Declarators. Declaration Specifiers. Declaration Examples. Declaration Examples. Declarators include:

LLVM Auto-Vectorization

But first, encode deck of cards. Integer Representation. Two possible representations. Two better representations WELLESLEY CS 240 9/8/15

CPU Toolchain Launch Postmortem. Greg Bedwell

DDMD AND AUTOMATED CONVERSION FROM C++ TO D

Assembly Language Programming Linkers

PIC 10A Objects/Classes

PIC 10A Pointers, Arrays, and Dynamic Memory Allocation. Ernest Ryu UCLA Mathematics

CCStudio 3.1 Migration and Key Improvements. Last Modified: 3 April 2008

Adventures with LLVM in a magical land where pointers are not integers

Implementing Interfaces. Marwan Burelle. July 20, 2012

G52CPP C++ Programming Lecture 20

ELEC 377 C Programming Tutorial. ELEC Operating Systems

Non-numeric types, boolean types, arithmetic. operators. Comp Sci 1570 Introduction to C++ Non-numeric types. const. Reserved words.

Contract Programming For C++0x

Francesco Nidito. Programmazione Avanzata AA 2007/08

ENCM 369 Winter 2017 Lab 3 for the Week of January 30

MODERN AND LUCID C++ ADVANCED

DSP Mapping, Coding, Optimization

Distributed Systems 8. Remote Procedure Calls

C++ Coding Standards. 101 Rules, Guidelines, and Best Practices. Herb Sutter Andrei Alexandrescu. Boston. 'Y.'YAddison-Wesley

The Foundation of C++: The C Subset An Overview of C p. 3 The Origins and History of C p. 4 C Is a Middle-Level Language p. 5 C Is a Structured

LLVM code generation and implementation of nested functions for the SimpliC language

CSE 374 Programming Concepts & Tools. Hal Perkins Spring 2010

Indexing Large, Mixed- Language Codebases. Luke Zarko

1. Introduction to the Common Language Infrastructure

RAD Studio XE3 The Developer Force Multiplier

Baggy bounds with LLVM

Lecture 3: C Programm

ECE 598 Advanced Operating Systems Lecture 10

EuroLLVM Adrien Guinet 2018/04/17

Data Types. Every program uses data, either explicitly or implicitly to arrive at a result.

Other array problems. Integer overflow. Outline. Integer overflow example. Signed and unsigned

Rethinking the core OS in 2015

How to cross compile with LLVM based tools. Peter Smith, Linaro

C6000 Compiler Roadmap

QUIZ. What are 3 differences between C and C++ const variables?

Proposal to Acknowledge that Garbage Collection for C++ is Possible X3J16/ WG21/N0932. Bjarne Stroustrup. AT&T Research.

Assignment 5: Priority Queue

An Introduction to Subtyping

DEVIRTUALIZATION IN LLVM

Deep C (and C++) by Olve Maudal

High Performance Computing and Programming, Lecture 3

Programming with MPI

ECE 598 Advanced Operating Systems Lecture 12

Hacking in C. Pointers. Radboud University, Nijmegen, The Netherlands. Spring 2019

Transcription:

Connecting the EDG front-end to LLVM Renato Golin, Evzen Muller, Jim MacArthur, Al Grant ARM Ltd. 1

Outline Why EDG Producing IR ARM support 2

EDG Front-End LLVM already has two good C++ front-ends, why add yet another one? We already use EDG in our compiler and would like to know how LLVM would support all EDG features and extensions (plus our modifications), at both IR and CodeGen levels EDG supports the complete set of C/C++ standards (all versions), plus many features of the new standard (C++0x) It's feature stable; newer versions of the compiler should still compile old code the same way Including extensions, platform choices Consistent error reports (especially with warnings as errors) Some ARM customers can t use GCC (GPLv3) and Clang is still immature 3

EDG Front-End The benefits for the LLVM community are: Cross-checking IR/code generation against a different front-end It s a fresh set of eyes on the same problems We found some CodeGen bugs (some patches sent) We found problems in structures, unions, bitfields As well as NEON, Exception Handling, Run-Time ABI We tested using vectors instead of structures for NEON So far, it s generating good code Extend IR/code generation support for the additional features, mainly EABI Some EDG users are asking about the bridge... 4

EDG + LLVM Why EDG Producing IR ARM support 5

General Impressions LLVM is a great platform to work on It s easy to generate IR / Metadata Good IR validation infrastructure Asserts get most of the consistency problems Coding standard is well thought of and modern Makes good use of C++ capabilities Consistent, frequently refactored Has good ways to debug (dump, viewgraph) Generated code quality is good Good support for ARM architectures 6

Type System Sign information EDG has it on type, easier for front-end development LLVM has it on instructions, easier for codegeneration We need to keep track to do casts, widening... Ex: sext, zext, uitofp, sitofp...comparison functions Ex: icmp uge vs. icmp sge 7

Type System Default alignment in EDG: Type and variable alignment Variable overrides type alignment Structure and Union alignment Alignment in LLVM on instructions, globals, allocas Default alignment comes from Data Layout Special cases (structures, unions, bitfields) have to be carefully hand-crafted Still, alignment in instructions are necessary to make bitfields work properly Throw away EDG info and partially reconstruct later 8

Type System Automatic C casting EDG sometimes relies on C s implicit conversions Mostly on type promotions LLVM doesn t accept any implicit cast Adding non trivial helpers to avoid bloating the IR Fiddling with types and sizes to determine the expected implicit cast Caused many LLVM failures by being too permissive or getting the cast wrong Took us a while to be auto-cast free, but it was worth it 9

C++ support C++ support was fairly simple EDG lowers C++ into C, so most support was already present Easier parts were Classes became structures, virtual functions became calls to pointers in an array Static construction/destruction identical to LLVM Bigger problems were Exception Handling (no EHABI) EDG storage class doesn t map directly to LLVM linkage type (some magic was required) Explicit allocation of parameters (thunks and other) 10

Structures Argument passing Code generated for Struct ByVal is not EABI compliant Missing feature in ARM s backend for structures Following Clang s example, converting structures to arrays, which gets correctly lowered as ByVal Return Value Indirect return has the same problem, as it becomes the first argument 11

Unions Unions are target-dependent from the front-end Lots of flames, but most agree it could be better The number of work-arounds to make it conform to the C++ standard is quite big Create N structure types (for N initialisers) Cast to i8 arrays, memcpy, and back Keep track of alignment when doing so Static initialization and alignment pose the worst problems Implementing it in the back-end might require several changes All targets must change simultaneously 12

Bitfields Zero-sized anonymous bitfields In the C standard, bitfields finish the packing of the previous field Armcc also uses it for general structure alignment But this cannot be easily represented in LLVM IR For example: struct A { int :0; char a; } S; compiler GCC x86 Clang ARM Armcc GCC ARM sizeof(s) 1 1 4 4 Maybe consider introducing i0 integer type? 13

Exception Handling ARM EHABI exception handling has not (yet) been implemented Plans to support it in MC in the near future DwarfException.cpp cannot be changed nicely to support EHABI; Any change will be considerably invasive We have no ELF writer for ARM Codesourcery s GAS doesn t understand the tables Need to export it to different ELF sections Need to support Generic and Compact EHABI There are solutions Implement EHABI in MC Write Dwarf exceptions directly to ELF 14

NEON NEON instructions are fully(?) represented in TableGen Some instructions can be represented via patternmatching, others via intrinsics: add(zext, zext) = VADDL.U* @llvm.arm.neon.vhadds.*() = VHADD.S* @...vabds.*() + add = VABA.S* arm_neon.h in Clang reflects those choices But neither ARM s nor GCC s arm_neon.h do Structures vs. Vectors in NEON type representation Source compatibility is important, but the front-end can recognize NEON structures and transform them into vectors in IR (we do that) 15

Suggestions Add target attributes in IR header Some generic enough to be used by any back-end Could the union type benefit from this? Some target specific Like build attributes for ARM Add validation to the debug data in IR MDNode* is too opaque for validation Helper classes are too simple, but could do basic validation Unions? Bitfields? EABI Struct ByVal? 16

Architecture Support Why EDG Producing IR ARM support 17

Architectures supported We tested Dhrystone on all supported architectures v4 to v7 (A, R, M) ARM, Thumb Soft and Hard FP, NEON Specific instructions (like SMUL) Our internal tests run mainly on 7TDMI and Cortex-A8 NEON tests in early stages Almost all intrinsics get compiled to NEON instructions correctly We reported some errors in bugzilla 18

Still missing EABI support We ve added some RT-ABI (FP, REM and Memset) But there are still things to do (Ex. EHABI, AAELF) MC support Refactor ARM MC to be less ASM-focused ARM ELF Exception Handling ARM ASM dialect? 19

20 Questions?