simd<t> Finding the right set of traits for Document Number: P0964R1 Date: Matthias Kretz Target: Parallelism TS 2

Similar documents
Feedback on P0214R5. Document Number P0820R0. Date Tim Shen < >

Data-Parallel Vector T ypes & Operations

Data-Parallel Vector T ypes & Operations

Data-Parallel Vector T ypes & Operations

Making operator?: overloadable

common_type and duration

Responses to National Body Comments for ISO/IEC PDTS 19750, C++ Extensions for Parallelism Version 2

Modules:Dependent ADL

Layout-compatibility and Pointer-interconvertibility Traits

SIMD Types: ABI Considerations

Sizes Should Only span Unsigned

Class Types in Non-Type Template Parameters

C++ Module TS Issues List Gabriel Dos Reis Microsoft

Sizes Should Only span Unsigned

Working Draft, Extensions to C++ for Modules

Source-Code Information Capture

Class Types in Non-Type Template Parameters

const-correctness in C++

Variant: a type-safe union that is rarely invalid (v5).

Transformation Trait remove_cvref 1

Source-Code Information Capture

Implicit Evaluation of auto Variables and Arguments

Wording for lambdas in unevaluated contexts

Fixing Atomic Initialization, Rev1

std::assume_aligned Abstract

ValuedOrError and ValueOrNone types. Table of Contents. Introduction. Date: ISO/IEC JTC1 SC22 WG21 Programming Language C++

Working Draft, Extensions to C++ for Modules

On empty structs in the standard library

Botet C++ generic match function P0050

Networking TS & Threadpools

Variant: a type-safe union without undefined behavior (v2).

P0591r4 Utility functions to implement uses-allocator construction

Expansion statements. Version history. Introduction. Basic usage

Allowing Class Template Specializations in Associated Namespaces (revision 1)

Class Types in Non-Type Template Parameters

Programming vulnerabilities for C++ (part of WG23 N0746)

Proposed resolution for US104: Allocator-aware regular expressions (rev 4)

Variant: a type-safe union (v3).

Integrating Concepts: Open items for consideration

Template deduction for nested classes P0293R0

Proposing Standard Library Support for the C++ Detection Idiom, v2

Note in particular that users of the standard are not special in this: abuse of these rules is generally bad usage for any library.

C++14 (Preview) Alisdair Meredith, Library Working Group Chair. Thursday, July 25, 13

Modules: ADL & Internal Linkage

A Standard flat_map. 1 Revisions. 2 Introduction. 3 Motivation and Scope. 1.1 Changes from R Changes from R0

Coroutine concepts and metafunctions

Layout-compatibility and Pointer-interconvertibility Traits

Modules:Context-Sensitive Keyword

Introduction to Standard C++

polymorphic_allocator<> as a vocabulary type

Object-Oriented Programming for Scientific Computing

Context Tokens for Parallel Algorithms

Class Types in Non-Type Template Parameters

On launder() Which Problem shall launder() solve? N. Josuttis: P0532R0: On launder()

P1267R0: Custom Constraint Diagnostics

SIMD in Scientific Computing

Lambdas in unevaluated contexts

Static If Considered

Vc: Portable and Easy SIMD Programming with C++

Proposed Wording for Variadic Templates (Revision 1)

Object-Oriented Programming for Scientific Computing

Record of Response: National Body Comments ISO/IEC PDTS Technical Specification: C++ Extensions for Concepts

A Standard flat_set. This paper outlines what a (mostly) API-compatible, non-node-based set might look like.

Technical Specification: Concepts

Document Number: P0429R4 Date: Reply to: 0.1 Revisions... 1

A Proposal to Add a Const-Propagating Wrapper to the Standard Library

Lambdas in unevaluated contexts

The C++ Programming Language, Core Working Group. Title: Unary Folds and Empty Parameter Packs (revision 1)

C++ Monadic interface

Extended friend Declarations

Implicit Evaluation of auto Variables and Arguments

Deducing the type of variable from its initializer expression (revision 4)

Iterator Concepts for the C++0x Standard Library

Improving the Return Value of Erase-Like Algorithms

A Proposal to Add a Logical Const Wrapper to the Standard Library Technical Report

A Cooperatively Interruptible Joining Thread, Rev 2

10. Functions (Part 2)

T10/06-286r0 SAS-2 Zone Management lock

Template Issue Resolutions from the Stockholm Meeting

Introduction to Core C++

A Unit Type for C++ Introduction. Related proposals. Proposed changes

Software Development with C++ Templates

p0647r1 - Floating point value access for std::ratio

Proposed Wording for Variadic Templates

Overload Resolution. Ansel Sermersheim & Barbara Geller Amsterdam C++ Group March 2019

Adding attribute reflection to C++.

Proposed resolution for US104: Allocator-aware regular expressions (rev 3)

Deducing the type of variable from its initializer expression (revision 3)

Variant: a typesafe union. ISO/IEC JTC1 SC22 WG21 N4218

Templates Templates are functions or classes that are parameterized. We have already seen a few in STL:

Modules: Contexts of Template Instantiations and Name Lookup Gabriel Dos Reis Microsoft

On the Coroutines TS

C++0x Stream Positioning

ISO/IEC PDTS 21425, C++ Extensions for Ranges, National Body Comments

COPYRIGHTED MATERIAL FORMAT A MESSAGE

Templating functions. Comp Sci 1570 Introduction to C++ Administrative notes. Review. Templates. Compiler processing. Functions Overloading

Proposed resolution for US104: Allocator-aware regular expressions (rev 3)

Polymorphic Memory Resources - r2

Proposed Wording for Concurrent Data Structures: Hazard Pointer and Read Copy Update (RCU)

Working Draft, Technical Specification for C++ Extensions for Parallelism

Transcription:

Document Number: P0964R1 Date: 2018-05-07 Reply-to: Matthias Kretz <m.kretz@gsi.de> Audience: LEWG Target: Parallelism TS 2 Finding the right set of traits for simd<t> This paper makes the set of traits for simd<t> more complete. ABSTRACT CONTENTS 1 Introduction 1 2 Motivation 1 3 Proposed Wording 2 4 Changelog 3 5 Straw Polls 3 A Bibliography 3

1 Introduction 1 INTRODUCTION [N4744] defines the trait simd_abi::deduce<t, N>, allowing users to find an implementation-recommended ABI tag for a given value_type and number of elements. Shen [P0820R1] discusses a use for considering involved ABI tags in the recommendation. SG1 polled in Albuquerque about Poll: abi_for_size_t (SF) vs. implementation-defined (SA) SF F N A SA 1 7 7 0 0 The poll result implies that SG1 prefers users to be able to spell out the ABI tags that are determined as return types. 2 MOTIVATION As Shen [P0820R1] shows, there is a use case for deducing an ABI tag type from a value_type, a width, and additionally zero or more input ABI tags. The latter tells the deduction logic what ABI tags are used in the input types to produce an object of the requested value_type and width. This enables an implementation design choice of staying within a certain SIMD register subset. From the user s perspective, the ABI tag deduction is most often necessary in the following two cases: Given a certain simd type, what is the best simd type for a different value_type (e.g. mixed precision calculations). Given a certain simd type, what is the best simd type for a different width (e.g. split, concat, shuffle). Therefore, I propose to 1. extend simd_abi::deduce to consider input ABI tags in its decision, 2. introduce a new trait rebind_simd<u, V>, which deduces a simd<u, Abi> instantiation from a given simd type V and requested value_type U, and 3. introduce a new trait resize_simd<n, V>, which deduces a simd<t, Abi> instantiation from a given simd type V with value_type T and requested width N. 1

3 Proposed Wording 3 PROPOSED WORDING Apply the following change to the Parallelism TS 2 [N4744]: modify [parallel.simd.synopsis] template <class T, size_t N> struct deduce { using type = see below ; }; template <class T, size_t N> using deduce_t = typename deduce<t, N>::type; template <class T, size_t N, class... Abis> struct deduce { using type = see below ; }; template <class T, size_t N, class... Abis> using deduce_t = typename deduce<t, N, Abis...>::type; inline constexpr size_t memory_alignment_v = memory_alignment<t, U>::value; add to [parallel.simd.synopsis] template <class T, class V> struct rebind_simd { using type = see below ; }; template <class T, class V> using rebind_simd_t = typename rebind_simd<t, V>::type; template <int N, class V> struct resize_simd { using type = see below ; }; template <int N, class V> using resize_simd_t = typename resize_simd<n, V>::type; template <class T, size_t N> struct deduce { using type = see below ; }; template <class T, size_t N, class... Abis> struct deduce { using type = see below ; }; 12 The member type shall be present if and only if T is a vectorizable type, and simd_abi::fixed_size<n> is supported (see 9.2.1), and every type in the Abis pack is an ABI tag. 13 Where present, the member typedef type shall name an ABI tag type that satisfies simd_size_v<t, type> == N, and simd<t, type> is default constructible (see 9.3.1), modify [parallel.simd.abi] If N is 1, the member typedef type is simd_abi::scalar. Otherwise, if there are multiple ABI tag types that satisfy the constraints, the member typedef type is implementation-defined. [ Note: It is expected that extended ABI tags can produce better optimizations and thus are preferred over simd_abi::fixed_- size<n>. Implementations can base the choice on Abis, but can also ignore the Abis arguments. end note ] add at the end of [parallel.simd.traits] template <class T, class V> struct rebind_simd { using type = see below ; }; 15 The member type shall be present if and only if V is either simd<u, Abi0> or simd_mask<u, Abi0>, where U and Abi0 are deduced from V, and 2

4 Changelog T is a vectorizable type, and simd_abi::deduce<t, simd_size_v<u, Abi0>, Abi0> has a member type type. 16 Let Abi1 identify the type deduce_t<t, simd_size_v<u, Abi0>, Abi0>. Where present, the member typedef type shall name simd<t, Abi1> if V is simd<u, Abi0> or simd_mask<t, Abi1> if V is simd_mask<u, Abi0>. template <int N, class V> struct resize_simd { using type = see below ; }; 17 The member type shall be present if and only if V is either simd<t, Abi0> or simd_mask<t, Abi0>, where T and Abi0 are deduced from V, and simd_abi::deduce<t, N, Abi0> has a member type type. 18 Let Abi1 identify the type deduce_t<t, N, Abi0>. Where present, the member typedef type shall name simd<t, Abi1> if V is simd<t, Abi0> or simd_mask<t, Abi1> if V is simd_mask<t, Abi0>. 4 CHANGELOG 4.1 changes from r0 Previous revision: [P0964R0]. Adjusted to changes between [P0214R8] and [N4744]. Make resize_simd a non-optional part of the requested changes (after SG1 discussion). Update motivation after resolving different naming preferences with Tim. 5 STRAW POLLS 5.1 sg1 at jacksonville 2018 Poll: Proceed to LEWG? unanimous consent A [N4744] BIBLIOGRAPHY Jared Hoberock, ed. Technical Specification for C++ Extensions for Parallelism Version 2. ISO/IEC JTC1/SC22/WG21, 2018. url: https : / / wg21. link/n4744. 3

A Bibliography [P0214R8] [P0964R0] [P0820R1] Matthias Kretz. P0214R8: Data-Parallel Vector Types & Operations. ISO/IEC C++ Standards Committee Paper. 2018. url: https://wg21.link/p0214r8. Matthias Kretz. P0964R0: Finding the right set of traits for simd<t>. ISO/IEC C++ Standards Committee Paper. 2018. url: https://wg21.link/ p0964r0. Tim Shen. P0820R1: Feedback on P0214R5. ISO/IEC C++ Standards Committee Paper. 2017. url: https://wg21.link/p0820r1. 4