VEOS high level design. Revision 2.1 NEC

Similar documents
Linux Driver and Embedded Developer

Design Overview of the FreeBSD Kernel. Organization of the Kernel. What Code is Machine Independent?

LINUX INTERNALS & NETWORKING Weekend Workshop

Design Overview of the FreeBSD Kernel CIS 657

Operating Systems. II. Processes

SMD149 - Operating Systems

Process. Program Vs. process. During execution, the process may be in one of the following states

ECE 550D Fundamentals of Computer Systems and Engineering. Fall 2017

Chap 4, 5: Process. Dongkun Shin, SKKU

Processes. CS3026 Operating Systems Lecture 05

Processes and Non-Preemptive Scheduling. Otto J. Anshus

Processes. Dr. Yingwu Zhu

Software Development & Education Center

Process. Heechul Yun. Disclaimer: some slides are adopted from the book authors slides with permission

Operating Systems Design Fall 2010 Exam 1 Review. Paul Krzyzanowski

PROCESS CONCEPTS. Process Concept Relationship to a Program What is a Process? Process Lifecycle Process Management Inter-Process Communication 2.

Getting to know you. Anatomy of a Process. Processes. Of Programs and Processes

Mid Term from Feb-2005 to Nov 2012 CS604- Operating System

OS Structure. User mode/ kernel mode. System call. Other concepts to know. Memory protection, privileged instructions

Introduction to OS Processes in Unix, Linux, and Windows MOS 2.1 Mahmoud El-Gayyar

UNIT I Linux Utilities

Native POSIX Thread Library (NPTL) CSE 506 Don Porter

Process Scheduling Queues

Chapter 4: Threads. Operating System Concepts 9 th Edit9on

(MCQZ-CS604 Operating Systems)

Processes & Threads. Recap of the Last Class. Microkernel System Architecture. Layered Structure

Operating Systems Comprehensive Exam. Spring Student ID # 3/20/2013

CS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University

OS 1 st Exam Name Solution St # (Q1) (19 points) True/False. Circle the appropriate choice (there are no trick questions).

OS Structure. User mode/ kernel mode (Dual-Mode) Memory protection, privileged instructions. Definition, examples, how it works?

Processes. Process Scheduling, Process Synchronization, and Deadlock will be discussed further in Chapters 5, 6, and 7, respectively.

CSC 4320 Test 1 Spring 2017

CS 333 Introduction to Operating Systems. Class 3 Threads & Concurrency. Jonathan Walpole Computer Science Portland State University

Processes Prof. James L. Frankel Harvard University. Version of 6:16 PM 10-Feb-2017 Copyright 2017, 2015 James L. Frankel. All rights reserved.

Operating System Support

Processes. Process Management Chapter 3. When does a process gets created? When does a process gets terminated?

Process and Its Image An operating system executes a variety of programs: A program that browses the Web A program that serves Web requests

Last class: Today: Thread Background. Thread Systems

* There are more than 100 hundred commercial RTOS with memory footprints from few hundred kilobytes to large multiprocessor systems

csci3411: Operating Systems

Process. Heechul Yun. Disclaimer: some slides are adopted from the book authors slides with permission 1

* What are the different states for a task in an OS?

Fall 2014:: CSE 506:: Section 2 (PhD) Threading. Nima Honarmand (Based on slides by Don Porter and Mike Ferdman)

Processes and Threads

2/14/2012. Using a layered approach, the operating system is divided into N levels or layers. Also view as a stack of services

Processes. CS 475, Spring 2018 Concurrent & Distributed Systems

Processes COMPSCI 386

CS2506 Quick Revision

Processes The Process Model. Chapter 2 Processes and Threads. Process Termination. Process States (1) Process Hierarchies

Processes The Process Model. Chapter 2. Processes and Threads. Process Termination. Process Creation

Operating Systems. Computer Science & Information Technology (CS) Rank under AIR 100

Concurrency: Deadlock and Starvation

PROCESS CONTROL BLOCK TWO-STATE MODEL (CONT D)

Linux Operating System

UNIT I Linux Utilities and Working with Bash

EECS 482 Introduction to Operating Systems

POSIX Threads: a first step toward parallel programming. George Bosilca

Processes. CS 416: Operating Systems Design, Spring 2011 Department of Computer Science Rutgers University

Chapter 12 IoT Projects Case Studies. Lesson-01: Introduction

Processes, Execution and State. Process Operations: fork. Variations on Process Creation. Windows Process Creation.

OVERVIEW. Last Week: But if frequency of high priority task increases temporarily, system may encounter overload: Today: Slide 1. Slide 3.

CSC Operating Systems Spring Lecture - XII Midterm Review. Tevfik Ko!ar. Louisiana State University. March 4 th, 2008.

Operating Systems Comprehensive Exam. Spring Student ID # 3/16/2006

CS 111. Operating Systems Peter Reiher

Sistemi in Tempo Reale

Lecture 18. Log into Linux. Copy two subdirectories in /home/hwang/cs375/lecture18/ $ cp r /home/hwang/cs375/lecture18/*.

CS153: Midterm (Fall 16)

Embedded System Curriculum

Processes. Operating System CS 217. Supports virtual machines. Provides services: User Process. User Process. OS Kernel. Hardware

PROCESSES. Jo, Heeseung

Processes. Jo, Heeseung

Exceptions and Processes

CS 326: Operating Systems. Process Execution. Lecture 5

Processes and Exceptions

CS 111. Operating Systems Peter Reiher

CSE 153 Design of Operating Systems

OS lpr. www. nfsd gcc emacs ls 1/27/09. Process Management. CS 537 Lecture 3: Processes. Example OS in operation. Why Processes? Simplicity + Speed

Dr. Rafiq Zakaria Campus. Maulana Azad College of Arts, Science & Commerce, Aurangabad. Department of Computer Science. Academic Year

PROCESSES & THREADS. Charles Abzug, Ph.D. Department of Computer Science James Madison University Harrisonburg, VA Charles Abzug

Processes. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

CSE506: Operating Systems CSE 506: Operating Systems

Exam Guide COMPSCI 386

Processes & Threads. Recap of the Last Class. Layered Structure. Virtual Machines. CS 256/456 Dept. of Computer Science, University of Rochester

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

CS 5523: Operating Systems

CS Lecture 3! Threads! George Mason University! Spring 2010!

Slide 6-1. Processes. Operating Systems: A Modern Perspective, Chapter 6. Copyright 2004 Pearson Education, Inc.

CL020 - Advanced Linux and UNIX Programming

Chapter 4: Multi-Threaded Programming

PROCESS MANAGEMENT. Operating Systems 2015 Spring by Euiseong Seo

Questions from last time

Computer Systems A Programmer s Perspective 1 (Beta Draft)

Chapter 2 Processes and Threads

CSC 539: Operating Systems Structure and Design. Spring 2006

Operating System. Operating System Overview. Structure of a Computer System. Structure of a Computer System. Structure of a Computer System

Process Environment. Pradipta De

Killing Zombies, Working, Sleeping, and Spawning Children

COSC243 Part 2: Operating Systems

CPSC 341 OS & Networks. Processes. Dr. Yingwu Zhu

OPERATING SYSTEMS: Lesson 4: Process Scheduling

Transcription:

high level design Revision 2.1 NEC

Table of contents About this document What is Components Process management Memory management System call Signal User mode DMA and communication register Feature list (Fundamental features) Feature list (inter communication) Available resources Notice

About this document This document describes the high level design of This revision covers v2.0.1 or later. 3 NEC Corporation 2018

What is is the software running on Linux/Vector Host, providing OS functionality for programs running on Vector Engine provides Linux compatible functionality and HPC functionality to programs Linux VH program also provides ported commands and debugger VH: Vector Host Off-the-shelf server on which runs : Vector Engine Engine which includes core and memory and executes application program 4 NEC Corporation 2018

Components daemon Daemon which manages one It is responsible for management, memory management, etc. Pseudo Process which manages one It loads program into and handles system call request from ID Daemon for inter- resource management MM daemon Daemon which communicates with InfiniBand driver Commands Ported commands such as ps and free Debugger Ported debugger (gdb) SW HW daemon ID commands Linux VH pseudo MM daemon debugger driver User program specific libraries Standard C library OS-less User Area Kernel Area driver Driver which provides resource accessibility to daemon and pseudo Process which executes user program The following libraries may be linked Standard C library specific libraries 5 NEC Corporation 2018

Process management Multiing and multithreading are supported same pid Identification of A is identified by PID of the corresponding pseudo. daemon Pseudo Process Process state maintains the state of es independently from Linux running, wait, etc. Scheduling chooses a core to execute a thread of a, when the thread is started schedules threads of es using roundrobin scheduling on each core Migration of a thread between cores may occur when a core has no thread to execute Relationship of parent and child es maintains the relationship of es, in order to inherit resource limit, CPU affinity, etc. maintains the relationship within one daemon VH VH Pseudo Process schedule Tasks for core 0 Tasks for core 1 Process 1 Thread1 Process 2 Thread1 Process 3 Thread1 Core 0 Process 1 Thread2 Process 2 Thread2 Core 1 6 NEC Corporation 2018

Memory management Virtual address management Text and data of executable binary, heap and stack are located at the fixed address 0xffff ffff ffff ffff 0xffff 8000 0000 0000 kernel Other areas such as anonymous memory are located at address allocated dynamically memory virtual address space is subset of virtual address space of pseudo 0x7fff ffff ffff stack (VH) 0xffff ffff ffff Anonymous memory and file backed memory are supported Data of file backed memory is transferred between and VH by system calls such as mmap(), munmap(), msync() and exit() p4 p4 file backed memory Physical memory management allocates physical memory when executable binary or shared library are loaded, or anonymous or file backed memory are requested Demand paging and copy-on-write are not supported, because HW doesn't support precise exceptions Anonymous memory and file backed memory share physical memory between es if shared mapping is requested Anonymous memory and file backed memory may share physical memory between es if private mapping with read-only protection is requested Text of executable binary and shared library are private file backed memory with read only protection. So, they shares physical memory between es Growing heap and stack is supported Page size 64 MB / 2MB The page size of text and data of executable, heap, and stack are equal to the alignment of a executable binary The page size of text and data of shared library is equal to the alignment of shared library The default page size of anonymous memory or file backed memory is equal to the alignment of executable binary. program can specify page size when it requests them p3 p3 anoymous memory thread stack p2 p2 growing data p1 p1 text 0x6100 0000 0000 0x6100 0000 0000 stack growing growing heap data 0x6000 0000 0000 0x6000 0000 0000 text heap (VH) data (VH) text (VH) 0 0 VH virtual address space memory virtual address space shared library executable binary pseudo 7 NEC Corporation 2018

User mode DMA and communication register User mode DMA can use user mode DMA in order to transfer data between memory described below 1. memory assigned to a and memory assigned to another 2. VH memory and memory assigned to a A can use two DMA descriptor tables User mode DMA is performed while one or more threads of the are running on cores User mode DMA is stopped while all threads of the are not running on cores Target and source address of DMA is specified using special address named host virtual address sets the mapping from host virtual address to memory Communication register (CR) CRs are 64-bit registers used for control of exclusive execution or synchronous operation between threads of a or MPI es can access CR of local and remote Access to CR of local CR access instruction can be used CR is specified using special address named effective CR address Access to CR of remote Host memory access instructions can be used CR is specified using host virtual address which is also used for user mode DMA sets the mapping from effective CR address or host virtual address to actual CR 0xffff ffff ffff 0x0800 0000 0000 0x0700 0000 0000 0x0090 0000 0000 0x0088 0000 0000 0x0000 0400 0000 remote or local memory (64MB page) remote or local memory (2MB page), VH memory 0x7f 0 remote CR, etc 0 host virtual address space local CR effective CR address space Notice: User mode DMA and CR are used by the MPI library or other system libraries. Low level API for user mode DMA is provided as an experimental feature. CR is not intended to be used directly from user programs. 8 NEC Corporation 2018

System call offloads system call onto VH Pseudo handles request of system call Pseudo requests daemon or Linux if needed System call is handled with pseudo s privilege VH Pseudo daemon Unblock request with return value System call number and arguments Return value ret=read(fd,buf,size); Sequence of system call (simple case) 1. stores system call number and arguments to VH memory 2. stops the core and raise an interrupt 3. Pseudo handles system call 4. Pseudo sends unblock request with return value to daemon 5. daemon stores return value to the register of the core and starts the core 9 NEC Corporation 2018

Signal Signal sent by another Signal handler of pseudo sends a signal request to delivers signal to when executed on core next time Pseudo is killed by SIGKILL if is terminated by signal pseudo text segment Signal handler Signal Waiting for system call or HW exception signal handler is invoked deliver text segment signal handler signal handler is invoked Signal due to HW exception Pseudo detects HW exception and sends a signal request to delivers signal to when executed on core next time If registers a signal handler corresponding to nonrecoverable HW exception, is terminated after execution of the signal handler text segment pseudo Detect exception deliver text segment signal handler Exception signal handler is invoked 10 NEC Corporation 2018

Feature list (Fundamental features) Category Feature Design Description Program loader File format ELF, DWARF Program loader recognizes ELF file and loads text and data to. Dynamic linking Supported Dynamic linking of shared library is supported. Dynamic loading Supported Dynamic loading of shared library is supported. (dlopen(), dlsym(), etc) Alignment of segment 2MB / 64MB Alignment of segment is decided on linking. Process management Multiing Fork-exec model program can create a using fork() system call. program can execute a new program using execve() system call. Multithreading POSIX thread program can create threads using POSIX API. Scheduling Round robin on each core Threads are executed in circular order without priority. Threads in blocked state are skipped. Preemption Supported If the time slice is expired, a context switch occurs, and a next thread starts to run on core. Time slice 1 second The period of time for which a thread of a is allowed to run. Timer interval 100 milliseconds The interval between invocations of the timer handler of the scheduler at VH side. Memory management Virtual address space Supported Process has own virtual address space. Memory allocation Dynamic Memory is allocated when program is loaded. Growing heap and stack are supported. Allocating anonymous or file-backed memory is also supported. File backed memory Supported Data of the file is transferred between and VH by system calls. Demand paging Not supported Physical memory is allocated when virtual memory is allocated. Copy on write Not supported Private readable and writable area has own physical memory. Sharing physical memory Supported Anonymous memory and file backed memory share physical memory under certain conditions. Page size 2MB / 64MB Page size of segments of executable binary and page size of shared library is equal to the alignment of them. program can specify page size of anonymous or file backed memory. Input and output File system, network, etc Supported Offloaded to Linux. 11 NEC Corporation 2018

Feature list (inter communication) Feature (1) Inside System V shared memory (2) - (3) -VH (4) -IB - - - shmget Function name example POSIX shared memory - - - shm_open Spin lock - - - pthread_spin_init Mutex (*1) - - - pthread_mutex_init Read write lock - - - pthread_rwlock_init Condition variable - - - pthread_cond_init Barrier - - - pthread_barrier_init Unnamed semaphore - - - sem_init System V semaphore - semget Named semaphore - sem_open System V message queue - msgget POSIX message queue - mq_open UNIX socket - socket Pipe, Fifo - pipe, mkfifo Signal - kill VH- SHM *2 - - - vh_shmat SHM *3 *5 - - - CR *5 - - - MM *4 *5 - - - - Supported - Not supported VH (1) VH (2) (3) (4) IB HCA IB HCA: InfiniBand HCA *1 robust mutex and pi mutex are not supported *2 VH- SHM is a feature to allow es to transfer data to or from System V shared memory created at VH side *3 SHM is a feature to allow es to transfer data between es *4 MM is a feature to allow the IB driver to access memory and to allow programs to access IB HCA *5 These features are used by the MPI library or other system libraries. They are not intended to be used directly from user programs 12 NEC Corporation 2018

Available resources Item Condition Per Per Note Maximum number of - 256 es - - Maximum number of threads - 1,024 threads 64 threads Maximum number of connections daemon accepts Maximum size of memory assigned to Maximum size of accessible remote memory Descriptor table of user mode DMA Maximum number of CR page for threads *1 Maximum number of local CR pages for MPI *1 Maximum number of remote CR pages for MPI *1-1,056 - A connection between a daemon and a pseudo is established per a thread. A connection between a and a ported command / gdb and is established per a request. 48GB memory model 49,024MB *2 49,024MB *2 A 64MB-page is reserved for 24GB memory model 24,448MB *3 24,448MB *3 zeroed page. A 64MB page is reserved for synchronization. *4 8 cores model 894GB 894GB The value will be changed when the number of cores is changed. - - 2 descriptor tables - - - 1 page - 8 cores model 24 pages 3 pages The value will be changed when the number of cores is changed. - - 32 pages - *1 A CR page consists of 32 CRs *2 49,086MB when is connected to SX-Aurora TSUBASA A100-1 model *3 24,510MB when is connected to SX-Aurora TSUBASA A100-1 model *4 A 2MB-page is reserved for synchronization when is connected to SX-Aurora TSUBASA A100-1 model 13 NEC Corporation 2018

Notice 1. The number of threads of es which run on should be less than or equal to the number of available cores, in order to achieve best performance 2. can execute new program or VH program. When executes new program, needs to specify program with the first argument of execve() system call. When executes VH program, s information such as resource limit is discarded, even if VH program executes program, again 3. Almost all of system calls, standard C library functions, Linux s commands and debugger commands are supported. But, some of them are not supported 14 NEC Corporation 2018

Revision History Rev. Date Update 1.8 28 th Feb. 2018 The first version 1.9 14 th May 2018 Updated the introduction of to clarify that it is running on Linux/VH (Page 4) Added VH- SHM which is a new feature of 1.1 release (Page 12) 1.10 11st July 2018 Mentioned migration of a thread between cores occurs after 1.2.1 (Page 6) Updated the description of maximum size of memory (Page 13) Incorporated the change of the number of DMA descriptor table done by 1.2 (Page 13) 1.11 12nd Sep. 2018 Updated the description of user mode DMA because Low level API for user mode DMA is provided as an experimental feature (Page 8) Updated the feature list because programs can use VH- SHM (Page 12) Added fifo to the feature list (Page 12) Improved figures 2.0 5 th Dec. 2018 Removed notice relating to kernel headers because kernel headers are provided as a part of 2.0 or later (Page 14) 2.1 9 th Feb. 2019 This revision covers v2.0.1 or later. Changed the format of top page. Added information which shows target versions of this document. (Page 3) 15 NEC Corporation 2018