Engine Support System. asyrani.com

Similar documents
Memory Management. q Basic memory management q Swapping q Kernel memory allocation q Next Time: Virtual memory

Memory Management. To do. q Basic memory management q Swapping q Kernel memory allocation q Next Time: Virtual memory

In multiprogramming systems, processes share a common store. Processes need space for:

CS61C : Machine Structures

Events, Memory Management

ECE 598 Advanced Operating Systems Lecture 10

Memory Allocation. Static Allocation. Dynamic Allocation. Dynamic Storage Allocation. CS 414: Operating Systems Spring 2008

Dynamic Memory Allocation

A.Arpaci-Dusseau. Mapping from logical address space to physical address space. CS 537:Operating Systems lecture12.fm.2

Optimizing Dynamic Memory Management

Heap Management portion of the store lives indefinitely until the program explicitly deletes it C++ and Java new Such objects are stored on a heap

Process s Address Space. Dynamic Memory. Backing the Heap. Dynamic memory allocation 3/29/2013. When a process starts the heap is empty

Memory Management. Today. Next Time. Basic memory management Swapping Kernel memory allocation. Virtual memory

CS Operating Systems

CS Operating Systems

Dynamic Memory Allocation. Gerson Robboy Portland State University. class20.ppt

CIS Operating Systems Non-contiguous Memory Allocation. Professor Qiang Zeng Spring 2018

Run-time Environments -Part 3

Advanced Programming & C++ Language

Preview. Memory Management

Heap Management. Heap Allocation

! What is main memory? ! What is static and dynamic allocation? ! What is segmentation? Maria Hybinette, UGA. High Address (0x7fffffff) !

Memory Allocation. Copyright : University of Illinois CS 241 Staff 1

Operating Systems and Computer Networks. Memory Management. Dr.-Ing. Pascal A. Klein

COMPUTER SCIENCE 4500 OPERATING SYSTEMS

Motivation for Dynamic Memory. Dynamic Memory Allocation. Stack Organization. Stack Discussion. Questions answered in this lecture:

Run-time Environments - 3

Dynamic Memory Allocation I Nov 5, 2002

Review! Lecture 5 C Memory Management !

CS61C : Machine Structures

The Art and Science of Memory Allocation

Dynamic Memory Allocation I

Memory Management Topics. CS 537 Lecture 11 Memory. Virtualizing Resources

Dynamic Memory Allocation: Basic Concepts

Dynamic Memory Allocation: Basic Concepts

Memory Management. Chapter Fourteen Modern Programming Languages, 2nd ed. 1

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11

VMem. By Stewart Lynch.

Memory Management. COMP755 Advanced Operating Systems

CS 550 Operating Systems Spring Memory Management: Paging

Today. Dynamic Memory Allocation: Basic Concepts. Dynamic Memory Allocation. Dynamic Memory Allocation. malloc Example. The malloc Package

Today. Dynamic Memory Allocation: Basic Concepts. Dynamic Memory Allocation. Dynamic Memory Allocation. malloc Example. The malloc Package

Operating Systems Memory Management. Mathieu Delalandre University of Tours, Tours city, France

Design Issues 1 / 36. Local versus Global Allocation. Choosing

Introduction to Computer Systems /18 243, fall th Lecture, Oct. 22 th

Deallocation Mechanisms. User-controlled Deallocation. Automatic Garbage Collection

15 Sharing Main Memory Segmentation and Paging

CS399 New Beginnings. Jonathan Walpole

CSCI-UA /2. Computer Systems Organization Lecture 19: Dynamic Memory Allocation: Basics

Requirements, Partitioning, paging, and segmentation

CIS Operating Systems Memory Management. Professor Qiang Zeng Fall 2017

Operating System Labs. Yuanbin Wu

Memory Management. Didactic Module 14 Programming Languages - EEL670 1

CHAPTER 3 RESOURCE MANAGEMENT

Dynamic Memory Management

Chapter 8 & Chapter 9 Main Memory & Virtual Memory

Dynamic Memory Allocation

CSCE Operating Systems Non-contiguous Memory Allocation. Qiang Zeng, Ph.D. Fall 2018

CS 33. Storage Allocation. CS33 Intro to Computer Systems XXVII 1 Copyright 2017 Thomas W. Doeppner. All rights reserved.

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction

Lectures 13 & 14. memory management

Classifying Information Stored in Memory! Memory Management in a Uniprogrammed System! Segments of a Process! Processing a User Program!

Memory Management. Reading: Silberschatz chapter 9 Reading: Stallings. chapter 7 EEL 358

Operating Systems. 11. Memory Management Part 3 Kernel Memory Allocation. Paul Krzyzanowski Rutgers University Spring 2015

16 Sharing Main Memory Segmentation and Paging

Memory Management Basics

Virtual Memory Management

Binding and Storage. COMP 524: Programming Language Concepts Björn B. Brandenburg. The University of North Carolina at Chapel Hill

Operating Systems. Designed and Presented by Dr. Ayman Elshenawy Elsefy

CS61C : Machine Structures

Addresses in the source program are generally symbolic. A compiler will typically bind these symbolic addresses to re-locatable addresses.

Dynamic Memory Allocation. Basic Concepts. Computer Organization 4/3/2012. CSC252 - Spring The malloc Package. Kai Shen

RAM, Design Space, Examples. January Winter Term 2008/09 Gerd Liefländer Universität Karlsruhe (TH), System Architecture Group

Memory management. Johan Montelius KTH

CS3600 SYSTEMS AND NETWORKS

CS6401- Operating System UNIT-III STORAGE MANAGEMENT

Main Memory and the CPU Cache

Chapter 3 - Memory Management

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip

Operating Systems (2INC0) 2017/18

Processes, Address Spaces, and Memory Management. CS449 Spring 2016

ECE 598 Advanced Operating Systems Lecture 12

6.172 Performance Engineering of Software Systems Spring Lecture 9. P after. Figure 1: A diagram of the stack (Image by MIT OpenCourseWare.

Chapter 8: Memory- Management Strategies. Operating System Concepts 9 th Edition

1. With respect to space, linked lists are and arrays are.

High Performance Computing and Programming, Lecture 3

CS 111. Operating Systems Peter Reiher

Chapter 9 Memory Management Main Memory Operating system concepts. Sixth Edition. Silberschatz, Galvin, and Gagne 8.1

12: Memory Management

Memory management. Knut Omang Ifi/Oracle 10 Oct, 2012

Operating Systems Unit 6. Memory Management

The levels of a memory hierarchy. Main. Memory. 500 By 1MB 4GB 500GB 0.25 ns 1ns 20ns 5ms

Segmentation. Multiple Segments. Lecture Notes Week 6

Memory: Overview. CS439: Principles of Computer Systems February 26, 2018

CSE506: Operating Systems CSE 506: Operating Systems

Chapter 8: Main Memory

Dynamic Memory Management! Goals of this Lecture!

CS24: INTRODUCTION TO COMPUTING SYSTEMS. Spring 2017 Lecture 7

Dynamic Memory: Alignment and Fragmentation

Memory. Objectives. Introduction. 6.2 Types of Memory

Transcription:

Engine Support System asyrani.com

A game engine is a complex piece of software consisting of many interacting subsystems. When the engine first starts up, each subsystem must be configured and initialized in a specific order

We could define our singleton instances as globals, without the need for dynamic memory allocation

A static variable that is declared within a function will not be constructed before main() is called, but rather on the first invocation of that function. So if our global singleton is function static, we can control the order of construction for our global singletons

Suggested by textbooks

A Simple Approach That Works

The main() function

It s simple and easy to implement. It s explicit. You can see and understand the start-up order immediately by just looking at the code. It s easy to debug and maintain. If something isn t starting early enough, or is starting too early, you can just move one line of code.

How to illustrate it in simple way

Ogre3D Controlled by the singleton object Ogre::Root. It contains pointers to every other subsystem in Ogre and manages their creation and destruction. Naughty Dog s Uncharted: Drake s Fortune Uses a similar explicit technique for starting up its subsystems A wide range of operating system services, third party libraries, and so on must all be started up during engine initialization

Memory Management

Dynamic memory allocation via malloc() or C++ s global operator new - But it is a very slow operation On modern CPUs, the performance of a piece of software is often dominated by its memory access patterns

Dynamic memory allocation via malloc() and free() or C++ s global new and delete operators also known as heap allocation is typically very slow. The high cost can be attributed to two main factors.

First, a heap allocator is a general-purpose facility, so it must be written to handle any allocation size, from one byte to one gigabyte. This requires a lot of management overhead, making the malloc() and free() functions inherently slow. Second, on most operating systems a call to malloc() or free() must first context-switch from user mode into kernel mode, process the request, and then contextswitch back to the program.

NEW AND DELETE Memory allocated from 'Free Store' Returns a fully typed pointer. new (standard version) never returns a NULL (will throw on failure) Are called with Type-ID (compiler calculates the size) Has a version explicitly to handle arrays. Reallocating (to get more space) not handled intuitively (because of copy constructor). Whether they call malloc/free is implementation defined. Can add a new memory allocator to deal with low memory (set_new_handler) operator new/delete can be overridden legally constructor/destructor used to initialize/destroy the object MALLOC Memory allocated from 'Heap' Returns a void* Returns NULL on failure Must specify the size required in bytes. Allocating array requires manual calculation of space. Reallocating larger chunk of memory simple (No copy constructor to worry about) They will NOT call new/delete No way to splice user code into the allocation sequence to help with low memory. malloc/free can NOT be overridden legally

Stack-Based Allocators

Many games allocate memory in a stack-like fashion. Whenever a new game level is loaded, memory is allocated for it. Once the level has been loaded, little or no dynamic memory allocation takes place. At the conclusion of the level, its data is unloaded and all of its memory can be freed. It makes a lot of sense to use a stack-like data structure for these kinds of memory allocations.

Allocate Memory malloc() Add pointer to the top of the stack All memory addresses below used All memory above- free Top pointer initialize lowest memory address Each allocation request moves the pointer up Most recently allocated block can be freed by moving the top pointer

Simplest way to understand it Create a program class system Declare some memory reserved Add pointer/memory Always update your pointer and empty your declared memory

Double-Ended Stack Allocators

Pool allocators

A pool allocator works by pre-allocating a large block of memory whose size is an exact multiple of the size of the elements that will be allocated.

A pool of 4 4 matrices would be an exact multiple of 64 bytes (16 elements per matrix times four bytes per element) Each element within the pool is added to a linked list of free elements; when the pool is first initialized, the free list contains all of the elements Whenever an allocation request is made, we simply grab the next free element off the free list and return it. When an element is freed, we simply tack it back onto the free list.

Aligned Allocators

All memory allocators must be capable of returning aligned memory blocks. This is relatively straightforward to implement. We simply allocate a little bit more memory than was actually requested, adjust the address of the memory block upward slightly so that it is aligned properly, and then return the adjusted address. Because we allocated a bit more memory than was requested, the returned block will still be large enough, even with the slight upward adjustment

Single-Frame Allocators

Single-frame allocator is implemented by reserving a block of memory and managing it with a simple stack allocator as described above. At the beginning of each frame, the stack s top pointer is cleared to the bottom of the memory block. Allocations made during the frame grow toward the top of the block. Rinse and repeat

Double-Buffered Allocators

A double-buffered allocator allows a block of memory allocated on frame i to be used on frame (i + 1). To accomplish this, we create two single frame stack allocators of equal size and then ping-pong between them every frame

Memory Fragmentation

Fragmentation Fragmentation is the inability to reuse memory that is free External fragmentation occurs when enough free memory is available but isn t contiguous Many small holes Internal fragmentation arises when a large enough block is allocated but it is bigger than needed Blocks are usually split to prevent internal fragmentation

What causes fragmentation? Isolated deaths When adjacent objects do not die at the same time. Time-varying program behavior Memory requests change unexpectedly

Why traditional approaches don t work Program behavior is not predictable in general The ability to reuse memory depends on the future interaction between the program and the allocator 100 blocks of size 10 and 200 of size 20?

How do we avoid fragmentation? A single death is a tragedy. A million deaths is a statistic. -Joseph Stalin

Understanding program behavior Common behavioral patterns Ramps Data structures that are accumulated over time Peaks Memory used in bursty patterns usually while building up temporal data structures. Plateaus Data structures build quickly and are used for long periods of time

Mechanisms Most common mechanisms used Sequential fits Segregated free lists Buddy System Bitmap fits Index fits

Sequential fits Based on a single linear list Stores all free memory blocks Usually circularly or doubly linked Most use boundary tag technique Most common mechanisms use this method.

Sequential fits Best fit, First fit, Worst fit Next fit Uses a roving pointer for allocation Optimal fit Samples the list first to find a good enough fit Half fit Splits blocks twice the requested size

Segregated free lists Use arrays of lists which hold free blocks of particular size Use size classes for indexing purposes Usually in sizes that are a power of two Requested sizes are rounded up to the nearest available size 2 4 8 16 32 64 128

Segregated free lists Simple segregated list No splitting of free blocks Subject to severe external fragmentation Segregated fit Splits larger blocks if there is no free block in the appropriate free list Uses first fit or next fit to find a free block Three types: exact lists, strict size classes with rounding or size classes with range lists.

Buddy system A special case of segregated fit Supports limited splitting and coalescing Separate free list for each allowable size Simple block address computation A free block can only be merged with its unique buddy. Only whole entirely free blocks can be merged.

Buddy system 16 MB 3 MB Free 8 MB 4 MB

Buddy system 16 MB 3 MB Split 8 MB Free Free 4 MB

Buddy system 16 MB 3 MB Split 8 MB Free Split 4 MB Free Free

Buddy system 16 MB Split 8 MB Free Split 4 MB Alloc. Free

Binary buddies Simplest implementation All buddy sizes are powers of two Each block divided into two equal parts Internal fragmentation very high Expected 28%, in practice usually higher

Fibonacci buddies Size classes based on the fibonacci series More closely-spaced set of size classes Reduces internal fragmentation Blocks can only be split into sizes that are also in the series Uneven block sizes a disadvantage? When allocating many equal sized blocks

Fibonacci buddies Size series: 2 3 5 8 13 21 34 55 Splitting blocks: 13 21 5 8 8 13

Weighted buddies Size classes are power of two Between each pair is a size three times a power of two Two different splitting methods 2 x numbers can be split in half 2 x *3 numbers can be split in half or unevenly into two sizes.

Weighted buddies Size series: 2 3 4 6 8 12 16 24 (2 1 ) (2 0 *3) (2 2 ) (2 1 *3) (2 3 ) (2 2 *3) (2 4 ) (2 3 *3) Splitting of 2 x *3 numbers: 6 6 3 3 2 4

Double buddies Use 2 different binary buddy series One list uses powers of two sizes Other uses powers of two spacing, offset by x Splitting rules Blocks can only be split in half Split blocks stay in the same series

Double buddies Size series: 2 4 8 16 32 64 128 (2 1 ) (2 2 ) (2 3 ) (2 4 ) (2 5 ) (2 6 ) (2 7 ) 3 6 12 24 48 96 192 (3*2 0 ) (3*2 1 ) (3*2 2 ) (3*2 3 ) (3*2 4 ) (3*2 5 ) (3*2 6 ) Splitting of 3*2 x numbers: 6 6 3 3 2 4

Avoiding Fragmentation in Game Engine Development

Cache Coherency

A cache is a special type of memory that can be read from and written to by the CPU much more quickly than main RAM. The basic idea of memory caching is to load a small chunk of memory into the high-speed cache whenever a given region of main RAM is first read. Such a memory chunk is called a cache line and is usually between 8 and 512 bytes, depending on the microprocessor architecture.

On subsequent read operations, if the requested data already exists in the cache, it is loaded from the cache directly into the CPU s registers a much faster operation than reading from main RAM. Only if the required data is not already in the cache does main RAM have to be accessed. This is called a cache miss. Whenever a cache miss occurs, the program is forced to wait for the cache line to be refreshed from main RAM

The rules for moving data back and forth between main RAM are of course complicated by the presence of a level 2 cache. Now, instead of data hopping from RAM to cache to CPU and back again, it must make two hops first from main RAM to the L2 cache, and then from L2 cache to L1 cache. We won t go into the specifics of these rules here. (They differ slightly from CPU to CPU anyway.) But suffice it to say that RAM is slower than L2 cache memory, and L2 cache is slower than L1 cache. Hence L2 cache misses are usually more expensive than L1 cache misses, all other things being equal.

Read Further Chapter Engine Support System - 5.1 Subsystem Start-Up and Shut-Down - 5.2 Memory Management and Internet Materials via Google, etc