the gamedesigninitiative at cornell university Lecture 13 Memory Management

Similar documents
Memory Management: High-Level Overview

the gamedesigninitiative at cornell university Lecture 10 Memory Management

Memory Management: The Details

the gamedesigninitiative at cornell university Lecture 9 Memory Management

IOS PERFORMANCE. Getting the most out of your Games and Apps

Today: Segmentation. Last Class: Paging. Costs of Using The TLB. The Translation Look-aside Buffer (TLB)

Chapter 3.5 Memory and I/O Systems

The Processor Memory Hierarchy

Engine Support System. asyrani.com

Robust Memory Management Schemes

TSBK 07! Computer Graphics! Ingemar Ragnemalm, ISY

Lecture 16. Today: Start looking into memory hierarchy Cache$! Yay!

Garbage Collection (1)

Optimisation. CS7GV3 Real-time Rendering

Agenda. CSE P 501 Compilers. Java Implementation Overview. JVM Architecture. JVM Runtime Data Areas (1) JVM Data Types. CSE P 501 Su04 T-1

CMSC 330: Organization of Programming Languages

the gamedesigninitiative at cornell university Lecture 12 Architecture Design

Windows Java address space

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi

Mali Developer Resources. Kevin Ho ARM Taiwan FAE

Optimizing Your Android Applications

CSE 410 Final Exam 6/09/09. Suppose we have a memory and a direct-mapped cache with the following characteristics.

Habanero Extreme Scale Software Research Project

Sustainable Memory Use Allocation & (Implicit) Deallocation (mostly in Java)

CS 345. Garbage Collection. Vitaly Shmatikov. slide 1

EEC 170 Computer Architecture Fall Cache Introduction Review. Review: The Memory Hierarchy. The Memory Hierarchy: Why Does it Work?

Managed runtimes & garbage collection. CSE 6341 Some slides by Kathryn McKinley

Field Analysis. Last time Exploit encapsulation to improve memory system performance

Managed runtimes & garbage collection

Memory Management. Kevin Webb Swarthmore College February 27, 2018

Advanced Programming & C++ Language

Optimizing and Profiling Unity Games for Mobile Platforms. Angelo Theodorou Senior Software Engineer, MPG Gamelab 2014, 25 th -27 th June

Motivation for Dynamic Memory. Dynamic Memory Allocation. Stack Organization. Stack Discussion. Questions answered in this lecture:

Optimizing DirectX Graphics. Richard Huddy European Developer Relations Manager

CS61C : Machine Structures

Garbage Collection. Akim D le, Etienne Renault, Roland Levillain. May 15, CCMP2 Garbage Collection May 15, / 35

PERFORMANCE. Rene Damm Kim Steen Riber COPYRIGHT UNITY TECHNOLOGIES

Plot SIZE. How will execution time grow with SIZE? Actual Data. int array[size]; int A = 0;

ECE 598 Advanced Operating Systems Lecture 14

Welcome to Part 3: Memory Systems and I/O

the gamedesigninitiative at cornell university Lecture 14 2D Sprite Graphics

Lecture Notes on Queues

ECE 571 Advanced Microprocessor-Based Design Lecture 12

CS161 Design and Architecture of Computer Systems. Cache $$$$$

Why memory hierarchy? Memory hierarchy. Memory hierarchy goals. CS2410: Computer Architecture. L1 cache design. Sangyeun Cho

Memory management. Last modified: Adaptation of Silberschatz, Galvin, Gagne slides for the textbook Applied Operating Systems Concepts

CMSC 330: Organization of Programming Languages

VMem. By Stewart Lynch.

CS 11 C track: lecture 5

Installing on WebLogic Server

Exploiting the Behavior of Generational Garbage Collector

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1

Optimizing for DirectX Graphics. Richard Huddy European Developer Relations Manager

CSE 333 Lecture 9 - storage

Operating Systems CMPSCI 377, Lec 2 Intro to C/C++ Prashant Shenoy University of Massachusetts Amherst

Lecture 10 Notes Linked Lists

PROCESS VIRTUAL MEMORY. CS124 Operating Systems Winter , Lecture 18

6.172 Performance Engineering of Software Systems Spring Lecture 9. P after. Figure 1: A diagram of the stack (Image by MIT OpenCourseWare.

Cache introduction. April 16, Howard Huang 1

MEMORY MANAGEMENT HEAP, STACK AND GARBAGE COLLECTION

Lecture Notes on Garbage Collection

(Refer Slide Time: 1:26)

Lecture 25: Board Notes: Threads and GPUs

Memory Allocation. Static Allocation. Dynamic Allocation. Dynamic Storage Allocation. CS 414: Operating Systems Spring 2008

CS3600 SYSTEMS AND NETWORKS

the gamedesigninitiative at cornell university Lecture 17 Color and Textures

CMSC 330: Organization of Programming Languages. Memory Management and Garbage Collection

Concurrent Programming using Threads

Computer Science 322 Operating Systems Mount Holyoke College Spring Topic Notes: Processes and Threads

Key Point. What are Cache lines

CSE351 Winter 2016, Final Examination March 16, 2016

CS 241 Honors Memory

CS 270 Algorithms. Oliver Kullmann. Binary search. Lists. Background: Pointers. Trees. Implementing rooted trees. Tutorial

Lecture 10 Notes Linked Lists

CUDA (Compute Unified Device Architecture)

Spring 2011 Prof. Hyesoon Kim

CPS221 Lecture: Threads

CS61C : Machine Structures

Lecture 16. Introduction to Game Development IAP 2007 MIT

JVM Memory Model and GC

Systems Programming and Computer Architecture ( ) Timothy Roscoe

Storage Management 1

CSE 373 OCTOBER 23 RD MEMORY AND HARDWARE

Abstract Interpretation Continued

CS 536 Introduction to Programming Languages and Compilers Charles N. Fischer Lecture 11

CS252 S05. Main memory management. Memory hardware. The scale of things. Memory hardware (cont.) Bottleneck

Page 1. Goals for Today" TLB organization" CS162 Operating Systems and Systems Programming Lecture 11. Page Allocation and Replacement"

LECTURE 4. Announcements

Dynamic Data Structures. John Ross Wallrabenstein

Operating Systems. 09. Memory Management Part 1. Paul Krzyzanowski. Rutgers University. Spring 2015

File Shredders. and, just what is a file?

Older geometric based addressing is called CHS for cylinder-head-sector. This triple value uniquely identifies every sector.

CS61C : Machine Structures

This Unit: Main Memory. Virtual Memory. Virtual Memory. Other Uses of Virtual Memory

Reducing Hit Times. Critical Influence on cycle-time or CPI. small is always faster and can be put on chip

Designing experiments Performing experiments in Java Intel s Manycore Testing Lab

Lecture 7 More Memory Management Slab Allocator. Slab Allocator

Garbage Collection. Steven R. Bagley

Computer Games 2012 Game Development

Computer Memory. Data Structures and Algorithms CSE 373 SP 18 - KASEY CHAMPION 1

Transcription:

Lecture 13

Take-Aways for Today Why does memory in games matter? Is re a difference between PC and mobile? Where do consoles fit in all this? Do we need to worry about it in Java? Java has garbage collection Handles difficult bits for us, right? What can we do in LibGDX? 2

Gaming Memory (Last Generation) Playstation 3 256 MB RAM for system 256 MB for graphics card X-Box 360 512 MB RAM (unified) Nintendo Wii 88 MB RAM (unified) 24 MB for graphics card iphone/ipad 1 GB RAM (unified) 3

Gaming Memory (Current Generation) Playstation 4 8 GB RAM (unified) X-Box One 8 GB RAM (unified) 5 GB for games Nintendo Wii-U 2 GB RAM (unified) Switch has 4 GB (rumored) iphone/ipad 2 GB RAM (unified) 4

Why Not Virtual Memory? Secondary storage exists Consoles have 500 GB HD idevices have 64 GB Flash But access time is slow HDs transfer at ~160 MB/s Best SSD is ~500 MB/s Recall 16 ms per frame At best, can access 8 MB Yields uneven performance 5

Aside: Java Memory Initial heap size Memory app starts with Can get more, but stalls app Set with Xms flag Maximum heap size OutOfMemory if exceed Set with Xmx flag Defaults by RAM installed Initial 25% RAM (<16 MB) Max is 75% RAM (<2 GB) Need more, n set it > java cp game.jar GameMain > java cp game.jar Xms:64m GameMain > java cp game.jar Xmx:4g GameMain > java cp game.jar Xms:64m Xms:64m GameMain 6

Memory Usage: Images Pixel color is 4 bytes 1 byte each for r, b, g, alpha More if using HDR color Image a 2D array of pixels 1280x1024 monitor size 5,242,880 bytes ~ 5 MB More if using mipmaps Graphic card texture feature Smaller versions of image Cached for performance But can double memory use 7

Memory Usage: Images Pixel color is 4 bytes 1 byte each for r, b, g, alpha More if using HDR color Image a 2D array of pixels 1280x1024 monitor size 5,242,880 bytes ~ 5 MB MipMaps Original Image More if using mipmaps Graphic card texture feature Smaller versions of image Cached for performance But can double memory use 8

But My JPEG is only 8 KB! Formats often compressed JPEG, PNG, GIF But not always TIFF Uncompress to display Need space to uncompress In RAM or graphics card Only load when needed Loading is primary I/O operation in AAA games Causes texture popping 9

But My JPEG is only 8 KB! Formats often compressed JPEG, PNG, GIF But not always TIFF Uncompress to display Need space to uncompress In RAM or graphics card Only load when needed Loading is primary I/O operation in AAA games Causes texture popping 10

Loading Screens 11

Problems with Asset Loading How to load assets? May have a lot of assets Init May have large assets Loading is blocking Game stops until done Update Cannot draw or animate May need to unload Running out of memory Draw Free something first 12

Problems with Asset Loading How to load assets? May have a lot of assets May have large assets Loading is blocking Game stops until done Cannot draw or animate May need to unload Running out of memory Free something first Blocks all drawing Blocks next frame Init Update Draw 13

Loading Screens 14

Solution: Asynchronous Loader Game Thread Second Thread Update Draw Specify Asset Notify done Update and draw simple animations until assets loaded Asset Loader 15 Game Architecture

Solution: Asynchronous Loader Game Thread Second Thread Update Specify Asset Notify done Asset Loader Draw Also an asset manager Each asset given a key Can access asset by key Works like Java Map 16 Game Architecture

Solution: Asynchronous Loader Game Thread Second Thread Update Specify Asset Notify done Asset Loader Draw Not always a good idea Only one thread can I/O May need OpenGL utils so will block drawing 17 Game Architecture

Alternative: Iterative Loader Game Thread Asset Manager Update Initialize Asset Update Loader Draw Access 18 Game Architecture

Alternative: Iterative Loader Uses a time budget Give set amount of time Do as much as possible Stop until next update Better for OpenGL Give time to manager Animate with remainder No resource contention LibGDX approach Re-examine game labs Asset Manager Initialize Asset Update Loader Access 19 Game Architecture

Alternative: Iterative Loader Uses a time budget Give set amount of time Do as much as possible Stop until next update Better for OpenGL Give time to manager Animate with remainder No resource contention LibGDX approach Re-examine game labs Load Assets Update Draw Budget b Remaining time t b 20 Game Architecture

Assets Beyond Images AAA games have a lot of 3D geometry Vertices for model polygons Physics bodies per polygon Scene graphs for organizing this data When are all se objects created? At load time (filling up memory)? Or only when y are needed? We need to understand memory better 21

Traditional Memory Organization High Address Stack Free Space Heap Program Data Program Code Static Variables 22 Low Address

Traditional Memory Organization High Address Stack Function parameters Local variables Return values Free Space Heap Low Address Program Data Program Code Static Variables 23

Traditional Memory Organization High Address Stack Function parameters Local variables Return values Free Space Heap Objects created via new (e.g. Every object in Java) Low Address Program Data Program Code Static Variables 24

Traditional Memory Organization High Address Stack Function parameters Local variables Return values Easy to Handle Free Space Heap Objects created via new (e.g. Every object in Java) 25 Low Address Program Data Program Code Static Variables Easy to Handle

Traditional Memory Organization High Address Stack Function parameters Local variables Return values Easy to Handle Free Space Heap Objects created via new (e.g. Every object in Java) Problems! 26 Low Address Program Data Program Code Static Variables Easy to Handle

Problem with Heap Allocation It can be slower to access Not always contiguous Stacks are nicer for caches private void handlecollision(shell s1, Shell s2) { // Find axis of "collision" Vector2 axis = new Vector2(s1.getPosition()); axis.sub(s2.getposition()); Garbage collection is brutal Old collectors would block New collectors are better but slower than manual Very bad if high churn Rapid creation/deletion Example: Particle systems } // Compute projections Vector2 temp1 = new Vector2(s2.getPosition()); temp1.sub(s1.getposition()).nor(); Vector2 temp2 = new Vector2(s1.getPosition()); temp2.sub(s2.getposition()).nor(); // Compute new velocities temp1.scl(temp1.dot(s1.getvelocity())); temp2.scl(temp2.dot(s2.getvelocity())); // Apply to objects s1.getvelocity().sub(temp1).add(temp2); s2.getvelocity().sub(temp2).add(temp1); 27

Problem with Heap Allocation It can be slower to access Not always contiguous Stacks are nicer for caches private void handlecollision(shell s1, Shell s2) { // Find axis of "collision" Vector2 axis = new Vector2(s1.getPosition()); axis.sub(s2.getposition()); Created/deleted every frame Garbage collection is brutal Old collectors would block New collectors are better but slower than manual Very bad if high churn Rapid creation/deletion Example: Particle systems } // Compute projections Vector2 temp1 = new Vector2(s2.getPosition()); temp1.sub(s1.getposition()).nor(); Vector2 temp2 = new Vector2(s1.getPosition()); temp2.sub(s2.getposition()).nor(); // Compute new velocities temp1.scl(temp1.dot(s1.getvelocity())); temp2.scl(temp2.dot(s2.getvelocity())); // Apply to objects s1.getvelocity().sub(temp1).add(temp2); s2.getvelocity().sub(temp2).add(temp1); 28

Memory Organization and Games Inter-Frame Memory Carries over across frame boundaries Update Draw Intra-Frame Memory Recovered each frame 29

Memory Organization and Games Inter-Frame Memory Carries over across frame boundaries Heap Update or Stack? Does it matter? Draw Intra-Frame Memory Recovered each frame 30

Distinguishing Data Types Intra-Frame Local computation Local variables (managed by compiler) Temporary objects (not necessarily managed) Transient data structures Built at start of update Used to process update Can be deleted at end Inter-Frame Game state Model instances Controller state View state and caches Long-term data structures Built at start/during frame Lasts for multiple frames May adjust to data changes 31

Distinguishing Data Types Intra-Frame Local computation Local variables (managed by compiler) Temporary objects (not necessarily managed) Transient data structures Built at start of update Used to process update Can be deleted at end Inter-Frame Game state Model instances Controller state View state and caches Long-term data structures Built at start/during frame Lasts for multiple frames May adjust to data changes 32

Distinguishing Data Types Intra-Frame Local computation Local variables (managed by compiler) Temporary objects (not necessarily managed) Transient data structures Built at start of update Used to process update Can be deleted at end Inter-Frame Game state Model instances Controller state View state and caches Long-term data structures Built at start/during frame Lasts for multiple frames May adjust to data changes 33

Handling Game Memory Intra-Frame Does not need to be paged Drop latest frame Restart on frame boundary Want size reasonably fixed Local variables always are Limited # of allocations Limit new inside loops Make use of cached objects Requires careful planning Inter-Frame Potential to be paged Defines current game state May just want level start Size is more flexible No. of objects is variable Subsystems may turn on/off User settings may affect Preallocate as possible Recycle with free lists 34

Rule of Thumb: Limiting new Limit new to constructors Identify object owner Allocate in owner constructor Root Controller Example: cached objects Look at what algorithm needs Allocate all necessary objects Algorithm just sets cache Subcontroller Subcontroller Problem: readability Naming is key to readability But new names = new objects Make good use of comments Model Model Model 35

Rule of Thumb: Limiting new Limit new to constructors Identify object owner Allocate in owner constructor Root Controller allocate Example: cached objects Look at what algorithm needs Allocate all necessary objects Algorithm just sets cache Subcontroller Subcontroller Problem: readability Naming is key to readability But new names = new objects Make good use of comments Model Model Model 36

Object Preallocation Idea: Allocate before need Compute maximum needed Create a list of objects Allocate contents at start Pull from list when neeeded Problem: Running out Eventually at end of list Want to reuse older objects Easy if deletion is FIFO But what if it isn t? Motivation for free list // Allocate all of particles Particle[] list = new Particle[CAP]; for(int ii = 0; ii < CAP; ii++) { list[ii] = new Particle(); } // Keep track of next particle int next = 0; // Need to allocate particle Particle p = list[next++]; p.set( ); 37

Free Lists Create an object queue Separate from preallocation Stores objects when freed To allocate an object Look at front of free list If object re take it Orwise make new object Preallocation unnecessary Queue wins in long term Main performance hit is garbage collector // Free new particle freelist.push(p); // Allocate a new particle Particle q; if (!freelist.isempty()) { q = freelist.pop(); } else { q = new Particle(); } q.set( ) 38

LibGDX Support: Pool Pool<T> public void free(t obj); Add an object to free list public T obtain(); Use this in place of new If object on free list, use it Orwise make new object public T newobject(); Rule to create a new object Could be preallocated Pool.Poolable public void reset(); Erases object contents Used when object freed Must be implemented by T Parameter free constructors Set contents with initializers See MemoryPool demo Also PooledList in Lab 4 39

Summary Memory usage is always an issue in games Uncompressed images are quite large Particularly a problem on mobile devices Asset loading must be balanced with animation LibGDX uses an incremental approach Limit calls to new in your animation frames Intra-frame objects: cached objects Inter-frame objects: free lists 40