General Transformations for GPU Execution of Tree Traversals
|
|
- Griselda May
- 6 years ago
- Views:
Transcription
1 eneral Transformations for PU Execution of Tree Traversals Michael oldfarb*, Youngjoon Jo**, Milind Kulkarni School of Electrical and Computer Engineering * Now at Qualcomm; ** Now at oogle
2 PU execution of irregular programs PUs offer promise of massive, energy-efficient parallelism Much success in mapping regular applications to PUs Regular memory accesses, predictable computation Much less success in mapping irregular applications Pointer-based data structures Unpredictable, input-dependent computation and memory accesses 2
3 Tree traversal algorithms Many irregular algorithms are built around tree-traversal arnes-hut Nearest-neighbor 2-point correlation Numerous papers describing how to map tree traversal algorithms to PUs 3
4 Point correlation Data mining algorithm oal: given a set of N points in k dimensions and a point p, find all points within a radius r of p Naïve approach: compare all N points with p etter approach: build kdtree over points, traverse tree for point p, prune subtrees that are far from p 4
5 Point correlation Data mining algorithm oal: given a set of N points in k dimensions and a point p, find all points within a radius r of p Naïve approach: compare all N points with p etter approach: build kdtree over points, traverse tree for point p, prune subtrees that are far from p 5
6 Point correlation Data mining algorithm oal: given a set of N points in k dimensions and a point p, find all points within a radius r of p Naïve approach: compare all N points with p etter approach: build kdtree over points, traverse tree for point p, prune subtrees that are far from p 6
7 Point correlation 7
8 Point correlation 7
9 Point correlation C F 7
10 Point correlation C F D E 7
11 Point correlation C F H K D E 7
12 Point correlation C F H K D E I J 7
13 Point correlation C F H K D E I J 8
14 Point correlation C F H K D E I J 8
15 Point correlation C F H K D E I J 9
16 Point correlation C F H K D E I J 10
17 Point correlation C F H K D E I J 11
18 Point correlation C F H K D E I J 12
19 Point correlation C F H K D E I J 13
20 Point correlation C F H K D E I J 14
21 Point correlation KDCell root = /* build kdtree */; Set<Point> ps; double radius; foreach Point p in ps { recurse(p, root, radius); }... void recurse(point p, KDCell node, double r) { if (toofar(p, node, r)) return; if (node.isleaf() && (dist(node.point, p) < r)) p.correlated++; else { recurse(p, node.left, r); recurse(p, node.right, r); } } 15
22 asic pattern TreeNode root; Set<Point> ps; foreach Point p in ps { recurse(p, root,...); }... recurse(point p, KDCell node,...) { if (truncate?(p, node,...)) {... } recurse(p, node.child1,...); recurse(p, node.child2,...);... } 16
23 asic pattern TreeNode root; Set<Point> ps; foreach Point p in ps { recurse(p, root,...); }... recurse(point p, KDCell node,...) { if (truncate?(p, node,...)) {... } recurse(p, node.child1,...); recurse(p, node.child2,...);... } recursive traversal 16
24 asic pattern TreeNode root; Set<Point> ps; foreach Point p in ps { recurse(p, root,...); }... recurse(point p, KDCell node,...) { if (truncate?(p, node,...)) {... } recurse(p, node.child1,...); recurse(p, node.child2,...);... } tree structure recursive traversal 16
25 asic pattern TreeNode root; Set<Point> ps; foreach Point p in ps { recurse(p, root,...); }... recurse(point p, KDCell node,...) { if (truncate?(p, node,...)) {... } recurse(p, node.child1,...); recurse(p, node.child2,...);... } tree structure repeated traversal recursive traversal 16
26 asic pattern TreeNode root; Set<Point> ps; foreach Point p in ps { recurse(p, root,...); }... recurse(point p, KDCell node,...) { if (truncate?(p, Lots of parallelism! node,...)) {... } recurse(p, node.child1,...); recurse(p, node.child2,...);... } tree structure repeated traversal recursive traversal 16
27 What s the problem? PUs add high overhead for recursion PUs work best when memory accesses are regular and strided, but irregular algorithms have unpredictable memory accesses Status quo: ad hoc solutions New algorithm? New PU techniques! 17
28 What s the problem? PUs add high overhead for recursion PUs work best when memory accesses are regular and strided, but irregular algorithms have unpredictable memory accesses Want generally applicable techniques for mapping irregular applications to PUs Status quo: ad hoc solutions New algorithm? New PU techniques! 17
29 Contributions Two general techniques for mapping treetraversals to PUs utoropes: eliminates recursion overhead Lockstepping: promotes memory coalescing Compiler pass to automatically apply techniques to recursive tree-traversal code Significant PU speedups on 5 tree-traversal algorithms 18
30 Naïve PU implementation Warp-based SIMT (single-instruction, multiplethread) execution 32 points put in a single warp Warp traverses tree ll points in warp must execute same instruction If points diverge, some points sit idle while other threads execute 19
31 Naïve PU implementation C F H K D E I J 20
32 Naïve PU implementation C F H K D E I J 20
33 Naïve PU implementation C F H K D E I J 20
34 Naïve PU implementation C F H K D E I J 20
35 Naïve PU implementation C F H K D E I J 21
36 Naïve PU implementation C F H K D E I J 22
37 Naïve PU implementation C F H K D E I J 23
38 Naïve PU implementation C F H K D E I J 24
39 Naïve PU implementation C F H K D E I J 25
40 Naïve PU implementation C F H K D E I J 26
41 Naïve PU implementation C F H K D E I J 27
42 Naïve PU implementation C F H K D E I J 28
43 Naïve PU implementation C F H K D E I J 29
44 Naïve PU implementation C F H K D E I J 30
45 Naïve PU implementation C F H K D E I J 31
46 Naïve PU implementation C F H K D E I J 32
47 Naïve PU implementation C F H K D E I J 33
48 Naïve PU implementation C F H K D E I J 34
49 Lots of accesses to tree Many accesses just moving up the tree in order to later move down again Lots of function stack manipulation Trees are very large, cannot be stored in PU s fast memory Want to minimize accesses to tree 35
50 How to avoid extra accesses to tree? Typical technique: ropes Pointers in each tree node that let a traversal jump to the next part of the tree Effectively linearizes traversal C F H K D E I J 36
51 How to avoid extra accesses to tree? Typical technique: ropes Pointers in each tree node that let a traversal jump to the next part of the tree Effectively linearizes traversal C F H K D E I J 36
52 How to avoid extra accesses to tree? Typical technique: ropes Pointers in each tree node that let a traversal jump to the next part of the tree Effectively linearizes traversal C F H K D E I J 36
53 How to avoid extra accesses to tree? Typical technique: ropes Pointers in each tree node that let a traversal jump to the next part of the tree Effectively linearizes traversal C F H K D E I J 36
54 How to avoid extra accesses to tree? Installing ropes into a tree requires complex, applicationspecific preprocessing Using ropes correctly during execution requires complex, application-specific logic C F H K D E I J 37
55 utoropes eneral technique for linearizing tree traversal for arbitrary traversal algorithms chieve generality, simplicity and spaceefficiency at the cost of overhead Key insight: recursive tree algorithms are just depth-first traversals of a tree; can transform into iterative algorithm 38
56 utoropes C F H K D E I J Rope stack 39
57 utoropes C F H K D E I J Rope stack 40
58 utoropes F C C F H K D E I J Rope stack 41
59 utoropes F E D C F H K D E I J Rope stack 42
60 utoropes F E C F H K D E I J Rope stack 43
61 utoropes F C F H K D E I J Rope stack 44
62 utoropes C F H K D E I J Rope stack 45
63 utoropes Ropes stored on rope stack instead of in tree No application-specific code to use ropes Ropes instantiated dynamically No preprocessing required Same access patterns as with manual ropes Extra pushes and pops on rope stack add some overhead 46
64 utoropes Simple compiler transformation converts recursive code into iterative autoropes code void recurse(point p, KDCell node, double r) { if (toofar(p, node, r)) return; if (node.isleaf() && (dist(node.point, p) < r)) p.correlated++; else { recurse(p, node.left, r); recurse(p, node.right, r); } } 47
65 utoropes ropestack.push(root); while (!ropestack.isempty()) { node = ropestack.pop(); if (toofar(p, node, r)) continue; if (node.isleaf() && (dist(node.point, p) < r)) p.correlated++; else { ropestack.push(node.right); ropestack.push(node.left); } } See paper for details of how to transform more complex code 48
66 Unintended consequence Recursive calls naturally lead to thread divergence If some threads make recursive calls, other threads wait until calls return Does not happen for iterative code ll threads reconverge at beginning of loop ropestack.push(root); while (!ropestack.isempty()) { node = ropestack.pop(); if (toofar(p, node, r)) continue; if (...) p.correlated++; else { ropestack.push(node.right); ropestack.push(node.left); } } 49
67 Unintended consequence Recursive calls naturally lead to thread divergence If some threads make recursive calls, other threads wait until calls return Does not happen for iterative code ll threads reconverge at beginning of loop ropestack.push(root); while (!ropestack.isempty()) { node = ropestack.pop(); if (toofar(p, node, r)) continue; if (...) formerly return p.correlated++; else { ropestack.push(node.right); ropestack.push(node.left); } } formerly recursion 49
68 utoropes on PU C F H K D E I J 50
69 utoropes on PU C F H K D E I J 51
70 utoropes on PU C F H K D E I J 52
71 utoropes on PU C F H K D E I J 53
72 utoropes on PU C F H K D E I J 54
73 utoropes on PU C F H K D E I J 55
74 utoropes on PU C F H K D E I J 56
75 utoropes on PU C F Threads no longer diverge in execution ut do diverge in tree! H K D E I J 56
76 Thread divergence vs. memory coalescing Memory accesses on PU only well behaved if accesses by all threads in warp can be coalesced Same memory or strided access ad memory behavior of autoropes outweighs lack of thread divergence oal: benefits of autoropes while maintaining memory coalescing 57
77 Lockstepping Essentially, force PU to let threads diverge If any thread in a warp wants to visit a node s children, all threads in a warp visit the child Threads that are dragged along are programmatically masked out Warp execution takes longer (proportional to union of threads traversals, rather than longest traversal), but improved memory performance makes up for it utomatically implemented during autoropes compiler pass 58
78 Dynamic lockstepping Some algorithms allow different traversal orders Some points visit left child before right, and others visit right before left Optimization reduces traversal size Inherently bad memory access patterns 59
79 Dynamic lockstepping C F H K D E I J 60
80 Dynamic lockstepping C F H K D E I J 61
81 Dynamic lockstepping C F H K D E I J 62
82 Dynamic lockstepping C F H K D E I J 63
83 Dynamic lockstepping C F H K D E I J 64
84 Dynamic lockstepping Dynamic lockstepping allows all points in a warp to vote on which traversal order to use Maintains memory coalescing Some points do more work than in original algorithm Tradeoff can still be worth it! 65
85 Engineering details Transform point data from array of structures format to structure of arrays Use analysis from [PCT 2013] to prove safety and transform automatically Copy tree data to PU in linearized fashion Lay out fields of tree and point according to use (more commonly-accessed fields placed in shared memory) Interleave rope stacks for points in warp to allow strided access 66
86 Results Two platforms PU platform: NVIDI Tesla C2070 (6 global memory, 14 SMs) CPU platform: 32-core, 2.3 Hz Opteron Five benchmarks arnes-hut, Point correlation, Nearest neighbor, k- Nearest neighbor, Vantage point trees Multiple inputs per benchmark Used sorted and unsorted points 67
87 High-level takeaways utoropes+lockstep always faster than simple recursive PU implementation (up to 14x faster) For most benchmarks/inputs, best PU implementation faster than CPU implementation up to 16 threads Speedups comparable to hand-written implementations 68
88 arnes-hut 69
89 Point correlation 70
90 Nearest-neighbor 71
91 Conclusions Mapping irregular applications to PUs is very difficult Developed two general techniques, autoropes and lockstepping, that can achieve significant speedup on PU vs. baseline PU code and CPU implementations utomatic approaches competitive with previous hand-written implementations 72
92 eneral Transformations for PU Execution of Tree Traversals Michael oldfarb*, Youngjoon Jo**, Milind Kulkarni School of Electrical and Computer Engineering * Now at Qualcomm; ** Now at oogle
General Transformations for GPU Execution of Tree Traversals
General Transformations for GPU Execution of Tree Traversals Michael Goldfarb, Youngjoon Jo and Milind Kulkarni School of Electrical and Computer Engineering Purdue University West Lafayette, IN {mgoldfar,
More informationAUTOMATIC VECTORIZATION OF TREE TRAVERSALS
AUTOMATIC VECTORIZATION OF TREE TRAVERSALS Youngjoon Jo, Michael Goldfarb and Milind Kulkarni PACT, Edinburgh, U.K. September 11 th, 2013 Youngjoon Jo 2 Commodity processors and SIMD Commodity processors
More informationWarped parallel nearest neighbor searches using kd-trees
Warped parallel nearest neighbor searches using kd-trees Roman Sokolov, Andrei Tchouprakov D4D Technologies Kd-trees Binary space partitioning tree Used for nearest-neighbor search, range search Application:
More informationAutomatically Enhancing Locality for Tree Traversals with Traversal Splicing
Purdue University Purdue e-pubs ECE Technical Reports Electrical and Computer Engineering 2-15-2012 Automatically Enhancing Locality for Tree Traversals with Traversal Splicing Youngjoon Jo Electrical
More informationPhoton Mapping. Kadi Bouatouch IRISA
Kadi Bouatouch IRISA Email: kadi@irisa.fr 1 Photon emission and transport 2 Photon caching 3 Spatial data structure for fast access 4 Radiance estimation 5 Kd-tree Balanced Binary Tree When a splitting
More informationAn Efficient CUDA Implementation of a Tree-Based N-Body Algorithm. Martin Burtscher Department of Computer Science Texas State University-San Marcos
An Efficient CUDA Implementation of a Tree-Based N-Body Algorithm Martin Burtscher Department of Computer Science Texas State University-San Marcos Mapping Regular Code to GPUs Regular codes Operate on
More informationCSC148 Week 7. Larry Zhang
CSC148 Week 7 Larry Zhang 1 Announcements Test 1 can be picked up in DH-3008 A1 due this Saturday Next week is reading week no lecture, no labs no office hours 2 Recap Last week, learned about binary trees
More informationChapter 20: Binary Trees
Chapter 20: Binary Trees 20.1 Definition and Application of Binary Trees Definition and Application of Binary Trees Binary tree: a nonlinear linked list in which each node may point to 0, 1, or two other
More informationEnhancing Locality for Recursive Traversals of Recursive Structures
Enhancing Locality for Recursive Traversals of Recursive Structures Youngjoon Jo and Milind Kulkarni School of Electrical and Computer Engineering Purdue University {yjo,milind}@purdue.edu Abstract While
More informationCMPSCI 187: Programming With Data Structures. Lecture #28: Binary Search Trees 21 November 2011
CMPSCI 187: Programming With Data Structures Lecture #28: Binary Search Trees 21 November 2011 Binary Search Trees The Idea of a Binary Search Tree The BinarySearchTreeADT Interface The LinkedBinarySearchTree
More informationCharacterization of CUDA and KD-Tree K-Query Point Nearest Neighbor for Static and Dynamic Data Sets
Characterization of CUDA and KD-Tree K-Query Point Nearest Neighbor for Static and Dynamic Data Sets Brian Bowden bbowden1@vt.edu Jon Hellman jonatho7@vt.edu 1. Introduction Nearest neighbor (NN) search
More informationTreelogy: A Benchmark Suite for Tree Traversals
Purdue University Programming Languages Group Treelogy: A Benchmark Suite for Tree Traversals Nikhil Hegde, Jianqiao Liu, Kirshanthan Sundararajah, and Milind Kulkarni School of Electrical and Computer
More informationCSE 332: Data Structures & Parallelism Lecture 7: Dictionaries; Binary Search Trees. Ruth Anderson Winter 2019
CSE 332: Data Structures & Parallelism Lecture 7: Dictionaries; Binary Search Trees Ruth Anderson Winter 2019 Today Dictionaries Trees 1/23/2019 2 Where we are Studying the absolutely essential ADTs of
More informationTrees. CSE 373 Data Structures
Trees CSE 373 Data Structures Readings Reading Chapter 7 Trees 2 Why Do We Need Trees? Lists, Stacks, and Queues are linear relationships Information often contains hierarchical relationships File directories
More informationHARNESSING IRREGULAR PARALLELISM: A CASE STUDY ON UNSTRUCTURED MESHES. Cliff Woolley, NVIDIA
HARNESSING IRREGULAR PARALLELISM: A CASE STUDY ON UNSTRUCTURED MESHES Cliff Woolley, NVIDIA PREFACE This talk presents a case study of extracting parallelism in the UMT2013 benchmark for 3D unstructured-mesh
More informationUses for Trees About Trees Binary Trees. Trees. Seth Long. January 31, 2010
Uses for About Binary January 31, 2010 Uses for About Binary Uses for Uses for About Basic Idea Implementing Binary Example: Expression Binary Search Uses for Uses for About Binary Uses for Storage Binary
More informationTrees : Part 1. Section 4.1. Theory and Terminology. A Tree? A Tree? Theory and Terminology. Theory and Terminology
Trees : Part Section. () (2) Preorder, Postorder and Levelorder Traversals Definition: A tree is a connected graph with no cycles Consequences: Between any two vertices, there is exactly one unique path
More informationTrees. Tree Structure Binary Tree Tree Traversals
Trees Tree Structure Binary Tree Tree Traversals The Tree Structure Consists of nodes and edges that organize data in a hierarchical fashion. nodes store the data elements. edges connect the nodes. The
More informationWhy Do We Need Trees?
CSE 373 Lecture 6: Trees Today s agenda: Trees: Definition and terminology Traversing trees Binary search trees Inserting into and deleting from trees Covered in Chapter 4 of the text Why Do We Need Trees?
More informationCS24 Week 8 Lecture 1
CS24 Week 8 Lecture 1 Kyle Dewey Overview Tree terminology Tree traversals Implementation (if time) Terminology Node The most basic component of a tree - the squares Edge The connections between nodes
More informationSorted Arrays. Operation Access Search Selection Predecessor Successor Output (print) Insert Delete Extract-Min
Binary Search Trees FRIDAY ALGORITHMS Sorted Arrays Operation Access Search Selection Predecessor Successor Output (print) Insert Delete Extract-Min 6 10 11 17 2 0 6 Running Time O(1) O(lg n) O(1) O(1)
More informationMIDTERM WEEK - 9. Question 1 : Implement a MyQueue class which implements a queue using two stacks.
Ashish Jamuda Week 9 CS 331-DATA STRUCTURES & ALGORITHMS MIDTERM WEEK - 9 Question 1 : Implement a MyQueue class which implements a queue using two stacks. Solution: Since the major difference between
More informationBinary Trees
Binary Trees 4-7-2005 Opening Discussion What did we talk about last class? Do you have any code to show? Do you have any questions about the assignment? What is a Tree? You are all familiar with what
More informationOverview of Presentation. Heapsort. Heap Properties. What is Heap? Building a Heap. Two Basic Procedure on Heap
Heapsort Submitted by : Hardik Parikh(hjp0608) Soujanya Soni (sxs3298) Overview of Presentation Heap Definition. Adding a Node. Removing a Node. Array Implementation. Analysis What is Heap? A Heap is a
More informationLECTURE 11 TREE TRAVERSALS
DATA STRUCTURES AND ALGORITHMS LECTURE 11 TREE TRAVERSALS IMRAN IHSAN ASSISTANT PROFESSOR AIR UNIVERSITY, ISLAMABAD BACKGROUND All the objects stored in an array or linked list can be accessed sequentially
More informationOutline. An Application: A Binary Search Tree. 1 Chapter 7: Trees. favicon. CSI33 Data Structures
Outline Chapter 7: Trees 1 Chapter 7: Trees Approaching BST Making a decision We discussed the trade-offs between linked and array-based implementations of sequences (back in Section 4.7). Linked lists
More information3137 Data Structures and Algorithms in C++
3137 Data Structures and Algorithms in C++ Lecture 4 July 17 2006 Shlomo Hershkop 1 Announcements please make sure to keep up with the course, it is sometimes fast paced for extra office hours, please
More informationThe Art of Recursion: Problem Set 10
The Art of Recursion: Problem Set Due Tuesday, November Tail recursion A recursive function is tail recursive if every recursive call is in tail position that is, the result of the recursive call is immediately
More informationSimultaneous Branch and Warp Interweaving for Sustained GPU Performance
Simultaneous Branch and Warp Interweaving for Sustained GPU Performance Nicolas Brunie Sylvain Collange Gregory Diamos by Ravi Godavarthi Outline Introduc)on ISCA'39 (Interna'onal Society for Computers
More informationAn Introduction to Trees
An Introduction to Trees Alice E. Fischer Spring 2017 Alice E. Fischer An Introduction to Trees... 1/34 Spring 2017 1 / 34 Outline 1 Trees the Abstraction Definitions 2 Expression Trees 3 Binary Search
More informationCSE 326: Data Structures Binary Search Trees
nnouncements (1/3/0) S 36: ata Structures inary Search Trees Steve Seitz Winter 0 HW # due now HW #3 out today, due at beginning of class next riday. Project due next Wed. night. Read hapter 4 1 Ts Seen
More informationScalable Algorithmic Techniques Decompositions & Mapping. Alexandre David
Scalable Algorithmic Techniques Decompositions & Mapping Alexandre David 1.2.05 adavid@cs.aau.dk Introduction Focus on data parallelism, scale with size. Task parallelism limited. Notion of scalability
More informationLecture 16: Binary Search Trees
Extended Introduction to Computer Science CS1001.py Lecture 16: Binary Search Trees Instructors: Daniel Deutch, Amir Rubinstein Teaching Assistants: Michal Kleinbort, Amir Gilad School of Computer Science
More information8. Write an example for expression tree. [A/M 10] (A+B)*((C-D)/(E^F))
DHANALAKSHMI COLLEGE OF ENGINEERING DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING EC6301 OBJECT ORIENTED PROGRAMMING AND DATA STRUCTURES UNIT IV NONLINEAR DATA STRUCTURES Part A 1. Define Tree [N/D 08]
More informationGlobal Memory Access Pattern and Control Flow
Optimization Strategies Global Memory Access Pattern and Control Flow Objectives Optimization Strategies Global Memory Access Pattern (Coalescing) Control Flow (Divergent branch) Global l Memory Access
More informationSeparate Compilation and Namespaces Week Fall. Computer Programming for Engineers
Separate Compilation and Namespaces Week 07 Fall Computer Programming for Engineers Problem Multiplied palindrome number Multiplied palindrome number A number is called as a palindrome when you can read
More informationWhere we are. CSE373: Data Structures & Algorithms Lecture 4: Dictionaries; Binary Search Trees. The Dictionary (a.k.a. Map) ADT
Where we are Studying the absolutely essential DTs of computer science and classic data structures for implementing them SE: Data Structures & lgorithms Lecture : Dictionaries; inary Search Trees Dan rossman
More informationBacktracking. Chapter 5
1 Backtracking Chapter 5 2 Objectives Describe the backtrack programming technique Determine when the backtracking technique is an appropriate approach to solving a problem Define a state space tree for
More informationStackless BVH Collision Detection for Physical Simulation
Stackless BVH Collision Detection for Physical Simulation Jesper Damkjær Department of Computer Science, University of Copenhagen Universitetsparken 1, DK-2100, Copenhagen Ø, Denmark damkjaer@diku.dk February
More informationTrees! Ellen Walker! CPSC 201 Data Structures! Hiram College!
Trees! Ellen Walker! CPSC 201 Data Structures! Hiram College! ADTʼs Weʼve Studied! Position-oriented ADT! List! Stack! Queue! Value-oriented ADT! Sorted list! All of these are linear! One previous item;
More informationCSI33 Data Structures
Outline Department of Mathematics and Computer Science Bronx Community College October 24, 2016 Outline Outline 1 Chapter 7: Trees Outline Chapter 7: Trees 1 Chapter 7: Trees The Binary Search Property
More informationDATA STRUCTURES AND ALGORITHMS
LECTURE 11 Babeş - Bolyai University Computer Science and Mathematics Faculty 2017-2018 In Lecture 10... Hash tables Separate chaining Coalesced chaining Open Addressing Today 1 Open addressing - review
More informationCS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c/su06 CS61C : Machine Structures Lecture #6: Memory Management CS 61C L06 Memory Management (1) 2006-07-05 Andy Carle Memory Management (1/2) Variable declaration allocates
More informationkd-trees Idea: Each level of the tree compares against 1 dimension. Let s us have only two children at each node (instead of 2 d )
kd-trees Invented in 1970s by Jon Bentley Name originally meant 3d-trees, 4d-trees, etc where k was the # of dimensions Now, people say kd-tree of dimension d Idea: Each level of the tree compares against
More informationTree traversals and binary trees
Tree traversals and binary trees Comp Sci 1575 Data Structures Valgrind Execute valgrind followed by any flags you might want, and then your typical way to launch at the command line in Linux. Assuming
More informationParallel Physically Based Path-tracing and Shading Part 3 of 2. CIS565 Fall 2012 University of Pennsylvania by Yining Karl Li
Parallel Physically Based Path-tracing and Shading Part 3 of 2 CIS565 Fall 202 University of Pennsylvania by Yining Karl Li Jim Scott 2009 Spatial cceleration Structures: KD-Trees *Some portions of these
More informationCSE373: Data Structures & Algorithms Lecture 5: Dictionaries; Binary Search Trees. Aaron Bauer Winter 2014
CSE373: Data Structures & lgorithms Lecture 5: Dictionaries; Binary Search Trees aron Bauer Winter 2014 Where we are Studying the absolutely essential DTs of computer science and classic data structures
More informationPostfix (and prefix) notation
Postfix (and prefix) notation Also called reverse Polish reversed form of notation devised by mathematician named Jan Łukasiewicz (so really lü-kä-sha-vech notation) Infix notation is: operand operator
More informationPriority Queues, Binary Heaps, and Heapsort
Priority Queues, Binary eaps, and eapsort Learning Goals: Provide examples of appropriate applications for priority queues and heaps. Implement and manipulate a heap using an array as the underlying data
More informationFast Bit Sort. A New In Place Sorting Technique. Nando Favaro February 2009
Fast Bit Sort A New In Place Sorting Technique Nando Favaro February 2009 1. INTRODUCTION 1.1. A New Sorting Algorithm In Computer Science, the role of sorting data into an order list is a fundamental
More informationCS 62 Practice Final SOLUTIONS
CS 62 Practice Final SOLUTIONS 2017-5-2 Please put your name on the back of the last page of the test. Note: This practice test may be a bit shorter than the actual exam. Part 1: Short Answer [32 points]
More informationMilind Kulkarni Research Statement
Milind Kulkarni Research Statement With the increasing ubiquity of multicore processors, interest in parallel programming is again on the upswing. Over the past three decades, languages and compilers researchers
More informationBalanced Search Trees. CS 3110 Fall 2010
Balanced Search Trees CS 3110 Fall 2010 Some Search Structures Sorted Arrays Advantages Search in O(log n) time (binary search) Disadvantages Need to know size in advance Insertion, deletion O(n) need
More informationData Structures and Algorithms
Data Structures and Algorithms CS245-2008S-19 B-Trees David Galles Department of Computer Science University of San Francisco 19-0: Indexing Operations: Add an element Remove an element Find an element,
More informationBinary Tree. Binary tree terminology. Binary tree terminology Definition and Applications of Binary Trees
Binary Tree (Chapter 0. Starting Out with C++: From Control structures through Objects, Tony Gaddis) Le Thanh Huong School of Information and Communication Technology Hanoi University of Technology 11.1
More informationProgrammer's View of Execution Teminology Summary
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 28: GP-GPU Programming GPUs Hardware specialized for graphics calculations Originally developed to facilitate the use of CAD programs
More informationOptimization Strategies Global Memory Access Pattern and Control Flow
Global Memory Access Pattern and Control Flow Optimization Strategies Global Memory Access Pattern (Coalescing) Control Flow (Divergent branch) Highest latency instructions: 400-600 clock cycles Likely
More informationCourse Review. Cpt S 223 Fall 2009
Course Review Cpt S 223 Fall 2009 1 Final Exam When: Tuesday (12/15) 8-10am Where: in class Closed book, closed notes Comprehensive Material for preparation: Lecture slides & class notes Homeworks & program
More informationa graph is a data structure made up of nodes in graph theory the links are normally called edges
1 Trees Graphs a graph is a data structure made up of nodes each node stores data each node has links to zero or more nodes in graph theory the links are normally called edges graphs occur frequently in
More information/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/27/17
01.433/33 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Priority Queues / Heaps Date: 9/2/1.1 Introduction In this lecture we ll talk about a useful abstraction, priority queues, which are
More informationBinomial Queue deletemin ADTs Seen So Far
Today s Outline Trees (inary Search Trees) hapter in Weiss S ata Structures Ruth nderson nnouncements Written HW # due next riday, 1/ Project due next Monday, /1 Today s Topics: Priority Queues inomial
More information9/29/2016. Chapter 4 Trees. Introduction. Terminology. Terminology. Terminology. Terminology
Introduction Chapter 4 Trees for large input, even linear access time may be prohibitive we need data structures that exhibit average running times closer to O(log N) binary search tree 2 Terminology recursive
More informationCS61B Lecture #20. Today: Trees. Readings for Today: Data Structures, Chapter 5. Readings for Next Topic: Data Structures, Chapter 6
CS61B Lecture #20 Today: Trees Readings for Today: Data Structures, Chapter 5 Readings for Next Topic: Data Structures, Chapter 6 Last modified: Wed Oct 14 03:20:09 2015 CS61B: Lecture #20 1 A Recursive
More informationLecture 15 Notes Binary Search Trees
Lecture 15 Notes Binary Search Trees 15-122: Principles of Imperative Computation (Spring 2016) Frank Pfenning, André Platzer, Rob Simmons 1 Introduction In this lecture, we will continue considering ways
More informationBalanced Search Trees
Balanced Search Trees Computer Science E-22 Harvard Extension School David G. Sullivan, Ph.D. Review: Balanced Trees A tree is balanced if, for each node, the node s subtrees have the same height or have
More informationAdvanced Algorithm Design and Analysis (Lecture 12) SW5 fall 2005 Simonas Šaltenis E1-215b
Advanced Algorithm Design and Analysis (Lecture 12) SW5 fall 2005 Simonas Šaltenis E1-215b simas@cs.aau.dk Range Searching in 2D Main goals of the lecture: to understand and to be able to analyze the kd-trees
More informationLPT Codes used with ROAM Splitting Algorithm CMSC451 Honors Project Joseph Brosnihan Mentor: Dave Mount Fall 2015
LPT Codes used with ROAM Splitting Algorithm CMSC451 Honors Project Joseph Brosnihan Mentor: Dave Mount Fall 2015 Abstract Simplicial meshes are commonly used to represent regular triangle grids. Of many
More informationOn the Correctness of the SIMT Execution Model of GPUs. Extended version of the author s ESOP 12 paper. Axel Habermaier and Alexander Knapp
UNIVERSITÄT AUGSBURG On the Correctness of the SIMT Execution Model of GPUs Extended version of the author s ESOP 12 paper Axel Habermaier and Alexander Knapp Report 2012-01 January 2012 INSTITUT FÜR INFORMATIK
More informationGPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC
GPU ACCELERATED SELF-JOIN FOR THE DISTANCE SIMILARITY METRIC MIKE GOWANLOCK NORTHERN ARIZONA UNIVERSITY SCHOOL OF INFORMATICS, COMPUTING & CYBER SYSTEMS BEN KARSIN UNIVERSITY OF HAWAII AT MANOA DEPARTMENT
More informationData structures and libraries
Data structures and libraries Bjarki Ágúst Guðmundsson Tómas Ken Magnússon Árangursrík forritun og lausn verkefna School of Computer Science Reykjavík University Today we re going to cover Basic data types
More informationToday's Topics. CISC 458 Winter J.R. Cordy
Today's Topics Last Time Semantics - the meaning of program structures Stack model of expression evaluation, the Expression Stack (ES) Stack model of automatic storage, the Run Stack (RS) Today Managing
More informationTrees. (Trees) Data Structures and Programming Spring / 28
Trees (Trees) Data Structures and Programming Spring 2018 1 / 28 Trees A tree is a collection of nodes, which can be empty (recursive definition) If not empty, a tree consists of a distinguished node r
More informationCSE 591: GPU Programming. Using CUDA in Practice. Klaus Mueller. Computer Science Department Stony Brook University
CSE 591: GPU Programming Using CUDA in Practice Klaus Mueller Computer Science Department Stony Brook University Code examples from Shane Cook CUDA Programming Related to: score boarding load and store
More informationLecture 15 Binary Search Trees
Lecture 15 Binary Search Trees 15-122: Principles of Imperative Computation (Fall 2017) Frank Pfenning, André Platzer, Rob Simmons, Iliano Cervesato In this lecture, we will continue considering ways to
More informationAutomatically Optimizing Tree Traversal Algorithms
Purdue University Purdue e-pubs Open Access Dissertations Theses and Dissertations Fall 2013 Automatically Optimizing Tree Traversal Algorithms Youngjoon Jo Purdue University Follow this and additional
More informationCSE373: Data Structures & Algorithms Lecture 4: Dictionaries; Binary Search Trees. Kevin Quinn Fall 2015
CSE373: Data Structures & Algorithms Lecture 4: Dictionaries; Binary Search Trees Kevin Quinn Fall 2015 Where we are Studying the absolutely essential ADTs of computer science and classic data structures
More informationCS61BL Summer 2013 Midterm 2
CS61BL Summer 2013 Midterm 2 Sample Solutions + Common Mistakes Question 0: Each of the following cost you.5 on this problem: you earned some credit on a problem and did not put your five digit on the
More informationCS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 30: GP-GPU Programming. Lecturer: Alan Christopher
CS 61C: Great Ideas in Computer Architecture (Machine Structures) Lecture 30: GP-GPU Programming Lecturer: Alan Christopher Overview GP-GPU: What and why OpenCL, CUDA, and programming GPUs GPU Performance
More informationBinary Trees, Binary Search Trees
Binary Trees, Binary Search Trees Trees Linear access time of linked lists is prohibitive Does there exist any simple data structure for which the running time of most operations (search, insert, delete)
More informationIntroduction to Trees. D. Thiebaut CSC212 Fall 2014
Introduction to Trees D. Thiebaut CSC212 Fall 2014 A bit of History & Data Visualization: The Book of Trees. (Link) We Concentrate on Binary-Trees, Specifically, Binary-Search Trees (BST) How Will Java
More informationPriority Queues Heaps Heapsort
Priority Queues Heaps Heapsort After this lesson, you should be able to apply the binary heap insertion and deletion algorithms by hand implement the binary heap insertion and deletion algorithms explain
More informationCS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 7 More Memory Management CS 61C L07 More Memory Management (1) 2004-09-15 Lecturer PSOE Dan Garcia www.cs.berkeley.edu/~ddgarcia Star Wars
More informationSimty: Generalized SIMT execution on RISC-V
Simty: Generalized SIMT execution on RISC-V CARRV 2017 Sylvain Collange INRIA Rennes / IRISA sylvain.collange@inria.fr From CPU-GPU to heterogeneous multi-core Yesterday (2000-2010) Homogeneous multi-core
More informationCSCI-401 Examlet #5. Name: Class: Date: True/False Indicate whether the sentence or statement is true or false.
Name: Class: Date: CSCI-401 Examlet #5 True/False Indicate whether the sentence or statement is true or false. 1. The root node of the standard binary tree can be drawn anywhere in the tree diagram. 2.
More informationParallelizing Adaptive Triangular Grids with Refinement Trees and Space Filling Curves
Parallelizing Adaptive Triangular Grids with Refinement Trees and Space Filling Curves Daniel Butnaru butnaru@in.tum.de Advisor: Michael Bader bader@in.tum.de JASS 08 Computational Science and Engineering
More informationMulti Agent Navigation on GPU. Avi Bleiweiss
Multi Agent Navigation on GPU Avi Bleiweiss Reasoning Explicit Implicit Script, storytelling State machine, serial Compute intensive Fits SIMT architecture well Navigation planning Collision avoidance
More informationSome Search Structures. Balanced Search Trees. Binary Search Trees. A Binary Search Tree. Review Binary Search Trees
Some Search Structures Balanced Search Trees Lecture 8 CS Fall Sorted Arrays Advantages Search in O(log n) time (binary search) Disadvantages Need to know size in advance Insertion, deletion O(n) need
More informationCUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav
CUDA PROGRAMMING MODEL Chaithanya Gadiyam Swapnil S Jadhav CMPE655 - Multiple Processor Systems Fall 2015 Rochester Institute of Technology Contents What is GPGPU? What s the need? CUDA-Capable GPU Architecture
More informationFINALTERM EXAMINATION Fall 2009 CS301- Data Structures Question No: 1 ( Marks: 1 ) - Please choose one The data of the problem is of 2GB and the hard
FINALTERM EXAMINATION Fall 2009 CS301- Data Structures Question No: 1 The data of the problem is of 2GB and the hard disk is of 1GB capacity, to solve this problem we should Use better data structures
More informationCS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 7 C Memory Management 2007-02-06 Hello to Said S. from Columbus, OH CS61C L07 More Memory Management (1) Lecturer SOE Dan Garcia www.cs.berkeley.edu/~ddgarcia
More informationCS61B Lecture #21. A Recursive Structure. Fundamental Operation: Traversal. Preorder Traversal and Prefix Expressions. (- (- (* x (+ y 3))) z)
CSB Lecture # A Recursive Structure Today: Trees Readings for Today: Data Structures, Chapter Readings for Next Topic: Data Structures, Chapter Trees naturally represent recursively defined, hierarchical
More informationCS61C : Machine Structures
inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 7 C Memory Management!!Lecturer SOE Dan Garcia!!!www.cs.berkeley.edu/~ddgarcia CS61C L07 More Memory Management (1)! 2010-02-03! Flexible
More informationTrees, Part 1: Unbalanced Trees
Trees, Part 1: Unbalanced Trees The first part of this chapter takes a look at trees in general and unbalanced binary trees. The second part looks at various schemes to balance trees and/or make them more
More informationAlgorithms in Systems Engineering ISE 172. Lecture 16. Dr. Ted Ralphs
Algorithms in Systems Engineering ISE 172 Lecture 16 Dr. Ted Ralphs ISE 172 Lecture 16 1 References for Today s Lecture Required reading Sections 6.5-6.7 References CLRS Chapter 22 R. Sedgewick, Algorithms
More informationCompiler construction in4303 lecture 8
Compiler construction in4303 lecture 8 Interpretation Simple code generation Chapter 4 4.2.4 Overview intermediate code program text front-end interpretation code generation code selection register allocation
More informationUnderstanding Task Scheduling Algorithms. Kenjiro Taura
Understanding Task Scheduling Algorithms Kenjiro Taura 1 / 48 Contents 1 Introduction 2 Work stealing scheduler 3 Analyzing execution time of work stealing 4 Analyzing cache misses of work stealing 5 Summary
More informationUniversity of Illinois at Urbana-Champaign Department of Computer Science. Second Examination
University of Illinois at Urbana-Champaign Department of Computer Science Second Examination CS 225 Data Structures and Software Principles Fall 2011 9a-11a, Wednesday, November 2 Name: NetID: Lab Section
More information» Access elements of a container sequentially without exposing the underlying representation
Iterator Pattern Behavioural Intent» Access elements of a container sequentially without exposing the underlying representation Iterator-1 Motivation Be able to process all the elements in a container
More informationDDS Dynamic Search Trees
DDS Dynamic Search Trees 1 Data structures l A data structure models some abstract object. It implements a number of operations on this object, which usually can be classified into l creation and deletion
More informationTopic 14. The BinaryTree ADT
Topic 14 The BinaryTree ADT Objectives Define trees as data structures Define the terms associated with trees Discuss tree traversal algorithms Discuss a binary tree implementation Examine a binary tree
More information