STREAMS VS PARALLELSTREAMS BY VAIBHAV CHOUDHARY JAVA PLATFORMS TEAM, ORACLE

Size: px
Start display at page:

Download "STREAMS VS PARALLELSTREAMS BY VAIBHAV CHOUDHARY JAVA PLATFORMS TEAM, ORACLE"

Transcription

1 STREAMS VS PARALLELSTREAMS BY VAIBHAV CHOUDHARY JAVA PLATFORMS TEAM, ORACLE

2 IN SHORT, WE WILL DISCUSS Concurrency and Parallelism Stream support in Java 8 Streams to Parallel Streams Implementation of Parallel Streams Performance Considerations

3 PERFORMANCE DEFINTION CHANGE Performance definition changed over time ::- Single-Core Time frame - Non-blocking, Prioritisation Multi-Core Time Frame - Focused on better throughput Many-Core Time Frame - Focused on better latency Software should dynamically adapt the requirements.

4 TREND IN JDK RELEASES JDK 1 JDK 5 JDK 7 JDK 8 Threads, Locks, Condition queues Fork-Join Framework Concurrency Package, Blocking Queues, Latches, Thread Pools Parallel Streams

5 CONCURRENCY VS PARALLELISM Concurrency is tough Its about efficiently using the shared resources. Its about dealing with thread safety, locks, semaphores, race conditions ITS TOUGH. Parallelism is relatively less complex Its about breaking the task into sub-task and get results faster. Faster, and only faster (with more resources).

6 PARALLELISM It is all about optimisation. Faster results, that s it! Analyse -> Implement -> Test -> Repeat If no better result, move to sequential work. Why you think, it should take more time than sequential implementation?

7 PARALLELISM OVERHEAD Parallel computation always involves more task than its alternative sequential computation. Always a slow start-up. DECOMPOSE THE PROBLEM INTO TASK LAUNCH TASK, MANAGE TASK, WAIT FOR COMPLETION + SOLVE THE PROBLEM COMBINE RESULTS

8 PARALLELISM WHEN And to succeed, you need it ( ALL ) :- A problem that can be broken into parallel. A niche implementation A good runtime support Minimum data to get benefited out of it.

9 UNDERSTANDING DATA FLOW DEPENDENCY Consider a problem and look at the data dependency. f(f(0) need f(0) Iterative problem Bad idea to think of parallelism G(3) G(2) G(N) = F (G(N-1)), IF N > 0 G(1) G(0) = F(0) G(0)

10 CONSIDER SIMILAR PROBLEMS G(N) = F(N) + G(N-1), IF N>0 G(2) G(0) = F(0) G(1) F(2) G(0) F(1) G(2) F(0) F(2) F(1) F(0)

11 REACHING TO PARALLEL COMPUTATION Example: Sum number. Lets talk of the various possible approach. INT SUM (INT[] ARR) { INT SUM = 0; FOR (INT I : ARR) SUM = SUM + I; RETURN SUM; Bad! Unlearn, accumulator pattern! Bad!

12 Let s try to solve this problem with Concurrency. INT SUM (INT[] ARR) { INT SUM = 0; FOR (INT I : ARR) SUM = SUM + ARR[I]; RETURN SUM; ATOMIC{ ATOMIC{ INT SUM (INT[] ARR) { INT SUM = 0; INT MID = ARR.LENGTH/2; CONCURRENT { { FOR (INT I=0; I< MID; I++) SUM = SUM + ARR[I]; { FOR (INT I=MID;I<ARR.LENGTH; I++) SUM = SUM + ARR[I]; RETURN SUM;

13 INT SUM (INT[] ARR) { INT SUM = 0; INT MID = ARR.LENGTH/2; CONCURRENT { { FOR (INT I=0; I< MID; I++) ATOMIC { SUM = SUM + ARR[I]; { FOR (INT I=MID;I<ARR.LENGTH; I++) ATOMIC { SUM = SUM + ARR[I]; RETURN SUM; -> Performance : It will just suck! -> What s wrong: Spending too much time in safeguarding the data -> Best practices: Don t share, Don t mutate, Coordinate access. -> Let s try with Don t share

14 DON T SHARE INT SUM (INT[] ARR) { INT LEFT = 0, INT RIGHT = 0; INT MID = ARR.LENGTH/2; CONCURRENT { { FOR (INT I=0; I< MID; I++) ATOMIC { LEFT = LEFT + ARR[I]; { FOR (INT I=MID;I<ARR.LENGTH; I++) ATOMIC { RIGHT = RIGHT + ARR[I]; RETURN LEFT + RIGHT; -> Looks better. -> But not the generic solution.

15 TOWARDS PARALLELISM // PSEUDOCODE Result solve(problem problem) { if (problem.size < SEQUENTIAL_THRESHOLD) return solvesequentially(problem); else { Result left, right; INVOKE-IN-PARALLEL { left = solve(extractlefthalf(problem)); right = solve(extractrighthalf(problem)); return combine(left, right);

16 Fork-Join Framework TASK SUB-TASK SUB-TASK SUB-TASK SUB-TASK SUB-TASK SUB-TASK JOIN JOIN JOIN JOIN JOIN JOIN JOIN

17 public class MaxWithFJ extends RecursiveAction { private final int threshold; private final SelectMaxProblem problem; public int result; public MaxWithFJ(SelectMaxProblem problem, int threshold) { this.problem = problem; this.threshold = threshold; protected void compute() { if (problem.size < threshold) result = problem.solvesequentially(); else { int midpoint = problem.size / 2; MaxWithFJ left = new MaxWithFJ(problem.subproblem(0, mi MaxWithFJ right = new MaxWithFJ(problem.subproblem(mid 1, problem.size), threshold); coinvoke(left, right); result = Math.max(left.result, right.result); public static void main(string[] args) { SelectMaxProblem problem =... int threshold =... int nthreads =... MaxWithFJ mfj = new MaxWithFJ(problem, threshold); ForkJoinExecutor fjpool = new ForkJoinPool(nThreads); fjpool.invoke(mfj); int result = mfj.result;

18 RESULTS AND SELECTION THREAD/ THRESHOLD 500K 50K 5K thread thread thread thread Run for the size of 500K better but underperforming

19 PERFORMANCE CONSIDERATIONS Splitting / Decomposing the problem Sometimes splitting is more expensive Task Management Cost Need to have a close look. Result Combination Cost Locality issue - Cache misses. EACH OF IT CAN BE PERFORMANCE KILLER.

20 STREAMS IN JAVA 8 Complexity of Fork-Join - Simplified in Streams. Simple pipelines like reduce, sort, map, filter. Exploits lazy evaluation. No magical parallelism. We have to own it. We have to apply our knowledge to get the right result.

21 THE NQ MODEL Simple model for parallel performance N : No of elements Q: amount of work per item. N X Q > 10,000 -> chance for parallelism. Important for say minimum data

22 SOURCE SPLITTING Source splitting is very important. Some source split awesome, some sucks. Cost of computation for split. Evenness in split. Predicability in split. Array rocks, LinkedList sucks, Tree goes OK-OK.

23 LOCALITY PLAYS ROLE Locality is dangerous. Parallelism is good if CPU is busy in good work. Cache miss is not a good work. Stream.of(int[].sum()) vs Stream.of(Integer[].sum()) Speed Up N=1K N=10K N=1M int 1x 6.2x 7.9x Integer -4.9x 1.5x 3.5x

24 LETS REPEAT When to consider parallelism NQ is high Low Cache miss Source good for splitting. Combining cost not high. Pipeline takes order insensitive operations.

25 LETS PRACTICE

26 JUST GET A BIT OF TECHNICALITY COLLECTIONS.STREAM() // SOURCE.FILTER() //INTERMEDIATE OPS.MAP() //INTERMEDIATE OPS.COLLECT() //TERMINAL OPS COLLECTIONS.STREAM() // SOURCE.FILTER() //INTERMEDIATE OPS.PARALLEL() // PARELLEL STREAM.MAP() //INTERMEDIATE OPS.SEQUENTIAL() // SEQUENTIAL ST.COLLECT() //TERMINAL OPS ( Last wins, either parallel or sequential)

27 FIRST EXAMPLE import java.util.list; import java.util.stream.intstream; import static java.util.stream.collectors.tolist; public class ParallelExample1 { public static void main(string[] args) { List <String> output = IntStream.range(0,50).parallel(). filter(i -> i%5 == 0). maptoobj(i -> String.valueOf(i/5)). collect(tolist()); System.out.println(output); WITHOUT PARALLEL : OUTPUT: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] WITH PARALLEL : OUTPUT: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

28 TIME ORDER VS SPACIAL ORDER.parallel() So, in example 1, time order change not the spatial order

29 EXAMPLE 2 - GOOD USAGES import java.util.arraylist; import java.util.list; import java.util.random; import java.util.set; import java.util.concurrent.concurrentskiplistset; import java.util.stream.collectors; public class SortingParallelExample { public static void main(string[] args) { List<Integer> list = new ArrayList<>(); Set<String> nthread = new ConcurrentSkipListSet<>(); Random rn = new Random(); for (int i = 0; i < 10_00_00_00; i++) { list.add((int) rn.nextint() * ); for (int i = 0; i < 10; i++) { long start = System.currentTimeMillis(); List<Integer> even = list.parallelstream(). filter(n -> n % 2 == 0). peek(n -> nthread.add(thread.currentthread().getname())). sorted(). collect(collectors.tolist()); long end = System.currentTimeMillis(); System.out.println(end - start + " " + "Active threads " + Thread.activeCount()); System.out.println("Thread Names are " + nthread);

Java Array List Interview Questions

Java Array List Interview Questions Java Array List Interview Questions codespaghetti.com/arraylist-interview-questions/ Array List Java Array List Interview Questions, Algorithms and Array List Programs. Table of Contents: CHAPTER 1: Top

More information

Lecture 29: Parallelism II

Lecture 29: Parallelism II Lecture 29: Parallelism II CS 62 Spring 2019 William Devanny & Alexandra Papoutsaki Some slides based on those from Dan Grossman, U. of Washington 1 How to Create a Thread in Java 1. Define class C extends

More information

Computation Abstractions. Processes vs. Threads. So, What Is a Thread? CMSC 433 Programming Language Technologies and Paradigms Spring 2007

Computation Abstractions. Processes vs. Threads. So, What Is a Thread? CMSC 433 Programming Language Technologies and Paradigms Spring 2007 CMSC 433 Programming Language Technologies and Paradigms Spring 2007 Threads and Synchronization May 8, 2007 Computation Abstractions t1 t1 t4 t2 t1 t2 t5 t3 p1 p2 p3 p4 CPU 1 CPU 2 A computer Processes

More information

Parallel Computing CSCI 201 Principles of Software Development

Parallel Computing CSCI 201 Principles of Software Development Parallel Computing CSCI 201 Principles of Software Development Jeffrey Miller, Ph.D. jeffrey.miller@usc.edu Program Outline USC CSCI 201L Parallel Computing Parallel computing studies software systems

More information

Multithreaded Programming Part II. CSE 219 Stony Brook University, Department of Computer Science

Multithreaded Programming Part II. CSE 219 Stony Brook University, Department of Computer Science Multithreaded Programming Part II CSE 219 Stony Brook University, Thread Scheduling In a Java application, main is a thread on its own Once multiple threads are made Runnable the thread scheduler of the

More information

Java 8 Stream Performance Angelika Langer & Klaus Kreft

Java 8 Stream Performance Angelika Langer & Klaus Kreft Java 8 Stream Performance Angelika Langer & Klaus Kreft agenda introduction loop vs. sequential stream sequential vs. parallel stream Stream Performance (2) what is a stream? equivalent of sequence from

More information

Perchance to Stream with Java 8

Perchance to Stream with Java 8 Perchance to Stream with Java 8 Paul Sandoz Oracle The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated $$ into

More information

Java8: Stream Style. Sergey

Java8: Stream Style. Sergey Java8: Stream Style Sergey Kuksenko sergey.kuksenko@oracle.com, @kuksenk0 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be

More information

Chair of Software Engineering. Java and C# in depth. Carlo A. Furia, Marco Piccioni, Bertrand Meyer. Java: concurrency

Chair of Software Engineering. Java and C# in depth. Carlo A. Furia, Marco Piccioni, Bertrand Meyer. Java: concurrency Chair of Software Engineering Carlo A. Furia, Marco Piccioni, Bertrand Meyer Java: concurrency Outline Java threads thread implementation sleep, interrupt, and join threads that return values Thread synchronization

More information

Concept of a process

Concept of a process Concept of a process In the context of this course a process is a program whose execution is in progress States of a process: running, ready, blocked Submit Ready Running Completion Blocked Concurrent

More information

Java 8 Stream Performance Angelika Langer & Klaus Kreft

Java 8 Stream Performance Angelika Langer & Klaus Kreft Java 8 Stream Performance Angelika Langer & Klaus Kreft objective how do streams perform? explore whether / when parallel streams outperfom seq. streams compare performance of streams to performance of

More information

Modern Processor Architectures. L25: Modern Compiler Design

Modern Processor Architectures. L25: Modern Compiler Design Modern Processor Architectures L25: Modern Compiler Design The 1960s - 1970s Instructions took multiple cycles Only one instruction in flight at once Optimisation meant minimising the number of instructions

More information

CS 31: Introduction to Computer Systems : Threads & Synchronization April 16-18, 2019

CS 31: Introduction to Computer Systems : Threads & Synchronization April 16-18, 2019 CS 31: Introduction to Computer Systems 22-23: Threads & Synchronization April 16-18, 2019 Making Programs Run Faster We all like how fast computers are In the old days (1980 s - 2005): Algorithm too slow?

More information

JAVA CONCURRENCY FRAMEWORK. Kaushik Kanetkar

JAVA CONCURRENCY FRAMEWORK. Kaushik Kanetkar JAVA CONCURRENCY FRAMEWORK Kaushik Kanetkar Old days One CPU, executing one single program at a time No overlap of work/processes Lots of slack time CPU not completely utilized What is Concurrency Concurrency

More information

CSE 591: GPU Programming. Using CUDA in Practice. Klaus Mueller. Computer Science Department Stony Brook University

CSE 591: GPU Programming. Using CUDA in Practice. Klaus Mueller. Computer Science Department Stony Brook University CSE 591: GPU Programming Using CUDA in Practice Klaus Mueller Computer Science Department Stony Brook University Code examples from Shane Cook CUDA Programming Related to: score boarding load and store

More information

Practical Concurrency. Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited.

Practical Concurrency. Copyright 2007 SpringSource. Copying, publishing or distributing without express written permission is prohibited. Practical Concurrency Agenda Motivation Java Memory Model Basics Common Bug Patterns JDK Concurrency Utilities Patterns of Concurrent Processing Testing Concurrent Applications Concurrency in Java 7 2

More information

Async Programming & Networking. CS 475, Spring 2018 Concurrent & Distributed Systems

Async Programming & Networking. CS 475, Spring 2018 Concurrent & Distributed Systems Async Programming & Networking CS 475, Spring 2018 Concurrent & Distributed Systems Review: Resource Metric Processes images Camera Sends images Image Service 2 Review: Resource Metric Processes images

More information

The New Java Technology Memory Model

The New Java Technology Memory Model The New Java Technology Memory Model java.sun.com/javaone/sf Jeremy Manson and William Pugh http://www.cs.umd.edu/~pugh 1 Audience Assume you are familiar with basics of Java technology-based threads (

More information

Java SE 8 Programming

Java SE 8 Programming Oracle University Contact Us: +52 1 55 8525 3225 Java SE 8 Programming Duration: 5 Days What you will learn This Java SE 8 Programming training covers the core language features and Application Programming

More information

Synchronization in Java

Synchronization in Java Synchronization in Java Nelson Padua-Perez Bill Pugh Department of Computer Science University of Maryland, College Park Synchronization Overview Unsufficient atomicity Data races Locks Deadlock Wait /

More information

Lecture 27: Safety and Liveness Properties, Java Synchronizers, Dining Philosophers Problem

Lecture 27: Safety and Liveness Properties, Java Synchronizers, Dining Philosophers Problem COMP 322: Fundamentals of Parallel Programming Lecture 27: Safety and Liveness Properties, Java Synchronizers, Dining Philosophers Problem Mack Joyner and Zoran Budimlić {mjoyner, zoran}@rice.edu http://comp322.rice.edu

More information

Multithreading in C with OpenMP

Multithreading in C with OpenMP Multithreading in C with OpenMP ICS432 - Spring 2017 Concurrent and High-Performance Programming Henri Casanova (henric@hawaii.edu) Pthreads are good and bad! Multi-threaded programming in C with Pthreads

More information

Complexity of Algorithms

Complexity of Algorithms Complexity of Algorithms Time complexity is abstracted to the number of steps or basic operations performed in the worst case during a computation. Now consider the following: 1. How much time does it

More information

Programmazione di sistemi multicore

Programmazione di sistemi multicore Programmazione di sistemi multicore A.A. 2015-2016 LECTURE 12 IRENE FINOCCHI http://wwwusers.di.uniroma1.it/~finocchi/ Shared-memory concurrency & mutual exclusion TASK PARALLELISM AND OVERLAPPING MEMORY

More information

Cache Coherence and Atomic Operations in Hardware

Cache Coherence and Atomic Operations in Hardware Cache Coherence and Atomic Operations in Hardware Previously, we introduced multi-core parallelism. Today we ll look at 2 things: 1. Cache coherence 2. Instruction support for synchronization. And some

More information

CSE 373: Data Structures and Algorithms

CSE 373: Data Structures and Algorithms CSE 373: Data Structures and Algorithms Lecture 22: Introduction to Multithreading and Parallelism Instructor: Lilian de Greef Quarter: Summer 2017 Today: Introduction to multithreading and parallelism

More information

Java Threads Instruct or: M Maina k Ch k Chaudh dh i ur ac.in in 1

Java Threads Instruct or: M Maina k Ch k Chaudh dh i ur ac.in in 1 Java Threads Instructor: t Mainak Chaudhuri mainakc@cse.iitk.ac.iniitk ac 1 Java threads Two ways to create a thread Extend the Thread class and override the public run method Implement a runnable interface

More information

High Performance Computing on GPUs using NVIDIA CUDA

High Performance Computing on GPUs using NVIDIA CUDA High Performance Computing on GPUs using NVIDIA CUDA Slides include some material from GPGPU tutorial at SIGGRAPH2007: http://www.gpgpu.org/s2007 1 Outline Motivation Stream programming Simplified HW and

More information

Concurrency & Parallelism. Threads, Concurrency, and Parallelism. Multicore Processors 11/7/17

Concurrency & Parallelism. Threads, Concurrency, and Parallelism. Multicore Processors 11/7/17 Concurrency & Parallelism So far, our programs have been sequential: they do one thing after another, one thing at a. Let s start writing programs that do more than one thing at at a. Threads, Concurrency,

More information

Implementing Mutual Exclusion. Sarah Diesburg Operating Systems CS 3430

Implementing Mutual Exclusion. Sarah Diesburg Operating Systems CS 3430 Implementing Mutual Exclusion Sarah Diesburg Operating Systems CS 3430 From the Previous Lecture The too much milk example shows that writing concurrent programs directly with load and store instructions

More information

Threads, Concurrency, and Parallelism

Threads, Concurrency, and Parallelism Threads, Concurrency, and Parallelism Lecture 24 CS2110 Spring 2017 Concurrency & Parallelism So far, our programs have been sequential: they do one thing after another, one thing at a time. Let s start

More information

Threads and Parallelism in Java

Threads and Parallelism in Java Threads and Parallelism in Java Java is one of the few main stream programming languages to explicitly provide for user-programmed parallelism in the form of threads. A Java programmer may organize a program

More information

CS 261 Fall Mike Lam, Professor. Threads

CS 261 Fall Mike Lam, Professor. Threads CS 261 Fall 2017 Mike Lam, Professor Threads Parallel computing Goal: concurrent or parallel computing Take advantage of multiple hardware units to solve multiple problems simultaneously Motivations: Maintain

More information

Programming in Parallel COMP755

Programming in Parallel COMP755 Programming in Parallel COMP755 All games have morals; and the game of Snakes and Ladders captures, as no other activity can hope to do, the eternal truth that for every ladder you hope to climb, a snake

More information

Java SE 8 Programming

Java SE 8 Programming Oracle University Contact Us: Local: 1800 103 4775 Intl: +91 80 67863102 Java SE 8 Programming Duration: 5 Days What you will learn This Java SE 8 Programming training covers the core language features

More information

CSE 332: Analysis of Fork-Join Parallel Programs. Richard Anderson Spring 2016

CSE 332: Analysis of Fork-Join Parallel Programs. Richard Anderson Spring 2016 CSE 332: Analysis of Fork-Join Parallel Programs Richard Anderson Spring 2016 1 New Story: Shared Memory with Threads Threads, each with own unshared call stack and program counter Heap for all objects

More information

CMSC 132: Object-Oriented Programming II

CMSC 132: Object-Oriented Programming II CMSC 132: Object-Oriented Programming II Synchronization in Java Department of Computer Science University of Maryland, College Park Multithreading Overview Motivation & background Threads Creating Java

More information

A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 2 Analysis of Fork-Join Parallel Programs

A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 2 Analysis of Fork-Join Parallel Programs A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 2 Analysis of Fork-Join Parallel Programs Dan Grossman Last Updated: January 2016 For more information, see http://www.cs.washington.edu/homes/djg/teachingmaterials/

More information

Programming II (CS300)

Programming II (CS300) 1 Programming II (CS300) Chapter 10 Recursion and Search MOUNA KACEM Recursion: General Overview 2 Recursion in Algorithms Recursion is the use of recursive algorithms to solve a problem A recursive algorithm

More information

CSE 332: Data Structures & Parallelism Lecture 17: Shared-Memory Concurrency & Mutual Exclusion. Ruth Anderson Winter 2019

CSE 332: Data Structures & Parallelism Lecture 17: Shared-Memory Concurrency & Mutual Exclusion. Ruth Anderson Winter 2019 CSE 332: Data Structures & Parallelism Lecture 17: Shared-Memory Concurrency & Mutual Exclusion Ruth Anderson Winter 2019 Toward sharing resources (memory) So far, we have been studying parallel algorithms

More information

Thread-Local. Lecture 27: Concurrency 3. Dealing with the Rest. Immutable. Whenever possible, don t share resources

Thread-Local. Lecture 27: Concurrency 3. Dealing with the Rest. Immutable. Whenever possible, don t share resources Thread-Local Lecture 27: Concurrency 3 CS 62 Fall 2016 Kim Bruce & Peter Mawhorter Some slides based on those from Dan Grossman, U. of Washington Whenever possible, don t share resources Easier to have

More information

G Programming Languages Spring 2010 Lecture 13. Robert Grimm, New York University

G Programming Languages Spring 2010 Lecture 13. Robert Grimm, New York University G22.2110-001 Programming Languages Spring 2010 Lecture 13 Robert Grimm, New York University 1 Review Last week Exceptions 2 Outline Concurrency Discussion of Final Sources for today s lecture: PLP, 12

More information

CS5460: Operating Systems

CS5460: Operating Systems CS5460: Operating Systems Lecture 9: Implementing Synchronization (Chapter 6) Multiprocessor Memory Models Uniprocessor memory is simple Every load from a location retrieves the last value stored to that

More information

Parallel Programming with OpenMP. CS240A, T. Yang

Parallel Programming with OpenMP. CS240A, T. Yang Parallel Programming with OpenMP CS240A, T. Yang 1 A Programmer s View of OpenMP What is OpenMP? Open specification for Multi-Processing Standard API for defining multi-threaded shared-memory programs

More information

Recursion CSCI 136: Fundamentals of Computer Science II Keith Vertanen Copyright 2011

Recursion CSCI 136: Fundamentals of Computer Science II Keith Vertanen Copyright 2011 Recursion CSCI 136: Fundamentals of Computer Science II Keith Vertanen Copyright 2011 Recursion A method calling itself Overview A new way of thinking about a problem Divide and conquer A powerful programming

More information

Reintroduction to Concurrency

Reintroduction to Concurrency Reintroduction to Concurrency The execution of a concurrent program consists of multiple processes active at the same time. 9/25/14 7 Dining philosophers problem Each philosopher spends some time thinking

More information

Parallel Programming Languages COMP360

Parallel Programming Languages COMP360 Parallel Programming Languages COMP360 The way the processor industry is going, is to add more and more cores, but nobody knows how to program those things. I mean, two, yeah; four, not really; eight,

More information

Grouping Objects (I)

Grouping Objects (I) KTH ROYAL INSTITUTE OF TECHNOLOGY Stockholm Sweden Grouping Objects (I) Managing collections of objects Ric Glassey glassey@kth.se Main concepts to be covered Grouping Objects Using ArrayLists Looping

More information

CSE 332: Data Structures & Parallelism Lecture 15: Analysis of Fork-Join Parallel Programs. Ruth Anderson Autumn 2018

CSE 332: Data Structures & Parallelism Lecture 15: Analysis of Fork-Join Parallel Programs. Ruth Anderson Autumn 2018 CSE 332: Data Structures & Parallelism Lecture 15: Analysis of Fork-Join Parallel Programs Ruth Anderson Autumn 2018 Outline Done: How to use fork and join to write a parallel algorithm Why using divide-and-conquer

More information

Virtual Memory COMPSCI 386

Virtual Memory COMPSCI 386 Virtual Memory COMPSCI 386 Motivation An instruction to be executed must be in physical memory, but there may not be enough space for all ready processes. Typically the entire program is not needed. Exception

More information

Recursion. Overview. Mathematical induction. Hello recursion. Recursion. Example applications. Goal: Compute factorial N! = 1 * 2 * 3...

Recursion. Overview. Mathematical induction. Hello recursion. Recursion. Example applications. Goal: Compute factorial N! = 1 * 2 * 3... Recursion Recursion Overview A method calling itself A new way of thinking about a problem Divide and conquer A powerful programming paradigm Related to mathematical induction Example applications Factorial

More information

CMSC Computer Architecture Lecture 12: Multi-Core. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 12: Multi-Core. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 12: Multi-Core Prof. Yanjing Li University of Chicago Administrative Stuff! Lab 4 " Due: 11:49pm, Saturday " Two late days with penalty! Exam I " Grades out on

More information

1z0-813.exam.28q 1z0-813 Upgrade to Java SE 8 OCP (Java SE 6 and all prior versions)

1z0-813.exam.28q   1z0-813 Upgrade to Java SE 8 OCP (Java SE 6 and all prior versions) 1z0-813.exam.28q Number: 1z0-813 Passing Score: 800 Time Limit: 120 min 1z0-813 Upgrade to Java SE 8 OCP (Java SE 6 and all prior versions) Exam A QUESTION 1 Given the code fragment: What is the result?

More information

A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 4 Shared-Memory Concurrency & Mutual Exclusion

A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 4 Shared-Memory Concurrency & Mutual Exclusion A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 4 Shared-Memory Concurrency & Mutual Exclusion Dan Grossman Last Updated: August 2010 For more information, see http://www.cs.washington.edu/homes/djg/teachingmaterials/

More information

CSE Traditional Operating Systems deal with typical system software designed to be:

CSE Traditional Operating Systems deal with typical system software designed to be: CSE 6431 Traditional Operating Systems deal with typical system software designed to be: general purpose running on single processor machines Advanced Operating Systems are designed for either a special

More information

Modern Processor Architectures (A compiler writer s perspective) L25: Modern Compiler Design

Modern Processor Architectures (A compiler writer s perspective) L25: Modern Compiler Design Modern Processor Architectures (A compiler writer s perspective) L25: Modern Compiler Design The 1960s - 1970s Instructions took multiple cycles Only one instruction in flight at once Optimisation meant

More information

THREADS & CONCURRENCY

THREADS & CONCURRENCY 27/04/2018 Sorry for the delay in getting slides for today 2 Another reason for the delay: Yesterday: 63 posts on the course Piazza yesterday. A7: If you received 100 for correctness (perhaps minus a late

More information

Programming II (CS300)

Programming II (CS300) 1 Programming II (CS300) Chapter 9 (Part II) Recursion MOUNA KACEM Recursion: General Overview 2 Recursion in Algorithms Recursion is the use of recursive algorithms to solve a problem A recursive algorithm

More information

Multiple Inheritance. Computer object can be viewed as

Multiple Inheritance. Computer object can be viewed as Multiple Inheritance We have seen that a class may be derived from a given parent class. It is sometimes useful to allow a class to be derived from more than one parent, inheriting members of all parents.

More information

CONCURRENCY IN JAVA Course Parallel Computing

CONCURRENCY IN JAVA Course Parallel Computing CONCURRENCY IN JAVA Course Parallel Computing Wolfgang Schreiner Research Institute for Symbolic Computation (RISC) Wolfgang.Schreiner@risc.jku.at http://www.risc.jku.at Java on a NUMA Architecture Loading

More information

Recursion. Fundamentals of Computer Science

Recursion. Fundamentals of Computer Science Recursion Fundamentals of Computer Science Outline Recursion A method calling itself All good recursion must come to an end A powerful tool in computer science Allows writing elegant and easy to understand

More information

Programming II (CS300)

Programming II (CS300) 1 Programming II (CS300) Chapter 10 Recursion and Search MOUNA KACEM mouna@cs.wisc.edu Spring 2019 Recursion: General Overview 2 Recursion in Algorithms Recursion is the use of recursive algorithms to

More information

CS193k, Stanford Handout #8. Threads 3

CS193k, Stanford Handout #8. Threads 3 CS193k, Stanford Handout #8 Spring, 2000-01 Nick Parlante Threads 3 t.join() Wait for finish We block until the receiver thread exits its run(). Use this to wait for another thread to finish. The current

More information

CS 179: GPU Programming LECTURE 5: GPU COMPUTE ARCHITECTURE FOR THE LAST TIME

CS 179: GPU Programming LECTURE 5: GPU COMPUTE ARCHITECTURE FOR THE LAST TIME CS 179: GPU Programming LECTURE 5: GPU COMPUTE ARCHITECTURE FOR THE LAST TIME 1 Last time... GPU Memory System Different kinds of memory pools, caches, etc Different optimization techniques 2 Warp Schedulers

More information

An Introduction to Parallel Programming

An Introduction to Parallel Programming An Introduction to Parallel Programming Ing. Andrea Marongiu (a.marongiu@unibo.it) Includes slides from Multicore Programming Primer course at Massachusetts Institute of Technology (MIT) by Prof. SamanAmarasinghe

More information

Midterm - Winter SE 350

Midterm - Winter SE 350 Please print in pen: Waterloo Student ID Number: WatIAM/Quest Login UserID: Midterm - Winter 0 - SE 0. Before you begin, make certain that you have one -sided booklet with pages. You have 0 minutes to

More information

Threaded Programming. Lecture 1: Concepts

Threaded Programming. Lecture 1: Concepts Threaded Programming Lecture 1: Concepts Overview Shared memory systems Basic Concepts in Threaded Programming 2 Shared memory systems Threaded programming is most often used on shared memory parallel

More information

Java SE 8 Programming

Java SE 8 Programming Java SE 8 Programming Training Calendar Date Training Time Location 16 September 2019 5 Days Bilginç IT Academy 28 October 2019 5 Days Bilginç IT Academy Training Details Training Time : 5 Days Capacity

More information

CSE 374 Programming Concepts & Tools

CSE 374 Programming Concepts & Tools CSE 374 Programming Concepts & Tools Hal Perkins Fall 2017 Lecture 22 Shared-Memory Concurrency 1 Administrivia HW7 due Thursday night, 11 pm (+ late days if you still have any & want to use them) Course

More information

Compiling for GPUs. Adarsh Yoga Madhav Ramesh

Compiling for GPUs. Adarsh Yoga Madhav Ramesh Compiling for GPUs Adarsh Yoga Madhav Ramesh Agenda Introduction to GPUs Compute Unified Device Architecture (CUDA) Control Structure Optimization Technique for GPGPU Compiler Framework for Automatic Translation

More information

Processes, Threads and Processors

Processes, Threads and Processors 1 Processes, Threads and Processors Processes and Threads From Processes to Threads Don Porter Portions courtesy Emmett Witchel Hardware can execute N instruction streams at once Ø Uniprocessor, N==1 Ø

More information

The Singleton Pattern. Design Patterns In Java Bob Tarr

The Singleton Pattern. Design Patterns In Java Bob Tarr The Singleton Pattern Intent Ensure a class only has one instance, and provide a global point of access to it Motivation Sometimes we want just a single instance of a class to exist in the system For example,

More information

Examination Questions Midterm 1

Examination Questions Midterm 1 CS1102s Data Structures and Algorithms 10/2/2010 Examination Questions Midterm 1 This examination question booklet has 9 pages, including this cover page, and contains 15 questions. You have 40 minutes

More information

Thread Safety. Review. Today o Confinement o Threadsafe datatypes Required reading. Concurrency Wrapper Collections

Thread Safety. Review. Today o Confinement o Threadsafe datatypes Required reading. Concurrency Wrapper Collections Thread Safety Today o Confinement o Threadsafe datatypes Required reading Concurrency Wrapper Collections Optional reading The material in this lecture and the next lecture is inspired by an excellent

More information

CSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable)

CSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable) CSL373: Lecture 5 Deadlocks (no process runnable) + Scheduling (> 1 process runnable) Past & Present Have looked at two constraints: Mutual exclusion constraint between two events is a requirement that

More information

Systèmes d Exploitation Avancés

Systèmes d Exploitation Avancés Systèmes d Exploitation Avancés Instructor: Pablo Oliveira ISTY Instructor: Pablo Oliveira (ISTY) Systèmes d Exploitation Avancés 1 / 32 Review : Thread package API tid thread create (void (*fn) (void

More information

OpenMP Introduction. CS 590: High Performance Computing. OpenMP. A standard for shared-memory parallel programming. MP = multiprocessing

OpenMP Introduction. CS 590: High Performance Computing. OpenMP. A standard for shared-memory parallel programming. MP = multiprocessing CS 590: High Performance Computing OpenMP Introduction Fengguang Song Department of Computer Science IUPUI OpenMP A standard for shared-memory parallel programming. MP = multiprocessing Designed for systems

More information

Introduction to Concurrency Principles of Concurrent System Design

Introduction to Concurrency Principles of Concurrent System Design Introduction to Concurrency 4010-441 Principles of Concurrent System Design Texts Logistics (On mycourses) Java Concurrency in Practice, Brian Goetz, et. al. Programming Concurrency on the JVM, Venkat

More information

INF 212 ANALYSIS OF PROG. LANGS CONCURRENCY. Instructors: Crista Lopes Copyright Instructors.

INF 212 ANALYSIS OF PROG. LANGS CONCURRENCY. Instructors: Crista Lopes Copyright Instructors. INF 212 ANALYSIS OF PROG. LANGS CONCURRENCY Instructors: Crista Lopes Copyright Instructors. Basics Concurrent Programming More than one thing at a time Examples: Network server handling hundreds of clients

More information

CS 241 Honors Concurrent Data Structures

CS 241 Honors Concurrent Data Structures CS 241 Honors Concurrent Data Structures Bhuvan Venkatesh University of Illinois Urbana Champaign March 27, 2018 CS 241 Course Staff (UIUC) Lock Free Data Structures March 27, 2018 1 / 43 What to go over

More information

Introducing Multi-core Computing / Hyperthreading

Introducing Multi-core Computing / Hyperthreading Introducing Multi-core Computing / Hyperthreading Clock Frequency with Time 3/9/2017 2 Why multi-core/hyperthreading? Difficult to make single-core clock frequencies even higher Deeply pipelined circuits:

More information

COMP 122/L Lecture 1. Kyle Dewey

COMP 122/L Lecture 1. Kyle Dewey COMP 122/L Lecture 1 Kyle Dewey About Me I research automated testing techniques and their intersection with CS education This is my first semester at CSUN Third time teaching this content About this Class

More information

Threads and Too Much Milk! CS439: Principles of Computer Systems February 6, 2019

Threads and Too Much Milk! CS439: Principles of Computer Systems February 6, 2019 Threads and Too Much Milk! CS439: Principles of Computer Systems February 6, 2019 Bringing It Together OS has three hats: What are they? Processes help with one? two? three? of those hats OS protects itself

More information

Application parallelization for multi-core Android devices

Application parallelization for multi-core Android devices SOFTWARE & SYSTEMS DESIGN Application parallelization for multi-core Android devices Jos van Eijndhoven Vector Fabrics BV The Netherlands http://www.vectorfabrics.com MULTI-CORE PROCESSORS: HERE TO STAY

More information

Last Class: Synchronization

Last Class: Synchronization Last Class: Synchronization Synchronization primitives are required to ensure that only one thread executes in a critical section at a time. Concurrent programs Low-level atomic operations (hardware) load/store

More information

Midterm 1 Review Document

Midterm 1 Review Document Midterm 1 Review Document CS61B Fall 2016 Antares Chen Introduction This document is meant to provide you supplementary practice questions for the upcoming midterm. It reflects all material that you will

More information

Multithreading and Interactive Programs

Multithreading and Interactive Programs Multithreading and Interactive Programs CS160: User Interfaces John Canny. Last time Model-View-Controller Break up a component into Model of the data supporting the App View determining the look of the

More information

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 31 October 2012

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY. Tim Harris, 31 October 2012 NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 31 October 2012 Lecture 6 Linearizability Lock-free progress properties Queues Reducing contention Explicit memory management Linearizability

More information

Complexity, General. Standard approach: count the number of primitive operations executed.

Complexity, General. Standard approach: count the number of primitive operations executed. Complexity, General Allmänt Find a function T(n), which behaves as the time it takes to execute the program for input of size n. Standard approach: count the number of primitive operations executed. Standard

More information

This exam is open book. Each question is worth 3 points.

This exam is open book. Each question is worth 3 points. This exam is open book. Each question is worth 3 points. Page 1 / 15 Page 2 / 15 Page 3 / 12 Page 4 / 18 Page 5 / 15 Page 6 / 9 Page 7 / 12 Page 8 / 6 Total / 100 (maximum is 102) 1. Are you in CS101 or

More information

Arrays. https://docs.oracle.com/javase/tutorial/java/nutsandbolts/arrays.html

Arrays. https://docs.oracle.com/javase/tutorial/java/nutsandbolts/arrays.html 1 Arrays Arrays in Java an array is a container object that holds a fixed number of values of a single type the length of an array is established when the array is created 2 https://docs.oracle.com/javase/tutorial/java/nutsandbolts/arrays.html

More information

Consistency: Relaxed. SWE 622, Spring 2017 Distributed Software Engineering

Consistency: Relaxed. SWE 622, Spring 2017 Distributed Software Engineering Consistency: Relaxed SWE 622, Spring 2017 Distributed Software Engineering Review: HW2 What did we do? Cache->Redis Locks->Lock Server Post-mortem feedback: http://b.socrative.com/ click on student login,

More information

Control Hazards. Branch Prediction

Control Hazards. Branch Prediction Control Hazards The nub of the problem: In what pipeline stage does the processor fetch the next instruction? If that instruction is a conditional branch, when does the processor know whether the conditional

More information

Parallel Programming: Background Information

Parallel Programming: Background Information 1 Parallel Programming: Background Information Mike Bailey mjb@cs.oregonstate.edu parallel.background.pptx Three Reasons to Study Parallel Programming 2 1. Increase performance: do more work in the same

More information

Parallel Programming: Background Information

Parallel Programming: Background Information 1 Parallel Programming: Background Information Mike Bailey mjb@cs.oregonstate.edu parallel.background.pptx Three Reasons to Study Parallel Programming 2 1. Increase performance: do more work in the same

More information

Running Time. Analytic Engine. Charles Babbage (1864) how many times do you have to turn the crank?

Running Time. Analytic Engine. Charles Babbage (1864) how many times do you have to turn the crank? 4.1 Performance Introduction to Programming in Java: An Interdisciplinary Approach Robert Sedgewick and Kevin Wayne Copyright 2002 2010 3/30/11 8:32 PM Running Time As soon as an Analytic Engine exists,

More information

Notes - Recursion. A geeky definition of recursion is as follows: Recursion see Recursion.

Notes - Recursion. A geeky definition of recursion is as follows: Recursion see Recursion. Notes - Recursion So far we have only learned how to solve problems iteratively using loops. We will now learn how to solve problems recursively by having a method call itself. A geeky definition of recursion

More information

Understanding Hardware Transactional Memory

Understanding Hardware Transactional Memory Understanding Hardware Transactional Memory Gil Tene, CTO & co-founder, Azul Systems @giltene 2015 Azul Systems, Inc. Agenda Brief introduction What is Hardware Transactional Memory (HTM)? Cache coherence

More information

Problem 1. (15 points):

Problem 1. (15 points): CMU 15-418/618: Parallel Computer Architecture and Programming Practice Exercise 1 A Task Queue on a Multi-Core, Multi-Threaded CPU Problem 1. (15 points): The figure below shows a simple single-core CPU

More information

Functional Programming in Java. CSE 219 Department of Computer Science, Stony Brook University

Functional Programming in Java. CSE 219 Department of Computer Science, Stony Brook University Functional Programming in Java CSE 219, Stony Brook University What is functional programming? There is no single precise definition of functional programming (FP) We should think of it as a programming

More information