Chapter 24 File Output on a Cluster

Size: px
Start display at page:

Download "Chapter 24 File Output on a Cluster"

Transcription

1 Chapter 24 File Output on a Cluster Part I. Preliminaries Part II. Tightly Coupled Multicore Part III. Loosely Coupled Cluster Chapter 18. Massively Parallel Chapter 19. Hybrid Parallel Chapter 20. Tuple Space Chapter 21. Cluster Parallel Loops Chapter 22. Cluster Parallel Reduction Chapter 23. Cluster Load Balancing Chapter 24. File Output on a Cluster Chapter 25. Interacting Tasks Chapter 26. Cluster Heuristic Search Chapter 27. Cluster Work Queues Chapter 28. On-Demand Tasks Part IV. GPU Acceleration Part V. Big Data Appendices 285

2 286 BIG CPU, BIG DATA R ecall the single-node multicore parallel program from Chapter 11 that computes an image of the Mandelbrot Set. The program partitioned the image rows among the threads of a parallel thread team. Each thread computed the color of each pixel in a certain row, put the row of pixels into a ColorImageQueue, and went on to the next available row. Simultaneously, another thread took each row of pixels out of the queue and wrote them to a PNG file. Now let s make this a cluster parallel program. The program will illustrate several features of the Parallel Java 2 Library, namely customized tuple subclasses and file output in a job. Studying the program s strong scaling performance will reveal interesting behavior that we didn t see with single-node multicore parallel programs. Figure 24.1 shows the cluster parallel Mandelbrot Set program s design. Like the previous multicore version, the program partitions the image rows among multiple parallel team threads. Unlike the previous version, the parallel team threads are located in multiple separate worker tasks. Each worker task runs in a separate backend process on one of the backend nodes in the cluster. I ll use a master-worker parallel for loop to partition the image rows among the tasks and threads. Also like the previous version, the program has an I/O thread responsible for writing the output PNG file, as well as a ColorImageQueue from which the I/O thread obtains rows of pixels. The I/O thread and the queue reside in an output task, separate from the worker tasks and shared by all of them. I ll run the output task in the job s process on the frontend node, rather than in a backend process on a backend node. That way, the I/O thread runs in the user s account and is able to write the PNG file in the user s directory. (If the output task ran in a backend process, the output task would typically run in a special Parallel Java account rather than the user s account, and the output task would typically not be able to write files in the user s directory.) This is a distributed memory program. The worker task team threads compute pixel rows, which are located in the backend processes memories. The output task s image queue is located in the frontend process s memory.

3 Chapter 24. File Output on a Cluster 287 Figure Cluster parallel Mandelbrot Set program

4 288 BIG CPU, BIG DATA Thus, it s not possible for a team thread to put a pixel row directly into the image queue, as the multicore parallel program could. This is where tuple space comes to the rescue. When a team thread has computed a pixel row, the team thread packages the pixel row into an output tuple and puts the tuple into tuple space. The tuple also includes the row index. A second gather thread in the output task repeatedly takes an output tuple, extracts the row index and the pixel row, and puts the pixel row into the image queue at the proper index. The I/O thread then removes pixel rows from the image queue and writes them to the PNG file. In this way, the computation s results are communicated from the worker tasks through tuple space to the output task. However, going through tuple space imposes a communication overhead in the cluster parallel program, which the multicore parallel program did not have. This communication overhead affects the cluster parallel program s scalability. Listing 24.1 gives the source code for class edu.rit.pj2.example.mandelbrotclu. Like all cluster parallel programs, it begins with the job main program that defines the tasks. The masterfor() method (line 40) sets up a task group with K worker tasks, where K is specified by the workers option on the pj2 command. The masterfor() method also sets up a master-worker parallel for loop that partitions the outer loop over the image rows, from 0 through height 1, among the worker tasks. Because the running time is different in every loop iteration, the parallel for loop needs a load balancing schedule; I specified a proportional schedule with a chunk factor of 10 (lines 38 39). This partitions the outer loop iterations into 10 times as many chunks as there are worker tasks, and each task will repeatedly execute the next available chunk in a dynamic fashion. The job main program also sets up the output task that will write the PNG file (lines 43 44), with the output task running in the job s process. Both the worker tasks and the output task are specified by start rules and will commence execution at the start of the job. Next comes the OutputTuple subclass (line 71). It conveys a row of pixel colors, a ColorArray (line 74), along with the row index (line 73), from a worker task to the output task. The tuple subclass also provides the obligatory no-argument constructor (lines 76 78), writeout() method (lines 88 92), and readin() method (lines 94 98). The WorkerTask class (line 102) is virtually identical to the single-node multicore MandelbrotSmp class from Chapter 11. There are only two differences. First, the worker task provides the worker portion of the masterworker parallel for loop (line 151). When the worker task obtains a chunk of row indexes from the master, the indexes are partitioned among the parallel team threads using a dynamic schedule for load balancing. Each loop iteration (line 160) computes the pixel colors for one row of the image, storing the colors in a per-thread color array (line 153). The second difference is that

5 Chapter 24. File Output on a Cluster package edu.rit.pj2.example; import edu.rit.image.color; import edu.rit.image.colorarray; import edu.rit.image.colorimagequeue; import edu.rit.image.colorpngwriter; import edu.rit.io.instream; import edu.rit.io.outstream; import edu.rit.pj2.job; import edu.rit.pj2.loop; import edu.rit.pj2.schedule; import edu.rit.pj2.section; import edu.rit.pj2.task; import edu.rit.pj2.tuple; import java.io.bufferedoutputstream; import java.io.file; import java.io.fileoutputstream; import java.io.ioexception; public class MandelbrotClu extends Job // Job main program. public void main (String[] args) // Parse command line arguments. if (args.length!= 8) usage(); int width = Integer.parseInt (args[0]); int height = Integer.parseInt (args[1]); double xcenter = Double.parseDouble (args[2]); double ycenter = Double.parseDouble (args[3]); double resolution = Double.parseDouble (args[4]); int maxiter = Integer.parseInt (args[5]); double gamma = Double.parseDouble (args[6]); File filename = new File (args[7]); // Set up task group with K worker tasks. Partition rows among // workers. masterschedule (proportional); masterchunk (10); masterfor (0, height - 1, WorkerTask.class).args (args); // Set up PNG file writing task. rule().task (OutputTask.class).args (args).runinjobprocess(); // Print a usage message and exit. private static void usage() System.err.println ("Usage: java pj2 [workers=<k>] " + "edu.rit.pj2.example.mandelbrotclu <width> <height> " + "<xcenter> <ycenter> <resolution> <maxiter> <gamma> " + "<filename>"); System.err.println ("<K> = Number of worker tasks (default " + "1)"); System.err.println ("<width> = Image width (pixels)"); System.err.println ("<height> = Image height (pixels)"); System.err.println ("<xcenter> = X coordinate of center " + Listing MandelbrotClu.java (part 1)

6 290 BIG CPU, BIG DATA once all columns in the row have been computed, the worker task packages the row index and the color array into an output tuple and puts the tuple into tuple space (line 189), whence the output task can take the tuple. Last comes the OutputTask class (line 196), which runs in the job s process. After setting up the PNG file writer and the color image queue (lines ), the task runs two parallel sections simultaneously in two threads (line 227). The first section (lines ) repeatedly takes an output tuple out of tuple space and puts the tuple s pixel data into the image queue at the row index indicated in the tuple. The taketuple() method is given a blank output tuple as the template; this matches any output tuple containing any pixel row, no matter which worker task put the tuple. The tuple s row index ensures that the pixel data goes into the proper image row, regardless of the order in which the tuples arrive. The first section takes exactly as many tuples as there are image rows (the height argument). The second section (lines ) merely uses the PNG image writer to write the PNG file. Each worker task terminates when there are no more chunks of pixel rows to calculate. The output task terminates when the first parallel section has taken and processed all the output tuples and the second parallel section has finished writing the PNG file. At that point the job itself terminates. I ran the Mandelbrot Set program on the tardis cluster to study the program s strong scaling performance. I computed images of size , , , , and pixels. For partitioning at the master level, the program is hard-coded to use a proportional schedule with a chunk factor of 10. For partitioning at the worker level, the program is hard-coded to use a dynamic schedule. To measure the sequential version, I ran the MandelbrotSeq program from Chapter 11 on one node using commands like this: $ java pj2 debug=makespan edu.rit.pj2.example.mandelbrotseq \ ms3200.png To measure the parallel version on one core, I ran the MandelbrotClu program with one worker task and one thread using commands like this: $ java pj2 debug=makespan workers=1 cores=1 \ edu.rit.pj2.example.mandelbrotclu \ ms3200.png To measure the parallel version on multiple cores, I ran the MandelbrotClu program with one to ten worker tasks and with all cores on each node (12 to 120 cores) using commands like this: $ java pj2 debug=makespan workers=2 \ edu.rit.pj2.example.mandelbrotclu \ ms3200.png

7 Chapter 24. File Output on a Cluster "point"); System.err.println ("<ycenter> = Y coordinate of center " + "point"); System.err.println ("<resolution> = Pixels per unit"); System.err.println ("<maxiter> = Maximum number of " + "iterations"); System.err.println ("<gamma> = Used to calculate pixel hues"); System.err.println ("<filename> = PNG image file name"); terminate (1); // Tuple for sending results from worker tasks to output task. private static class OutputTuple extends Tuple public int row; // Row index public ColorArray pixeldata; // Row's pixel data public OutputTuple() public OutputTuple (int row, ColorArray pixeldata) this.row = row; this.pixeldata = pixeldata; public void writeout (OutStream out) throws IOException out.writeunsignedint (row); out.writeobject (pixeldata); public void readin (InStream in) throws IOException row = in.readunsignedint(); pixeldata = (ColorArray) in.readobject(); // Worker task class. private static class WorkerTask extends Task // Command line arguments. int width; int height; double xcenter; double ycenter; double resolution; int maxiter; double gamma; // Initial pixel offsets from center. int xoffset; Listing MandelbrotClu.java (part 2)

8 292 BIG CPU, BIG DATA Figure MandelbrotClu strong scaling performance metrics

9 Chapter 24. File Output on a Cluster int yoffset; // Table of hues. Color[] huetable; // Worker task main program. public void main (String[] args) throws Exception // Parse command line arguments. width = Integer.parseInt (args[0]); height = Integer.parseInt (args[1]); xcenter = Double.parseDouble (args[2]); ycenter = Double.parseDouble (args[3]); resolution = Double.parseDouble (args[4]); maxiter = Integer.parseInt (args[5]); gamma = Double.parseDouble (args[6]); // Initial pixel offsets from center. xoffset = -(width - 1) / 2; yoffset = (height - 1) / 2; // Create table of hues for different iteration counts. huetable = new Color [maxiter + 2]; for (int i = 1; i <= maxiter; ++ i) huetable[i] = new Color().hsb (/*hue*/ (float) Math.pow ((double)(i 1)/maxiter, gamma), /*sat*/ 1.0f, /*bri*/ 1.0f); huetable[maxiter + 1] = new Color().hsb (1.0f, 1.0f, 0.0f); // Compute all rows and columns. workerfor().schedule (dynamic).exec (new Loop() ColorArray pixeldata; public void start() pixeldata = new ColorArray (width); public void run (int r) throws Exception double y = ycenter + (yoffset - r) / resolution; for (int c = 0; c < width; ++ c) double x = xcenter + (xoffset + c) / resolution; // Iterate until convergence. int i = 0; double aold = 0.0; double bold = 0.0; double a = 0.0; double b = 0.0; double zmagsqr = 0.0; Listing MandelbrotClu.java (part 3)

10 294 BIG CPU, BIG DATA Figure 24.2 plots the running times, speedups, and efficiencies I observed. The running time plots behavior is peculiar. The running times decrease as the number of cores increases, more or less as expected with strong scaling, but only up to a certain point. At around 36 or 48 cores, the running time plots flatten out, and there is no further reduction as more cores are added. Also, the efficiency plots show that there s a steady decrease in efficiency as more cores are added, much more of a drop than we ve seen before. What s going on? Fitting this model to the data yields this running time formula: T = ( N) + ( N) K + ( N) K. (24.1) Plugging a certain problem size N into Equation 24.1 yields a running time formula as a function of just K. For example, the pixel image has a problem size (number of inner loop iterations) N of For that problem size, the formula becomes T = K K. (24.2) For the numbers of cores I used, the second term in Equation 24.2 is negligible compared to the other terms. Figure 24.3 plots the first and third terms separately in black, along with their sum T in red. Because the third term s coefficient is so much larger than the first term s coefficient, the third term Figure MandelbrotClu running time model, pixel image

11 Chapter 24. File Output on a Cluster while (i <= maxiter && zmagsqr <= 4.0) ++ i; a = aold*aold - bold*bold + x; b = 2.0*aold*bold + y; zmagsqr = a*a + b*b; aold = a; bold = b; // Record number of iterations for pixel. pixeldata.color (c, huetable[i]); puttuple (new OutputTuple (r, pixeldata)); ); // Output PNG file writing task. private static class OutputTask extends Task // Command line arguments. int width; int height; File filename; // For writing PNG image file. ColorPngWriter writer; ColorImageQueue imagequeue; // Task main program. public void main (String[] args) throws Exception // Parse command line arguments. width = Integer.parseInt (args[0]); height = Integer.parseInt (args[1]); filename = new File (args[7]); // Set up for writing PNG image file. writer = new ColorPngWriter (height, width, new BufferedOutputStream (new FileOutputStream (filename))); filename.setreadable (true, false); filename.setwritable (true, false); imagequeue = writer.getimagequeue(); // Overlapped pixel data gathering and file writing. paralleldo (new Section() // Pixel data gathering section. public void run() throws Exception OutputTuple template = new OutputTuple(); Listing MandelbrotClu.java (part 4)

12 296 BIG CPU, BIG DATA dominates for small K values, and T decreases as K increases. But as K gets larger, the third term gets smaller, while the first term stays the same. Eventually the third term becomes smaller than the first term. After that, the running time T flattens out and approaches the first term as K increases. There s an important lesson here. When doing strong scaling on a cluster parallel computer, you don t necessarily want to run the program on all the cores in the cluster. Rather, you want to run the program on only as many cores as are needed to minimize the running time. This might be fewer than the total number of cores. Measuring the program s performance and deriving a running time model, as I did above, lets you determine the optimum number of cores to use. For the images I computed, the running times on 36 cores were very nearly the same as the running times on 120 cores. So on the tardis cluster I could compute three images on 36 cores each in about the same time as I could compute one image on 120 cores. Limiting the number of cores per job would improve utilization of the cluster, allowing more jobs to run in a given amount of time. This scaling behavior is a consequence of Amdahl s Law. If you run a parallel program on too many cores, the sequential portion for the Mandelbrot Set program, the portion that writes the output image file is going to dominate the parallelizable portion, and you won t get any further decreases in the running time. We didn t see this happening with the multicore parallel program because we could scale up to only 12 cores on one tardis node. Now with the cluster parallel program we can scale up to 120 cores on the whole tardis cluster, and we can observe the diminishing returns. Points to Remember In a cluster parallel program that must write (or read) a file, consider doing the file I/O in a task that runs in the job s process. Use tuple space to convey the worker tasks results to the output task. Define a tuple subclass whose fields hold the output results. When doing strong scaling on a cluster parallel program, as the number of cores increases, the running time initially decreases, but eventually flattens out. Use the program s running time model, fitted to the program s measured running time data, to determine the optimum number of cores on which to run the program the smallest number of cores needed to minimize the running time.

13 Chapter 24. File Output on a Cluster , OutputTuple tuple; for (int i = 0; i < height; ++ i) tuple = (OutputTuple) taketuple (template); imagequeue.put (tuple.row, tuple.pixeldata); new Section() // File writing section. public void run() throws Exception writer.write(); ); Listing MandelbrotClu.java (part 5)

14 298 BIG CPU, BIG DATA

Chapter 11 Overlapping

Chapter 11 Overlapping Chapter 11 Overlapping Part I. Preliminaries Part II. Tightly Coupled Multicore Chapter 6. Parallel Loops Chapter 7. Parallel Loop Schedules Chapter 8. Parallel Reduction Chapter 9. Reduction Variables

More information

Chapter 21 Cluster Parallel Loops

Chapter 21 Cluster Parallel Loops Chapter 21 Cluster Parallel Loops Part I. Preliminaries Part II. Tightly Coupled Multicore Part III. Loosely Coupled Cluster Chapter 18. Massively Parallel Chapter 19. Hybrid Parallel Chapter 20. Tuple

More information

Chapter 27 Cluster Work Queues

Chapter 27 Cluster Work Queues Chapter 27 Cluster Work Queues Part I. Preliminaries Part II. Tightly Coupled Multicore Part III. Loosely Coupled Cluster Chapter 18. Massively Parallel Chapter 19. Hybrid Parallel Chapter 20. Tuple Space

More information

Chapter 26 Cluster Heuristic Search

Chapter 26 Cluster Heuristic Search Chapter 26 Cluster Heuristic Search Part I. Preliminaries Part II. Tightly Coupled Multicore Part III. Loosely Coupled Cluster Chapter 18. Massively Parallel Chapter 19. Hybrid Parallel Chapter 20. Tuple

More information

Chapter 13 Strong Scaling

Chapter 13 Strong Scaling Chapter 13 Strong Scaling Part I. Preliminaries Part II. Tightly Coupled Multicore Chapter 6. Parallel Loops Chapter 7. Parallel Loop Schedules Chapter 8. Parallel Reduction Chapter 9. Reduction Variables

More information

Chapter 25 Interacting Tasks

Chapter 25 Interacting Tasks Chapter 25 Interacting Tasks Part I. Preliminaries Part II. Tightly Coupled Multicore Part III. Loosely Coupled Cluster Chapter 18. Massively Parallel Chapter 19. Hybrid Parallel Chapter 20. Tuple Space

More information

Load Balancing & Broadcasting

Load Balancing & Broadcasting What to Learn This Week? Load Balancing & Broadcasting Minseok Kwon Department of Computer Science Rochester Institute of Technology We will discover that cluster parallel programs can have unbalanced

More information

Chapter 19 Hybrid Parallel

Chapter 19 Hybrid Parallel Chapter 19 Hybrid Parallel Part I. Preliminaries Part II. Tightly Coupled Multicore Part III. Loosely Coupled Cluster Chapter 18. Massively Parallel Chapter 19. Hybrid Parallel Chapter 20. Tuple Space

More information

Chapter 6 Parallel Loops

Chapter 6 Parallel Loops Chapter 6 Parallel Loops Part I. Preliminaries Part II. Tightly Coupled Multicore Chapter 6. Parallel Loops Chapter 7. Parallel Loop Schedules Chapter 8. Parallel Reduction Chapter 9. Reduction Variables

More information

Chapter 20 Tuple Space

Chapter 20 Tuple Space Chapter 20 Tuple Space Part I. Preliminaries Part II. Tightly Coupled Multicore Part III. Loosely Coupled Cluster Chapter 18. Massively Parallel Chapter 19. Hybrid Parallel Chapter 20. Tuple Space Chapter

More information

Chapter 31 Multi-GPU Programming

Chapter 31 Multi-GPU Programming Chapter 31 Multi-GPU Programming Part I. Preliminaries Part II. Tightly Coupled Multicore Part III. Loosely Coupled Cluster Part IV. GPU Acceleration Chapter 29. GPU Massively Parallel Chapter 30. GPU

More information

Chapter 16 Heuristic Search

Chapter 16 Heuristic Search Chapter 16 Heuristic Search Part I. Preliminaries Part II. Tightly Coupled Multicore Chapter 6. Parallel Loops Chapter 7. Parallel Loop Schedules Chapter 8. Parallel Reduction Chapter 9. Reduction Variables

More information

Chapter 17 Parallel Work Queues

Chapter 17 Parallel Work Queues Chapter 17 Parallel Work Queues Part I. Preliminaries Part II. Tightly Coupled Multicore Chapter 6. Parallel Loops Chapter 7. Parallel Loop Schedules Chapter 8. Parallel Reduction Chapter 9. Reduction

More information

Chapter 36 Cluster Map-Reduce

Chapter 36 Cluster Map-Reduce Chapter 36 Cluster Map-Reduce Part I. Preliminaries Part II. Tightly Coupled Multicore Part III. Loosely Coupled Cluster Part IV. GPU Acceleration Part V. Big Data Chapter 35. Basic Map-Reduce Chapter

More information

Chapter 38 Map-Reduce Meets GIS

Chapter 38 Map-Reduce Meets GIS Chapter 38 Map-Reduce Meets GIS Part I. Preliminaries Part II. Tightly Coupled Multicore Part III. Loosely Coupled Cluster Part IV. GPU Acceleration Part V. Big Data Chapter 35. Basic Map-Reduce Chapter

More information

Chapter 9 Reduction Variables

Chapter 9 Reduction Variables Chapter 9 Reduction Variables Part I. Preliminaries Part II. Tightly Coupled Multicore Chapter 6. Parallel Loops Chapter 7. Parallel Loop Schedules Chapter 8. Parallel Reduction Chapter 9. Reduction Variables

More information

Maximum Clique Problem

Maximum Clique Problem Maximum Clique Problem Dler Ahmad dha3142@rit.edu Yogesh Jagadeesan yj6026@rit.edu 1. INTRODUCTION Graph is a very common approach to represent computational problems. A graph consists a set of vertices

More information

Building a Java First-Person Shooter

Building a Java First-Person Shooter Building a Java First-Person Shooter Episode 5 Playing with Pixels! [Last update 5/2/2017] Objective This episode does not really introduce any new concepts. Two software defects are fixed (one poorly)

More information

BIG CPU, BIG DATA. Solving the World s Toughest Computational Problems with Parallel Computing. Alan Kaminsky

BIG CPU, BIG DATA. Solving the World s Toughest Computational Problems with Parallel Computing. Alan Kaminsky BIG CPU, BIG DATA Solving the World s Toughest Computational Problems with Parallel Computing Alan Kaminsky Department of Computer Science B. Thomas Golisano College of Computing and Information Sciences

More information

Chapter 3 Parallel Software

Chapter 3 Parallel Software Chapter 3 Parallel Software Part I. Preliminaries Chapter 1. What Is Parallel Computing? Chapter 2. Parallel Hardware Chapter 3. Parallel Software Chapter 4. Parallel Applications Chapter 5. Supercomputers

More information

BIG CPU, BIG DATA. Solving the World s Toughest Computational Problems with Parallel Computing. Second Edition. Alan Kaminsky

BIG CPU, BIG DATA. Solving the World s Toughest Computational Problems with Parallel Computing. Second Edition. Alan Kaminsky Solving the World s Toughest Computational Problems with Parallel Computing Second Edition Alan Kaminsky Department of Computer Science B. Thomas Golisano College of Computing and Information Sciences

More information

BIG CPU, BIG DATA. Solving the World s Toughest Computational Problems with Parallel Computing Second Edition. Alan Kaminsky

BIG CPU, BIG DATA. Solving the World s Toughest Computational Problems with Parallel Computing Second Edition. Alan Kaminsky Solving the World s Toughest Computational Problems with Parallel Computing Second Edition Alan Kaminsky Solving the World s Toughest Computational Problems with Parallel Computing Second Edition Alan

More information

Shared Memory and Distributed Multiprocessing. Bhanu Kapoor, Ph.D. The Saylor Foundation

Shared Memory and Distributed Multiprocessing. Bhanu Kapoor, Ph.D. The Saylor Foundation Shared Memory and Distributed Multiprocessing Bhanu Kapoor, Ph.D. The Saylor Foundation 1 Issue with Parallelism Parallel software is the problem Need to get significant performance improvement Otherwise,

More information

Maximizing Face Detection Performance

Maximizing Face Detection Performance Maximizing Face Detection Performance Paulius Micikevicius Developer Technology Engineer, NVIDIA GTC 2015 1 Outline Very brief review of cascaded-classifiers Parallelization choices Reducing the amount

More information

Optimizing CUDA for GPU Architecture. CSInParallel Project

Optimizing CUDA for GPU Architecture. CSInParallel Project Optimizing CUDA for GPU Architecture CSInParallel Project August 13, 2014 CONTENTS 1 CUDA Architecture 2 1.1 Physical Architecture........................................... 2 1.2 Virtual Architecture...........................................

More information

High Performance Computing. Introduction to Parallel Computing

High Performance Computing. Introduction to Parallel Computing High Performance Computing Introduction to Parallel Computing Acknowledgements Content of the following presentation is borrowed from The Lawrence Livermore National Laboratory https://hpc.llnl.gov/training/tutorials

More information

Parallel Computing with MATLAB

Parallel Computing with MATLAB Parallel Computing with MATLAB CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University

More information

Homework # 1 Due: Feb 23. Multicore Programming: An Introduction

Homework # 1 Due: Feb 23. Multicore Programming: An Introduction C O N D I T I O N S C O N D I T I O N S Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.86: Parallel Computing Spring 21, Agarwal Handout #5 Homework #

More information

Parallelism. CS6787 Lecture 8 Fall 2017

Parallelism. CS6787 Lecture 8 Fall 2017 Parallelism CS6787 Lecture 8 Fall 2017 So far We ve been talking about algorithms We ve been talking about ways to optimize their parameters But we haven t talked about the underlying hardware How does

More information

Workloads Programmierung Paralleler und Verteilter Systeme (PPV)

Workloads Programmierung Paralleler und Verteilter Systeme (PPV) Workloads Programmierung Paralleler und Verteilter Systeme (PPV) Sommer 2015 Frank Feinbube, M.Sc., Felix Eberhardt, M.Sc., Prof. Dr. Andreas Polze Workloads 2 Hardware / software execution environment

More information

Homework # 2 Due: October 6. Programming Multiprocessors: Parallelism, Communication, and Synchronization

Homework # 2 Due: October 6. Programming Multiprocessors: Parallelism, Communication, and Synchronization ECE669: Parallel Computer Architecture Fall 2 Handout #2 Homework # 2 Due: October 6 Programming Multiprocessors: Parallelism, Communication, and Synchronization 1 Introduction When developing multiprocessor

More information

Chapter 5 Supercomputers

Chapter 5 Supercomputers Chapter 5 Supercomputers Part I. Preliminaries Chapter 1. What Is Parallel Computing? Chapter 2. Parallel Hardware Chapter 3. Parallel Software Chapter 4. Parallel Applications Chapter 5. Supercomputers

More information

Appendix A Clash of the Titans: C vs. Java

Appendix A Clash of the Titans: C vs. Java Appendix A Clash of the Titans: C vs. Java Part I. Preliminaries Part II. Tightly Coupled Multicore Part III. Loosely Coupled Cluster Part IV. GPU Acceleration Part V. Map-Reduce Appendices Appendix A.

More information

Parallel Programming Concepts. Parallel Algorithms. Peter Tröger

Parallel Programming Concepts. Parallel Algorithms. Peter Tröger Parallel Programming Concepts Parallel Algorithms Peter Tröger Sources: Ian Foster. Designing and Building Parallel Programs. Addison-Wesley. 1995. Mattson, Timothy G.; S, Beverly A.; ers,; Massingill,

More information

Chapter 7. Multicores, Multiprocessors, and Clusters. Goal: connecting multiple computers to get higher performance

Chapter 7. Multicores, Multiprocessors, and Clusters. Goal: connecting multiple computers to get higher performance Chapter 7 Multicores, Multiprocessors, and Clusters Introduction Goal: connecting multiple computers to get higher performance Multiprocessors Scalability, availability, power efficiency Job-level (process-level)

More information

Lecture 13: Memory Consistency. + a Course-So-Far Review. Parallel Computer Architecture and Programming CMU , Spring 2013

Lecture 13: Memory Consistency. + a Course-So-Far Review. Parallel Computer Architecture and Programming CMU , Spring 2013 Lecture 13: Memory Consistency + a Course-So-Far Review Parallel Computer Architecture and Programming Today: what you should know Understand the motivation for relaxed consistency models Understand the

More information

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing

Serial. Parallel. CIT 668: System Architecture 2/14/2011. Topics. Serial and Parallel Computation. Parallel Computing CIT 668: System Architecture Parallel Computing Topics 1. What is Parallel Computing? 2. Why use Parallel Computing? 3. Types of Parallelism 4. Amdahl s Law 5. Flynn s Taxonomy of Parallel Computers 6.

More information

PRINCIPLES OF SOFTWARE BIM209DESIGN AND DEVELOPMENT 10. PUTTING IT ALL TOGETHER. Are we there yet?

PRINCIPLES OF SOFTWARE BIM209DESIGN AND DEVELOPMENT 10. PUTTING IT ALL TOGETHER. Are we there yet? PRINCIPLES OF SOFTWARE BIM209DESIGN AND DEVELOPMENT 10. PUTTING IT ALL TOGETHER Are we there yet? Developing software, OOA&D style You ve got a lot of new tools, techniques, and ideas about how to develop

More information

Most real programs operate somewhere between task and data parallelism. Our solution also lies in this set.

Most real programs operate somewhere between task and data parallelism. Our solution also lies in this set. for Windows Azure and HPC Cluster 1. Introduction In parallel computing systems computations are executed simultaneously, wholly or in part. This approach is based on the partitioning of a big task into

More information

FINAL REPORT: K MEANS CLUSTERING SAPNA GANESH (sg1368) VAIBHAV GANDHI(vrg5913)

FINAL REPORT: K MEANS CLUSTERING SAPNA GANESH (sg1368) VAIBHAV GANDHI(vrg5913) FINAL REPORT: K MEANS CLUSTERING SAPNA GANESH (sg1368) VAIBHAV GANDHI(vrg5913) Overview The partitioning of data points according to certain features of the points into small groups is called clustering.

More information

OOADP/OOSE Re- exam. 23 August Mapping marks onto grades. Answers

OOADP/OOSE Re- exam. 23 August Mapping marks onto grades. Answers OOADP/OOSE Re- exam 23 August 213 Mapping marks onto grades Answers 1. [4 marks] The amount of communication required between team members increases (in the worst case) with the square of the number of

More information

Problem 1. (10 points):

Problem 1. (10 points): Parallel Computer Architecture and Programming Written Assignment 1 30 points total + 2 pts extra credit. Due Monday, July 3 at the start of class. Warm Up Problems Problem 1. (10 points): A. (5 pts) Complete

More information

High-Performance and Parallel Computing

High-Performance and Parallel Computing 9 High-Performance and Parallel Computing 9.1 Code optimization To use resources efficiently, the time saved through optimizing code has to be weighed against the human resources required to implement

More information

High Performance Computing Systems

High Performance Computing Systems High Performance Computing Systems Shared Memory Doug Shook Shared Memory Bottlenecks Trips to memory Cache coherence 2 Why Multicore? Shared memory systems used to be purely the domain of HPC... What

More information

CMPE-655 Fall 2013 Assignment 2: Parallel Implementation of a Ray Tracer

CMPE-655 Fall 2013 Assignment 2: Parallel Implementation of a Ray Tracer CMPE-655 Fall 2013 Assignment 2: Parallel Implementation of a Ray Tracer Rochester Institute of Technology, Department of Computer Engineering Instructor: Dr. Shaaban (meseec@rit.edu) TAs: Jason Lowden

More information

Loops. In Example 1, we have a Person class, that counts the number of Person objects constructed.

Loops. In Example 1, we have a Person class, that counts the number of Person objects constructed. Loops Introduction In this article from my free Java 8 course, I will discuss the use of loops in Java. Loops allow the program to execute repetitive tasks or iterate over vast amounts of data quickly.

More information

Determining the Number of CPUs for Query Processing

Determining the Number of CPUs for Query Processing Determining the Number of CPUs for Query Processing Fatemah Panahi Elizabeth Soechting CS747 Advanced Computer Systems Analysis Techniques The University of Wisconsin-Madison fatemeh@cs.wisc.edu, eas@cs.wisc.edu

More information

CMPE 655 Fall 2016 Assignment 2: Parallel Implementation of a Ray Tracer

CMPE 655 Fall 2016 Assignment 2: Parallel Implementation of a Ray Tracer CMPE 655 Fall 2016 Assignment 2: Parallel Implementation of a Ray Tracer Rochester Institute of Technology, Department of Computer Engineering Instructor: Dr. Shaaban (meseec@rit.edu) TAs: Akshay Yembarwar

More information

Agenda Process Concept Process Scheduling Operations on Processes Interprocess Communication 3.2

Agenda Process Concept Process Scheduling Operations on Processes Interprocess Communication 3.2 Lecture 3: Processes Agenda Process Concept Process Scheduling Operations on Processes Interprocess Communication 3.2 Process in General 3.3 Process Concept Process is an active program in execution; process

More information

CMSC Computer Architecture Lecture 12: Multi-Core. Prof. Yanjing Li University of Chicago

CMSC Computer Architecture Lecture 12: Multi-Core. Prof. Yanjing Li University of Chicago CMSC 22200 Computer Architecture Lecture 12: Multi-Core Prof. Yanjing Li University of Chicago Administrative Stuff! Lab 4 " Due: 11:49pm, Saturday " Two late days with penalty! Exam I " Grades out on

More information

Chapter 8: Main Memory

Chapter 8: Main Memory Chapter 8: Main Memory Chapter 8: Memory Management Background Swapping Contiguous Memory Allocation Segmentation Paging Structure of the Page Table Example: The Intel 32 and 64-bit Architectures Example:

More information

Image Processing The Easy Way

Image Processing The Easy Way Image Processing The Easy Way Image Zoom with and without the AMD Performance Library Brent Hollingsworth Advanced Micro Devices November 2006-1 - 2006 Advanced Micro Devices, Inc. All rights reserved.

More information

A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 2 Analysis of Fork-Join Parallel Programs

A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 2 Analysis of Fork-Join Parallel Programs A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 2 Analysis of Fork-Join Parallel Programs Dan Grossman Last Updated: January 2016 For more information, see http://www.cs.washington.edu/homes/djg/teachingmaterials/

More information

CSCI Lab 9 Implementing and Using a Binary Search Tree (BST)

CSCI Lab 9 Implementing and Using a Binary Search Tree (BST) CSCI Lab 9 Implementing and Using a Binary Search Tree (BST) Preliminaries In this lab you will implement a binary search tree and use it in the WorkerManager program from Lab 3. Start by copying this

More information

Chapter 8: Memory-Management Strategies

Chapter 8: Memory-Management Strategies Chapter 8: Memory-Management Strategies Chapter 8: Memory Management Strategies Background Swapping Contiguous Memory Allocation Segmentation Paging Structure of the Page Table Example: The Intel 32 and

More information

Priority Queues. 1 Introduction. 2 Naïve Implementations. CSci 335 Software Design and Analysis III Chapter 6 Priority Queues. Prof.

Priority Queues. 1 Introduction. 2 Naïve Implementations. CSci 335 Software Design and Analysis III Chapter 6 Priority Queues. Prof. Priority Queues 1 Introduction Many applications require a special type of queuing in which items are pushed onto the queue by order of arrival, but removed from the queue based on some other priority

More information

*** TROUBLESHOOTING TIP ***

*** TROUBLESHOOTING TIP *** *** TROUBLESHOOTING TIP *** If you are experiencing errors with your deliverable 2 setup which deliverable 3 is built upon, delete the deliverable 2 project within Eclipse, and delete the non working newbas

More information

Introduction IS

Introduction IS Introduction IS 313 4.1.2003 Outline Goals of the course Course organization Java command line Object-oriented programming File I/O Business Application Development Business process analysis Systems analysis

More information

Master-Worker pattern

Master-Worker pattern COSC 6397 Big Data Analytics Master Worker Programming Pattern Edgar Gabriel Fall 2018 Master-Worker pattern General idea: distribute the work among a number of processes Two logically different entities:

More information

CSC630/CSC730 Parallel & Distributed Computing

CSC630/CSC730 Parallel & Distributed Computing CSC630/CSC730 Parallel & Distributed Computing Analytical Modeling of Parallel Programs Chapter 5 1 Contents Sources of Parallel Overhead Performance Metrics Granularity and Data Mapping Scalability 2

More information

Note: Each loop has 5 iterations in the ThreeLoopTest program.

Note: Each loop has 5 iterations in the ThreeLoopTest program. Lecture 23 Multithreading Introduction Multithreading is the ability to do multiple things at once with in the same application. It provides finer granularity of concurrency. A thread sometimes called

More information

Introduction to Parallel Programming

Introduction to Parallel Programming Introduction to Parallel Programming David Lifka lifka@cac.cornell.edu May 23, 2011 5/23/2011 www.cac.cornell.edu 1 y What is Parallel Programming? Using more than one processor or computer to complete

More information

Master-Worker pattern

Master-Worker pattern COSC 6397 Big Data Analytics Master Worker Programming Pattern Edgar Gabriel Spring 2017 Master-Worker pattern General idea: distribute the work among a number of processes Two logically different entities:

More information

CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES

CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES CHAPTER 8 - MEMORY MANAGEMENT STRATEGIES OBJECTIVES Detailed description of various ways of organizing memory hardware Various memory-management techniques, including paging and segmentation To provide

More information

CSE 332: Data Structures & Parallelism Lecture 15: Analysis of Fork-Join Parallel Programs. Ruth Anderson Autumn 2018

CSE 332: Data Structures & Parallelism Lecture 15: Analysis of Fork-Join Parallel Programs. Ruth Anderson Autumn 2018 CSE 332: Data Structures & Parallelism Lecture 15: Analysis of Fork-Join Parallel Programs Ruth Anderson Autumn 2018 Outline Done: How to use fork and join to write a parallel algorithm Why using divide-and-conquer

More information

CSE 120 Principles of Operating Systems

CSE 120 Principles of Operating Systems CSE 120 Principles of Operating Systems Spring 2018 Lecture 15: Multicore Geoffrey M. Voelker Multicore Operating Systems We have generally discussed operating systems concepts independent of the number

More information

Chapter 8: Main Memory. Operating System Concepts 9 th Edition

Chapter 8: Main Memory. Operating System Concepts 9 th Edition Chapter 8: Main Memory Silberschatz, Galvin and Gagne 2013 Chapter 8: Memory Management Background Swapping Contiguous Memory Allocation Segmentation Paging Structure of the Page Table Example: The Intel

More information

Generating Charts in PDF Format with JFreeChart and itext

Generating Charts in PDF Format with JFreeChart and itext Generating Charts in PDF Format with JFreeChart and itext Written by David Gilbert May 28, 2002 c 2002, Simba Management Limited. All rights reserved. Everyone is permitted to copy and distribute verbatim

More information

Fractals exercise. Investigating task farms and load imbalance

Fractals exercise. Investigating task farms and load imbalance Fractals exercise Investigating task farms and load imbalance Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

Subset Sum - A Dynamic Parallel Solution

Subset Sum - A Dynamic Parallel Solution Subset Sum - A Dynamic Parallel Solution Team Cthulu - Project Report ABSTRACT Tushar Iyer Rochester Institute of Technology Rochester, New York txi9546@rit.edu The subset sum problem is an NP-Complete

More information

An Exceptional Class. by Peter Lavin. June 1, 2004

An Exceptional Class. by Peter Lavin. June 1, 2004 An Exceptional Class by Peter Lavin June 1, 2004 Overview When a method throws an exception, Java requires that it be caught. Some exceptions require action on the programmer s part and others simply need

More information

Animations involving numbers

Animations involving numbers 136 Chapter 8 Animations involving numbers 8.1 Model and view The examples of Chapter 6 all compute the next picture in the animation from the previous picture. This turns out to be a rather restrictive

More information

Repe$$on CSC 121 Spring 2017 Howard Rosenthal

Repe$$on CSC 121 Spring 2017 Howard Rosenthal Repe$$on CSC 121 Spring 2017 Howard Rosenthal Lesson Goals Learn the following three repetition structures in Java, their syntax, their similarities and differences, and how to avoid common errors when

More information

CSCI 135 Exam #0 Fundamentals of Computer Science I Fall 2012

CSCI 135 Exam #0 Fundamentals of Computer Science I Fall 2012 CSCI 135 Exam #0 Fundamentals of Computer Science I Fall 2012 Name: This exam consists of 7 problems on the following 6 pages. You may use your single- side hand- written 8 ½ x 11 note sheet during the

More information

Chapter 8: Main Memory

Chapter 8: Main Memory Chapter 8: Main Memory Silberschatz, Galvin and Gagne 2013 Chapter 8: Memory Management Background Swapping Contiguous Memory Allocation Segmentation Paging Structure of the Page Table Example: The Intel

More information

Page 1. Program Performance Metrics. Program Performance Metrics. Amdahl s Law. 1 seq seq 1

Page 1. Program Performance Metrics. Program Performance Metrics. Amdahl s Law. 1 seq seq 1 Program Performance Metrics The parallel run time (Tpar) is the time from the moment when computation starts to the moment when the last processor finished his execution The speedup (S) is defined as the

More information

Introduction to parallel computers and parallel programming. Introduction to parallel computersand parallel programming p. 1

Introduction to parallel computers and parallel programming. Introduction to parallel computersand parallel programming p. 1 Introduction to parallel computers and parallel programming Introduction to parallel computersand parallel programming p. 1 Content A quick overview of morden parallel hardware Parallelism within a chip

More information

Sorting. Bubble Sort. Selection Sort

Sorting. Bubble Sort. Selection Sort Sorting In this class we will consider three sorting algorithms, that is, algorithms that will take as input an array of items, and then rearrange (sort) those items in increasing order within the array.

More information

Project 1 Computer Science 2334 Spring 2016 This project is individual work. Each student must complete this assignment independently.

Project 1 Computer Science 2334 Spring 2016 This project is individual work. Each student must complete this assignment independently. Project 1 Computer Science 2334 Spring 2016 This project is individual work. Each student must complete this assignment independently. User Request: Create a simple movie data system. Milestones: 1. Use

More information

Acknowledgments. Amdahl s Law. Contents. Programming with MPI Parallel programming. 1 speedup = (1 P )+ P N. Type to enter text

Acknowledgments. Amdahl s Law. Contents. Programming with MPI Parallel programming. 1 speedup = (1 P )+ P N. Type to enter text Acknowledgments Programming with MPI Parallel ming Jan Thorbecke Type to enter text This course is partly based on the MPI courses developed by Rolf Rabenseifner at the High-Performance Computing-Center

More information

Lab 09 - Virtual Memory

Lab 09 - Virtual Memory Lab 09 - Virtual Memory Due: November 19, 2017 at 4:00pm 1 mmapcopy 1 1.1 Introduction 1 1.1.1 A door predicament 1 1.1.2 Concepts and Functions 2 1.2 Assignment 3 1.2.1 mmap copy 3 1.2.2 Tips 3 1.2.3

More information

A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 2 Analysis of Fork-Join Parallel Programs

A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 2 Analysis of Fork-Join Parallel Programs A Sophomoric Introduction to Shared-Memory Parallelism and Concurrency Lecture 2 Analysis of Fork-Join Parallel Programs Steve Wolfman, based on work by Dan Grossman (with small tweaks by Alan Hu) Learning

More information

Understanding Parallelism and the Limitations of Parallel Computing

Understanding Parallelism and the Limitations of Parallel Computing Understanding Parallelism and the Limitations of Parallel omputing Understanding Parallelism: Sequential work After 16 time steps: 4 cars Scalability Laws 2 Understanding Parallelism: Parallel work After

More information

Nesting Foreach Loops

Nesting Foreach Loops Steve Weston doc@revolutionanalytics.com December 9, 2017 1 Introduction The foreach package provides a looping construct for executing R code repeatedly. It is similar to the standard for loop, which

More information

Subset Sum Problem Parallel Solution

Subset Sum Problem Parallel Solution Subset Sum Problem Parallel Solution Project Report Harshit Shah hrs8207@rit.edu Rochester Institute of Technology, NY, USA 1. Overview Subset sum problem is NP-complete problem which can be solved in

More information

CS 61C: Great Ideas in Computer Architecture. Amdahl s Law, Thread Level Parallelism

CS 61C: Great Ideas in Computer Architecture. Amdahl s Law, Thread Level Parallelism CS 61C: Great Ideas in Computer Architecture Amdahl s Law, Thread Level Parallelism Instructor: Alan Christopher 07/17/2014 Summer 2014 -- Lecture #15 1 Review of Last Lecture Flynn Taxonomy of Parallel

More information

CS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it

CS 590: High Performance Computing. Parallel Computer Architectures. Lab 1 Starts Today. Already posted on Canvas (under Assignment) Let s look at it Lab 1 Starts Today Already posted on Canvas (under Assignment) Let s look at it CS 590: High Performance Computing Parallel Computer Architectures Fengguang Song Department of Computer Science IUPUI 1

More information

Parallel Computing Concepts. CSInParallel Project

Parallel Computing Concepts. CSInParallel Project Parallel Computing Concepts CSInParallel Project July 26, 2012 CONTENTS 1 Introduction 1 1.1 Motivation................................................ 1 1.2 Some pairs of terms...........................................

More information

Parallel Programming

Parallel Programming Parallel Programming OpenMP Dr. Hyrum D. Carroll November 22, 2016 Parallel Programming in a Nutshell Load balancing vs Communication This is the eternal problem in parallel computing. The basic approaches

More information

Accelerated Library Framework for Hybrid-x86

Accelerated Library Framework for Hybrid-x86 Software Development Kit for Multicore Acceleration Version 3.0 Accelerated Library Framework for Hybrid-x86 Programmer s Guide and API Reference Version 1.0 DRAFT SC33-8406-00 Software Development Kit

More information

Parallelism paradigms

Parallelism paradigms Parallelism paradigms Intro part of course in Parallel Image Analysis Elias Rudberg elias.rudberg@it.uu.se March 23, 2011 Outline 1 Parallelization strategies 2 Shared memory 3 Distributed memory 4 Parallelization

More information

Outline. CSC 447: Parallel Programming for Multi- Core and Cluster Systems

Outline. CSC 447: Parallel Programming for Multi- Core and Cluster Systems CSC 447: Parallel Programming for Multi- Core and Cluster Systems Performance Analysis Instructor: Haidar M. Harmanani Spring 2018 Outline Performance scalability Analytical performance measures Amdahl

More information

CS 62 Practice Final SOLUTIONS

CS 62 Practice Final SOLUTIONS CS 62 Practice Final SOLUTIONS 2017-5-2 Please put your name on the back of the last page of the test. Note: This practice test may be a bit shorter than the actual exam. Part 1: Short Answer [32 points]

More information

About this exam review

About this exam review Final Exam Review About this exam review I ve prepared an outline of the material covered in class May not be totally complete! Exam may ask about things that were covered in class but not in this review

More information

Keys to Faster Sampling in Dataflow

Keys to Faster Sampling in Dataflow Keys to Faster Sampling in Dataflow Ben Chambers, former Cloud Software Engineer Rafael Fernandez, Cloud Engineering Manager Editor s Note: Ben Chambers made the majority of the contributions to this post

More information

Line Segment Intersection Dmitriy V'jukov

Line Segment Intersection Dmitriy V'jukov Line Segment Intersection Dmitriy V'jukov 1. Problem Statement Write a threaded code to find pairs of input line segments that intersect within three-dimensional space. Line segments are defined by 6 integers

More information

Part 1. Summary of For Loops and While Loops

Part 1. Summary of For Loops and While Loops NAME EET 2259 Lab 5 Loops OBJECTIVES -Understand when to use a For Loop and when to use a While Loop. -Write LabVIEW programs using each kind of loop. -Write LabVIEW programs with one loop inside another.

More information

1.00 Introduction to Computers and Engineering Problem Solving. Quiz 1 March 7, 2003

1.00 Introduction to Computers and Engineering Problem Solving. Quiz 1 March 7, 2003 1.00 Introduction to Computers and Engineering Problem Solving Quiz 1 March 7, 2003 Name: Email Address: TA: Section: You have 90 minutes to complete this exam. For coding questions, you do not need to

More information

Fractals. Investigating task farms and load imbalance

Fractals. Investigating task farms and load imbalance Fractals Investigating task farms and load imbalance Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

Topics. Java arrays. Definition. Data Structures and Information Systems Part 1: Data Structures. Lecture 3: Arrays (1)

Topics. Java arrays. Definition. Data Structures and Information Systems Part 1: Data Structures. Lecture 3: Arrays (1) Topics Data Structures and Information Systems Part 1: Data Structures Michele Zito Lecture 3: Arrays (1) Data structure definition: arrays. Java arrays creation access Primitive types and reference types

More information