THE AUSTRALIAN NATIONAL UNIVERSITY First Semester Examination June 2011 COMP4300/6430. Parallel Systems
|
|
- Marvin Neal Fox
- 5 years ago
- Views:
Transcription
1 THE AUSTRALIAN NATIONAL UNIVERSITY First Semester Examination June 2011 COMP4300/6430 Parallel Systems Study Period: 15 minutes Time Allowed: 3 hours Permitted Materials: Non-Programmable Calculator This exam is worth 40% of your total course mark. Exam questions total 100 marks, with marks awarded according to the breakdown given. Answer ALL questions. Write your answers using a black or blue pen. Your answers should be clear and concise; marks may be lost for supplying irrelevant information.
2 Question 1 [9 marks] (a) [1 mark] Explain the differences between blocking and non-blocking communications. (b) [2 marks] In the context of parallel computing, what is a superlinear speedup? Explain why you might sometimes observe such a speedup. (c) [3 marks] For a communication network represented as an undirected graph, what is (i) the diameter and (ii) the bisection bandwidth? Why are these concepts important in designing communication networks for parallel computers? (d) [3 marks] For what class of computing system was the Hadoop file system designed? Briefly describe two of its main features. Question 2 [25 marks] Four programming models/languages/libraries that are applicable to distributed and/or shared memory parallel computers are: (A) MPI; (B) Pthreads; (C) OpenMP; (D) Cilk. (a) [20 marks] For each of (A) (D): (i) give a brief description of what it is; (ii) mention the class of parallel computers on which it is applicable; (iii) comment on its advantages and disadvantages; (iv) give an example of an application for which it is well-suited. (b) [3 marks] On what parallel computer architectures could MPI and OpenMP be combined? Give an example of an application where such a combination would be useful. Justify your answer. (c) [2 marks] Which of (A) (D) above would you recommend to a parallel programming novice? Explain your answer. COMP4300/6430 First Semester Exam 2011 Page 2 of 5
3 Question 3 [25 marks] The following C code performs a binary radix sort of an array val of N non-negative integers whose maximum value is at most MxInt: void radixsort (int *val, int N, int MxInt) { int i, j, low, high, level; int *tmp; tmp = (int*) malloc (N*sizeof(int)); if (tmp == NULL) { /* Error-handling code omitted */ for (i=1, level=0; i <= MxInt; i *= 2, level++) { low = high = 0; for (j = 0; j < N; j++) { if (((val[j] >> level) & 1) == 0) val[low++] = val[j]; else tmp[high++] = val[j]; for (j = 0; j < high; j++) val[low+j] = tmp[j]; free (tmp); You can assume that the code compiles and runs correctly on a single core. (a) [15 marks] Explain how you would parallelise this code for a uniform memory access (UMA) shared-memory system using OpenMP. You are free to use additional storage if this is necessary for your solution. You should provide pseudo-code, i.e. you are not required to write syntactically correct C code or OpenMP pragmas, but you should make your intentions clear. (b) [6 marks] (i) Discuss how you would expect your code to perform as a function of the parameters N, MxInt, and the number of threads used. (ii) How might the performance differ on a non-uniform memory access (NUMA) machine? To be specific, consider the case of up to eight threads on a four-processor machine where each processor has two cores. (c) [4 marks] Outline how a solution using Cilk would differ from your OpenMP solution to part (a). COMP4300/6430 First Semester Exam 2011 Page 3 of 5
4 Question 4 [25 marks] This question assumes a CPU (host) with attached GPU (device), programmed using CUDA. You are not required to write syntactically correct CUDA code, but you should make your intentions clear. (a) [6 marks] In the context of a GPU programmed using CUDA, what are (i) threads; (ii) blocks; and (iii) global memory? (b) [10 marks] The following fragment of C code performs matrix multiplication of n n matrices A and B, and stores the result in a matrix C. The matrices are assumed to be stored in onedimensional arrays with the usual C convention (contiguous by rows), and C must not overlap A or B. void MatMulOnHost (float *A, float *B, float *C, int n) { int i, j, k; float x, y, sum; for (i = 0; i < n; i++) for (j = 0; j < n; j++) { sum = 0.0; for (k = 0; k < n; k++) { x = A[i*n+k]; /* A[i][k] */ y = B[k*n+j]; /* B[k][j] */ sum += x*y; C[i*n+j] = sum; /* C[i][j] */ Describe how you would convert this to a routine MatMulKernel to run on a GPU, using CUDA. How would you invoke MatMulKernel from the host? (c) [4 marks] Outline how you would allocate and free memory for the arrays A, B and C on the GPU, and how you would transfer data from and to the host CPU. (d) [5 marks] Why is matrix multiplication in the class of problems that can be computed efficiently on a CUDA-enabled GPU? Would your routine MatMulKernel give good performance on the GPU? If not, suggest how it might be modified to give better performance. COMP4300/6430 First Semester Exam 2011 Page 4 of 5
5 Question 5 [16 marks] MapReduce is a programming paradigm well-suited for embarrassingly parallel applications. (a) [8 marks] Give an overview of the MapReduce programming model and how it implements parallelism. Comment on aspects such as task granularity, load balancing, fault tolerance, and mechanisms to achieve data locality. (b) [2 marks] Give an example of a problem that is well-suited to be solved using MapReduce. (c) [6 marks] Suppose that you have been given two documents with content such as the following: Document1: Test test Test test test Document2: This is a test file Based on your experience in developing a MapReduce program for inverted index creation, give MapReduce program pseudo-code to generate a list of locations (word number in the document and identifier for the document) for each word occurrence. An identifier for each document is provided as the key to the map() function. The output generated by your program should look like: Test Document1: 1, 3 test Document1: 2, 4, 5 Document2: 4 This Document2: 1 is Document2: 2 a Document2: 3 file Document2: 5 COMP4300/6430 First Semester Exam 2011 Page 5 of 5
THE AUSTRALIAN NATIONAL UNIVERSITY First Semester Examination June COMP3320/6464/HONS High Performance Scientific Computing
THE AUSTRALIAN NATIONAL UNIVERSITY First Semester Examination June 2014 COMP3320/6464/HONS High Performance Scientific Computing Study Period: 15 minutes Time Allowed: 3 hours Permitted Materials: Non-Programmable
More informationTHE AUSTRALIAN NATIONAL UNIVERSITY First Semester Examination June COMP3320/6464/HONS High Performance Scientific Computing
THE AUSTRALIAN NATIONAL UNIVERSITY First Semester Examination June 2010 COMP3320/6464/HONS High Performance Scientific Computing Study Period: 15 minutes Time Allowed: 3 hours Permitted Materials: Non-Programmable
More informationBasic Communication Operations (Chapter 4)
Basic Communication Operations (Chapter 4) Vivek Sarkar Department of Computer Science Rice University vsarkar@cs.rice.edu COMP 422 Lecture 17 13 March 2008 Review of Midterm Exam Outline MPI Example Program:
More informationParallelism paradigms
Parallelism paradigms Intro part of course in Parallel Image Analysis Elias Rudberg elias.rudberg@it.uu.se March 23, 2011 Outline 1 Parallelization strategies 2 Shared memory 3 Distributed memory 4 Parallelization
More informationComp2310 & Comp6310 Systems, Networks and Concurrency
The Australian National University Mid Semester Examination August 2018 Comp2310 & Comp6310 Systems, Networks and Concurrency Study period: 15 minutes Time allowed: 1.5 hours (after study period) Total
More informationTHE AUSTRALIAN NATIONAL UNIVERSITY First Semester Examination June COMP4300/6430 Parallel Systems
THE AUSTRALIAN NATIONAL UNIVERSITY First Semester Examination June 2009 COMP4300/6430 Parallel Systems Study Period: 15 minutes Time Allowed: 3 hours Permitted Materials: Non-Programmable Calculator This
More informationSample Examination. Family Name:... Other Names:... Signature:... Student Number:...
Family Name:... Other Names:... Signature:... Student Number:... THE UNIVERSITY OF NEW SOUTH WALES SCHOOL OF COMPUTER SCIENCE AND ENGINEERING Sample Examination COMP1917 Computing 1 EXAM DURATION: 2 HOURS
More informationParallel and Distributed Computing
Parallel and Distributed Computing NUMA; OpenCL; MapReduce José Monteiro MSc in Information Systems and Computer Engineering DEA in Computational Engineering Department of Computer Science and Engineering
More informationIntroduction to parallel computers and parallel programming. Introduction to parallel computersand parallel programming p. 1
Introduction to parallel computers and parallel programming Introduction to parallel computersand parallel programming p. 1 Content A quick overview of morden parallel hardware Parallelism within a chip
More informationCSE 230 Intermediate Programming in C and C++ Arrays and Pointers
CSE 230 Intermediate Programming in C and C++ Arrays and Pointers Fall 2017 Stony Brook University Instructor: Shebuti Rayana http://www3.cs.stonybrook.edu/~cse230/ Definition: Arrays A collection of elements
More informationSTA141C: Big Data & High Performance Statistical Computing
STA141C: Big Data & High Performance Statistical Computing Lecture 7: Parallel Computing Cho-Jui Hsieh UC Davis May 3, 2018 Outline Multi-core computing, distributed computing Multi-core computing tools
More informationJoe Hummel, PhD. Microsoft MVP Visual C++ Technical Staff: Pluralsight, LLC Professor: U. of Illinois, Chicago.
Joe Hummel, PhD Microsoft MVP Visual C++ Technical Staff: Pluralsight, LLC Professor: U. of Illinois, Chicago email: joe@joehummel.net stuff: http://www.joehummel.net/downloads.html Async programming:
More informationHybrid Model Parallel Programs
Hybrid Model Parallel Programs Charlie Peck Intermediate Parallel Programming and Cluster Computing Workshop University of Oklahoma/OSCER, August, 2010 1 Well, How Did We Get Here? Almost all of the clusters
More informationHomework 3 (r1.2) Due: Part (A) -- Apr 28, 2017, 11:55pm Part (B) -- Apr 28, 2017, 11:55pm Part (C) -- Apr 28, 2017, 11:55pm
Second Semester, 2016 17 Homework 3 (r1.2) Due: Part (A) -- Apr 28, 2017, 11:55pm Part (B) -- Apr 28, 2017, 11:55pm Part (C) -- Apr 28, 2017, 11:55pm Instruction: Submit your answers electronically through
More informationPARALLEL AND DISTRIBUTED COMPUTING
PARALLEL AND DISTRIBUTED COMPUTING 2013/2014 1 st Semester 2 nd Exam January 29, 2014 Duration: 2h00 - No extra material allowed. This includes notes, scratch paper, calculator, etc. - Give your answers
More informationCS/CoE 1541 Final exam (Fall 2017). This is the cumulative final exam given in the Fall of Question 1 (12 points): was on Chapter 4
CS/CoE 1541 Final exam (Fall 2017). Name: This is the cumulative final exam given in the Fall of 2017. Question 1 (12 points): was on Chapter 4 Question 2 (13 points): was on Chapter 4 For Exam 2, you
More informationComp2310 & Comp6310 Systems, Networks and Concurrency
The Australian National University Mid Semester Examination August 2017 Comp2310 & Comp6310 Systems, Networks and Concurrency Study period: 15 minutes Time allowed: 1.5 hours (after study period) Total
More informationIntroduction to OpenMP. OpenMP basics OpenMP directives, clauses, and library routines
Introduction to OpenMP Introduction OpenMP basics OpenMP directives, clauses, and library routines What is OpenMP? What does OpenMP stands for? What does OpenMP stands for? Open specifications for Multi
More informationParallel Numerical Algorithms
Parallel Numerical Algorithms http://sudalab.is.s.u-tokyo.ac.jp/~reiji/pna16/ [ 9 ] Shared Memory Performance Parallel Numerical Algorithms / IST / UTokyo 1 PNA16 Lecture Plan General Topics 1. Architecture
More informationCS 470 Spring Other Architectures. Mike Lam, Professor. (with an aside on linear algebra)
CS 470 Spring 2016 Mike Lam, Professor Other Architectures (with an aside on linear algebra) Parallel Systems Shared memory (uniform global address space) Primary story: make faster computers Programming
More informationParallel Programming. Exploring local computational resources OpenMP Parallel programming for multiprocessors for loops
Parallel Programming Exploring local computational resources OpenMP Parallel programming for multiprocessors for loops Single computers nowadays Several CPUs (cores) 4 to 8 cores on a single chip Hyper-threading
More informationEnd-Term Examination Second Semester [MCA] MAY-JUNE 2006
(Please write your Roll No. immediately) Roll No. Paper Code: MCA-102 End-Term Examination Second Semester [MCA] MAY-JUNE 2006 Subject: Data Structure Time: 3 Hours Maximum Marks: 60 Note: Question 1.
More informationAn array is a collection of data that holds fixed number of values of same type. It is also known as a set. An array is a data type.
Data Structures Introduction An array is a collection of data that holds fixed number of values of same type. It is also known as a set. An array is a data type. Representation of a large number of homogeneous
More informationI BCS-031 BACHELOR OF COMPUTER APPLICATIONS (BCA) (Revised) Term-End Examination. June, 2015 BCS-031 : PROGRAMMING IN C ++
No. of Printed Pages : 3 I BCS-031 BACHELOR OF COMPUTER APPLICATIONS (BCA) (Revised) Term-End Examination 05723. June, 2015 BCS-031 : PROGRAMMING IN C ++ Time : 3 hours Maximum Marks : 100 (Weightage 75%)
More informationCUDA GPGPU Workshop 2012
CUDA GPGPU Workshop 2012 Parallel Programming: C thread, Open MP, and Open MPI Presenter: Nasrin Sultana Wichita State University 07/10/2012 Parallel Programming: Open MP, MPI, Open MPI & CUDA Outline
More informationOF VICTORIA EXAMINATIONS- DECEMBER 2010 CSC
Name: ID Number: UNIVERSITY OF VICTORIA EXAMINATIONS- DECEMBER 2010 CSC 225 - Algorithms and Data Structures: I Section A01 (CRN 1089) Instructor: Wendy Myrvold Duration: 3 hours TO BE ANSWERED ON THE
More informationMapReduce: A Programming Model for Large-Scale Distributed Computation
CSC 258/458 MapReduce: A Programming Model for Large-Scale Distributed Computation University of Rochester Department of Computer Science Shantonu Hossain April 18, 2011 Outline Motivation MapReduce Overview
More informationPARALLEL AND DISTRIBUTED COMPUTING
PARALLEL AND DISTRIBUTED COMPUTING 2010/2011 1 st Semester Recovery Exam February 2, 2011 Duration: 2h00 - No extra material allowed. This includes notes, scratch paper, calculator, etc. - Give your answers
More informationCase Study: Matrix Multiplication. 6.S898: Advanced Performance Engineering for Multicore Applications February 22, 2017
Case Study: Matrix Multiplication 6.S898: Advanced Performance Engineering for Multicore Applications February 22, 2017 1 4k-by-4k Matrix Multiplication Version Implementation Running time (s) GFLOPS Absolute
More informationCUDA Memory Types All material not from online sources/textbook copyright Travis Desell, 2012
CUDA Memory Types All material not from online sources/textbook copyright Travis Desell, 2012 Overview 1. Memory Access Efficiency 2. CUDA Memory Types 3. Reducing Global Memory Traffic 4. Example: Matrix-Matrix
More informationIntroduction to Multicore Programming
Introduction to Multicore Programming Minsoo Ryu Department of Computer Science and Engineering 2 1 Multithreaded Programming 2 Automatic Parallelization and OpenMP 3 GPGPU 2 Multithreaded Programming
More informationParallel Computing Introduction
Parallel Computing Introduction Bedřich Beneš, Ph.D. Associate Professor Department of Computer Graphics Purdue University von Neumann computer architecture CPU Hard disk Network Bus Memory GPU I/O devices
More informationComp2310 & Comp6310 Systems, Networks and Concurrency
The Australian National University Final Examination November 2017 Comp2310 & Comp6310 Systems, Networks and Concurrency Study period: 15 minutes Writing time: 3 hours (after study period) Total marks:
More informationMatrix Multiplication
Matrix Multiplication CPS343 Parallel and High Performance Computing Spring 2018 CPS343 (Parallel and HPC) Matrix Multiplication Spring 2018 1 / 32 Outline 1 Matrix operations Importance Dense and sparse
More informationMatrix Multiplication
Matrix Multiplication CPS343 Parallel and High Performance Computing Spring 2013 CPS343 (Parallel and HPC) Matrix Multiplication Spring 2013 1 / 32 Outline 1 Matrix operations Importance Dense and sparse
More informationCSE 591: GPU Programming. Using CUDA in Practice. Klaus Mueller. Computer Science Department Stony Brook University
CSE 591: GPU Programming Using CUDA in Practice Klaus Mueller Computer Science Department Stony Brook University Code examples from Shane Cook CUDA Programming Related to: score boarding load and store
More informationCS 470 Spring Other Architectures. Mike Lam, Professor. (with an aside on linear algebra)
CS 470 Spring 2018 Mike Lam, Professor Other Architectures (with an aside on linear algebra) Aside (P3 related): linear algebra Many scientific phenomena can be modeled as matrix operations Differential
More informationParallel Computing. Hwansoo Han (SKKU)
Parallel Computing Hwansoo Han (SKKU) Unicore Limitations Performance scaling stopped due to Power consumption Wire delay DRAM latency Limitation in ILP 10000 SPEC CINT2000 2 cores/chip Xeon 3.0GHz Core2duo
More informationR10 SET - 1. Code No: R II B. Tech I Semester, Supplementary Examinations, May
www.jwjobs.net R10 SET - 1 II B. Tech I Semester, Supplementary Examinations, May - 2012 (Com. to CSE, IT, ECC ) Time: 3 hours Max Marks: 75 *******-****** 1. a) Which of the given options provides the
More informationHow to declare an array in C?
Introduction An array is a collection of data that holds fixed number of values of same type. It is also known as a set. An array is a data type. Representation of a large number of homogeneous values.
More informationCase study: OpenMP-parallel sparse matrix-vector multiplication
Case study: OpenMP-parallel sparse matrix-vector multiplication A simple (but sometimes not-so-simple) example for bandwidth-bound code and saturation effects in memory Sparse matrix-vector multiply (spmvm)
More informationArrays. Defining arrays, declaration and initialization of arrays. Designed by Parul Khurana, LIECA.
Arrays Defining arrays, declaration and initialization of arrays Introduction Many applications require the processing of multiple data items that have common characteristics (e.g., a set of numerical
More informationTHE AUSTRALIAN NATIONAL UNIVERSITY Mid Semester Examination April COMP3320/6464 High Performance Scientific Computing
THE AUSTRALIAN NATIONAL UNIVERSITY Mid Semester Examination April 2012 COMP3320/6464 High Performance Scientific Computing Study Period: 15 minutes Time Allowed: 90 hours Permitted Materials: NONE The
More informationIntroduction to Parallel Programming Part 4 Confronting Race Conditions
Introduction to Parallel Programming Part 4 Confronting Race Conditions Intel Software College Objectives At the end of this module you should be able to: Give practical examples of ways that threads may
More informationAllows program to be incrementally parallelized
Basic OpenMP What is OpenMP An open standard for shared memory programming in C/C+ + and Fortran supported by Intel, Gnu, Microsoft, Apple, IBM, HP and others Compiler directives and library support OpenMP
More informationAE52/AC52/AT52 C & Data Structures JUNE 2014
Q.2 a. Write a program to add two numbers using a temporary variable. #include #include int main () int num1, num2; clrscr (); printf( \n Enter the first number : ); scanf ( %d, &num1);
More informationR13. II B. Tech I Semester Supplementary Examinations, May/June DATA STRUCTURES (Com. to ECE, CSE, EIE, IT, ECC)
SET - 1 II B. Tech I Semester Supplementary Examinations, May/June - 2016 PART A 1. a) Write a procedure for the Tower of Hanoi problem? b) What you mean by enqueue and dequeue operations in a queue? c)
More informationShared-memory Parallel Programming with Cilk Plus
Shared-memory Parallel Programming with Cilk Plus John Mellor-Crummey Department of Computer Science Rice University johnmc@rice.edu COMP 422/534 Lecture 4 30 August 2018 Outline for Today Threaded programming
More informationLecture 2. Memory locality optimizations Address space organization
Lecture 2 Memory locality optimizations Address space organization Announcements Office hours in EBU3B Room 3244 Mondays 3.00 to 4.00pm; Thurs 2:00pm-3:30pm Partners XSED Portal accounts Log in to Lilliput
More informationMessage Passing Interface (MPI)
CS 220: Introduction to Parallel Computing Message Passing Interface (MPI) Lecture 13 Today s Schedule Parallel Computing Background Diving in: MPI The Jetson cluster 3/7/18 CS 220: Parallel Computing
More informationITCS 4/5145 Parallel Computing Test 1 5:00 pm - 6:15 pm, Wednesday February 17, 2016 Solutions Name:...
ITCS 4/5145 Parallel Computing Test 1 5:00 pm - 6:15 pm, Wednesday February 17, 016 Solutions Name:... Answer questions in space provided below questions. Use additional paper if necessary but make sure
More informationThis exam paper contains 8 questions (12 pages) Total 100 points. Please put your official name and NOT your assumed name. First Name: Last Name:
CSci 4061: Introduction to Operating Systems (Spring 2013) Final Exam May 14, 2013 (4:00 6:00 pm) Open Book and Lecture Notes (Bring Your U Photo Id to the Exam) This exam paper contains 8 questions (12
More informationTHE AUSTRALIAN NATIONAL UNIVERSITY Mid Semester Examination April 2010 COMP3320/6464. High Performance Scientific Computing
THE AUSTRALIAN NATIONAL UNIVERSITY Mid Semester Examination April 2010 COMP3320/6464 High Performance Scientific Computing Study Period: 15 minutes Time Allowed: 90 minutes Permitted Materials: Non-Programmable
More informationDeclaring Pointers. Declaration of pointers <type> *variable <type> *variable = initial-value Examples:
1 Programming in C Pointer Variable A variable that stores a memory address Allows C programs to simulate call-by-reference Allows a programmer to create and manipulate dynamic data structures Must be
More informationOpenACC 2.6 Proposed Features
OpenACC 2.6 Proposed Features OpenACC.org June, 2017 1 Introduction This document summarizes features and changes being proposed for the next version of the OpenACC Application Programming Interface, tentatively
More informationMPI CS 732. Joshua Hegie
MPI CS 732 Joshua Hegie 09 The goal of this assignment was to get a grasp of how to use the Message Passing Interface (MPI). There are several different projects that help learn the ups and downs of creating
More informationESC101N: Fundamentals of Computing End-sem st semester
ESC101N: Fundamentals of Computing End-sem 2010-11 1st semester Instructor: Arnab Bhattacharya 8:00-11:00am, 15th November, 2010 Instructions 1. Please write your name, roll number and section below. 2.
More informationParallel Processing. Parallel Processing. 4 Optimization Techniques WS 2018/19
Parallel Processing WS 2018/19 Universität Siegen rolanda.dwismuellera@duni-siegena.de Tel.: 0271/740-4050, Büro: H-B 8404 Stand: September 7, 2018 Betriebssysteme / verteilte Systeme Parallel Processing
More informationFundamental of Programming (C)
Borrowed from lecturer notes by Omid Jafarinezhad Fundamental of Programming (C) Lecturer: Vahid Khodabakhshi Lecture 8 Array typical problems, Search, Sorting Department of Computer Engineering Outline
More informationDistributed Systems CS /640
Distributed Systems CS 15-440/640 Programming Models Borrowed and adapted from our good friends at CMU-Doha, Qatar Majd F. Sakr, Mohammad Hammoud andvinay Kolar 1 Objectives Discussion on Programming Models
More information(the bubble footer is automatically inserted into this space)
CS 2150 Final Exam, spring 2016 Page 1 of 10 UVa userid: CS 2150 Final Exam, spring 2016 Name You MUST write your e-mail ID on EACH page and bubble in your userid at the bottom of this first page. And
More informationNOTE: Answer ANY FOUR of the following 6 sections:
A-PDF MERGER DEMO Philadelphia University Lecturer: Dr. Nadia Y. Yousif Coordinator: Dr. Nadia Y. Yousif Internal Examiner: Dr. Raad Fadhel Examination Paper... Programming Languages Paradigms (750321)
More informationSubject: PROBLEM SOLVING THROUGH C Time: 3 Hours Max. Marks: 100
Code: DC-05 Subject: PROBLEM SOLVING THROUGH C Time: 3 Hours Max. Marks: 100 NOTE: There are 11 Questions in all. Question 1 is compulsory and carries 16 marks. Answer to Q. 1. must be written in the space
More informationCache memories are small, fast SRAM based memories managed automatically in hardware.
Cache Memories Cache memories are small, fast SRAM based memories managed automatically in hardware. Hold frequently accessed blocks of main memory CPU looks first for data in caches (e.g., L1, L2, and
More informationLecture 2 Arrays, Searching and Sorting (Arrays, multi-dimensional Arrays)
Lecture 2 Arrays, Searching and Sorting (Arrays, multi-dimensional Arrays) In this lecture, you will: Learn about arrays Explore how to declare and manipulate data into arrays Understand the meaning of
More informationMLR Institute of Technology
MLR Institute of Technology Laxma Reddy Avenue, Dundigal, Quthbullapur (M), Hyderabad 500 043 Phone Nos: 08418 204066 / 204088, Fax : 08418 204088 TUTORIAL QUESTION BANK Course Name : DATA STRUCTURES Course
More informationConcurrency for data-intensive applications
Concurrency for data-intensive applications Dennis Kafura CS5204 Operating Systems 1 Jeff Dean Sanjay Ghemawat Dennis Kafura CS5204 Operating Systems 2 Motivation Application characteristics Large/massive
More informationECE331: Hardware Organization and Design
ECE331: Hardware Organization and Design Lecture 25: Multilevel Caches & Data Access Strategies Adapted from Computer Organization and Design, Patterson & Hennessy, UCB Overview Last time: Associative
More informationCMU /618 Exam 2 Practice Problems
CMU 15-418/618 Exam 2 Practice Problems Miscellaneous Questions A. You are working on parallelizing a matrix-vector multiplication, and try creating a result vector for each thread (p). Your code then
More informationCOMP1917 Computing 1 Written Exam Sample Questions
COMP1917 Computing 1 Written Exam Sample Questions Note: these sample questions are intended to provide examples of a certain style of question which did not occur in the tutorial or laboratory exercises,
More informationGeneral Instructions. You can use QtSpim simulator to work on these assignments.
General Instructions You can use QtSpim simulator to work on these assignments. Only one member of each group has to submit the assignment. Please Make sure that there is no duplicate submission from your
More informationWide operands. CP1: hardware can multiply 64-bit floating-point numbers RAM MUL. core
RAM MUL core Wide operands RAM MUL core CP1: hardware can multiply 64-bit floating-point numbers Pipelining: can start the next independent operation before the previous result is available RAM MUL core
More informationIntroduction to Algorithms October 12, 2005 Massachusetts Institute of Technology Professors Erik D. Demaine and Charles E. Leiserson Quiz 1.
Introduction to Algorithms October 12, 2005 Massachusetts Institute of Technology 6.046J/18.410J Professors Erik D. Demaine and Charles E. Leiserson Quiz 1 Quiz 1 Do not open this quiz booklet until you
More informationCS427 Multicore Architecture and Parallel Computing
CS427 Multicore Architecture and Parallel Computing Lecture 9 MapReduce Prof. Li Jiang 2014/11/19 1 What is MapReduce Origin from Google, [OSDI 04] A simple programming model Functional model For large-scale
More informationESC101 : Fundamental of Computing
ESC101 : Fundamental of Computing End Semester Exam 19 November 2008 Name : Roll No. : Section : Note : Read the instructions carefully 1. You will lose 3 marks if you forget to write your name, roll number,
More informationMPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016
MPI and OpenMP (Lecture 25, cs262a) Ion Stoica, UC Berkeley November 19, 2016 Message passing vs. Shared memory Client Client Client Client send(msg) recv(msg) send(msg) recv(msg) MSG MSG MSG IPC Shared
More informationParallel Computing Using OpenMP/MPI. Presented by - Jyotsna 29/01/2008
Parallel Computing Using OpenMP/MPI Presented by - Jyotsna 29/01/2008 Serial Computing Serially solving a problem Parallel Computing Parallelly solving a problem Parallel Computer Memory Architecture Shared
More informationParallel Programming in C with MPI and OpenMP
Parallel Programming in C with MPI and OpenMP Michael J. Quinn Chapter 17 Shared-memory Programming 1 Outline n OpenMP n Shared-memory model n Parallel for loops n Declaring private variables n Critical
More informationCOMPUTER SCIENCE Paper 2 (PRACTICAL)
COMPUTER SCIENCE Paper 2 (PRACTICAL) (Maximum Marks: 30) (Time allowed: Three hours) (Candidates are allowed additional 15 minutes for only reading the paper. They must NOT start writing during this time.)
More informationAgenda. Cache-Memory Consistency? (1/2) 7/14/2011. New-School Machine Structures (It s a bit more complicated!)
7/4/ CS 6C: Great Ideas in Computer Architecture (Machine Structures) Caches II Instructor: Michael Greenbaum New-School Machine Structures (It s a bit more complicated!) Parallel Requests Assigned to
More informationECE 122. Engineering Problem Solving with Java
ECE 122 Engineering Problem Solving with Java Lecture 8 More Conditional Statements Outline Problem: How do I make choices in my Java program? Understanding conditional statements Remember: Boolean logic
More informationECE 122. Engineering Problem Solving with Java
ECE 122 Engineering Problem Solving with Java Lecture 8 More Conditional Statements Outline Problem: How do I make choices in my Java program? Understanding conditional statements Remember: Boolean logic
More informationCSCE 110 PROGRAMMING FUNDAMENTALS. Prof. Amr Goneid AUC Part 7. 1-D & 2-D Arrays
CSCE 110 PROGRAMMING FUNDAMENTALS WITH C++ Prof. Amr Goneid AUC Part 7. 1-D & 2-D Arrays Prof. Amr Goneid, AUC 1 Arrays Prof. Amr Goneid, AUC 2 1-D Arrays Data Structures The Array Data Type How to Declare
More information1. Define algorithm complexity 2. What is called out of order in detail? 3. Define Hardware prefetching. 4. Define software prefetching. 5. Define wor
CS6801-MULTICORE ARCHECTURES AND PROGRAMMING UN I 1. Difference between Symmetric Memory Architecture and Distributed Memory Architecture. 2. What is Vector Instruction? 3. What are the factor to increasing
More informationQuestion 13 1: (Solution, p 4) Describe the inputs and outputs of a (1-way) demultiplexer, and how they relate.
Questions 1 Question 13 1: (Solution, p ) Describe the inputs and outputs of a (1-way) demultiplexer, and how they relate. Question 13 : (Solution, p ) In implementing HYMN s control unit, the fetch cycle
More informationMULTI-CORE PROGRAMMING. Dongrui She December 9, 2010 ASSIGNMENT
MULTI-CORE PROGRAMMING Dongrui She December 9, 2010 ASSIGNMENT Goal of the Assignment 1 The purpose of this assignment is to Have in-depth understanding of the architectures of real-world multi-core CPUs
More informationPROGRAMOVÁNÍ V C++ CVIČENÍ. Michal Brabec
PROGRAMOVÁNÍ V C++ CVIČENÍ Michal Brabec PARALLELISM CATEGORIES CPU? SSE Multiprocessor SIMT - GPU 2 / 17 PARALLELISM V C++ Weak support in the language itself, powerful libraries Many different parallelization
More informationImplementing a Speech Recognition System on a GPU using CUDA. Presented by Omid Talakoub Astrid Yi
Implementing a Speech Recognition System on a GPU using CUDA Presented by Omid Talakoub Astrid Yi Outline Background Motivation Speech recognition algorithm Implementation steps GPU implementation strategies
More informationCOMP4510 Introduction to Parallel Computation. Shared Memory and OpenMP. Outline (cont d) Shared Memory and OpenMP
COMP4510 Introduction to Parallel Computation Shared Memory and OpenMP Thanks to Jon Aronsson (UofM HPC consultant) for some of the material in these notes. Outline (cont d) Shared Memory and OpenMP Including
More informationINF3380: Parallel Programming for Scientific Problems
INF3380: Parallel Programming for Scientific Problems Xing Cai Simula Research Laboratory, and Dept. of Informatics, Univ. of Oslo INF3380: Parallel Programming for Scientific Problems p. 1 Course overview
More informationCS575: Parallel Processing Sanjay Rajopadhye CSU. Course Topics
CS575: Parallel Processing Sanjay Rajopadhye CSU Lecture 2: Parallel Computer Models CS575 lecture 1 Course Topics Introduction, background Complexity, orders of magnitude, recurrences Models of parallel
More informationCONTENTS: Array Usage Multi-Dimensional Arrays Reference Types. COMP-202 Unit 6: Arrays
CONTENTS: Array Usage Multi-Dimensional Arrays Reference Types COMP-202 Unit 6: Arrays Introduction (1) Suppose you want to write a program that asks the user to enter the numeric final grades of 350 COMP-202
More informationConcurrent Programming with OpenMP
Concurrent Programming with OpenMP Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior Técnico October 11, 2012 CPD (DEI / IST) Parallel and Distributed
More informationSri Vidya College of Engineering & Technology
UNIT I INTRODUCTION TO OOP AND FUNDAMENTALS OF JAVA 1. Define OOP. Part A Object-Oriented Programming (OOP) is a methodology or paradigm to design a program using classes and objects. It simplifies the
More informationSAMPLE QUESTIONS FOR DIPLOMA IN INFORMATION TECHNOLOGY; YEAR 1
FACULTY OF SCIENCE AND TECHNOLOGY SAMPLE QUESTIONS FOR DIPLOMA IN INFORMATION TECHNOLOGY; YEAR 1 ACADEMIC SESSION 2014; SEMESTER 3 PRG102D: BASIC PROGRAMMING CONCEPTS Section A Compulsory section Question
More informationComputer Science 1 Bh
UNIVERSITY OF EDINBURGH course CS0077 FACULTY OF SCIENCE AND ENGINEERING DIVISION OF INFORMATICS SCHOOL OF COMPUTER SCIENCE Computer Science 1 Bh Degree Examination Date: Saturday 26th May 2001 Time: 12:00
More informationCS222: Cache Performance Improvement
CS222: Cache Performance Improvement Dr. A. Sahu Dept of Comp. Sc. & Engg. Indian Institute of Technology Guwahati Outline Eleven Advanced Cache Performance Optimization Prev: Reducing hit time & Increasing
More informationCSE 260 Lecture 19. Parallel Programming Languages
CSE 260 Lecture 19 Parallel Programming Languages Announcements Thursday s office hours are cancelled Office hours on Weds 2p to 4pm Jing will hold OH, too, see Moodle Scott B. Baden /CSE 260/ Winter 2014
More informationGPU Programming. Parallel Patterns. Miaoqing Huang University of Arkansas 1 / 102
1 / 102 GPU Programming Parallel Patterns Miaoqing Huang University of Arkansas 2 / 102 Outline Introduction Reduction All-Prefix-Sums Applications Avoiding Bank Conflicts Segmented Scan Sorting 3 / 102
More informationToday Cache memory organization and operation Performance impact of caches
Cache Memories 1 Today Cache memory organization and operation Performance impact of caches The memory mountain Rearranging loops to improve spatial locality Using blocking to improve temporal locality
More information