ENCM 501 Winter 2016 Assignment 1 for the Week of January 25

Similar documents
ENCM 501 Winter 2018 Assignment 2 for the Week of January 22 (with corrections)

ENCM 501 Winter 2017 Assignment 3 for the Week of January 30

ENCM 501 Winter 2015 Assignment 3 for the Week of February 2

ENCM 339 Fall 2017 Lecture Section 01 Lab 5 for the Week of October 16

ENCM 369 Winter 2019 Lab 6 for the Week of February 25

Integer Multiplication and Division

ENCM 369 Winter 2018 Lab 9 for the Week of March 19

ENCM 335 Fall 2018 Lab 6 for the Week of October 22 Complete Instructions

ENCM 501 Winter 2017 Assignment 6 for the Week of February 27

ENCM 335 Fall 2018 Lab 2 for the Week of September 24

ENCM 369 Winter 2017 Lab 3 for the Week of January 30

ENCM 501 Winter 2015 Tutorial for Week 5

Slides for Lecture 6

ENCM 339 Fall 2017: Cygwin Setup Help

ENCM 369 Winter 2016 Lab 11 for the Week of April 4

ENCM 339 Fall 2017: Editing and Running Programs in the Lab

Slide Set 1 (corrected)

Slide Set 9. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

Slide Set 4. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

ENCM 339 Fall 2017 Lecture Section 01 Lab 9 for the Week of November 20

Contents. Slide Set 1. About these slides. Outline of Slide Set 1. Typographical conventions: Italics. Typographical conventions. About these slides

ENCM 501 Winter 2019 Assignment 9

Slide Set 2. for ENCM 335 in Fall Steve Norman, PhD, PEng

Slide Set 1. for ENCM 339 Fall Steve Norman, PhD, PEng. Electrical & Computer Engineering Schulich School of Engineering University of Calgary

Contents. Slide Set 2. Outline of Slide Set 2. More about Pseudoinstructions. Avoid using pseudoinstructions in ENCM 369 labs

Slide Set 3. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

Slide Set 8. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

#1 #2 with corrections Monday, March 12 7:00pm to 8:30pm. Please do not write your U of C ID number on this cover page.

Slides for Lecture 15

Contents Slide Set 9. Final Notes on Textbook Chapter 7. Outline of Slide Set 9. More about skipped sections in Chapter 7. Outline of Slide Set 9

ENCM 369 Winter 2015 Lab 6 for the Week of March 2

Slide Set 5. for ENCM 369 Winter 2014 Lecture Section 01. Steve Norman, PhD, PEng

Administrivia. Minute Essay From 4/11

Course web site: teaching/courses/car. Piazza discussion forum:

Slide Set 8. for ENCM 501 in Winter Steve Norman, PhD, PEng

Vector and Parallel Processors. Amdahl's Law

Winter 2002 FINAL EXAMINATION

Slide Set 7. for ENCM 501 in Winter Term, Steve Norman, PhD, PEng

Instruction-Level Parallelism Dynamic Branch Prediction. Reducing Branch Penalties

Slide Set 5. for ENCM 501 in Winter Term, Steve Norman, PhD, PEng

Winter 2009 FINAL EXAMINATION Location: Engineering A Block, Room 201 Saturday, April 25 noon to 3:00pm

Computer Architecture Homework Set # 1 COVER SHEET Please turn in with your own solution

University of Calgary Department of Electrical and Computer Engineering ENCM 369: Computer Organization Instructor: Steve Norman

Quiz for Chapter 1 Computer Abstractions and Technology

Slide Set 3. for ENCM 339 Fall 2017 Section 01. Steve Norman, PhD, PEng

Alexandria University

EECS 470 Midterm Exam Winter 2008 answers

Slide Set 11. for ENCM 369 Winter 2015 Lecture Section 01. Steve Norman, PhD, PEng

Slide Set 15 (Complete)

Slide Set 5. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

CS61C : Machine Structures

Slide Set 4. for ENCM 335 in Fall Steve Norman, PhD, PEng

ENCM 335 Fall 2018 Tutorial for Week 13

COMP 3500 Introduction to Operating Systems Project 5 Virtual Memory Manager

Slide Set 3. for ENCM 339 Fall Steve Norman, PhD, PEng. Electrical & Computer Engineering Schulich School of Engineering University of Calgary

1.2 Adding Integers. Contents: Numbers on the Number Lines Adding Signed Numbers on the Number Line

Introduction to Computer Systems

Introduction to Computer Systems

Slide Set 4. for ENCM 339 Fall 2017 Section 01. Steve Norman, PhD, PEng

Announcements. 1. Forms to return today after class:

Control Structures. Lecture 4 COP 3014 Fall September 18, 2017

Organisation. Assessment

Engineering 9859 CoE Fundamentals Computer Architecture

CSC209. Software Tools and Systems Programming.

Winter 2006 FINAL EXAMINATION Auxiliary Gymnasium Tuesday, April 18 7:00pm to 10:00pm

CS3350B Computer Architecture. Introduction

a number of pencil-and-paper(-and-calculator) questions two Intel assembly programming questions

CS 251, Winter 2019, Assignment % of course mark

Compiler Optimisation 2014 Course Project

Winter 2012 MID-SESSION TEST Tuesday, March 6 6:30pm to 8:15pm. Please do not write your U of C ID number on this cover page.

CMSC411 Fall 2013 Midterm 1

CS 251, Winter 2018, Assignment % of course mark

CMSC 411 Practice Exam 1 w/answers. 1. CPU performance Suppose we have the following instruction mix and clock cycles per instruction.

CSE 490/590 Computer Architecture Homework 2

Slide Set 14. for ENCM 339 Fall Steve Norman, PhD, PEng. Electrical & Computer Engineering Schulich School of Engineering University of Calgary

CMSC Computer Architecture Lecture 18: Exam 2 Review Session. Prof. Yanjing Li University of Chicago

ECE 313 Computer Organization FINAL EXAM December 13, 2000

EXAM 1 SOLUTIONS. Midterm Exam. ECE 741 Advanced Computer Architecture, Spring Instructor: Onur Mutlu

CSC258: Computer Organization. Memory Systems

Computer Architecture Practical 1 Pipelining

CS152 Computer Architecture and Engineering CS252 Graduate Computer Architecture. VLIW, Vector, and Multithreaded Machines

Quiz for Chapter 1 Computer Abstractions and Technology 3.10

CS433 Midterm. Prof Josep Torrellas. October 16, Time: 1 hour + 15 minutes

Programming and Data Structures Prof. N.S. Narayanaswamy Department of Computer Science and Engineering Indian Institute of Technology, Madras

Slide Set 6. for ENCM 339 Fall 2017 Section 01. Steve Norman, PhD, PEng

4. What is the average CPI of a 1.4 GHz machine that executes 12.5 million instructions in 12 seconds?

5008: Computer Architecture HW#2

mith College Computer Science CSC231-Assembly Week #1 Fall 2018 Dominique Thiébaut

Pipelining and Vector Processing

Slide Set 8. for ENCM 339 Fall 2017 Section 01. Steve Norman, PhD, PEng

ENCM 501 Winter 2019 Assignment 6 for the Week of March 11

Pipelining. lecture 15. MIPS data path and control 3. Five stages of a MIPS (CPU) instruction. - factory assembly line (Henry Ford years ago)

EECS150 Lab Lecture 5 Introduction to the Project

OPEN BOOK, OPEN NOTES. NO COMPUTERS, OR SOLVING PROBLEMS DIRECTLY USING CALCULATORS.

Slide Set 9. for ENCM 335 in Fall Steve Norman, PhD, PEng

Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi.

Hints for Exercise 4: Recursion

C152 Laboratory Exercise 1

CS 2506 Computer Organization II

ENCM 339 Fall 2017 Tutorial for Week 8

Transcription:

page 1 of 5 ENCM 501 Winter 2016 Assignment 1 for the Week of January 25 Steve Norman Department of Electrical & Computer Engineering University of Calgary January 2016 Assignment instructions and other documents for ENCM 501 can be found at http://people.ucalgary.ca/~norman/encm501winter2016/ 1 Administrative details 1.1 Each student must hand in his or her own assignment Later in the course, you will be allowed to work in pairs on some assignments. 1.2 Due Dates The Due Date for this assignment is 3:30pm, Thursday, Jan. 28. The Late Due Date is 3:30pm, Friday, Jan. 29. The penalty for handing in an assignment after the Due Date but before the Late Due Date is 3 marks. In other words, X/Y becomes (X 3)/Y if the assignment is late. There will be no credit for assignments turned in after the Late Due Date; they will be returned unmarked. 1.3 Marking scheme A B C D E total 4 marks 5 marks 6 marks 3 marks 4 marks 22 marks 1.4 How to package and hand in your assignments Please see the instructions in Assignment 1. 2 Exercise A: More about MIPS64 instructions 2.1 Read This First This exercise extends the loop programming example presented in the tutorial of Wednesday, January 20 and solved on slide 9 of the Thursday, January 21 lecture. We are going to look at some simple optimizations that a C compiler for MIPS64 would likely make.

ENCM 501 Winter 2016 Assignment 2 page 2 of 5 Two instructions useful for an optimizing compiler are the conditional move instructions MOVN and MOVZ: MOVN dest, src1, src2 if src2 0, copy src1 to dest, otherwise do nothing. MOVZ dest, src1, src2 if src2 = 0, copy src1 to dest, otherwise do nothing. 2.2 What to Do Rewrite the assembly language loop from slide 9 of Slide Set 2A, so that it does the same job, but contains no jump instructions and only one branch instruction. If possible, eliminate all NOP instructions. (Remark: Using conditional moves instead of branches to implement short if statements can really help performance, because branches can cause pipeline stalls.) 2.3 What to Hand In Hand in typed or neatly hand-written assembly language. 3 Exercise B: Loop unrolling 3.1 Read This First Loop unrolling is a relatively simple and sometimes effective compiler optimization. 3.2 What to Do, Part I Do a Web search for loop unrolling, then write a few short paragraphs to explain what loop unrolling is. Put it in your own words do not simply copy-and-paste. 3.3 What to Do, Part II Rewrite your assembly code from Exercise A, unrolling the loop by a factor of 4, so there a total of 25 passes through the loop and that among the instructions in the loop body there are 4 LD instructions. 3.4 What to Do, Part III Briefly describe changes you would need to make to unroll the loop by a factor of 8. Note that 12 8 is 96, and 13 8 is 104. 3.5 What to Do, Part IV Suppose that you have a program with hundreds of C functions, and that most of those functions contain loops that are easy for a compiler to unroll. Why could it be a bad idea to ask the compiler to unroll all the loops? Specifically, what could go wrong if the program is for an embedded system with a very small memory; what could go wrong if the program is for a desktop computer with 3 levels of cache and a huge amount of DRAM?

ENCM 501 Winter 2016 Assignment 2 page 3 of 5 3.6 What to Hand In Typed or neatly hand-written assembly language for Part II, well-explained answers for the other parts. 4 Exercise C: Comparing run times in a SPEC-like framework 4.1 Read This First This exercise is designed to give some insight into the structure of the SPEC CPU benchmarks how performance is reported for a suite of programs running on a number of different systems. The tiny programs you will work with here are convenient, but contrary to the goals of SPEC in a couple of ways: The work they do filling up data structures with integers, then traversing the data structures to add up those same integers is not at all like the work done by the real applications in the SPEC suites. To avoid making students wait and wait, the programs in this exercise run for just a few seconds, not minutes or hours. Longer runs would result in less measurement error. 4.2 Attention You must do this exercise on one of the machines labeled Optiplex 755 in ICT 320. (Most but not quite all of the boxes in that room have that label.) This is because (a) I want all students do the work with identical hardware and software and (b) supporting this assignment for various compilers and libraries on various operating systems is too much work. 4.3 Cygwin64 The reference platform for programming exercises in this course is Cygwin64, which is installed on the machines in ICT 320. Cygwin64 brings two important capabilities to a Windows box: a command-line interface and set of utility programs that is very similar to what you find on a typical Linux or other Unix-like system; a compiler, linker and libraries that allows you to build C and C++ programs that rely on calls to functions typically found in Unix-like libraries. If you ve never worked with Cygwin before, you can learn the basics needed for ENCM 501 by trying some of the lab exercises from Lab 1 of the Fall 2015 version of ENCM 339. I ve posted the relevant documents and C source files on the ENCM 501 Assignments page. (Cygwin64 is fairly easy to install on 64-bit Windows 7 and Windows 8.1, if you would like to put in on your own machine. I imagine that it works well on Windows 10, but I haven t tried that. Go to https://www.cygwin.com/ for more about downloading and installing Cygwin64.)

ENCM 501 Winter 2016 Assignment 2 page 4 of 5 4.4 What to Do, Part I There are four source files you need to copy. They should be easy to find following links from the ENCM 501 home page. Read the files to get a rough idea of what the code does. Don t worry if some of the details are unfamiliar. There are two programs in the suite, Array and Set. Instead of benchmarking different hardware, you re going to benchmark the same hardware several times with a variety of optimization options presented to the compiler. The reference machine data will come from running the programs compiled without optimization. You will likely find that running times for a given executable, run many times, are all slightly different from each other. I recommend that you run each executable about 10 times, throw out the worst 5 run times, and take the average of the best 5. The rationale is that other programs running on the computer at the same time can sometimes interfere to make a run time unusually bad, but can t do anything to make a run time unusually good. Here are the four systems to test, with each of the two programs: Reference: no compiler optimizations at all. O2: with -O2 optimization. O2-unroll: with -O2 optimization, plus loop unrolling. O3: with -O3 optimization. The command to build Array for the Reference system is gcc Array.c ts_funcs.c -o Array -lrt (On Cygwin, the executable will be called Array.exe; to run the executable the commands./array and./array.exe both work.) To build Array for O2-unroll, it s gcc -O2 -funroll-loops Array.c ts_funcs.c -o Array -lrt To build Set for O3, use g++ -O3 Set.cpp ts_funcs.c -o Set -lrt I hope those three examples are enough to let you figure out the remaining cases. (The option -lrt is needed so that the linker can find the library with clock_gettime.) Get run time data for both programs on all four systems, then determine SPEClike scores for O2, O2-unroll, and O3, using Reference as a reference machine. 4.5 What to Hand In Write a brief report explaining exactly how you collected all your data and how you computed your SPEC-like scores. 4.6 Optional extra part (no marks) It s interesting to look at the instructions chosen by the compiler with different optimization settings. For example, here are a couple of commands to generate assembly language from Array.c: gcc -S -O2 Array.c -o ArrayO2.s gcc -S -O2 -funroll-loops Array.c -o ArrayO2unroll.s If you do that, look at the two.s files to see how relatively large and messy the code for fill_array and sum is when you ask for loop unrolling.

ENCM 501 Winter 2016 Assignment 2 page 5 of 5 5 Exercise D 5.1 What to Do Exercise 1.15 on page 67 of the textbook. 5.2 What to Hand In Solutions, showing clearly how you obtained your answers. 6 Exercise E 6.1 Read This First One of the points of this exercise is that Amdahl s law can be applied at a finegrained level thinking about what happens if there are speedups for some kinds of instructions but not for others. 6.2 What to Do Exercise 1.16 on page 67 68 of the textbook. In part (b) assume that the 10% number refers to the processor without the floating-point enhancement. 6.3 What to Hand In Solutions, showing clearly how you obtained your answers.