Chapter 24a More Numerics and Parallelism
|
|
- Oliver McCormick
- 5 years ago
- Views:
Transcription
1 Chapter 24a More Numerics and Parallelism Nick Maclaren ix-courses/cplusplus This was written by me, not Bjarne Stroustrup
2 Numeric Algorithms These are only accumulate(), inner_product(), partial_sum() and adjacent_difference() Not what numerical programmers call algorithms I can't see any particular reason to use them C++ developers rarely pay attention to numerical properties, or high performance, unlike Fortran ones They are likely to be just the obvious code The first three can be implemented much better I recommend doing as I show in the exercises BLAS, long double or compensated summation 12
3 Gaussian Elimination The book teaches Gaussian elimination with pivoting and an example of a typical numeric algorithm You may need to write such code, in other contexts But DON'T just copy that code, for reasons I shall explain I am NOT criticising the book or code merely stressing the software reuse principle The executive summary here is use LAPACK 13
4 Using Libraries The first approach is to call a (good!) library These usually have a Fortran or C interface There are some C++ libraries around, too They are of VERY mixed quality NAG, LAPACK, FFTW are reliable Netlib is patchy, but some of it is good Numerical Recipes is NOT reliable 14
5 How to Write Them Choose a numerically competent algorithm! This is the key to accuracy and performance Do NOT use Numerical Recipes as a guide The NAG documentation is much better When coding them, watch out for numeric errors Typically accumulation and cancellation errors For these, there are some adequate solutions Or subtracting/dividing two nearly-equal numbers This one is harder to resolve, and I shall skip it 15
6 Improving Accuracy Often arises when using accumulate() or inner_product() Only simple solution is to use long double for the accumulation It's useful for the multiplication in inner_product(), too, but is not essential This is left as an exercise (see later) It may or may not help, for very complicated reasons You can actually do a lot better (in accuracy) But it's NOT a task for the non-expert Both numerically and in the C++ and C languages 16
7 Improving Accuracy Do not,, repeat NOT,, simply code Kahan summation A nightmare in C++, even for the VERY few experts The problem is primarily the C and C++ standards They don't specify what most people think they do All compilers, versions and options will vary Look in the specimen answers for this chapter on the local course Web site fancy_accumulate.cpp and fancy_inner.cpp And read the comments they are not exaggerated! Those work as they stand under gcc but not Intel I can get them working under Intel, painfully 17
8 Doing Better Rule number one is to look for a better algorithm And at the highest level possible, too It is tricky, but the potential gains are huge You can extend the arithmetic's precision Do that only when a few 'operations' are the problem Addition/subtraction is the only easy case I can and have done multiplication, math. functions etc. It's anywhere from painful to fiendish or worse The C/C++ standards are the real problem It can often be easier in assembler :-( 18
9 BLAS and LAPACK Always a good idea to use their interface Have option of writing your own or calling them Optimised libraries can be a LOT faster Atlas, MKL, ACML etc. but not standard Linux ones Mainly the level 3 BLAS, but can include level 1 E.g. xgemm matrix multiply inner_product() is level 1 (DDOT( DDOT, ZDOT, ZDOTC) The BLAS can increase accuracy, but generally don't LAPACK generally uses the level 3 BLAS Optimised ones include NAG, MKL, ACML They are also numerically robust algorithms 19
10 Calling Them Calling the BLAS and LAPACK interface: The interface is usually Fortran 77 A vendor may provide a C one, or even a C++ one The code may be in anything it's not your problem This is not a big deal, but needs care Fortran 77 is to modern Fortran as C is to C++ And you can usually get between Fortran 77 and C BLAS/LAPACK are unmodified Fortran 77 This can't be called entirely portably The next slide gives the USUAL rules 20
11 Calling Fortran 77 Call via extern C BLAS name DDOT becomes ddot_ A Fortran SUBROUTINE is a C void function ALL arguments are passed as pointers double and int carry across, including function results complex and C character arrays are OK, with care Do NOT call functions returning either as the result Write a small Fortran subroutine and return via arguments LOGICAL and character lengths are a bit of a problem In Fortran subroutine, translate LOGICAL to int For character strings, pass the length separately Fortran character strings are not null-terminated 21
12 Performance It is possible to get array-handling C++ code to run as fast as Fortran (my specimen answers do, for example) But it is MUCH harder to achieve Quite a lot of that has to do with the last dimension varying fastest (row-major order) The problems are mainly that most good array libraries are Fortran-based This includes the BLAS and LAPACK But there do seem to be some fundamental ones as well E.g. find x such that b=a.x is more natural for column-major Left solution (i.e. to find x such that b=x.a) ) fits row-major better 22
13 Parallelism Using multiple processes is easy Distributed memory and message passing Use MPI via C see my MPI course for more You will need to pack and unpack C++ classes CilkPlus looks interesting currently Intel only I can't remember exactly which product, so it may cost Intel are funding gcc to include it I hope to investigate it and maybe write a course It's a shared-memory C++ language extension 23
14 Shared Memory Aargh! This area of POSIX is a nightmare area Its specification often makes no sense Its memory model isn't compatible with C99's Its synchronisation doesn't cover program state C threading isn't usable by mere mortals Experts could use it to write higher-level primitives But I have reason to believe it won't work reliably I haven't had time to complete a test program 24
15 OpenMP This is the leader for shared-memory parallelism When the requirement is performance My OpenMP course describes a defensive strategy Its specification makes even POSIX's look good And it doesn't fit well with C++ Realistically, you can parallelise only C-style code That's a soluble problem, in most cases You can use C++ in serial code, including <vector> Theoretically, OpenMP supports a lot more of C++ In practice, I would expect truly foul problems 25
16 Other Shared-Memory There are Boost facilities, too DON'T rely on them The shared-memory problem is NOT about the calls It's not even even about synchronisation etc. It's ALL about the memory consistency model The question is whether the compiler agrees with Boost And much the same applies to any other facilities There are a zillion threading libraries, all dangerous As all experts agree, this CAN'T be done by a library Language and compiler support is CRITICAL 26
17 Exercises Instead of exercise 10, look up Marsaglia's DIEHARD or Knuth TAOCP, vol. 2 Code one of the better tests e.g. the runs test Use realistic sample sizes millions or more Or use the spacings test,, which I have done Generate a U(0,1) sample of size N and sort into order The spacings are negative exponential, mean 1/(N+1) Test using Kolmogorov-Smirnov or otherwise LOTS of simulations rely on adjacency properties 27
18 Exercises The first two extra ones are about basic algorithms and accuracy, to give you a feel for that One uses the BLAS, but it probably won't do much Look at my code to see why I say what I do The others are about using matrices I use Cholesky as a basis, because it is simpler than Gaussian elimination It is for positive definite real matrices ONLY,, and needs no pivoting 28
19 Exercises Exercise 13. Take accumulate.cpp and complete it (see statements marked CHANGE) It's completed (and more) in fancy_accumulate.cpp Exercise 14. Do the same for inner.cpp You will need lblas to link it It's completed (and more) in fancy_inner.cpp These exercises are fairly easy The point of my fancy coding is to show why I make the remarks I do There be dragons! 29
20 Exercises I recommend doing exercises if you are going to need to do any serious n-d array handling They are about the simplest realistic problem possible Tackling a 'real' problem as a first step is insane FAR WORSE,, you are likely to do things in bad ways They teach how to call the BLAS/LAPACK And provide a proper interface to them! They will expose some of the gotchas Never underestimate the problems these can cause 30
21 Exercises Exercise 15. Take Book_matrix_zero.cpp and complete it according to the instructions. This will use the book's Matrix.h class to solve Cholesky by calling the BLAS/LAPACK, and by hand There is a specimen answer in Book_matrix_one.h Do not worry if the matrix multiply is very slow Exercise 16. Attempt to optimise matmul() Aim for same time as cholesky() on 1000x1000 Clue: transpose matrices to do all inner loops along fastest varying dimension uses slices when you can do that There is a specimen answer in Book_matrix_two.cpp 31
22 Exercise 17. Change Exercises Change matmul(), cholesky() and solver() to use the BLAS and LAPACK This will be a LOT faster if you use MKL, ACML etc., and faster (especially the solver) even with GNU versions Be warned: this needs a clear head I did it by comparing intermediate results with a working version on 3x3 matrices The problem is storage order incompatibility 32
23 Exercises Exercise 18. Take My_matrix_zero.cpp,, add complete the program Write a very simple 2-D double matrix class Implement only what you need Use first dimension varying fastest (column-major) Complete the calls to the BLAS and LAPACK Write a matmul() There is a specimen answer in My_matrix_one.cpp Do not try to be clever at this stage This is a lot easier than you might think 33
24 Exercises Exercise 19. Take the program you wrote in exercise 18 and extend it to work better Use the techniques in this chapter The higher level code should use inner product calls and A += z*a,, where A is a 1-D slice Do NOT try to provide a proper interface for slices Provide them solely for matmul(), cholesky() and solver() Do support both row and column slices Try to get matrix multiply to run faster There is a specimen answer in My_matrix_two.h 34
25 Exercises My_matrix_three.h uses a high-precision inner product (from my fancy answer to exercise 14) It doesn't make very much difference, to time or accuracy The solver is twice as slow and still much less than machine accuracy Why? The time is in memory access, and the accuracy limit is in the mathematics LAPACK is robust But, occasionally,, this technique can be necessary Exercise 20. For extreme masochists only. Try repeating these exercises with the <valarray> and <gslice> or Boost::multi_array 35
26 Next lecture There is no next lecture! We are at the end 36
Chapter 21a Other Library Issues
Chapter 21a Other Library Issues Nick Maclaren http://www.ucs.cam.ac.uk/docs/course-notes/un ix-courses/cplusplus This was written by me, not Bjarne Stroustrup Function Objects These are not the only way
More informationProgramming with MPI
Programming with MPI p. 1/?? Programming with MPI One-sided Communication Nick Maclaren nmm1@cam.ac.uk October 2010 Programming with MPI p. 2/?? What Is It? This corresponds to what is often called RDMA
More informationIntroduction to OpenMP
1.1 Minimal SPMD Introduction to OpenMP Simple SPMD etc. N.M. Maclaren Computing Service nmm1@cam.ac.uk ext. 34761 August 2011 SPMD proper is a superset of SIMD, and we are now going to cover some of the
More informationProgramming with MPI
Programming with MPI p. 1/?? Programming with MPI Miscellaneous Guidelines Nick Maclaren Computing Service nmm1@cam.ac.uk, ext. 34761 March 2010 Programming with MPI p. 2/?? Summary This is a miscellaneous
More informationProgramming with MPI
Programming with MPI p. 1/?? Programming with MPI Composite Types and Language Standards Nick Maclaren Computing Service nmm1@cam.ac.uk, ext. 34761 March 2008 Programming with MPI p. 2/?? Composite Types
More informationBindel, Fall 2011 Applications of Parallel Computers (CS 5220) Tuning on a single core
Tuning on a single core 1 From models to practice In lecture 2, we discussed features such as instruction-level parallelism and cache hierarchies that we need to understand in order to have a reasonable
More informationWorkshop on High Performance Computing (HPC08) School of Physics, IPM February 16-21, 2008 HPC tools: an overview
Workshop on High Performance Computing (HPC08) School of Physics, IPM February 16-21, 2008 HPC tools: an overview Stefano Cozzini CNR/INFM Democritos and SISSA/eLab cozzini@democritos.it Agenda Tools for
More informationIntroduction to OpenMP
Introduction to OpenMP p. 1/?? Introduction to OpenMP Simple SPMD etc. Nick Maclaren nmm1@cam.ac.uk September 2017 Introduction to OpenMP p. 2/?? Terminology I am badly abusing the term SPMD tough The
More informationChapter 18 Vectors and Arrays [and more on pointers (nmm) ] Bjarne Stroustrup
Chapter 18 Vectors and Arrays [and more on pointers (nmm) ] Bjarne Stroustrup www.stroustrup.com/programming Abstract arrays, pointers, copy semantics, elements access, references Next lecture: parameterization
More informationIntroduction to the Course
Introduction to the Course Nick Maclaren http://www.ucs.cam.ac.uk/docs/course-notes/unix- courses/cplusplus Some of the slides in this lecture are the copyright of Bjarne Stroustrup, but they are not marked
More informationIntroduction to OpenMP
Introduction to OpenMP p. 1/?? Introduction to OpenMP More Syntax and SIMD Nick Maclaren nmm1@cam.ac.uk September 2017 Introduction to OpenMP p. 2/?? C/C++ Parallel for (1) I said that I would give the
More informationIntroduction to OpenMP
Introduction to OpenMP p. 1/?? Introduction to OpenMP Basics and Simple SIMD Nick Maclaren nmm1@cam.ac.uk September 2017 Introduction to OpenMP p. 2/?? Terminology I am abusing the term SIMD tough Strictly,
More informationMessage-Passing and MPI Programming
Message-Passing and MPI Programming 2.1 Transfer Procedures Datatypes and Collectives N.M. Maclaren Computing Service nmm1@cam.ac.uk ext. 34761 July 2010 These are the procedures that actually transfer
More informationIntroduction to OpenMP. Tasks. N.M. Maclaren September 2017
2 OpenMP Tasks 2.1 Introduction Introduction to OpenMP Tasks N.M. Maclaren nmm1@cam.ac.uk September 2017 These were introduced by OpenMP 3.0 and use a slightly different parallelism model from the previous
More informationProgramming with MPI
Programming with MPI p. 1/?? Programming with MPI Debugging, Performance and Tuning Nick Maclaren Computing Service nmm1@cam.ac.uk, ext. 34761 March 2008 Programming with MPI p. 2/?? Available Implementations
More informationProgramming with MPI
Programming with MPI p. 1/?? Programming with MPI Miscellaneous Guidelines Nick Maclaren nmm1@cam.ac.uk March 2010 Programming with MPI p. 2/?? Summary This is a miscellaneous set of practical points Over--simplifies
More informationProgramming with MPI
Programming with MPI p. 1/?? Programming with MPI More on Datatypes and Collectives Nick Maclaren nmm1@cam.ac.uk May 2008 Programming with MPI p. 2/?? Less Basic Collective Use A few important facilities
More informationMessage-Passing and MPI Programming
Message-Passing and MPI Programming More on Collectives N.M. Maclaren Computing Service nmm1@cam.ac.uk ext. 34761 July 2010 5.1 Introduction There are two important facilities we have not covered yet;
More information1 Motivation for Improving Matrix Multiplication
CS170 Spring 2007 Lecture 7 Feb 6 1 Motivation for Improving Matrix Multiplication Now we will just consider the best way to implement the usual algorithm for matrix multiplication, the one that take 2n
More informationMessage-Passing and MPI Programming
Message-Passing and MPI Programming More on Point-to-Point 6.1 Introduction N.M. Maclaren nmm1@cam.ac.uk July 2010 These facilities are the most complicated so far, but you may well want to use them. There
More informationIntroduction to OpenMP
Introduction to OpenMP p. 1/?? Introduction to OpenMP Synchronisation Nick Maclaren Computing Service nmm1@cam.ac.uk, ext. 34761 June 2011 Introduction to OpenMP p. 2/?? Summary Facilities here are relevant
More informationLinear Algebra libraries in Debian. DebConf 10 New York 05/08/2010 Sylvestre
Linear Algebra libraries in Debian Who I am? Core developer of Scilab (daily job) Debian Developer Involved in Debian mainly in Science and Java aspects sylvestre.ledru@scilab.org / sylvestre@debian.org
More informationMITOCW ocw f99-lec07_300k
MITOCW ocw-18.06-f99-lec07_300k OK, here's linear algebra lecture seven. I've been talking about vector spaces and specially the null space of a matrix and the column space of a matrix. What's in those
More informationAdvanced School in High Performance and GRID Computing November Mathematical Libraries. Part I
1967-10 Advanced School in High Performance and GRID Computing 3-14 November 2008 Mathematical Libraries. Part I KOHLMEYER Axel University of Pennsylvania Department of Chemistry 231 South 34th Street
More informationMy malloc: mylloc and mhysa. Johan Montelius HT2016
1 Introduction My malloc: mylloc and mhysa Johan Montelius HT2016 So this is an experiment where we will implement our own malloc. We will not implement the world s fastest allocator, but it will work
More informationIntroduction to OpenMP
Introduction to OpenMP p. 1/?? Introduction to OpenMP Tasks Nick Maclaren nmm1@cam.ac.uk September 2017 Introduction to OpenMP p. 2/?? OpenMP Tasks In OpenMP 3.0 with a slightly different model A form
More informationIntroduction to Modern Fortran
Introduction to Modern Fortran p. 1/?? Introduction to Modern Fortran Advanced Use Of Procedures Nick Maclaren nmm1@cam.ac.uk March 2014 Introduction to Modern Fortran p. 2/?? Summary We have omitted some
More informationBLAS. Christoph Ortner Stef Salvini
BLAS Christoph Ortner Stef Salvini The BLASics Basic Linear Algebra Subroutines Building blocks for more complex computations Very widely used Level means number of operations Level 1: vector-vector operations
More informationSkill 1: Multiplying Polynomials
CS103 Spring 2018 Mathematical Prerequisites Although CS103 is primarily a math class, this course does not require any higher math as a prerequisite. The most advanced level of mathematics you'll need
More informationComputer Caches. Lab 1. Caching
Lab 1 Computer Caches Lab Objective: Caches play an important role in computational performance. Computers store memory in various caches, each with its advantages and drawbacks. We discuss the three main
More informationChapter 5 Errors. Bjarne Stroustrup
Chapter 5 Errors Bjarne Stroustrup www.stroustrup.com/programming Abstract When we program, we have to deal with errors. Our most basic aim is correctness, but we must deal with incomplete problem specifications,
More informationIntroduction to OpenMP
Introduction to OpenMP p. 1/?? Introduction to OpenMP More Syntax and SIMD Nick Maclaren Computing Service nmm1@cam.ac.uk, ext. 34761 June 2011 Introduction to OpenMP p. 2/?? C/C++ Parallel for (1) I said
More informationMITOCW ocw f99-lec12_300k
MITOCW ocw-18.06-f99-lec12_300k This is lecture twelve. OK. We've reached twelve lectures. And this one is more than the others about applications of linear algebra. And I'll confess. When I'm giving you
More informationPROFESSOR: Last time, we took a look at an explicit control evaluator for Lisp, and that bridged the gap between
MITOCW Lecture 10A [MUSIC PLAYING] PROFESSOR: Last time, we took a look at an explicit control evaluator for Lisp, and that bridged the gap between all these high-level languages like Lisp and the query
More informationProgramming with MPI
Programming with MPI p. 1/?? Programming with MPI Point-to-Point Transfers Nick Maclaren nmm1@cam.ac.uk May 2008 Programming with MPI p. 2/?? Digression Most books and courses teach point--to--point first
More informationMessage-Passing and MPI Programming
Message-Passing and MPI Programming 5.1 Introduction More on Datatypes and Collectives N.M. Maclaren nmm1@cam.ac.uk July 2010 There are a few important facilities we have not covered yet; they are less
More informationAuthor: Steve Gorman Title: Programming with the Intel architecture in the flat memory model
Author: Steve Gorman Title: Programming with the Intel architecture in the flat memory model Abstract: As the Intel architecture moves off the desktop into a variety of other computing applications, developers
More informationHow to approach a computational problem
How to approach a computational problem A lot of people find computer programming difficult, especially when they first get started with it. Sometimes the problems are problems specifically related to
More informationLinked Lists. What is a Linked List?
Linked Lists Along with arrays, linked lists form the basis for pretty much every other data stucture out there. This makes learning and understand linked lists very important. They are also usually the
More informationIntroduction to Modern Fortran
Introduction to Modern Fortran p. 1/?? Introduction to Modern Fortran Advanced I/O and Files Nick Maclaren Computing Service nmm1@cam.ac.uk, ext. 34761 November 2007 Introduction to Modern Fortran p. 2/??
More informationWeek - 01 Lecture - 04 Downloading and installing Python
Programming, Data Structures and Algorithms in Python Prof. Madhavan Mukund Department of Computer Science and Engineering Indian Institute of Technology, Madras Week - 01 Lecture - 04 Downloading and
More informationChapter 17 vector and Free Store. Bjarne Stroustrup
Chapter 17 vector and Free Store Bjarne Stroustrup www.stroustrup.com/programming Overview Vector revisited How are they implemented? Pointers and free store Allocation (new) Access Arrays and subscripting:
More informationCoding Tools. (Lectures on High-performance Computing for Economists VI) Jesús Fernández-Villaverde 1 and Pablo Guerrón 2 March 25, 2018
Coding Tools (Lectures on High-performance Computing for Economists VI) Jesús Fernández-Villaverde 1 and Pablo Guerrón 2 March 25, 2018 1 University of Pennsylvania 2 Boston College Compilers Compilers
More informationResources for parallel computing
Resources for parallel computing BLAS Basic linear algebra subprograms. Originally published in ACM Toms (1979) (Linpack Blas + Lapack). Implement matrix operations upto matrix-matrix multiplication and
More informationPerformance Optimization Tutorial. Labs
Performance Optimization Tutorial Labs Exercise 1 These exercises are intended to provide you with a general feeling about some of the issues involved in writing high performance code, both on a single
More informationIntroduction to OpenMP
1.1 C/C++ Parallel for-loop Introduction to OpenMP More Syntax and SIMD N.M. Maclaren Computing Service nmm1@cam.ac.uk ext. 34761 August 2011 C/C++ programmers need to know the rules more precisely. The
More informationNumerical Programming in Python
Numerical Programming in Python p. 1/?? Numerical Programming in Python Part III: Using Python for Numerics Nick Maclaren Computing Service nmm1@cam.ac.uk, ext. 34761 February 2006 Numerical Programming
More informationChapter 3 Parallel Software
Chapter 3 Parallel Software Part I. Preliminaries Chapter 1. What Is Parallel Computing? Chapter 2. Parallel Hardware Chapter 3. Parallel Software Chapter 4. Parallel Applications Chapter 5. Supercomputers
More informationNote: Please use the actual date you accessed this material in your citation.
MIT OpenCourseWare http://ocw.mit.edu 18.06 Linear Algebra, Spring 2005 Please use the following citation format: Gilbert Strang, 18.06 Linear Algebra, Spring 2005. (Massachusetts Institute of Technology:
More information6.001 Notes: Section 4.1
6.001 Notes: Section 4.1 Slide 4.1.1 In this lecture, we are going to take a careful look at the kinds of procedures we can build. We will first go back to look very carefully at the substitution model,
More informationPointer Casts and Data Accesses
C Programming Pointer Casts and Data Accesses For this assignment, you will implement a C function similar to printf(). While implementing the function you will encounter pointers, strings, and bit-wise
More informationUsing Java for Scientific Computing. Mark Bul EPCC, University of Edinburgh
Using Java for Scientific Computing Mark Bul EPCC, University of Edinburgh markb@epcc.ed.ac.uk Java and Scientific Computing? Benefits of Java for Scientific Computing Portability Network centricity Software
More information1 of 5 5/11/2006 12:10 AM CS 61A Spring 2006 Midterm 2 solutions 1. Box and pointer. Note: Please draw actual boxes, as in the book and the lectures, not XX and X/ as in these ASCII-art solutions. Also,
More informationProgramming. Dr Ben Dudson University of York
Programming Dr Ben Dudson University of York Outline Last lecture covered the basics of programming and IDL This lecture will cover More advanced IDL and plotting Fortran and C++ Programming techniques
More informationScientific Computing. Some slides from James Lambers, Stanford
Scientific Computing Some slides from James Lambers, Stanford Dense Linear Algebra Scaling and sums Transpose Rank-one updates Rotations Matrix vector products Matrix Matrix products BLAS Designing Numerical
More informationChapter 1 Getting Started
Chapter 1 Getting Started The C# class Just like all object oriented programming languages, C# supports the concept of a class. A class is a little like a data structure in that it aggregates different
More informationSoftware Design and Development
Software Design and Development p. 1/?? Software Design and Development Introduction and Principles Nick Maclaren nmm1@cam.ac.uk September 2017 Software Design and Development p. 2/?? Apologia There is
More informationPractical High Performance Computing
Practical High Performance Computing Donour Sizemore July 21, 2005 2005 ICE Purpose of This Talk Define High Performance computing Illustrate how to get started 2005 ICE 1 Preliminaries What is high performance
More informationIntroduction to OpenMP
Introduction to OpenMP p. 1/?? Introduction to OpenMP Intermediate OpenMP Nick Maclaren nmm1@cam.ac.uk September 2017 Introduction to OpenMP p. 2/?? Summary This is a miscellaneous collection of facilities
More informationDebugging. CSE 2231 Supplement A Annatala Wolf
Debugging CSE 2231 Supplement A Annatala Wolf Testing is not debugging! The purpose of testing is to detect the existence of errors, not to identify precisely where the errors came from. Error messages
More information1 of 5 3/28/2010 8:01 AM Unit Testing Notes Home Class Info Links Lectures Newsgroup Assignmen [Jump to Writing Clear Tests, What about Private Functions?] Testing The typical approach to testing code
More informationProgramming with MPI
Programming with MPI p. 1/?? Programming with MPI Introduction Nick Maclaren nmm1@cam.ac.uk April 2010 Programming with MPI p. 2/?? Why Use MPI? CPUs got faster at 40% per annum until 2003 Since then,
More informationSlide 1 CS 170 Java Programming 1 Multidimensional Arrays Duration: 00:00:39 Advance mode: Auto
CS 170 Java Programming 1 Working with Rows and Columns Slide 1 CS 170 Java Programming 1 Duration: 00:00:39 Create a multidimensional array with multiple brackets int[ ] d1 = new int[5]; int[ ][ ] d2;
More informationCS2 Algorithms and Data Structures Note 1
CS2 Algorithms and Data Structures Note 1 Analysing Algorithms This thread of the course is concerned with the design and analysis of good algorithms and data structures. Intuitively speaking, an algorithm
More informationCode optimization techniques
& Alberto Bertoldo Advanced Computing Group Dept. of Information Engineering, University of Padova, Italy cyberto@dei.unipd.it May 19, 2009 The Four Commandments 1. The Pareto principle 80% of the effects
More informationParallel Programming (3)
Parallel Programming (3) p. 1/?? Parallel Programming (3) Shared Memory Nick Maclaren nmm1@cam.ac.uk March 2014 Parallel Programming (3) p. 2/?? Summary This does NOT repeat previous information There
More informationIssues In Implementing The Primal-Dual Method for SDP. Brian Borchers Department of Mathematics New Mexico Tech Socorro, NM
Issues In Implementing The Primal-Dual Method for SDP Brian Borchers Department of Mathematics New Mexico Tech Socorro, NM 87801 borchers@nmt.edu Outline 1. Cache and shared memory parallel computing concepts.
More informationMITOCW watch?v=kz7jjltq9r4
MITOCW watch?v=kz7jjltq9r4 PROFESSOR: We're going to look at the most fundamental of all mathematical data types, namely sets, and let's begin with the definitions. So informally, a set is a collection
More informationCOMP 105 Homework: Type Systems
Due Tuesday, March 29, at 11:59 PM (updated) The purpose of this assignment is to help you learn about type systems. Setup Make a clone of the book code: git clone linux.cs.tufts.edu:/comp/105/build-prove-compare
More informationParallel Constraint Programming (and why it is hard... ) Ciaran McCreesh and Patrick Prosser
Parallel Constraint Programming (and why it is hard... ) This Week s Lectures Search and Discrepancies Parallel Constraint Programming Why? Some failed attempts A little bit of theory and some very simple
More informationIntroduction to OpenMP
Introduction to OpenMP Intermediate OpenMP N.M. Maclaren Computing Service nmm1@cam.ac.uk ext. 34761 August 2011 1.1 Summary This is a miscellaneous collection of facilities, most of which are potentially
More informationChapter 4. Computation. Bjarne Stroustrup.
Chapter 4 Computation Bjarne Stroustrup www.stroustrup.com/programming Abstract Today, I ll present the basics of computation. In particular, we ll discuss expressions, how to iterate over a series of
More informationOptimisation p.1/22. Optimisation
Performance Tuning Optimisation p.1/22 Optimisation Optimisation p.2/22 Constant Elimination do i=1,n a(i) = 2*b*c(i) enddo What is wrong with this loop? Compilers can move simple instances of constant
More informationSo on the survey, someone mentioned they wanted to work on heaps, and someone else mentioned they wanted to work on balanced binary search trees.
So on the survey, someone mentioned they wanted to work on heaps, and someone else mentioned they wanted to work on balanced binary search trees. According to the 161 schedule, heaps were last week, hashing
More informationIn our first lecture on sets and set theory, we introduced a bunch of new symbols and terminology.
Guide to and Hi everybody! In our first lecture on sets and set theory, we introduced a bunch of new symbols and terminology. This guide focuses on two of those symbols: and. These symbols represent concepts
More informationDesign and Analysis of Algorithms Prof. Madhavan Mukund Chennai Mathematical Institute. Week 02 Module 06 Lecture - 14 Merge Sort: Analysis
Design and Analysis of Algorithms Prof. Madhavan Mukund Chennai Mathematical Institute Week 02 Module 06 Lecture - 14 Merge Sort: Analysis So, we have seen how to use a divide and conquer strategy, we
More informationCPSC 320 Sample Solution, Playing with Graphs!
CPSC 320 Sample Solution, Playing with Graphs! September 23, 2017 Today we practice reasoning about graphs by playing with two new terms. These terms/concepts are useful in themselves but not tremendously
More informationReview of previous examinations TMA4280 Introduction to Supercomputing
Review of previous examinations TMA4280 Introduction to Supercomputing NTNU, IMF April 24. 2017 1 Examination The examination is usually comprised of: one problem related to linear algebra operations with
More informationWhat's the Slope of a Line?
What's the Slope of a Line? These lines look pretty different, don't they? Lines are used to keep track of lots of info -- like how much money a company makes. Just off the top of your head, which of the
More informationBrief notes on setting up semi-high performance computing environments. July 25, 2014
Brief notes on setting up semi-high performance computing environments July 25, 2014 1 We have two different computing environments for fitting demanding models to large space and/or time data sets. 1
More informationWhy Study Assembly Language?
Why Study Assembly Language? This depends on the decade in which you studied assembly language. 1940 s You cannot study assembly language. It does not exist yet. 1950 s You study assembly language because,
More informationMathematical libraries at the CHPC
Presentation Mathematical libraries at the CHPC Martin Cuma Center for High Performance Computing University of Utah mcuma@chpc.utah.edu October 19, 2006 http://www.chpc.utah.edu Overview What and what
More informationQuiz 3; Tuesday, January 27; 5 minutes; 5 points [Solutions follow on next page]
Quiz 3; Tuesday, January 27; 5 minutes; 5 points [Solutions follow on next page] 1. Does the Java expression x + y == z have a side-effect? If so, what is it? 2. Write a function named add that can add
More informationIntroduction to Modern Fortran
Introduction to Modern Fortran p. 1/?? Introduction to Modern Fortran KIND, Precision and COMPLEX Nick Maclaren Computing Service nmm1@cam.ac.uk, ext. 34761 November 2007 Introduction to Modern Fortran
More informationCONTENTS: What Is Programming? How a Computer Works Programming Languages Java Basics. COMP-202 Unit 1: Introduction
CONTENTS: What Is Programming? How a Computer Works Programming Languages Java Basics COMP-202 Unit 1: Introduction Announcements Did you miss the first lecture? Come talk to me after class. If you want
More informationModule 10A Lecture - 20 What is a function? Why use functions Example: power (base, n)
Programming, Data Structures and Algorithms Prof. Shankar Balachandran Department of Computer Science and Engineering Indian Institute of Technology, Madras Module 10A Lecture - 20 What is a function?
More informationScientific Programming in C X. More features & Fortran interface
Scientific Programming in C X. More features & Fortran interface Susi Lehtola 20 November 2012 typedef typedefs are a way to make shorthand for data types, and possibly also make the code more general
More informationCOMP 3430 Robert Guderian
Operating Systems COMP 3430 Robert Guderian file:///users/robg/dropbox/teaching/3430-2018/slides/04_threads/index.html?print-pdf#/ 1/58 1 Threads Last week: Processes This week: Lesser processes! file:///users/robg/dropbox/teaching/3430-2018/slides/04_threads/index.html?print-pdf#/
More informationInteger Multiplication and Division
Integer Multiplication and Division for ENCM 369: Computer Organization Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Winter Term, 208 Integer
More informationWho am I? I m a python developer who has been working on OpenStack since I currently work for Aptira, who do OpenStack, SDN, and orchestration
Who am I? I m a python developer who has been working on OpenStack since 2011. I currently work for Aptira, who do OpenStack, SDN, and orchestration consulting. I m here today to help you learn from my
More informationParallel Programming (1)
Parallel Programming (1) p. 1/?? Parallel Programming (1) Introduction and Scripting Nick Maclaren nmm1@cam.ac.uk February 2014 Parallel Programming (1) p. 2/?? Introduction (1) This is a single three--session
More informationString Allocation in Icon
String Allocation in Icon Ralph E. Griswold Department of Computer Science The University of Arizona Tucson, Arizona IPD277 May 12, 1996 http://www.cs.arizona.edu/icon/docs/ipd275.html Note: This report
More informationCOSC 2P95. Introduction. Week 1. Brock University. Brock University (Week 1) Introduction 1 / 18
COSC 2P95 Introduction Week 1 Brock University Brock University (Week 1) Introduction 1 / 18 Lectures and Labs Lectures are Thursdays, from 3pm 5pm (AS/STH 217) There are two lab sections Lab 1 is Mondays,
More informationProgramming Languages and Compilers. Jeff Nucciarone AERSP 597B Sept. 20, 2004
Programming Languages and Compilers Jeff Nucciarone Sept. 20, 2004 Programming Languages Fortran C C++ Java many others Why use Standard Programming Languages? Programming tedious requiring detailed knowledge
More informationLecture 3: Intro to parallel machines and models
Lecture 3: Intro to parallel machines and models David Bindel 1 Sep 2011 Logistics Remember: http://www.cs.cornell.edu/~bindel/class/cs5220-f11/ http://www.piazza.com/cornell/cs5220 Note: the entire class
More informationATLAS (Automatically Tuned Linear Algebra Software),
LAPACK library I Scientists have developed a large library of numerical routines for linear algebra. These routines comprise the LAPACK package that can be obtained from http://www.netlib.org/lapack/.
More informationP1_L3 Operating Systems Security Page 1
P1_L3 Operating Systems Security Page 1 that is done by the operating system. systems. The operating system plays a really critical role in protecting resources in a computer system. Resources such as
More informationChapter 1 Introduction
Chapter 1 Introduction Why I Am Writing This: Why I am I writing a set of tutorials on compilers and how to build them? Well, the idea goes back several years ago when Rapid-Q, one of the best free BASIC
More informationComputer Science 322 Operating Systems Mount Holyoke College Spring Topic Notes: C and Unix Overview
Computer Science 322 Operating Systems Mount Holyoke College Spring 2010 Topic Notes: C and Unix Overview This course is about operating systems, but since most of our upcoming programming is in C on a
More information6.001 Notes: Section 15.1
6.001 Notes: Section 15.1 Slide 15.1.1 Our goal over the next few lectures is to build an interpreter, which in a very basic sense is the ultimate in programming, since doing so will allow us to define
More information