Designing for Performance. Martin Thompson

Size: px
Start display at page:

Download "Designing for Performance. Martin Thompson"

Transcription

1 Designing for Performance Martin Thompson

2 Is it difficult writing software that has good performance?

3

4 RDD (Resume Driven Development)

5

6 How do we Design for Performance?

7 1. What is Performance? 2. What is Clean & Representative? 3. Implementing efficient Models 4. Why Performance Test?

8 Performance

9 Throughput (aka Bandwidth)

10 Response Time (aka Latency)

11 Scalability

12

13 Response Time Queuing Theory Utilisation

14 Pro Tip: Ensure you have sufficient capacity

15 Can we go parallel to speedup?

16 Amdahl s Law time Sequential Process A B

17 Amdahl s Law time Sequential Process A B Parallel Process A A A A A B

18 Amdahl s Law time Sequential Process A B Parallel Process A A A A A B Parallel Process B A B B B B

19 Amdahl s Law

20 Universal Scalability Law C(N) = N / (1 + α(n 1) + ((β* N) * (N 1))) C = capacity or throughput N = number of processors α = contention penalty β = coherence penalty

21 Speedup Universal Scalability Law Processors Amdahl USL

22 Time (nanoseconds) 160, , , ,000 80,000 60,000 40,000 20,

23 Time (nanoseconds) Mean Logging Duration 160, , , ,000 80,000 60,000 40,000 20,

24 Clean & Representative

25 - Clean Morally uncontaminated; pure; innocent - Oxford English Dictionary

26 - Representative Serving as a portrayal or symbol of something - Oxford English Dictionary

27 - Representative Code is the best place to capture our current understanding of a model

28 Abstractions

29 Rules of Abstraction 1. Don t use abstraction

30 Rules of Abstraction 1. Don t use abstraction 2. Don t use abstraction

31 Rules of Abstraction 1. Don t use abstraction 2. Don t use abstraction 3. Only consider abstracting when you see at least 3 things that ARE the same

32 Rules of Abstraction 1. Don t use abstraction 2. Don t use abstraction 3. Only consider abstracting when you see at least 3 things that ARE the same 4. Abstractions must pay for themselves

33 Rules of Abstraction 1. Don t use abstraction 2. Don t use abstraction 3. Only consider abstracting when you see at least 3 things that ARE the same 4. Abstractions must pay for themselves 5. Beware DRY, the evil siren that tricks you into abstraction -> Coupling

34 Abstraction Megamorphism => Branch Hell

35 Abstraction Not Representative => Big Smell

36 Abstraction Say no to big frameworks!

37

38 Pro Tip: Abstract when you are sure of the benefits

39 Law of Leaky Abstractions All non-trivial abstractions, to some extent, are leaky. - Joel Spolsky

40 Law of Leaky Abstractions The detail of underlying complexity cannot be ignored.

41 the purpose of abstracting is not to be vague, but to create a new semantic level in which one can be absolutely precise - Dijkstra

42 Can we abstract memory systems?

43 - It s about 3 bets!

44 - It s about 3 bets! 1. The Temporal Bet

45 - It s about 3 bets! 1. The Temporal Bet 2. The Spatial Bet

46 - It s about 3 bets! 1. The Temporal Bet 2. The Spatial Bet 3. The Pattern Bet

47 Model Implementation

48 Coupling & Cohesion

49 Coupling & Cohesion Properties Bag

50 Pro Tip: Respect Locality of Reference

51 Can you implement a B+ Tree efficiently in a language?

52 Relationships

53 Relationships Order Book Order

54 Relationships Order Book 1 Bids * 1 Offers * Order

55 Relationships Order Book Price Price 1 Bids * 1 Offers * Order

56 Relationships Order Book Price Price 1 Bids * {ordered, FIFO} 1 Offers * {ordered, FIFO} Order

57 Pro Tip: Make friends with your Data Structures

58 Batching

59 Amortise the expensive costs

60 Natural Batching Producers

61 Natural Batching << Amortise Expensive Costs >> Batcher Producers

62 Response Time Natural Batching Typical Possible Load

63 Pro Tip: Batch processing is not just for offline

64 Branches, branches, branches...

65 Branches public void dostuff(list<string> things) { if (null == things things.isempty()) { return; } } for (String thing : things) { // Do useful work }

66 Branches public void dostuff(list<string> things) { if (null == things things.isempty()) { return; } } for (String thing : things) { // Do useful work }

67 Branches public void dostuff(list<string> things) { for (String thing : things) { // Do useful work } }

68 Pro Tip: Respect the Principle of least surprise

69 Loops

70 Loops If I had more time, I would have written a shorter letter. - Blaise Pascal

71 Loops L µops

72 Loops L µops IDQ 28/56 µops

73 Pro Tip: Craft major loops like good prose

74 Composition

75 Composition Size matters

76 Composition Inlining is THE optimisation. - Cliff Click

77 Composition Single Responsibility

78 Pro Tip: Small atoms can compose to build anything

79 APIs

80 selector.selectnow(); Set<SelectionKey> selectedkeys = selector.selectedkeys(); Iterator<SelectionKey> iter = selectedkeys.iterator(); while (iter.hasnext()) { SelectionKey key = iter.next(); if (key.isreadable()) { process(key.attachment()); // do work } } iter.remove();

81 selector.selectnow(); Set<SelectionKey> selectedkeys = selector.selectedkeys(); Iterator<SelectionKey> iter = selectedkeys.iterator(); while (iter.hasnext()) { SelectionKey key = iter.next(); if (key.isreadable()) { process(key.attachment()); // do work } } iter.remove();

82 // Keep and reuse List<SelectionKey> keys = new ArrayList<>(); selector.selectnow(keys, READABLE); keys.foreach(keyhandler);

83 // Keep and reuse List<SelectionKey> keys = new ArrayList<>(); selector.selectnow(keys, READABLE); keys.foreach(keyhandler);

84 Data

85 Data

86 Data

87 Data

88 Data

89 Data

90 Pro Tip: Embrace Set Theory and FP techniques

91 Performance Testing

92 Define Performance Goals

93 Measuring response time?

94 Response Time Histograms

95 Response Time Histograms Mode

96 Response Time Histograms Mode Median

97 Response Time Histograms Mode Median Mean

98 Response Time Histograms Mode Median Mean

99 Response Time Histograms Mode Median Mean

100 Coordinated Omission

101 HdrHistogram

102 JMH (Java Microbenchmark Harness)

103 CPU Performance Counters

104 Performance test as part of Continuous Integration

105 Build telemetry into production systems

106 AGAIN!!! Build telemetry into production systems

107

108 Counters of: Queue Lengths Concurrent Users Exceptions Transactions - orders, trades Etc. See: Aeron SystemCounters

109 Histograms of: Response Times Service Times Queue Lengths Concurrent Users Etc.

110 In closing

111 Clean => Uncontaminated

112 Good performance is built on a foundation of good design fundamentals

113

114 Questions? It does not matter how intelligent you are, if you guess and that guess cannot be backed up by experimental evidence then it is still a guess. - Richard Feynman

Designing for Performance. Martin Thompson

Designing for Performance. Martin Thompson Designing for Performance Martin Thompson - @mjpt777 Feynman is becoming a real pain. He has the greatest scientific honesty of anyone I ve ever meet - William P Rogers The impact of QED cannot be overestimated.

More information

Rectangles All The Way Down. Martin Thompson

Rectangles All The Way Down. Martin Thompson Rectangles All The Way Down Martin Thompson - @mjpt777 The most amazing achievement of the computer software industry is its continuing cancellation of the steady and staggering gains made by the computer

More information

A Quest for Predictable Latency Adventures in Java Concurrency. Martin Thompson

A Quest for Predictable Latency Adventures in Java Concurrency. Martin Thompson A Quest for Predictable Latency Adventures in Java Concurrency Martin Thompson - @mjpt777 If a system does not respond in a timely manner then it is effectively unavailable 1. It s all about the Blocking

More information

High Performance Managed Languages. Martin Thompson

High Performance Managed Languages. Martin Thompson High Performance Managed Languages Martin Thompson - @mjpt777 Really, what s your preferred platform for building HFT applications? Why would you build low-latency applications on a GC ed platform? Some

More information

High Performance Managed Languages. Martin Thompson

High Performance Managed Languages. Martin Thompson High Performance Managed Languages Martin Thompson - @mjpt777 Really, what is your preferred platform for building HFT applications? Why do you build low-latency applications on a GC ed platform? Agenda

More information

Practicing at the Cutting Edge Learning and Unlearning about Performance. Martin Thompson

Practicing at the Cutting Edge Learning and Unlearning about Performance. Martin Thompson Practicing at the Cutting Edge Learning and Unlearning about Performance Martin Thompson - @mjpt777 Learning and Unlearning 1. Brief History of Java 2. Evolving Design approach 3. An evolving Hardware

More information

Computer Architecture. R. Poss

Computer Architecture. R. Poss Computer Architecture R. Poss 1 ca01-10 september 2015 Course & organization 2 ca01-10 september 2015 Aims of this course The aims of this course are: to highlight current trends to introduce the notion

More information

CS 62 Practice Final SOLUTIONS

CS 62 Practice Final SOLUTIONS CS 62 Practice Final SOLUTIONS 2017-5-2 Please put your name on the back of the last page of the test. Note: This practice test may be a bit shorter than the actual exam. Part 1: Short Answer [32 points]

More information

NAMD Serial and Parallel Performance

NAMD Serial and Parallel Performance NAMD Serial and Parallel Performance Jim Phillips Theoretical Biophysics Group Serial performance basics Main factors affecting serial performance: Molecular system size and composition. Cutoff distance

More information

Event Loop. + Vert.x

Event Loop. + Vert.x Event Loop + Vert.x Jochen Mader Chief Developer @ Senacor Technologies AG! http://www.senacor.com! jochen.mader@senacor.com! Twitter: @codepitbull 42 1 2 3 4 5 6 7 8 9 0 ABC volatile Thread ReentrantLock

More information

Cluster Consensus When Aeron Met Raft. Martin Thompson

Cluster Consensus When Aeron Met Raft. Martin Thompson Cluster When Aeron Met Raft Martin Thompson - @mjpt777 What does mean? con sen sus noun \ kən-ˈsen(t)-səs \ : general agreement : unanimity Source: http://www.merriam-webster.com/ con sen sus noun \ kən-ˈsen(t)-səs

More information

Continuous Performance Testing. Mark Price Performance Engineer Improbable.io

Continuous Performance Testing. Mark Price Performance Engineer Improbable.io Continuous Performance Testing Mark Price / @epickrram Performance Engineer Improbable.io The ideal System performance testing as a first-class citizen of the continuous delivery pipeline Process Process

More information

High Performance Computing

High Performance Computing The Need for Parallelism High Performance Computing David McCaughan, HPC Analyst SHARCNET, University of Guelph dbm@sharcnet.ca Scientific investigation traditionally takes two forms theoretical empirical

More information

Lecture 12. Lecture 12: Access Methods

Lecture 12. Lecture 12: Access Methods Lecture 12 Lecture 12: Access Methods Lecture 12 If you don t find it in the index, look very carefully through the entire catalog - Sears, Roebuck and Co., Consumers Guide, 1897 2 Lecture 12 > Section

More information

TDT Coarse-Grained Multithreading. Review on ILP. Multi-threaded execution. Contents. Fine-Grained Multithreading

TDT Coarse-Grained Multithreading. Review on ILP. Multi-threaded execution. Contents. Fine-Grained Multithreading Review on ILP TDT 4260 Chap 5 TLP & Hierarchy What is ILP? Let the compiler find the ILP Advantages? Disadvantages? Let the HW find the ILP Advantages? Disadvantages? Contents Multi-threading Chap 3.5

More information

Algorithm Performance Factors. Memory Performance of Algorithms. Processor-Memory Performance Gap. Moore s Law. Program Model of Memory I

Algorithm Performance Factors. Memory Performance of Algorithms. Processor-Memory Performance Gap. Moore s Law. Program Model of Memory I Memory Performance of Algorithms CSE 32 Data Structures Lecture Algorithm Performance Factors Algorithm choices (asymptotic running time) O(n 2 ) or O(n log n) Data structure choices List or Arrays Language

More information

Software Analysis. Asymptotic Performance Analysis

Software Analysis. Asymptotic Performance Analysis Software Analysis Performance Analysis Presenter: Jonathan Aldrich Performance Analysis Software Analysis 1 Asymptotic Performance Analysis How do we compare algorithm performance? Abstract away low-level

More information

PROCESSES AND THREADS

PROCESSES AND THREADS PROCESSES AND THREADS A process is a heavyweight flow that can execute concurrently with other processes. A thread is a lightweight flow that can execute concurrently with other threads within the same

More information

Design of Concurrent and Distributed Data Structures

Design of Concurrent and Distributed Data Structures METIS Spring School, Agadir, Morocco, May 2015 Design of Concurrent and Distributed Data Structures Christoph Kirsch University of Salzburg Joint work with M. Dodds, A. Haas, T.A. Henzinger, A. Holzer,

More information

ECE 669 Parallel Computer Architecture

ECE 669 Parallel Computer Architecture ECE 669 Parallel Computer Architecture Lecture 9 Workload Evaluation Outline Evaluation of applications is important Simulation of sample data sets provides important information Working sets indicate

More information

Modern Processor Architectures. L25: Modern Compiler Design

Modern Processor Architectures. L25: Modern Compiler Design Modern Processor Architectures L25: Modern Compiler Design The 1960s - 1970s Instructions took multiple cycles Only one instruction in flight at once Optimisation meant minimising the number of instructions

More information

be used for more than one use case (for instance, for use cases Create User and Delete User, one can have one UserController, instead of two separate

be used for more than one use case (for instance, for use cases Create User and Delete User, one can have one UserController, instead of two separate UNIT 4 GRASP GRASP: Designing objects with responsibilities Creator Information expert Low Coupling Controller High Cohesion Designing for visibility - Applying GoF design patterns adapter, singleton,

More information

Martin Kruliš, v

Martin Kruliš, v Martin Kruliš 1 Optimizations in General Code And Compilation Memory Considerations Parallelism Profiling And Optimization Examples 2 Premature optimization is the root of all evil. -- D. Knuth Our goal

More information

CSCI-GA Multicore Processors: Architecture & Programming Lecture 3: The Memory System You Can t Ignore it!

CSCI-GA Multicore Processors: Architecture & Programming Lecture 3: The Memory System You Can t Ignore it! CSCI-GA.3033-012 Multicore Processors: Architecture & Programming Lecture 3: The Memory System You Can t Ignore it! Mohamed Zahran (aka Z) mzahran@cs.nyu.edu http://www.mzahran.com Memory Computer Technology

More information

Leslie Lamport: The Specification Language TLA +

Leslie Lamport: The Specification Language TLA + Leslie Lamport: The Specification Language TLA + This is an addendum to a chapter by Stephan Merz in the book Logics of Specification Languages by Dines Bjørner and Martin C. Henson (Springer, 2008). It

More information

CS 136: Advanced Architecture. Review of Caches

CS 136: Advanced Architecture. Review of Caches 1 / 30 CS 136: Advanced Architecture Review of Caches 2 / 30 Why Caches? Introduction Basic goal: Size of cheapest memory... At speed of most expensive Locality makes it work Temporal locality: If you

More information

Reactive design patterns for microservices on multicore

Reactive design patterns for microservices on multicore Reactive Software with elegance Reactive design patterns for microservices on multicore Reactive summit - 22/10/18 charly.bechara@tredzone.com Outline Microservices on multicore Reactive Multicore Patterns

More information

Course web site: teaching/courses/car. Piazza discussion forum:

Course web site:   teaching/courses/car. Piazza discussion forum: Announcements Course web site: http://www.inf.ed.ac.uk/ teaching/courses/car Lecture slides Tutorial problems Courseworks Piazza discussion forum: http://piazza.com/ed.ac.uk/spring2018/car Tutorials start

More information

A common scenario... Most of us have probably been here. Where did my performance go? It disappeared into overheads...

A common scenario... Most of us have probably been here. Where did my performance go? It disappeared into overheads... OPENMP PERFORMANCE 2 A common scenario... So I wrote my OpenMP program, and I checked it gave the right answers, so I ran some timing tests, and the speedup was, well, a bit disappointing really. Now what?.

More information

Performance analysis basics

Performance analysis basics Performance analysis basics Christian Iwainsky Iwainsky@rz.rwth-aachen.de 25.3.2010 1 Overview 1. Motivation 2. Performance analysis basics 3. Measurement Techniques 2 Why bother with performance analysis

More information

Principles of Software Construction: Objects, Design, and Concurrency. The Perils of Concurrency Can't live with it. Cant live without it.

Principles of Software Construction: Objects, Design, and Concurrency. The Perils of Concurrency Can't live with it. Cant live without it. Principles of Software Construction: Objects, Design, and Concurrency The Perils of Concurrency Can't live with it. Cant live without it. Spring 2014 Charlie Garrod Christian Kästner School of Computer

More information

A common scenario... Most of us have probably been here. Where did my performance go? It disappeared into overheads...

A common scenario... Most of us have probably been here. Where did my performance go? It disappeared into overheads... OPENMP PERFORMANCE 2 A common scenario... So I wrote my OpenMP program, and I checked it gave the right answers, so I ran some timing tests, and the speedup was, well, a bit disappointing really. Now what?.

More information

Caches Concepts Review

Caches Concepts Review Caches Concepts Review What is a block address? Why not bring just what is needed by the processor? What is a set associative cache? Write-through? Write-back? Then we ll see: Block allocation policy on

More information

Real Time: Understanding the Trade-offs Between Determinism and Throughput

Real Time: Understanding the Trade-offs Between Determinism and Throughput Real Time: Understanding the Trade-offs Between Determinism and Throughput Roland Westrelin, Java Real-Time Engineering, Brian Doherty, Java Performance Engineering, Sun Microsystems, Inc TS-5609 Learn

More information

Advanced d Instruction Level Parallelism. Computer Systems Laboratory Sungkyunkwan University

Advanced d Instruction Level Parallelism. Computer Systems Laboratory Sungkyunkwan University Advanced d Instruction ti Level Parallelism Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu ILP Instruction-Level Parallelism (ILP) Pipelining:

More information

Modern Processor Architectures (A compiler writer s perspective) L25: Modern Compiler Design

Modern Processor Architectures (A compiler writer s perspective) L25: Modern Compiler Design Modern Processor Architectures (A compiler writer s perspective) L25: Modern Compiler Design The 1960s - 1970s Instructions took multiple cycles Only one instruction in flight at once Optimisation meant

More information

CEC 450 Real-Time Systems

CEC 450 Real-Time Systems CEC 450 Real-Time Systems Lecture 6 Accounting for I/O Latency September 28, 2015 Sam Siewert A Service Release and Response C i WCET Input/Output Latency Interference Time Response Time = Time Actuation

More information

ni.com Decisions Behind the Design: LabVIEW for CompactRIO Sample Projects

ni.com Decisions Behind the Design: LabVIEW for CompactRIO Sample Projects Decisions Behind the Design: LabVIEW for CompactRIO Sample Projects Agenda Keys to quality in a software architecture Software architecture overview I/O safe states Watchdog timers Message communication

More information

ECE C61 Computer Architecture Lecture 2 performance. Prof. Alok N. Choudhary.

ECE C61 Computer Architecture Lecture 2 performance. Prof. Alok N. Choudhary. ECE C61 Computer Architecture Lecture 2 performance Prof Alok N Choudhary choudhar@ecenorthwesternedu 2-1 Today s s Lecture Performance Concepts Response Time Throughput Performance Evaluation Benchmarks

More information

Class 26: Linked Lists

Class 26: Linked Lists Introduction to Computation and Problem Solving Class 26: Linked Lists Prof. Steven R. Lerman and Dr. V. Judson Harward 2 The Java Collection Classes The java.util package contains implementations of many

More information

Lists and Iterators. CSSE 221 Fundamentals of Software Development Honors Rose-Hulman Institute of Technology

Lists and Iterators. CSSE 221 Fundamentals of Software Development Honors Rose-Hulman Institute of Technology Lists and Iterators CSSE 221 Fundamentals of Software Development Honors Rose-Hulman Institute of Technology Announcements Should be making progress on VectorGraphics. Should have turned in CRC cards,

More information

Modern C++ Old Dog, New Tricks Todd L.

Modern C++ Old Dog, New Tricks Todd L. StoneTor Modern C++ Old Dog, New Tricks Todd L. Montgomery @toddlmontgomery C++ is so old Languages are Tools Learning Tools is Good There are only two kinds of languages: the ones people complain about

More information

Operating System. Chapter 4. Threads. Lynn Choi School of Electrical Engineering

Operating System. Chapter 4. Threads. Lynn Choi School of Electrical Engineering Operating System Chapter 4. Threads Lynn Choi School of Electrical Engineering Process Characteristics Resource ownership Includes a virtual address space (process image) Ownership of resources including

More information

CS2900 Introductory Programming with Python and C++ Kevin Squire LtCol Joel Young Fall 2007

CS2900 Introductory Programming with Python and C++ Kevin Squire LtCol Joel Young Fall 2007 CS2900 Introductory Programming with Python and C++ Kevin Squire LtCol Joel Young Fall 2007 Course Web Site http://www.nps.navy.mil/cs/facultypages/squire/cs2900 All course related materials will be posted

More information

PARALLEL AND DISTRIBUTED COMPUTING

PARALLEL AND DISTRIBUTED COMPUTING PARALLEL AND DISTRIBUTED COMPUTING 2013/2014 1 st Semester 2 nd Exam January 29, 2014 Duration: 2h00 - No extra material allowed. This includes notes, scratch paper, calculator, etc. - Give your answers

More information

All Paging Schemes Depend on Locality. VM Page Replacement. Paging. Demand Paging

All Paging Schemes Depend on Locality. VM Page Replacement. Paging. Demand Paging 3/14/2001 1 All Paging Schemes Depend on Locality VM Page Replacement Emin Gun Sirer Processes tend to reference pages in localized patterns Temporal locality» locations referenced recently likely to be

More information

Cycle Time for Non-pipelined & Pipelined processors

Cycle Time for Non-pipelined & Pipelined processors Cycle Time for Non-pipelined & Pipelined processors Fetch Decode Execute Memory Writeback 250ps 350ps 150ps 300ps 200ps For a non-pipelined processor, the clock cycle is the sum of the latencies of all

More information

Using Scala for building DSL s

Using Scala for building DSL s Using Scala for building DSL s Abhijit Sharma Innovation Lab, BMC Software 1 What is a DSL? Domain Specific Language Appropriate abstraction level for domain - uses precise concepts and semantics of domain

More information

Dynamic Control Hazard Avoidance

Dynamic Control Hazard Avoidance Dynamic Control Hazard Avoidance Consider Effects of Increasing the ILP Control dependencies rapidly become the limiting factor they tend to not get optimized by the compiler more instructions/sec ==>

More information

Best Practices for Architecting Embedded Applications in LabVIEW Jacques Cilliers Applications Engineering

Best Practices for Architecting Embedded Applications in LabVIEW Jacques Cilliers Applications Engineering Best Practices for Architecting Embedded Applications in LabVIEW Jacques Cilliers Applications Engineering Overview of NI RIO Architecture PC Real Time Controller FPGA 4 Where to Start? 5 Requirements

More information

All you need is fun. Cons T Åhs Keeper of The Code

All you need is fun. Cons T Åhs Keeper of The Code All you need is fun Cons T Åhs Keeper of The Code cons@klarna.com Cons T Åhs Keeper of The Code at klarna Architecture - The Big Picture Development - getting ideas to work Code Quality - care about the

More information

Optimisation and Operations Research

Optimisation and Operations Research Optimisation and Operations Research Lecture 4: Algorithm Design in Matlab Matthew Roughan http://www.maths.adelaide.edu.au/matthew.roughan/ Lecture_notes/OORII/ School

More information

Ext4 Filesystem Scaling

Ext4 Filesystem Scaling Ext4 Filesystem Scaling Jan Kára SUSE Labs Overview Handling of orphan inodes in ext4 Shrinking cache of logical to physical block mappings Cleanup of transaction checkpoint lists 2 Orphan

More information

Static Analysis of Embedded C

Static Analysis of Embedded C Static Analysis of Embedded C John Regehr University of Utah Joint work with Nathan Cooprider Motivating Platform: TinyOS Embedded software for wireless sensor network nodes Has lots of SW components for

More information

Parallel and Distributed Systems. Programming Models. Why Parallel or Distributed Computing? What is a parallel computer?

Parallel and Distributed Systems. Programming Models. Why Parallel or Distributed Computing? What is a parallel computer? Parallel and Distributed Systems Instructor: Sandhya Dwarkadas Department of Computer Science University of Rochester What is a parallel computer? A collection of processing elements that communicate and

More information

Interactive Responsiveness and Concurrent Workflow

Interactive Responsiveness and Concurrent Workflow Middleware-Enhanced Concurrency of Transactions Interactive Responsiveness and Concurrent Workflow Transactional Cascade Technology Paper Ivan Klianev, Managing Director & CTO Published in November 2005

More information

Page 1. Program Performance Metrics. Program Performance Metrics. Amdahl s Law. 1 seq seq 1

Page 1. Program Performance Metrics. Program Performance Metrics. Amdahl s Law. 1 seq seq 1 Program Performance Metrics The parallel run time (Tpar) is the time from the moment when computation starts to the moment when the last processor finished his execution The speedup (S) is defined as the

More information

COURSE 11 DESIGN PATTERNS

COURSE 11 DESIGN PATTERNS COURSE 11 DESIGN PATTERNS PREVIOUS COURSE J2EE Design Patterns CURRENT COURSE Refactoring Way refactoring Some refactoring examples SOFTWARE EVOLUTION Problem: You need to modify existing code extend/adapt/correct/

More information

Curriculum 2013 Knowledge Units Pertaining to PDC

Curriculum 2013 Knowledge Units Pertaining to PDC Curriculum 2013 Knowledge Units Pertaining to C KA KU Tier Level NumC Learning Outcome Assembly level machine Describe how an instruction is executed in a classical von Neumann machine, with organization

More information

Concurrency and High Performance Reloaded

Concurrency and High Performance Reloaded Concurrency and High Performance Reloaded Disclaimer Any performance tuning advice provided in this presentation... will be wrong! 2 www.kodewerk.com Me Work as independent (a.k.a. freelancer) performance

More information

Hi. My name is Jasper. Together with Richard we thought of some ways that could make a parallel approach to sequential flowsheeting attractive.

Hi. My name is Jasper. Together with Richard we thought of some ways that could make a parallel approach to sequential flowsheeting attractive. Hi. My name is Jasper. Together with Richard we thought of some ways that could make a parallel approach to sequential flowsheeting attractive. Although it only partially related to CAPE-OPEN, this is

More information

Setting the stage... Key Design Issues. Main purpose - Manage software system complexity by improving software quality factors

Setting the stage... Key Design Issues. Main purpose - Manage software system complexity by improving software quality factors Setting the stage... Dr. Radu Marinescu 1 1946 Key Design Issues Main purpose - Manage software system complexity by...... improving software quality factors... facilitating systematic reuse Dr. Radu Marinescu

More information

EE282 Computer Architecture. Lecture 1: What is Computer Architecture?

EE282 Computer Architecture. Lecture 1: What is Computer Architecture? EE282 Computer Architecture Lecture : What is Computer Architecture? September 27, 200 Marc Tremblay Computer Systems Laboratory Stanford University marctrem@csl.stanford.edu Goals Understand how computer

More information

Software Architecture and Design I

Software Architecture and Design I Software Architecture and Design I Instructor: Yongjie Zheng February 23, 2017 CS 490MT/5555 Software Methods and Tools Outline What is software architecture? Why do we need software architecture? How

More information

Job Scheduling. CS170 Fall 2018

Job Scheduling. CS170 Fall 2018 Job Scheduling CS170 Fall 2018 What to Learn? Algorithms of job scheduling, which maximizes CPU utilization obtained with multiprogramming Select from ready processes and allocates the CPU to one of them

More information

CSC258: Computer Organization. Microarchitecture

CSC258: Computer Organization. Microarchitecture CSC258: Computer Organization Microarchitecture 1 Wrap-up: Function Conventions 2 Key Elements: Caller Ensure that critical registers like $ra have been saved. Save caller-save registers. Place arguments

More information

Parallel Algorithm Engineering

Parallel Algorithm Engineering Parallel Algorithm Engineering Kenneth S. Bøgh PhD Fellow Based on slides by Darius Sidlauskas Outline Background Current multicore architectures UMA vs NUMA The openmp framework and numa control Examples

More information

Abstraction. Abstraction

Abstraction. Abstraction 9/11/2003 1 Software today is more complex than it has ever been. Software controls Space shuttle systems during launch Elevators in 100-story buildings Cross-Atlantic transportation routes of freighters

More information

CITS1001 week 4 Grouping objects lecture 2

CITS1001 week 4 Grouping objects lecture 2 CITS1001 week 4 Grouping objects lecture 2 Arran Stewart March 29, 2018 1 / 37 Overview Last lecture, we looked at how we can group objects together into collections We looked at the ArrayList class. This

More information

Perchance to Stream with Java 8

Perchance to Stream with Java 8 Perchance to Stream with Java 8 Paul Sandoz Oracle The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated $$ into

More information

Parallel Processing with the SMP Framework in VTK. Berk Geveci Kitware, Inc.

Parallel Processing with the SMP Framework in VTK. Berk Geveci Kitware, Inc. Parallel Processing with the SMP Framework in VTK Berk Geveci Kitware, Inc. November 3, 2013 Introduction The main objective of the SMP (symmetric multiprocessing) framework is to provide an infrastructure

More information

Performance, Power, Die Yield. CS301 Prof Szajda

Performance, Power, Die Yield. CS301 Prof Szajda Performance, Power, Die Yield CS301 Prof Szajda Administrative HW #1 assigned w Due Wednesday, 9/3 at 5:00 pm Performance Metrics (How do we compare two machines?) What to Measure? Which airplane has the

More information

Case Study Don t Guess Where Your Performance Problem Is

Case Study Don t Guess Where Your Performance Problem Is Case Study Don t Guess Where Your Performance Problem Is Xense Limited The Problem This case study examines a performance problem that a client had with a custom Documentum job. The particular process

More information

TDT 4260 lecture 3 spring semester 2015

TDT 4260 lecture 3 spring semester 2015 1 TDT 4260 lecture 3 spring semester 2015 Lasse Natvig, The CARD group Dept. of computer & information science NTNU http://research.idi.ntnu.no/multicore 2 Lecture overview Repetition Chap.1: Performance,

More information

Principles of Software Construction: Objects, Design, and Concurrency. The Perils of Concurrency Can't live with it. Can't live without it.

Principles of Software Construction: Objects, Design, and Concurrency. The Perils of Concurrency Can't live with it. Can't live without it. Principles of Software Construction: Objects, Design, and Concurrency The Perils of Concurrency Can't live with it. Can't live without it. Fall 2014 Charlie Garrod Jonathan Aldrich School of Computer Science

More information

CS231 - Spring 2017 Linked Lists. ArrayList is an implementation of List based on arrays. LinkedList is an implementation of List based on nodes.

CS231 - Spring 2017 Linked Lists. ArrayList is an implementation of List based on arrays. LinkedList is an implementation of List based on nodes. CS231 - Spring 2017 Linked Lists List o Data structure which stores a fixed-size sequential collection of elements of the same type. o We've already seen two ways that you can store data in lists in Java.

More information

740: Computer Architecture Memory Consistency. Prof. Onur Mutlu Carnegie Mellon University

740: Computer Architecture Memory Consistency. Prof. Onur Mutlu Carnegie Mellon University 740: Computer Architecture Memory Consistency Prof. Onur Mutlu Carnegie Mellon University Readings: Memory Consistency Required Lamport, How to Make a Multiprocessor Computer That Correctly Executes Multiprocess

More information

Parallelism. CS6787 Lecture 8 Fall 2017

Parallelism. CS6787 Lecture 8 Fall 2017 Parallelism CS6787 Lecture 8 Fall 2017 So far We ve been talking about algorithms We ve been talking about ways to optimize their parameters But we haven t talked about the underlying hardware How does

More information

The Optimal CPU and Interconnect for an HPC Cluster

The Optimal CPU and Interconnect for an HPC Cluster 5. LS-DYNA Anwenderforum, Ulm 2006 Cluster / High Performance Computing I The Optimal CPU and Interconnect for an HPC Cluster Andreas Koch Transtec AG, Tübingen, Deutschland F - I - 15 Cluster / High Performance

More information

ni.com Best Practices for Architecting Embedded Applications in LabVIEW

ni.com Best Practices for Architecting Embedded Applications in LabVIEW Best Practices for Architecting Embedded Applications in LabVIEW Overview of NI RIO Architecture PC Real Time Controller FPGA 2 Where to Start? 3 Requirements Before you start to design your system, you

More information

Data Structures - CSCI 102. CS102 Hash Tables. Prof. Tejada. Copyright Sheila Tejada

Data Structures - CSCI 102. CS102 Hash Tables. Prof. Tejada. Copyright Sheila Tejada CS102 Hash Tables Prof. Tejada 1 Vectors, Linked Lists, Stack, Queues, Deques Can t provide fast insertion/removal and fast lookup at the same time The Limitations of Data Structure Binary Search Trees,

More information

Replication. Consistency models. Replica placement Distribution protocols

Replication. Consistency models. Replica placement Distribution protocols Replication Motivation Consistency models Data/Client-centric consistency models Replica placement Distribution protocols Invalidate versus updates Push versus Pull Cooperation between replicas Client-centric

More information

COMP3151/9151 Foundations of Concurrency Lecture 8

COMP3151/9151 Foundations of Concurrency Lecture 8 1 COMP3151/9151 Foundations of Concurrency Lecture 8 Liam O Connor CSE, UNSW (and data61) 8 Sept 2017 2 Shared Data Consider the Readers and Writers problem from Lecture 6: Problem We have a large data

More information

Algorithm Performance Factors. Memory Performance of Algorithms. Processor-Memory Performance Gap. Moore s Law. Program Model of Memory II

Algorithm Performance Factors. Memory Performance of Algorithms. Processor-Memory Performance Gap. Moore s Law. Program Model of Memory II Memory Performance of Algorithms CSE 32 Data Structures Lecture Algorithm Performance Factors Algorithm choices (asymptotic running time) O(n 2 ) or O(n log n) Data structure choices List or Arrays Language

More information

COMP6237 Data Mining Data Mining & Machine Learning with Big Data. Jonathon Hare

COMP6237 Data Mining Data Mining & Machine Learning with Big Data. Jonathon Hare COMP6237 Data Mining Data Mining & Machine Learning with Big Data Jonathon Hare jsh2@ecs.soton.ac.uk Contents Going to look at two case-studies looking at how we can make machine-learning algorithms work

More information

EECS 144/244: Fundamental Algorithms for System Modeling, Analysis, and Optimization

EECS 144/244: Fundamental Algorithms for System Modeling, Analysis, and Optimization EECS 144/244: Fundamental Algorithms for System Modeling, Analysis, and Optimization Dataflow Lecture: SDF, Kahn Process Networks Stavros Tripakis University of California, Berkeley Stavros Tripakis: EECS

More information

CS200 Spring 2004 Midterm Exam II April 14, Value Points a. 5 3b. 16 3c. 8 3d Total 100

CS200 Spring 2004 Midterm Exam II April 14, Value Points a. 5 3b. 16 3c. 8 3d Total 100 CS200 Spring 2004 Midterm Exam II April 14, 2004 Value Points 1. 12 2. 26 3a. 5 3b. 16 3c. 8 3d. 8 4. 25 Total 100 Name StudentID# 1. [12 pts] Short Answer from recitations: a) [3 pts] What is the first

More information

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1

Memory Hierarchy. Maurizio Palesi. Maurizio Palesi 1 Memory Hierarchy Maurizio Palesi Maurizio Palesi 1 References John L. Hennessy and David A. Patterson, Computer Architecture a Quantitative Approach, second edition, Morgan Kaufmann Chapter 5 Maurizio

More information

EITF20: Computer Architecture Part4.1.1: Cache - 2

EITF20: Computer Architecture Part4.1.1: Cache - 2 EITF20: Computer Architecture Part4.1.1: Cache - 2 Liang Liu liang.liu@eit.lth.se 1 Outline Reiteration Cache performance optimization Bandwidth increase Reduce hit time Reduce miss penalty Reduce miss

More information

Beyond Latency and Throughput

Beyond Latency and Throughput Beyond Latency and Throughput Performance for Heterogeneous Multi-Core Architectures JoAnn M. Paul Virginia Tech, ECE National Capital Region Common basis for two themes Flynn s Taxonomy Computers viewed

More information

Coherence and Consistency

Coherence and Consistency Coherence and Consistency 30 The Meaning of Programs An ISA is a programming language To be useful, programs written in it must have meaning or semantics Any sequence of instructions must have a meaning.

More information

Extracting Performance and Scalability Metrics From TCP. Baron Schwartz Postgres Open September 16, 2011

Extracting Performance and Scalability Metrics From TCP. Baron Schwartz Postgres Open September 16, 2011 Extracting Performance and Scalability Metrics From TCP Baron Schwartz Postgres Open September 16, 2011 Consulting Support Training Development For MySQL October 24-25, London /live Agenda Capturing TCP

More information

CPU Architecture Overview. Varun Sampath CIS 565 Spring 2012

CPU Architecture Overview. Varun Sampath CIS 565 Spring 2012 CPU Architecture Overview Varun Sampath CIS 565 Spring 2012 Objectives Performance tricks of a modern CPU Pipelining Branch Prediction Superscalar Out-of-Order (OoO) Execution Memory Hierarchy Vector Operations

More information

Index. ADEPT (tool for modelling proposed systerns),

Index. ADEPT (tool for modelling proposed systerns), Index A, see Arrivals Abstraction in modelling, 20-22, 217 Accumulated time in system ( w), 42 Accuracy of models, 14, 16, see also Separable models, robustness Active customer (memory constrained system),

More information

Software Architecture

Software Architecture Software Systems Architecture, Models, Methodologies & Design - Introduction Based on slides and information from a variety of sources Including Booch Software Architecture High level design of large software

More information

Topic 10: The Java Collections Framework (and Iterators)

Topic 10: The Java Collections Framework (and Iterators) Topic 10: The Java Collections Framework (and Iterators) A set of interfaces and classes to help manage collections of data. Why study the Collections Framework? very useful in many different kinds of

More information

COMP 201: Principles of Programming

COMP 201: Principles of Programming COMP 201: Principles of Programming 1 Learning Outcomes To understand what computing entails and what the different branches of computing are. To understand the basic design of a computer and how it represents

More information

Introduction to Functional Programming in Haskell 1 / 56

Introduction to Functional Programming in Haskell 1 / 56 Introduction to Functional Programming in Haskell 1 / 56 Outline Why learn functional programming? The essence of functional programming What is a function? Equational reasoning First-order vs. higher-order

More information

Agile Architecture. The Why, the What and the How

Agile Architecture. The Why, the What and the How Agile Architecture The Why, the What and the How Copyright Net Objectives, Inc. All Rights Reserved 2 Product Portfolio Management Product Management Lean for Executives SAFe for Executives Scaled Agile

More information

Magento Technical Guidelines

Magento Technical Guidelines Magento Technical Guidelines Eugene Shakhsuvarov, Software Engineer @ Magento 2018 Magento, Inc. Page 1 Magento 2 Technical Guidelines Document which describes the desired technical state of Magento 2

More information