Projections A Performance Tool for Charm++ Applications

Size: px
Start display at page:

Download "Projections A Performance Tool for Charm++ Applications"

Transcription

1 Projections A Performance Tool for Charm++ Applications Chee Wai Lee cheelee@uiuc.edu Parallel Programming Laboratory Dept. of Computer Science University of Illinois at Urbana Champaign

2 Outline General Introduction to Projections Projections Basics Advanced Features Features to aid Effective Analysis Extremely Large Datasets Tips and Notes

3 Projections Projections is a performance tool designed for use with Charm++/AMPI. Trace-based, post-mortem analysis. Supports highly detailed traces, summary formats and a flexible user-level API. Java-based visualization tool for presenting performance information. 10/19/05 Projections Tutorial 3

4 Charm++ Model Object Oriented. Chares (objects) encapsulate data, standard C++ methods and entry methods. Message Driven. Entry methods represent work units activated by an incoming message. Only one entry method may execute at any time in a Chare. A runtime schedules incoming messages. 10/19/05 Projections Tutorial 4

5 What you will need A version of Charm++ built without the CMK_OPTIMIZE flag. User-built, download our release or check out a copy from anonymous CVS. Pre-built, check with your machine sysadmin. Java Runtime or higher. Projections Visualization binary (projections.jar) User-built Charm++, located in charm/tools/projections/bin Pre-built, check with sysadmin or acquire the binary separately.

6 Outline: Basics General Introduction to Projections Projections Basics Instrumentation Trace generation Visualization Advanced Features Features to aid Effective Analysis Extremely Large Datasets

7 Trace Generation: Basics Automatic trace instrumentation - No user codes required by default. Any Charm++ version built without the CMK_OPTIMIZE flag supports tracing. All Charm++ entry methods and messaging events are traced.

8 Trace Generation: Steps Link your application with link-time options -tracemode projections -tracemode summary Run your application normally. At the end of the run, you will see.log,.sum as well as.sts files generated on the same directory as your application binary.

9 Visualization: Basic Steps Run the script found at charm/tools/projections/bin as such: projections [<application>.sts] Or activate the Java binary projections.jar via the command-line passing, in the optional <application>.sts filename as argument.

10 Visualization: Main Window 10/19/05 Projections Tutorial 10

11 Visualization: Overview

12 Visualization: Usage Profile

13 Visualization: Time Profile

14 Visualization: Timeline

15 Visualization: Task Histogram

16 Visualization: Communication

17 Visualization: Tabulated Call Info

18 Projections Basic demo. 10/19/05 Projections Tutorial 18

19 Outline: Advanced Features General Introduction to Projections Projections Basics Advanced Features Partial Tracing Tracing User Events Tracing AMPI Functions Features to aid Effective Analysis Extremely Large Datasets Tips and Notes

20 Partial Trace Generation The following API calls are provided by the tracing framework: void tracebegin() void traceend() The above calls turns tracing on/off for the processor on which the call was made. int traceison() queries the tracing framework status. +traceoff runtime option Causes tracing (over all processors) to be turned off when the application is started. 10/19/05 Projections Tutorial 20

21 Partial Tracing watch-it s tracebegin() and traceend() calls apply only on the processor on which it invoked. Offers flexibility but vulnerable to programmer error. Typically used in a collective manner. (eg. NAMD does this at specific load balancing operations) Partial trace calls are invoked in the context of an entry method. One should be prepared to drop initial performance data just after turning tracing on and just before turning tracing off. Appropriate use of the +traceoff runtime option is essential.

22 Partial Tracing Example // in the case when trace is off at the beginning, // only turn trace of from after the first LB to the firstldbstep after // the second LB. // // off on Alg7 refine refine... on #if CHARM_VERSION >= if (traceavailable()) { static int specialtracing = 0; if (ldbcyclenum == 1 && traceison() == 0) specialtracing = 1; if (specialtracing) { if (ldbcyclenum == 4) tracebegin(); if (ldbcyclenum == 6) traceend(); } } #endif

23 Tracing User Events The following APIs are provided for user event registration and tracing: int traceregisteruserevent(char *eventdesc, int EventNum=-1) Acquire or specify an event ID to be associated with event name. void traceuserevent(int eventnum) Use a valid event ID to record an event. void traceuserbracketevent(int eventnum, double starttime, double endtime) Use a valid event ID to record an event interval. 10/19/05 Projections Tutorial 23

24 AMPI Function Tracing API Works like User Events in Charm++ Applications. Why? MPI Function abstraction is invisible to the Charm++ runtime. REGISTER_FUNCTION(<namestring>) to register <namestring> as a string-id to be traced. TRACEFUNC(<funcall>,<namestring>) to record a call to function <funcall> to be associated with the event registered as <namestring>.

25 Advanced Features Demo

26 Outline: Effective Analysis General Introduction to Projections Projections Basics Advanced Features Features to aid Effective Analysis How Tracing Works Memory Footprint Control Data Volume Control Visualization Controls Extremely Large Datasets Tips and Notes

27 Log Event Tracing Each Charm++ event and any registered user events are recorded into a pre-assigned memory buffer on each processor. Default Buffer size is 10,000 trace log entries. When a buffer is full, a special flush event is logged and buffer is flushed to disk. Flushing is done independent of other processors. There is no synchronization on a flush.

28 Summary Tracing Invoked by the same event set as in Log tracing. User events are, however, ignored. Memory buffer is organized into k bins of representing an initial time of 1ms. Each event contributes data into the appropriate time-bin. When a buffer is filled, bin-time representation is doubled and the data is packed into the first k/2 bins. Event contribution continues at the (k/2+1)th bin.

29 Memory Footprint Control Tracing Memory Footprint Event Log tracemode (-tracemode projections) Default 10,000 event entries. Controlled by runtime flag +logsize <size>. Summary tracemode (-tracemode summary) Default 10,000 time bins of 1 ms. Controlled by runtime flags +bincount <#bins> and +binsize <seconds>.

30 Memory Footprint Control (2) Trade-offs to consider Log Buffers Flush overhead vs Memory usage Frequency of flushes vs Size of flushes Summary Buffers Frequency of compaction + Data Granularity vs Memory usage

31 Controlling Data Volume [Code-time] User API for partial tracing. [Link-time] Generating only summary data. [Run-time] Writing compressed output (runtime flag): +gz-trace [Post-run] Deleting subset of generated logs. [Visualization] Parameter range control, analysis Memory.

32 Visualization Parameter Control

33 Specific Visualization Tool Issues Memory usage constraints Timeline - dependent on the event-density of the selected time range of the application. Typical workable range is 10ms to 10s for between processors. Processor based tools (Overview, Communications, Usage Profile) - limit to 2000 processors or less. Interval based tools (Graph, Time Profile, Communication vs Time, Animation) - limit to 1500 time intervals. Histogram tools limit to 1000 bins or less. 10/19/05 Projections Tutorial 33

34 Analysis Techniques Zoom in from wide-ranged low-detail views to detailed look at problem spots. Make use of range histories. Control data volume. Use effective colors. Make effective use of specific tool features (see manual).

35 Analysis Techniques (2) Load Imbalance: Overview, Usage Profile. Where's my work going?: Time Profile. How is my communication behaving?: Communication, Communication vs Time. Are there critical paths? I need details!: Timeline.

36 Putting it all together

37 Outline: Extremely Large Datasets General Introduction to Projections Projections Basics How Tracing Works Features to aid Effective Analysis Extremely Large Datasets Tips and Notes

38 Considerations for Extremely Large Datasets Large number of files ware thee the filesystem. Huge amounts of data control!

39 Demo NAMD on 8192 processors

40 Outline: Miscellany General Introduction to Projections Projections Basics How Tracing Works Features to aid Effective Analysis Extremely Large Datasets Tips and Notes

41 Conveniently Placing Logs Specifying a user-defined output location (runtime option): +traceroot <desired log directory> It is important to note that <desired log directory> must be available on a machine's compute nodes. 10/19/05 Projections Tutorial 41

42 Perturbation Issues Perturbation of application. Tracing overhead Timer overheads (rdtsc, machine wallclock). Acquisition of performance data and storage. Observed in the case of NAMD with timesteps below 10ms with many compute objects in the microsecond range. 10/19/05 Projections Tutorial 42

43 Look Closely! Do not be fooled! Projections does not handle some visualization artifacts well: Fine grain details can sometimes look like one big solid block on timeline. It is hard to mouse-over items that represent fine-grained events. Other times, tiny slivers of activity become too small to be drawn.

44 Answering my questions (again!) Load Imbalance: Overview, Usage Profile. Where's my work going?: Time Profile. How is my communication behaving?: Communication, Communication vs Time. Are there critical paths? I need details!: Timeline.

45 Frequently Asked Questions Q: I tried user events and projections visualization crashes! A: Did you register the events before using them? Q: There are giant stretched event(s) in my run! A: Did you set a large enough log buffer size? Q: Projections visualization crashes! A: Are your logs corrupted? These can happen on bad I/O (experienced on NFS).

Using BigSim to Estimate Application Performance

Using BigSim to Estimate Application Performance October 19, 2010 Using BigSim to Estimate Application Performance Ryan Mokos Parallel Programming Laboratory University of Illinois at Urbana-Champaign Outline Overview BigSim Emulator BigSim Simulator

More information

Scalable Fault Tolerance Schemes using Adaptive Runtime Support

Scalable Fault Tolerance Schemes using Adaptive Runtime Support Scalable Fault Tolerance Schemes using Adaptive Runtime Support Laxmikant (Sanjay) Kale http://charm.cs.uiuc.edu Parallel Programming Laboratory Department of Computer Science University of Illinois at

More information

Adaptive Runtime Support

Adaptive Runtime Support Scalable Fault Tolerance Schemes using Adaptive Runtime Support Laxmikant (Sanjay) Kale http://charm.cs.uiuc.edu Parallel Programming Laboratory Department of Computer Science University of Illinois at

More information

Memory Tagging in Charm++

Memory Tagging in Charm++ Memory Tagging in Charm++ Filippo Gioachin Laxmikant V. Kalé Department of Computer Science University of Illinois at Urbana Champaign Outline Overview Memory tagging Charm++ RTS CharmDebug Charm++ memory

More information

Recent Advances in Heterogeneous Computing using Charm++

Recent Advances in Heterogeneous Computing using Charm++ Recent Advances in Heterogeneous Computing using Charm++ Jaemin Choi, Michael Robson Parallel Programming Laboratory University of Illinois Urbana-Champaign April 12, 2018 1 / 24 Heterogeneous Computing

More information

Scalable Dynamic Adaptive Simulations with ParFUM

Scalable Dynamic Adaptive Simulations with ParFUM Scalable Dynamic Adaptive Simulations with ParFUM Terry L. Wilmarth Center for Simulation of Advanced Rockets and Parallel Programming Laboratory University of Illinois at Urbana-Champaign The Big Picture

More information

Visual Profiler. User Guide

Visual Profiler. User Guide Visual Profiler User Guide Version 3.0 Document No. 06-RM-1136 Revision: 4.B February 2008 Visual Profiler User Guide Table of contents Table of contents 1 Introduction................................................

More information

Introduction to Parallel Performance Engineering

Introduction to Parallel Performance Engineering Introduction to Parallel Performance Engineering Markus Geimer, Brian Wylie Jülich Supercomputing Centre (with content used with permission from tutorials by Bernd Mohr/JSC and Luiz DeRose/Cray) Performance:

More information

Load Balancing Techniques for Asynchronous Spacetime Discontinuous Galerkin Methods

Load Balancing Techniques for Asynchronous Spacetime Discontinuous Galerkin Methods Load Balancing Techniques for Asynchronous Spacetime Discontinuous Galerkin Methods Aaron K. Becker (abecker3@illinois.edu) Robert B. Haber Laxmikant V. Kalé University of Illinois, Urbana-Champaign Parallel

More information

Scalable Interaction with Parallel Applications

Scalable Interaction with Parallel Applications Scalable Interaction with Parallel Applications Filippo Gioachin Chee Wai Lee Laxmikant V. Kalé Department of Computer Science University of Illinois at Urbana-Champaign Outline Overview Case Studies Charm++

More information

Under The Hood: Performance Tuning With Tizen. Ravi Sankar Guntur

Under The Hood: Performance Tuning With Tizen. Ravi Sankar Guntur Under The Hood: Performance Tuning With Tizen Ravi Sankar Guntur How to write a Tizen App Tools already available in IDE v2.3 Dynamic Analyzer Valgrind 2 What s NEXT? Want to optimize my application App

More information

HANDLING LOAD IMBALANCE IN DISTRIBUTED & SHARED MEMORY

HANDLING LOAD IMBALANCE IN DISTRIBUTED & SHARED MEMORY HANDLING LOAD IMBALANCE IN DISTRIBUTED & SHARED MEMORY Presenters: Harshitha Menon, Seonmyeong Bak PPL Group Phil Miller, Sam White, Nitin Bhat, Tom Quinn, Jim Phillips, Laxmikant Kale MOTIVATION INTEGRATED

More information

Project #1 Exceptions and Simple System Calls

Project #1 Exceptions and Simple System Calls Project #1 Exceptions and Simple System Calls Introduction to Operating Systems Assigned: January 21, 2004 CSE421 Due: February 17, 2004 11:59:59 PM The first project is designed to further your understanding

More information

[CoolName++]: A Graph Processing Framework for Charm++

[CoolName++]: A Graph Processing Framework for Charm++ [CoolName++]: A Graph Processing Framework for Charm++ Hassan Eslami, Erin Molloy, August Shi, Prakalp Srivastava Laxmikant V. Kale Charm++ Workshop University of Illinois at Urbana-Champaign {eslami2,emolloy2,awshi2,psrivas2,kale}@illinois.edu

More information

BigNetSim Tutorial. Presented by Gengbin Zheng & Eric Bohm. Parallel Programming Laboratory University of Illinois at Urbana-Champaign

BigNetSim Tutorial. Presented by Gengbin Zheng & Eric Bohm. Parallel Programming Laboratory University of Illinois at Urbana-Champaign BigNetSim Tutorial Presented by Gengbin Zheng & Eric Bohm Parallel Programming Laboratory University of Illinois at Urbana-Champaign Outline Overview BigSim Emulator Charm++ on the Emulator Simulation

More information

Data storage on Triton: an introduction

Data storage on Triton: an introduction Motivation Data storage on Triton: an introduction How storage is organized in Triton How to optimize IO Do's and Don'ts Exercises slide 1 of 33 Data storage: Motivation Program speed isn t just about

More information

Designing Parallel Programs. This review was developed from Introduction to Parallel Computing

Designing Parallel Programs. This review was developed from Introduction to Parallel Computing Designing Parallel Programs This review was developed from Introduction to Parallel Computing Author: Blaise Barney, Lawrence Livermore National Laboratory references: https://computing.llnl.gov/tutorials/parallel_comp/#whatis

More information

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission 1

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission 1 Filesystem Disclaimer: some slides are adopted from book authors slides with permission 1 Storage Subsystem in Linux OS Inode cache User Applications System call Interface Virtual File System (VFS) Filesystem

More information

Tutorial: Analyzing MPI Applications. Intel Trace Analyzer and Collector Intel VTune Amplifier XE

Tutorial: Analyzing MPI Applications. Intel Trace Analyzer and Collector Intel VTune Amplifier XE Tutorial: Analyzing MPI Applications Intel Trace Analyzer and Collector Intel VTune Amplifier XE Contents Legal Information... 3 1. Overview... 4 1.1. Prerequisites... 5 1.1.1. Required Software... 5 1.1.2.

More information

Java TM SE 7 Release Notes Microsoft Windows Installation (32-bit)

Java TM SE 7 Release Notes Microsoft Windows Installation (32-bit) » search tips Search Products and Technologies Technical Topics Join Sun Developer Network Java TM SE 7 Release Notes Microsoft Windows Installation (32-bit) System Requirements JDK Documentation See supported

More information

Dynamic Load Balancing for Weather Models via AMPI

Dynamic Load Balancing for Weather Models via AMPI Dynamic Load Balancing for Eduardo R. Rodrigues IBM Research Brazil edrodri@br.ibm.com Celso L. Mendes University of Illinois USA cmendes@ncsa.illinois.edu Laxmikant Kale University of Illinois USA kale@cs.illinois.edu

More information

A Parallel-Object Programming Model for PetaFLOPS Machines and BlueGene/Cyclops Gengbin Zheng, Arun Singla, Joshua Unger, Laxmikant Kalé

A Parallel-Object Programming Model for PetaFLOPS Machines and BlueGene/Cyclops Gengbin Zheng, Arun Singla, Joshua Unger, Laxmikant Kalé A Parallel-Object Programming Model for PetaFLOPS Machines and BlueGene/Cyclops Gengbin Zheng, Arun Singla, Joshua Unger, Laxmikant Kalé Parallel Programming Laboratory Department of Computer Science University

More information

I/O: State of the art and Future developments

I/O: State of the art and Future developments I/O: State of the art and Future developments Giorgio Amati SCAI Dept. Rome, 18/19 May 2016 Some questions Just to know each other: Why are you here? Which is the typical I/O size you work with? GB? TB?

More information

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission Filesystem Disclaimer: some slides are adopted from book authors slides with permission 1 Recap Directory A special file contains (inode, filename) mappings Caching Directory cache Accelerate to find inode

More information

CS533 Modeling and Performance Evaluation of Network and Computer Systems

CS533 Modeling and Performance Evaluation of Network and Computer Systems CS533 Modeling and Performance Evaluation of Network and Computer Systems Monitors (Chapter 7) 1 Monitors That which is monitored improves. Source unknown A monitor is a tool used to observe system Observe

More information

DAQFactory U3 Tutorial Getting Started with DAQFactory-Express and your LabJack U3 11/3/06

DAQFactory U3 Tutorial Getting Started with DAQFactory-Express and your LabJack U3 11/3/06 DAQFactory U3 Tutorial Getting Started with DAQFactory-Express and your LabJack U3 11/3/06 Congratulations on the purchase of your new LabJack U3. Included on your installation CD is a fully licensed copy

More information

Immersive Out-of-Core Visualization of Large-Size and Long-Timescale Molecular Dynamics Trajectories

Immersive Out-of-Core Visualization of Large-Size and Long-Timescale Molecular Dynamics Trajectories Immersive Out-of-Core Visualization of Large-Size and Long-Timescale Molecular Dynamics Trajectories J. Stone, K. Vandivort, K. Schulten Theoretical and Computational Biophysics Group Beckman Institute

More information

Performance Profiling

Performance Profiling Performance Profiling Minsoo Ryu Real-Time Computing and Communications Lab. Hanyang University msryu@hanyang.ac.kr Outline History Understanding Profiling Understanding Performance Understanding Performance

More information

Workload Characterization using the TAU Performance System

Workload Characterization using the TAU Performance System Workload Characterization using the TAU Performance System Sameer Shende, Allen D. Malony, and Alan Morris Performance Research Laboratory, Department of Computer and Information Science University of

More information

Java Programming Lecture 23

Java Programming Lecture 23 Java Programming Lecture 23 Alice E. Fischer April 19, 2012 Alice E. Fischer () Java Programming - L23... 1/20 April 19, 2012 1 / 20 Outline 1 Thread Concepts Definition and Purpose 2 Java Threads Creation

More information

The Cache-Coherence Problem

The Cache-Coherence Problem The -Coherence Problem Lecture 12 (Chapter 6) 1 Outline Bus-based multiprocessors The cache-coherence problem Peterson s algorithm Coherence vs. consistency Shared vs. Distributed Memory What is the difference

More information

Introducing Overdecomposition to Existing Applications: PlasComCM and AMPI

Introducing Overdecomposition to Existing Applications: PlasComCM and AMPI Introducing Overdecomposition to Existing Applications: PlasComCM and AMPI Sam White Parallel Programming Lab UIUC 1 Introduction How to enable Overdecomposition, Asynchrony, and Migratability in existing

More information

PICS - a Performance-analysis-based Introspective Control System to Steer Parallel Applications

PICS - a Performance-analysis-based Introspective Control System to Steer Parallel Applications PICS - a Performance-analysis-based Introspective Control System to Steer Parallel Applications Yanhua Sun, Laxmikant V. Kalé University of Illinois at Urbana-Champaign sun51@illinois.edu November 26,

More information

What is DARMA? DARMA is a C++ abstraction layer for asynchronous many-task (AMT) runtimes.

What is DARMA? DARMA is a C++ abstraction layer for asynchronous many-task (AMT) runtimes. DARMA Janine C. Bennett, Jonathan Lifflander, David S. Hollman, Jeremiah Wilke, Hemanth Kolla, Aram Markosyan, Nicole Slattengren, Robert L. Clay (PM) PSAAP-WEST February 22, 2017 Sandia National Laboratories

More information

SHARCNET Workshop on Parallel Computing. Hugh Merz Laurentian University May 2008

SHARCNET Workshop on Parallel Computing. Hugh Merz Laurentian University May 2008 SHARCNET Workshop on Parallel Computing Hugh Merz Laurentian University May 2008 What is Parallel Computing? A computational method that utilizes multiple processing elements to solve a problem in tandem

More information

Caching and Buffering in HDF5

Caching and Buffering in HDF5 Caching and Buffering in HDF5 September 9, 2008 SPEEDUP Workshop - HDF5 Tutorial 1 Software stack Life cycle: What happens to data when it is transferred from application buffer to HDF5 file and from HDF5

More information

<Insert Picture Here>

<Insert Picture Here> The Other HPC: Profiling Enterprise-scale Applications Marty Itzkowitz Senior Principal SW Engineer, Oracle marty.itzkowitz@oracle.com Agenda HPC Applications

More information

CS 241 Honors Memory

CS 241 Honors Memory CS 241 Honors Memory Ben Kurtovic Atul Sandur Bhuvan Venkatesh Brian Zhou Kevin Hong University of Illinois Urbana Champaign February 20, 2018 CS 241 Course Staff (UIUC) Memory February 20, 2018 1 / 35

More information

Intel VTune Amplifier XE

Intel VTune Amplifier XE Intel VTune Amplifier XE Vladimir Tsymbal Performance, Analysis and Threading Lab 1 Agenda Intel VTune Amplifier XE Overview Features Data collectors Analysis types Key Concepts Collecting performance

More information

CSE398: Network Systems Design

CSE398: Network Systems Design CSE398: Network Systems Design Instructor: Dr. Liang Cheng Department of Computer Science and Engineering P.C. Rossin College of Engineering & Applied Science Lehigh University February 23, 2005 Outline

More information

Enzo-P / Cello. Scalable Adaptive Mesh Refinement for Astrophysics and Cosmology. San Diego Supercomputer Center. Department of Physics and Astronomy

Enzo-P / Cello. Scalable Adaptive Mesh Refinement for Astrophysics and Cosmology. San Diego Supercomputer Center. Department of Physics and Astronomy Enzo-P / Cello Scalable Adaptive Mesh Refinement for Astrophysics and Cosmology James Bordner 1 Michael L. Norman 1 Brian O Shea 2 1 University of California, San Diego San Diego Supercomputer Center 2

More information

Programming by Delegation

Programming by Delegation Chapter 2 a Programming by Delegation I. Scott MacKenzie a These slides are mostly based on the course text: Java by abstraction: A client-view approach (4 th edition), H. Roumani (2015). 1 Topics What

More information

Pictometry for ArcGIS Desktop Local Release Notes

Pictometry for ArcGIS Desktop Local Release Notes Version 10.4 The Desktop - Local 10.4 extension is compatible with ArcGIS Desktop 10.4. Version 10.3.2 This extension includes a new installer, which allows you to select a location (other than Program

More information

CS 160: Interactive Programming

CS 160: Interactive Programming CS 160: Interactive Programming Professor John Canny 3/8/2006 1 Outline Callbacks and Delegates Multi-threaded programming Model-view controller 3/8/2006 2 Callbacks Your code Myclass data method1 method2

More information

libhio: Optimizing IO on Cray XC Systems With DataWarp

libhio: Optimizing IO on Cray XC Systems With DataWarp libhio: Optimizing IO on Cray XC Systems With DataWarp May 9, 2017 Nathan Hjelm Cray Users Group May 9, 2017 Los Alamos National Laboratory LA-UR-17-23841 5/8/2017 1 Outline Background HIO Design Functionality

More information

System Wide Tracing User Need

System Wide Tracing User Need System Wide Tracing User Need dominique toupin ericsson com April 2010 About me Developer Tool Manager at Ericsson, helping Ericsson sites to develop better software efficiently Background

More information

CSC207 Week 9. Larry Zhang

CSC207 Week 9. Larry Zhang CSC207 Week 9 Larry Zhang 1 Logistics A2 Part 2 is out. A1 Part 2 marks out on Peer Assessment, for remarking requests, contact Larry. 2 Today s outline File I/O Regular Expressions 3 File I/O: read and

More information

Outline. Operating Systems. File Systems. File System Concepts. Example: Unix open() Files: The User s Point of View

Outline. Operating Systems. File Systems. File System Concepts. Example: Unix open() Files: The User s Point of View Operating Systems s Systems (in a Day) Ch - Systems Abstraction to disk (convenience) The only thing friendly about a disk is that it has persistent storage. Devices may be different: tape, IDE/SCSI, NFS

More information

ReVive: Cost-Effective Architectural Support for Rollback Recovery in Shared-Memory Multiprocessors

ReVive: Cost-Effective Architectural Support for Rollback Recovery in Shared-Memory Multiprocessors ReVive: Cost-Effective Architectural Support for Rollback Recovery in Shared-Memory Multiprocessors Milos Prvulovic, Zheng Zhang*, Josep Torrellas University of Illinois at Urbana-Champaign *Hewlett-Packard

More information

Outline. Topic 9: Swing. GUIs Up to now: line-by-line programs: computer displays text user types text AWT. A. Basics

Outline. Topic 9: Swing. GUIs Up to now: line-by-line programs: computer displays text user types text AWT. A. Basics Topic 9: Swing Outline Swing = Java's GUI library Swing is a BIG library Goal: cover basics give you concepts & tools for learning more Assignment 7: Expand moving shapes from Assignment 4 into game. "Programming

More information

c Copyright by Gengbin Zheng, 2005

c Copyright by Gengbin Zheng, 2005 c Copyright by Gengbin Zheng, 2005 ACHIEVING HIGH PERFORMANCE ON EXTREMELY LARGE PARALLEL MACHINES: PERFORMANCE PREDICTION AND LOAD BALANCING BY GENGBIN ZHENG B.S., Beijing University, China, 1995 M.S.,

More information

Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing

Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing /34 Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing Zuhair Khayyat 1 Karim Awara 1 Amani Alonazi 1 Hani Jamjoom 2 Dan Williams 2 Panos Kalnis 1 1 King Abdullah University of

More information

ScalaIOTrace: Scalable I/O Tracing and Analysis

ScalaIOTrace: Scalable I/O Tracing and Analysis ScalaIOTrace: Scalable I/O Tracing and Analysis Karthik Vijayakumar 1, Frank Mueller 1, Xiaosong Ma 1,2, Philip C. Roth 2 1 Department of Computer Science, NCSU 2 Computer Science and Mathematics Division,

More information

Efficient Lists Intersection by CPU- GPU Cooperative Computing

Efficient Lists Intersection by CPU- GPU Cooperative Computing Efficient Lists Intersection by CPU- GPU Cooperative Computing Di Wu, Fan Zhang, Naiyong Ao, Gang Wang, Xiaoguang Liu, Jing Liu Nankai-Baidu Joint Lab, Nankai University Outline Introduction Cooperative

More information

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Operating Systems Lecture 7.2 - File system implementation Adrien Krähenbühl Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Design FAT or indexed allocation? UFS, FFS & Ext2 Journaling with Ext3

More information

Execution Architecture

Execution Architecture Execution Architecture Software Architecture VO (706.706) Roman Kern Institute for Interactive Systems and Data Science, TU Graz 2018-11-07 Roman Kern (ISDS, TU Graz) Execution Architecture 2018-11-07

More information

VAMPIR & VAMPIRTRACE INTRODUCTION AND OVERVIEW

VAMPIR & VAMPIRTRACE INTRODUCTION AND OVERVIEW VAMPIR & VAMPIRTRACE INTRODUCTION AND OVERVIEW 8th VI-HPS Tuning Workshop at RWTH Aachen September, 2011 Tobias Hilbrich and Joachim Protze Slides by: Andreas Knüpfer, Jens Doleschal, ZIH, Technische Universität

More information

Expressing Fault Tolerant Algorithms with MPI-2. William D. Gropp Ewing Lusk

Expressing Fault Tolerant Algorithms with MPI-2. William D. Gropp Ewing Lusk Expressing Fault Tolerant Algorithms with MPI-2 William D. Gropp Ewing Lusk www.mcs.anl.gov/~gropp Overview Myths about MPI and Fault Tolerance Error handling and reporting Goal of Fault Tolerance Run

More information

Welcome to the 2017 Charm++ Workshop!

Welcome to the 2017 Charm++ Workshop! Welcome to the 2017 Charm++ Workshop! Laxmikant (Sanjay) Kale http://charm.cs.illinois.edu Parallel Programming Laboratory Department of Computer Science University of Illinois at Urbana Champaign 2017

More information

Pose Reference Manual

Pose Reference Manual Parallel Programming Laboratory University of Illinois at Urbana-Champaign Pose Reference Manual Pose software and this manual are entirely the fault of Terry L. Wilmarth. Credit for Charm++ is due to

More information

Motivation. Operating Systems. File Systems. Outline. Files: The User s Point of View. File System Concepts. Solution? Files!

Motivation. Operating Systems. File Systems. Outline. Files: The User s Point of View. File System Concepts. Solution? Files! Motivation Operating Systems Process store, retrieve information Process capacity restricted to vmem size When process terminates, memory lost Multiple processes share information Systems (Ch 0.-0.4, Ch.-.5)

More information

Make sure you have the latest Hive trunk by running svn up in your Hive directory. More detailed instructions on downloading and setting up

Make sure you have the latest Hive trunk by running svn up in your Hive directory. More detailed instructions on downloading and setting up GenericUDAFCaseStudy Writing GenericUDAFs: A Tutorial User-Defined Aggregation Functions (UDAFs) are an excellent way to integrate advanced data-processing into Hive. Hive allows two varieties of UDAFs:

More information

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture

Last Class: OS and Computer Architecture. Last Class: OS and Computer Architecture Last Class: OS and Computer Architecture System bus Network card CPU, memory, I/O devices, network card, system bus Lecture 4, page 1 Last Class: OS and Computer Architecture OS Service Protection Interrupts

More information

Complete Tutorial (Includes Schematic & Layout)

Complete Tutorial (Includes Schematic & Layout) Complete Tutorial (Includes Schematic & Layout) Download 1. Go to the "Download Free PCB123 Software" button or click here. 2. Enter your e-mail address and for your primary interest in the product. (Your

More information

Debugging and profiling of MPI programs

Debugging and profiling of MPI programs Debugging and profiling of MPI programs The code examples: http://syam.sharcnet.ca/mpi_debugging.tgz Sergey Mashchenko (SHARCNET / Compute Ontario / Compute Canada) Outline Introduction MPI debugging MPI

More information

I/O Systems. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

I/O Systems. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University I/O Systems Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Today s Topics Device characteristics Block device vs. Character device Direct I/O vs.

More information

Caching and reliability

Caching and reliability Caching and reliability Block cache Vs. Latency ~10 ns 1~ ms Access unit Byte (word) Sector Capacity Gigabytes Terabytes Price Expensive Cheap Caching disk contents in RAM Hit ratio h : probability of

More information

Parallel I/O Libraries and Techniques

Parallel I/O Libraries and Techniques Parallel I/O Libraries and Techniques Mark Howison User Services & Support I/O for scientifc data I/O is commonly used by scientific applications to: Store numerical output from simulations Load initial

More information

RAID in Practice, Overview of Indexing

RAID in Practice, Overview of Indexing RAID in Practice, Overview of Indexing CS634 Lecture 4, Feb 04 2014 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke 1 Disks and Files: RAID in practice For a big enterprise

More information

EW The Source Browser might fail to start data collection properly in large projects until the Source Browser window is opened manually.

EW The Source Browser might fail to start data collection properly in large projects until the Source Browser window is opened manually. EW 25462 The Source Browser might fail to start data collection properly in large projects until the Source Browser window is opened manually. EW 25460 Some objects of a struct/union type defined with

More information

INF5040 (Open Distributed Systems)

INF5040 (Open Distributed Systems) INF5040 (Open Distributed Systems) Lucas Provensi (provensi@ifi.uio.no) Department of Informatics University of Oslo November 12, 2012 An application-level network on top of the Internet (overlay network)

More information

Exercise 6a: Using free and/or open source tools to build workflows to manipulate. LAStools

Exercise 6a: Using free and/or open source tools to build workflows to manipulate. LAStools Exercise 6a: Using free and/or open source tools to build workflows to manipulate and process LiDAR data: LAStools Christopher Crosby Last Revised: December 1st, 2009 Exercises in this series: 1. LAStools

More information

ECE 669 Parallel Computer Architecture

ECE 669 Parallel Computer Architecture ECE 669 Parallel Computer Architecture Lecture 9 Workload Evaluation Outline Evaluation of applications is important Simulation of sample data sets provides important information Working sets indicate

More information

The HAMMER Filesystem DragonFlyBSD Project Matthew Dillon 11 October 2008

The HAMMER Filesystem DragonFlyBSD Project Matthew Dillon 11 October 2008 The HAMMER Filesystem DragonFlyBSD Project Matthew Dillon 11 October 2008 HAMMER Quick Feature List 1 Exabyte capacity (2^60 = 1 million terrabytes). Fine-grained, live-view history retention for snapshots

More information

Managing custom montage files Quick montages How custom montage files are applied Markers Adding markers...

Managing custom montage files Quick montages How custom montage files are applied Markers Adding markers... AnyWave Contents What is AnyWave?... 3 AnyWave home directories... 3 Opening a file in AnyWave... 4 Quick re-open a recent file... 4 Viewing the content of a file... 5 Choose what you want to view and

More information

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University Che-Wei Chang chewei@mail.cgu.edu.tw Department of Computer Science and Information Engineering, Chang Gung University Chapter 10: File System Chapter 11: Implementing File-Systems Chapter 12: Mass-Storage

More information

Raima Database Manager Version 14.1 In-memory Database Engine

Raima Database Manager Version 14.1 In-memory Database Engine + Raima Database Manager Version 14.1 In-memory Database Engine By Jeffrey R. Parsons, Chief Engineer November 2017 Abstract Raima Database Manager (RDM) v14.1 contains an all new data storage engine optimized

More information

FairCom White Paper Caching and Data Integrity Recommendations

FairCom White Paper Caching and Data Integrity Recommendations FairCom White Paper Caching and Data Integrity Recommendations Contents 1. Best Practices - Caching vs. Data Integrity... 1 1.1 The effects of caching on data recovery... 1 2. Disk Caching... 2 2.1 Data

More information

LabWindows /CVI Release Notes Version 9.0

LabWindows /CVI Release Notes Version 9.0 LabWindows /CVI Release Notes Version 9.0 Contents These release notes introduce LabWindows /CVI 9.0. Refer to this document for system requirements, installation and activation instructions, and information

More information

7 th Annual Workshop on Charm++ and its Applications. Rodrigo Espinha 1 Waldemar Celes 1 Noemi Rodriguez 1 Glaucio H. Paulino 2

7 th Annual Workshop on Charm++ and its Applications. Rodrigo Espinha 1 Waldemar Celes 1 Noemi Rodriguez 1 Glaucio H. Paulino 2 7 th Annual Workshop on Charm++ and its Applications ParTopS: Compact Topological Framework for Parallel Fragmentation Simulations Rodrigo Espinha 1 Waldemar Celes 1 Noemi Rodriguez 1 Glaucio H. Paulino

More information

Sensitive Data and Database Inference

Sensitive Data and Database Inference Sensitive Data and Database Inference Tom Kelliher, CS 325 Nov. 8, 2006 1 Administrivia Announcements Collect assignment. Assignment Read 7.1. From Last Time Database security and reliability. Outline

More information

L41: Lab 1 - Getting Started with Kernel Tracing - I/O

L41: Lab 1 - Getting Started with Kernel Tracing - I/O L41: Lab 1 - Getting Started with Kernel Tracing - I/O Dr Robert N.M. Watson Michaelmas Term 2015 The goals of this lab are to: Introduce you to our experimental environment and DTrace Have you explore

More information

A CASE STUDY OF COMMUNICATION OPTIMIZATIONS ON 3D MESH INTERCONNECTS

A CASE STUDY OF COMMUNICATION OPTIMIZATIONS ON 3D MESH INTERCONNECTS A CASE STUDY OF COMMUNICATION OPTIMIZATIONS ON 3D MESH INTERCONNECTS Abhinav Bhatele, Eric Bohm, Laxmikant V. Kale Parallel Programming Laboratory Euro-Par 2009 University of Illinois at Urbana-Champaign

More information

CS3600 SYSTEMS AND NETWORKS

CS3600 SYSTEMS AND NETWORKS CS3600 SYSTEMS AND NETWORKS NORTHEASTERN UNIVERSITY Lecture 11: File System Implementation Prof. Alan Mislove (amislove@ccs.neu.edu) File-System Structure File structure Logical storage unit Collection

More information

Best Practices. Deploying Optim Performance Manager in large scale environments. IBM Optim Performance Manager Extended Edition V4.1.0.

Best Practices. Deploying Optim Performance Manager in large scale environments. IBM Optim Performance Manager Extended Edition V4.1.0. IBM Optim Performance Manager Extended Edition V4.1.0.1 Best Practices Deploying Optim Performance Manager in large scale environments Ute Baumbach (bmb@de.ibm.com) Optim Performance Manager Development

More information

Chapter 11: Implementing File

Chapter 11: Implementing File Chapter 11: Implementing File Systems Chapter 11: Implementing File Systems File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency

More information

CSE 333 Lecture 9 - storage

CSE 333 Lecture 9 - storage CSE 333 Lecture 9 - storage Steve Gribble Department of Computer Science & Engineering University of Washington Administrivia Colin s away this week - Aryan will be covering his office hours (check the

More information

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission 1

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission 1 Filesystem Disclaimer: some slides are adopted from book authors slides with permission 1 Recap Blocking, non-blocking, asynchronous I/O Data transfer methods Programmed I/O: CPU is doing the IO Pros Cons

More information

Chapter 11: Implementing File Systems. Operating System Concepts 9 9h Edition

Chapter 11: Implementing File Systems. Operating System Concepts 9 9h Edition Chapter 11: Implementing File Systems Operating System Concepts 9 9h Edition Silberschatz, Galvin and Gagne 2013 Chapter 11: Implementing File Systems File-System Structure File-System Implementation Directory

More information

File systems CS 241. May 2, University of Illinois

File systems CS 241. May 2, University of Illinois File systems CS 241 May 2, 2014 University of Illinois 1 Announcements Finals approaching, know your times and conflicts Ours: Friday May 16, 8-11 am Inform us by Wed May 7 if you have to take a conflict

More information

Charm++ on the Cell Processor

Charm++ on the Cell Processor Charm++ on the Cell Processor David Kunzman, Gengbin Zheng, Eric Bohm, Laxmikant V. Kale 1 Motivation Cell Processor (CBEA) is powerful (peak-flops) Allow Charm++ applications utilize the Cell processor

More information

LOADRUNNER INTERVIEW QUESTIONS

LOADRUNNER INTERVIEW QUESTIONS LOADRUNNER INTERVIEW QUESTIONS 1. Why should we automate the performance testing? It s a discipline that leverages products, people and processes to reduce the risk of application upgrade or patch deployment.

More information

API and Usage of libhio on XC-40 Systems

API and Usage of libhio on XC-40 Systems API and Usage of libhio on XC-40 Systems May 24, 2018 Nathan Hjelm Cray Users Group May 24, 2018 Los Alamos National Laboratory LA-UR-18-24513 5/24/2018 1 Outline Background HIO Design HIO API HIO Configuration

More information

Design of Parallel Programs Algoritmi e Calcolo Parallelo. Daniele Loiacono

Design of Parallel Programs Algoritmi e Calcolo Parallelo. Daniele Loiacono Design of Parallel Programs Algoritmi e Calcolo Parallelo Web: home.dei.polimi.it/loiacono Email: loiacono@elet.polimi.it References q The material in this set of slide is taken from two tutorials by Blaise

More information

User and Installation Guide

User and Installation Guide The Logic IO RTCU Gateway Professional Version 1.28 User and Installation Guide Table of Contents Table of Contents... 2 Introduction... 3 Contents of package... 4 System requirements... 4 Time Service...

More information

Advanced Parallel Programming. Is there life beyond MPI?

Advanced Parallel Programming. Is there life beyond MPI? Advanced Parallel Programming Is there life beyond MPI? Outline MPI vs. High Level Languages Declarative Languages Map Reduce and Hadoop Shared Global Address Space Languages Charm++ ChaNGa ChaNGa on GPUs

More information

Briefly Explain The Difference Between Manual File System

Briefly Explain The Difference Between Manual File System Briefly Explain The Difference Between Manual File System apropos, Search the manual pages for a keyword or regular expression. apt-cache, Search bc, A calculator. bdiff, Identify the differences between

More information

Method-Level Phase Behavior in Java Workloads

Method-Level Phase Behavior in Java Workloads Method-Level Phase Behavior in Java Workloads Andy Georges, Dries Buytaert, Lieven Eeckhout and Koen De Bosschere Ghent University Presented by Bruno Dufour dufour@cs.rutgers.edu Rutgers University DCS

More information

Scalable In-memory Checkpoint with Automatic Restart on Failures

Scalable In-memory Checkpoint with Automatic Restart on Failures Scalable In-memory Checkpoint with Automatic Restart on Failures Xiang Ni, Esteban Meneses, Laxmikant V. Kalé Parallel Programming Laboratory University of Illinois at Urbana-Champaign November, 2012 8th

More information

Munara Tolubaeva Technical Consulting Engineer. 3D XPoint is a trademark of Intel Corporation in the U.S. and/or other countries.

Munara Tolubaeva Technical Consulting Engineer. 3D XPoint is a trademark of Intel Corporation in the U.S. and/or other countries. Munara Tolubaeva Technical Consulting Engineer 3D XPoint is a trademark of Intel Corporation in the U.S. and/or other countries. notices and disclaimers Intel technologies features and benefits depend

More information