Technical Report. Performance Analysis for Parallel R Programs: Towards Efficient Resource Utilization. Helena Kotthaus, Ingo Korb, Peter Marwedel

Size: px
Start display at page:

Download "Technical Report. Performance Analysis for Parallel R Programs: Towards Efficient Resource Utilization. Helena Kotthaus, Ingo Korb, Peter Marwedel"

Transcription

1 Performance Analysis for Parallel R Programs: Towards Efficient Resource Utilization Technical Report Helena Kotthaus, Ingo Korb, Peter Marwedel 01/2015 technische universität dortmund

2 Part of the work on this technical report has been supported by Deutsche Forschungsgemeinschaft (DFG) within the Collaborative Research Center SFB 876 "Providing Information by Resource-Constrained Analysis", project A3. Speaker: Address: Web: Prof. Dr. Katharina Morik TU Dortmund University Joseph-von-Fraunhofer-Str. 23 D Dortmund

3 1 Introduction The R programming language [1] is widely used in biostatistics with high-dimensional data sets. Here, a vast amount of resources is needed. Our tool tracer [5] allows the user to profile the resource usage of an R application to locate bottlenecks and develop new optimizations [8, 7]. TraceR is a deterministic profiling tool. In contrast to existing R language profiling tools, tracer is directly integrated with the R interpreter. This enables the generation of more detailed and accurate data about memory and runtime behavior of an R application. This profiling functionality was previously limited to sequential R applications. Parallel computing however is becoming more and more popular, since R is increasingly used to process large data sets. We therefore have improved tracer to allow for profiling parallel applications also. TraceR can be used for common cases like parallelization on multiple cores or parallelization on multiple machines. For the parallel performance analysis we added measurements like CPU utilization of parallel tasks and measurements for analyzing the memory usage of parallel programs during execution [9, 10]. With our parallel performance analysis we concentrate on applications that are embarrassingly parallel consisting of independent tasks. One example application which is embarrassingly parallel and also has a high resource utilization is the model selection. Here the goal is to find the best machine learning algorithm configuration for building a model for the given data. Therefore one has to search through a huge model space. Since the gain from parallel execution can be negated if the memory requirements of all parallel processes exceed the capacity of the system, our profiling data can serve as a constraint to determine the degree of parallelism and also to guide distribution of parallel R applications. Our goal is to provide a resource-aware parallelization strategy. To develop such a strategy we first need to analyze the performance of parallel applications. In the following we therefore will describe different parallel example applications and show how tracer is applied to analyze parallel R applications. 1

4 2 Parallel Performance Analysis In the following we will illustrate the gain from analyzing parallel R programs with tracer by presenting three example cases. Subsection 2.1 explains the parallel profiles produced by tracer and shows the outcome of using too many machines for parallel execution. Subsection 2.2 discusses the problem of high runtime variance of parallel processes and load balancing options. An example of inefficient resource utilization caused by using too many parallel processes is presented in Subsection Inefficient Resource Utilization In this subsection we will first use tracer to analyze an R example program which runs on multiple cores. To show how tracer can support the detection of inefficient resource utilization for parallel R programs, we will extend this example program for execution on multiple machines. Figure 1 shows the parallel parts of an R program that calculates Figure 1: Parallel code parts of an R program for calculating a Mandelbrot fractal on 4 cores on a single machine. a Mandelbrot fractal. The important lines are marked in green. Here the function calcblock (line 20) is running in parallel on 4 cores (line 12) on a single machine. Each job calculates a block of lines of the final Mandelbrot image. For parallelization the R function mclapply (line 29) of the parallel R package [3] is used. This function sets up a pool of 4 worker processes with the fork mechanism, one per core. The jobs are prescheduled by default. Therefore mclapply first splits the jobs into as many groups as there are cores and sends them to the 4 worker processes, each covering more than one job. 2

5 For analyzing the resource consumption of this parallel example code our tracer tool produces a profile that visualizes the relative CPU utilization and memory consumption which is shown in Figure 2. Besides this profile tracer also generates more detailed runtime profiles and memory profiles for each parallel process (for a detailed description see [6]). This data is especially useful to support efficient distribution of parallel processes on heterogeneous machines or cores, a manual for installing and configuring tracer can be found in [5]. The x-axis of Figure 2 represents the runtime in seconds for the R program example. The runtime and the CPU utilization of the parallel processes are illustrated by a horizontal line for each process. The length of the line represents the runtime. The color can vary from blue via purple to orange and indicates the relative CPU utilization of a process, calculated over its entire runtime. The master process is shown on the top while the worker processes are shown below it, sorted by starting time. low CPU utilization high Used memory Free memory Mandelbrot Set processes memory [MB] time [s] 0 Figure 2: Relative CPU utilization and memory utilization of a parallel R program for calculating a Mandelbrot fractal on 4 cores on a single machine. Since mclapply uses one worker process per core, Figure 2 shows 4 orange lines for the worker processes and one blue line for the master process. The worker processes are orange since there CPU utilization is high, this indicates that they did not have to wait for CPU resources, instead they received the maximum amount of CPU time. If their color would shift to purple or blue this would indicate that they had consumed less CPU time than their wallclock-runtime. This could be caused by other concurrent processes running on the same machine, by running more processes than CPU cores available or just by processing delays that are for example caused by I/O operations. The measure- 3

6 ments for the master process shown in the first horizontal line of Figure 2 indicate a low CPU utilization (blue) since the master only waits for the workers to finish and gathers the results at the end without running any calculations. The y-axis on the right side of Figure 2 shows the memory utilization of the Mandelbrot program. The memory behavior is indicated by two curves. The green curve shows the amount of free main memory during program execution, reported by the Linux kernel. The black curve shows the amount of memory allocated by all parallel processes over time. The values for both curves are sampled once per second. For the Mandelbrot example this profile indicates that we have more memory resources than needed. A common way to further reduce the effective runtime of a parallel program is to execute it on multiple machines. TraceR is also able to produce profiles for parallelization on multiple machines and the R language supports this way of parallelization via several R packages. We will therefore modify our R program example by using additional mechanisms of the parallel R package. This modification is shown in Figure 3. Again the important parts of the code are marked in green. Figure 3: Parallel code parts of an R program for calculating a Mandelbrot fractal using 8 cores on 2 machines. To run this example program the machines and cores that should be used for execution first have to be specified. This is done by defining a list of machines (line 11). Here the number of cores was increased by a factor of two. We are using 4 cores on each machine (host1 and host2 ). The makepsockcluster function (line 29) and the parlapply function (line 31) are launching further copies of R interpreters by using the Rscript front end, which is an alternative front end for using R in scripts. This mechanism creates a cluster of workers based on sockets. Here again the jobs are prescheduled and therefore evenly split into groups and send to the worker processes. If the calculation of a program 4

7 was distributed over multiple hosts, tracer generates one sub-profile for each machine. In this case the master process is only shown in the sub-profile for the machine it was running on, here on host2. Compared to the profile in Figure 2 where the program was executed only on 4 cores this profile shows the parallel execution on 8 cores on 2 machines. But by doubling the number of cores only minor runtime savings were realized. This indicates an inefficient resource utilization. One reason for this is the overhead of using the Rscript to launch futher copies of the master process. Another reason is indicated by the CPU utilization of the worker processes on host2 where the master process is executed. In the previous profile where the program was executed on a single machine all worker processes had a high CPU utilization and also a variance in runtimes. Now the single runtimes of the worker processes are very similar to each other and their CPU utilization is lower (purple color). The reason for this is that the worker processes are not able to stop execution when their computation is finished, they have to wait for the master process and the master first waits for all computations to be finished and then shuts down the worker processes. processes low CPU utilization high Used memory Free memory Mandelbrot Set host1 host time [s] memory [MB] Figure 4: Relative CPU utilization and memory utilization of an R program for calculating a Mandelbrot fractal using 8 cores on 2 machines. In summary the parallel profile in Figure 4 shows that using two machines for the Mandelbrot example produces overhead and thus does not lead to an efficient runtime reduction. For a runtime reduction the execution on a single machine with more cores would be more efficient and would avoid inefficient resource utilization. In the next subsection we will introduce a real world code example running on a single machine, were the runtime variance between jobs is higher compared to this Mandelbrot program example. 5

8 2.2 Runtime Variance and Load Balancing If parallel executed jobs have a high variance in runtime, load balancing is recommended to reduce inefficient resource utilization. In this subsection we want to discuss the problem of high runtime variance of parallel processes by analyzing a real world code example with tracer. Figure 5 therefore shows the parallel parts of an R program for evaluating different parameter configurations of a SVM classification task based on the mlr library [2]. Again the important parts of the program are marked in green. The performance of a SVM classification highly depends on its parameter configurations, therefore it is important to tune those parameters by evaluating the performance of different configurations. Figure 5: Parallel code parts of an R program for evaluating different parameter configurations of a SVM classification using 4 cores on a single machines. For parallelization of the SVM program in Figure 5 the parallelstartmulticore function (line 7) from the parallelmap R library [4] is used. This library utilizes the mechanisms of the parallel package. Here the master process is copied via fork and the parameter mc.preschedule is passed for splitting the jobs into groups and sending them to the worker processes. Here again 4 cores on a single machine (line 7) are used. Each job evaluates the performance of the SVM with another parameter configuration, this is triggered by the function tuneparams (line 16). Figure 6 shows the corresponding tracer profile. As can be seen from the black curve that indicates the memory allocation, here much more memory is needed for computation compared to our previous Mandelbrot program. Here all worker processes have a high CPU utilization (orange color). But the completion times of the worker processes have a high variance (length of the horizontal lines). This indicates, that runtime could be saved and thus resources by distributing the jobs more evenly on the worker processes. The parallel package supports such mechanisms through its load balancing option. This option is recommended when the jobs of a parallel program have widely different computation times or the machines are heterogeneous. 6

9 low CPU utilization high Used memory Free memory SVM Classification processes memory [MB] time [s] 0 Figure 6: Relative CPU utilization and memory utilization of an R program evaluating different parameter configurations of a SVM classification using 4 cores on a single machines. Figure 7 shows the modification marked in green (line 7) needed to enable the load balancing option of the parallel package. Therefore the mc.preschedule parameter is passed with the FALSE argument. Figure 8 presents the corresponding profile for CPU utilization and memory utilization of the SVM program that utilizes the load balancing mechanism. Figure 7: Parallel code parts of an R program for evaluating different parameter configurations of a SVM classification using load balancing on 4 cores on a single machine. 7

10 low CPU utilization high Used memory Free memory SVM Classification 4500 processes memory [MB] time [s] 0 Figure 8: Relative CPU utilization and memory utilization of an R program evaluating different parameter configurations of a SVM classification using load balancing on 4 cores on a single machine. To visualize the load balancing option the tracer profile shows one horizontal line for the CPU utilization of each job. All jobs are still processed on 4 cores. Load balancing dynamically allocates jobs to the worker processes, it sends jobs one at a time to a core (or machine). Compared to the profile from Figure 6 where load balancing was turned off here we could save runtime and thus resources. However there is still a high variance of computation times that leads to an inefficient resource utilization. This indicates that the load balancing mechanism of the parallel package is not sufficient for the SVM program where most of the jobs have a high variance in computation times. This high variance however is a common case for parameter tuning of machine learning algorithms. Therefore our future work will concentrate on the development of an optimized scheduling strategy for such use cases. Inefficient resource utilization can not only be caused by insufficient load balancing. In the next subsection we will discuss an example case where inefficient resource utilization is caused by creating to many parallel processes. 8

11 2.3 Number of Parallel Processes The gain from parallel execution can be negated if the memory requirements of all processes exceed the capacity of the RAM. By triggering too many parallel processes the OS starts to swap out memory which leads to inefficient resource utilization. An example profile illustrating this case is shown in Figure 9. Here the same SVM program code from Figure 7 was running, but this time on a system with lower memory resources (2GB of RAM). low CPU utilization high Used memory Free memory SVM Classification processes memory [MB] time [s] 0 Figure 9: Relative CPU utilization and memory utilization of an R program evaluating different parameter configurations of a SVM classification using load balancing on 4 cores on a single machine with 2 GB of RAM. Compared to Figure 8 where all processes had a high CPU utilization this profile includes more blue lines indicating a low CPU utilization. This low CPU utilization is caused by processes that are waiting for processing their data because it was swapped out. This is also represented by the green curve that indicates dthe free memory which is most of the time close to zero. If only 2 instead of 4 cores would have been used the CPU utilization would be high for all jobs since using less worker processes also reduces the data that needs to be copied. The parallel processes are all independent. They are not aware of each others memory usage, thus one job cannot trigger garbage collection in another and free memory for it. This example shows that the development of an optimized scheduling strategy for parallel R applications also has to consider the memory requirements of applications. Here TraceR supports the development by generation those parallel profiles. 9

12 3 Conclusion We presented the gain from analyzing parallel R applications with tracer by showing three example cases. We presented the outcome of using too many machines for parallel execution, we discussed the problem of high runtime variance of parallel processes including load balancing and we presented how inefficient resource utilization can be caused by using too many parallel processes. Our parallel performance analysis with TraceR can steer the development of parallel R programs aimed at overcoming inefficient resource utilizations we are currently facing. Our long term goal is to realize an optimized scheduling strategy for parallel R programs Therefore the information gathered by tracer could be used to guide the scheduling decisions to allow for efficient resource utilization. Such decisions are especially important if the hardware system is heterogeneous or if the jobs have varying resource requirements depending on the input data. References [1] R Core Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, URL R-project.org [2] Bischl B., Lang M., Richter J., Bossek J., Judt L., Kuehn T., Studerus E., Kotthoff L., Jones Z. mlr: Machine Learning in R. Version URL R-project.org/package=mlr [3] Ripley B, Tierney L., Urbanek S., parallel: Support for Parallel Computation. R package included in the R-core URL R-devel/library/parallel/doc/parallel.pdf [4] Bischl B., Lang M., parallelmap: Unified Interface to Parallelization Back- Ends. R package version URL parallelmap [5] TraceR: A Profiling Tool for R. TU Dortmund University URL https: //github.com/allr/tracer-installer [6] R-instrumented: Part of the TraceR Profiling Tool for R. TU Dortmund University URL [7] Kotthaus H., Korb I., Lang M., Bischl B., Rahnenführer J., Marwedel P., Runtime and Memory Consumption Analyses for Machine Learning R Programs. Journal of Statistical Computation and Simulation. Volume 85, Issue 1, pp [8] Kotthaus, H., Korb, I., Engel, M., Marwedel, P., Dynamic Page Sharing Optimization for the R Language. Proceedings of the 10th Symposium on Dynamic Languages (DLS 14). Portland, USA. pp

13 [9] Kotthaus, H., Korb, I., Marwedel, P., Performance Analysis for Parallel R Programs: Towards Efficient Resource Utilization. Abstract Booklet of the International R User Conference (UseR!). Aalborg, Denmark, pp [10] Kotthaus, H., Korb, I., Marwedel, P., Distributed Performance Analysis for R. R Implementation, Optimization and Tooling Workshop (RIOT). Prag, Czech,

R goes Mobile: Efficient Scheduling for Parallel R Programs on Heterogeneous Embedded Systems

R goes Mobile: Efficient Scheduling for Parallel R Programs on Heterogeneous Embedded Systems R goes Mobile: Efficient Scheduling for Parallel R Programs on Heterogeneous Embedded Systems, Andreas Lang Olaf Neugebauer, Peter Marwedel 03/07/2017 SFB 876 Parallel Machine Learning Algorithms Challenge:

More information

Data reduction for CORSIKA. Dominik Baack. Technical Report 06/2016. technische universität dortmund

Data reduction for CORSIKA. Dominik Baack. Technical Report 06/2016. technische universität dortmund Data reduction for CORSIKA Technical Report Dominik Baack 06/2016 technische universität dortmund Part of the work on this technical report has been supported by Deutsche Forschungsgemeinschaft (DFG) within

More information

parallel Parallel R ANF R Vincent Miele CNRS 07/10/2015

parallel Parallel R ANF R Vincent Miele CNRS 07/10/2015 Parallel R ANF R Vincent Miele CNRS 07/10/2015 Thinking Plan Thinking Context Principles Traditional paradigms and languages Parallel R - the foundations embarrassingly computations in R the snow heritage

More information

Fractals exercise. Investigating task farms and load imbalance

Fractals exercise. Investigating task farms and load imbalance Fractals exercise Investigating task farms and load imbalance Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

Dynamic Page Sharing Optimization for the R Language

Dynamic Page Sharing Optimization for the R Language Dynamic Page Sharing Optimization for the R Language Helena Kotthaus Ingo Korb Michael Engel 2 Peter Marwedel Department of Computer Science 2, TU Dortmund University, Germany 2 School of Computing, Creative

More information

The MOSIX Scalable Cluster Computing for Linux. mosix.org

The MOSIX Scalable Cluster Computing for Linux.  mosix.org The MOSIX Scalable Cluster Computing for Linux Prof. Amnon Barak Computer Science Hebrew University http://www. mosix.org 1 Presentation overview Part I : Why computing clusters (slide 3-7) Part II : What

More information

Fractals. Investigating task farms and load imbalance

Fractals. Investigating task farms and load imbalance Fractals Investigating task farms and load imbalance Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us

More information

PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS

PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS PLATFORM AND SOFTWARE AS A SERVICE THE MAPREDUCE PROGRAMMING MODEL AND IMPLEMENTATIONS By HAI JIN, SHADI IBRAHIM, LI QI, HAIJUN CAO, SONG WU and XUANHUA SHI Prepared by: Dr. Faramarz Safi Islamic Azad

More information

RESEARCH ARTICLE. Runtime and Memory Consumption Analyses for Machine Learning R Programs [SPECIAL ISSUE: StatConf13] [PREPRINT]

RESEARCH ARTICLE. Runtime and Memory Consumption Analyses for Machine Learning R Programs [SPECIAL ISSUE: StatConf13] [PREPRINT] 1 Journal of Statistical Computation and Simulation Vol. 00, No. 00, Month 201X, 1 16 RESEARCH ARTICLE Runtime and Memory Consumption Analyses for Machine Learning R Programs [SPECIAL ISSUE: StatConf13]

More information

A Distributed System for Continuous Integration with JINI 1

A Distributed System for Continuous Integration with JINI 1 A Distributed System for Continuous Integration with JINI 1 Y. C. Cheng, P.-H. Ou, C.-T. Chen and T.-S. Hsu Software Systems Lab Department of Computer Science and Information Engineering National Taipei

More information

Dynamic Data Placement Strategy in MapReduce-styled Data Processing Platform Hua-Ci WANG 1,a,*, Cai CHEN 2,b,*, Yi LIANG 3,c

Dynamic Data Placement Strategy in MapReduce-styled Data Processing Platform Hua-Ci WANG 1,a,*, Cai CHEN 2,b,*, Yi LIANG 3,c 2016 Joint International Conference on Service Science, Management and Engineering (SSME 2016) and International Conference on Information Science and Technology (IST 2016) ISBN: 978-1-60595-379-3 Dynamic

More information

MATE-EC2: A Middleware for Processing Data with Amazon Web Services

MATE-EC2: A Middleware for Processing Data with Amazon Web Services MATE-EC2: A Middleware for Processing Data with Amazon Web Services Tekin Bicer David Chiu* and Gagan Agrawal Department of Compute Science and Engineering Ohio State University * School of Engineering

More information

HPMMAP: Lightweight Memory Management for Commodity Operating Systems. University of Pittsburgh

HPMMAP: Lightweight Memory Management for Commodity Operating Systems. University of Pittsburgh HPMMAP: Lightweight Memory Management for Commodity Operating Systems Brian Kocoloski Jack Lange University of Pittsburgh Lightweight Experience in a Consolidated Environment HPC applications need lightweight

More information

SC12 HPC Educators session: Unveiling parallelization strategies at undergraduate level

SC12 HPC Educators session: Unveiling parallelization strategies at undergraduate level SC12 HPC Educators session: Unveiling parallelization strategies at undergraduate level E. Ayguadé, R. M. Badia, D. Jiménez, J. Labarta and V. Subotic August 31, 2012 Index Index 1 1 The infrastructure:

More information

mmpf: Monte-Carlo Methods for Prediction Functions by Zachary M. Jones

mmpf: Monte-Carlo Methods for Prediction Functions by Zachary M. Jones CONTRIBUTED RESEARCH ARTICLE 1 mmpf: Monte-Carlo Methods for Prediction Functions by Zachary M. Jones Abstract Machine learning methods can often learn high-dimensional functions which generalize well

More information

Go Deep: Fixing Architectural Overheads of the Go Scheduler

Go Deep: Fixing Architectural Overheads of the Go Scheduler Go Deep: Fixing Architectural Overheads of the Go Scheduler Craig Hesling hesling@cmu.edu Sannan Tariq stariq@cs.cmu.edu May 11, 2018 1 Introduction Golang is a programming language developed to target

More information

Scheduling the Intel Core i7

Scheduling the Intel Core i7 Third Year Project Report University of Manchester SCHOOL OF COMPUTER SCIENCE Scheduling the Intel Core i7 Ibrahim Alsuheabani Degree Programme: BSc Software Engineering Supervisor: Prof. Alasdair Rawsthorne

More information

Performance Tuning ScrumWorks Pro Server

Performance Tuning ScrumWorks Pro Server Performance Tuning ScrumWorks Pro Server Summary Audience: ScrumWorks Pro Server Systems Administrators This document provides a summary of performance tuning measures for ScrumWorks Pro Servers with large

More information

Storage access optimization with virtual machine migration during execution of parallel data processing on a virtual machine PC cluster

Storage access optimization with virtual machine migration during execution of parallel data processing on a virtual machine PC cluster Storage access optimization with virtual machine migration during execution of parallel data processing on a virtual machine PC cluster Shiori Toyoshima Ochanomizu University 2 1 1, Otsuka, Bunkyo-ku Tokyo

More information

10 MONITORING AND OPTIMIZING

10 MONITORING AND OPTIMIZING MONITORING AND OPTIMIZING.1 Introduction Objectives.2 Windows XP Task Manager.2.1 Monitor Running Programs.2.2 Monitor Processes.2.3 Monitor System Performance.2.4 Monitor Networking.2.5 Monitor Users.3

More information

Processor : Intel Pentium D3.0 GigaHtz

Processor : Intel Pentium D3.0 GigaHtz CHALAPATHI INSTITUTE OF ENGINEERING & TECHNOLOGY CHALAPATHI NAGAR LAM,GUNTUR DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING INTRODUCTION ABOUT 'L2' LAB There are 30 systems (HCL) installed in this Lab.

More information

Optimization Plugin for RapidMiner. Venkatesh Umaashankar Sangkyun Lee. Technical Report 04/2012. technische universität dortmund

Optimization Plugin for RapidMiner. Venkatesh Umaashankar Sangkyun Lee. Technical Report 04/2012. technische universität dortmund Optimization Plugin for RapidMiner Technical Report Venkatesh Umaashankar Sangkyun Lee 04/2012 technische universität dortmund Part of the work on this technical report has been supported by Deutsche Forschungsgemeinschaft

More information

SANDPIPER: BLACK-BOX AND GRAY-BOX STRATEGIES FOR VIRTUAL MACHINE MIGRATION

SANDPIPER: BLACK-BOX AND GRAY-BOX STRATEGIES FOR VIRTUAL MACHINE MIGRATION SANDPIPER: BLACK-BOX AND GRAY-BOX STRATEGIES FOR VIRTUAL MACHINE MIGRATION Timothy Wood, Prashant Shenoy, Arun Venkataramani, and Mazin Yousif * University of Massachusetts Amherst * Intel, Portland Data

More information

Introduction to Operating Systems Prof. Chester Rebeiro Department of Computer Science and Engineering Indian Institute of Technology, Madras

Introduction to Operating Systems Prof. Chester Rebeiro Department of Computer Science and Engineering Indian Institute of Technology, Madras Introduction to Operating Systems Prof. Chester Rebeiro Department of Computer Science and Engineering Indian Institute of Technology, Madras Week 02 Lecture 06 Virtual Memory Hello. In this video, we

More information

HPX. High Performance ParalleX CCT Tech Talk Series. Hartmut Kaiser

HPX. High Performance ParalleX CCT Tech Talk Series. Hartmut Kaiser HPX High Performance CCT Tech Talk Hartmut Kaiser (hkaiser@cct.lsu.edu) 2 What s HPX? Exemplar runtime system implementation Targeting conventional architectures (Linux based SMPs and clusters) Currently,

More information

DISTRIBUTED HIGH-SPEED COMPUTING OF MULTIMEDIA DATA

DISTRIBUTED HIGH-SPEED COMPUTING OF MULTIMEDIA DATA DISTRIBUTED HIGH-SPEED COMPUTING OF MULTIMEDIA DATA M. GAUS, G. R. JOUBERT, O. KAO, S. RIEDEL AND S. STAPEL Technical University of Clausthal, Department of Computer Science Julius-Albert-Str. 4, 38678

More information

Optimizing the use of the Hard Disk in MapReduce Frameworks for Multi-core Architectures*

Optimizing the use of the Hard Disk in MapReduce Frameworks for Multi-core Architectures* Optimizing the use of the Hard Disk in MapReduce Frameworks for Multi-core Architectures* Tharso Ferreira 1, Antonio Espinosa 1, Juan Carlos Moure 2 and Porfidio Hernández 2 Computer Architecture and Operating

More information

L41: Lab 2 - Kernel implications of IPC

L41: Lab 2 - Kernel implications of IPC L41: Lab 2 - Kernel implications of IPC Dr Robert N.M. Watson Michaelmas Term 2015 The goals of this lab are to: Continue to gain experience tracing user-kernel interactions via system calls Explore the

More information

Hardware-Efficient Parallelized Optimization with COMSOL Multiphysics and MATLAB

Hardware-Efficient Parallelized Optimization with COMSOL Multiphysics and MATLAB Hardware-Efficient Parallelized Optimization with COMSOL Multiphysics and MATLAB Frommelt Thomas* and Gutser Raphael SGL Carbon GmbH *Corresponding author: Werner-von-Siemens Straße 18, 86405 Meitingen,

More information

Imperative model of computation

Imperative model of computation 12 Imperative model of computation Peter Marwedel TU Dortmund, Informatik 12 Graphics: Alexandra Nolte, Gesine Marwedel, 2003 2010/10/28 These slides use Microsoft clip arts. Microsoft copyright restrictions

More information

Data Clustering on the Parallel Hadoop MapReduce Model. Dimitrios Verraros

Data Clustering on the Parallel Hadoop MapReduce Model. Dimitrios Verraros Data Clustering on the Parallel Hadoop MapReduce Model Dimitrios Verraros Overview The purpose of this thesis is to implement and benchmark the performance of a parallel K- means clustering algorithm on

More information

Optimization Coaching for Fork/Join Applications on the Java Virtual Machine

Optimization Coaching for Fork/Join Applications on the Java Virtual Machine Optimization Coaching for Fork/Join Applications on the Java Virtual Machine Eduardo Rosales Advisor: Research area: PhD stage: Prof. Walter Binder Parallel applications, performance analysis Planner EuroDW

More information

Efficient Resource Management for Cloud Computing Environments

Efficient Resource Management for Cloud Computing Environments Efficient Resource Management for Cloud Computing Environments Andrew J. Younge, Gregor von Laszewski, Lizhe Wang Pervasive Technology Institute Indianan University Bloomington, IN USA Sonia Lopez-Alarcon,

More information

Performance and Optimization Issues in Multicore Computing

Performance and Optimization Issues in Multicore Computing Performance and Optimization Issues in Multicore Computing Minsoo Ryu Department of Computer Science and Engineering 2 Multicore Computing Challenges It is not easy to develop an efficient multicore program

More information

CONCURRENT DISTRIBUTED TASK SYSTEM IN PYTHON. Created by Moritz Wundke

CONCURRENT DISTRIBUTED TASK SYSTEM IN PYTHON. Created by Moritz Wundke CONCURRENT DISTRIBUTED TASK SYSTEM IN PYTHON Created by Moritz Wundke INTRODUCTION Concurrent aims to be a different type of task distribution system compared to what MPI like system do. It adds a simple

More information

A Frame Study for Post-Processing Analysis on System Behavior: A Case Study of Deadline Miss Detection

A Frame Study for Post-Processing Analysis on System Behavior: A Case Study of Deadline Miss Detection Journal of Computer Science 6 (12): 1505-1510, 2010 ISSN 1549-3636 2010 Science Publications A Frame Study for Post-Processing Analysis on System Behavior: A Case Study of Deadline Miss Detection Junghee

More information

Page Replacement Algorithm using Swap-in History for Remote Memory Paging

Page Replacement Algorithm using Swap-in History for Remote Memory Paging Page Replacement Algorithm using Swap-in History for Remote Memory Paging Kazuhiro SAITO Hiroko MIDORIKAWA and Munenori KAI Graduate School of Engineering, Seikei University, 3-3-, Kichijoujikita-machi,

More information

Compute Node Linux (CNL) The Evolution of a Compute OS

Compute Node Linux (CNL) The Evolution of a Compute OS Compute Node Linux (CNL) The Evolution of a Compute OS Overview CNL The original scheme plan, goals, requirements Status of CNL Plans Features and directions Futures May 08 Cray Inc. Proprietary Slide

More information

Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS

Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS Advanced Topics UNIT 2 PERFORMANCE EVALUATIONS Structure Page Nos. 2.0 Introduction 4 2. Objectives 5 2.2 Metrics for Performance Evaluation 5 2.2. Running Time 2.2.2 Speed Up 2.2.3 Efficiency 2.3 Factors

More information

From Reproducibility Problems to Improvements: A journey

From Reproducibility Problems to Improvements: A journey From Reproducibility Problems to Improvements: A journey Holger Eichelberger, Aike Sass, Klaus Schmid {eichelberger, schmid}@sse.uni-hildesheim.de sassai@uni-hildesheim.de Software Systems Engineering

More information

Oracle JD Edwards EnterpriseOne Object Usage Tracking Performance Characterization Using JD Edwards EnterpriseOne Object Usage Tracking

Oracle JD Edwards EnterpriseOne Object Usage Tracking Performance Characterization Using JD Edwards EnterpriseOne Object Usage Tracking Oracle JD Edwards EnterpriseOne Object Usage Tracking Performance Characterization Using JD Edwards EnterpriseOne Object Usage Tracking ORACLE WHITE PAPER NOVEMBER 2017 Disclaimer The following is intended

More information

Bootstrap, Memory Management and Troubleshooting. LS 12, TU Dortmund

Bootstrap, Memory Management and Troubleshooting. LS 12, TU Dortmund Bootstrap, Memory Management and Troubleshooting (slides are based on Prof. Dr. Jian-Jia Chen and http://www.freertos.org) Anas Toma LS 12, TU Dortmund February 01, 2018 Anas Toma (LS 12, TU Dortmund)

More information

Some changes in snow and R

Some changes in snow and R Some changes in snow and R Luke Tierney Department of Statistics & Actuarial Science University of Iowa December 13, 2007 Luke Tierney (U. of Iowa) Some changes in snow and R December 13, 2007 1 / 22 Some

More information

System Design for a Million TPS

System Design for a Million TPS System Design for a Million TPS Hüsnü Sensoy Global Maksimum Data & Information Technologies Global Maksimum Data & Information Technologies Focused just on large scale data and information problems. Complex

More information

Overview of MOSIX. Prof. Amnon Barak Computer Science Department The Hebrew University.

Overview of MOSIX. Prof. Amnon Barak Computer Science Department The Hebrew University. Overview of MOSIX Prof. Amnon Barak Computer Science Department The Hebrew University http:// www.mosix.org Copyright 2006-2017. All rights reserved. 1 Background Clusters and multi-cluster private Clouds

More information

2 Calculation of the within-class covariance matrix

2 Calculation of the within-class covariance matrix 1 Topic Parallel programming in R. Using the «parallel» and «doparallel» packages. Personal computers become more and more efficient. They are mostly equipped with multi-core processors. At the same time,

More information

Huawei FusionCloud Desktop Solution 5.1 Resource Reuse Technical White Paper HUAWEI TECHNOLOGIES CO., LTD. Issue 01.

Huawei FusionCloud Desktop Solution 5.1 Resource Reuse Technical White Paper HUAWEI TECHNOLOGIES CO., LTD. Issue 01. Huawei FusionCloud Desktop Solution 5.1 Resource Reuse Technical White Paper Issue 01 Date 2014-03-26 HUAWEI TECHNOLOGIES CO., LTD. 2014. All rights reserved. No part of this document may be reproduced

More information

Overcoming the Memory System Challenge in Dataflow Processing. Darren Jones, Wave Computing Drew Wingard, Sonics

Overcoming the Memory System Challenge in Dataflow Processing. Darren Jones, Wave Computing Drew Wingard, Sonics Overcoming the Memory System Challenge in Dataflow Processing Darren Jones, Wave Computing Drew Wingard, Sonics Current Technology Limits Deep Learning Performance Deep Learning Dataflow Graph Existing

More information

Analysis of Parallelization Techniques and Tools

Analysis of Parallelization Techniques and Tools International Journal of Information and Computation Technology. ISSN 97-2239 Volume 3, Number 5 (213), pp. 71-7 International Research Publications House http://www. irphouse.com /ijict.htm Analysis of

More information

Implementing Hierarchical Heavy Hitters in RapidMiner: Solutions and Open Questions

Implementing Hierarchical Heavy Hitters in RapidMiner: Solutions and Open Questions Implementing Hierarchical Heavy Hitters in RapidMiner: Solutions and Open Questions Marco Stolpe and Peter Fricke firstname.surname@tu-dortmund.de Technical University of Dortmund, Artificial Intelligence

More information

The rcuda middleware and applications

The rcuda middleware and applications The rcuda middleware and applications Will my application work with rcuda? rcuda currently provides binary compatibility with CUDA 5.0, virtualizing the entire Runtime API except for the graphics functions,

More information

Chapter 5. The MapReduce Programming Model and Implementation

Chapter 5. The MapReduce Programming Model and Implementation Chapter 5. The MapReduce Programming Model and Implementation - Traditional computing: data-to-computing (send data to computing) * Data stored in separate repository * Data brought into system for computing

More information

Performance Optimization for Informatica Data Services ( Hotfix 3)

Performance Optimization for Informatica Data Services ( Hotfix 3) Performance Optimization for Informatica Data Services (9.5.0-9.6.1 Hotfix 3) 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic,

More information

Parallel Programming Concepts

Parallel Programming Concepts Parallel Programming Concepts MapReduce Frank Feinbube Source: MapReduce: Simplied Data Processing on Large Clusters; Dean et. Al. Examples for Parallel Programming Support 2 MapReduce 3 Programming model

More information

Keywords: Mobile Agent, Distributed Computing, Data Mining, Sequential Itinerary, Parallel Execution. 1. Introduction

Keywords: Mobile Agent, Distributed Computing, Data Mining, Sequential Itinerary, Parallel Execution. 1. Introduction 413 Effectiveness and Suitability of Mobile Agents for Distributed Computing Applications (Case studies on distributed sorting & searching using IBM Aglets Workbench) S. R. Mangalwede {1}, K.K.Tangod {2},U.P.Kulkarni

More information

Preliminary Research on Distributed Cluster Monitoring of G/S Model

Preliminary Research on Distributed Cluster Monitoring of G/S Model Available online at www.sciencedirect.com Physics Procedia 25 (2012 ) 860 867 2012 International Conference on Solid State Devices and Materials Science Preliminary Research on Distributed Cluster Monitoring

More information

Research on Load Balancing in Task Allocation Process in Heterogeneous Hadoop Cluster

Research on Load Balancing in Task Allocation Process in Heterogeneous Hadoop Cluster 2017 2 nd International Conference on Artificial Intelligence and Engineering Applications (AIEA 2017) ISBN: 978-1-60595-485-1 Research on Load Balancing in Task Allocation Process in Heterogeneous Hadoop

More information

Accelerating K-Means Clustering with Parallel Implementations and GPU computing

Accelerating K-Means Clustering with Parallel Implementations and GPU computing Accelerating K-Means Clustering with Parallel Implementations and GPU computing Janki Bhimani Electrical and Computer Engineering Dept. Northeastern University Boston, MA Email: bhimani@ece.neu.edu Miriam

More information

Supporting Operating System Kernel Data Disambiguation using Points-to Analysis

Supporting Operating System Kernel Data Disambiguation using Points-to Analysis Supporting Operating System Kernel Data Disambiguation using Points-to Analysis Amani Ibriham, James Hamlyn-Harris, John Grundy & Mohamed Almorsy Center for Computing and Engineering Software Systems Swinburne

More information

Performance Evaluation of Parallel Algorithms in R Statistical Package on Multicore Parallel Architectures

Performance Evaluation of Parallel Algorithms in R Statistical Package on Multicore Parallel Architectures Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'16 15 Performance Evaluation of Parallel Algorithms in R Statistical Package on Multicore Parallel Architectures A. F. Kummer Neto 1 and A. Charão

More information

Workload Optimization on Hybrid Architectures

Workload Optimization on Hybrid Architectures IBM OCR project Workload Optimization on Hybrid Architectures IBM T.J. Watson Research Center May 4, 2011 Chiron & Achilles 2003 IBM Corporation IBM Research Goal Parallelism with hundreds and thousands

More information

Predicting Rare Failure Events using Classification Trees on Large Scale Manufacturing Data with Complex Interactions

Predicting Rare Failure Events using Classification Trees on Large Scale Manufacturing Data with Complex Interactions 2016 IEEE International Conference on Big Data (Big Data) Predicting Rare Failure Events using Classification Trees on Large Scale Manufacturing Data with Complex Interactions Jeff Hebert, Texas Instruments

More information

Monitoring Remote Access VPN Services

Monitoring Remote Access VPN Services CHAPTER 5 A remote access service (RAS) VPN secures connections for remote users, such as mobile users or telecommuters. RAS VPN monitoring provides all of the most important indicators of cluster, concentrator,

More information

Cross-Layer Memory Management for Managed Language Applications

Cross-Layer Memory Management for Managed Language Applications Cross-Layer Memory Management for Managed Language Applications Michael R. Jantz University of Tennessee mrjantz@utk.edu Forrest J. Robinson Prasad A. Kulkarni University of Kansas {fjrobinson,kulkarni}@ku.edu

More information

Oracle JD Edwards EnterpriseOne Object Usage Tracking Performance Characterization Using JD Edwards EnterpriseOne Object Usage Tracking

Oracle JD Edwards EnterpriseOne Object Usage Tracking Performance Characterization Using JD Edwards EnterpriseOne Object Usage Tracking Oracle JD Edwards EnterpriseOne Object Usage Tracking Performance Characterization Using JD Edwards EnterpriseOne Object Usage Tracking ORACLE WHITE PAPER JULY 2017 Disclaimer The following is intended

More information

High Performance Computing on MapReduce Programming Framework

High Performance Computing on MapReduce Programming Framework International Journal of Private Cloud Computing Environment and Management Vol. 2, No. 1, (2015), pp. 27-32 http://dx.doi.org/10.21742/ijpccem.2015.2.1.04 High Performance Computing on MapReduce Programming

More information

Performance impact of dynamic parallelism on different clustering algorithms

Performance impact of dynamic parallelism on different clustering algorithms Performance impact of dynamic parallelism on different clustering algorithms Jeffrey DiMarco and Michela Taufer Computer and Information Sciences, University of Delaware E-mail: jdimarco@udel.edu, taufer@udel.edu

More information

Research Article Mobile Storage and Search Engine of Information Oriented to Food Cloud

Research Article Mobile Storage and Search Engine of Information Oriented to Food Cloud Advance Journal of Food Science and Technology 5(10): 1331-1336, 2013 DOI:10.19026/ajfst.5.3106 ISSN: 2042-4868; e-issn: 2042-4876 2013 Maxwell Scientific Publication Corp. Submitted: May 29, 2013 Accepted:

More information

Introduction to the bnclassify package

Introduction to the bnclassify package Introduction to the bnclassify package Bojan Mihaljevic, Concha Bielza, Pedro Larranaga 2017-09-06 Contents 1 Introduction 1 2 Data 2 3 Workflow 2 4 Network structure 3 4.1 Learning...........................................

More information

King 2 Abstract: There is one evident area of operating systems that has enormous potential for growth and optimization. Only recently has focus been

King 2 Abstract: There is one evident area of operating systems that has enormous potential for growth and optimization. Only recently has focus been King 1 Input and Output Optimization in Linux for Appropriate Resource Allocation and Management James Avery King March 25, 2016 University of North Georgia Annual Research Conference King 2 Abstract:

More information

Lesson 2: Using the Performance Console

Lesson 2: Using the Performance Console Lesson 2 Lesson 2: Using the Performance Console Using the Performance Console 19-13 Windows XP Professional provides two tools for monitoring resource usage: the System Monitor snap-in and the Performance

More information

Configuration Maximums VMware Infrastructure 3: ESX Server 3.5 Update 2, ESX Server 3i version 3.5 Update 2, VirtualCenter 2.

Configuration Maximums VMware Infrastructure 3: ESX Server 3.5 Update 2, ESX Server 3i version 3.5 Update 2, VirtualCenter 2. Topic Configuration s VMware Infrastructure 3: ESX Server 3.5 Update 2, ESX Server 3i version 3.5 Update 2, VirtualCenter 2.5 Update 2 When you are selecting and configuring your virtual and physical equipment,

More information

I/O Systems. Amir H. Payberah. Amirkabir University of Technology (Tehran Polytechnic)

I/O Systems. Amir H. Payberah. Amirkabir University of Technology (Tehran Polytechnic) I/O Systems Amir H. Payberah amir@sics.se Amirkabir University of Technology (Tehran Polytechnic) Amir H. Payberah (Tehran Polytechnic) I/O Systems 1393/9/15 1 / 57 Motivation Amir H. Payberah (Tehran

More information

Dynamic Monitoring Tool based on Vector Clocks for Multithread Programs

Dynamic Monitoring Tool based on Vector Clocks for Multithread Programs , pp.45-49 http://dx.doi.org/10.14257/astl.2014.76.12 Dynamic Monitoring Tool based on Vector Clocks for Multithread Programs Hyun-Ji Kim 1, Byoung-Kwi Lee 2, Ok-Kyoon Ha 3, and Yong-Kee Jun 1 1 Department

More information

Hosting with Eduphoria

Hosting with Eduphoria Hosting with Eduphoria Hosted Migration Process What does my district need to do? How will this migration effect my district? Eduphoria's Hosted Environment Hosted vs Self hosted features User Account

More information

Original citation: Shi, Xuanhua, Chen, Ming, He, Ligang, Xie, Xu, Lu, Lu, Jin, Hai, Chen, Yong and Wu, Song. (214) Mammoth : gearing hadoop towards memory-intensive MapReduce applications. IEEE Transactions

More information

BioTechnology. An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc.

BioTechnology. An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-7435 Volume 10 Issue 15 2014 BioTechnology An Indian Journal FULL PAPER BTAIJ, 10(15), 2014 [8768-8774] The Java virtual machine in a thread migration of

More information

... WebSphere 6.1 and WebSphere 6.0 performance with Oracle s JD Edwards EnterpriseOne 8.12 on IBM Power Systems with IBM i

... WebSphere 6.1 and WebSphere 6.0 performance with Oracle s JD Edwards EnterpriseOne 8.12 on IBM Power Systems with IBM i 6.1 and 6.0 performance with Oracle s JD Edwards EnterpriseOne 8.12 on IBM Power Systems with IBM i........ Gerrie Fisk IBM Oracle ICC June 2008 Copyright IBM Corporation, 2008. All Rights Reserved. All

More information

Fractal: A Software Toolchain for Mapping Applications to Diverse, Heterogeneous Architecures

Fractal: A Software Toolchain for Mapping Applications to Diverse, Heterogeneous Architecures Fractal: A Software Toolchain for Mapping Applications to Diverse, Heterogeneous Architecures University of Virginia Dept. of Computer Science Technical Report #CS-2011-09 Jeremy W. Sheaffer and Kevin

More information

Dynamic load-balancing algorithm for a decentralized gpu-cpu cluster

Dynamic load-balancing algorithm for a decentralized gpu-cpu cluster Dynamic load-balancing algorithm for a decentralized gpu-cpu cluster Mert Değirmenci Department of Computer Engineering, Middle East Technical University Related work http://www2.computer.org/portal/web/csdl/doi/10.1109/icpads.2004.13161144

More information

2 TEST: A Tracer for Extracting Speculative Threads

2 TEST: A Tracer for Extracting Speculative Threads EE392C: Advanced Topics in Computer Architecture Lecture #11 Polymorphic Processors Stanford University Handout Date??? On-line Profiling Techniques Lecture #11: Tuesday, 6 May 2003 Lecturer: Shivnath

More information

Design methodology for multi processor systems design on regular platforms

Design methodology for multi processor systems design on regular platforms Design methodology for multi processor systems design on regular platforms Ph.D in Electronics, Computer Science and Telecommunications Ph.D Student: Davide Rossi Ph.D Tutor: Prof. Roberto Guerrieri Outline

More information

Efficiency. Efficiency: Indexing. Indexing. Efficiency Techniques. Inverted Index. Inverted Index (COSC 488)

Efficiency. Efficiency: Indexing. Indexing. Efficiency Techniques. Inverted Index. Inverted Index (COSC 488) Efficiency Efficiency: Indexing (COSC 488) Nazli Goharian nazli@cs.georgetown.edu Difficult to analyze sequential IR algorithms: data and query dependency (query selectivity). O(q(cf max )) -- high estimate-

More information

Diffusing Your Mobile Apps: Extending In-Network Function Virtualisation to Mobile Function Offloading

Diffusing Your Mobile Apps: Extending In-Network Function Virtualisation to Mobile Function Offloading Diffusing Your Mobile Apps: Extending In-Network Function Virtualisation to Mobile Function Offloading Mario Almeida, Liang Wang*, Jeremy Blackburn, Konstantina Papagiannaki, Jon Crowcroft* Telefonica

More information

TIMELINE SCHEDULING FOR OUT-OF-CORE RAY BATCHING. SGVR Lab KAIST

TIMELINE SCHEDULING FOR OUT-OF-CORE RAY BATCHING. SGVR Lab KAIST TIMELINE SCHEDLIN FOR OT-OF-CORE RAY BATCHIN Myungbae Son SVR Lab KAIST Sung-EuiYoon Our Scenario Complex scenes Out-of-core model: Too big data! Cannot be stored in main / memory Complex device configurations

More information

High-Performance Data Loading and Augmentation for Deep Neural Network Training

High-Performance Data Loading and Augmentation for Deep Neural Network Training High-Performance Data Loading and Augmentation for Deep Neural Network Training Trevor Gale tgale@ece.neu.edu Steven Eliuk steven.eliuk@gmail.com Cameron Upright c.upright@samsung.com Roadmap 1. The General-Purpose

More information

Technical Paper. Performance and Tuning Considerations for SAS on Dell EMC VMAX 250 All-Flash Array

Technical Paper. Performance and Tuning Considerations for SAS on Dell EMC VMAX 250 All-Flash Array Technical Paper Performance and Tuning Considerations for SAS on Dell EMC VMAX 250 All-Flash Array Release Information Content Version: 1.0 April 2018 Trademarks and Patents SAS Institute Inc., SAS Campus

More information

TensorFlow: A System for Learning-Scale Machine Learning. Google Brain

TensorFlow: A System for Learning-Scale Machine Learning. Google Brain TensorFlow: A System for Learning-Scale Machine Learning Google Brain The Problem Machine learning is everywhere This is in large part due to: 1. Invention of more sophisticated machine learning models

More information

Performance Testing: A Comparative Study and Analysis of Web Service Testing Tools

Performance Testing: A Comparative Study and Analysis of Web Service Testing Tools Performance Testing: A Comparative Study and Analysis of Web Service Testing Tools Dr.V.Asha 1, Divyabindu M C 2, Asha V 3 1,2,3 Department of Master of Computer Applications, New Horizon College of Engineering,

More information

RHRK-Seminar. High Performance Computing with the Cluster Elwetritsch - II. Course instructor : Dr. Josef Schüle, RHRK

RHRK-Seminar. High Performance Computing with the Cluster Elwetritsch - II. Course instructor : Dr. Josef Schüle, RHRK RHRK-Seminar High Performance Computing with the Cluster Elwetritsch - II Course instructor : Dr. Josef Schüle, RHRK Overview Course I Login to cluster SSH RDP / NX Desktop Environments GNOME (default)

More information

Scalability Testing with Login VSI v16.2. White Paper Parallels Remote Application Server 2018

Scalability Testing with Login VSI v16.2. White Paper Parallels Remote Application Server 2018 Scalability Testing with Login VSI v16.2 White Paper Parallels Remote Application Server 2018 Table of Contents Scalability... 3 Testing the Scalability of Parallels RAS... 3 Configurations for Scalability

More information

Flat Datacenter Storage. Edmund B. Nightingale, Jeremy Elson, et al. 6.S897

Flat Datacenter Storage. Edmund B. Nightingale, Jeremy Elson, et al. 6.S897 Flat Datacenter Storage Edmund B. Nightingale, Jeremy Elson, et al. 6.S897 Motivation Imagine a world with flat data storage Simple, Centralized, and easy to program Unfortunately, datacenter networks

More information

called Hadoop Distribution file System (HDFS). HDFS is designed to run on clusters of commodity hardware and is capable of handling large files. A fil

called Hadoop Distribution file System (HDFS). HDFS is designed to run on clusters of commodity hardware and is capable of handling large files. A fil Parallel Genome-Wide Analysis With Central And Graphic Processing Units Muhamad Fitra Kacamarga mkacamarga@binus.edu James W. Baurley baurley@binus.edu Bens Pardamean bpardamean@binus.edu Abstract The

More information

Computer Systems A Programmer s Perspective 1 (Beta Draft)

Computer Systems A Programmer s Perspective 1 (Beta Draft) Computer Systems A Programmer s Perspective 1 (Beta Draft) Randal E. Bryant David R. O Hallaron August 1, 2001 1 Copyright c 2001, R. E. Bryant, D. R. O Hallaron. All rights reserved. 2 Contents Preface

More information

Techniques to improve the scalability of Checkpoint-Restart

Techniques to improve the scalability of Checkpoint-Restart Techniques to improve the scalability of Checkpoint-Restart Bogdan Nicolae Exascale Systems Group IBM Research Ireland 1 Outline A few words about the lab and team Challenges of Exascale A case for Checkpoint-Restart

More information

Introduction. Architecture Overview

Introduction. Architecture Overview Performance and Sizing Guide Version 17 November 2017 Contents Introduction... 5 Architecture Overview... 5 Performance and Scalability Considerations... 6 Vertical Scaling... 7 JVM Heap Sizes... 7 Hardware

More information

WHITE PAPER NGINX An Open Source Platform of Choice for Enterprise Website Architectures

WHITE PAPER NGINX An Open Source Platform of Choice for Enterprise Website Architectures ASHNIK PTE LTD. White Paper WHITE PAPER NGINX An Open Source Platform of Choice for Enterprise Website Architectures Date: 10/12/2014 Company Name: Ashnik Pte Ltd. Singapore By: Sandeep Khuperkar, Director

More information

White Paper Features and Benefits of Fujitsu All-Flash Arrays for Virtualization and Consolidation ETERNUS AF S2 series

White Paper Features and Benefits of Fujitsu All-Flash Arrays for Virtualization and Consolidation ETERNUS AF S2 series White Paper Features and Benefits of Fujitsu All-Flash Arrays for Virtualization and Consolidation Fujitsu All-Flash Arrays are extremely effective tools when virtualization is used for server consolidation.

More information

davidklee.net gplus.to/kleegeek linked.com/a/davidaklee

davidklee.net gplus.to/kleegeek linked.com/a/davidaklee @kleegeek davidklee.net gplus.to/kleegeek linked.com/a/davidaklee Specialties / Focus Areas / Passions: Performance Tuning & Troubleshooting Virtualization Cloud Enablement Infrastructure Architecture

More information

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP TITLE: Implement sort algorithm and run it using HADOOP PRE-REQUISITE Preliminary knowledge of clusters and overview of Hadoop and its basic functionality. THEORY 1. Introduction to Hadoop The Apache Hadoop

More information