IBM Platform LSF 9.1.3
|
|
- Hilda Sanders
- 6 years ago
- Views:
Transcription
1 IBM Platform LSF Global Product Portfolio Manager, IBM Platform LSF Family 1
2 IBM Platform LSF Family Key Drivers Unceasing demand for Compute Scalability and Throughput Node counts are steadily growing and are becoming more dense From Single core per socket, to 18-20, and 72 is on the horizon From Single job per node now multiple jobs per node Accelerators Need for isolation and containment Job volumes continue to increase Manageability and usability Need to be able to operate efficiently at scale Need to provide useful feedback to the user on when and why Need to be more SLA focused reduce the complexity Power is an issue Cost and limits Data awareness/affinity Secure Collaboration and Remote Visualization Operational Management HPC is business critical 2
3 IBM Platform LSF Family IBM Platform Analytics IBM Platform Application Center Platform MPI IBM Platform RTM Platform LSF IBM Platform Process Manager Hadoop Connector MapReduce Accelerator Data Manager IBM Platform Dynamic Cluster Scheduling Extensions IBM Platform Session Scheduler IBM Platform License Scheduler 3 Docker Connector Elastic Storage (GPFS)
4 IBM Platform LSF Family Releases 0 9.1Major Release Stable Production Branch, Fixes Only Every 4 months x (Jan, May, Sept) Every 4 months x (Nov, Mar, July) Next Major Release Codename : Nevis 0 4 1H2013 2H2013 1H2014 2H2014 1H2015 2H2015 Add-on s follow roughly the same timeline
5 9.1 Highlights LSF 9.1 Scalability & Performance New threaded query subsystem significantly improving performance Scheduling cycle decreased with memory optimisations Usability & Manageability Clearer reporting of resource usage Fast detection of hung hosts/jobs Leverage Linux kernel cgroup for process tracking and accounting Query Diagnostics Kerberos Refresh Scheduling Alternative/time based resource requirements Enhanced SLA scheduling SLA scheduling unit extended to core+memory packages Numerous multicluster enhancements LS 9.1 New Project mode to supplement cluster mode Flexible hierarchal project definition Better handling of parallel jobs where each rank checks out a license Removed the need for LSF restarts when the LS configuration changes PAC 9.1 Increased scalability and performance Configurable information retention periods Support for additional LSF fields Extended Web Services interface Addition remote visualization support DCV, EoD Enhanced integration with Platform Analytics RTM 9.1 Support for GPFS moniitoring Refreshed Alarms module allowing new alarms to be quickly defined Enhanced filtering and drilldown capabilities Support for LSF GSLA and Dynamic Cluster New ELIM templates to allow automatic graphing of LSF ELIM information PPM 9.1 Extended support for user variables and flow controls Flow version control Support for non-lsf systems 4Q2011 1Q2012 2Q2012 3Q2012 4Q2012 1Q2013 5
6 9.1.1 Highlights LSF Usability Ability to customize the default output of bjobs Ability to pass arguments to esubs Manageability Linux OOM Killer support Flexible clean period for DONE and EXITed jobs New Queue Host Limit Aggressive cleanup on host failure Scheduling Cgroup enforcement of cpu affinity and memory use Support preemption & backfill with affinity Enhancements to blaunch performance and reliability Updated HW Support ARMv7 & Cray XC-30 Energy Aware Scheduling (Phase 1) LA Set core/node frequency, Suspend to S3, Energy Reporting PAC Support for DCV session sharing using Support for X.509 authentication Updated browser support: IE, Firefox and Chrome. Support for PAC on PowerLinux Various usability enhancements Product usage tracking for the LSF Family Analytics This release adds formal support for IBM Platform Symphony It includes updates to the data loaders and new Symphony specific workbooks LSF Manageability Ability to set the default order Ability to set env vars in application profiles Scheduling New CPU Affinity New Memory Affinity New IBM PE integration on x86 (Power 3Q) PMPI Various performance enhancements Shared memory DNS caching at startup GPUDirect Non blocking collectives Free PMPI Community Edition 1Q2013 2Q2013 3Q2013 4Q2013 6
7 LSF Highlights LSF Usability & Manageability Updates to DRMAA, Perl and Python API s Support for host based/parallel pre/post exec for node health checks Enhanced Kerberos Support for parallel jobs New ACL based security to control who is allowed to see which jobs Scheduling Compound resource scheduling extended to support a global same[] clause Updates to Nvidia GPGPU and Intel Phi support Block allocations Energy Aware Scheduling including frequency prediction Further blaunch improvements for parallel job launching Updated HW Support zlinux and ARM refresh PMPI Community Edition LS Combine related features into packages for ease of use. Run lmstat through ssh for remote clusters General performance enhancements New LS Basic Edition Support for LSF Advanced Edition PAC Upgrade to IBM Java 7 Various usability and functional enhancements Ability to pass parameters from PAC to Platform Analytics for report generation Preview release of PAC on IBM Websphere SiteMinder Support Analytics Support for Platform Symphony Updates to Tableau & Vertica Updated/refresh workbooks. FNM, Updated Loaders RTM New Heuristic plugin providing a holistic view on cluster health, throughput and job failures. New Grid Host History plugin Enhancements to the License plugin Support for Power Linux PPM Support for SiteMinder Marshalling of flow level stdout/stderr Misc of enhancements Support for Power Linux 1Q2013 2Q2013 3Q2013 4Q2013 7
8 Overview The v9.1.3 release of the Platform LSF Family continues our commitment to delivering the best of breed workload management solutions. There are over 75 customer driven enhancements across the family, including: Enhanced scheduling and improved manageability capabilities in LSF. Support for the Reprise License Manager in LSF, License Scheduler and RTM. Application Center now leverages IBM WebSphere for higher performance, as well as scale out and high availability. Numerous enhancements in Process Manager to support our market focus in Life Sciences and Genomics Processor support IBM Power8 (Linux, AIX), ARM 64-bit (Linux) IO optimizations with GPFS/ROMIO and GPU Direct support in Platform MPI New add-ons and extensions for Hadoop & Data Management 8
9 LSF Highlights LSF Change how exclusive jobs are scheduled and accounted for Discrete slot counts for parallel jobs Enhanced hostfile support Auto purge of orphaned dependent jobs Node Based Advanced Reservations Control of Env Var propagation Improved handling of job control failures Remote proclimit considered in MC forwarding LS Support for non-flexlm licenses (RLM) PMPI Mellanox Connect IB/DCT support to increase scale GPU-Direct RDMA Improved Torus Scale Support (Mellanox ACM) Improved ROMIO/GPFS support (MPI IO optimization) Improved Shmem performance PPM Enhanced flow completion criteria Execute multiple revisions of the same flow Group permissions & Execution Account PAC Migrate to WebSphere Performance, Scale out & HA,. Enhanced field dependencies & hide form sections Job data file actions Online editors RTM Extended License Dashboard and non FlexLM support perfmon support Additional alarming templates Native Phi Support for LSF 9 and PMPI 1Q2014 ARMv8 Support 2Q2014 3Q2014 4Q2014 1Q2015 Connector for Hadoop Power 8 BE Docker Integration MapReduce Accelerator Data Manager Power 8 LE KNL Support
10 Overview of IBM Platform LSF Easier to Manage You can now account for the actual resources allocated (and blocked) by a job, rather than just the resources it requested, providing better visibility into utilization. You can restrict users to submitting jobs of specific sizes improving overall utilization. Orphaned Dependent Jobs can be automatically cleaned up, maintaining performance. New options for controlling how memory limits are applied and enforced. Improved handling of common failures Support for RLM in addition to FlexLM Support for ARMv8, Power8 10
11 Requested and Allocated Resources Many organizations base their fairshare allocations and accounting on the resources the user requested. In the example below, the red job is spanning 8 hosts, and using 4 cores per host. The accounting information will show 32 cores in use, fairshare will be based on 32 cores. Now consider the same job, submitted for exclusive execution. What should the accounting show? 32 or 128? 11
12 Auto Clean up of Orphan Dependent Jobs LSF Job dependencies are a key part of many customers workflows. Consider the simple example below: Done(A) Job A Exit(A) Done(B) Job B Exit(B) Done(E) Job E Exit(E) Job C Job D Job F Job G If A fails and E succeeds then F will run, but B,C,D & G will never run - they are orphans and will remain pending forever with Dependency condition invalid or never satisfied. The user could go back and fix whatever caused A to fail, rerun it, and then B etc would execute. In we have introduced: An administrator option to purge orphan jobs after a grace period this gives the user some time to fix failed jobs or bad dependencies. A user option to delete jobs as soon as they become orphans 12
13 Smart Memory Limit Enforcement The Smart Memory Limit allows a job to use more memory (or swap) than it asked for, as long as there is no contention. 64GB 64GB A overuse Threshold C overuse 32GB 32GB C A B overuse B 13 JobA reserved 32GB of Memory. However, it tries to use 48GB With traditional LSF or cgroup memory enforcement, the job would be killed. Smart Memory Limit would allow this because there is no contention. When there is contention (memory or swap threshold), the worst offender will be terminated. JobB asked for 10GB but it using 16GB JobC asked for 16GB but is using 40GB Thus JobC will be killed
14 Handling Failures MultiCluster PROCLIMIT/TASKLIMIT MultiCluster makes numerous checks when deciding which cluster to forward to. However, one check that wasn t made was whether the remote PROCLIMIT for the queue or application profile would allow the job to run. For example, the local cluster may allow a 256 way Fluent job, but the remote cluster may only allow a 64 way Fluent job. Not checking this in advance resulted in the job being returned to the submission cluster. In this now checked in advanced, resulting in more efficient scheduling. Pre-Execution Failures When a pre-exec repeatedly fails, the job will be suspended and left for the user to fix (if they remember!) In there is now a queue/application option to have the job exit instead. Checking of Job Control return codes Job controls are expected to put the job in the correct state before returning. In some cases this does not happen with the result that the LSF state and real job state are out of sync. To alleviate this, we will now check the return codes of the job control if zero, then the action has succeeded, if non-zero it has failed and LSF will not change the job state. 14
15 Overview of IBM Platform LSF Easier to Use New online help for bsub and bjobs helping users help themselves. Increased flexibility when creating advance reservations. You can now create them per node (rather than per slot), and give access to a group of users (rather than individual users) If you have to run the same job repeatedly on the exact same hosts and layout, you can now specify exact job topology via a host file. Fine grained control of how the job environment is propagated. 15
16 Platform LSF Family Add-on s 16
17 IBM Platform Application Center Platform Application Center s principal focus is ease of use, and is driven by enterprise IM clients and by Life Sciences Easier to Manage Greater performance, scalability and high availability through replacing Apache Tomcat with IBM WebSphere. Enterprise Identity management with support for CA SiteMinder Support for RHEL7 (with MariaDB) on x86 and Power 8 Easier to Use Support for online file editing of job submission scripts and data files removes the need to repeatedly download-edit-upload-test. Support for concurrently viewing multiple output files, avoiding the need to view them sequentially. Easier to Customize Simplified submission templates through extended support for field dependencies, layout persistence, and custom file actions. Users can be restricted from creating copies of existing templates enabling a more controlled and standardised environment. 17
18 Why move to IBM WebSphere? Improved web portal performance and scalability Support more concurrent users Better performance (30-60% faster) More stable under heavy load than tomcat WebSphere Liberty Profile is embedded in Application Center No requirement to purchase WebSphere seperately. Provides enterprise level HA Leverage WebSphere clustering for scale out and fail over Leverage prebuilt components from WebSphere For example: SiteMinder support 100 concurrent users, 1.4 million active jobs, 4 million historical jobs 5k hosts, mysql 5.6; PAC servers: Redhat 6.4, 2CPU, 32 Cores, 128G memory Submit/Query 18 test case: 100 clients, each client submit 10 jobs continually, each job with 1 file attachment(size 1k)
19 Superior Repeatable Performance = Better Business Outcomes Benchmarks designed to simulate workloads typically found in high throughput environments at small scale (128 cores) 12x less work 150x less work 6x less work 19
20 Summary of IBM Platform Process Manager Product direction is strongly influenced by the adoption of workflow in Industrial Manufacturing and in Life Sciences Easier to Use Faster time to results through array element to element dependencies. Reduce wasted computation through enhanced flow completion criteria. S1 S2 S3 Reduce errors through the use of parameter files. Users now have control over which version of a particular flow is triggered allowing them to compare different versions or verify old results. Easier to Manage Ability to delegate flow submission, and management. Support for automatic archiving of flow execution history enabling 20 traceability over much longer periods of time.
21 Platform RTM Platform RTM s direction is driven by a broad range of clients spanning many industries. The main foci for V9.1.3 were: Make it easier to get started with Platform RTM Platform LSF performance monitoring Collect additional Platform LSF data Support RLM Platform RTM also adds support for POWER8 on RHEL6.5/SLES
22 RTM 9.1.3: Guided Help Page level help for each link in the Console tab Link to Getting Started 22
23 RTM Cluster Performance Tracking To aid in the tracking of cluster performance, the output of badmin perfmon and badmin showstatus can be collected periodically. RTM can also periodically submit and track a test job for test job for end to end performance monitoring. 23
24 Platform MPI Two Key Performance Enhancements: ROMIO Support for Elastic Storage ROMIO is a parallel I/O library for MPI applications Now also available IBM Parallel Environment (PE) Significantly enhanced performance for I/O intensive applications. GPU-Direct RDMA New NVIDIA functionality Delivers significantly enhanced performance for GPU enabled applications MBytes/sec Message Size 24
25 New Add-on s and Extensions Data Manager for LSF Docker Integration Hadoop Connector MapReduce Accelerator for LSF 25
26 Integration with In short, Docker is a lightweight Linux container technology built on top of LXC and cgroup. Think of it as a Virtual Machine, without the overheads or size of a VM, that installs a complete application stack, on demand. All environment/library/distro dependencies can be encapsulated within the container so two applications that would have conflicting library requirements can now share the same host 26
27 Integration with Hot topic in big data for installing applications on demand in the cloud. Significant interest in Life Sciences, Software Development and other industries for encapsulating workflows and installing all components on demand [gecko1]$ cat /etc/redhat-release Red Hat Enterprise Linux Server release 6.2 (Santiago) [gecko1]$ bsub m gecko1 -Is -a "docker(panxun/bwa)" /bin/sh Job <402> is submitted to default queue <interactive>. <<Waiting for dispatch...>> <<Starting on gecko1>> $ hostname Docker_SL-402 $ cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=14.04 DISTRIB_CODENAME=trusty DISTRIB_DESCRIPTION="Ubuntu LTS" LSF OpenBeta & Whitepaper 27
28 28
29 29
30 For further information, please contact: Bill McMillan Global Product Portfolio Manager IBM Platform Computing LSF Family of Products Tel:
31 31
Using Docker in High Performance Computing in OpenPOWER Environment
Using Docker in High Performance Computing in OpenPOWER Environment Zhaohui Ding, Senior Product Architect Sam Sanjabi, Advisory Software Engineer IBM Platform Computing #OpenPOWERSummit Join the conversation
More informationFixed Bugs for IBM Platform LSF Version
Fixed Bugs for IBM LSF Version 9.1.1.1 Release Date: July 2013 The following bugs have been fixed in LSF Version 9.1.1.1 since March 2013 until June 24, 2013: 173446 Date 2013-01-11 The full pending reason
More informationCisco Integration Platform
Data Sheet Cisco Integration Platform The Cisco Integration Platform fuels new business agility and innovation by linking data and services from any application - inside the enterprise and out. Product
More informationIntroduction to Slurm
Introduction to Slurm Tim Wickberg SchedMD Slurm User Group Meeting 2017 Outline Roles of resource manager and job scheduler Slurm description and design goals Slurm architecture and plugins Slurm configuration
More informationSlurm Overview. Brian Christiansen, Marshall Garey, Isaac Hartung SchedMD SC17. Copyright 2017 SchedMD LLC
Slurm Overview Brian Christiansen, Marshall Garey, Isaac Hartung SchedMD SC17 Outline Roles of a resource manager and job scheduler Slurm description and design goals Slurm architecture and plugins Slurm
More informationFixed Bugs for IBM Spectrum LSF Version 10.1 Fix Pack 1
Fixed Bugs for IBM Spectrum LSF Version 10.1 Fix Pack 1 The following bugs have been fixed in LSF Version 10.1 Fix Pack 1 between 22 July 2016 and 20 October 2016: P101978 Date 2016-10-20 IBM Spectrum
More informationFixed Bugs for IBM Platform LSF Version 9.1.3
Fixed Bugs for IBM LSF Version 9.1.3 Release Date: July 31 2014 The following bugs have been fixed in LSF Version 9.1.3 between 8 October 2013 and 21 July 2014: 223287 Date 2013-12-06 The preemption calculation
More informationImproved Infrastructure Accessibility and Control with LSF for LS-DYNA
4 th European LS-DYNA Users Conference LS-DYNA Environment I Improved Infrastructure Accessibility and Control with LSF for LS-DYNA Author: Bernhard Schott Christof Westhues Platform Computing GmbH, Ratingen,
More informationStreamSets Control Hub Installation Guide
StreamSets Control Hub Installation Guide Version 3.2.1 2018, StreamSets, Inc. All rights reserved. Table of Contents 2 Table of Contents Chapter 1: What's New...1 What's New in 3.2.1... 2 What's New in
More informationMoab Workload Manager on Cray XT3
Moab Workload Manager on Cray XT3 presented by Don Maxwell (ORNL) Michael Jackson (Cluster Resources, Inc.) MOAB Workload Manager on Cray XT3 Why MOAB? Requirements Features Support/Futures 2 Why Moab?
More informationSmarter Systems In Your Cloud Deployment
Smarter Systems In Your Cloud Deployment Hemant S Shah ASEAN Executive: Cloud Computing, Systems Software. 5 th Oct., 2010 Contents We need Smarter Systems for a Smarter Planet Smarter Systems = Systems
More informationPBS PROFESSIONAL VS. MICROSOFT HPC PACK
PBS PROFESSIONAL VS. MICROSOFT HPC PACK On the Microsoft Windows Platform PBS Professional offers many features which are not supported by Microsoft HPC Pack. SOME OF THE IMPORTANT ADVANTAGES OF PBS PROFESSIONAL
More informationDelivers cost savings, high definition display, and supercharged sharing
TM OpenText TM Exceed TurboX Delivers cost savings, high definition display, and supercharged sharing OpenText Exceed TurboX is an advanced solution for desktop virtualization and remote access to enterprise
More informationVMware vcloud Air User's Guide
vcloud Air This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition. To check for more recent editions of this document,
More informationTable of Contents. Table of Contents Pivotal Greenplum Command Center Release Notes. Copyright Pivotal Software Inc,
Table of Contents Table of Contents Pivotal Greenplum Command Center 3.2.2 Release Notes 1 2 Copyright Pivotal Software Inc, 2013-2017 1 3.2.2 Pivotal Greenplum Command Center 3.2.2 Release Notes Greenplum
More informationHarp-DAAL for High Performance Big Data Computing
Harp-DAAL for High Performance Big Data Computing Large-scale data analytics is revolutionizing many business and scientific domains. Easy-touse scalable parallel techniques are necessary to process big
More informationContents. Index iii
Known Issues Contents Known issues............ 1 Blank administrative settings in IBM SmartCloud Analytics - Log Analysis.......... 1 Logs are not available for an incomplete or failed installation...............
More informationIBM Power Systems: Open innovation to put data to work Dexter Henderson Vice President IBM Power Systems
IBM Power Systems: Open innovation to put data to work Dexter Henderson Vice President IBM Power Systems 2014 IBM Corporation Powerful Forces are Changing the Way Business Gets Done Data growing exponentially
More informationQlik Sense Enterprise architecture and scalability
White Paper Qlik Sense Enterprise architecture and scalability June, 2017 qlik.com Platform Qlik Sense is an analytics platform powered by an associative, in-memory analytics engine. Based on users selections,
More informationHPC and IT Issues Session Agenda. Deployment of Simulation (Trends and Issues Impacting IT) Mapping HPC to Performance (Scaling, Technology Advances)
HPC and IT Issues Session Agenda Deployment of Simulation (Trends and Issues Impacting IT) Discussion Mapping HPC to Performance (Scaling, Technology Advances) Discussion Optimizing IT for Remote Access
More informationMigrating to the P8 5.2 Component Manager Framework
Migrating to the P8 5.2 Component Manager Framework Contents Migrating to the P8 5.2 Component Manager Framework... 1 Introduction... 1 Revision History:... 2 Comparing the Two Component Manager Frameworks...
More informationWorkload management at KEK/CRC -- status and plan
Workload management at KEK/CRC -- status and plan KEK/CRC Hiroyuki Matsunaga Most of the slides are prepared by Koichi Murakami and Go Iwai CPU in KEKCC Work server & Batch server Xeon 5670 (2.93 GHz /
More informationIBM. PDF file of IBM Knowledge Center topics. IBM Operations Analytics for z Systems. Version 2 Release 2
IBM Operations Analytics for z Systems IBM PDF file of IBM Knowledge Center topics Version 2 Release 2 IBM Operations Analytics for z Systems IBM PDF file of IBM Knowledge Center topics Version 2 Release
More informationIBM Power Systems HPC Cluster
IBM Power Systems HPC Cluster Highlights Complete and fully Integrated HPC cluster for demanding workloads Modular and Extensible: match components & configurations to meet demands Integrated: racked &
More informationLaohu cluster user manual. Li Changhua National Astronomical Observatory, Chinese Academy of Sciences 2011/12/26
Laohu cluster user manual Li Changhua National Astronomical Observatory, Chinese Academy of Sciences 2011/12/26 About laohu cluster Laohu cluster has 85 hosts, each host has 8 CPUs and 2 GPUs. GPU is Nvidia
More informationVisual Design Flows for Faster Debug and Time to Market FlowTracer White Paper
Visual Design Flows for Faster Debug and Time to Market FlowTracer White Paper 2560 Mission College Blvd., Suite 130 Santa Clara, CA 95054 (408) 492-0940 Introduction As System-on-Chip (SoC) designs have
More informationHortonworks Data Platform
Hortonworks Data Platform Workflow Management (August 31, 2017) docs.hortonworks.com Hortonworks Data Platform: Workflow Management Copyright 2012-2017 Hortonworks, Inc. Some rights reserved. The Hortonworks
More informationPerformance Monitoring and Management of Microservices on Docker Ecosystem
Performance Monitoring and Management of Microservices on Docker Ecosystem Sushanta Mahapatra Sr.Software Specialist Performance Engineering SAS R&D India Pvt. Ltd. Pune Sushanta.Mahapatra@sas.com Richa
More informationHigh Availability for Enterprise Clouds: Oracle Solaris Cluster and OpenStack
High Availability for Enterprise Clouds: Oracle Solaris Cluster and OpenStack Eve Kleinknecht Principal Product Manager Thorsten Früauf Principal Software Engineer November 18, 2015 Safe Harbor Statement
More informationBest practices. Using Affinity Scheduling in IBM Platform LSF. IBM Platform LSF
IBM Platform LSF Best practices Using Affinity Scheduling in IBM Platform LSF Rong Song Shen Software Developer: LSF Systems & Technology Group Sam Sanjabi Senior Software Developer Systems & Technology
More informationBuild your own Cloud on Christof Westhues
Build your own Cloud on Christof Westhues chwe@de.ibm.com IBM Big Data & Elastic Storage Tour Software Defined Infrastructure Roadshow December 2 4, 2014 New applications and IT are being built for Cloud
More informationOpenManage Power Center Demo Guide for https://demos.dell.com
OpenManage Power Center Demo Guide for https://demos.dell.com Contents Introduction... 3 Lab 1 Demo Environment... 6 Lab 2 Change the default settings... 7 Lab 3 Discover the devices... 8 Lab 4 Group Creation
More informationIsilon InsightIQ. Version User Guide
Isilon InsightIQ Version 4.1.1 User Guide Copyright 2009-2017 Dell Inc. or its subsidiaries. All rights reserved. Published January 2017 Dell believes the information in this publication is accurate as
More informationRelease Notes for Platform Process Manager. Platform Process Manager Version 8.2 May 2012
Release Notes for Platform Process Manager Platform Process Manager Version 8.2 May 2012 Copyright 1994-2012 Platform Computing Corporation. Although the information in this document has been carefully
More informationWrite a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical
Identify a problem Review approaches to the problem Propose a novel approach to the problem Define, design, prototype an implementation to evaluate your approach Could be a real system, simulation and/or
More informationOverview Job Management OpenVZ Conclusions. XtreemOS. Surbhi Chitre. IRISA, Rennes, France. July 7, Surbhi Chitre XtreemOS 1 / 55
XtreemOS Surbhi Chitre IRISA, Rennes, France July 7, 2009 Surbhi Chitre XtreemOS 1 / 55 Surbhi Chitre XtreemOS 2 / 55 Outline What is XtreemOS What features does it provide in XtreemOS How is it new and
More informationCluster Computing. Resource and Job Management for HPC 16/08/2010 SC-CAMP. ( SC-CAMP) Cluster Computing 16/08/ / 50
Cluster Computing Resource and Job Management for HPC SC-CAMP 16/08/2010 ( SC-CAMP) Cluster Computing 16/08/2010 1 / 50 Summary 1 Introduction Cluster Computing 2 About Resource and Job Management Systems
More informationMemory Footprint of Locality Information On Many-Core Platforms Brice Goglin Inria Bordeaux Sud-Ouest France 2018/05/25
ROME Workshop @ IPDPS Vancouver Memory Footprint of Locality Information On Many- Platforms Brice Goglin Inria Bordeaux Sud-Ouest France 2018/05/25 Locality Matters to HPC Applications Locality Matters
More informationPaving the Road to Exascale
Paving the Road to Exascale Gilad Shainer August 2015, MVAPICH User Group (MUG) Meeting The Ever Growing Demand for Performance Performance Terascale Petascale Exascale 1 st Roadrunner 2000 2005 2010 2015
More informationShared File System Requirements for SAS Grid Manager. Table Talk #1546 Ben Smith / Brian Porter
Shared File System Requirements for SAS Grid Manager Table Talk #1546 Ben Smith / Brian Porter About the Presenters Main Presenter: Ben Smith, Technical Solutions Architect, IBM smithbe1@us.ibm.com Brian
More informationIBM Spectrum LSF Version 10 Release 1. Release Notes IBM
IBM Spectrum LSF Version 10 Release 1 Release Notes IBM IBM Spectrum LSF Version 10 Release 1 Release Notes IBM Note Before using this information and the product it supports, read the information in
More informationIBM WebSphere Application Server 8. Clustering Flexible Management
IBM WebSphere Application Server 8 Clustering Flexible Management Thomas Bussière- bussiere@fr.ibm.com IT Architect Business Solution Center La Gaude, France WebSphere Application Server: High Availability
More informationIBM i 7.2. Therese Eaton Client Technical Specialist
IBM i 7.2 Therese Eaton Client Technical Specialist IBM i 7.2 Therese Eaton Client Technical Specialist IBM i Decades of ultimate integration & workload optimization that all the vendors in the IT industry
More informationMcAfee Security Management Center
Data Sheet McAfee Security Management Center Unified management for next-generation devices Key advantages: Single pane of glass across the management lifecycle for McAfee next generation devices. Scalability
More informationA Container On a Virtual Machine On an HPC? Presentation to HPC Advisory Council. Perth, July 31-Aug 01, 2017
A Container On a Virtual Machine On an HPC? Presentation to HPC Advisory Council Perth, July 31-Aug 01, 2017 http://levlafayette.com Necessary and Sufficient Definitions High Performance Computing: High
More informationPreemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization
Preemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization Wei Chen, Jia Rao*, and Xiaobo Zhou University of Colorado, Colorado Springs * University of Texas at Arlington Data Center
More informationData Analytics using MapReduce framework for DB2's Large Scale XML Data Processing
IBM Software Group Data Analytics using MapReduce framework for DB2's Large Scale XML Data Processing George Wang Lead Software Egnineer, DB2 for z/os IBM 2014 IBM Corporation Disclaimer and Trademarks
More informationPython based Data Science on Cray Platforms Rob Vesse, Alex Heye, Mike Ringenburg - Cray Inc C O M P U T E S T O R E A N A L Y Z E
Python based Data Science on Cray Platforms Rob Vesse, Alex Heye, Mike Ringenburg - Cray Inc Overview Supported Technologies Cray PE Python Support Shifter Urika-XC Anaconda Python Spark Intel BigDL machine
More informationScheduling in SAS 9.2
Scheduling in SAS 9.2 The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2009. Scheduling in SAS 9.2. Cary, NC: SAS Institute Inc. Scheduling in SAS 9.2 Copyright 2009,
More informationThe rcuda middleware and applications
The rcuda middleware and applications Will my application work with rcuda? rcuda currently provides binary compatibility with CUDA 5.0, virtualizing the entire Runtime API except for the graphics functions,
More informationMaking Supercomputing More Available and Accessible Windows HPC Server 2008 R2 Beta 2 Microsoft High Performance Computing April, 2010
Making Supercomputing More Available and Accessible Windows HPC Server 2008 R2 Beta 2 Microsoft High Performance Computing April, 2010 Windows HPC Server 2008 R2 Windows HPC Server 2008 R2 makes supercomputing
More information<Insert Picture Here> Managing Oracle Exadata Database Machine with Oracle Enterprise Manager 11g
Managing Oracle Exadata Database Machine with Oracle Enterprise Manager 11g Exadata Overview Oracle Exadata Database Machine Extreme ROI Platform Fast Predictable Performance Monitor
More informationScheduler Optimization for Current Generation Cray Systems
Scheduler Optimization for Current Generation Cray Systems Morris Jette SchedMD, jette@schedmd.com Douglas M. Jacobsen, David Paul NERSC, dmjacobsen@lbl.gov, dpaul@lbl.gov Abstract - The current generation
More informationGuide to Deploying VMware Workspace ONE. VMware Identity Manager VMware AirWatch 9.1
Guide to Deploying VMware Workspace ONE VMware Identity Manager 2.9.1 VMware AirWatch 9.1 Guide to Deploying VMware Workspace ONE You can find the most up-to-date technical documentation on the VMware
More informationThe following bugs have been fixed in LSF Version Service Pack 2 between 30 th May 2014 and 31 st January 2015:
The following bugs have been fixed in LSF Version 9.1.3 Service Pack 2 between 30 th May 2014 and 31 st January 2015: 211873 Date 2013-7-19 1. When advance reservation files exist (lsb.rsv.id, lsb.rsv.stat),
More informationIBM Spectrum Scale on Power Linux tuning paper
IBM Spectrum Scale on Power Linux tuning paper Current Version Number: 7.1 Date: 09/11/2016 Authors: Sven Oehme Todd Tosseth Daniel De Souza Casali Scott Fadden 1 Table of Contents 1 Introduction... 3
More informationOracle Enterprise Manager 12c IBM DB2 Database Plug-in
Oracle Enterprise Manager 12c IBM DB2 Database Plug-in May 2015 Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and
More informationBest Practices for Setting BIOS Parameters for Performance
White Paper Best Practices for Setting BIOS Parameters for Performance Cisco UCS E5-based M3 Servers May 2013 2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public. Page
More informationIBM Platform LSF. Best Practices. IBM Platform LSF and IBM GPFS in Large Clusters. Jin Ma Platform LSF Developer IBM Canada
IBM Platform LSF Best Practices IBM Platform LSF 9.1.3 and IBM GPFS in Large Clusters Jin Ma Platform LSF Developer IBM Canada Table of Contents IBM Platform LSF 9.1.3 and IBM GPFS in Large Clusters...
More informationWhat Is New in VMware vcenter Server 4 W H I T E P A P E R
What Is New in VMware vcenter Server 4 W H I T E P A P E R Table of Contents What Is New in VMware vcenter Server 4....................................... 3 Centralized Control and Visibility...............................................
More informationVIRTUAL GPU LICENSE SERVER VERSION AND 5.1.0
VIRTUAL GPU LICENSE SERVER VERSION 2018.06 AND 5.1.0 DU-07754-001 _v6.0 through 6.2 July 2018 User Guide TABLE OF CONTENTS Chapter 1. Introduction to the NVIDIA vgpu Software License Server... 1 1.1. Overview
More informationIntroduction. Key Features and Benefits
Introduction Stabilix Underwriting Framework is a highly adaptable XML based J2EE com-pliant software platform built on the Stabilix s business process automation (BPA) suite, code named CloudEx. CloudEx
More informationExtended Search Administration
IBM Lotus Extended Search Extended Search Administration Version 4 Release 0.1 SC27-1404-02 IBM Lotus Extended Search Extended Search Administration Version 4 Release 0.1 SC27-1404-02 Note! Before using
More informationFUJITSU Software ServerView Cloud Monitoring Manager V1.1. Release Notes
FUJITSU Software ServerView Cloud Monitoring Manager V1.1 Release Notes J2UL-2170-01ENZ0(00) July 2016 Contents Contents About this Manual... 4 1 What's New?...6 1.1 Performance Improvements... 6 1.2
More informationState of the Linux Kernel
State of the Linux Kernel Timothy D. Witham Chief Technology Officer Open Source Development Labs, Inc. 1 Agenda Process Performance/Scalability Responsiveness Usability Improvements Device support Multimedia
More informationIBM i 7.3 Features for SAP clients A sortiment of enhancements
IBM i 7.3 Features for SAP clients A sortiment of enhancements Scott Forstie DB2 for i Business Architect Eric Kass SAP on IBM i Database Driver and Kernel Engineer Agenda Independent ASP Vary on improvements
More informationIBM Endpoint Manager Version 9.0. Software Distribution User's Guide
IBM Endpoint Manager Version 9.0 Software Distribution User's Guide IBM Endpoint Manager Version 9.0 Software Distribution User's Guide Note Before using this information and the product it supports,
More informationGuide to Deploying VMware Workspace ONE with VMware Identity Manager. SEP 2018 VMware Workspace ONE
Guide to Deploying VMware Workspace ONE with VMware Identity Manager SEP 2018 VMware Workspace ONE You can find the most up-to-date technical documentation on the VMware website at: https://docs.vmware.com/
More informationAccelerate critical decisions and optimize network use with distributed computing
DATASHEET EDGE & FOG PROCESSING MODULE Accelerate critical decisions and optimize network use with distributed computing Add computing power anywhere in your distributed network with the Cisco Kinetic
More informationIRIX Resource Management Plans & Status
IRIX Resource Management Plans & Status Dan Higgins Engineering Manager, Resource Management Team, SGI E-mail: djh@sgi.com CUG Minneapolis, May 1999 Abstract This paper will detail what work has been done
More informationOracle Financial Consolidation and Close Cloud. October 2017 Update (17.10) What s New
Oracle Financial Consolidation and Close Cloud October 2017 Update (17.10) What s New TABLE OF CONTENTS REVISION HISTORY... 3 ORACLE FINANCIAL CONSOLIDATION AND CLOSE CLOUD, OCTOBER UPDATE... 4 ANNOUNCEMENTS
More informationNastel Technologies 48 South Service Road Melville, NY, USA Copyright 2015 Nastel Technologies, Inc.
Nastel Technologies 48 South Service Road Melville, NY, USA 11747 Copyright 2015 Nastel Technologies, Inc. 3 Reasons MQ isn t just about Messages MQ Messages not processed can cost you Millions $$$! Example:
More informationWhat's new in IBM Rational Build Forge Version 7.1
What's new in IBM Rational Build Forge Version 7.1 Features and support that help you automate or streamline software development tasks Skill Level: Intermediate Rational Staff, IBM Corporation 13 Jan
More informationTableau Server - 101
Tableau Server - 101 Prepared By: Ojoswi Basu Certified Tableau Consultant LinkedIn: https://ca.linkedin.com/in/ojoswibasu Introduction Tableau Software was founded on the idea that data analysis and subsequent
More informationServer Installation Guide
Server Installation Guide Server Installation Guide Legal notice Copyright 2018 LAVASTORM ANALYTICS, INC. ALL RIGHTS RESERVED. THIS DOCUMENT OR PARTS HEREOF MAY NOT BE REPRODUCED OR DISTRIBUTED IN ANY
More informationStorage for HPC, HPDA and Machine Learning (ML)
for HPC, HPDA and Machine Learning (ML) Frank Kraemer, IBM Systems Architect mailto:kraemerf@de.ibm.com IBM Data Management for Autonomous Driving (AD) significantly increase development efficiency by
More information12d Synergy V4 Release Notes. 12d Synergy V4 Release Notes. Prerequisites. Upgrade Path. Check Outs. Scripts. Workspaces
12d Synergy V4 Release Notes V4 contains a large number of features. Many of these features are listed in this document, but this list may not be exhaustive. This document also contains pre-requisites
More informationWebSphere Application Server, Version 5. What s New?
WebSphere Application Server, Version 5 What s New? 1 WebSphere Application Server, V5 represents a continuation of the evolution to a single, integrated, cost effective, Web services-enabled, J2EE server
More informationWebCenter Interaction 10gR3 Overview
WebCenter Interaction 10gR3 Overview Brian C. Harrison Product Management WebCenter Interaction and Related Products Summary of Key Points AquaLogic Interaction portal has been renamed
More informationTECHNICAL OVERVIEW OF NEW AND IMPROVED FEATURES OF EMC ISILON ONEFS 7.1.1
TECHNICAL OVERVIEW OF NEW AND IMPROVED FEATURES OF EMC ISILON ONEFS 7.1.1 ABSTRACT This introductory white paper provides a technical overview of the new and improved enterprise grade features introduced
More informationGuide to Deploying VMware Workspace ONE. DEC 2017 VMware AirWatch 9.2 VMware Identity Manager 3.1
Guide to Deploying VMware Workspace ONE DEC 2017 VMware AirWatch 9.2 VMware Identity Manager 3.1 You can find the most up-to-date technical documentation on the VMware website at: https://docs.vmware.com/
More informationUsing the SDACK Architecture to Build a Big Data Product. Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver
Using the SDACK Architecture to Build a Big Data Product Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver Outline A Threat Analytic Big Data product The SDACK Architecture Akka Streams and data
More informationOVERVIEW OF THE SAS GRID
OVERVIEW OF THE SAS GRID Host Caroline Scottow Presenter Peter Hobart MANAGING THE WEBINAR In Listen Mode Control bar opened with the white arrow in the orange box Copyr i g ht 2012, SAS Ins titut e Inc.
More informationLBRN - HPC systems : CCT, LSU
LBRN - HPC systems : CCT, LSU HPC systems @ CCT & LSU LSU HPC Philip SuperMike-II SuperMIC LONI HPC Eric Qeenbee2 CCT HPC Delta LSU HPC Philip 3 Compute 32 Compute Two 2.93 GHz Quad Core Nehalem Xeon 64-bit
More informationScheduling in SAS 9.4, Second Edition
Scheduling in SAS 9.4, Second Edition SAS Documentation September 5, 2017 The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2016. Scheduling in SAS 9.4, Second Edition.
More informationTechnical Computing Suite supporting the hybrid system
Technical Computing Suite supporting the hybrid system Supercomputer PRIMEHPC FX10 PRIMERGY x86 cluster Hybrid System Configuration Supercomputer PRIMEHPC FX10 PRIMERGY x86 cluster 6D mesh/torus Interconnect
More informationAchieving Horizontal Scalability. Alain Houf Sales Engineer
Achieving Horizontal Scalability Alain Houf Sales Engineer Scale Matters InterSystems IRIS Database Platform lets you: Scale up and scale out Scale users and scale data Mix and match a variety of approaches
More informationEsgynDB Enterprise 2.0 Platform Reference Architecture
EsgynDB Enterprise 2.0 Platform Reference Architecture This document outlines a Platform Reference Architecture for EsgynDB Enterprise, built on Apache Trafodion (Incubating) implementation with licensed
More informationCloud Computing & Visualization
Cloud Computing & Visualization Workflows Distributed Computation with Spark Data Warehousing with Redshift Visualization with Tableau #FIUSCIS School of Computing & Information Sciences, Florida International
More informationImage Management for View Desktops using Mirage
Image Management for View Desktops using Mirage Mirage 5.9.0 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition.
More informationPerformance Benchmark and Capacity Planning. Version: 7.3
Performance Benchmark and Capacity Planning Version: 7.3 Copyright 215 Intellicus Technologies This document and its content is copyrighted material of Intellicus Technologies. The content may not be copied
More informationCIT 668: System Architecture. Amazon Web Services
CIT 668: System Architecture Amazon Web Services Topics 1. AWS Global Infrastructure 2. Foundation Services 1. Compute 2. Storage 3. Database 4. Network 3. AWS Economics Amazon Services Architecture Regions
More informationJitterbit is comprised of two components: Jitterbit Integration Environment
Technical Overview Integrating your data, applications, and other enterprise systems is critical to the success of your business but, until now, integration has been a complex and time-consuming process
More informationSilk Central Release Notes
Silk Central 16.5 Release Notes Borland Software Corporation 700 King Farm Blvd, Suite 400 Rockville, MD 20850 Copyright Micro Focus 2015. All rights reserved. Portions Copyright 2004-2009 Borland Software
More informationReal-Time Internet of Things
Real-Time Internet of Things Chenyang Lu Cyber-Physical Systems Laboratory h7p://www.cse.wustl.edu/~lu/ Internet of Things Ø Convergence of q Miniaturized devices: integrate processor, sensors and radios.
More informationBuilding A Better Test Platform:
Building A Better Test Platform: A Case Study of Improving Apache HBase Testing with Docker Aleks Shulman, Dima Spivak Outline About Cloudera Apache HBase Overview API compatibility API compatibility testing
More informationMELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구
MELLANOX EDR UPDATE & GPUDIRECT MELLANOX SR. SE 정연구 Leading Supplier of End-to-End Interconnect Solutions Analyze Enabling the Use of Data Store ICs Comprehensive End-to-End InfiniBand and Ethernet Portfolio
More informationInfoBrief. Platform ROCKS Enterprise Edition Dell Cluster Software Offering. Key Points
InfoBrief Platform ROCKS Enterprise Edition Dell Cluster Software Offering Key Points High Performance Computing Clusters (HPCC) offer a cost effective, scalable solution for demanding, compute intensive
More informationGoDocker. A batch scheduling system with Docker containers
GoDocker A batch scheduling system with Docker containers Web - http://www.genouest.org/godocker/ Code - https://bitbucket.org/osallou/go-docker Twitter - #godocker Olivier Sallou IRISA - 2016 CC-BY-SA
More informationWhat s new in HTCondor? What s coming? HTCondor Week 2018 Madison, WI -- May 22, 2018
What s new in HTCondor? What s coming? HTCondor Week 2018 Madison, WI -- May 22, 2018 Todd Tannenbaum Center for High Throughput Computing Department of Computer Sciences University of Wisconsin-Madison
More information