Enabling Scalable Data Analysis for Large Computa9onal Structural Biology Datasets on Distributed Memory Systems
|
|
- Peter Skinner
- 5 years ago
- Views:
Transcription
1 Enabling Scalable Data Analysis for Large Computa9onal Structural Biology Datasets on Distributed Memory Systems Michela Taufer Global Compu9ng Laboratory Computer and Informa9on Sciences University of Delaware
2 In- situ Analysis The perfect in- situ analysis algorithm: Avoids the need for moving data Uses a limited amount of memory Executes sufficiently fast Dataset NleNle - PDB ID 2F4K from Vijay Pande group Can we perform in- situ analysis on trajectories generated in protein folding, predic9on, or refinement simula9ons? 1
3 Start from unfolded conforma9ons of a protein with correct chemical bonds but random torsion angles Search for a conforma9on close to the observed na9ve (folded) conforma9on Protein Folding Process Reynaud, E. (21) Protein Misfolding and Degenera<ve Diseases. Nature Educa<on 3(9):28 2
4 Start from unfolded conforma9ons of a protein with correct chemical bonds but random torsion angles Search for a conforma9on close to the observed na9ve (folded) conforma9on Protein Folding Process Reynaud, E. (21) Protein Misfolding and Degenera<ve Diseases. Nature Educa<on 3(9):28 3
5 Scien9fic Problem Cluster folding trajectories into recurrent paverns based on geometrical varia9ons in 9me of the folding protein frames Node 1 Node 2 Trajectory segment 1 Trajectory segment RMSD RMSD RMSD RMSD frame / conformations Frame (snapshot) 2 frame / conformations Frame (snapshot) 4
6 Scien9fic Problem Cluster folding trajectories into recurrent paverns based on geometrical varia9ons in 9me of the folding protein conforma9ons Intra- trajectory analysis - > iden9fy meta- stable and transi9on stages within trajectory Meta- stable 1 8 RMSD RMSD 6 4 RMSD RMSD Frame (snapshot) 2 Transi9on Frame (snapshot) 5
7 Limits of Current Prac9ce Centralized clustering analysis [Phillips et al. 213] Wait for a job to end before to move its data (trajectory segment) to a centralize server Does not scale when the dataset is large May end up was9ng compu9ng resources e.g., by trying to further fold proteins that are already in a stable condi9on 6
8 From Protein Conforma9ons to Metadata From a single protein conforma9on to a 3D point capturing the atom distances within the frame distance matrix (DM) atoms three leading eigenvalues atoms (x,y,z) Conforma9on (Data from F@H) mul9- dimensional scaling metadata B. Zhang, T. Estrada, P. Cicoc, and M. Taufer. Enabling In- situ Data Analysis for Large Protein- folding Trajectory Datasets. In IPDPS, 214 7
9 From Metadata to Scien9fic Knowledge Given a set of confirma9ons and their 3D points in a segment of the trajectory: 5 3 largest eigenvalues as (x,y,z)
10 From Metadata to Scien9fic Knowledge Given a set of confirma9ons and their 3D points in a segment of the trajectory Par99on the 3D points into 2 clusters using fuzzy c- means (with c = 2)
11 From Metadata to Scien9fic Knowledge Given a set of confirma9ons and their 3D points in a segment of the trajectory: Par99on the 3D points into 2 clusters using fuzzy c- means (with c = 2) Test the equality of the 2 clusters using Welch s t- test with p- value <
12 From Metadata to Scien9fic Knowledge Given a set of confirma9ons and their 3D points in a segment of the trajectory: Par99on the 3D points into 2 clusters using fuzzy c- means (with c = 2) Test the equality of the 2 clusters using Welch s t- test with p- value <.1 Itera9ve par99on on the cluster with more points
13 From Metadata to Scien9fic Knowledge Given a set of confirma9ons and their 3D points in a segment of the trajectory: Par99on the 3D points into 2 clusters using fuzzy c- means (with c = 2) Test the equality of the 2 clusters using Welch s t- test with p- value <.1 Itera9ve par99on on the cluster with more points Finish when the 2 clusters are equal Stage 1 Stage 2 Stage 3 Representative
14 From Metadata to Scien9fic Knowledge Intra- trajectory analysis of an ensemble of - conforma9ons containing one meta- stable stage followed by a transi9on stage and another meta- stable stage Our clustering iden9fies two stable stages and one transi9on stage 13
15 From Metadata to Scien9fic Knowledge Intra- trajectory analysis of an ensemble of conforma9ons containing one meta- stable stage followed by a transi9on stage and another meta- stable stage 5 3 RMSD Stage 1 Stage 2 Stage 3 Representative Our clustering iden9fies two stable stages and one transi9on stage Protein conformations RMSDs of protein conforma9ons iden9fy the same three stages 14
16 From Metadata to Scien9fic Knowledge Intra- trajectory analysis of an ensemble of - conforma9ons containing one meta- stable stage only Our clustering iden9fies one single stage 15
17 From Metadata to Scien9fic Knowledge Intra- trajectory analysis of an ensemble of - conforma9ons containing one meta- stable stage only Stage 1 Representative Our clustering iden9fies one single stage RMSD Protein conformations RMSDs of protein conforma9ons iden9fy one single stage 16
18 Time (sec) Performance Compare our approach with tradi9onal clustering method proposed by Phillips et al s B Folding trajectories of villin headpiece subdomain (HP- 35 NleNle) Parallel MATLAB on 256 Gordon compute cores Time(sec) Memory usage (MB) Data movement (KB) 1116s B 1 Memory per core (MB) MB 1MB Data moved per analysis (KB) KB 539MB - - Our method Phillips' method Our method Phillips' method Our method Phillips' method Our method performs orders of magnitude bever in terms of 9me, memory usage and data movement 17
19 Lessons Learned and Future Direc9ons The distributed analysis of structural biology data is feasible and scalable Our approach transforms data proper9es into metadata concurrently and extracts scien9fic insights from metadata with small data movement Our approach makes the in- situ analysis feasible by: By avoiding the need for moving data, using a limited amount of memory, and execu9ng sufficiently fast Can we extend our approach to larger scales, i.e., larger datasets and more diverse datasets? 18
20 Global Compu9ng Lab: Boyu Zhang Travis Johnston Acknowledgments Collaborators: Trilce Estrada (UNM) Pietro Cicoc (SDSC) Roger Armen (TJU) Special thanks to: D.E. Shaw group Vijay Pande group (Stanford) volunteers volunteer Sponsors: 19
Enabling in-situ data analysis for large protein-folding trajectory datasets
Enabling in-situ data analysis for large protein-folding trajectory datasets Boyu Zhang, Trilce Estrada, Pietro Cicotti, Michela Taufer University of Delaware {bzhang, taufer}@udel.edu University of New
More informationBANDWIDTH MODELING IN LARGE DISTRIBUTED SYSTEMS FOR BIG DATA APPLICATIONS
BANDWIDTH MODELING IN LARGE DISTRIBUTED SYSTEMS FOR BIG DATA APPLICATIONS Bahman Javadi School of Computing, Engineering and Mathematics Western Sydney University, Australia 1 Boyu Zhang and Michela Taufer
More informationAccurate Scoring of Drug Conformations at the Extreme Scale
Accurate Scoring of Drug Conformations at the Extreme Scale Boyu Zhang, Trilce Estrada, Pietro Cicotti, Pavan Balaji, Michela Taufer University of Delaware {bzhang, taufer}@udel.edu University of New Mexico
More informationReengineering high-throughput molecular datasets for scalable clustering using MapReduce
Reengineering high-throughput molecular datasets for scalable clustering using MapReduce Trilce Estrada, Boyu Zhang, Pietro Cicotti, Roger Armen 3, and Michela Taufer University of Delaware 3 Thomas Jefferson
More informationBigDataBench- S: An Open- source Scien6fic Big Data Benchmark Suite
BigDataBench- S: An Open- source Scien6fic Big Data Benchmark Suite Xinhui Tian, Shaopeng Dai, Zhihui Du, Wanling Gao, Rui Ren, Yaodong Cheng, Zhifei Zhang, Zhen Jia, Peijian Wang and Jianfeng Zhan INSTITUTE
More informationOutline. In Situ Data Triage and Visualiza8on
In Situ Data Triage and Visualiza8on Kwan- Liu Ma University of California at Davis Outline In situ data triage and visualiza8on: Issues and strategies Case study: An earthquake simula8on Case study: A
More informationDecentralized K-Means Clustering with Emergent Computing
Decentralized K-Means Clustering with Emergent Computing Ryan McCune & Greg Madey University of Notre Dame, Computer Science & Engineering Spring Simula?on Mul?- Conference 2014, Tampa, FL Student Colloquium
More informationOn the effectiveness of application-aware self-management for scientific discovery in volunteer computing systems
On the effectiveness of application-aware self-management for scientific discovery in volunteer computing systems SC Trilce Estrada and Michela Taufer University of Delaware November, Trilce Estrada and
More informationAn Arbitrary Precision Library for GPUs
An Arbitrary Precision Library for GPUs Philip Saponaro, Omar Padron, and Michela Taufer Global Computing Lab Computer and Information Sciences University of Delaware 1 The Problem Constant energy MD simulations
More informationbiokepler: A Comprehensive Bioinforma2cs Scien2fic Workflow Module for Distributed Analysis of Large- Scale Biological Data
biokepler: A Comprehensive Bioinforma2cs Scien2fic Workflow Module for Distributed Analysis of Large- Scale Biological Data Ilkay Al/ntas 1, Jianwu Wang 2, Daniel Crawl 1, Shweta Purawat 1 1 San Diego
More informationA Distributed Data- Parallel Execu3on Framework in the Kepler Scien3fic Workflow System
A Distributed Data- Parallel Execu3on Framework in the Kepler Scien3fic Workflow System Ilkay Al(ntas and Daniel Crawl San Diego Supercomputer Center UC San Diego Jianwu Wang UMBC WorDS.sdsc.edu Computa3onal
More informationA FILTERING TECHNIQUE FOR FRAGMENT ASSEMBLY- BASED PROTEINS LOOP MODELING WITH CONSTRAINTS
A FILTERING TECHNIQUE FOR FRAGMENT ASSEMBLY- BASED PROTEINS LOOP MODELING WITH CONSTRAINTS F. Campeotto 1,2 A. Dal Palù 3 A. Dovier 2 F. Fioretto 1 E. Pontelli 1 1. Dept. Computer Science, NMSU 2. Dept.
More informationFault-Tolerant Parallel Analysis of Millisecond-scale Molecular Dynamics Trajectories. Tiankai Tu D. E. Shaw Research
Fault-Tolerant Parallel Analysis of Millisecond-scale Molecular Dynamics Trajectories Tiankai Tu D. E. Shaw Research Anton: A Special-Purpose Parallel Machine for MD Simulations 2 Routine Data Analysis
More informationCombinatorial Mathema/cs and Algorithms at Exascale: Challenges and Promising Direc/ons
Combinatorial Mathema/cs and Algorithms at Exascale: Challenges and Promising Direc/ons Assefaw Gebremedhin Purdue University (Star/ng August 2014, Washington State University School of Electrical Engineering
More informationMetadata Zoo Dataset Metadata Rebecca Koskela Execu4ve Director, DataONE
Metadata Zoo Dataset Metadata Rebecca Koskela Execu4ve Director, DataONE eurocris September 9, 2013 Outline Data Challenges Metadata Solu=on DataONE addressing the Data Challenge Enabling Scien=fic Discovery
More informationStages of (Batch) Machine Learning
Evalua&on Stages of (Batch) Machine Learning Given: labeled training data X, Y = {hx i,y i i} n i=1 Assumes each x i D(X ) with y i = f target (x i ) Train the model: model ß classifier.train(x, Y ) x
More informationA Parallel Implementation of a Higher-order Self Consistent Mean Field. Effectively solving the protein repacking problem is a key step to successful
Karl Gutwin May 15, 2005 18.336 A Parallel Implementation of a Higher-order Self Consistent Mean Field Effectively solving the protein repacking problem is a key step to successful protein design. Put
More informationNov 20, 2013: Intro to RNA & Topological Landscapes for Visualiza:on of Scalar- Valued Func:ons.
MATH:7450 (22M:305) Topics in Topology: Scien:fic and Engineering Applica:ons of Algebraic Topology Nov 20, 2013: Intro to RNA & Topological Landscapes for Visualiza:on of Scalar- Valued Func:ons. Fall
More informationMachine Learning : supervised versus unsupervised
Machine Learning : supervised versus unsupervised Neural Networks: supervised learning makes use of a known property of the data: the digit as classified by a human the ground truth needs a training set,
More informationTutorial. Docking School SAnDReS Tutorial Cyclin-Dependent Kinases with K i Information (Structural Parameters)
Tutorial Docking School SAnDReS Tutorial Cyclin-Dependent Kinases with K i Information (Structural Parameters) Prof. Dr. Walter Filgueira de Azevedo Jr. Laboratory of Computational Systems Biology azevedolab.net
More informationHigh Performance Data Analytics for Numerical Simulations. Bruno Raffin DataMove
High Performance Data Analytics for Numerical Simulations Bruno Raffin DataMove bruno.raffin@inria.fr April 2016 About this Talk HPC for analyzing the results of large scale parallel numerical simulations
More informationParallelization of Tau-Leap Coarse-Grained Monte Carlo Simulations on GPUs
Parallelization of Tau-Leap Coarse-Grained Monte Carlo Simulations on GPUs Lifan Xu, Michela Taufer, Stuart Collins, Dionisios G. Vlachos Global Computing Lab University of Delaware Multiscale Modeling:
More informationSimulation-time data analysis and I/O acceleration at extreme scale with GLEAN
Simulation-time data analysis and I/O acceleration at extreme scale with GLEAN Venkatram Vishwanath, Mark Hereld and Michael E. Papka Argonne Na
More informationInforma(on Retrieval
Introduc*on to Informa(on Retrieval CS276: Informa*on Retrieval and Web Search Pandu Nayak and Prabhakar Raghavan Lecture 12: Clustering Today s Topic: Clustering Document clustering Mo*va*ons Document
More informationEnsemble- Based Characteriza4on of Uncertain Features Dennis McLaughlin, Rafal Wojcik
Ensemble- Based Characteriza4on of Uncertain Features Dennis McLaughlin, Rafal Wojcik Hydrology TRMM TMI/PR satellite rainfall Neuroscience - - MRI Medicine - - CAT Geophysics Seismic Material tes4ng Laser
More informationPrivacy-Preserving Shortest Path Computa6on
Privacy-Preserving Shortest Path Computa6on David J. Wu, Joe Zimmerman, Jérémy Planul, and John C. Mitchell Stanford University Naviga6on desired des@na@on current posi@on Naviga6on: A Solved Problem?
More informationHypergraph Sparsifica/on and Its Applica/on to Par//oning
Hypergraph Sparsifica/on and Its Applica/on to Par//oning Mehmet Deveci 1,3, Kamer Kaya 1, Ümit V. Çatalyürek 1,2 1 Dept. of Biomedical Informa/cs, The Ohio State University 2 Dept. of Electrical & Computer
More informationPerformance Evaluation of a MongoDB and Hadoop Platform for Scientific Data Analysis
Performance Evaluation of a MongoDB and Hadoop Platform for Scientific Data Analysis Elif Dede, Madhusudhan Govindaraju Lavanya Ramakrishnan, Dan Gunter, Shane Canon Department of Computer Science, Binghamton
More informationWorkshop #7: Docking. Suggested Readings
Workshop #7: Docking Protein protein docking is the prediction of a complex structure starting from its monomer components. The search space can be extremely large, so large amounts of computational resources
More informationSalsa: Scalable Asynchronous Replica Exchange for Parallel Molecular Dynamics Applications
Salsa: Scalable Asynchronous Replica Exchange for Parallel Molecular Dynamics Applications L. Zhang, M. Parashar, E. Gallicchio, R. Levy TASSL & BIOMAPS Rutgers University ICPP 06, Columbus, OH, Aug. 16,
More informationExecu&on Templates: Caching Control Plane Decisions for Strong Scaling of Data Analy&cs
Execu&on Templates: Caching Control Plane Decisions for Strong Scaling of Data Analy&cs Omid Mashayekhi Hang Qu Chinmayee Shah Philip Levis July 13, 2017 2 Cloud Frameworks SQL Streaming Machine Learning
More information151 Project 3. Bison Management
Suppose you take over the management of a certain Bison popula8on. The popula8on dynamics are similar to those of the popula8on we considered in Project 2 but differ in a few important ways: The adult
More informationDec 2, 2013: Hippocampal spa:al map forma:on
MATH:7450 (22M:305) Topics in Topology: Scien:fic and Engineering Applica:ons of Algebraic Topology Dec 2, 2013: Hippocampal spa:al map forma:on Fall 2013 course offered through the University of Iowa
More informationMolecular Distance Geometry and Atomic Orderings
Molecular Distance Geometry and Atomic Orderings Antonio Mucherino IRISA, University of Rennes 1 Workshop Set Computation for Control ENSTA Bretagne, Brest, France December 5 th 2013 The Distance Geometry
More informationThe Numerical Reproducibility Fair Trade: Facing the Concurrency Challenges at the Extreme Scale
The Numerical Reproducibility Fair Trade: Facing the Concurrency Challenges at the Extreme Scale Michela Taufer University of Delaware Michela Becchi University of Missouri From MulD- core to Many Cores
More informationOn the Migration of the Scientific Code Dyana from SMPs to Clusters of PCs and on to the Grid
On the Migration of the Scientific Code Dyana from SMPs to Clusters of PCs and on to the Grid Michela Taufer Gérard Roos Thomas Stricker Peter Güntert Institute for Computer Systems Institute for Molecular
More informationAn applica)on of Markov Chains: PageRank. Finding relevant informa)on on the Web
An applica)on of Markov Chains: PageRank Finding relevant informa)on on the Web Please Par)cipate h>p://www.st.ewi.tudelc.nl/~marco/lectures.html How much do you know about PageRank? 1) Nothing. 2) I
More informationPredicting Protein Folding Paths. S.Will, , Fall 2011
Predicting Protein Folding Paths Protein Folding by Robotics Probabilistic Roadmap Planning (PRM): Thomas, Song, Amato. Protein folding by motion planning. Phys. Biol., 2005 Aims Find good quality folding
More informationNARCCAP: North American Regional Climate Change Assessment Program. Seth McGinnis, NCAR
NARCCAP: North American Regional Climate Change Assessment Program Seth McGinnis, NCAR mcginnis@ucar.edu NARCCAP: North American Regional Climate Change Assessment Program Nest highresolution regional
More informationAssessing the homo- or heterogeneity of noisy experimental data. Kay Diederichs Konstanz, 01/06/2017
Assessing the homo- or heterogeneity of noisy experimental data Kay Diederichs Konstanz, 01/06/2017 What is the problem? Why do an experiment? because we want to find out a property (or several) of an
More informationAccelerating the Scientific Exploration Process with Kepler Scientific Workflow System
Accelerating the Scientific Exploration Process with Kepler Scientific Workflow System Jianwu Wang, Ilkay Altintas Scientific Workflow Automation Technologies Lab SDSC, UCSD project.org UCGrid Summit,
More informationRecommender Systems Collabora2ve Filtering and Matrix Factoriza2on
Recommender Systems Collaborave Filtering and Matrix Factorizaon Narges Razavian Thanks to lecture slides from Alex Smola@CMU Yahuda Koren@Yahoo labs and Bing Liu@UIC We Know What You Ought To Be Watching
More informationhashfs Applying Hashing to Op2mize File Systems for Small File Reads
hashfs Applying Hashing to Op2mize File Systems for Small File Reads Paul Lensing, Dirk Meister, André Brinkmann Paderborn Center for Parallel Compu2ng University of Paderborn Mo2va2on and Problem Design
More informationTangible Visualiza.on. Andy Wu Synaesthe.c Media Lab GVU Center Georgia Ins.tute of Technology
Tangible Visualiza.on Andy Wu Synaesthe.c Media Lab GVU Center Georgia Ins.tute of Technology Introduc.on Informa.on Visualiza.on (Infovis) is the study of the visual representa.on of complex informa.on,
More informationSEDA An architecture for Well Condi6oned, scalable Internet Services
SEDA An architecture for Well Condi6oned, scalable Internet Services Ma= Welsh, David Culler, and Eric Brewer University of California, Berkeley Symposium on Operating Systems Principles (SOSP), October
More informationParallel Graph Coloring For Many- core Architectures
Parallel Graph Coloring For Many- core Architectures Mehmet Deveci, Erik Boman, Siva Rajamanickam Sandia Na;onal Laboratories Sandia National Laboratories is a multi-program laboratory managed and operated
More informationSpatializing GIS Commands with Self-Organizing Maps. Jochen Wendel Barbara P. Buttenfield Roland J. Viger Jeremy M. Smith
Spatializing GIS Commands with Self-Organizing Maps Jochen Wendel Barbara P. Buttenfield Roland J. Viger Jeremy M. Smith Outline Introduction Characterizing GIS Commands Implementation Interpretation of
More informationIntroduction to Parallel Computing. Lesson 1
Introduction to Parallel Computing. Lesson 1 Adriano FESTA Dipartimento di Ingegneria e Scienze dell Informazione e Matematica, L Aquila DISIM, L Aquila, 25.02.2019 adriano.festa@univaq.it What is Parallel
More informationManifold Learning: Applications in Neuroimaging
Your own logo here Manifold Learning: Applications in Neuroimaging Robin Wolz 23/09/2011 Overview Manifold learning for Atlas Propagation Multi-atlas segmentation Challenges LEAP Manifold learning for
More informationData Curation Profile Cornell University, Biophysics
Data Curation Profile Cornell University, Biophysics Profile Author Dianne Dietrich Author s Institution Cornell University Contact dd388@cornell.edu Researcher(s) Interviewed Withheld Researcher s Institution
More informationA First Introduction to Scientific Visualization Geoffrey Gray
Visual Molecular Dynamics A First Introduction to Scientific Visualization Geoffrey Gray VMD on CIRCE: On the lower bottom left of your screen, click on the window start-up menu. In the search box type
More informationComputational and Statistical Tradeoffs in VoI-Driven Learning
ARO ARO MURI MURI on on Value-centered Theory for for Adaptive Learning, Inference, Tracking, and and Exploitation Computational and Statistical Tradefs in VoI-Driven Learning Co-PI Michael Jordan University
More informationOn Op%mality of Clustering by Space Filling Curves
On Op%mality of Clustering by Space Filling Curves Pan Xu panxu@iastate.edu Iowa State University Srikanta Tirthapura snt@iastate.edu 1 Mul%- dimensional Data Indexing and managing single- dimensional
More informationAr#ficial Intelligence
Ar#ficial Intelligence Advanced Searching Prof Alexiei Dingli Gene#c Algorithms Charles Darwin Genetic Algorithms are good at taking large, potentially huge search spaces and navigating them, looking for
More informationDecision Support Systems
Decision Support Systems 2011/2012 Week 3. Lecture 6 Previous Class Dimensions & Measures Dimensions: Item Time Loca0on Measures: Quan0ty Sales TransID ItemName ItemID Date Store Qty T0001 Computer I23
More informationArchitectures, and Protocol Design Issues for Mobile Social Networks: A Survey
Applica@ons, Architectures, and Protocol Design Issues for Mobile Social Networks: A Survey N. Kayastha,D. Niyato, P. Wang and E. Hossain, Proceedings of the IEEEVol. 99, No. 12, Dec. 2011. Sabita Maharjan
More information@ COUCHBASE CONNECT. Using Couchbase. By: Carleton Miyamoto, Michael Kehoe Version: 1.1w LinkedIn Corpora3on
@ COUCHBASE CONNECT Using Couchbase By: Carleton Miyamoto, Michael Kehoe Version: 1.1w Overview The LinkedIn Story Enter Couchbase Development and Opera3ons Clusters and Numbers Opera3onal Tooling Carleton
More informationGenerating Better Conformations for Roadmaps in Protein Folding
Generating Better Conformations for Roadmaps in Protein Folding Jeff May, Lydia Tapia, and Nancy M. Amato Parasol Lab, Dept. of Computer Science, Texas A&M University, College Station, TX 77843 {jmayx9ns}@umw.edu
More information3.021J / 1.021J / J / J / 22.00J Introduction to Modeling and Simulation Spring 2008
MIT OpenCourseWare http://ocw.mit.edu 3.021J / 1.021J / 10.333J / 18.361J / 22.00J Introduction to Modeling and Simulation Spring 2008 For information about citing these materials or our Terms of Use,
More informationProtein Flexibility Predictions
Protein Flexibility Predictions Using Graph Theory by D.J. Jacobs, A.J. Rader et. al. from the Michigan State University Talk by Jan Christoph, 19th January 2007 Outline 1. Introduction / Motivation 2.
More informationAdvanced Research Compu2ng Informa2on Technology Virginia Tech
Advanced Research Compu2ng Informa2on Technology Virginia Tech www.arc.vt.edu Personnel Associate VP for Research Compu6ng: Terry Herdman (herd88@vt.edu) Director, HPC: Vijay Agarwala (vijaykag@vt.edu)
More informationPerformance Analysis of Distributed Systems using BOINC
Performance Analysis of Distributed Systems using BOINC Muhammed Suhail T. S. ASE Tata Consultancy Services Pooranam House Nellikkunnu Kasaragod ABSTRACT In this paper, we are discussing about the performance
More informationCCW Workshop Technical Session on Mobile Cloud Compu<ng
CCW Workshop Technical Session on Mobile Cloud Compu
More informationA Tes&ng Engine for High- Performance and Cost- Effec&ve Workflow Execu&on in the Cloud
A Tes&ng Engine for High- Performance and Cost- Effec&ve Workflow Execu&on in the Cloud V. K. Pallipuram 1, T. Estrada 2, M. Taufer 3 University of the Pacific 1 University of New Mexico 2 University of
More informationAccelerating Parallel Analysis of Scientific Simulation Data via Zazen
Accelerating Parallel Analysis of Scientific Simulation Data via Zazen Tiankai Tu, Charles A. Rendleman, Patrick J. Miller, Federico Sacerdoti, Ron O. Dror, and David E. Shaw D. E. Shaw Research Motivation
More informationArrival Scheduling with Shortcut Path Op6ons and Mixed Aircra: Performance. Shannon Zelinski and Jaewoo Jung NASA Ames Research Center
Arrival Scheduling with Shortcut Path Op6ons and Mixed Aircra: Performance Shannon Zelinski and Jaewoo Jung NASA Ames Research Center Time- Based Arrival Management Fixed Path Speed Control More accurate
More informationJialin Liu, Evan Racah, Quincey Koziol, Richard Shane Canon, Alex Gittens, Lisa Gerhardt, Suren Byna, Mike F. Ringenburg, Prabhat
H5Spark H5Spark: Bridging the I/O Gap between Spark and Scien9fic Data Formats on HPC Systems Jialin Liu, Evan Racah, Quincey Koziol, Richard Shane Canon, Alex Gittens, Lisa Gerhardt, Suren Byna, Mike
More informationEnergy Minimization on Manifolds for Docking Flexible Molecules
pubs.acs.org/jctc Energy Minimization on Manifolds for Docking Flexible Molecules Hanieh Mirzaei, Shahrooz Zarbafian, Elizabeth Villar, Scott Mottarella, Dmitri Beglov, Sandor Vajda, Ioannis Ch. Paschalidis,
More informationSupervised and Unsupervised Learning. Ciro Donalek Ay/Bi 199 April 2011
Supervised and Unsupervised Learning Ciro Donalek Ay/Bi 199 April 2011 KDD and Data Mining Tasks Finding the op?mal approach Supervised Models Neural Networks Mul? Layer Perceptron Decision Trees Unsupervised
More informationKinetically-aware Conformational Distances in Molecular Dynamics
CCCG 211, Toronto ON, August 12, 211 Kinetically-aware Conformational Distances in Molecular Dynamics Chen Gu Xiaoye Jiang Leonidas Guibas Abstract In this paper, we present a novel approach for discovering
More informationEnergy Minimization -Non-Derivative Methods -First Derivative Methods. Background Image Courtesy: 3dciencia.com visual life sciences
Energy Minimization -Non-Derivative Methods -First Derivative Methods Background Image Courtesy: 3dciencia.com visual life sciences Introduction Contents Criteria to start minimization Energy Minimization
More informationOverview of XSEDE for HPC Users Victor Hazlewood XSEDE Deputy Director of Operations
October 29, 2014 Overview of XSEDE for HPC Users Victor Hazlewood XSEDE Deputy Director of Operations XSEDE for HPC Users What is XSEDE? XSEDE mo/va/on and goals XSEDE Resources XSEDE for HPC Users: Before
More informationIntroduction to Parallel Computing
Introduction to Parallel Computing Chieh-Sen (Jason) Huang Department of Applied Mathematics National Sun Yat-sen University Thank Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar for providing
More informationVulnerability Analysis (III): Sta8c Analysis
Computer Security Course. Vulnerability Analysis (III): Sta8c Analysis Slide credit: Vijay D Silva 1 Efficiency of Symbolic Execu8on 2 A Sta8c Analysis Analogy 3 Syntac8c Analysis 4 Seman8cs- Based Analysis
More informationBioinformatics III Structural Bioinformatics and Genome Analysis
Bioinformatics III Structural Bioinformatics and Genome Analysis Chapter 3 Structural Comparison and Alignment 3.1 Introduction 1. Basic algorithms review Dynamic programming Distance matrix 2. SARF2,
More informationIntroduc)on to High Performance Compu)ng Advanced Research Computing
Introduc)on to High Performance Compu)ng Advanced Research Computing Outline What cons)tutes high performance compu)ng (HPC)? When to consider HPC resources What kind of problems are typically solved?
More informationHeterogeneous CPU+GPU Molecular Dynamics Engine in CHARMM
Heterogeneous CPU+GPU Molecular Dynamics Engine in CHARMM 25th March, GTC 2014, San Jose CA AnE- Pekka Hynninen ane.pekka.hynninen@nrel.gov NREL is a na*onal laboratory of the U.S. Department of Energy,
More informationCompiler: Control Flow Optimization
Compiler: Control Flow Optimization Virendra Singh Computer Architecture and Dependable Systems Lab Department of Electrical Engineering Indian Institute of Technology Bombay http://www.ee.iitb.ac.in/~viren/
More informationReducing the Search Space in Protein Structure Prediction
FACULTY OF SCIENCE UNIVERSITY OF COPENHAGEN Ph.D. thesis Rasmus Fonseca Reducing the Search Space in Protein Structure Prediction Department of Computer Science Academic advisor: Pawel Winter Submitted:
More informationFebruary 2010 Christopher Bruns
Molecular Simulation with OpenMM Zephyr February 2010 Christopher Bruns What is OpenMM Zephyr? Graphical user interface for running GPU accelerated molecular dynamics simulations Automates running of gromacs
More informationTerraSwarm. A Machine Learning and Op0miza0on Toolkit for the Swarm. Ilge Akkaya, Shuhei Emoto, Edward A. Lee. University of California, Berkeley
TerraSwarm A Machine Learning and Op0miza0on Toolkit for the Swarm Ilge Akkaya, Shuhei Emoto, Edward A. Lee University of California, Berkeley TerraSwarm Tools Telecon 17 November 2014 Sponsored by the
More informationTrajectory Analysis. Release Ahmet Bakan
Trajectory Analysis Release 1.5.1 Ahmet Bakan December 24, 2013 CONTENTS 1 Introduction 2 1.1 Required Programs........................................... 2 1.2 Recommended Programs.......................................
More informationPractical 10: Protein dynamics and visualization Documentation
Practical 10: Protein dynamics and visualization Documentation Release 1.0 Oliver Beckstein March 07, 2013 CONTENTS 1 Basics of protein structure 3 2 Analyzing protein structure and topology 5 2.1 Topology
More informationLecture 13: Tracking mo3on features op3cal flow
Lecture 13: Tracking mo3on features op3cal flow Professor Fei- Fei Li Stanford Vision Lab Lecture 13-1! What we will learn today? Introduc3on Op3cal flow Feature tracking Applica3ons (Problem Set 3 (Q1))
More informationA Study on Optimally Co-scheduling Jobs of Different Lengths on CMP
A Study on Optimally Co-scheduling Jobs of Different Lengths on CMP Kai Tian Kai Tian, Yunlian Jiang and Xipeng Shen Computer Science Department, College of William and Mary, Virginia, USA 5/18/2009 Cache
More informationDataONE Cyberinfrastructure. Ma# Jones Dave Vieglais Bruce Wilson
DataONE Cyberinfrastructure Ma# Jones Dave Vieglais Bruce Wilson Foremost a Federa9on Member Nodes (MNs) Heart of the federa9on Harness the power of local cura9on Coordina9ng Nodes (CNs) Services to link
More informationBig Data Analytics! Special Topics for Computer Science CSE CSE Feb 9
Big Data Analytics! Special Topics for Computer Science CSE 4095-001 CSE 5095-005! Feb 9 Fei Wang Associate Professor Department of Computer Science and Engineering fei_wang@uconn.edu Clustering I What
More informationCSE Opera+ng System Principles
CSE 30341 Opera+ng System Principles Lecture 2 Introduc5on Con5nued Recap Last Lecture What is an opera+ng system & kernel? What is an interrupt? CSE 30341 Opera+ng System Principles 2 1 OS - Kernel CSE
More informationParallel & Cluster Computing. cs 6260 professor: elise de doncker by: lina hussein
Parallel & Cluster Computing cs 6260 professor: elise de doncker by: lina hussein 1 Topics Covered : Introduction What is cluster computing? Classification of Cluster Computing Technologies: Beowulf cluster
More informationInforma(on Retrieval
Introduc*on to Informa(on Retrieval Clustering Chris Manning, Pandu Nayak, and Prabhakar Raghavan Today s Topic: Clustering Document clustering Mo*va*ons Document representa*ons Success criteria Clustering
More informationTiDA: High Level Programming Abstrac8ons for Data Locality Management
h#p://parcorelab.ku.edu.tr TiDA: High Level Programming Abstrac8ons for Data Locality Management Didem Unat, Muhammed Nufail Farooqi, Burak Bastem Koç University, Turkey Tan Nguyen, Weiqun Zhang, George
More informationIntroduction. IST557 Data Mining: Techniques and Applications. Jessie Li, Penn State University
Introduction IST557 Data Mining: Techniques and Applications Jessie Li, Penn State University 1 Introduction Why Data Mining? What Is Data Mining? A Mul3-Dimensional View of Data Mining What Kinds of Data
More informationAlgorithm Engineering with PRAM Algorithms
Algorithm Engineering with PRAM Algorithms Bernard M.E. Moret moret@cs.unm.edu Department of Computer Science University of New Mexico Albuquerque, NM 87131 Rome School on Alg. Eng. p.1/29 Measuring and
More informationALP Interchange Review. Colorado Department of Educa2on Office of Gi7ed Educa2on Jessica Howard, Gi7ed Educa2on Specialist
ALP Interchange Review Colorado Department of Educa2on Office of Gi7ed Educa2on Jessica Howard, Gi7ed Educa2on Specialist January 2017 Please sign in using the Chat Box (Click on the chat icon to access
More informationUsing Sequen+al Run+me Distribu+ons for the Parallel Speedup Predic+on of SAT Local Search
Using Sequen+al Run+me Distribu+ons for the Parallel Speedup Predic+on of SAT Local Search Alejandro Arbelaez - CharloBe Truchet - Philippe Codognet JFLI University of Tokyo LINA, UMR 6241 University of
More informationIndex construc-on. Friday, 8 April 16 1
Index construc-on Informa)onal Retrieval By Dr. Qaiser Abbas Department of Computer Science & IT, University of Sargodha, Sargodha, 40100, Pakistan qaiser.abbas@uos.edu.pk Friday, 8 April 16 1 4.3 Single-pass
More informationEasy manual for SIRD interface
Easy manual for SIRD interface What is SIRD interface (1) SIRDi is the web-based interface for SIRD (Structure-Interaction Relational Database) (2) You can search for protein structure models with sequence,
More informationOh, Exascale! The effect of emerging architectures on scien1fic discovery. Kenneth Moreland, Sandia Na1onal Laboratories
Photos placed in horizontal posi1on with even amount of white space between photos and header Oh, $#*@! Exascale! The effect of emerging architectures on scien1fic discovery Ultrascale Visualiza1on Workshop,
More informationHigh-throughput molecular dynamics simulation and Markov modeling. Frank Noé (FU Berlin)
High-throughput molecular dynamics simulation and Markov modeling Frank Noé (FU Berlin) frank.noe@fu-berlin.de Molecular dynamics Molecular dynamics are stochastic Single trajectories are usually not representative
More informationVMD Documentation. Mehrdad Youse (CCIT Visualization Group)
VMD Documentation Mehrdad Youse (CCIT Visualization Group) May 22, 2017 Abstract In this documentation, the basic and intermediate documentation of VMD molecular dynamics visualization software from University
More information