RUNNING MOLECULAR DYNAMICS SIMULATIONS WITH CHARMM: A BRIEF TUTORIAL

Similar documents
Lab III: MD II & Analysis Linux & OS X. In this lab you will:

Lab III: MD II & Analysis Linux & OS X. In this lab you will:

CHEM 5412 Spring 2017: Introduction to Maestro and Linux Command Line

ssrna Example SASSIE Workflow Data Interpolation

MD Workflow Single System Tutorial (LINUX OPERATION GPU Cluster) Written by Pek Ieong

WebPrint Quick Start User Guide

High Performance Computing (HPC) Club Training Session. Xinsheng (Shawn) Qin

NBIC TechTrack PBS Tutorial

Unit: Making a move (using FTP)

How to upload documentation

UoW HPC Quick Start. Information Technology Services University of Wollongong. ( Last updated on October 10, 2011)

1 Introduction. 2 Summary of Tutorial. Guest Lecture, Smith College, CS 334, BioInformatics 16 October 2008 GROMACS, MD Tutorial Filip Jagodzinski

MD Workflow Single System Tutorial (LINUX OPERATION Local Execution) Written by Pek Ieong

SASSIE-web: Quick Start Introduction. SASSIE-web Interface

NBIC TechTrack PBS Tutorial. by Marcel Kempenaar, NBIC Bioinformatics Research Support group, University Medical Center Groningen

New User Tutorial. OSU High Performance Computing Center

An Introduction to Cluster Computing Using Newton

A Hands-On Tutorial: RNA Sequencing Using High-Performance Computing

CHM 579 LAB 2B: MOLECULAR DYNAMICS SIMULATION OF WATER IN GROMACS

CS CS Tutorial 2 2 Winter 2018

Anthill User Group Meeting, 2015

Please cite the following papers if you perform simulations with PACE:

STA 303 / 1002 Using SAS on CQUEST

CpSc 1111 Lab 1 Introduction to Unix Systems, Editors, and C

SUREedge DR Installation Guide for Windows Hyper-V

How to run an MD simulation

Intel Manycore Testing Lab (MTL) - Linux Getting Started Guide

APBS electrostatics in VMD

Getting started with Raspberry Pi (and WebIoPi framework)

Introduction to Cuda Visualization. Graphical Application Tunnelling on Palmetto

Lab 1 Introduction to UNIX and C

Contents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version...

AutoIMD User s Guide

CS Operating Systems, Fall 2018 Project #0 Description

February 2010 Christopher Bruns

PARALLEL COMPUTING IN R USING WESTGRID CLUSTERS STATGEN GROUP MEETING 10/30/2017

Exercise 1: Connecting to BW using ssh: NOTE: $ = command starts here, =means one space between words/characters.

ARCHER ecse Technical Report. Algorithmic Enhancements to the Solvaware Package for the Analysis of Hydration. 4 Technical Report (publishable)

Siemens PLM Software. HEEDS MDO Setting up a Windows-to- Linux Compute Resource.

CHEM5302 Fall 2015: Introduction to Maestro and the command line

sftp - secure file transfer program - how to transfer files to and from nrs-labs

CS 215 Fundamentals of Programming II Spring 2019 Very Basic UNIX

Emile R. Chimusa Division of Human Genetics Department of Pathology University of Cape Town

VMD Documentation. Mehrdad Youse (CCIT Visualization Group)

Unix/Linux Primer. Taras V. Pogorelov and Mike Hallock School of Chemical Sciences, University of Illinois

Session 1: Accessing MUGrid and Command Line Basics

Introduction to Discovery.

Name Department/Research Area Have you used the Linux command line?

Sandbox Setup Guide for HDP 2.2 and VMware

Supplier Registration Quick Reference Guide for Suppliers

Supercomputing environment TMA4280 Introduction to Supercomputing

2. What are the advantages of simulating a molecule or a reaction over doing the experiment in a lab?

CS 1301 Fall 2008 Lab 2 Introduction to UNIX

Image Sharpening. Practical Introduction to HPC Exercise. Instructions for Cirrus Tier-2 System

Xton Access Manager GETTING STARTED GUIDE

Molecular Modeling Lab #1: Minimizing a Polylysine Peptide with a Molecular Mechanics Force Field

When you first log in, you will be placed in your home directory. To see what this directory is named, type:

Introduction to Discovery.

HPC Introductory Course - Exercises

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing

CS Fundamentals of Programming II Fall Very Basic UNIX

Guide to your Plug Computer

Table of Contents. Table of Contents Job Manager for remote execution of QuantumATK scripts. A single remote machine

Installing Dolphin on Your PC

A Brief Introduction to The Center for Advanced Computing

Introduction to Molecular Dynamics on ARCHER: Instructions for running parallel jobs on ARCHER

Introduction to Unix The Windows User perspective. Wes Frisby Kyle Horne Todd Johansen

Exercise: Calling LAPACK

COMPARATIVE MICROBIAL GENOMICS ANALYSIS WORKSHOP. Exercise 3: Pan- and Core- genome analysis, Pan-genome tree

Using ISMLL Cluster. Tutorial Lec 5. Mohsan Jameel, Information Systems and Machine Learning Lab, University of Hildesheim

Introduction to HPC Resources and Linux

A Brief Introduction to The Center for Advanced Computing

CLC Genomics Workbench. Setup and User Guide

How to connect to the University of Exeter VPN service

For Dr Landau s PHYS8602 course

Smart Bulk SMS & Voice SMS Marketing Script with 2-Way Messaging. Quick-Start Manual

Accuterm 7 Usage Guide

Intermediate Programming, Spring Misha Kazhdan

Applying for a Job. Step-by-Step Instructions

Installing the WHI Virtual Private Network (VPN) for WHIX Users Updated 12/16/2016

Connect using Putty to a Linux Server

Quick-Start Tutorial. Airavata Reference Gateway

Solr Installation User Guide. Solr Installation Brainvire Infotech Pvt. Ltd

Remote Access to Unix Machines

Graduate Topics in Biophysical Chemistry CH Assignment 0 (Programming Assignment) Due Monday, March 19

Using CLC Genomics Workbench on Turing

Building DNA Brick Structures with LegoGen

Quick Start Guide. by Burak Himmetoglu. Supercomputing Consultant. Enterprise Technology Services & Center for Scientific Computing

Molecular Dynamics Simulation: Analysis

HPC Account and Software Setup

HPC Course Session 3 Running Applications

Spring 2017 Gabriel Kuri

AMS 200: Working on Linux/Unix Machines

IT INFRASTRUCTURE PROJECT PHASE I INSTRUCTIONS

Hadoop Exercise to Create an Inverted List

Unix Essentials. BaRC Hot Topics Bioinformatics and Research Computing Whitehead Institute October 12 th

Cheat Sheet on using Electric for Design and Simulations

True Potential Client Site

Introduction to Discovery.

SUREedge MIGRATOR INSTALLATION GUIDE FOR HYPERV

Transcription:

RUNNING MOLECULAR DYNAMICS SIMULATIONS WITH CHARMM: A BRIEF TUTORIAL While you can probably write a reasonable program that carries out molecular dynamics (MD) simulations, it s sometimes more efficient to use existing MD packages that are optimized to run on distributed supercomputing clusters. In this example we will use the CHARMM (Chemistry at HARvard Molecular Mechanics) integrator and force fields (www.charmm.org) to simulate a protein in both an implicit, and explicit solvent. Before you get started, you will need to download and setup a couple of utilities: (A) Access to a local terminal. If you re running Linux or OSX, a terminal utility is already built into your operating system. However, if you re running Windows, you will need to download and install PuTTY (http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html) and WinSCP (https://winscp.net/eng/download.php). PuTTY will be used to SSH and run commands in the terminal, while WinSCP will be used to transfer files to-and-from various machines. (B) A user account on Knot. Knot is a supercomputing cluster located at the University of California, Santa Barbara. Your username on Knot will be tstc0#, where # represents your group number. For instance if you were assigned to group 7, your Knot username would be tstc07. To check this in OSX/Linux, type ssh yourusername@knot.cnsi.ucsb.edu p 43210. For Windows users, you ll need to setup a new PuTTY profile that specifies knot.cnsi.ucsb.edu as the server name, 43210 as the port number, and your unique username as the user. When prompted for a password, type in SummerCharm2015 (that s Charm with one m). If you re successfully able to login to Knot, go ahead and make a folder with your last name (i.e. mkdir mylastname ). For any subsequent work you perform on Knot, make sure to log in with your unique username, and only write to your own mylastname directory. For a more complete list of terminal commands, check out http://www.dummies.com/how-to/content/how-to-use-basic-unix-commands-to-work-interminal.html. (Don t be fooled by the domain name!) (C) VMD (Visual Molecular Dynamics). You can download VMD at http://www.ks.uiuc.edu/research/vmd/, however you will need to register for a free account. Installation directions are available at http://www.ks.uiuc.edu/research/vmd/current/ig/ig.html. A detailed user guide for VMD can be found at http://www.ks.uiuc.edu/research/vmd/current/ug.pdf. RUNNING MD WITH AN IMPLICIT SOLVENT. Here, we are going to simulate a 67-residue protein sequence that is found in the proto-oncogene tyrosine-protein kinase Fyn (PDB: 1NYG) in an implicit solvent. More information about this sequence can be found at http://www.rcsb.org/pdb/explore.do?structureid=1nyg. To help set this system up, we re going to enlist the help of CHARMM-GUI 1, which is a web-server dedicated to

setting up CHARMM simulations. (1.) Go to http://www.charmm-gui.org and click on 'Input Generator' to the left. Then click on 'Implicit Solvent Modeller'. This should take you to http://www.charmmgui.org/?doc=input/implicit (2.) Choose the 'EEF1' implicit solvent model, and then enter '1NYG' under Download PDB file. Click next. (3.) We won t bother modifying any segment IDs (SEGIDs), names, or residue numbers, so click next once again. (4.) Keep the default options here, and click next. (However, if you want to play around with the structure after the end of these exercises, feel free to mutate the structure at this step and compare it to the un-mutated protein structures.) (5.) CHARMM-GUI should have now produced the files you need. Download them (in *.tgz format) on the right-hand side. (6.) Go ahead and unzip the *.tgz file, and look through each of the files. Specifically, try opening step2_implicit.pdb in VMD by typing vmd step2_implicit.pdb (or in Windows, you ll need to open VMD separately, and manually load in the file). You should see a threedimensional rendering of the protein, which you can move around and rotate. Also, try playing around with the molecular representations by clicking on Graphics > Representations from the top menu bar. (7.) Upload your *.tgz file to knot by typing 'scp P 43210 -r charm-gui.tgz \ yourusername@knot.cnsi.ucsb.edu:/home/yourusername/yourlastname'. Make sure though that you ve already created the directory yourlastname before transferring files to it. (8.) Now, log into Knot by typing 'ssh yourusername@knot.cnsi.ucsb.edu -p 43210' if you re running OSX or Linux. If you re running on Windows, select the appropriate Knot profile in PuTTY and click connect. (9.) You should initially be located in your home directory when you log into Knot, so navigate to your mylastname directory once you log in, which should contain the tgz file you just uploaded. Unzip/Untar the file by typing tar zxvf charm-gui.tgz (or whatever name you chose for your tgz file), and then navigate to the directory it just created (usually called charmm-gui ). (10.) The first thing we ll want to do is to edit the step2_implicit.inp CHARMM input file. Go ahead and change the value of 'nstep' from 100 to 5000. Time is counted in units of dt, which is 2 femtoseconds here. Therefore, 5000*dt = 5000*0.002 fs = 10 ps. Because the input file will tell CHARMM to write to a trajectory every 500*dt (1 ps), that should give us 10 frames to work

with. (11.) Now that we re ready to run CHARMM, we need to do so on compute nodes that are optimized for computation, rather than on the login node that we re typing commands into. To submit our job to the Knot scheduler, copy /home/zlevine/submit to your working directory. This will be used to submit simulation jobs on Knot, and request distributed computing resources. (12.) After you copy the submit file to your working directory, edit the file and change the email address towards the top to your own email address (this will notify you when your job has begun running, or has completed.) Then, update the CHARMM_INPUT variable so that it has a value of 'step2_implicit.inp'. This will be the input file that CHARMM will run during submission. (13.) Submit your job by typing 'qsub submit'. You can check the status of your job by typing `qstat -u yourusername`. Because the job is queued, you can log out and log back in to check the status of your jobs at any time without losing your progress. (14.) When your job is done (i.e. when qstat u yourusername returns no running jobs), you should see a new trajectory file called run.dcd. Since the trajectory is written in binary, you can only see what is in this file by downloading run.dcd and the initial pdb file (step1_pdbreader.pdb) to your computer, and loading them into VMD ( vmd step1_pdbreader.pdb dcd run.dcd). Using the slider in the main VMD menu, you can cycle back and forth between difference trajectory frames, and see the protein wiggle in time. (15.) Another option to extract data from the trajectory file is to output its contents into a textreadable format. To do this, we can use another input script to load the trajectory into CHARMM, and output its contents to individual pdb files. Copy /home/zlevine/write_pdb.inp to your working directory, and run it by typing 'charmm < write_pdb.inp'. This should create 10 consecutively-numbered pdb files every 1 ps, from our 10 ns trajectory. (Note: This is one of the only times that we should directly call CHARMM from the Knot login node, since the computational requirements are small. Usually when users perform too much computation on the shared login node, the administrative gods punish you by temporarily disabling your account. So, try to keep computation on the login node to a minimum.) (16.) Finally, in order to analyze these pdb files, we can utilize scripts (written in, e.g., Python or Perl) to quickly extract the data we want. Go ahead and copy the perl file /home/zlevine/analysis.pl to your working directory (which contains your newly created pdb files). This file will parse through your pdb files and calculate various quantities from them. Once the script is copied, you can run it by typing `perl analysis.pl`. This will write the protein s end-to-end distance (Ree) and it s radius of gyration (Rg) to the files Ree_log and Rg_log, respectively. We will want to compare these values to those derived from explicitly solvated proteins in the next section, and see how similar (or dissimilar) they are from one another.

Reference(s): 1 Jo, S., Kim, T., Iyer, V. G. and Im, W. (2008), CHARMM-GUI: A web-based graphical user interface for CHARMM. J. Comput. Chem., 29: 1859 1865. doi: 10.1002/jcc.20945 RUNNING MD WITH AN EXPLICIT SOLVENT. This simulation will be similar to the implicit solvent section, but with the addition of discrete water molecules solvating the protein (versus a continuum of water.) Make sure you run these new simulations in a separate folder from the previous section, since you don t want to accidently write over files that you created earlier. An example might be creating the explicit folder in your yourlastname directory. (1.) As before, go to http://www.charmm-gui.org and click on 'Input Generator', but this time select 'Quick MD Simulator' on the left menu bar. This should take you to http://www.charmmgui.org/?doc=input/mdsetup. (2.) Enter in '1NYG' for the PDB file, then click next on the lower right hand side. (3.) As before, we don't need to update the segment ID or residue numbers, so go ahead and click next again. (4.) Moreover, we don't need to add anything exotic, so click next. (5.) Here, CHARMM-GUI allows us to download our files so far (as was the case for implicit water), but instead let's continue utilizing the web server so that it generates more content for us. You can choose varying box sizes (in angstroms) and electrolyte concentrations here, but let's stick with the default values for now. Click next. (Note that because CHARMM-GUI does a lot of the initial heavy lifting server-side, it may take some time to progress to the next step. Tip: using CHARMM-GUI at 'unpopular' times will significantly speed up computation from one step to another.) (6.) Eventually* you will get to the next step, where you can (once again) use the default values for invoking periodic boundary conditions. Click next here. (* you may start to see CHARMM- GUI slow down right about now. This is normal.) (7.) Now, unselect all of the output formats, except for 'CHARMM/OpenMM'. Keep the equilibration ensemble canonically-defined (NVT), and the dynamics ensemble set to NPT. This essentially equilibrates our water/protein system at constant volume. Afterwards, we will allow our box volume to relax, while a constant pressure (of 1 bar) is maintained in all directions. Click next. (8.) Finally, when we arrive at the final step from CHARMM-GUI, we can download our zipped

files in *.tgz format (on the right-hand side). This should contain all of the files you've generated so far. (9.) Unzip the *.tgz file, and take a look at the some of the files. In particular, load step3_pbcsetup.pdb into vmd by typing 'vmd step3_pbcsetup.pdb'. Notice the abundance of discrete water molecules, and the multiple ways you can interact with your pdb file (by changing, e.g., textures/color/viewpoints/spotlights/etc.) (10.) Now we need to upload these files to Knot. As before, navigate to the directory where your tgz file is located, and type 'scp -P 43210 -r charmm-gui/ yourusername@knot.cnsi.ucsb.edu:/home/yourusername/yourlastname/explicit' if you re running OSX or Linux. Similarly, use WinSCP if you re running in Windows. (I m assuming that you created a separate explicit directory for this part of the tutorial.) (11.) Log into Knot (in your terminal) by typing 'ssh yourusername@knot.cnsi.ucsb.edu -p 43210'. Or in PuTTY, select the Knot profile and connect. Navigate to the 'yourlastname' directory, where you just uploaded your files to. (12.) Unzip/Untar your file by typing ('tar -zxvf file.tgz'). Then navigate to the directory that you just created (most likely also named charmm-gui hence the importance of using a separate directory). CHARMM-GUI generated the appropriate run input (.inp) files for us, but we still need to run the actual simulations on a distributed, supercomputing cluster (like Knot). (13.) Edit (using 'nano' or 'VI') step4_equilibration.inp. Insert 'BOMLEV -6' after the initial header (marked by asterisks). This allows us to proceed, even if a certain number of errors are encountered (which only come up here because we are using a development version of CHARMM). In general though, it is not advisable to artificially ignore errors! Additionally, insert: open write unit 13 file name trajectory.dcd' immediately after: 'open write unit 12 card name step4_equilibration.rst' This will open a trajectory file that we will subsequently write to. Then, substitute 'iuncrd 13' in place of 'iuncrd -1', and 'nsavc 1000' for 'nsavc 0'. This will direct the coordinates to be written to the trajectory file (in namespace 13) that we just declared, and at a frequency of 1000*dt, or every 1 ps. (in this exercise, dt = 1 femtosecond) (14.) To run this simulation, copy /home/zlevine/submit to your CHARMM directory. Edit the file as you did with implicit water, and make sure that the CHARMM_INPUT variable is set to 'step4_equilibration.inp'. (15.) Submit your job by typing 'qsub submit'. You can check the status of your job by typing

`qstat -u yourusername`. Jobs can sometimes take time to queue and subsequently run, so you may have to wait 30-60 minutes for this step. However, because the job is queued, you can logout and check the status of your jobs at any time without losing your progress. (16.) When your job is finished, it should have produced a trajectory.dcd file. This file contains the time evolution of your simulation for 25 ns. (17.) Try downloading your trajectory file (and initial pdb file -- 'step3_pbcsetup.pdb') onto your personal computer. Then open the trajectory locally by typing 'vmd step3_pbcsetup.pdb -dcd trajectory.dcd'. You should be able to move the slider in the VMD control panel, and see the various water/protein structures move around. Notice the thermal fluctuations of individual water molecules. (18.) Back on Knot, copy the 'write_pdb.inp' file from the implicit example, and add in the 'BOMLEV -6' line after the initial header. Now, try and see if you can modify this script to now extract 25 ps of information (i.e. 25 frames) from trajectory.dcd. Note that you will have to change some input files that are hard-coded into the.inp file. (19.) Run the input file by typing 'charmm < write_pdb.inp' to extract pdb files every 1 ps. You should have 25 files in total. (20.) Copy /home/zlevine/analysis.pl and run it (as before) on the resulting PDB files. This will, once again, produce values for Ree and Rg. (21.) How do the values for Ree and Rg in explicit water compare with those derived from implicit water models?