Introduction to Bioinformatics Software on Bio-Linux

Size: px
Start display at page:

Download "Introduction to Bioinformatics Software on Bio-Linux"

Transcription

1 Introduction to Bioinformatics Software on Bio-Linux The aim of this practical is to give you experience with a number of programs using different command line and graphical interfaces. We will begin by looking at the practicalities and advantages of running programs on the command line. At the end of this course, we hope that you have some feel for some of the software pre-installed Bio- Linux and different ways to access it. The main points we hope you take away with you are: 1.) If you have repetitive tasks to carry out, chances are there is a way of automating the job, at least to some extent. 2.) Web interfaces are easy, and have certain benefits, but there are other ways to access software, and sometimes they will suit your needs better. 4.) We can be contacted for help. Please mail us if you have questions or problems relating to your account or your analysis! helpdesk@envgen.nox.ac.uk Please note: The point of this practical is not to give you a good knowledge of any program in particular, but rather to introduce you to some of the software, ways of using it, and where to find documentation and help. Before you use any of the programs to analyse your own data, we highly recommend that you read the documentation! The programs we will be using in this part of the practical are: readseq converting between different sequence formats remap restriction mapping (an EMBOSS program) clustalw multiple sequence alignment program (command line) clustalx multiple sequence alignment program (Xwindows) blast searching sequence databases with sequence data MSPcrunch post blast processing (command line and web-based) blixem Xwindows further post-blast processing jalview Xwindows multiple sequence editor and more prettyplot creating pretty versions of multiple sequence alignments (EMBOSS) The sample sequences you need for this session are in a directory named intro_pract. Please move into this directory if you are not already there. 1

2 Interface choices - pros and cons of different interfaces Command line Pros Fast to run Very flexible many options are available Repetitive tasks can be carried out easily and quickly Cons Have to learn syntax (just need to read the documentation!) Prompted command line Pros Can get flexibility of the command line without having to type in everything Cons Easy to forget the diversity of options that exist for many programs Slow to run compared with pure command line Xwindows Pros Cons More intuitive than the command line windows-like Usually quite colourful Some programs can only be run through Xwindows (e.g. Staden and Blixem) Often, extensive help is readily available through the menu system Much slower to use than command line, especially for repetitive tasks Web-based Pros Cons Usually very intuitive Some web-based programs are linked together so you can directly use the results of one program as input to the next Slow to use relative to the command line, especially with repetitive tasks Your data needs to be in an accessible location so you can either upload it, or copy/paste it into the web form You need to consider where and how to save the results Network speed affects how fast you get your results Security issues The most effective way to analyse your data is often to use programs through a combination of the above interfaces, depending on your requirements. For repetitive tasks please consider using the command line, and learning to use scripting to automate what you need to do. 2

3 Some general points before you start File naming conventions in bioinformatics Some bioinformatics programs will present you with a default filename for the output. It is a good idea to take either accept the default (depending on how sensible it is), or if not, at least take note of the default suffix, (often proceeding a dot). E.g. the default output filename for a clustal format multiple sequence alignment might be: output.aln It is common to call clustal format files something.aln. If you were outputting a multiple sequence alignment in msf format, it might be called output.msf You are not restricted to naming your files in any particular way but we highly recommend that you follow the convention for the type of file you are generating/saving. Examples of naming conventions will be pointed out throughout the practical. Benefits to following the general trends include: you will have at least some notion of what kind of data are in a file just by looking at the name of the file by following the standard conventions, (rather than making up your own), you will make it easier for other people looking at your files, (e.g. collaborators, or people helping you), to know what is in your files just by looking at the title. Following file naming conventions will save you a lot of time! Naming files and the danger of over-writing previous results Many programs will suggest a name for your results file. Sometimes this name is generated by taking the beginning of the name of your input file, and adding a new suffix. However, sometimes it is just a generic name like prettyplot.ps or clustalw.aln. We encourage you to change generic names as soon as you can. Apart from the fact that filenames like prettyplot.ps give you little idea what data you actually analysed, if you do not change the name, the next time a file of the same name is generated, you will overwrite previous results. 3

4 Sequence formats A simple thing that often trips people up is sequence formats. You can think of a sequence format as how the sequence looks on the page, or on the screen, as well as how it is stored. Sequences are stored in text or binary formats. Text formats are human readable, binary formats are not human readable (for most humans). Examples of text formats: embl plain/staden msf genbank clustal gcg (seq format) fasta phylip The reasons there are different sequence formats are both historical and functional. When people first started writing biological analysis programs, they would design a format that their program would understand. As time went on, numerous formats came into existence. We live with the legacy of this; we must be aware of what format a sequence is in, and whether the program we want to run understands it. Functionally, some programs require information that can be handled by some formats, but not others. For example, embl format files can contain lots of descriptive information about a sequence, whereas plain format contains none, and fasta format can contain only a small amount. Clustal and msf formats can handle multiple sequences that are aligned, and phylip format files can contain information relevant to phylogenetic analysis programs. To be able to analyse data, it must be presented to the analysis program in a format it can understand, and must be appropriate for the analysis you are performing. This seems obvious, but frequent errors (or worse, meaningless results) occur when the data entered into a program is not appropriate. Note EMBOSS is a large, and recommended, package of programs for sequence analysis. EMBOSS programs accept data in many different formats, and sequence format conversion is rarely required when using EMBOSS programs. 4

5 A common problem: what is a text file and what is not Word documents may look like text, but they aren t! The letters you see on the page of a Word document (or Word Perfect, or most word processing programs) are actually stored in what is known as binary format. Most sequence analysis programs expect text. Plain old, nothing fancy, text. It is an unusual situation to need to use sequence data that has been stored as a Word document (if it is not unusual to you, please ask a demonstrator as you may be doing things the hard way!). To get a text document when using Word, save it as text only. Please note: If you are using Word at all in any part of the process of your bioinformatics analysis, you are probably doing things the hard way! Please contact us for alternatives!! Converting between different sequence formats There are a number of programs available to convert one sequence format to another. One of the most versatile is readseq. Readseq allows conversion between many different formats, both for files containing single sequences, and those containing more than one sequence. Exercise Converting sequences from embl to fasta format. First look at testseq1.embl. Notice the type of information held in this file. less testseq1.embl Quit the command less by typing q Now, type readseq on the command line. You will be prompted for all additional required information. Read the prompts carefully! Readseq asks for the name of an output file before the input file! You don t want to end up overwriting your original data! Give the output filename as: testseq1.tfa As long as your sequence is in a recognised format, readseq will understand it. You will need to specify what output format you want when you are prompted. In this case, it will be number 8, Fasta. When prompted, give the input filename as testseq1.embl 5

6 You will now again be presented with the prompt Name an input sequence or option: This gives you the opportunity to specify other filenames (if you were creating a multiple sequence file) or other formatting options. We don t need to do this, so just press the return key. Now list your files by typing ls You should see one called testseq1.tfa. Look at it using the command less less testseq1.tfa Compare this fasta-formatted version to the embl formatted version testseq1.embl. Now do the same thing for testseq2.embl and testseq3.embl. Bring up the help for readseq by typing readseq -h Can you see how you might be able to do this type of conversion in a single command given on the command line? Try using the full command line to convert testseq4.embl to fasta format. Do it again for testseq5.embl. Exercise Sequence format conversion and multiple sequence files Multiple sequence files, that is, files containing more than one sequence are often used for input in multiple sequence alignment programs, and for carrying out repetitive analysis. Look again at some of the options available in readseq.. readseq h Notice the option all. If we include this on the command line, it indicates to readseq that we want it to take all the sequences we name, reformat them, and place the output of all of them into a single file. Try creating a multiple fasta sequence file called testseqs_all.tfa that contains sequences testseq1.embl, testseq2.embl, testseq3.embl and testseq4.embl. If you succeeded, all the sequences in the output file, testseqs_all.tfa, will be in fasta format. Note that the input sequences do not all have to be in the same format as each other, they need only be in a format recognised by readseq. 6

7 There are many ways of doing the above, and many are faster than the way described here. We cannot describe all the possible methods here, but if you already had all your files in the appropriate format (e.g. here we are using fasta format), and you wanted to create a file containing all your sequences, you could use the command: cat testseq[1-4].tfa > testseq_all2.tfa Or, if you wanted to add extra information to a file that already exists, you could use the command: cat testseq5.tfa >> testseq_all2.tfa The message: If you suspect there may be a more efficient way to do what you are doing, there probably is! Please helpdesk@envgen.nox.ac.uk and ask us if we know of programs or options that might help you. Running programs via the command line Most programs can be run by typing everything the program needs to know on the command line and pressing the return key. Usually this resembles the way you give Linux/Unix commands: you enter the command followed by flags or arguments specifying how you want the program to run. Some programs will prompt you for the information they need if you have not entered everything required on the command line. Even in the case of programs that will prompt you for information on the command line, there are good reasons to provide all of the information directly, rather than in response to the program prompts: Many programs can be tailored to run according to your needs, but when a program prompts you for input, it usually only prompts you for information that is absolutely necessary in order to run. There may be many useful options that you miss out on using if you only answer the prompts. That is, if you miss out any information that is not vital, but affects the way you want the program to act, it will run fine, but not as you wanted. If you have repetitive tasks to carry out, (e.g. say you have 100 sequences to analyse in a particular way), the easiest (i.e. fastest) method to set this up involves using the full command line and automating the task. To illustrate these points, we will be using an EMBOSS program called remap. This program looks for restriction sites in nucleotide sequences and is also useful for looking for open reading frames and translations in any or all 6 frames. 7

8 Note There are necessary setup steps required to get certain EMBOSS applications to work. Remap is one of them. Remap relies on information from a restriction enzyme database called Rebase. Information is available on our website explaining how to get rebase and get EMBOSS applications to interact with it. Exercise Running Remap prompted and complete command line Using the prompted command line We will start by running remap as simply as possible. Just type: remap The sequence you want to run the program on is testseq1.tfa Choose the default answers to all questions. When your analysis has run you will find the results in a file called testseq2.remap. Use the command less to look at this file: type less testseq2.remap You should see your sequence, with restriction sites marked out along it, with a six frame translation shown below. Keep pressing the space bar until you near the bottom of the file. Here you should find three lists: one of enzymes that did cut your sequence, one of enzymes that did not cut, and the number of enzymes that did not match your criteria. Specifying everything on the command line There are lots of useful remap options available these can be accessed by using the full command line. To find out what these are, you can go to the web page documentation for remap ( or can type: remap help 8

9 This time, lets try running remap, and specify that we do not wish to see any ORF s that are less than 20 amino acids long, and we wish to list only 6-cutter enzymes. Try looking at the documentation and see if you can figure out what the command line would be yourself. In case you had problems, the following should work: remap -orfminsize=20 sitelen=6 Notice that you are still prompted for the necessary information you didn t provide on the command line here this includes what enzyme set to choose from, the name of the input file and the name of the output file. By providing all the necessary information on the command line, you can speed up your analysis, and take the first step towards automating your task should you have to run this mapping many times. Try this command decide what you think it should be doing before looking at the results file. remap orfminsize=20 outfile=teseq3remap.html highlight= blue sitelen=6 html enzymes=all testseq3.tfa Notes: type the above command all on one line EMBOSS programs allow you to use the above syntax, with a = sign between parameters and values, or spaces. For example, -orfminsize 20 is acceptable. What syntax is accepted by a program depends on how it has been written. For example, readseq is not so flexible in what it accepts. Notice that we gave the output name of this file the suffix.html. This is because we asked for an html output file from remap, and most browsers require the.html suffix to recognize that the page you are trying to view is an html document. Take a look at the results of the above command by opening a web browser and loading the file testseq3remap.html. The command line truly comes into its own when you need to run an analysis over and over again. For example, what if I had 100 sequences I wanted to map exactly the same way? The prompted command line would become tedious very quickly. And the full command line would too, although it would be faster. A first foray into automation the foreach loop Foreach loops allow you to say to the computer: Foreach thing in this list, do the following: So, when running a restriction mapping analysis, you might want to do something like: Foreach sequence in my list, run the program map to look for the enzymes that that have recognition sites of at least six bases and that cut a minimum of twice. Mapping 100 sequences with a foreach loop would only take fractionally more time than mapping a single sequence, and required practically no extra effort on your part. 9

10 Since the general idea is to get the computer to read a list, and run an analysis on each item in that list, you need to generate a list of the sequences you want analysed. We will go through one example here. Please note: If you have no prior experience with Linux or unix, you may find it a challenge to set up your own foreach loops the first time. Please contact us at helpdesk@envgen.nox.ac.uk if you have problems setting up your own foreach loops. Exercise Looping through multiple restriction mapping analyses Type the following on the command line: ls testseq[1-5].tfa You should see 5 sequences listed: testseq1.tfa testseq2.tfa testseq3.tfa testseq4.tfa testseq5.tfa If we wish to map these five sequences, we know that the command ls testseq[1-5].tfa will list our sequences of interest. We can put this list into a foreach loop by stating foreach i ( ls testseq[1-5].tfa ) Here there are several things to note: we have used the command foreach the i means each thing for each thing in the list, i takes the value of that thing (in this case, a sequence name) the information in the brackets is the list of sequences you want to work on the quotation marks around the ls testseq[1-5].tfa command are backquotes and the computer understands this to mean take the results of the command inside these backquotes. The brackets are important: the computer needs them to understand what you want So the overall effect of that one line is: foreach thing in the list that can be generated using the command ls testseq[1-5].tfa, do the following: If you have typed the foreach line in, you will now be seeing something like: jbloggs@machine [demo] foreach i ( ls testseq*.seq ) foreach> The foreach is a prompt - we need to tell the computer what we want it to do with each item in the list. To do this, type: remap outfile=$i.remap enzymes=all sitelen=6 mincuts=2 $i 10

11 and press the return key. Each $i in that command will be replaced by the name of a sequence file from the list and the remap command is executed. You will now see another foreach> prompt. Type end to let the computer know that you have told it all it needs to do for each item in the list. Now type: ls l *.remap You should see that you have run the mapping analysis on all 5 sequences (they will all be called testseqs#.tfa.remap.) As you can see, running this analysis on 100, or 1000 sequences, would be relatively painless if you did it using a foreach loop. Note Sometimes programs (like remap) send information to the screen that you don t want. One way to get rid of it while carrying out a foreach loop as above is to send this screen output to a garbage area (aka a bit-bucket). The information you see in the case of remap is being sent from the program to STDERR, (which, in this case, is your screen). By writing 2>/dev/null at the end of the remap command, you are saying to send anything send to STDERR (2) to a garbage area (/dev/null). Try replacing the line: remap outfile=$.remap enzymes=all sitelen=6 mincuts=2 $i in your foreach loop with: remap outfile=$.remap enzymes=all sitelen=6 mincuts=2 $i 2>/dev/null This may seem trivial at this point, but could be important if you start writing shell or cron jobs where information is being sent to STDERR. Nicing aka Being a considerate user! If you are running a computationally intensive job (e.g. when you search databases, or run large alignments), you should consider being polite to other users of your system by setting your jobs to work at a low priority. The priority given to your jobs are referred to as nice levels. We won t be nicing any jobs today, but for the sake of all the other users of your Bio-Linux machine, please read the documentation on nice: 11

12 man nice To nice a job you are about to run, use nice n level command. Levels range from For example, to nice a program called someprog.pl, you could type to level 15 (an low-ish priority): nice n 15 someprog.pl You can also move a running program to a lower priority using the command renice. Note: You may have to give the full path of the command you wish to run when using nice, rather than just the short name. There are other facilities, such as queuing and load balancing systems, which are more sophisticated than just nicing a job, but nice is simple, built-in, and effective for machines with a very small number of users. Running programs with graphical interfaces Your Bio-Linux machine runs X windows, which means that you can run programs with graphical interfaces either by working on the console, or by working remotely. Your system administrator should be able to help you if you do not know how to run graphical programs on Bio-Linux remotely. Programs you can run using graphical interfaces include clustalx, blixem, jalview, and Staden., among others. Exercise Running clustalx Clustalx is a multiple alignment program. (A command line version called clustalw is also available.) To start up clustalx, just type the name of the program on the command line: clustalx & The & allows us to work in the clustalx window and also to continue working in our original terminal window. (Nothing bad happens if you forget the &.) Alternatively, you can start up clustalx from choosing the icon under the Bioinformatics drop down menu. You should now see a new window appear with the title ClustalX. There are a number of drop-down menus available within the clustalx program: Files, Edit, Alignment, Trees, Colors, Quality, and Help. Click on each of them and see what choices you are presented with. Any choice in grey text is not available to you at this moment, any choice in black text is. 12

13 Please note: we have just discovered that the help menus for clustalx in Bio-Linux 2.0 are not working. There is and easy (though not completely elegant) fix for this. Please see our bioinformatics software faq at We have several options for loading sequences into clustalx. Some of these are: we can load all the sequences at once using a multiple sequence fasta format file (e.g. testseqs_all.tfa) we can add the sequences one at a time by loading the first sequence using the menu option Load Sequences, and adding all subsequent sequences using the Append Sequences option we can do a mixture of both the above Try loading testseqs_all.tfa into clustalx: Choose the Load Sequences option, and choose testseqs_all.tfa from the menu. Sequences should appear in the Clustalx window. Please ask a demonstrator if they do not. To add a single sequence to those already loaded: Choose the Append Sequences option and choose testseq5.tfa. This sequence should now be visible in your Clustalx window. In order to carry out an alignment, you need to highlight the name of the sequences you want aligned, and click on the Alignment menu. It is not part of the course today, but we highly advise you to click on the Alignment parameters and the Output Format Options choices and see what is available! For now, just click on Do Complete Alignment under the Alignment menu. You will be presented with output file names that you can change if you like. Click on the button marked Align. At the bottom of the clustalx window, you should see text describing the progression of the alignment. To keep this alignment, you need to save it. It can be saved in a text format so that it can be used in other programs, or you can save the output much as it looks in the ClustalX window: a coloured alignment. This is like a picture and cannot be used for further analysis. To save the alignment in text form, click on the Save Sequences As choice under the File menu. You will be presented with choices as to what format to save your alignment in, which portion of the alignment to save, and what to call the output file. Take note of what the output file will be called, and then click on OK. Go back to your main window and type ls at the prompt. You should now see the file you just created. 13

14 Try and create a colour version of the alignment name it testseqs_all.ps. To view this postscript format file, you can use a command called ghostview: ghostview testseqs_all.ps Quit clustalx by choosing Quit from under the File menu. Exercise Fetching sequences using SRS at the EBI The majority of this exercise will be carried out via concurrent demonstration. Exercise Running artemis Artemis is a program for viewing and annotating DNA sequences. It is an Xwindows program. You should have a file in your account called hsy14768.embl. Start artemis by typing artemis Now choose the option Open from under the File menu, and select the file you just saved: hsy14768.embl. You may get an error, just hit the button OK. This should open up a large window where this sequence will be displayed graphically. In another xterm window, you may like to view the actual text of the entry using the command less. less hsy14768.embl Notice how Artemis has essentially transformed this text information into a picture. For more information about how to work with Artemis, please refer to the web page: Explore the options available to you. Not all options will be functional if you need them, you will need to ask your system administrator to set up some of them on your local Bio-Linux machine. 14

15 Blast from step one onwards Running blast searches is one of the most frequent tasks in bioinformatics. Running blast searches locally, and learning how to automate this task will greatly add to your efficiency. To search a database with blast locally, you need to get that database and then make a copy in a format that blast can read that is, you must index the database for blast. There are two main types of blast NCBI s blast, (aka blastall) and Washington University blast (aka wublast). Both do a good job, but they work slightly differently (under the hood), and can produce different results in some cases. In additiona, wu-blast offers some features NCBI blast does not. Academic licenses for wu-blast are free and can be obtained by ing licensing@blast.wustl.edu. NCBI blast (blastall) comes pre-installed on your Bio-Linux system. We are going to download the peptide database swissprot, format it for searches with blast, and then carry out some blast searches against this formatted database. Please note that where we put the blast database during this course is not the recommended location!!! Please ask your system administrator to put blast databases in the location /home/db/blastdb OR to change the environmental variable BLASTDB set in the file /usr/software/bioenvrc to the appropriate location. We recommend that you store all databases somewhere under /home, and preferably somewhere under /home/db. Exercise Running blast from database formatting to database searching Step 1 get the database onto the machine The majority of sequence and sequence-related databases are disseminated as flatfiles and there are a number of ways you can get hold of such databases. The most common is to ftp them from a central repository like the EBI, Expasy or the NCBI. We will download the entire swissprot database from the EBI using ftp, and the command wget. Create a directory to store your database in and move into that directory: mkdir blastdb cd blastdb Download the appropriate file (the fasta file!) from the ftp site: wget ftp://ftp.ebi.ac.uk/pub/databases/sp_tr_nrdb/fasta/sprot.fas.gz 15

16 Uncompress the database: gunzip sprot.fas.gz To format the database for use with blastall, you need to use a program called formatdb. Documentation for this program can be found by clicking on the desktop folder called Bioinformatics Software Manuals, and then the sub folder db_search_docs, then the folder BLAST_docs. A reasonable command to run to format the above database, creating a blast version with the name swissprot would be: formatdb i sprot.fas p T o T n swissprot Run the above command and look at the list of files created: ls -l It is worth looking at the file formatdb.log after you create a blast database. Now move back to your original directory cd.. Blastall needs to know where to find the database you want to search. You can do this by giving the full path to the database, or by defining an environmental variable $BLASTDB as the directory where your blast database is. We will give the full path during this practical. There are many command line options available for blastall, and we HIGHLY recommend you read the documentation for this program! Understanding how this program works and how it can be used will aid you greatly in searching databases effectively and understanding what your blast results really mean. Blast comes in a number of different flavours, which carry out different types of searches. FLAVOUR SEARCH SEQUENCE TYPE DATABASE SEQUENCE TYPE blastn nucleotide nucleotide blastp peptide peptide blastx nucleotide (6 frame conceptual peptide translation of) tblastn peptide nucleotide tblastx nucleotide (6 frame conceptual translation of) nucleotide (6 frame conceptual translation of) It is beyond the scope of this course to cover the details of blast searching. We will just run a basic blastp search, and then we ll use a foreach loop to run 5 blastx searches. 16

17 A simple blastp search blastall p blastp d blastdb/swissprot i cd4_cerae.tfa e 0.01 o cd4_cerae.blastp This means: run blastall, using the flavour (-p) blastp. The database (-d) to be searched is called swissprot and can be found in the blastdb directory. The input sequence (-i) is cd4_cerae.tfa. I only want to see results of sequences with e-values (-e) better than (i.e. lower than) 0.01, and I want the results of this search (-o) to be sent to the file cd4_cerae.blast. Please look at the results file. Because you used the o option when you created your blast database, you can use the fastacmd program to retrieve any sequences you are interested in using the sequence id you find in the blast report. E.g. for this search, you could retrieve the sequence with id swissprot_1 by typing: fastacmd d blastdb/swissprot s COAD_BPFD Notice that this finds the sequence and returns the results to the screen. If you wish to keep this sequence, you need to capture the output. You can do this using the > redirect symbol. This will send the output to a file. For example: fastacmd d blastdb/swissprot s COAD_BPFD > coad_bpfd.tfa The output is now stored in a file called coad_bpfd.tfa. You can look at this file using cat, more, or less. Try a blastx search: blastall p blastx d blastdb/swissprot i unknown.tfa -e 1 o unknown.blastx This may have seemed a lot of work when you could have just gone to a web site to do it! There are many reasons to choose to blast locally including configurability, speed, security, and being able to automate your jobs. Five blastx searches, one command Remember the foreach loop? Try to set up a foreach loop to run blastx searches of testseq1.tfa through to testseq5.tfa against the swissprot database. The answer is given below, but try it yourself first and see how it goes. If you type foreach i (`ls testseq[1-5].tfa`) blastall p blastx d blastdb/swissprot -i $i -e o $i.blast end 17

18 ls *blast you should see the blast reports you have just generated listed. You could read through the testseq*.blast files by using the command less: less testseq*.blast When you get to the end of one document, (or just want to go to the next document), just type :n If you want to quit, type q 18

19 MSPCrunch and Blixem MSPcrunch is a program that process the output of blast searches. It is used to filter the output from blast searches, the aim being to optimise the chances of finding new biologically significant matches by reducing the display of redundant matches, and retaining the best other matches. There are many command line options for this command. You can list these by typing: mspcrunch h Exercise In our case, we will be using MSPcrunch specifically to convert our blast output into a format readable by the program blixem. This means we need to use the q option. Try the following: mspcrunch q unknown.blastx > mspcrunch.out This makes a file called mspcrunch.out, which we will feed into Blixem. Blixem stands for BLast matches In an X-windows Embedded Multiple alignment and is an interactive browser of pairwise Blast matches. The alignment that is produced is thus not a true multiple alignment, such as produced by e.g. Clustalw, but a one-to-many alignment (all sequences are aligned to your original search sequence.) There are many viewing options in Blixem and it is worth getting to know this program. Running Blixem: Make sure you have saved your MSPcrunch results to the same directory as your sequence files. To run Blixem, you need to give the program your MSPcrunch results AND the file containing the sequence you did the blast search with. Blixem accepts only fasta formatted sequence. On the command line, type : blixem unknown.tfa mspcrunch.out & You may have to wait a few seconds for Blixem to start up. Place your cursor on the blue box in the top section, click on the middle mouse button and drag the box to the left or right. What happens to the sequences in the bottom section? There are many menus available if you click with your right mouse button. Place the cursor over different areas of the screen and click with the right mouse button. Choose some of the menu options and observe their effects. A particularly good program available through Blixem (and also directly on the command line) is Dotter. Try opening the option Dotter query vs. itself when you see it in a sub-menu. 19

20 Warning Many of the best features of Blixem rely on being able to fetch sequences from a database. For instance, when you double click on a sequence name, the full sequence should pop up, and when you right click and choose the program Dotter, it should fetch the sequence of interest and show you a graphical pairwise alignment of your query sequence against it. To take advantage of these features requires two steps. change the default search program used by Blixem (WWW-efetch) to efetch (simple) write a script called efetch to get sequences from databases you hold on your machine (often not as simple). The script efetch must be on your PATH. A simple example is given on our Bioinformatics FAQ page. If you want to try this out, feel free and ask a demonstrator if you have problems. Quit Blixem by right clicking over the window and choosing the option Quit. Further help and explanation about Blixem is available at: Running Jalview Jalview is a versatile program that allows you to do multiple sequence alignments, view and edit alignments, carry out a restricted amount of phylogenetic analysis, view trees, etc. The only bothersome thing about running jalview from the command line (as opposed to when it is offered as a web-based application) is that the command line is ugly. You have to give both the name of the sequence file you want to view, and the format it is in. The formats allowed are MSF, CLUSTAL, FASTA, BLC, MSP or PIR Exercise We will load the multiple sequence fasta file capsall.tfa into jalview. Type the following jalview capsall.tfa FASTA You can run a clustal alignment within Jalview by going to the Align menu and choosing Local Alignment. Try this now. This is a big file, so it may take a little time. 20

21 A new window with the aligned sequences should appear eventually. If it does not have the same colouring scheme as before, go to the menu Colour and choose Clustal colours Please note: Clustalw run in this way runs with default options It is a better idea to run clustalx or clustalx and then load the aligned file into Jalview so that you can have full control over the parameters used when creating your alignment. Try out some of the other options available to you under the menus. Running Prettyplot Prettyplot is a program to generate pretty versions of multiple sequence alignments. By now you have generated a number of alignment files and are hopefully fairly comfortable in finding out information about programs, and trying out options available to you. Try looking at the prettyplot documentation by typing prettyplot h or referring to the web documentation at Try running this program on some of your alignments, choosing various display options. 21

INTRODUCTION TO BIOINFORMATICS

INTRODUCTION TO BIOINFORMATICS Molecular Biology-2019 1 INTRODUCTION TO BIOINFORMATICS In this section, we want to provide a simple introduction to using the web site of the National Center for Biotechnology Information NCBI) to obtain

More information

INTRODUCTION TO BIOINFORMATICS

INTRODUCTION TO BIOINFORMATICS Molecular Biology-2017 1 INTRODUCTION TO BIOINFORMATICS In this section, we want to provide a simple introduction to using the web site of the National Center for Biotechnology Information NCBI) to obtain

More information

Week - 01 Lecture - 04 Downloading and installing Python

Week - 01 Lecture - 04 Downloading and installing Python Programming, Data Structures and Algorithms in Python Prof. Madhavan Mukund Department of Computer Science and Engineering Indian Institute of Technology, Madras Week - 01 Lecture - 04 Downloading and

More information

A Step-by-Step Guide to getting started with Hot Potatoes

A Step-by-Step Guide to getting started with Hot Potatoes A Step-by-Step Guide to getting started with Hot Potatoes Hot Potatoes Software: http://web.uvic.ca/hrd/hotpot/ Andrew Balaam Objectives: To put together a short cycle of exercises linked together based

More information

Unix basics exercise MBV-INFX410

Unix basics exercise MBV-INFX410 Unix basics exercise MBV-INFX410 In order to start this exercise, you need to be logged in on a UNIX computer with a terminal window open on your computer. It is best if you are logged in on freebee.abel.uio.no.

More information

Bioinformatics explained: BLAST. March 8, 2007

Bioinformatics explained: BLAST. March 8, 2007 Bioinformatics Explained Bioinformatics explained: BLAST March 8, 2007 CLC bio Gustav Wieds Vej 10 8000 Aarhus C Denmark Telephone: +45 70 22 55 09 Fax: +45 70 22 55 19 www.clcbio.com info@clcbio.com Bioinformatics

More information

CS Multimedia and Communications REMEMBER TO BRING YOUR MEMORY STICK TO EVERY LAB!

CS Multimedia and Communications REMEMBER TO BRING YOUR MEMORY STICK TO EVERY LAB! CS 1033 Multimedia and Communications REMEMBER TO BRING YOUR MEMORY STICK TO EVERY LAB! Lab 06: Introduction to KompoZer (Website Design - Part 3 of 3) Lab 6 Tutorial 1 In this lab we are going to learn

More information

Tiny Instruction Manual for the Undergraduate Mathematics Unix Laboratory

Tiny Instruction Manual for the Undergraduate Mathematics Unix Laboratory Tiny Instruction Manual for the Undergraduate Mathematics Unix Laboratory 1 Logging In When you sit down at a terminal and jiggle the mouse to turn off the screen saver, you will be confronted with a window

More information

When talking about how to launch commands and other things that is to be typed into the terminal, the following syntax is used:

When talking about how to launch commands and other things that is to be typed into the terminal, the following syntax is used: Linux Tutorial How to read the examples When talking about how to launch commands and other things that is to be typed into the terminal, the following syntax is used: $ application file.txt

More information

CSCU9B2 Practical 1: Introduction to HTML 5

CSCU9B2 Practical 1: Introduction to HTML 5 CSCU9B2 Practical 1: Introduction to HTML 5 Aim: To learn the basics of creating web pages with HTML5. Please register your practical attendance: Go to the GROUPS\CSCU9B2 folder in your Computer folder

More information

Burning CDs in Windows XP

Burning CDs in Windows XP B 770 / 1 Make CD Burning a Breeze with Windows XP's Built-in Tools If your PC is equipped with a rewritable CD drive you ve almost certainly got some specialised software for copying files to CDs. If

More information

Outlook Web Access. In the next step, enter your address and password to gain access to your Outlook Web Access account.

Outlook Web Access. In the next step, enter your  address and password to gain access to your Outlook Web Access account. Outlook Web Access To access your mail, open Internet Explorer and type in the address http://www.scs.sk.ca/exchange as seen below. (Other browsers will work but there is some loss of functionality) In

More information

Authoring World Wide Web Pages with Dreamweaver

Authoring World Wide Web Pages with Dreamweaver Authoring World Wide Web Pages with Dreamweaver Overview: Now that you have read a little bit about HTML in the textbook, we turn our attention to creating basic web pages using HTML and a WYSIWYG Web

More information

Introduc)on to annota)on with Artemis. Download presenta.on and data

Introduc)on to annota)on with Artemis. Download presenta.on and data Introduc)on to annota)on with Artemis Download presenta.on and data Annota)on Assign an informa)on to genomic sequences???? Genome annota)on 1. Iden.fying genomic elements by: Predic)on (structural annota.on

More information

Sequence Alignment. GBIO0002 Archana Bhardwaj University of Liege

Sequence Alignment. GBIO0002 Archana Bhardwaj University of Liege Sequence Alignment GBIO0002 Archana Bhardwaj University of Liege 1 What is Sequence Alignment? A sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity.

More information

Printing Envelopes in Microsoft Word

Printing Envelopes in Microsoft Word Printing Envelopes in Microsoft Word P 730 / 1 Stop Addressing Envelopes by Hand Let Word Print Them for You! One of the most common uses of Microsoft Word is for writing letters. With very little effort

More information

XP: Backup Your Important Files for Safety

XP: Backup Your Important Files for Safety XP: Backup Your Important Files for Safety X 380 / 1 Protect Your Personal Files Against Accidental Loss with XP s Backup Wizard Your computer contains a great many important files, but when it comes to

More information

NCMail: Microsoft Outlook User s Guide

NCMail: Microsoft Outlook User s Guide NCMail: Microsoft Outlook 2007 Email User s Guide Revision 1.1 3/9/2009 This document covers how to use Microsoft Outlook 2007 for accessing your email with the NCMail Exchange email system. The syntax

More information

NCMail: Microsoft Outlook User s Guide

NCMail: Microsoft Outlook User s Guide NCMail: Microsoft Outlook 2003 Email User s Guide Revision 1.0 11/10/2007 This document covers how to use Microsoft Outlook 2003 for accessing your email with the NCMail Exchange email system. The syntax

More information

The ViVo Mouse Versions: Standard & Professional Installation Guide

The ViVo Mouse Versions: Standard & Professional Installation Guide Versions: Standard & Professional Installation Guide Copyright 2010-2014 Vortant Technologies, LLC Table of Contents Installation Guide - ViVo Standard & Professional... 3 Contact Information... 3 Getting

More information

The first thing we ll need is some numbers. I m going to use the set of times and drug concentration levels in a patient s bloodstream given below.

The first thing we ll need is some numbers. I m going to use the set of times and drug concentration levels in a patient s bloodstream given below. Graphing in Excel featuring Excel 2007 1 A spreadsheet can be a powerful tool for analyzing and graphing data, but it works completely differently from the graphing calculator that you re used to. If you

More information

BIOL591: Introduction to Bioinformatics Alignment of pairs of sequences

BIOL591: Introduction to Bioinformatics Alignment of pairs of sequences BIOL591: Introduction to Bioinformatics Alignment of pairs of sequences Reading in text (Mount Bioinformatics): I must confess that the treatment in Mount of sequence alignment does not seem to me a model

More information

Using GitHub to Share with SparkFun a

Using GitHub to Share with SparkFun a Using GitHub to Share with SparkFun a learn.sparkfun.com tutorial Available online at: http://sfe.io/t52 Contents Introduction Gitting Started Forking a Repository Committing, Pushing and Pulling Syncing

More information

Barchard Introduction to SPSS Marks

Barchard Introduction to SPSS Marks Barchard Introduction to SPSS 22.0 3 Marks Purpose The purpose of this assignment is to introduce you to SPSS, the most commonly used statistical package in the social sciences. You will create a new data

More information

Excel Basics Rice Digital Media Commons Guide Written for Microsoft Excel 2010 Windows Edition by Eric Miller

Excel Basics Rice Digital Media Commons Guide Written for Microsoft Excel 2010 Windows Edition by Eric Miller Excel Basics Rice Digital Media Commons Guide Written for Microsoft Excel 2010 Windows Edition by Eric Miller Table of Contents Introduction!... 1 Part 1: Entering Data!... 2 1.a: Typing!... 2 1.b: Editing

More information

Tutorial 1: Unix Basics

Tutorial 1: Unix Basics Tutorial 1: Unix Basics To log in to your ece account, enter your ece username and password in the space provided in the login screen. Note that when you type your password, nothing will show up in the

More information

A Document Created By Lisa Diner Table of Contents Western Quebec School Board October, 2007

A Document Created By Lisa Diner Table of Contents Western Quebec School Board October, 2007 Table of Contents A Document Created By Lisa Diner Western Quebec School Board October, 2007 Table of Contents Some Basics... 3 Login Instructions... 4 To change your password... 6 Options As You Login...

More information

How To Upload Your Newsletter

How To Upload Your Newsletter How To Upload Your Newsletter Using The WS_FTP Client Copyright 2005, DPW Enterprises All Rights Reserved Welcome, Hi, my name is Donna Warren. I m a certified Webmaster and have been teaching web design

More information

Let s begin by naming the first folder you create Pictures.

Let s begin by naming the first folder you create Pictures. 1 Creating a Folder on Your Desktop Saving A Picture to Your Folder Creating Desktop Wallpaper from Pictures on the Internet Changing Your Home Page Creating a Shortcut to a Web Page on Your Desktop One

More information

Basic Local Alignment Search Tool (BLAST)

Basic Local Alignment Search Tool (BLAST) BLAST 26.04.2018 Basic Local Alignment Search Tool (BLAST) BLAST (Altshul-1990) is an heuristic Pairwise Alignment composed by six-steps that search for local similarities. The most used access point to

More information

Very large searches present a number of challenges. These are the topics we will cover during this presentation.

Very large searches present a number of challenges. These are the topics we will cover during this presentation. 1 Very large searches present a number of challenges. These are the topics we will cover during this presentation. 2 The smartest way to merge files, like fractions from a MudPIT run, is using Mascot Daemon.

More information

Intersect. User s Manual. Version 1.0. (last revision: ) 2001 University of California

Intersect. User s Manual. Version 1.0. (last revision: ) 2001 University of California Intersect Version 1.0 User s Manual (last revision: 12-18-02) Table of Contents 1. Introduction 3 2. Installation 4 3. Using Intersect 5 Adding sets 5 Adding files to a set 6 Removing files from a set

More information

Your . A setup guide. Last updated March 7, Kingsford Avenue, Glasgow G44 3EU

Your  . A setup guide. Last updated March 7, Kingsford Avenue, Glasgow G44 3EU fuzzylime WE KNOW DESIGN WEB DESIGN AND CONTENT MANAGEMENT 19 Kingsford Avenue, Glasgow G44 3EU 0141 416 1040 hello@fuzzylime.co.uk www.fuzzylime.co.uk Your email A setup guide Last updated March 7, 2017

More information

Creating a Website Using Weebly.com (June 26, 2017 Update)

Creating a Website Using Weebly.com (June 26, 2017 Update) Creating a Website Using Weebly.com (June 26, 2017 Update) Weebly.com is a website where anyone with basic word processing skills can create a website at no cost. No special software is required and there

More information

If Statements, For Loops, Functions

If Statements, For Loops, Functions Fundamentals of Programming If Statements, For Loops, Functions Table of Contents Hello World Types of Variables Integers and Floats String Boolean Relational Operators Lists Conditionals If and Else Statements

More information

Barchard Introduction to SPSS Marks

Barchard Introduction to SPSS Marks Barchard Introduction to SPSS 21.0 3 Marks Purpose The purpose of this assignment is to introduce you to SPSS, the most commonly used statistical package in the social sciences. You will create a new data

More information

Creating Simple Links

Creating Simple Links Creating Simple Links Linking to another place is one of the most used features on web pages. Some links are internal within a page. Some links are to pages within the same web site, and yet other links

More information

COMPARATIVE MICROBIAL GENOMICS ANALYSIS WORKSHOP. Exercise 2: Predicting Protein-encoding Genes, BlastMatrix, BlastAtlas

COMPARATIVE MICROBIAL GENOMICS ANALYSIS WORKSHOP. Exercise 2: Predicting Protein-encoding Genes, BlastMatrix, BlastAtlas COMPARATIVE MICROBIAL GENOMICS ANALYSIS WORKSHOP Exercise 2: Predicting Protein-encoding Genes, BlastMatrix, BlastAtlas First of all connect once again to the CBS system: Open ssh shell client. Press Quick

More information

Hello World! Computer Programming for Kids and Other Beginners. Chapter 1. by Warren Sande and Carter Sande. Copyright 2009 Manning Publications

Hello World! Computer Programming for Kids and Other Beginners. Chapter 1. by Warren Sande and Carter Sande. Copyright 2009 Manning Publications Hello World! Computer Programming for Kids and Other Beginners by Warren Sande and Carter Sande Chapter 1 Copyright 2009 Manning Publications brief contents Preface xiii Acknowledgments xix About this

More information

FILE ORGANIZATION. GETTING STARTED PAGE 02 Prerequisites What You Will Learn

FILE ORGANIZATION. GETTING STARTED PAGE 02 Prerequisites What You Will Learn FILE ORGANIZATION GETTING STARTED PAGE 02 Prerequisites What You Will Learn PRINCIPLES OF FILE ORGANIZATION PAGE 03 Organization Trees Creating Categories FILES AND FOLDERS PAGE 05 Creating Folders Saving

More information

Introduction to Unix - Lab Exercise 0

Introduction to Unix - Lab Exercise 0 Introduction to Unix - Lab Exercise 0 Along with this document you should also receive a printout entitled First Year Survival Guide which is a (very) basic introduction to Unix and your life in the CSE

More information

9.2 Linux Essentials Exam Objectives

9.2 Linux Essentials Exam Objectives 9.2 Linux Essentials Exam Objectives This chapter will cover the topics for the following Linux Essentials exam objectives: Topic 3: The Power of the Command Line (weight: 10) 3.3: Turning Commands into

More information

Laboratory 1: Eclipse and Karel the Robot

Laboratory 1: Eclipse and Karel the Robot Math 121: Introduction to Computing Handout #2 Laboratory 1: Eclipse and Karel the Robot Your first laboratory task is to use the Eclipse IDE framework ( integrated development environment, and the d also

More information

How to Rescue a Deleted File Using the Free Undelete 360 Program

How to Rescue a Deleted File Using the Free Undelete 360 Program R 095/1 How to Rescue a Deleted File Using the Free Program This article shows you how to: Maximise your chances of recovering the lost file View a list of all your deleted files in the free Restore a

More information

Linux File System and Basic Commands

Linux File System and Basic Commands Linux File System and Basic Commands 0.1 Files, directories, and pwd The GNU/Linux operating system is much different from your typical Microsoft Windows PC, and probably looks different from Apple OS

More information

Assignment 0. Nothing here to hand in

Assignment 0. Nothing here to hand in Assignment 0 Nothing here to hand in The questions here have solutions attached. Follow the solutions to see what to do, if you cannot otherwise guess. Though there is nothing here to hand in, it is very

More information

COPYRIGHTED MATERIAL. Starting Strong with Visual C# 2005 Express Edition

COPYRIGHTED MATERIAL. Starting Strong with Visual C# 2005 Express Edition 1 Starting Strong with Visual C# 2005 Express Edition Okay, so the title of this chapter may be a little over the top. But to be honest, the Visual C# 2005 Express Edition, from now on referred to as C#

More information

Delegate Notes. Title: Creating Interactive Exercises using Hot Potatoes Software

Delegate Notes. Title: Creating Interactive Exercises using Hot Potatoes Software Delegate Notes Title: Creating Interactive Exercises using Hot Potatoes Software Session objectives: To put together a short cycle of exercises linked together based on the topic of animals. Examples here

More information

Chapter 2 The SAS Environment

Chapter 2 The SAS Environment Chapter 2 The SAS Environment Abstract In this chapter, we begin to become familiar with the basic SAS working environment. We introduce the basic 3-screen layout, how to navigate the SAS Explorer window,

More information

the NXT-G programming environment

the NXT-G programming environment 2 the NXT-G programming environment This chapter takes a close look at the NXT-G programming environment and presents a few simple programs. The NXT-G programming environment is fairly complex, with lots

More information

Semester 2, 2018: Lab 1

Semester 2, 2018: Lab 1 Semester 2, 2018: Lab 1 S2 2018 Lab 1 This lab has two parts. Part A is intended to help you familiarise yourself with the computing environment found on the CSIT lab computers which you will be using

More information

A CHILD S GUIDE TO DIRECT DATALOGGING WITH EXCEL. (All brickbats and bouquets gladly received - on the Arduino forum)

A CHILD S GUIDE TO DIRECT DATALOGGING WITH EXCEL. (All brickbats and bouquets gladly received - on the Arduino forum) A CHILD S GUIDE TO DIRECT DATALOGGING WITH EXCEL version 5 (All brickbats and bouquets gladly received - on the Arduino forum) This is an aide memoire for the PLX-DAQ macro for Excel. Parallax do not address

More information

Lab 8: Using POY from your desktop and through CIPRES

Lab 8: Using POY from your desktop and through CIPRES Integrative Biology 200A University of California, Berkeley PRINCIPLES OF PHYLOGENETICS Spring 2012 Updated by Michael Landis Lab 8: Using POY from your desktop and through CIPRES In this lab we re going

More information

Geneious 5.6 Quickstart Manual. Biomatters Ltd

Geneious 5.6 Quickstart Manual. Biomatters Ltd Geneious 5.6 Quickstart Manual Biomatters Ltd October 15, 2012 2 Introduction This quickstart manual will guide you through the features of Geneious 5.6 s interface and help you orient yourself. You should

More information

1 Installation (briefly)

1 Installation (briefly) Jumpstart Linux Bo Waggoner Updated: 2014-09-15 Abstract A basic, rapid tutorial on Linux and its command line for the absolute beginner. Prerequisites: a computer on which to install, a DVD and/or USB

More information

2) NCBI BLAST tutorial This is a users guide written by the education department at NCBI.

2) NCBI BLAST tutorial   This is a users guide written by the education department at NCBI. Web resources -- Tour. page 1 of 8 This is a guided tour. Any homework is separate. In fact, this exercise is used for multiple classes and is publicly available to everyone. The entire tour will take

More information

Outlook is easier to use than you might think; it also does a lot more than. Fundamental Features: How Did You Ever Do without Outlook?

Outlook is easier to use than you might think; it also does a lot more than. Fundamental Features: How Did You Ever Do without Outlook? 04 537598 Ch01.qxd 9/2/03 9:46 AM Page 11 Chapter 1 Fundamental Features: How Did You Ever Do without Outlook? In This Chapter Reading e-mail Answering e-mail Creating new e-mail Entering an appointment

More information

Get comfortable using computers

Get comfortable using computers Mouse A computer mouse lets us click buttons, pick options, highlight sections, access files and folders, move around your computer, and more. Think of it as your digital hand for operating a computer.

More information

Rescuing Lost Files from CDs and DVDs

Rescuing Lost Files from CDs and DVDs Rescuing Lost Files from CDs and DVDs R 200 / 1 Damaged CD? No Problem Let this Clever Software Recover Your Files! CDs and DVDs are among the most reliable types of computer disk to use for storing your

More information

Kindle Formatting Guide

Kindle Formatting Guide Kindle Formatting Guide Contents Introduction... 2 How about doing your own formatting?... 2 About the Kindle Format... 2 What file formats can you submit to Kindle?... 2 Stage 1 Format Your Headings...

More information

CPS109 Lab 1. i. To become familiar with the Ryerson Computer Science laboratory environment.

CPS109 Lab 1. i. To become familiar with the Ryerson Computer Science laboratory environment. CPS109 Lab 1 Source: Partly from Big Java lab1, by Cay Horstmann. Objective: i. To become familiar with the Ryerson Computer Science laboratory environment. ii. To obtain your login id and to set your

More information

Lab: Supplying Inputs to Programs

Lab: Supplying Inputs to Programs Steven Zeil May 25, 2013 Contents 1 Running the Program 2 2 Supplying Standard Input 4 3 Command Line Parameters 4 1 In this lab, we will look at some of the different ways that basic I/O information can

More information

Part I. Introduction to Linux

Part I. Introduction to Linux Part I Introduction to Linux 7 Chapter 1 Linux operating system Goal-of-the-Day Familiarisation with basic Linux commands and creation of data plots. 1.1 What is Linux? All astronomical data processing

More information

Anthill User Group Meeting, 2015

Anthill User Group Meeting, 2015 Agenda Anthill User Group Meeting, 2015 1. Introduction to the machines and the networks 2. Accessing the machines 3. Command line introduction 4. Setting up your environment to see the queues 5. The different

More information

Lab 1: Accessing the Linux Operating System Spring 2009

Lab 1: Accessing the Linux Operating System Spring 2009 CIS 90 Linux Lab Exercise Lab 1: Accessing the Linux Operating System Spring 2009 Lab 1: Accessing the Linux Operating System This lab takes a look at UNIX through an online experience on an Ubuntu Linux

More information

Course Exercises for the Content Management System. Grazyna Whalley, Laurence Cornford June 2014 AP-CMS2.0. University of Sheffield

Course Exercises for the Content Management System. Grazyna Whalley, Laurence Cornford June 2014 AP-CMS2.0. University of Sheffield Course Exercises for the Content Management System. Grazyna Whalley, Laurence Cornford June 2014 AP-CMS2.0 University of Sheffield PART 1 1.1 Getting Started 1. Log on to the computer with your usual username

More information

Getting Started With Linux and Fortran Part 2

Getting Started With Linux and Fortran Part 2 Getting Started With Linux and Fortran Part 2 by Simon Campbell [The K Desktop Environment, one of the many desktops available for Linux] ASP 3012 (Stars) Computer Tutorial 2 1 Contents 1 Some Funky Linux

More information

This Tutorial is for Word 2007 but 2003 instructions are included in [brackets] after of each step.

This Tutorial is for Word 2007 but 2003 instructions are included in [brackets] after of each step. This Tutorial is for Word 2007 but 2003 instructions are included in [brackets] after of each step. Table of Contents Get Organized... 1 Create the Home Page... 1 Save the Home Page as a Word Document...

More information

BLAST Exercise 2: Using mrna and EST Evidence in Annotation Adapted by W. Leung and SCR Elgin from Annotation Using mrna and ESTs by Dr. J.

BLAST Exercise 2: Using mrna and EST Evidence in Annotation Adapted by W. Leung and SCR Elgin from Annotation Using mrna and ESTs by Dr. J. BLAST Exercise 2: Using mrna and EST Evidence in Annotation Adapted by W. Leung and SCR Elgin from Annotation Using mrna and ESTs by Dr. J. Buhler Prerequisites: BLAST Exercise: Detecting and Interpreting

More information

Getting started with UNIX/Linux for G51PRG and G51CSA

Getting started with UNIX/Linux for G51PRG and G51CSA Getting started with UNIX/Linux for G51PRG and G51CSA David F. Brailsford Steven R. Bagley 1. Introduction These first exercises are very simple and are primarily to get you used to the systems we shall

More information

Using Mail Merge in Microsoft Word 2003

Using Mail Merge in Microsoft Word 2003 Using Mail Merge in Microsoft Word 2003 Mail Merge Created: 12 April 2005 Note: You should be competent in Microsoft Word before you attempt this Tutorial. Open Microsoft Word 2003 Beginning the Merge

More information

Biostatistics and Bioinformatics Molecular Sequence Databases

Biostatistics and Bioinformatics Molecular Sequence Databases . 1 Description of Module Subject Name Paper Name Module Name/Title 13 03 Dr. Vijaya Khader Dr. MC Varadaraj 2 1. Objectives: In the present module, the students will learn about 1. Encoding linear sequences

More information

Civil Engineering Computation

Civil Engineering Computation Civil Engineering Computation First Steps in VBA Homework Evaluation 2 1 Homework Evaluation 3 Based on this rubric, you may resubmit Homework 1 and Homework 2 (along with today s homework) by next Monday

More information

Bryce Lightning (Network Rendering)

Bryce Lightning (Network Rendering) Bryce Lightning (Network Rendering) Bryce Lightning is a network rendering system that permits distributing a render job to several computers on a network. What must be observed to make it work is discussed.

More information

CROMWELLSTUDIOS. Content Management System Instruction Manual V1. Content Management System. V1

CROMWELLSTUDIOS. Content Management System Instruction Manual V1.   Content Management System. V1 Content Management System Instruction Manual V1 www.cromwellstudios.co.uk Cromwell Studios Web Services Content Management System Manual Part 1 Content Management is the system by which you can change

More information

Introduction to UNIX. Logging in. Basic System Architecture 10/7/10. most systems have graphical login on Linux machines

Introduction to UNIX. Logging in. Basic System Architecture 10/7/10. most systems have graphical login on Linux machines Introduction to UNIX Logging in Basic system architecture Getting help Intro to shell (tcsh) Basic UNIX File Maintenance Intro to emacs I/O Redirection Shell scripts Logging in most systems have graphical

More information

Creating full-featured PDFs in OpenOffice LUXURY EXPORT

Creating full-featured PDFs in OpenOffice LUXURY EXPORT Creating full-featured PDFs in OpenOffice LUXURY EXPORT www.sxc.hu The PDF format has many useful features that make it easier for readers to find their way around large documents, but the native PDF export

More information

Lesson 3 Transcript: Part 1 of 2 - Tools & Scripting

Lesson 3 Transcript: Part 1 of 2 - Tools & Scripting Lesson 3 Transcript: Part 1 of 2 - Tools & Scripting Slide 1: Cover Welcome to lesson 3 of the db2 on Campus lecture series. Today we're going to talk about tools and scripting, and this is part 1 of 2

More information

This Tutorial is for Word 2007 but 2003 instructions are included in [brackets] after of each step.

This Tutorial is for Word 2007 but 2003 instructions are included in [brackets] after of each step. This Tutorial is for Word 2007 but 2003 instructions are included in [brackets] after of each step. Table of Contents Just so you know: Things You Can t Do with Word... 1 Get Organized... 1 Create the

More information

An Introduction to Stata Exercise 1

An Introduction to Stata Exercise 1 An Introduction to Stata Exercise 1 Anna Folke Larsen, September 2016 1 Table of Contents 1 Introduction... 1 2 Initial options... 3 3 Reading a data set from a spreadsheet... 5 4 Descriptive statistics...

More information

Creating Word Outlines from Compendium on a Mac

Creating Word Outlines from Compendium on a Mac Creating Word Outlines from Compendium on a Mac Using the Compendium Outline Template and Macro for Microsoft Word for Mac: Background and Tutorial Jeff Conklin & KC Burgess Yakemovic, CogNexus Institute

More information

FTP Frequently Asked Questions

FTP Frequently Asked Questions Guide to FTP Introduction This manual will guide you through understanding the basics of FTP and file management. Within this manual are step-by-step instructions detailing how to connect to your server,

More information

Bioinformatics. Computational Methods I: Genomic Resources and Unix. George Bell WIBR Biocomputing Group

Bioinformatics. Computational Methods I: Genomic Resources and Unix. George Bell WIBR Biocomputing Group Bioinformatics Computational Methods I: Genomic Resources and Unix George Bell WIBR Biocomputing Group Human genome databases Human Genome Sequencing Consortium Major annotators: NCBI Ensembl (EMBL-EBI

More information

ENCM 339 Fall 2017: Editing and Running Programs in the Lab

ENCM 339 Fall 2017: Editing and Running Programs in the Lab page 1 of 8 ENCM 339 Fall 2017: Editing and Running Programs in the Lab Steve Norman Department of Electrical & Computer Engineering University of Calgary September 2017 Introduction This document is a

More information

Imagery International website manual

Imagery International website manual Imagery International website manual Prepared for: Imagery International Prepared by: Jenn de la Fuente Rosebud Designs http://www.jrosebud.com/designs designs@jrosebud.com 916.538.2133 A brief introduction

More information

Using Dreamweaver CS6

Using Dreamweaver CS6 3 Now that you should know some basic HTML, it s time to get in to using the general editing features of Dreamweaver. In this section we ll create a basic website for a small business. We ll start by looking

More information

Lab #2 Physics 91SI Spring 2013

Lab #2 Physics 91SI Spring 2013 Lab #2 Physics 91SI Spring 2013 Objective: Some more experience with advanced UNIX concepts, such as redirecting and piping. You will also explore the usefulness of Mercurial version control and how to

More information

Introduction to Bio-Linux. February 4, 2009

Introduction to Bio-Linux. February 4, 2009 Introduction to Bio-Linux February 4, 2009 1 Table of Contents INTRODUCTION TO BIO-LINUX...1 PART ONE: INTRODUCTION TO THE BIO-LINUX SYSTEM...4 Logging in and exploring the Bio-Linux desktop...4 Finding

More information

Module 1 Artemis. Introduction. Aims IF YOU DON T UNDERSTAND, PLEASE ASK! -1-

Module 1 Artemis. Introduction. Aims IF YOU DON T UNDERSTAND, PLEASE ASK! -1- Module 1 Artemis Introduction Artemis is a DNA viewer and annotation tool, free to download and use, written by Kim Rutherford from the Sanger Institute (Rutherford et al., 2000). The program allows the

More information

Tips & Tricks for Microsoft Word

Tips & Tricks for Microsoft Word T 330 / 1 Discover Useful Hidden Features to Speed-up Your Work in Word For what should be a straightforward wordprocessing program, Microsoft Word has a staggering number of features. Many of these you

More information

EEN118 LAB FOUR. h = v t ½ g t 2

EEN118 LAB FOUR. h = v t ½ g t 2 EEN118 LAB FOUR In this lab you will be performing a simulation of a physical system, shooting a projectile from a cannon and working out where it will land. Although this is not a very complicated physical

More information

Using UNIX. -rwxr--r-- 1 root sys Sep 5 14:15 good_program

Using UNIX. -rwxr--r-- 1 root sys Sep 5 14:15 good_program Using UNIX. UNIX is mainly a command line interface. This means that you write the commands you want executed. In the beginning that will seem inferior to windows point-and-click, but in the long run the

More information

CHAPTER 1 COPYRIGHTED MATERIAL. Finding Your Way in the Inventor Interface

CHAPTER 1 COPYRIGHTED MATERIAL. Finding Your Way in the Inventor Interface CHAPTER 1 Finding Your Way in the Inventor Interface COPYRIGHTED MATERIAL Understanding Inventor s interface behavior Opening existing files Creating new files Modifying the look and feel of Inventor Managing

More information

Protocols. Module UFCE Topic: Protocols and More HTML

Protocols. Module UFCE Topic: Protocols and More HTML Protocols Module UFCE47-20-1 Topic: Protocols and More HTML Introduction This worksheet is designed to encourage you to continuing your web page writing skills. Also to give you the opportunity to use

More information

CSE 101 Introduction to Computers Development / Tutorial / Lab Environment Setup

CSE 101 Introduction to Computers Development / Tutorial / Lab Environment Setup CSE 101 Introduction to Computers Development / Tutorial / Lab Environment Setup Purpose: The purpose of this lab is to setup software that you will be using throughout the term for learning about Python

More information

Intro to the Apple Macintosh Operating System, OSX

Intro to the Apple Macintosh Operating System, OSX Intro to the Apple Macintosh Operating System, OSX Introduction. The Apple Macintosh Operating system or OS, is one of the oldest operating systems in use on a personal computer 1. It has been designed

More information

Function. Description

Function. Description Function Check In Get / Checkout Description Checking in a file uploads the file from the user s hard drive into the vault and creates a new file version with any changes to the file that have been saved.

More information

Bioinformatics? Reads, assembly, annotation, comparative genomics and a bit of phylogeny.

Bioinformatics? Reads, assembly, annotation, comparative genomics and a bit of phylogeny. Bioinformatics? Reads, assembly, annotation, comparative genomics and a bit of phylogeny stefano.gaiarsa@unimi.it Linux and the command line PART 1 Survival kit for the bash environment Purpose of the

More information

Sequence Database Download & Configuration ASMS 2003

Sequence Database Download & Configuration ASMS 2003 Sequence Database Download & Configuration This talk will be mainly of interest to those people who administer an in-house Mascot server. 1 General procedure for setting up a new database Choose a name

More information

Contents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version...

Contents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version... Contents Note: pay attention to where you are........................................... 1 Note: Plaintext version................................................... 1 Hello World of the Bash shell 2 Accessing

More information