Very Short Introduction Thomas Girke October, Slide / Slide / Outline Why? Complete statistical package and programming language Efficient data structures make programming very easy Ease of implementing custom functions Powerful graphics Access to fast growing number of analysis packages Most widely used language in bioinformatics Is standard for data mining and biostatistical analysis Technical advantages: free, open-source, available for all OSs Slide 3/ Slide 4/
Outline How to Get Started? Control Structures Conditional Executions Loops Basic Matrix Calculations 3 Functions 4 Useful Utilities 5 System Commands 6 Writing and Executing R Scripts Choose your favorite text editor with syntax highlighting. Stick to it for the rest of your life! 3 Suggestions: Vim Emacs Many others 4 Find a problem and solve it by writing a script! 5 Implement an algorithm on your own. 6 Never perform any repetitive computer tasks manually again. Slide 5/ Slide 6/ Outline What Is L A TEX? Originally developed in the early 98s by Leslie Lamport. L A TEX is a document markup language and document preparation system for the TeX typesetting program. Developed for mathematicians, statisticians, engineers and computer scientists. High quality of typesetting for scientific articles. Programmable environment. Many efficient cross-referencing facilities for equations, tables, figures, bibliographies, etc. Slide 7/ Slide 8/
How Does It Work? Bibtex: The Ultimate Reference Management Tool Write in your favourite text editor, e.g.: Vim or Emacs. Install L A TEX distribution for your OS Windows: MiKTeX Linux: Latex Project Site Mac OS X: TexShop L A TEX manuals (very incomplete selection) List of Manuals on Latex Project Site The Not So Short Introduction to LaTeX Getting to grips with LaTeX Packages An almost complete list: Online TeX Catalogue Sample Latex file: example.tex Convert to PDF with command: pdflatex example.tex To include references from MyBibTex.bib database, the following command sequence needs to be executed: pdflatex example.tex; bibtex example; pdflatex example.tex Slide 9/ Slide / of BibTex Citations Outline Citation in parentheses (Grant et al., 6; Schwacke et al., 3; Miteva et al., 6) Citation of Grant et al. (6); Schwacke et al. (3); Miteva et al. (6) Extended citation (Grant et al., 6, J Chem Inf Model, 46, 9-98) Footnote citation with more detail (Grant et al., 6)... The reference list for these citations appears automatically at a defined position, here the end of the document. (Grant et al., 6, J Chem Inf Model) Slide / Slide /
R s Sweave Function Integrates R with Latex Structure of *.Rnw Hybrid File Sweave provides a flexible framework for mixing Latex and R code for automatic generation of scientific documents. It does this by identifying R code chunks - starting with <<>>= and ending with @ - and replaces them with the corresponding R output in L A TEX format, e.g. commands, data objects, plots. The user organizes the hybrid code in a *.Rnw file, while the Sweave() function converts this file into a typical Latex *.tex file. A quick learning exercise: Download sample hybrid file Sweave-test-.Rnw Run in R command Sweave("Sweave-test-.Rnw") Convert generated Sweave-test-.tex to PDF with pdflatex Sweave-test-.tex Sweave User Manual Latex \Sexpr{pi} Latex Latex Latex Latex Latex Latex <<>>= @ <<>>= @ Slide 3/ Slide 4/ Convert *.Rnw to *.tex to *.pdf Code Chunk Options Create *.tex file > Sweave("mydoc.Rnw") Create R source file with code chunks (optional) > Stangle("mydoc.Rnw") 3 Gernerate PDF with bibliography > system("pdflatex mydoc.tex; bibtex mydoc; pdflatex mydoc.tex") Important options that can be included in code chunk start tag: <<>>= label: optional name for code chunk. echo: shows command if set to TRUE. Default is TRUE. fig: shows plots automatically if set to TRUE. Alternatively, one can use standard R code to save graphics to files and point to them in Latex source code. eval: if FALSE, the variables and functions in code chunk are not evaluated. prefix: if TRUE, generated file names have a common prefix. Slide 5/ Slide 6/
Printing R Commands and Output Including Tables with xtable() Beamer Slide Setting R Code Chunks Result in PDF > : \ begin {frame }[ containsverbatim ] > <<echo=true>>= > : > @ [] 3 4 5 6 7 8 9 Code Chunk Result in PDF > <<echo=false>>= > l i b r a r y ( xtable ) > xtable ( i r i s [ : 4, ] ) > @ Sepal.Length Sepal.Width Petal.Length Petal.Width Species 5. 3.5.4. setosa 4.9 3..4. setosa 3 4.7 3..3. setosa 4 4.6 3..5. setosa Slide 7/ Slide 8/ Including Graphics: Venn Diagram Code Including Graphics: Plot Venn Diagram Venn Diagram > source("http://faculty.ucr.edu/~tgirke/documents/r_biocond/my_r_scripts/overlapper.r") A > setlist5 <- list(a = sample(letters, 8), B = sample(letters, + 6), C = sample(letters, ), D = sample(letters, ), + E = sample(letters, 8)) > OLlist5 <- overlapper(setlist = setlist5, sep = "_", type = "vennsets") > counts <- sapply(ollist5$venn_list, length) > pdf("venn.pdf") > vennplot(counts = counts, ccol = c(rep(, 3), ), lcex =.5, + ccex = c(rep(.5, 5), rep(.6, 5),.5)) > dev.off() E 3 4 3 3 B C null device D Unique objects: All = 6; S = 8; S = 6; S3 = ; S4 = ; S5 = 8 Figure: Venn Diagram Slide 9/ Slide /
Bibliography I Grant, J. A., Haigh, J. A., Pickup, B. T., Nicholls, A., Sayle, R. A., Sep-Oct 6. Lingos, finite state machines, and fast similarity searching. J Chem Inf Model 46 (5), 9 98. URL http://www.hubmed.org/display.cgi?uids=69957 Miteva, M. A., Violas, S., Montes, M., Gomez, D., Tuffery, P., Villoutreix, B. O., Jul 6. FAF-Drugs: free ADME/tox filtering of compound collections. Nucleic Acids Res 34 (Web Server issue), 738 744. URL http://www.hubmed.org/display.cgi?uids=6845 Schwacke, R., Schneider, A., van der Graaff, E., Fischer, K., Catoni, E., Desimone, M., Frommer, W. B., Flügge, U. I., Kunze, R., Jan 3. ARAMEMNON, a novel database for Arabidopsis integral membrane proteins. Plant Physiol 3 (), 6 6. URL http://www.hubmed.org/display.cgi?uids=595 Slide /