MultiMap. User's Guide. A Program for Automated Genetic Linkage Mapping. version 1.1. Written by. Tara Matise. Mark Perlin Aravinda Chakravarti

Size: px
Start display at page:

Download "MultiMap. User's Guide. A Program for Automated Genetic Linkage Mapping. version 1.1. Written by. Tara Matise. Mark Perlin Aravinda Chakravarti"

Transcription

1 MultiMap User's Guide A Program for Automated Genetic Linkage Mapping version 1.1 Written by Tara Cox Matise Mark Perlin Aravinda Chakravarti Language: Common Lisp Computer: Sun/Dec/HP... Program for Linkage Analysis : CRI-MAP Copyright 1993 Tara Matise and Aravinda Chakravarti To whom correspondence should be addressed: Tara Matise Aravinda Chakravarti Department of Human Genetics Department of Genetics, BRB 721 A310 Crabtree Hall Case Western Reserve University University of Pittsburgh Euclid Avenue Pittsburgh, Pennsylvania Cleveland, OH FAX: FAX: multimap@genome1.hgen.pitt.edu axc39@po.cwru.edu

2 This documentation and the MultiMap program are copyrighted. The program may be copied freely for nonprofit research uses only and may not be used for commercial purposes without the specific permission of the authors. The code may not be modified without the specific permission of the authors. Permission was granted by Dr. Phil Green (Washington University, St. Louis) to distribute a modified version of the CRI-MAP program (version 2.41m) along with MultiMap. The changes made to CRI-MAP for compatibility with MultiMap relate only to the format of CRI- MAP output. No changes were made to the manner in which CRI-MAP performs linkage analysis. Users who wish to use CRI-MAP outside of MultiMap must obtain an original copy of CRI-MAP from Phil Green. Any questions relating to MultiMap or to MultiMap's use of CRI- MAP must be directed solely to Tara Matise or Aravinda Chakravarti (see Final Notes). Unless you are familiar with the Lisp programming language, we strongly recommend that you read this entire documentation before using MultiMap, as the Lisp language is quite different from many others, such as Pascal, Fortran and C. In addition, if you are unfamiliar with the CRI- MAP program, recommend you read the CRI-MAP documentation before using MultiMap.

3 Contents I. Introduction 4 II. How to Use MultiMap A. Format of Input Files 6 1,2. chr50.gen and chr50.dat 6 3. chr50.names 7 4. chr50.input a. An example file 8 b. Creating an input file 8 c. Description of input parameters 9 B. Constructing Maps What MultiMap does How to run MultiMap MultiMap keywords What to do in case of problems 18 C. Output Files 19 IV. Additional Functions 21 A. Introduction to Functions 21 B. Description of Functions 21 V. Support software 1. setf get-marker-names, get-marker-name get-marker-numbers, get-marker-number find-zero-recombs flips-pretty write-map-to-file get-map-from-user print-map drop-each-locus save-results reverse find-all-linkage-groups get-map-from-file check-all-haps lnktocri.p makenames 28 VI. Final Notes 29 VII. Getting Ready to Use MultiMap A. Obtaining a Lisp interpreter 30 B. Obtaining MultiMap and CRI-MAP 30 C. A Brief introduction to Lisp 33 VIII. Obtaining a Common Lisp interpreter 34 IX. Acknowledgments and References 35 X. The Mapping Algorithm 36

4 I Introduction MultiMap is designed to automate the process of genetic linkage mapping by using heuristics for map construction. The order in which markers are added to the map is dependent not only on statistical support for order, but is also based locus content. We define the content of a marker locus as a function of a) informativeness, as measured by heterozygosity or pairwise joint-pic values (Chakravarti 1991) with other linked markers; b) relative ease of genotyping (i.e., Southern blotting versus PCR-based markers); c) marker quality score (based on background, ease of allele identification, etc.); d) ability to be multiplexed with other markers; and, e) genetic distance between it and other closely linked markers. Based on these criteria, markers can enter the map building procedure in a non-random manner, with those markers with the most desirable content characteristics added to the map before others. Currently, MultiMap measures locus content only through informativeness and genetic distance (a. and e. above). It uses the mapping algorithms we have developed at the University of Pittsburgh (Cox 1992; Cox et al. 1992) to construct both framework and comprehensive linkage maps. We are currently preparing the manuscript which will document these algorithms as well as the MultiMap program and will be made available to interested investigators. MultiMap uses a modified version (ver. 2.41m) of the program CRI-MAP (Green unpublished; Lander and Green 1987) for linkage analysis. Both CRI-MAP 2.41m and MultiMap are easily distributed via FTP or . Because MultiMap is written in standard Common Lisp, it should run under any Lisp compiler, but has only been tested with CMU Common Lisp (CMU CL; Carnegie Mellon University 1992) and LUCID Common Lisp (LUCID Inc. 1990). MultiMap is designed to be run on CEPH-type marker data. Currently, the only reason why it is restricted to CEPH-type data is because estimation of the allele frequencies, heterozygosities, PIC values and joint-pic calculations are dependent on this type of data structure. It can be used to analyze both autosomal and X-linked data. It can be used to construct radiation hybrid maps assuming a constant retention frequency of 50% (Green 1992). We are currently completing the version of MultiMap which will also construct radiation hybrid maps at a varying retention frequency. MultiMap can be used to construct a framework map, a comprehensive map, or both. It can first construct a framework map and then expand that map into a comprehensive map, or you can provide MultiMap with a framework map (whether it fits the true criteria of a framework map or not is irrelevant) which MultiMap will expand into a comprehensive map. Because MultiMap is

5 written in Lisp, it is controlled through a series of functions. There is one main function for constructing maps, and others for executing MultiMap's additional features. There are many mapping parameters over which the user has control (see section II.A.4.c). Most of these control the manner in which the maps will be constructed, while others determine which analyses will be performed (i.e. construction of framework vs. comprehensive map). Two functions that are controlled in this manner are estimation of genotype error rates and analysis of sex-specific differences in recombination. Descriptions of these methods are given in section II.A.4.c. For the estimates of genotype error, an output file is produced, in tabular format, which can be used by your favorite program to produce graphs. For analysis of sex-differences in recombination, an input file is written for the program SEXDIF (Blaschak and Chakravarti, unpublished). The SEXDIF program is run separately, also produces tabular output for graphing, and is available from the same FTP site as MultiMap. MultiMap can either be run automatically or interactively. In the automatic mode, after the initial mapping parameters have been set, MultiMap will run from start to finish with no intervention. In the interactive mode you are consulted at many stages of the map construction for his/her input. Throughout this documentation, commands which you would type are listed in bold type. These commands follow a "%" Unix prompt or a "*" prompt if they are executed from within a LISP interpreter (i.e. while running MultiMap). We assume you are using CMU Common Lisp.

6 II How to Use MultiMap A. Format of Input and Output Files MultiMap and CRI-MAP require that the input file names be of the form "chrw.ext" where W is an arbitrary number designating a chromosome and.ext represents various file extensions. We use the chromosome number 50 as an example. A total of 4 files are required as input to MultiMap. They are : 1 chr50.gen (supplied by user) 2 chr50.dat (created by CRI-MAP) 3 chr50.names (supplied by user) 4 chr50.input (supplied by user or created by MultiMap) MultiMap will not run unless there are 4 files with the exact filename extensions as specified above. 1, 2. Format of linkage data files (chr50.gen and chr50.dat) You must supply the file "chr50.gen." This file contains the pedigree and genotype data and must be in CRI-MAP format (see the CRI-MAP documentation). As specified in the CRI- MAP documentation, you first create a ".gen" file, and then invoke the CRI-MAP "PREPARE" option, which creates the ".dat" file from the ".gen" file. To run PREPARE, execute the following command: % lispcri 50 prepare where cri is the name of the executable file for running CRI-MAP. (You can use either ver 2.4 or 2.41m for PREPARE). The PREPARE option will check for instances of non-inheritance in the data file. These are printed to the screen and you are responsible for correcting any erroneous genotypes, as these could have a negative effect on the resulting map if left unchecked (Buetow 1991; Lasher et al. 1991). Note The modified version of PREPARE does not create a ".par" file as the original CRI-MAP PREPARE does. MultiMap writes its own ".par" file. Note If you are analyzing X-linked data you must use a -1 as the dummy allele assigned as the second allele for all males. Note Each time you change anything in the.gen file, YOU MUST RUN PREPARE AGAIN! Prepare will create a new.dat file, corresponding to the updated.gen file. This part of mapping has not been automated because the PREPARE function alerts you to the presence of any data inconsistencies, which you may want to check before continuing with map construction.

7 3. Format of ".names" data file (chr50.names) The file chr50.names simply contains the number of markers and two sets of marker names. For improved appearance of MultiMap outputs, marker names should be no longer than 8 characters, although the program will run if the names are longer than 8 characters. In Lisp, parentheses, apostrophes and vertical bars are reserved symbols. For this reason, these symbols may not be included as part of a marker name. In addition to letters and numbers, the following characters are normally considered to be alphabetic and can be used as part of marker names (Steele 1990): + - * $ % ^ & _ = < > ~. The first line of the "names" file contains the number of markers (N). For the next N lines of this file, one marker name is specified per line. The order of these marker names must correspond to their order in the data files. It is imperative that each marker name be unique, i.e. no marker name may be used to identify more than one marker. Each marker name in the first set is read in as a string. Therefore, any leading or trailing characters and blanks on each line are included as part of the marker name. Following the N marker names is a list of the markers which are to be mapped in this particular analysis. The marker names in this second set are enclosed by parenthesis and may be all on one line or each on a separate line. For example, if you wish to map all the markers, the second list contains the same marker names as the first. Otherwise, the second list is a subset of the first list. The marker names in the second list can be in any order. Anything else in this file after the two lists of markers names is not read by MultiMap. For example, the following ".names" file indicates that there are 7 markers in the data file, named A - G, and you wish to construct a map of all these markers except C and F: 7 A B C D E F G (A B D E G) Note Each marker must have a unique marker name, i.e., no marker names may be present more than once in either list of marker names. We suggest that if two markers share the same name, a letter be added to the end of the name, for example D50S15A, D50S15B, etc. Also, while MultiMap preserves the case of the characters in each name, it does not distinguish between them. Therefore, two markers names that differ only by case (i.e. Ash and ASH) would be considered identical by MultiMap. Also, the marker names in the second list must be spelled exactly as they are in the first list.

8 4. Format of ".input" data file (chr50.input) : The file chr50.input contains the values of all 27 of the user-controlled MultiMap parameters. You can create this file in a text editor (you may wish to edit the sample "input" file provided), or you can have MultiMap create this file for you using the function (createinput-file) (see section II.B.2. Constructing Maps). a. An example input file The values following each variable (variables are enclosed by two *) in the input file are the value that will be assigned to that variable during a MultiMap run. The values are either T (true), NIL (false), numeric or symbolic (directly preceded by an single quote (')). (The line numbers at the end of each line below are given for documentational purposes only, they would not be included in an actual "input" file.) (*make-framework* T) 1 (*flips-every-new-marker* NIL) 2 (*max-interval-theta* 0.20) 3 (*min-interval-theta* 0.10) 4 (*max-start-theta*.20) 5 (*min-start-theta*.10) 6 (*min-start-lod* 3.0) 7 (*est-genome-length* 1.0) 8 (*percent-desired-chromosome-coverage*.66) 9 (*min-num-of-framework-markers* 4) 10 (*choose-start-markers* NIL) 11 (*use-joint-pic* T) 12 (*extend-map* nil) 13 (*num-to-order-per-interval* 3) 14 (*num-to-flip-during-extension* 3) 15 (*num-to-flip-at-end* 3) 16 (*data-type* 'linkage) 17 (*odds-threshold* 3.0) 18 (*flips-odds-threshold* 3.0) 19 (*usehaps* NIL 20 (*haplist* '()) 21 (*do-flips* T) 22 (*sex-eq* 1) 23 (*compute-error* T) 24 (*compute-sex-diff* T) 25 (*flips-pretty* T) 26 (*save-results-often* NIL) 27 b. Creating an input file You do not need to read this section if you plan to let MultiMap create your input file for you. This section is only necessary if you plan to create your own "input" file using a text editor. An explanation of each MultiMap parameter follows, and each is referred to by its line number as listed above. These parameters are represented in the "input" file as a

9 list, where each item in the list is itself a list. (A list is simply a set of items which are enclosed in parentheses). This type of "list of lists" is given by the following example : ((A a) (B b) (C c)) Each list is enclosed in parentheses, as is the entire list (note the double parentheses). Within the main list there is a list for each parameter, where the first item in this list (A, B, C in the above example) is the name of the parameter and the second item (a, b, c in the above example) is its value for a particular MultiMap analysis. The parameters may appear in any order in this list, as long as they are all included and each has an associated value. In the following example, the parameters are organized according to their function within MultiMap. Note If you use the (create-input-file) function to create the input file (see section II.B.2), the order of the parameters in the corresponding "input" file may not correspond to the order given in this documentation. This will not present a problem. c. Description of input parameters (See section IX. Mapping Algorithm for more detail) Parameters associated only with construction of a framework map 1. ((*make-framework* T ) The parameter *make-framework* is set to T (true) if a framework map will be constructed, and is set to "NIL" (false) if a framework map is not desired. In this example, you wish to construct a framework map. Note the extra parenthesis at the front of this list. This parenthesis must be present and is used to enclose all of the parameter lists into one "main" list. For comprehensive map construction see parameter #13 (*extend-map*). 2. (*flips-every-new-marker* NIL) If the parameter *flips-every-new-marker* is T, a CRI-MAP FLIPS analysis (see Mapping Algorithm) is performed each time a new marker is added to the framework map. While this option adds a degree of certainty to the final map, it can greatly increase the total time for map completion. When *flipsevery-new-marker* is NIL, a CRI-MAP FLIPS analysis is not performed each time one new marker is added to the framework map. In any case, a FLIPS analysis is performed upon completion of the framework map. 3. (*max-interval-theta* 0.20) The recombination value assigned to the parameter *max-interval-theta* represents the maximum recombination distance desired between markers in the framework map. It is used as part of the

10 determination of whether a certain map qualifies as a complete framework map (see Mapping Algorithm). 4. (*min-interval-theta* 0.10) The recombination value assigned to the parameter *min-interval-theta* represents the minimum recombination distance desired between markers in the framework map. It is used when determining whether to add a particular marker to the framework map or not(see Mapping Algorithm). The example values assigned to the parameters *max-interval-theta* and *min-intervaltheta* indicate that this user wanted a recombination distance of 10%-20% between the markers in his framework map. 5. (*max-start-theta*.20) The recombination value assigned to the parameter *max-start-theta* represents the maximum recombination distance desired between the two initial markers from which framework map construction will begin. 6. (*min-start-theta*.10) The recombination value assigned to the parameter *min-start-theta* represents the minimum recombination distance desired between the two initial markers from which the framework map will be constructed. 7. (*min-start-lod* 3.0) The value assigned to *min-start-lod* is the minimum lod score you wishes to use for accepting linkage between the starting two markers. The example values assigned to the parameters *max-start-theta*, *min-start-theta* and *min-start-lod* indicate that this user wanted the two initial markers from which his framework map would be built to be linked to each other with a recombination fraction of 10%-20% and a lod score of at least (*est-genome-length* 1.0) The value assigned to the parameter *est-genome-length* represents the estimated length of the chromosome in Morgans that is covered by the markers included in this particular MultiMap analysis. 9. (*percent-desired-chromosome-coverage*.66) The framework map is not considered complete until its total length is at least *percent-desiredchromosome-coverage* multiplied by the *est-genome-length*.

11 10. (*min-num-of-framework-markers* 4) The *min-num-of-framework-markers* indicates the minimum number of markers which must be placed on a framework map before it can be considered complete. The example values assigned to the parameters *est-genome-length*, *percent-desiredchromosome-coverage* and *min-num-of-framework-markers* indicate that this user estimated that the markers would cover a 1.0 Morgan chromosome segment, that the framework map should cover at least 66% of this segment, and that there should be at least 4 markers on the framework map. (See Mapping Algorithm for more details). 11. (*choose-start-markers* NIL) When *choose-start-markers* is T, you will be asked to provide the names of the two initial markers from which the framework map will be built. When NIL, MultiMap will determine these two "start" markers (see Mapping Algorithm). 12. (*use-joint-pic* T) When *use-joint-pic* is T, pairs of markers are considered for mapping in order of decreasing joint- PIC value. When NIL the markers are sorted in order of decreasing heterozygosity. Parameters associated only with construction of a comprehensive map 13. (*extend-map* T) When *extend-map* is T, a comprehensive map will be constructed from a framework map (This framework map may be either one which was just constructed by MultiMap, or can be any map specified by you). When *extend-map* is NIL a comprehensive map will not be constructed. 14. (*num-to-order-per-interval* 3) During any given "round" of comprehensive map construction, several different markers may map to the same unique interval, and need to be mutually ordered (see Mapping Algorithm). The parameter *num-to-order-per-interval* determines how many of these markers MultiMap will attempt to order within each interval. In other words, if 5 markers localize only to the interval B-C of the map A-B-C- D-E, and *num-to-order-per-interval* is set to 3, MultiMap will try to order the 3 most informative of those 5 markers. The remaining 2 markers will not be added to the map at this point but will be considered as unmapped loci in subsequent rounds of mapping. 15. (*num-to-flip-during-extension* 3) At the end of each "round" of comprehensive map construction, a CRI-MAP FLIPS analysis is performed to check the validity of the current map. The parameter *num-to-flip-during-extension*

12 determines the number of markers that will be "flipped" in each contiguous block of markers to assess local support. 16. (*num-to-flip-at-end* 3) At the end of construction of a comprehensive map, a CRI-MAP FLIPS analysis is performed to check the local support of the current map. The parameter *num-to-flip-at-end* determines the number of markers that will be "flipped" in each contiguous block of markers. Some mappers prefer to perform a more rigorous FLIPS analysis on the final map(e.g. 5 markers per contiguous block instead of 3). Parameters associated with all aspects of map construction 17. (*data-type* 'linkage) Currently, the value of this parameter must be 'linkage. Radiation hybrid map construction is only possible through assumption of a constant retention frequency (Green 1992). Note the single quote (') before the word "linkage" is very important! 18. (*odds-threshold* 3.0) The value of *odds-threshold* represents the minimum lod score required for acceptance of linkage between two markers or between a marker and a map. 19. (*flips-odds-threshold* 3.0) During CRI-MAP FLIPS analyses, marker orders whose log 10 -likelihood is at least *flips-oddsthreshold* orders of magnitude less than that of the "best" order are considered to be significantly less likely. 20. (*usehaps* T) If sets of markers are completely linked and are to be considered as a haplotype by CRI-MAP, the parameter *usehaps* should be set to T. Otherwise it should be nil. See the CRI-MAP documentation for more details. 21. (*haplist* '((D50S210 D50S19) (D50S166 D50S94))) If *usehaps* is T, then the parameter *haplist* is a list which contains lists of names of markers to be haplotyped. If *usehaps* is NIL, then *haplist* is set to an empty list as follows : (*haplist* '()). In this example, you want the recombination distance to be set to zero between the markers named D50S210 and D50S19, and between the markers named D50S166 and D50S94. MultiMap may not run properly if there are recombinants between markers in the haplotype list. Note the single quote (') before the list of haplotype markers is very important!

13 22. (*do-flips* T) If you does not want any FLIPS analyses to be performed the parameter *do-flips* should be set to NIL, otherwise it should be set to T. Normally, for increased map accuracy, FLIPS analyses should be performed. However, in some cases, in order to save time, you may opt not to perform FLIPS analyses. 23. (*sex-eq* 1) If the data is autosomal *sex-eq* should be 1. If the data is X-linked, *sex-eq* should be 0. (This follows CRI-MAP's convention). 24. (*compute-error* T) When *compute-error* is T the percent rate of genotype error in males and females is estimated by the drop-one-locus method as described in (Lasher et al. 1991). The percent error rates are written to the screen and to the file "chr50.out," and a tabular output for graph construction is written to the file "chr50.drop." Summary of method: For a map interval of length ω cm, the estimated map length is ω+2p if p is the percent undetected error rate for a marker locus lying in that interval. Thus, dropping this marker locus should result in a map distance of ω cm; one-half the difference in map lengths being equal to p. MultiMap drops each internal (non-terminal) locus in a map in turn and averages the total map length; p is estimated as one-half the difference between total length of the original map and the average length of the "dropped" maps. A plot of the map lengths obtained by dropping each locus from a map against the map of marker locations identifies those markers that contribute most to this difference. 25. (*compute-sex-diff* T) When *compute-sex-diff* is T the input for the program SEXDIF is written to the file "chr50.sexin." SEXDIF is not run within MultiMap. For information on how to run SEXDIF, type % sexdif -h. (In UNIX, not in MultiMap!). Summary of method: Sex-differences in individual map intervals and the total map can be evaluated from the likelihoods of sex-specific maps and the map obtained by constraining male and female recombination rates as equal. The magnitude in variation in sex-difference along the chromosome cannot be effectively studied by comparing each interval since the ratio of female (θ f ) and male (θ m ) recombination values is unstable for small q values. MultiMap plots the quantity (θ f - θ m )/(θ f + θ m ), which always lies between -1 (male excess) and +1 (female excess), against the sex equal map locations. The program allows control of a "window size" of k markers (k > 2) over which intervals the sex-difference is evaluated. This is because when θ is small, the statistical power for detecting this difference is low.

14 26. (*flips-pretty* T) When *flips-pretty* is T a CRI-MAP FLIPS (n=2) analysis is run in order to determine the likelihood support for pairwise inversions of loci in the final comprehensive map. Output from this analysis is written to the file "chr50.flips." 27. (*save-results-often* NIL) When *save-results-often* is NIL the CRI-MAP results are written to file upon completion of construction of the map (framework, comprehensive or both). When *save-results-often* is T the CRI- MAP results are written to file after each calculation. This will cause MultiMap to run more slowly, but is useful when running a lengthy job so that should an error, computer problem or loss of power occur, MultiMap does not have to repeat its earlier analyses. 28. (*error-threshold* 0.10) The default value of *error-threshold* is NIL. When it is not NIL, it should be set to a decimal value less than 1.0. When not NIL, any marker that can be added to the map in a unique map position with odds above *odds-threshold* is added ONLY IF its addition increases the size of its unique map interval by less than *error-threshold*x100% of that interval's size without the marker. We normally use a value of 0.10 (i.e. 10%).

15 B. Constructing Maps 1. What MultiMap does The first time MultiMap is run on a set of data, the allele frequencies are estimated and heterozygosities and PIC values are computed for each markers. The joint-pic values are also calculated for each pair of markers. The allele frequencies, heterozygosities and PIC values are written to the files chr50.freq1 and chr50.freq2. Chr50.freq1 is use26 ful for you and chr50.freq2 is used by MultiMap. The list of marker numbers in decreasing order of heterozygosity is written to the file chr50.ordh for use by MultiMap. The joint-pic values are written to the files chr50.pic1 and the sorted joint-pic values are written to the file chr50.pic2. The list of marker pairs in decreasing order of joint-pic value is written to the file chr50.ordj for use by MultiMap. During the course of a MultiMap run, CRI-MAP will be run many times. These results, along with the CRI-MAP specific parameter values, are saved in a hash table. At the end of construction of a framework or comprehensive map, these results are written to the file chr50.hash. Each data file has a unique checksum value (computed with the UNIX program sum). The checksum value of the file chr50.dat is also written to the chr50.hash file. Therefore, if you want to re-analyze the same data, perhaps with a different set of input parameters, the previously computed heterozygosities, joint-pic values and CRI-MAP results can be used. In the case where the file chr50.hash exists, the checksum of the current chr50.dat file is compared to the checksum value stored in chr50.hash. If these values are the same, (i.e. the data has not changed since the previous run), then values from the files chr50.ordh, chr50.ordj and chr50.hash will be used. If the checksum values do not match (i.e. the data has changed) or the file chr50.hash does not exist, all values are (re-) computed. If at any time you wish to force MultiMap to compute new values, use the :clear keyword (see II.B.3. keywords below) when running MultiMap. Each time MultiMap is run on a set of data, the program parameters must be initialized (i.e. set to those specified by you in the.input file). Once these parameters have been initialized, they do not need to be reset unless you wishes to analyze a different set of data. If a map is constructed (framework, comprehensive, or both) the parameters will automatically be initialized. However, if you wished to perform functions other than constructing maps, the :run keyword must be user (see keywords below). This keyword explicitly tells MultiMap to initialize the parameters. 2. How to run MultiMap To use MultiMap, the Lisp interpreter must be running and the MultiMap program must be loaded. MultiMap will look for data files only in the directory in which the Lisp interpreter is running (i.e., run MultiMap from the directory where your data files are located). The MultiMap

16 executable file is multimap.sparcf for CMU Common Lisp. The directions for use with CLISP are different, please see the README.clisp file at the FTP site. To start Lisp and load MultiMap, type: % lisp CMU Common Lisp 16e, running on watson Send bug reports and questions to your local CMU CL maintainer, or to cmucl-bugs@cs.cmu.edu. Loaded subsystems: Python 1.0, target SPARCstation/Sun 4 CLOS based on PCL version: March 92 PCL (2a) * (load "/usr/local/multimap") (specify the path to your MultiMap executable file) ; Loading "/usr/local/multimap.sparcf". T * Once MultiMap has been loaded into the Lisp interpreter, you may construct as many different maps as you like. MultiMap needs to be loaded only once each time the Lisp interpreter is started up. The most basic usage of MultiMap is for automatic construction of a framework map and then a comprehensive map. We suggest you first try running MultiMap on the sample chromosome 50 files we provide. Since we have provided all the necessary input files, simply type * (multimap 50) This will cause MultiMap to perform all analyses necessary to complete a framework and comprehensive map. The main output file created by MultiMap is called chr50.out. It should be identical to the output file we provided with the test files - chr50.out.keep. There should not be any significant differences. If there are and you do not understand why, please contact us by e- mail. After testing MultiMap on our sample data set, you are ready to analyze your own data. Create the input files described above. To have MultiMap create the ".input" file for you, type * (create-input-file) and answer the questions asked by multimap. The values shown in the example file above are the default values used by MultiMap. Note If you use the (create-input-file) function to create the input file, the order of the parameters in the corresponding "input" file may not correspond to the order given in the example file above. This will not present a problem.

17 To run MultiMap in the interactive mode, type * (multimap 50 :interactive t) Note Upon completion of a map, the variable *mapped* holds the marker numbers of those markers in the map. This variable may then be used in additional analyses. 3. MultiMap keywords Keywords are special arguments that may be added to the end of the command to run MultiMap. If they are not specified, the value of each keyword is set to its default value. Keyword :clear * (multimap 50 :clear T) The default value of :clear is NIL. When the :clear keyword is set to T, MultiMap ignores any heterozygosities, joint-pic values or CRI-MAP results that may have been previously computed. Keyword :run * (multimap 50 :run NIL) The default value of :run is T. When the :run keyword is set to nil, MultiMap will stop after initializing its parameters to the values found in the file chr50.input, and after computing heterozygosities for each marker. This option must be run before performing any functions other than constructing maps. For example, suppose you want to construct a map of chromosome 2, then you want to run the function flips-pretty on the chromosome 2 map, and then you want to run the function find-zerorecombs on data from chromosome 11. You would use the following commands: % lisp * (load "/usr/local/multimap") * (multimap 2) * (flips-pretty *mapped*) (see section IV. additional functions) * (multimap 11 :run nil) * (find-zero-recombs *mapped* *mapped* 3.0) * (quit)

18 Keyword :reverse * (multimap 50 :reverse T) The default value of :reverse is NIL. When the :reverse keyword is set to T, MultiMap will reverse the order of the starting two markers (when constructing a framework map) or will reverse the order of the starting map (when constructing a comprehensive map). Most previously computed likelihoods can be used even when the orientation is switched. Multiple keywords can be used, and they may be specified in any order. However, they must be added to the command after all other parameters required by the function. For example: * (multimap 50 :run nil :clear t) 4. What to do in case of problems There are many potential sources of errors which could cause MultiMap to stop running. With a little luck the error messages will make sense. If not, the first places to check for errors are in the ".input" and ".names" files. In each file it is important that each pair of parentheses are correctly matched. Also, in the ".input" file each parameter that is included must have a value associated with it. When MultiMap stops running due to an error, you will find yourself in the Lisp Debugger. If you are using CMU Common Lisp, the command to exit the debugger is "a" (for ABORT); in LUCID Lisp the exit command is ":a". If MultiMap stops running after some analyses have already been performed, after exiting the debugger, you should execute the command: * (save-results) to write the results computed thus far to the file "chr50.hash." If the error message isn't clear, and checking the obvious input files does not help, send with a detailed description of the error to multimap@genome1.hgen.pitt.edu. We will respond as soon as possible.

19 C. Output Files Several output files are produced during a run of MultiMap. Two files, cri.out and sum.out are created during the MultiMap run but are deleted at the end of the run. The remaining output files are and a brief explanation of what is contained in each of these files follows: chr50.drop Tabular output from the error analysis performed only when *compute-sex-diff* is T. chr50.flips Output from the flips run performed only when *flips-pretty* is T. chr50.frame Marker numbers for the markers in the framework map. chr50.freq1 Allele frequencies, heterozygosities and PIC values. chr50.freq2 Used by MultiMap. chr50.hash Used by MultiMap. chr50.loc Created by CRI-MAP PREPARE - see CRI-MAP documentation. chr50.mapped Marker numbers for the markers in the final comprehensive map. chr50.ordh Marker numbers for given run in order of decreasing heterozygosity. chr50.ordj Pairs of marker numbers in order of decreasing joint-pic-value chr50.out MAIN OUTPUT FILE: Output from most recent MultiMap run. chr50.par CRI-MAP parameter file from most recent call to CRI-MAP. chr50.pic1 Marker pairs and their joint-pic values. chr50.pic2 chr50.pic1 ordered by decreasing joint-pic value. chr50.sexin Input for the SEXDIF program, produced only when *compute-sex-diff* is T. chr50.two Created when all pairwise twopoint lod scores are computed, used by MultiMap. This output file contains the complete details of the MultiMap run, including all intermediate steps and the final map. Most aspects are self-explanatory, but a few sections require explanation. 1. Explanation of main output file chr50.out It may be helpful to have a print-out of chr50.out available as you read this section. For construction of the framework map, the two initial markers are given and the steps taken to sequentially add markers to the map are given. Each time a marker is added to the framework map, the current map is shown. The marker that was most recently added to the map is indicated by an asterisk (*) after the marker name. Upon completion of a framework map, a CRI-MAP FLIPS run is performed and the support for order is given. The number of markers in each contiguous block in this analysis is n-1 (n= number in map) up to 5. During construction of a comprehensive map, the map to be extended (either the previous framework map of a user-specified map) is printed, following by "rounds" of marker addition. During each round, the possible location(s) of each unmapped markers (with odds as specified by you) on the current map are determined. These locations are printed after the heading "The current map state is." The markers that have been previously mapped are printed in the left-most column. In each map interval the markers (if any) whose map locations include that interval are printed horizontally. Markers that map to only one map interval are denoted by an asterisk (*) following their marker name. The marker names of all markers that map to

20 multiple intervals are followed by a number indicating the relative likelihood rank of each interval. An example follows: --> The current map state is : A B C D E-1 F-1 G-2 E-2 F-2 G-1 I * J * K * L * M-3 N-1 M-2 O * N-2 M-1 O places uniquely between C and D In this case, markers E,F,G,H,I,J,K,L,O,M,N were placed on the map A-B-C-D. Markers E,F and G could each be placed in two intervals: either proximal to A or between A and B, given you-specified odds (i.e. 1000:1). However, of these two intervals, E and F are more likely to map proximal to A (rank=1) than between A and B (rank=2), while G is more likely to lie between A and B (rank=1) than proximal to A (rank=2). Similarly, marker M can lie in one of three intervals given you-specified odds: most likely distal to D (rank=1), but also between C and D (rank=2) or between B and C (rank=3). The markers I,J,K and L all map only to the interval B-C with user-specified odds, and marker O maps only to interval C-D. Since O maps to only one interval, and is the only marker which maps uniquely to this interval, it will be added to the map at this point. After the next map state is determined, a FLIPS analysis is run to validate the order and the new map is printed, with an asterisk (*) following the marker name(s) of all markers added in that "round." This continues until no more markers can be added to unique map locations with user-specified odds. Upon completion of the comprehensive map, sex-averaged and sexspecific maps are printed and the run is complete. 2. Explanation of output file chr50.flips When the parameter *flips-pretty* is T, a FLIPS of sequential blocks of 2 markers is computed and the results printed to this file. The markers are given in map order, and the value printed in each interval is the log-likelihood difference of the true order versus the order with each pair of markers inverted. Inversions with a value less than the specified odds are followed by an asterisk (*) for easy identification.

21 IV Additional Functions In addition to constructing maps, MultiMap has several functions which you may wish to perform. Some of them require you to specify a map or set of markers on which the function will be run. Most of them require the MultiMap parameters to be initialized to the values in the ".input" file. This is done by first running MultiMap with the keyword :run set to NIL - see explanation above. A. Useful Variables There are 5 variables that may be useful for running these functions. These variables and the values they are assigned to are: a. *master-list* list of all marker names b. *marker-names* list of subset of marker names c. *master-list-in-order* list of all marker numbers in decreasing order of heterozygosity d. *markers-in-order* list of subset of marker numbers, minus secondary markers in haplotypes, in decreasing order of heterozygosity e. *mapped* marker numbers of the current or final map The * is part of the variable name (indicates a global variable). B. Description of Functions 1. setf The function setf (a Common Lisp function) takes two values. The first is the name of a variable and the second is a value or list to which the variable will be set. For example, * (setf X '( )) or * (setf X *marker-names*) sets the variable X to hold the list '( ) (i.e., a list of marker numbers), or the list of marker names. Once setf has been used to assign a value or list to a variable, that variable will keep that same value until it is assigned a new value. To see the value of a variable, simply type the variable name at the lisp prompt (no parentheses). Note remember that a list must be enclosed by parentheses and preceded by an apostrophe, and that function calls must also be enclosed by parentheses. Note like CRI-MAP, in MultiMap markers are numbered beginning with 0.

22 2. get-marker-names, get-marker-name The functions get-marker-names and get-marker-name return the marker name(s) of the marker number(s) supplied to the function. The following command returns the marker names of the markers numbers in the variable *mapped*. The parameter for get-marker-names is a list of marker numbers, while the parameter for get-marker-name is a single marker number. * (get-marker-names *mapped*) * (get-marker-name 15) Suppose you wanted the marker names of markers 1, 2 and 3. You could either or * (setf X '(1 2 3)) * (get-marker-names X) * (get-marker-names '(1 2 3)) 3. get-marker-numbers, get-marker-number The functions get-marker-numbers and get-marker-number return the marker number(s) of the marker name(s) supplied to the function. The parameter for get-marker-numbers is a list of marker names, while the parameter for get-marker-number is a single marker name. * (get-marker-numbers *marker-names*) * (get-marker-number 'D50SC) Note a single marker name is actually a Lisp symbol. Symbols must be preceded by a single quote. For example, the following function call would return the marker numbers of the markers D50S10, D50S15 and D50S11. * (get-marker-numbers '(D50S10 D50S15 D50S11)) 4. find-zero-recombs This function finds pairs of markers for which theta = 0.0 and the lod-score is above a certain specified limit (i.e. 3.0). It takes as input two lists of markers (either marker names of marker numbers) and the lod-score level. It computes twopoint lod-scores of all markers in the first list paired with all markers in the second list. It then prints those marker pairs whose maximum lod score is found at theta = 0.0 and whose lod-score is above the specified limit. Setting the lod-score limit to 0.0 will cause the function to print all pairs of markers. The results are printed to the screen and to the file chr50.zero.

23 To have MultiMap print a list of those markers whose maximum lod score is above a specified limit and occurs at theta = 0.0, type * (find-zero-recombs '( ) '( ) 3.0) The example above will compute the twopoint lod-score for the following pairs : 0-1, 0-2,0-3,1-2,1-3,2-3 and will output those pairs with a theta of 0.0 and a lod-score above 3.0. This could also be accomplished by executing the following commands : * (setf A '( )) * (find-zero-recombs A A 3.0) Keywords :names (default NIL) If the lists of markers are marker names instead of marker numbers, you could set the keyword :names to T. For example * (find-zero-recombs '(D50S10 D50S15) '(D50D20) 3.0 :names t) to compare D50S10- D50S20 and D50S15-D50S20. Note As in the above example, the markers in the two lists need not be the same. :get-twopoints (default nil) When :get-twopoints is T the pairwise maximum likelihoods and recombination values of all markers will be computed by CRI-MAP. For a large number of twopoint computations, setting :get-twopoints to T will improve efficiency (less I/O overhead). Otherwise, when :get-twopoints is NIL, each pairwise computation is performed separately. To run find-zero-recombs on all markers you are currently mapping. * (setf X (get-marker-numbers *marker-names*)) * (find-zero-recombs X X 3.0 :get-twopoints t)

24 5. flips-pretty To have MultiMap run FLIPS (flips2) and produce an output which shows the log-likelihood support for each interval, type or or * (flips-pretty *mapped*) * (flips-pretty '( )) (use marker numbers) * (flips-pretty '(A7 A4 A9 A13) :names t ) (use marker names) If you want to run flips-pretty on a map other than that stored in the variable *mapped*, use the second or third command (above) and substitute your marker numbers for '( ) or marker names for '(A1 A2 A3 A4). The flips-pretty output will be written to the file chr50.flips. An example of flips-pretty output is given below * (flips-pretty '( )) Log-likelihood support A7 A4 A9 A write-map-to-file This function takes as its argument a list of markers names or marker numbers representing a map. It runs CRI-MAP's FIXED function to estimate recombination fractions between each of the markers and writes the resulting maps to the output file chr50.map. As previously explained, the :names keyword can be set to T if the markers are represented by names instead of numbers. For example: * (write-map-to-file *mapped*) 7. get-map-from-user This is an interactive function which asks you to specify a map for the purpose of mapping additional markers to construct a comprehensive map. This map is written to the file chr50.frame and is as an initial map by MultiMap. (It may be just as easy for you to simply put the list of marker numbers of the map directly in the file chr50.frame.) MultiMap will follow the mapping algorithm used for comprehensive maps to try to place all markers in the chr50.names file that are not part of the initial map specified by you.

25 This is the most convenient way to add a set of markers to an existing map. The marker names of both the markers in the map and any markers to be added to the map must be specified in the second list of markers in the chr50.names file (see description above). The parameter *make-framework* must be set to NIL in the chr50.input file, and the parameter *extend-map* must be set to T. As described above, marker names may be used instead of marker numbers by setting the keyword :names to T. For example, * (get-map-from-user) or * (get-map-from-user :names T) 8. print-map This function prints the inter-marker recombination distances between markers in the specified map (using CRI-MAP's FIXED option) to the screen. The output is similar in appearance to that from a CRI-MAP FIXED run. The parameter for this function is a list of marker numbers. * (print-map *mapped*) or * (print-map '( )) Keywords :both-maps (default NIL) The value of the :both-maps keyword can be either NIL (default) or 3. When NIL, only a sex-averaged map is printed. When 3, both a sex-averaged and sex-specific (female and male) maps are printed. * (print-map *mapped* :both 3) Note The :names keyword is not yet incorporated into print-map. 9. drop-each-locus This function performs a drop-each-locus analysis (see II.A.4.c.24 above) to estimate the rate of genotypic error. It is run automatically when the variable *compute-error* is set to T in the ".input" file. It may be run separately from a MultiMap run, on any set of markers you choose. The output is written to the ".drop" file. The parameter for this function is a list of marker numbers. * (drop-each-locus *mapped*) or * (drop-each-locus '( )) Note The :names keyword is not yet incorporated into drop-each-locus. Note Each time drop-each-locus is run, the output is written to the ".drop" file. Any previous ".drop" file is overwritten.

26 10. save-results This function writes-the results of all likelihood computations to the ".hash" file. These results are normally saved at the end of a MultiMap run. However, there may be other times when you want to save results: if the program has crashed before completion (exit the debugger first!); if you want to quit and continue later; if you have run one of the above additional functions (results are not automatically saved after running any of these functions). You can save results as often as you like, it may become time consuming, but no harm will be done. 11. reverse Reverse is a Lisp function used to reverse the order of items in a list. This may be useful if MultiMap has constructed a map in an upside-down orientation. For example, after completion of map construction, execute the following Lisp command to reverse the order of markers in your map: * (setf *mapped* (reverse *mapped*)) You may then wish to run the print-map or write-map-to-file functions to see the final map in the proper orientation. If you wish to repeat the entire run (most likelihoods will not be recomputed) with the map in its correct orientation, use the MultiMap keyword :reverse (see Section II.B.3). 12. find-all-linkage-groups The purpose of this function is to group a list of markers into a set of groups in which each marker is linked to at least one other marker in the same group with specified theta and lod score values. The first and second parameters are the recombination fraction and lod scores used to separate groups of markers. The third parameter determines whether the twopoint recombination values and lod scores have been computed or need to be computed. If they have previously been computed (probably not) the file chrx.two should be in your working directory. If so, set the third parameter to NIL. * (find-all-linkage-groups T) This run will separate all the markers in the.gen file into groups where each marker in each group is linked to at least one other marker in the same group with theta <= 0.2 and lod >= 3.0.

Development of linkage map using Mapmaker/Exp3.0

Development of linkage map using Mapmaker/Exp3.0 Development of linkage map using Mapmaker/Exp3.0 Balram Marathi 1, A. K. Singh 2, Rajender Parsad 3 and V.K. Gupta 3 1 Institute of Biotechnology, Acharya N. G. Ranga Agricultural University, Rajendranagar,

More information

Oracle Financial Services Behavior Detection Platform: Administration Tools User Guide. Release May 2012

Oracle Financial Services Behavior Detection Platform: Administration Tools User Guide. Release May 2012 Oracle Financial Services Behavior Detection Platform: Administration Tools User Guide Release 6.1.1 May 2012 Oracle Financial Services Behavior Detection Platform: Administration Tools User Guide Release

More information

Genetic Analysis. Page 1

Genetic Analysis. Page 1 Genetic Analysis Page 1 Genetic Analysis Objectives: 1) Set up Case-Control Association analysis and the Basic Genetics Workflow 2) Use JMP tools to interact with and explore results 3) Learn advanced

More information

2. If Perl isn t installed on your computer, download and run the appropriate installer from

2. If Perl isn t installed on your computer, download and run the appropriate installer from Instructions for using the Perl script BuildConsensusMap.pl Clare Nelson Dept. of Plant Pathology Kansas State University 8.19.2011 Introduction The script described in this guide is intended to be applied

More information

Linkage Analysis Package. User s Guide to Support Programs

Linkage Analysis Package. User s Guide to Support Programs Linkage Analysis Package User s Guide to Support Programs Version 5.20 December 1993 (based on original document supplied by CEPH, modified by J. Ott 2 November 2013) Table of Contents CHAPTER 1: LINKAGE

More information

SOLOMON: Parentage Analysis 1. Corresponding author: Mark Christie

SOLOMON: Parentage Analysis 1. Corresponding author: Mark Christie SOLOMON: Parentage Analysis 1 Corresponding author: Mark Christie christim@science.oregonstate.edu SOLOMON: Parentage Analysis 2 Table of Contents: Installing SOLOMON on Windows/Linux Pg. 3 Installing

More information

CROSSREF Manual. Tools and Utilities Library

CROSSREF Manual. Tools and Utilities Library Tools and Utilities Library CROSSREF Manual Abstract This manual describes the CROSSREF cross-referencing utility, including how to use it with C, COBOL 74, COBOL85, EXTENDED BASIC, FORTRAN, Pascal, SCREEN

More information

Section 5.5: Text Menu Input from Character Strings

Section 5.5: Text Menu Input from Character Strings Chapter 5. Text User Interface TGrid user interface also consists of a textual command line reference. The text user interface (TUI) is written in a dialect of Lisp called Scheme. Users familiar with Scheme

More information

CSC 533: Programming Languages. Spring 2015

CSC 533: Programming Languages. Spring 2015 CSC 533: Programming Languages Spring 2015 Functional programming LISP & Scheme S-expressions: atoms, lists functional expressions, evaluation, define primitive functions: arithmetic, predicate, symbolic,

More information

Step-by-Step Guide to Basic Genetic Analysis

Step-by-Step Guide to Basic Genetic Analysis Step-by-Step Guide to Basic Genetic Analysis Page 1 Introduction This document shows you how to clean up your genetic data, assess its statistical properties and perform simple analyses such as case-control

More information

Population Genetics (52642)

Population Genetics (52642) Population Genetics (52642) Benny Yakir 1 Introduction In this course we will examine several topics that are related to population genetics. In each topic we will discuss briefly the biological background

More information

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview

D-Optimal Designs. Chapter 888. Introduction. D-Optimal Design Overview Chapter 888 Introduction This procedure generates D-optimal designs for multi-factor experiments with both quantitative and qualitative factors. The factors can have a mixed number of levels. For example,

More information

Lecture 1 Contracts. 1 A Mysterious Program : Principles of Imperative Computation (Spring 2018) Frank Pfenning

Lecture 1 Contracts. 1 A Mysterious Program : Principles of Imperative Computation (Spring 2018) Frank Pfenning Lecture 1 Contracts 15-122: Principles of Imperative Computation (Spring 2018) Frank Pfenning In these notes we review contracts, which we use to collectively denote function contracts, loop invariants,

More information

Administration Tools User Guide. Release April 2015

Administration Tools User Guide. Release April 2015 Administration Tools User Guide Release 6.2.5 April 2015 Administration Tools User Guide Release 6.2.5 April 2015 Part Number: E62969_05 Oracle Financial Services Software, Inc. 1900 Oracle Way Reston,

More information

Intro. Scheme Basics. scm> 5 5. scm>

Intro. Scheme Basics. scm> 5 5. scm> Intro Let s take some time to talk about LISP. It stands for LISt Processing a way of coding using only lists! It sounds pretty radical, and it is. There are lots of cool things to know about LISP; if

More information

BEG 6. 50p. Getting Started with the Emacs Screen Editor. An introduction to the Emacs screen editor, which is available on Unix systems.

BEG 6. 50p. Getting Started with the Emacs Screen Editor. An introduction to the Emacs screen editor, which is available on Unix systems. Getting Started with the Emacs Screen Editor An introduction to the Emacs screen editor, which is available on Unix systems. AUTHOR Information Systems Services University of Leeds DATE March 2000 EDITION

More information

9.2 Linux Essentials Exam Objectives

9.2 Linux Essentials Exam Objectives 9.2 Linux Essentials Exam Objectives This chapter will cover the topics for the following Linux Essentials exam objectives: Topic 3: The Power of the Command Line (weight: 10) 3.3: Turning Commands into

More information

Lecture 1 Contracts : Principles of Imperative Computation (Fall 2018) Frank Pfenning

Lecture 1 Contracts : Principles of Imperative Computation (Fall 2018) Frank Pfenning Lecture 1 Contracts 15-122: Principles of Imperative Computation (Fall 2018) Frank Pfenning In these notes we review contracts, which we use to collectively denote function contracts, loop invariants,

More information

The Lander-Green Algorithm in Practice. Biostatistics 666

The Lander-Green Algorithm in Practice. Biostatistics 666 The Lander-Green Algorithm in Practice Biostatistics 666 Last Lecture: Lander-Green Algorithm More general definition for I, the "IBD vector" Probability of genotypes given IBD vector Transition probabilities

More information

Essentials for Scientific Computing: Bash Shell Scripting Day 3

Essentials for Scientific Computing: Bash Shell Scripting Day 3 Essentials for Scientific Computing: Bash Shell Scripting Day 3 Ershaad Ahamed TUE-CMS, JNCASR May 2012 1 Introduction In the previous sessions, you have been using basic commands in the shell. The bash

More information

Bits, Words, and Integers

Bits, Words, and Integers Computer Science 52 Bits, Words, and Integers Spring Semester, 2017 In this document, we look at how bits are organized into meaningful data. In particular, we will see the details of how integers are

More information

Recalling Genotypes with BEAGLECALL Tutorial

Recalling Genotypes with BEAGLECALL Tutorial Recalling Genotypes with BEAGLECALL Tutorial Release 8.1.4 Golden Helix, Inc. June 24, 2014 Contents 1. Format and Confirm Data Quality 2 A. Exclude Non-Autosomal Markers......................................

More information

Lab copy. Do not remove! Mathematics 152 Spring 1999 Notes on the course calculator. 1. The calculator VC. The web page

Lab copy. Do not remove! Mathematics 152 Spring 1999 Notes on the course calculator. 1. The calculator VC. The web page Mathematics 152 Spring 1999 Notes on the course calculator 1. The calculator VC The web page http://gamba.math.ubc.ca/coursedoc/math152/docs/ca.html contains a generic version of the calculator VC and

More information

HaploHMM - A Hidden Markov Model (HMM) Based Program for Haplotype Inference Using Identified Haplotypes and Haplotype Patterns

HaploHMM - A Hidden Markov Model (HMM) Based Program for Haplotype Inference Using Identified Haplotypes and Haplotype Patterns HaploHMM - A Hidden Markov Model (HMM) Based Program for Haplotype Inference Using Identified Haplotypes and Haplotype Patterns Jihua Wu, Guo-Bo Chen, Degui Zhi, NianjunLiu, Kui Zhang 1. HaploHMM HaploHMM

More information

Creating a data file and entering data

Creating a data file and entering data 4 Creating a data file and entering data There are a number of stages in the process of setting up a data file and analysing the data. The flow chart shown on the next page outlines the main steps that

More information

Linkage analysis with paramlink Session I: Introduction and pedigree drawing

Linkage analysis with paramlink Session I: Introduction and pedigree drawing Linkage analysis with paramlink Session I: Introduction and pedigree drawing In this session we will introduce R, and in particular the package paramlink. This package provides a complete environment for

More information

Notes on QTL Cartographer

Notes on QTL Cartographer Notes on QTL Cartographer Introduction QTL Cartographer is a suite of programs for mapping quantitative trait loci (QTLs) onto a genetic linkage map. The programs use linear regression, interval mapping

More information

6.001 Notes: Section 4.1

6.001 Notes: Section 4.1 6.001 Notes: Section 4.1 Slide 4.1.1 In this lecture, we are going to take a careful look at the kinds of procedures we can build. We will first go back to look very carefully at the substitution model,

More information

QTX. Tutorial for. by Kim M.Chmielewicz Kenneth F. Manly. Software for genetic mapping of Mendelian markers and quantitative trait loci.

QTX. Tutorial for. by Kim M.Chmielewicz Kenneth F. Manly. Software for genetic mapping of Mendelian markers and quantitative trait loci. Tutorial for QTX by Kim M.Chmielewicz Kenneth F. Manly Software for genetic mapping of Mendelian markers and quantitative trait loci. Available in versions for Mac OS and Microsoft Windows. revised for

More information

CS102: Standard I/O. %<flag(s)><width><precision><size>conversion-code

CS102: Standard I/O. %<flag(s)><width><precision><size>conversion-code CS102: Standard I/O Our next topic is standard input and standard output in C. The adjective "standard" when applied to "input" or "output" could be interpreted to mean "default". Typically, standard output

More information

VARIABLES. Aim Understanding how computer programs store values, and how they are accessed and used in computer programs.

VARIABLES. Aim Understanding how computer programs store values, and how they are accessed and used in computer programs. Lesson 2 VARIABLES Aim Understanding how computer programs store values, and how they are accessed and used in computer programs. WHAT ARE VARIABLES? When you input data (i.e. information) into a computer

More information

Chapter Two: Descriptive Methods 1/50

Chapter Two: Descriptive Methods 1/50 Chapter Two: Descriptive Methods 1/50 2.1 Introduction 2/50 2.1 Introduction We previously said that descriptive statistics is made up of various techniques used to summarize the information contained

More information

Step-by-Step Guide to Relatedness and Association Mapping Contents

Step-by-Step Guide to Relatedness and Association Mapping Contents Step-by-Step Guide to Relatedness and Association Mapping Contents OBJECTIVES... 2 INTRODUCTION... 2 RELATEDNESS MEASURES... 2 POPULATION STRUCTURE... 6 Q-K ASSOCIATION ANALYSIS... 10 K MATRIX COMPRESSION...

More information

Programming Fundamentals and Python

Programming Fundamentals and Python Chapter 2 Programming Fundamentals and Python This chapter provides a non-technical overview of Python and will cover the basic programming knowledge needed for the rest of the chapters in Part 1. It contains

More information

SECTION 1: INTRODUCTION. ENGR 112 Introduction to Engineering Computing

SECTION 1: INTRODUCTION. ENGR 112 Introduction to Engineering Computing SECTION 1: INTRODUCTION ENGR 112 Introduction to Engineering Computing 2 Course Overview What is Programming? 3 Programming The implementation of algorithms in a particular computer programming language

More information

C: How to Program. Week /Mar/05

C: How to Program. Week /Mar/05 1 C: How to Program Week 2 2007/Mar/05 Chapter 2 - Introduction to C Programming 2 Outline 2.1 Introduction 2.2 A Simple C Program: Printing a Line of Text 2.3 Another Simple C Program: Adding Two Integers

More information

S E C T I O N O V E R V I E W

S E C T I O N O V E R V I E W AN INTRODUCTION TO SHELLS S E C T I O N O V E R V I E W Continuing from last section, we are going to learn about the following concepts: understanding quotes and escapes; considering the importance of

More information

Chapter 1 Operations With Numbers

Chapter 1 Operations With Numbers Chapter 1 Operations With Numbers Part I Negative Numbers You may already know what negative numbers are, but even if you don t, then you have probably seen them several times over the past few days. If

More information

Alternation. Kleene Closure. Definition of Regular Expressions

Alternation. Kleene Closure. Definition of Regular Expressions Alternation Small finite sets are conveniently represented by listing their elements. Parentheses delimit expressions, and, the alternation operator, separates alternatives. For example, D, the set of

More information

Haplotype Analysis. 02 November 2003 Mendel Short IGES Slide 1

Haplotype Analysis. 02 November 2003 Mendel Short IGES Slide 1 Haplotype Analysis Specifies the genetic information descending through a pedigree Useful visualization of the gene flow through a pedigree A haplotype for a given individual and set of loci is defined

More information

Definition: A data structure is a way of organizing data in a computer so that it can be used efficiently.

Definition: A data structure is a way of organizing data in a computer so that it can be used efficiently. The Science of Computing I Lesson 4: Introduction to Data Structures Living with Cyber Pillar: Data Structures The need for data structures The algorithms we design to solve problems rarely do so without

More information

6.001 Notes: Section 8.1

6.001 Notes: Section 8.1 6.001 Notes: Section 8.1 Slide 8.1.1 In this lecture we are going to introduce a new data type, specifically to deal with symbols. This may sound a bit odd, but if you step back, you may realize that everything

More information

Spotter Documentation Version 0.5, Released 4/12/2010

Spotter Documentation Version 0.5, Released 4/12/2010 Spotter Documentation Version 0.5, Released 4/12/2010 Purpose Spotter is a program for delineating an association signal from a genome wide association study using features such as recombination rates,

More information

MICROSOFT EXCEL 2000 LEVEL 3

MICROSOFT EXCEL 2000 LEVEL 3 MICROSOFT EXCEL 2000 LEVEL 3 WWP Training Limited Page 1 STUDENT EDITION LESSON 1 - USING LOGICAL, LOOKUP AND ROUND FUNCTIONS... 7 Using the IF Function... 8 Using Nested IF Functions... 10 Using an AND

More information

User s Guide. Version 2.2. Semex Alliance, Ontario and Centre for Genetic Improvement of Livestock University of Guelph, Ontario

User s Guide. Version 2.2. Semex Alliance, Ontario and Centre for Genetic Improvement of Livestock University of Guelph, Ontario User s Guide Version 2.2 Semex Alliance, Ontario and Centre for Genetic Improvement of Livestock University of Guelph, Ontario Mehdi Sargolzaei, Jacques Chesnais and Flavio Schenkel Jan 2014 Disclaimer

More information

Language Basics. /* The NUMBER GAME - User tries to guess a number between 1 and 10 */ /* Generate a random number between 1 and 10 */

Language Basics. /* The NUMBER GAME - User tries to guess a number between 1 and 10 */ /* Generate a random number between 1 and 10 */ Overview Language Basics This chapter describes the basic elements of Rexx. It discusses the simple components that make up the language. These include script structure, elements of the language, operators,

More information

QUICKTEST user guide

QUICKTEST user guide QUICKTEST user guide Toby Johnson Zoltán Kutalik December 11, 2008 for quicktest version 0.94 Copyright c 2008 Toby Johnson and Zoltán Kutalik Permission is granted to copy, distribute and/or modify this

More information

Chapter 2 - Introduction to C Programming

Chapter 2 - Introduction to C Programming Chapter 2 - Introduction to C Programming 2 Outline 2.1 Introduction 2.2 A Simple C Program: Printing a Line of Text 2.3 Another Simple C Program: Adding Two Integers 2.4 Memory Concepts 2.5 Arithmetic

More information

round decimals to the nearest decimal place and order negative numbers in context

round decimals to the nearest decimal place and order negative numbers in context 6 Numbers and the number system understand and use proportionality use the equivalence of fractions, decimals and percentages to compare proportions use understanding of place value to multiply and divide

More information

Intro to Programming. Unit 7. What is Programming? What is Programming? Intro to Programming

Intro to Programming. Unit 7. What is Programming? What is Programming? Intro to Programming Intro to Programming Unit 7 Intro to Programming 1 What is Programming? 1. Programming Languages 2. Markup vs. Programming 1. Introduction 2. Print Statement 3. Strings 4. Types and Values 5. Math Externals

More information

AutoPagex Plug-in User s Manual

AutoPagex Plug-in User s Manual Page 1 of 32 AutoPagex Plug-in User s Manual Version 1.1 Page 2 of 32 What is AutoPagex plug-in? AutoPagex is an advanced plug-in for Adobe Acrobat and Adobe Acrobat Professional software. It is designed

More information

(Refer Slide Time: 01:12)

(Refer Slide Time: 01:12) Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #22 PERL Part II We continue with our discussion on the Perl

More information

BEAGLECALL 1.0. Brian L. Browning Department of Medicine Division of Medical Genetics University of Washington. 15 November 2010

BEAGLECALL 1.0. Brian L. Browning Department of Medicine Division of Medical Genetics University of Washington. 15 November 2010 BEAGLECALL 1.0 Brian L. Browning Department of Medicine Division of Medical Genetics University of Washington 15 November 2010 BEAGLECALL 1.0 P a g e i Contents 1 Introduction... 1 1.1 Citing BEAGLECALL...

More information

6. Relational Algebra (Part II)

6. Relational Algebra (Part II) 6. Relational Algebra (Part II) 6.1. Introduction In the previous chapter, we introduced relational algebra as a fundamental model of relational database manipulation. In particular, we defined and discussed

More information

Lecture Notes on Contracts

Lecture Notes on Contracts Lecture Notes on Contracts 15-122: Principles of Imperative Computation Frank Pfenning Lecture 2 August 30, 2012 1 Introduction For an overview the course goals and the mechanics and schedule of the course,

More information

Math 25 and Maple 3 + 4;

Math 25 and Maple 3 + 4; Math 25 and Maple This is a brief document describing how Maple can help you avoid some of the more tedious tasks involved in your Math 25 homework. It is by no means a comprehensive introduction to using

More information

Mastering Linux by Paul S. Wang Appendix: The emacs Editor

Mastering Linux by Paul S. Wang Appendix: The emacs Editor Mastering Linux by Paul S. Wang Appendix: The emacs Editor The emacs editor originally was developed at the MIT Laboratory for Computer Science. As emacs gained popularity, it was ported to UNIX and Linux

More information

Java How to Program, 10/e. Copyright by Pearson Education, Inc. All Rights Reserved.

Java How to Program, 10/e. Copyright by Pearson Education, Inc. All Rights Reserved. Java How to Program, 10/e Education, Inc. All Rights Reserved. Each class you create becomes a new type that can be used to declare variables and create objects. You can declare new classes as needed;

More information

COMPUTER SCIENCE LARGE PRACTICAL.

COMPUTER SCIENCE LARGE PRACTICAL. COMPUTER SCIENCE LARGE PRACTICAL Page 45 of 100 SURVEY RESULTS Approx. 1/5 of class responded; statistically significant? The majority of you have substantial experience in Java, and all have at least

More information

MICROSOFT EXCEL 2003 LEVEL 3

MICROSOFT EXCEL 2003 LEVEL 3 MICROSOFT EXCEL 2003 LEVEL 3 WWP Training Limited Page 1 STUDENT EDITION LESSON 1 - USING LOGICAL, LOOKUP AND ROUND FUNCTIONS... 7 Using Lookup Functions... 8 Using the VLOOKUP Function... 8 Using the

More information

Introduction to MATLAB

Introduction to MATLAB Introduction to MATLAB Introduction MATLAB is an interactive package for numerical analysis, matrix computation, control system design, and linear system analysis and design available on most CAEN platforms

More information

Problem 1: Hello World!

Problem 1: Hello World! Problem 1: Hello World! Instructions This is the classic first program made famous in the early 70s. Write the body of the program called Problem1 that prints out The text must be terminated by a new-line

More information

Chapter 10: File Input / Output

Chapter 10: File Input / Output C: Chapter10 Page 1 of 6 C Tutorial.......... File input/output Chapter 10: File Input / Output OUTPUT TO A FILE Load and display the file named formout.c for your first example of writing data to a file.

More information

Essential Skills for Bioinformatics: Unix/Linux

Essential Skills for Bioinformatics: Unix/Linux Essential Skills for Bioinformatics: Unix/Linux SHELL SCRIPTING Overview Bash, the shell we have used interactively in this course, is a full-fledged scripting language. Unlike Python, Bash is not a general-purpose

More information

Documentation for BayesAss 1.3

Documentation for BayesAss 1.3 Documentation for BayesAss 1.3 Program Description BayesAss is a program that estimates recent migration rates between populations using MCMC. It also estimates each individual s immigrant ancestry, the

More information

Hashing. Hashing Procedures

Hashing. Hashing Procedures Hashing Hashing Procedures Let us denote the set of all possible key values (i.e., the universe of keys) used in a dictionary application by U. Suppose an application requires a dictionary in which elements

More information

Breeding Guide. Customer Services PHENOME-NETWORKS 4Ben Gurion Street, 74032, Nes-Ziona, Israel

Breeding Guide. Customer Services PHENOME-NETWORKS 4Ben Gurion Street, 74032, Nes-Ziona, Israel Breeding Guide Customer Services PHENOME-NETWORKS 4Ben Gurion Street, 74032, Nes-Ziona, Israel www.phenome-netwoks.com Contents PHENOME ONE - INTRODUCTION... 3 THE PHENOME ONE LAYOUT... 4 THE JOBS ICON...

More information

Modern Programming Languages. Lecture LISP Programming Language An Introduction

Modern Programming Languages. Lecture LISP Programming Language An Introduction Modern Programming Languages Lecture 18-21 LISP Programming Language An Introduction 72 Functional Programming Paradigm and LISP Functional programming is a style of programming that emphasizes the evaluation

More information

Section Graphs and Lines

Section Graphs and Lines Section 1.1 - Graphs and Lines The first chapter of this text is a review of College Algebra skills that you will need as you move through the course. This is a review, so you should have some familiarity

More information

Lecture 5. Essential skills for bioinformatics: Unix/Linux

Lecture 5. Essential skills for bioinformatics: Unix/Linux Lecture 5 Essential skills for bioinformatics: Unix/Linux UNIX DATA TOOLS Text processing with awk We have illustrated two ways awk can come in handy: Filtering data using rules that can combine regular

More information

Linkage analysis with paramlink Appendix: Running MERLIN from paramlink

Linkage analysis with paramlink Appendix: Running MERLIN from paramlink Linkage analysis with paramlink Appendix: Running MERLIN from paramlink Magnus Dehli Vigeland 1 Introduction While multipoint analysis is not implemented in paramlink, a convenient wrapper for MERLIN (arguably

More information

Chapter 1 & 2 Introduction to C Language

Chapter 1 & 2 Introduction to C Language 1 Chapter 1 & 2 Introduction to C Language Copyright 2007 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. Chapter 1 & 2 - Introduction to C Language 2 Outline 1.1 The History

More information

EC121 Mathematical Techniques A Revision Notes

EC121 Mathematical Techniques A Revision Notes EC Mathematical Techniques A Revision Notes EC Mathematical Techniques A Revision Notes Mathematical Techniques A begins with two weeks of intensive revision of basic arithmetic and algebra, to the level

More information

M(ARK)S(IM) Dec. 1, 2009 Payseur Lab University of Wisconsin

M(ARK)S(IM) Dec. 1, 2009 Payseur Lab University of Wisconsin M(ARK)S(IM) Dec. 1, 2009 Payseur Lab University of Wisconsin M(ARK)S(IM) extends MS by enabling the user to simulate microsatellite data sets under a variety of mutational models. Simulated data sets are

More information

CHAPTER 1: INTRODUCTION...

CHAPTER 1: INTRODUCTION... Linkage Analysis Package User s Guide to Analysis Programs Version 5.10 for IBM PC/compatibles 10 Oct 1996, updated 2 November 2013 Table of Contents CHAPTER 1: INTRODUCTION... 1 1.0 OVERVIEW... 1 1.1

More information

Our Strategy for Learning Fortran 90

Our Strategy for Learning Fortran 90 Our Strategy for Learning Fortran 90 We want to consider some computational problems which build in complexity. evaluating an integral solving nonlinear equations vector/matrix operations fitting data

More information

A Tutorial for Excel 2002 for Windows

A Tutorial for Excel 2002 for Windows INFORMATION SYSTEMS SERVICES Writing Formulae with Microsoft Excel 2002 A Tutorial for Excel 2002 for Windows AUTHOR: Information Systems Services DATE: August 2004 EDITION: 2.0 TUT 47 UNIVERSITY OF LEEDS

More information

Virtual CD TS 1 Introduction... 3

Virtual CD TS 1 Introduction... 3 Table of Contents Table of Contents Virtual CD TS 1 Introduction... 3 Document Conventions...... 4 What Virtual CD TS Can Do for You...... 5 New Features in Version 10...... 6 Virtual CD TS Licensing......

More information

Rational Numbers CHAPTER Introduction

Rational Numbers CHAPTER Introduction RATIONAL NUMBERS Rational Numbers CHAPTER. Introduction In Mathematics, we frequently come across simple equations to be solved. For example, the equation x + () is solved when x, because this value of

More information

Table of Contents EVALUATION COPY

Table of Contents EVALUATION COPY Table of Contents Introduction... 1-2 A Brief History of Python... 1-3 Python Versions... 1-4 Installing Python... 1-5 Environment Variables... 1-6 Executing Python from the Command Line... 1-7 IDLE...

More information

Box-Cox Transformation for Simple Linear Regression

Box-Cox Transformation for Simple Linear Regression Chapter 192 Box-Cox Transformation for Simple Linear Regression Introduction This procedure finds the appropriate Box-Cox power transformation (1964) for a dataset containing a pair of variables that are

More information

1 Lexical Considerations

1 Lexical Considerations Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.035, Spring 2013 Handout Decaf Language Thursday, Feb 7 The project for the course is to write a compiler

More information

DCN Delegate Database. Software User Manual LBB3580

DCN Delegate Database. Software User Manual LBB3580 DCN en LBB580 GENERAL CONTENTS Chapter 1-1.1 About Chapter 2 - Getting Started 2.1 Starting 2.2 Using Help Chapter - Preparing for a Conference.1 The main window.2 Working with names files. Entering delegate

More information

NOTE: Answer ANY FOUR of the following 6 sections:

NOTE: Answer ANY FOUR of the following 6 sections: A-PDF MERGER DEMO Philadelphia University Lecturer: Dr. Nadia Y. Yousif Coordinator: Dr. Nadia Y. Yousif Internal Examiner: Dr. Raad Fadhel Examination Paper... Programming Languages Paradigms (750321)

More information

Using LookoutDirect. Overview of the Process Development Cycle

Using LookoutDirect. Overview of the Process Development Cycle 5 Overview of the Process Development Cycle The first step in developing a process file is creating a process file. After the file is created, control panels are added. Control panels are windows you use

More information

Contents. Jairo Pava COMS W4115 June 28, 2013 LEARN: Language Reference Manual

Contents. Jairo Pava COMS W4115 June 28, 2013 LEARN: Language Reference Manual Jairo Pava COMS W4115 June 28, 2013 LEARN: Language Reference Manual Contents 1 Introduction...2 2 Lexical Conventions...2 3 Types...3 4 Syntax...3 5 Expressions...4 6 Declarations...8 7 Statements...9

More information

!"!!!"!!"!! = 10!!!!!(!!) = 10! = 1,000,000

!!!!!!!! = 10!!!!!(!!) = 10! = 1,000,000 Math Review for AP Chemistry The following is a brief review of some of the math you should remember from your past. This is meant to jog your memory and not to teach you something new. If you find you

More information

number Understand the equivalence between recurring decimals and fractions

number Understand the equivalence between recurring decimals and fractions number Understand the equivalence between recurring decimals and fractions Using and Applying Algebra Calculating Shape, Space and Measure Handling Data Use fractions or percentages to solve problems involving

More information

Intermediate Algebra. Gregg Waterman Oregon Institute of Technology

Intermediate Algebra. Gregg Waterman Oregon Institute of Technology Intermediate Algebra Gregg Waterman Oregon Institute of Technology c 2017 Gregg Waterman This work is licensed under the Creative Commons Attribution 4.0 International license. The essence of the license

More information

\n is used in a string to indicate the newline character. An expression produces data. The simplest expression

\n is used in a string to indicate the newline character. An expression produces data. The simplest expression Chapter 1 Summary Comments are indicated by a hash sign # (also known as the pound or number sign). Text to the right of the hash sign is ignored. (But, hash loses its special meaning if it is part of

More information

A Big Step. Shell Scripts, I/O Redirection, Ownership and Permission Concepts, and Binary Numbers

A Big Step. Shell Scripts, I/O Redirection, Ownership and Permission Concepts, and Binary Numbers A Big Step Shell Scripts, I/O Redirection, Ownership and Permission Concepts, and Binary Numbers Copyright 2006 2009 Stewart Weiss What a shell really does Here is the scoop on shells. A shell is a program

More information

Introduction to Bioinformatics Problem Set 3: Genome Sequencing

Introduction to Bioinformatics Problem Set 3: Genome Sequencing Introduction to Bioinformatics Problem Set 3: Genome Sequencing 1. Assemble a sequence with your bare hands! You are trying to determine the DNA sequence of a very (very) small plasmids, which you estimate

More information

There are two ways to use the python interpreter: interactive mode and script mode. (a) open a terminal shell (terminal emulator in Applications Menu)

There are two ways to use the python interpreter: interactive mode and script mode. (a) open a terminal shell (terminal emulator in Applications Menu) I. INTERACTIVE MODE VERSUS SCRIPT MODE There are two ways to use the python interpreter: interactive mode and script mode. 1. Interactive Mode (a) open a terminal shell (terminal emulator in Applications

More information

Simulator. Chapter 4 Tutorial: The SDL

Simulator. Chapter 4 Tutorial: The SDL 4 Tutorial: The SDL Simulator The SDL Simulator is the tool that you use for testing the behavior of your SDL systems. In this tutorial, you will practice hands-on on the DemonGame system. To be properly

More information

Package inversion. R topics documented: July 18, Type Package. Title Inversions in genotype data. Version

Package inversion. R topics documented: July 18, Type Package. Title Inversions in genotype data. Version Package inversion July 18, 2013 Type Package Title Inversions in genotype data Version 1.8.0 Date 2011-05-12 Author Alejandro Caceres Maintainer Package to find genetic inversions in genotype (SNP array)

More information

Chapter 2 THE STRUCTURE OF C LANGUAGE

Chapter 2 THE STRUCTURE OF C LANGUAGE Lecture # 5 Chapter 2 THE STRUCTURE OF C LANGUAGE 1 Compiled by SIA CHEE KIONG DEPARTMENT OF MATERIAL AND DESIGN ENGINEERING FACULTY OF MECHANICAL AND MANUFACTURING ENGINEERING Contents Introduction to

More information

Introduction to MATLAB

Introduction to MATLAB Chapter 1 Introduction to MATLAB 1.1 Software Philosophy Matrix-based numeric computation MATrix LABoratory built-in support for standard matrix and vector operations High-level programming language Programming

More information

LISP. Everything in a computer is a string of binary digits, ones and zeros, which everyone calls bits.

LISP. Everything in a computer is a string of binary digits, ones and zeros, which everyone calls bits. LISP Everything in a computer is a string of binary digits, ones and zeros, which everyone calls bits. From one perspective, sequences of bits can be interpreted as a code for ordinary decimal digits,

More information

Excel Introduction to Excel Databases & Data Tables

Excel Introduction to Excel Databases & Data Tables Creating an Excel Database Key field: Each record should have some field(s) that helps to uniquely identify them, put these fields at the start of your database. In an Excel database each column is a field

More information