Structural analysis and haplotype diversity in swine LEP and MC4R genes

Size: px
Start display at page:

Download "Structural analysis and haplotype diversity in swine LEP and MC4R genes"

Transcription

1 J. Anim. Breed. Genet. ISSN - OIGINAL ATICLE Structural analysis and haplotype diversity in swine LEP and MC genes M. D Andrea, F. Pilla, E. Giuffra, D. Waddington & A.L. Archibald University of Molise, Dip. S.A.V.A., Campobasso, Italy Parco Tecnologico Padano CESA, Via A. Einstein, Lodi, Italy Division of Genetics and Genomics, oslin Institute (Edinburgh), oslin Midlothian, Scotland, UK Keywords Casertana pig breed; haplotypes; leptin; melanocortin- receptor. Correspondence Mariasilvia D Andrea, University of Molise, Dip. S.A.V.A., Via De Sanctis snc, Campobasso, Italy. Tel: + ; Fax: + ; dandrea@unimol.it eceived: February ; accepted: September Summary Knowledge about structural variation of candidate genes could be important to improve breeding selection scheme and preserve genetic variability in livestock species. Leptin (LEP) and melanocortin- receptor (MC) genes are involved in the energetic pathway and are obvious candidate genes for fatness. By sequencing LEP and MC genes in pigs belonging to lean (Large White and Duroc), fat (Meishan and Casertana) breeds and also Wild Boar, polymorphic sites, of which were novel, were found in the Leptin sequence while only the previously described mutation was found in the MC gene. A total of LEP haplotypes were observed and their distribution was unequal among the breeds. The phylogenetic analysis showed two haplotype branches distinguishing between lean and fat breeds. Introduction The analysis of genetic variability of livestock reflects their evolution and provides valuable information for breed conservation and management (Bruford et al. ). Moreover, knowledge about structural genetic variability in candidate genes for economically important traits, can contribute to understanding the genetic control of such complex traits. The leptin gene (LEP) is the homologue of the murine Lep gene also known as Obese or Ob. LEP encodes a protein of amino acids expressed in adipose tissue, which plays a key role in food intake. Genetic defects in the murine and human leptin genes are associated with obesity traits (Zhang et al. ). The function of leptin has been studied in other species, where a relationship between a positive energy balance and circulating levels of leptin has been observed (Cusin et al. ; Kolaczynski et al. ; Barb et al. a,b). In swine, seven polymorphisms were found in the LEP gene, and association analyses with production traits have produced differing results (Jiang & Gibson ; Kennes et al. ; Chen et al. ; Szydlowski et al. ; de Oliveira Peixoto et al. ). In concert with leptin, the melanocortin- receptor (MC) has a key role in regulating feed intake and energy balance (Kim et al. ; Huston et al. ). The missense mutation AspAsn originally described by Kim et al. () has been tested for associations with performance in different pig populations (Kim et al. ; Park et al. ; Stachowiak et al. ; Bruun et al. ; Jokubka et al. ; Kim et al. ; Meidtner et al. ) but with conflicting results that brought some authors to conclude that either MC could not be the causative mutation but it is closely related to the true quantitative trait nucleotide (QTN) or there may be an epistatic interaction between quantitative trait loci (QTL) (Bruun et al. ). Beside all, it is clear that MC is associated with growth and leanness in most populations. The Casertana (CT) pig is a local breed from the Southern Italy characterized by slow growth and massive accumulation of backfat (Pietrolà et al. ). It has been dismissed mainly for economic reasons, Journal compilation ª Blackwell Verlag, Berlin. No claim to original US government works J. Anim. Breed. Genet. ()

2 M. D Andrea et al. Structural analysis and haplotype diversity in swine LEP and MC genes i.e. because of its long time to reach its mature body weight and preference of consumers for lean meat. As an autochthonous breed it could be linked to specialized local products as a result of tradition and practice; moreover it represents useful genotypes and may be a reservoir of alleles that could be important for future breeding objectives and research. In this paper, we explored the genetic variation of LEP and MC segregating in CT versus Large White (LW), a major commercial breed. Additionally, two reference sequences from Duroc (DU), Meishan (Me) and Wild Boar (WB) were included in the database. Material and methods LEP and MC sequences were successfully obtained from animals, including LW; CT; two WB, two DU and two Me pigs. DNA was isolated from blood using a phenol chloroform method. To analyse the seven LEP and one MC previously identified polymorphic positions, the primers and the amplification conditions were carried as described previously (Jiang & Gibson ; Kim et al. ; Kennes et al. ). Moreover, to obtain the complete sequence of the two genes, LEP and MC overlapping PC primers were designed using the primer software ( on the EMBL-Bank (European Molecular Biology Laboratory Nucleotide Sequence Database) LEP U and AF sequences and MC AB sequence. The amplifications were conducted in ll using ng of genomic DNA, x buffer,. mm MgCl,. mm dntp,. lm of each primer and. U Taq polymerase (oche Taq DNA Polymerase, oche Diagnostic, Pleasanton, CA, USA). The thermal cycling conditions were different in accordance with the primers. In Table are reported the different primers used to amplify LEP (PLF to PbisL) and MC (PMF to PM) and some specific Leptin sequence primers necessary to overcome the sequence problems in samples from animals heterozygous for ins/ (PF to P). Moreover, the amplified fragment sizes and their localization within the genes are reported (Table ). The amplicon products were purified using the QIAquick PC purification kit (Qiagen, Inc., Valencia, CA, USA) and sequenced by standard protocols on a ABI PISM (Applied Biosystems, Foster City, CA, USA). The sequences were aligned to the reference swine sequences in the NCBI database (LEP: U and AF; MC: AB). The NCBI Map Viewer web resource ( mapview) was used to scan for mutations in functional sites. Haplotype analysis was performed using phase version. program (Stephens et al. ; Stephens & Scheet ). Phylogenetic and molecular evolutionary analyses were conducted using mega software version. with Maximum Parsimony and bootstrap options (Kumar et al. ). esults and discussion A total of bp were sequenced for the LEP gene, which included intron, exon, intron, exon and the UT known in swine (Soares & Guimarães ), and a total of polymorphisms were detected including single nucleotide polymorphisms (SNPs) of which only seven had been previously reported in pigs (Jiang & Gibson ; Kennes et al. ), five etions (nucleotide position: C, T, AAG, AG and C of the reference sequence U) and four insertions one of which was also associated with an inversion (nucleotide position: inscccinvtgc, AA, GGGTGGACG- TGG, AA of the reference sequence U) never reported before. All the new polymorphisms were found in either introns or the UT region. The CT LEP gene sequence containing all the polymorphisms has been deposited in the EMBL data base; accession number: AJ. None of the polymorphic LEP sites were revealed as potentially functional using the NCBI Map Viewer analysis. A total of haplotypes were inferred by phase (Tables and ). Despite the high number of mutations, only five haplotypes (A, B, C, D, E) were represented more than once in the CT and LW animals (Table ). The B haplotype was found only in CT. The unrooted tree showed two major branches (L is the third), where haplotypes B and closely related variants, including three Me variants, cluster together (Figure ). The B and E haplotypes could be considered as the CT haplotypes with the presence of the A, C and N haplotypes in CT animals most likely to have arisen from crossbreeding with other breeds to improve performance. Several of the nodes in the tree are not well supported by the bootstrap testing, but the tree formed from the five multiply represented haplotypes alone has two tight clusters (B, E) and (A, C, D), supported by / bootstrap samples (not presented). Casertana and LW are mostly represented by the A, B, C and D haplotypes, but with very different frequencies, with the B haplotype only found in the CT breed. This finding reflects population history and bottleneck effects typical of small traditional populations (S. Palermo, E. Capra, M. Torremorrell, M. Dolzan,. Davoli, C.S. Haley & E. Giuffra, in Journal compilation ª Blackwell Verlag, Berlin. No claim to original US government works J. Anim. Breed. Genet. ()

3 Structural analysis and haplotype diversity in swine LEP and MC genes M. D Andrea et al. Table LEP and MC amplification and sequencing primers, fragment sizes and localization Amplification primers Sequence Length in bp and gene position of the amplicon eference sequence Annealing condition PLF ATA CCC AGC CCA GGG GAC int U Touch down PL TCT CCA GGC TTT TAT GAG GA U C for s PLF TGA TCC TCA TAA AAG CCT GGA int & ex U PL TCC TGG TGA CAA TCG TCT TG U PLF GTT TCC AGG CCC CAG AAG ex U PL GAA ATG TCA CTG ATC CTG GTG A U PLF AGG GTC ACC GGT TTG GAC T ex part. U PL ACC ACC TCC GTG GAG TAG AG U PLF AAG CCT CCC TCT ACT CCA CG part. ex & UT U PL AAA GGC TGG TGT TTT GCT TC U PLF AAG CAA AAC ACC AGC CTT TC UT U PL GGG GCT GAG CAC AAT AGA TG U PbisLF TGA CAC CAA AAC CCT CAT CA part. ex & int U Touch down PbisL TCA GCT GTC ACC AGG AAG AA U C for s PbisLF TCT TCC TGG TGA CAG CTG AA int U PbisL GTC TGT GCT GGG AGC TGT CT U PbisLF GCC CAT GTT CCC ACA CTA AC int U PbisL CAA AGC CAC AAC CGA AAA CT U PbisLF GTG GCT TTG ATA GCA CCC AG int U PbisL CAG CCA CGA CTG TCT GTT TC U PbisLF ACA GAC AGT CGT GGC TGG TT int & part. ex U PbisL TTT CTG GAA GGC AGA CTG GT U PbisLF AAA CAA GGA GGC ATG GGT TT UT U PbisL GCT ATC CTG CTT CAA AGG GA U PbisLF TCC CTT TGA AGC AGG ATA GC UT U PbisL CAA ACA AAA CAG CCT CCT CC U PLF AAG TGT TTG CTG GAA GAG CG UT AF PL TTT CAG GGG GCA AAG GTA AT AF PLF TGC AGA CAG CTC CGA TTA GA UT U PL GGA CAC GCT GGA TCT GTC AT U PbisLF TTA CAG GAA GGC AGA CAG CTC int U Touch down PbisL GTT AGT GTG GGA ACA TGG GC U C for s PbisLF ATC AAG CAG GGT TCC ATC TG UT U PbisL CCC ATG GAT GGT ACT GGA AA U PMF AAA GAA GCA GAG GAG GAG CC UT AB Touch down PM CCT CAG CGA TTT TCT CCA AG AB C for s PMF CTT GGA GAA AAT CGC TGA GG part. UT - ex AB PM GTG CAG ACT GCC CAG ATA CA AB PMF CCC CTT GGA AAA GGC TAC TC part. ex AB PM GAC AAA TCA CAG AGG CCA CC AB PMF ATG TTC CTC ATG GCC AGA CT part. ex UT AB PM CAG GGA GAA TGA GCA TGGT TTT AB Further LEP primers specific for sequencing Sequence Sequence eference sequence PF TTA TCC TCC TTC TTC CC U CTA CAG ATT AGA ACA TTC C U P GTA GAG GGT TGT ATA GG U PF AGA AGA GGC ATC TGG AG U P GTT TGG AGG AGA CAG A U PF AGG AAGTGT GTT GGT GG U P GAC CAT CTG CTA AAG CC U Journal compilation ª Blackwell Verlag, Berlin. No claim to original US government works J. Anim. Breed. Genet. ()

4 M. D Andrea et al. Structural analysis and haplotype diversity in swine LEP and MC genes Table LEP haplotypes in the population analysed: CT, LW, Me, D, WB intron intron ex UT HAP INS INV ins ins A T C C A A T T C G G n n G T G C C T G G G G C G A G A A A T C C A G A G n A C C C G T C G C C T n G G n C T T T C G C G G B G T * G G C C T * * y y A C A T T C A * * * * * * * * * * C * * G A G A y T * * T * G * * * T G y A * y A C * * T * T * * C T C * A A T T C * * n n G T G C C T G * * * * * * * * * * T * * A G A G n A * A C * T * * * C T n G * n C T * * C * C * * D * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * C * * * * * * * * * * * * * * C * * * * * * E G * * G G C C T * * y y A C A T T C A * * * * * * * * * * C * * G A G A y T * * T * G * * * T G y A * y A C T * T * T * * F * * * A A T T C T * n n G T G C C T G T A * G A G * G C G * * A A * * * * * T * * * * T T T * * n G A n C T C * C A * * C G * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * H * * * G G C C T G * y y A C A T T C A G G * C G A * A A A T * C * G A G n A C * C * T C G C C T * * G * * * T * * G C * G I * * * * * * * * * * * * * * * * * * * * * * * A G * G C G C * A * * * * * T * * * A G * * * T G * * A * * * C * * A T * C J * * * * * * * * * * n n G T G * * * * * * A * G A * A A A * * C G * * * * A * A * G T * * * C T * * G * * * T * * G C * G K * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * y * * C * * * * * * * * y * * * * * * * T * T * * L * * T A A T T C * * * * * * * C C T G * * G * A G A G * * * T * A * * * n T * * * A G * T T T G n * A * * * C * C * C A * M T * C * * * * * * * y y A C A T T C A * * * * G A G A * * T C * * * * * * A * * * G T * G C C T * * G * * * T * * * * G * N * * * * * * * * * * n n G T G C C T G * * * * * * * * * * C * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * O * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * T * * * A G * T T T G * * * * * * * * * * * * * P * * * * * * * * * * * * * * * * * * * * * * * * * * * * * T * * * * * * * A * * * G T * G C C T * * * * * * * * T * T * * Q * * * * * * * * * * * * * * * * * * * * * * * * * * * C G * * * * * * * * * * A * * * * * * * * * * * * * * * * C * C * * * * * * * * * * * * * * * * * * * * * * * * G * * * * A A * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * LEP, leptin gene; CT, Casertena; LW, Large White; Me, Meishan; WB, Wild Boar. Position referred to the sequence U. *Matches at the same position with the previous nucleotide or, etions and insertion. y ¼ Presence of mutation. n ¼ Absence of mutation. Journal compilation ª Blackwell Verlag, Berlin. No claim to original US government works J. Anim. Breed. Genet. ()

5 Structural analysis and haplotype diversity in swine LEP and MC genes M. D Andrea et al. Table LEP haplotypes in the population analysed: CT, LW, Me, D, WB UT H A P ins CT LW Me D WB Total A G T C T T T C G G G G n C A T n T T C C G n T C G n A G C G C G C G A C C B * C G C * * T A A A A y T G G y C C * * A y C T A y * * * A T A A A G T T C * T C T * * C G G G G n C A T n T T * * G n T C G n * * * G C G C G A C C D * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * E * C G C * * T * A A A y T G G y C C * * A y C T A y * * * A T A A A G T T F * * * * * * * A * * * * * * * * * * * * G * * C G n G A T G * * * * * * * G * * * * * * * * * * * * * * * * * * T * * * * * * * * * * * * * * * * * * H * T C T * * C G G G G n C A T n T T C * * n T * * * A G C * C G C G A C C I * C G C * * T A A A A y T G G y C C * * * y C * * * G A T * T A A A G T T J * T C T * * C G G G G n C A T n T T * * * n T * * * A G C * C G C G A C C K * C G C * * T A A A A y T G G y C C * * A y C T A y * * * A T A A A G T T L A * C * C C C G G G G n * * * * * * * T G * * C G n G A T G * * * * * * * M G T * T T T * * * * * * C A T n T T * C * n T * * * A G C * C G C G A C C N * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * O * * * * * * * * * * * * * * * * * * * * * y C * * * G A T * T A A A G T T P * C G C * * T A A A A y T G G y C C * * * n T * * * A G C * C G C G A C C Q * T C T * * C G G G G n C A T n T T * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * LEP, leptin gene; CT, Casertena; LW, Large White; Me, Meishan; WB, Wild Boar. Position referred to the sequence U. *Matches at the same position with the previous nucleotide or, etions and insertion. y ¼ Presence of mutation. n ¼ Absence of mutation. ¼ Position referred to the sequence AF. Journal compilation ª Blackwell Verlag, Berlin. No claim to original US government works J. Anim. Breed. Genet. ()

6 M. D Andrea et al. Structural analysis and haplotype diversity in swine LEP and MC genes Casertana (CT) Large White (LW) Wild Boar (WB) Duroc (DU) Meishan (Me) C Q D A P M H J N O L F G I Figure Phylogenetic tree of haplotypes in the leptin (LEP) gene. Numbers are the number of bootstrap samples from supporting the corresponding node. K B E preparation). However, it is interesting that the B haplotype also identifies a divergent branch of the tree which is closely related to the three Asian variants carried by the Me and WB animals analysed (Figure ), and that both CT and Me animals are fat breeds distinct from LW and other European breeds highly selected for leanness, suggesting that artificial selection effects could account for the observed distribution of haplotypes. As for many traditional breeds, the CT is considered to be threatened by extinction; moreover, the breed has a niche for ham production in Southern Italy. Markers for the SNPs that identify the B haplotype of the Leptin gene should thus be included in marker panels for the genetic conservation of the existing populations. A total of bp were sequenced for the MC gene, including the only exon and bp upstream and downstream of it. Only one, previously described (Kim et al. ), mutation was found in MC, with frequencies of the G allele of and % in CT and LW respectively. All the Me, DU and WB pigs studied were homozygous for the G allele. The low polymorphism levels in the MC gene accord with the literature. MC is a subtype of seven-transmembrane G protein-coupled receptors that have conserved amino acid sequences, and thus the lack of polymorphisms could be explained by functional constraints and selection pressure (Kim et al. ; Huston et al. ; Kim et al. ). The MC data with its lack of polymorphism provides a valuable contrast to the LEP gene in which we have revealed many novel polymorphisms. In part, the MC data act as a control for the rigour of our experimental work and confirmation that the CT samples are not artefactual. The LEP gene could be considered as a hot spot gene. In the recent work from Stachowiak et al., a further four novel polymorphisms were found in a -bp fragment of the promoter region. The extensive polymorphisms found in LEP and not previously described may account for the lack of consistency in the results of association analysis between LEP polymorphisms and traits. The association analyse conducted to date were performed considering the single SNP instead of haplotype. In conclusion, whilst the results for association studies in swine between polymorphisms at the LEP locus and performance are inconsistent, the markedly greater level of polymorphism observed in the fat CT breed which shares this variation with other fat pigs (WB, Me) in contrast to the lower levels of variation found in lean breeds (LW and DU) would be consistent with unwitting selection on the LEP locus or linked loci in the pursuit of lean genotypes. Indeed, these data have some of the features of a signature of selection. The new LEP polymorphisms described here provide the tools for future more rigorous studies of associations between LEP genotypes and performance. Journal compilation ª Blackwell Verlag, Berlin. No claim to original US government works J. Anim. Breed. Genet. ()

7 Structural analysis and haplotype diversity in swine LEP and MC genes M. D Andrea et al. eferences Barb C.., Hausman J.H., Hoseknechtm K.L. (a) Biology of leptin in the pig. Domest. Anim. Endocrinol.,,. Barb C.., Barrett J.B., Kraeling.., ampacek G.B. (b) Serum leptin concentrations, luteinizing hormone and growth hormone secretion during feed and metabolic fuel restriction in the prepuberal gilt. Domest. Anim. Endocrinol.,,. Bruford M.W., Bradley D.G., Luikart G. () DNA markers reveal the complexity of livestock domestication. Nature ev. Genet.,,. Bruun C.S., Jørgensen C.B., Nielsen V.H., Andersson L., Fredholm M. () Evaluation of the pocine melanocortin receptor (MC) gene as a positional candidate for fatness QTL in a cross between Landrace and Hampshire. Anim. Genet.,,. Chen C.C., Chang T., Su H.Y. () Genetic polymorphisms in porcine leptin gene and their association with reproduction and production traits. Aust. J. Agric. es.,,. Cusin I., Sainsbury A., ohner-jeanrenaud F. () The ob gene and insulin: a relationship leading to clues to the understanding of obesity. Diabetes,,. Huston.D., Cameron N.D., ance K.A. () A melanocortin- receptor (MC) polymorphism is associated with performance traits in divergently selected large white pig populations. Anim. Genet.,,. Jiang Z.-H., Gibson J.P. () Genetics polymorphism in the leptin gene and their association with fatness in four pig breeds. Mamm. Genome,,. Jokubka., Maak S., Kersiene S., Swalve H.H. () Association of a melanortinc receptor (MC) polymorphism with performance traits in Lithuanian White pigs. J. Anim. Breed. Genet.,,. Kennes Y.M., Murphy B.D., Pothier F., Palin M.-F. () Characterization of swine leptin (LEP) polymorphisms and their association with production traits. Anim. Genet.,,. Kim K.S., Larsen N., Short T., Plastow G., othschild M.F. () A missense variant of porcine melanocortin- receptor (MC) gene is associated with fatness, growth, and feed intake traits. Mamm. Menome,,. Kim K.-S., eecy J.M., Hsu W.H., Anderson L.L., othschild M.F. () Functional and phylogenetic analyses of a melanocortin- receptor mutation in domestic pigs. Domest. Anim. Endocrinol.,,. Kim K.S., Lee J.J., Shin H.Y., Choi B.H., Lee C.K., Kim J.J., Cho B.W., Kim T.-H. () Association of melanocortin receptor (MC) and high mobility group AT-hook (HMGA) polymorphisms with pig growth and fat deposition traits. Anim. Genet.,,. Kolaczynski J.W., Considine.V., Ohannesian J., Marco C., Opentanova I., Nyce M.., Myint M., Caro J.F. () esponses to leptin in short-term fasting and refeeding in humans: a link with ketogenesis but not ketones themselves. Diabetes,,. Kumar S., Tamura K., Nei M. () MEGA: Integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief. Bioinform.,,. Meidtner K., Wermter A.-K., Hinney A., emschmidt H., Hebebrand J., Fries. () Association of the melanocortin receptor with feed intake and daily gain in F Mangalitsa X Pietrain pigs. Anim. Genet.,,. de Oliveira Peixoto J., Facioni Guimaraes S.E., Savio Lopes P., Menck Soares M.A., Vieira Pires A., Gualberto Barbosa M.V., de Almeida Torres., de Almeida e Silva M. () Associations of leptin gene polymorphisms with production traits in pigs. J. Anim. Breed. Genet.,,. Park H.B., Carlborg Ö., Marklund S., Andersson L. () Melanocortin- receptor (MC) genotypes have no major effect on fatness in a Large-White - Wild Boar intercross. Anim. Genet.,,. Pietrolà, E., Pilla F., Maiorano G., Matassino D., () Morphological traits, reproductive and productive performances of Casertana pigs reared outdoors. Ital. J. Anim. Sci.,,. Soares M.A.M., Guimarães S.E.F. () The role of leptin and its receptors in fat metabolism. Second International Virtual Conference on Pork Quality, November to December,. Stachowiak M., Szydlowski M., Obarzanek-Fojt M., Switonski M. () An effect of missense mutation in the porcine melanocortin- receptor (MC) gene on production traits in polish pig breeds is doubtful. Anim. Genet.,,. Stachowiak M., Mackowski M., Madeja Z., Szydlowski M., Buszka A., Kaczmarek P., ubis B., Mackowiak P., Nowak K.W., Switonski M. () Polymorphism of the porcine leptin gene promoter and analysis of its association with gene expression and fatness traits. Biochem. Genet.,,. Stephens M., Scheet P. () Accounting for decay of linkage disequilibrium in haplotype inference and missing data imputation. Am. J. Hum. Genet.,,. Stephens M., Smith N.J., Donnelly P. () A new statistical method for haplotype reconstruction from population data. Am. J. Hum. Genet.,,. Szydlowski M., Stachowiak M., Mackowski M., Kamyczek M., Eckert., ozycki., Switonski M. (). No major effect of the leptin gene polymorphism on porcine production traits. J. Anim. Breed. Genet.,,. Zhang Y., Proenca., Maffel M., Barone M., Leopold L., Friedman J.M. () Positional cloning of the mouse obese gene and its human homologue. Nature,,. Journal compilation ª Blackwell Verlag, Berlin. No claim to original US government works J. Anim. Breed. Genet. ()

by the Genevestigator program (www.genevestigator.com). Darker blue color indicates higher gene expression.

by the Genevestigator program (www.genevestigator.com). Darker blue color indicates higher gene expression. Figure S1. Tissue-specific expression profile of the genes that were screened through the RHEPatmatch and root-specific microarray filters. The gene expression profile (heat map) was drawn by the Genevestigator

More information

Genome Reconstruction: A Puzzle with a Billion Pieces Phillip E. C. Compeau and Pavel A. Pevzner

Genome Reconstruction: A Puzzle with a Billion Pieces Phillip E. C. Compeau and Pavel A. Pevzner Genome Reconstruction: A Puzzle with a Billion Pieces Phillip E. C. Compeau and Pavel A. Pevzner Outline I. Problem II. Two Historical Detours III.Example IV.The Mathematics of DNA Sequencing V.Complications

More information

Pyramidal and Chiral Groupings of Gold Nanocrystals Assembled Using DNA Scaffolds

Pyramidal and Chiral Groupings of Gold Nanocrystals Assembled Using DNA Scaffolds Pyramidal and Chiral Groupings of Gold Nanocrystals Assembled Using DNA Scaffolds February 27, 2009 Alexander Mastroianni, Shelley Claridge, A. Paul Alivisatos Department of Chemistry, University of California,

More information

HP22.1 Roth Random Primer Kit A für die RAPD-PCR

HP22.1 Roth Random Primer Kit A für die RAPD-PCR HP22.1 Roth Random Kit A für die RAPD-PCR Kit besteht aus 20 Einzelprimern, jeweils aufgeteilt auf 2 Reaktionsgefäße zu je 1,0 OD Achtung: Angaben beziehen sich jeweils auf ein Reaktionsgefäß! Sequenz

More information

6 Anhang. 6.1 Transgene Su(var)3-9-Linien. P{GS.ry + hs(su(var)3-9)egfp} 1 I,II,III,IV 3 2I 3 3 I,II,III 3 4 I,II,III 2 5 I,II,III,IV 3

6 Anhang. 6.1 Transgene Su(var)3-9-Linien. P{GS.ry + hs(su(var)3-9)egfp} 1 I,II,III,IV 3 2I 3 3 I,II,III 3 4 I,II,III 2 5 I,II,III,IV 3 6.1 Transgene Su(var)3-9-n P{GS.ry + hs(su(var)3-9)egfp} 1 I,II,III,IV 3 2I 3 3 I,II,III 3 4 I,II,II 5 I,II,III,IV 3 6 7 I,II,II 8 I,II,II 10 I,II 3 P{GS.ry + UAS(Su(var)3-9)EGFP} A AII 3 B P{GS.ry + (10.5kbSu(var)3-9EGFP)}

More information

TCGR: A Novel DNA/RNA Visualization Technique

TCGR: A Novel DNA/RNA Visualization Technique TCGR: A Novel DNA/RNA Visualization Technique Donya Quick and Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University Dallas, Texas 75275 dquick@mail.smu.edu, mhd@engr.smu.edu

More information

Appendix A. Example code output. Chapter 1. Chapter 3

Appendix A. Example code output. Chapter 1. Chapter 3 Appendix A Example code output This is a compilation of output from selected examples. Some of these examples requires exernal input from e.g. STDIN, for such examples the interaction with the program

More information

Genome Reconstruction: A Puzzle with a Billion Pieces. Phillip Compeau Carnegie Mellon University Computational Biology Department

Genome Reconstruction: A Puzzle with a Billion Pieces. Phillip Compeau Carnegie Mellon University Computational Biology Department http://cbd.cmu.edu Genome Reconstruction: A Puzzle with a Billion Pieces Phillip Compeau Carnegie Mellon University Computational Biology Department Eternity II: The Highest-Stakes Puzzle in History Courtesy:

More information

SUPPLEMENTARY INFORMATION. Systematic evaluation of CRISPR-Cas systems reveals design principles for genome editing in human cells

SUPPLEMENTARY INFORMATION. Systematic evaluation of CRISPR-Cas systems reveals design principles for genome editing in human cells SUPPLEMENTARY INFORMATION Systematic evaluation of CRISPR-Cas systems reveals design principles for genome editing in human cells Yuanming Wang 1,2,7, Kaiwen Ivy Liu 2,7, Norfala-Aliah Binte Sutrisnoh

More information

warm-up exercise Representing Data Digitally goals for today proteins example from nature

warm-up exercise Representing Data Digitally goals for today proteins example from nature Representing Data Digitally Anne Condon September 6, 007 warm-up exercise pick two examples of in your everyday life* in what media are the is represented? is the converted from one representation to another,

More information

2 41L Tag- AA GAA AAA ATA AAA GCA TTA RYA GAA ATT TGT RMW GAR C K65 Tag- A AAT CCA TAC AAT ACT CCA GTA TTT GCY ATA AAG AA

2 41L Tag- AA GAA AAA ATA AAA GCA TTA RYA GAA ATT TGT RMW GAR C K65 Tag- A AAT CCA TAC AAT ACT CCA GTA TTT GCY ATA AAG AA 176 SUPPLEMENTAL TABLES 177 Table S1. ASPE Primers for HIV-1 group M subtype B Primer no Type a Sequence (5'-3') Tag ID b Position c 1 M41 Tag- AA GAA AAA ATA AAA GCA TTA RYA GAA ATT TGT RMW GAR A d 45

More information

Supplementary Table 1. Data collection and refinement statistics

Supplementary Table 1. Data collection and refinement statistics Supplementary Table 1. Data collection and refinement statistics APY-EphA4 APY-βAla8.am-EphA4 Crystal Space group P2 1 P2 1 Cell dimensions a, b, c (Å) 36.27, 127.7, 84.57 37.22, 127.2, 84.6 α, β, γ (

More information

Machine Learning Classifiers

Machine Learning Classifiers Machine Learning Classifiers Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve Bayes Perceptrons, Multi-layer Neural Networks

More information

Digging into acceptor splice site prediction: an iterative feature selection approach

Digging into acceptor splice site prediction: an iterative feature selection approach Digging into acceptor splice site prediction: an iterative feature selection approach Yvan Saeys, Sven Degroeve, and Yves Van de Peer Department of Plant Systems Biology, Ghent University, Flanders Interuniversity

More information

Supplementary Data. Image Processing Workflow Diagram A - Preprocessing. B - Hough Transform. C - Angle Histogram (Rose Plot)

Supplementary Data. Image Processing Workflow Diagram A - Preprocessing. B - Hough Transform. C - Angle Histogram (Rose Plot) Supplementary Data Image Processing Workflow Diagram A - Preprocessing B - Hough Transform C - Angle Histogram (Rose Plot) D - Determination of holes Description of Image Processing Workflow The key steps

More information

LABORATORY STANDARD OPERATING PROCEDURE FOR PULSENET CODE: PNL28 MLVA OF SHIGA TOXIN-PRODUCING ESCHERICHIA COLI

LABORATORY STANDARD OPERATING PROCEDURE FOR PULSENET CODE: PNL28 MLVA OF SHIGA TOXIN-PRODUCING ESCHERICHIA COLI 1. PURPOSE: to describe the standardized laboratory protocol for molecular subtyping of Shiga toxin-producing Escherichia coli O157 (STEC O157) and Salmonella enterica serotypes Typhimurium and Enteritidis.

More information

DNA Sequencing. Overview

DNA Sequencing. Overview BINF 3350, Genomics and Bioinformatics DNA Sequencing Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Backgrounds Eulerian Cycles Problem Hamiltonian Cycles

More information

Crick s Hypothesis Revisited: The Existence of a Universal Coding Frame

Crick s Hypothesis Revisited: The Existence of a Universal Coding Frame Crick s Hypothesis Revisited: The Existence of a Universal Coding Frame Jean-Louis Lassez*, Ryan A. Rossi Computer Science Department, Coastal Carolina University jlassez@coastal.edu, raross@coastal.edu

More information

MLiB - Mandatory Project 2. Gene finding using HMMs

MLiB - Mandatory Project 2. Gene finding using HMMs MLiB - Mandatory Project 2 Gene finding using HMMs Viterbi decoding >NC_002737.1 Streptococcus pyogenes M1 GAS TTGTTGATATTCTGTTTTTTCTTTTTTAGTTTTCCACATGAAAAATAGTTGAAAACAATA GCGGTGTCCCCTTAAAATGGCTTTTCCACAGGTTGTGGAGAACCCAAATTAACAGTGTTA

More information

A relation between trinucleotide comma-free codes and trinucleotide circular codes

A relation between trinucleotide comma-free codes and trinucleotide circular codes Theoretical Computer Science 401 (2008) 17 26 www.elsevier.com/locate/tcs A relation between trinucleotide comma-free codes and trinucleotide circular codes Christian J. Michel a,, Giuseppe Pirillo b,c,

More information

Sequence Assembly. BMI/CS 576 Mark Craven Some sequencing successes

Sequence Assembly. BMI/CS 576  Mark Craven Some sequencing successes Sequence Assembly BMI/CS 576 www.biostat.wisc.edu/bmi576/ Mark Craven craven@biostat.wisc.edu Some sequencing successes Yersinia pestis Cannabis sativa The sequencing problem We want to determine the identity

More information

10/15/2009 Comp 590/Comp Fall

10/15/2009 Comp 590/Comp Fall Lecture 13: Graph Algorithms Study Chapter 8.1 8.8 10/15/2009 Comp 590/Comp 790-90 Fall 2009 1 The Bridge Obsession Problem Find a tour crossing every bridge just once Leonhard Euler, 1735 Bridges of Königsberg

More information

10/8/13 Comp 555 Fall

10/8/13 Comp 555 Fall 10/8/13 Comp 555 Fall 2013 1 Find a tour crossing every bridge just once Leonhard Euler, 1735 Bridges of Königsberg 10/8/13 Comp 555 Fall 2013 2 Find a cycle that visits every edge exactly once Linear

More information

Graph Algorithms in Bioinformatics

Graph Algorithms in Bioinformatics Graph Algorithms in Bioinformatics Computational Biology IST Ana Teresa Freitas 2015/2016 Sequencing Clone-by-clone shotgun sequencing Human Genome Project Whole-genome shotgun sequencing Celera Genomics

More information

CSCI2950-C Lecture 4 DNA Sequencing and Fragment Assembly

CSCI2950-C Lecture 4 DNA Sequencing and Fragment Assembly CSCI2950-C Lecture 4 DNA Sequencing and Fragment Assembly Ben Raphael Sept. 22, 2009 http://cs.brown.edu/courses/csci2950-c/ l-mer composition Def: Given string s, the Spectrum ( s, l ) is unordered multiset

More information

Efficient Selection of Unique and Popular Oligos for Large EST Databases. Stefano Lonardi. University of California, Riverside

Efficient Selection of Unique and Popular Oligos for Large EST Databases. Stefano Lonardi. University of California, Riverside Efficient Selection of Unique and Popular Oligos for Large EST Databases Stefano Lonardi University of California, Riverside joint work with Jie Zheng, Timothy Close, Tao Jiang University of California,

More information

Sequencing. Computational Biology IST Ana Teresa Freitas 2011/2012. (BACs) Whole-genome shotgun sequencing Celera Genomics

Sequencing. Computational Biology IST Ana Teresa Freitas 2011/2012. (BACs) Whole-genome shotgun sequencing Celera Genomics Computational Biology IST Ana Teresa Freitas 2011/2012 Sequencing Clone-by-clone shotgun sequencing Human Genome Project Whole-genome shotgun sequencing Celera Genomics (BACs) 1 Must take the fragments

More information

Supplementary Materials:

Supplementary Materials: Supplementary Materials: Amino acid codo n Numb er Table S1. Codon usage in all the protein coding genes. RSC U Proportion (%) Amino acid codo n Numb er RSC U Proportion (%) Phe UUU 861 1.31 5.71 Ser UCU

More information

Degenerate Coding and Sequence Compacting

Degenerate Coding and Sequence Compacting ESI The Erwin Schrödinger International Boltzmanngasse 9 Institute for Mathematical Physics A-1090 Wien, Austria Degenerate Coding and Sequence Compacting Maya Gorel Kirzhner V.M. Vienna, Preprint ESI

More information

DNA Sequencing The Shortest Superstring & Traveling Salesman Problems Sequencing by Hybridization

DNA Sequencing The Shortest Superstring & Traveling Salesman Problems Sequencing by Hybridization Eulerian & Hamiltonian Cycle Problems DNA Sequencing The Shortest Superstring & Traveling Salesman Problems Sequencing by Hybridization The Bridge Obsession Problem Find a tour crossing every bridge just

More information

Genome 373: Genome Assembly. Doug Fowler

Genome 373: Genome Assembly. Doug Fowler Genome 373: Genome Assembly Doug Fowler What are some of the things we ve seen we can do with HTS data? We ve seen that HTS can enable a wide variety of analyses ranging from ID ing variants to genome-

More information

Tutorial 1: Exploring the UCSC Genome Browser

Tutorial 1: Exploring the UCSC Genome Browser Last updated: May 12, 2011 Tutorial 1: Exploring the UCSC Genome Browser Open the homepage of the UCSC Genome Browser at: http://genome.ucsc.edu/ In the blue bar at the top, click on the Genomes link.

More information

Scalable Solutions for DNA Sequence Analysis

Scalable Solutions for DNA Sequence Analysis Scalable Solutions for DNA Sequence Analysis Michael Schatz Dec 4, 2009 JHU/UMD Joint Sequencing Meeting The Evolution of DNA Sequencing Year Genome Technology Cost 2001 Venter et al. Sanger (ABI) $300,000,000

More information

Phylogenetics on CUDA (Parallel) Architectures Bradly Alicea

Phylogenetics on CUDA (Parallel) Architectures Bradly Alicea Descent w/modification Descent w/modification Descent w/modification Descent w/modification CPU Descent w/modification Descent w/modification Phylogenetics on CUDA (Parallel) Architectures Bradly Alicea

More information

Supporting Information

Supporting Information Copyright WILEY VCH Verlag GmbH & Co. KGaA, 69469 Weinheim, Germany, 2015. Supporting Information for Small, DOI: 10.1002/smll.201501370 A Compact DNA Cube with Side Length 10 nm Max B. Scheible, Luvena

More information

Purpose of sequence assembly

Purpose of sequence assembly Sequence Assembly Purpose of sequence assembly Reconstruct long DNA/RNA sequences from short sequence reads Genome sequencing RNA sequencing for gene discovery Amplicon sequencing But not for transcript

More information

DNA Fragment Assembly

DNA Fragment Assembly Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri DNA Fragment Assembly Overlap

More information

Algorithms for Bioinformatics

Algorithms for Bioinformatics Adapted from slides by Alexandru Tomescu, Leena Salmela and Veli Mäkinen, which are partly from http://bix.ucsd.edu/bioalgorithms/slides.php 58670 Algorithms for Bioinformatics Lecture 5: Graph Algorithms

More information

Eulerian Tours and Fleury s Algorithm

Eulerian Tours and Fleury s Algorithm Eulerian Tours and Fleury s Algorithm CSE21 Winter 2017, Day 12 (B00), Day 8 (A00) February 8, 2017 http://vlsicad.ucsd.edu/courses/cse21-w17 Vocabulary Path (or walk): describes a route from one vertex

More information

Multiple Sequence Alignment Gene Finding, Conserved Elements

Multiple Sequence Alignment Gene Finding, Conserved Elements Multiple Sequence Alignment Gene Finding, Conserved Elements Definition Given N sequences x 1, x 2,, x N : Insert gaps (-) in each sequence x i, such that All sequences have the same length L Score of

More information

Assembly in the Clouds

Assembly in the Clouds Assembly in the Clouds Michael Schatz October 13, 2010 Beyond the Genome Shredded Book Reconstruction Dickens accidentally shreds the first printing of A Tale of Two Cities Text printed on 5 long spools

More information

Sequence Assembly Required!

Sequence Assembly Required! Sequence Assembly Required! 1 October 3, ISMB 20172007 1 Sequence Assembly Genome Sequenced Fragments (reads) Assembled Contigs Finished Genome 2 Greedy solution is bounded 3 Typical assembly strategy

More information

de Bruijn graphs for sequencing data

de Bruijn graphs for sequencing data de Bruijn graphs for sequencing data Rayan Chikhi CNRS Bonsai team, CRIStAL/INRIA, Univ. Lille 1 SMPGD 2016 1 MOTIVATION - de Bruijn graphs are instrumental for reference-free sequencing data analysis:

More information

Graphs and Puzzles. Eulerian and Hamiltonian Tours.

Graphs and Puzzles. Eulerian and Hamiltonian Tours. Graphs and Puzzles. Eulerian and Hamiltonian Tours. CSE21 Winter 2017, Day 11 (B00), Day 7 (A00) February 3, 2017 http://vlsicad.ucsd.edu/courses/cse21-w17 Exam Announcements Seating Chart on Website Good

More information

Read Mapping. de Novo Assembly. Genomics: Lecture #2 WS 2014/2015

Read Mapping. de Novo Assembly. Genomics: Lecture #2 WS 2014/2015 Mapping de Novo Assembly Institut für Medizinische Genetik und Humangenetik Charité Universitätsmedizin Berlin Genomics: Lecture #2 WS 2014/2015 Today Genome assembly: the basics Hamiltonian and Eulerian

More information

1. PURPOSE: to describe the standardized laboratory protocol for molecular subtyping of Salmonella enterica serotype Enteritidis.

1. PURPOSE: to describe the standardized laboratory protocol for molecular subtyping of Salmonella enterica serotype Enteritidis. 1. PURPOSE: to describe the standardized laboratory protocol for molecular subtyping of Salmonella enterica serotype Enteritidis. 2. SCOPE: to provide the PulseNet participants with a single protocol for

More information

Graph Algorithms in Bioinformatics

Graph Algorithms in Bioinformatics Graph Algorithms in Bioinformatics Bioinformatics: Issues and Algorithms CSE 308-408 Fall 2007 Lecture 13 Lopresti Fall 2007 Lecture 13-1 - Outline Introduction to graph theory Eulerian & Hamiltonian Cycle

More information

Lecture 5: Markov models

Lecture 5: Markov models Master s course Bioinformatics Data Analysis and Tools Lecture 5: Markov models Centre for Integrative Bioinformatics Problem in biology Data and patterns are often not clear cut When we want to make a

More information

Algorithms for Bioinformatics

Algorithms for Bioinformatics Adapted from slides by Alexandru Tomescu, Leena Salmela and Veli Mäkinen, which are partly from http://bix.ucsd.edu/bioalgorithms/slides.php 582670 Algorithms for Bioinformatics Lecture 3: Graph Algorithms

More information

de novo assembly Rayan Chikhi Pennsylvania State University Workshop On Genomics - Cesky Krumlov - January /73

de novo assembly Rayan Chikhi Pennsylvania State University Workshop On Genomics - Cesky Krumlov - January /73 1/73 de novo assembly Rayan Chikhi Pennsylvania State University Workshop On Genomics - Cesky Krumlov - January 2014 2/73 YOUR INSTRUCTOR IS.. - Postdoc at Penn State, USA - PhD at INRIA / ENS Cachan,

More information

PERFORMANCE ANALYSIS OF DATAMINIG TECHNIQUE IN RBC, WBC and PLATELET CANCER DATASETS

PERFORMANCE ANALYSIS OF DATAMINIG TECHNIQUE IN RBC, WBC and PLATELET CANCER DATASETS PERFORMANCE ANALYSIS OF DATAMINIG TECHNIQUE IN RBC, WBC and PLATELET CANCER DATASETS Mayilvaganan M 1 and Hemalatha 2 1 Associate Professor, Department of Computer Science, PSG College of arts and science,

More information

Finding homologous sequences in databases

Finding homologous sequences in databases Finding homologous sequences in databases There are multiple algorithms to search sequences databases BLAST (EMBL, NCBI, DDBJ, local) FASTA (EMBL, local) For protein only databases scan via Smith-Waterman

More information

Parsimony-Based Approaches to Inferring Phylogenetic Trees

Parsimony-Based Approaches to Inferring Phylogenetic Trees Parsimony-Based Approaches to Inferring Phylogenetic Trees BMI/CS 576 www.biostat.wisc.edu/bmi576.html Mark Craven craven@biostat.wisc.edu Fall 0 Phylogenetic tree approaches! three general types! distance:

More information

A Novel Implementation of an Extended 8x8 Playfair Cipher Using Interweaving on DNA-encoded Data

A Novel Implementation of an Extended 8x8 Playfair Cipher Using Interweaving on DNA-encoded Data International Journal of Electrical and Computer Engineering (IJECE) Vol. 4, No. 1, Feburary 2014, pp. 93~100 ISSN: 2088-8708 93 A Novel Implementation of an Extended 8x8 Playfair Cipher Using Interweaving

More information

WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches

WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches WSSP-10 Chapter 7 BLASTN: DNA vs DNA searches 4-3 DSAP: BLASTn Page p. 7-1 NCBI BLAST Home Page p. 7-1 NCBI BLASTN search page p. 7-2 Copy sequence from DSAP or wave form program p. 7-2 Choose a database

More information

Genome Assembly Using de Bruijn Graphs. Biostatistics 666

Genome Assembly Using de Bruijn Graphs. Biostatistics 666 Genome Assembly Using de Bruijn Graphs Biostatistics 666 Previously: Reference Based Analyses Individual short reads are aligned to reference Genotypes generated by examining reads overlapping each position

More information

Eulerian tours. Russell Impagliazzo and Miles Jones Thanks to Janine Tiefenbruck. April 20, 2016

Eulerian tours. Russell Impagliazzo and Miles Jones Thanks to Janine Tiefenbruck.  April 20, 2016 Eulerian tours Russell Impagliazzo and Miles Jones Thanks to Janine Tiefenbruck http://cseweb.ucsd.edu/classes/sp16/cse21-bd/ April 20, 2016 Seven Bridges of Konigsberg Is there a path that crosses each

More information

de novo assembly Simon Rasmussen 36626: Next Generation Sequencing analysis DTU Bioinformatics Next Generation Sequencing Analysis

de novo assembly Simon Rasmussen 36626: Next Generation Sequencing analysis DTU Bioinformatics Next Generation Sequencing Analysis de novo assembly Simon Rasmussen 36626: Next Generation Sequencing analysis DTU Bioinformatics 27626 - Next Generation Sequencing Analysis Generalized NGS analysis Data size Application Assembly: Compare

More information

Computational Methods for de novo Assembly of Next-Generation Genome Sequencing Data

Computational Methods for de novo Assembly of Next-Generation Genome Sequencing Data 1/39 Computational Methods for de novo Assembly of Next-Generation Genome Sequencing Data Rayan Chikhi ENS Cachan Brittany / IRISA (Genscale team) Advisor : Dominique Lavenier 2/39 INTRODUCTION, YEAR 2000

More information

NGS NEXT GENERATION SEQUENCING

NGS NEXT GENERATION SEQUENCING NGS NEXT GENERATION SEQUENCING Paestum (Sa) 15-16 -17 maggio 2014 Relatore Dr Cataldo Senatore Dr.ssa Emilia Vaccaro Sanger Sequencing Reactions For given template DNA, it s like PCR except: Uses only

More information

Solutions Exercise Set 3 Author: Charmi Panchal

Solutions Exercise Set 3 Author: Charmi Panchal Solutions Exercise Set 3 Author: Charmi Panchal Problem 1: Suppose we have following fragments: f1 = ATCCTTAACCCC f2 = TTAACTCA f3 = TTAATACTCCC f4 = ATCTTTC f5 = CACTCCCACACA f6 = CACAATCCTTAACCC f7 =

More information

BMI/CS 576 Fall 2015 Midterm Exam

BMI/CS 576 Fall 2015 Midterm Exam BMI/CS 576 Fall 2015 Midterm Exam Prof. Colin Dewey Tuesday, October 27th, 2015 11:00am-12:15pm Name: KEY Write your answers on these pages and show your work. You may use the back sides of pages as necessary.

More information

PLNT4610 BIOINFORMATICS FINAL EXAMINATION

PLNT4610 BIOINFORMATICS FINAL EXAMINATION PLNT4610 BIOINFORMATICS FINAL EXAMINATION 18:00 to 20:00 Thursday December 13, 2012 Answer any combination of questions totalling to exactly 100 points. The questions on the exam sheet total to 120 points.

More information

Lecture Overview. Sequence search & alignment. Searching sequence databases. Sequence Alignment & Search. Goals: Motivations:

Lecture Overview. Sequence search & alignment. Searching sequence databases. Sequence Alignment & Search. Goals: Motivations: Lecture Overview Sequence Alignment & Search Karin Verspoor, Ph.D. Faculty, Computational Bioscience Program University of Colorado School of Medicine With credit and thanks to Larry Hunter for creating

More information

How to Run NCBI BLAST on zcluster at GACRC

How to Run NCBI BLAST on zcluster at GACRC How to Run NCBI BLAST on zcluster at GACRC BLAST: Basic Local Alignment Search Tool Georgia Advanced Computing Resource Center University of Georgia Suchitra Pakala pakala@uga.edu 1 OVERVIEW What is BLAST?

More information

Sequence Alignment & Search

Sequence Alignment & Search Sequence Alignment & Search Karin Verspoor, Ph.D. Faculty, Computational Bioscience Program University of Colorado School of Medicine With credit and thanks to Larry Hunter for creating the first version

More information

Computational Genomics and Molecular Biology, Fall

Computational Genomics and Molecular Biology, Fall Computational Genomics and Molecular Biology, Fall 2015 1 Sequence Alignment Dannie Durand Pairwise Sequence Alignment The goal of pairwise sequence alignment is to establish a correspondence between the

More information

MIRING: Minimum Information for Reporting Immunogenomic NGS Genotyping. Data Standards Hackathon for NGS HACKATHON 1.0 Bethesda, MD September

MIRING: Minimum Information for Reporting Immunogenomic NGS Genotyping. Data Standards Hackathon for NGS HACKATHON 1.0 Bethesda, MD September MIRING: Minimum Information for Reporting Immunogenomic NGS Genotyping Data Standards Hackathon for NGS HACKATHON 1.0 Bethesda, MD September 27 2014 Static Dynamic Static Minimum Information for Reporting

More information

DNA Fragment Assembly

DNA Fragment Assembly SIGCSE 009 Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri DNA Fragment Assembly

More information

Tutorial. Comparative Analysis of Three Bovine Genomes. Sample to Insight. November 21, 2017

Tutorial. Comparative Analysis of Three Bovine Genomes. Sample to Insight. November 21, 2017 Comparative Analysis of Three Bovine Genomes November 21, 2017 Sample to Insight QIAGEN Aarhus Silkeborgvej 2 Prismet 8000 Aarhus C Denmark Telephone: +45 70 22 32 44 www.qiagenbioinformatics.com AdvancedGenomicsSupport@qiagen.com

More information

Detecting Superbubbles in Assembly Graphs. Taku Onodera (U. Tokyo)! Kunihiko Sadakane (NII)! Tetsuo Shibuya (U. Tokyo)!

Detecting Superbubbles in Assembly Graphs. Taku Onodera (U. Tokyo)! Kunihiko Sadakane (NII)! Tetsuo Shibuya (U. Tokyo)! Detecting Superbubbles in Assembly Graphs Taku Onodera (U. Tokyo)! Kunihiko Sadakane (NII)! Tetsuo Shibuya (U. Tokyo)! de Bruijn Graph-based Assembly Reads (substrings of original DNA sequence) de Bruijn

More information

Topics of the talk. Biodatabases. Data types. Some sequence terminology...

Topics of the talk. Biodatabases. Data types. Some sequence terminology... Topics of the talk Biodatabases Jarno Tuimala / Eija Korpelainen CSC What data are stored in biological databases? What constitutes a good database? Nucleic acid sequence databases Amino acid sequence

More information

Biostatistics and Bioinformatics Molecular Sequence Databases

Biostatistics and Bioinformatics Molecular Sequence Databases . 1 Description of Module Subject Name Paper Name Module Name/Title 13 03 Dr. Vijaya Khader Dr. MC Varadaraj 2 1. Objectives: In the present module, the students will learn about 1. Encoding linear sequences

More information

DELAMANID SUSCEPTIBILITY TESTING IN AN AUTOMATED LIQUID CULTURE SYSTEM

DELAMANID SUSCEPTIBILITY TESTING IN AN AUTOMATED LIQUID CULTURE SYSTEM DELAMANID SUSCEPTIBILITY TESTING IN AN AUTOMATED LIQUID CULTURE SYSTEM Daniela Maria Cirillo San Raffaele Scientific Institute, Milan COI/CA OSR as signed MTA with Janssen and Otzuka as SRL and is involved

More information

Network Based Models For Analysis of SNPs Yalta Opt

Network Based Models For Analysis of SNPs Yalta Opt Outline Network Based Models For Analysis of Yalta Optimization Conference 2010 Network Science Zeynep Ertem*, Sergiy Butenko*, Clare Gill** *Department of Industrial and Systems Engineering, **Department

More information

Multiple Sequence Alignment. With thanks to Eric Stone and Steffen Heber, North Carolina State University

Multiple Sequence Alignment. With thanks to Eric Stone and Steffen Heber, North Carolina State University Multiple Sequence Alignment With thanks to Eric Stone and Steffen Heber, North Carolina State University Definition: Multiple sequence alignment Given a set of sequences, a multiple sequence alignment

More information

OFFICE OF RESEARCH AND SPONSORED PROGRAMS

OFFICE OF RESEARCH AND SPONSORED PROGRAMS OFFICE OF RESEARCH AND SPONSORED PROGRAMS June 9, 2016 Mr. Satoshi Harada Department of Innovation Research Japan Science and Technology Agency (JST) K s Gobancho, 7, Gobancho, Chiyoda-ku, Tokyo, 102-0076

More information

The software comes with 2 installers: (1) SureCall installer (2) GenAligners (contains BWA, BWA- MEM).

The software comes with 2 installers: (1) SureCall installer (2) GenAligners (contains BWA, BWA- MEM). Release Notes Agilent SureCall 4.0 Product Number G4980AA SureCall Client 6-month named license supports installation of one client and server (to host the SureCall database) on one machine. For additional

More information

Step-by-Step Guide to Basic Genetic Analysis

Step-by-Step Guide to Basic Genetic Analysis Step-by-Step Guide to Basic Genetic Analysis Page 1 Introduction This document shows you how to clean up your genetic data, assess its statistical properties and perform simple analyses such as case-control

More information

Global Alignment. Algorithms in BioInformatics Mandatory Project 1 Magnus Erik Hvass Pedersen (971055) Daimi, University of Aarhus September 2004

Global Alignment. Algorithms in BioInformatics Mandatory Project 1 Magnus Erik Hvass Pedersen (971055) Daimi, University of Aarhus September 2004 1 Introduction Global Alignment Algorithms in BioInformatics Mandatory Project 1 Magnus Erik Hvass Pedersen (971055) Daimi, University of Aarhus September 2004 The purpose of this report is to verify attendance

More information

Minimum Information for Reporting Immunogenomic NGS Genotyping (MIRING)

Minimum Information for Reporting Immunogenomic NGS Genotyping (MIRING) Minimum Information for Reporting Immunogenomic NGS Genotyping (MIRING) Reporting guideline statement for HLA and KIR genotyping data generated via Next Generation Sequencing (NGS) technologies and analysis

More information

USER S MANUAL FOR THE AMaCAID PROGRAM

USER S MANUAL FOR THE AMaCAID PROGRAM USER S MANUAL FOR THE AMaCAID PROGRAM TABLE OF CONTENTS Introduction How to download and install R Folder Data The three AMaCAID models - Model 1 - Model 2 - Model 3 - Processing times Changing directory

More information

CAP BIOINFORMATICS Su-Shing Chen CISE. 8/19/2005 Su-Shing Chen, CISE 1

CAP BIOINFORMATICS Su-Shing Chen CISE. 8/19/2005 Su-Shing Chen, CISE 1 CAP 5510-2 BIOINFORMATICS Su-Shing Chen CISE 8/19/2005 Su-Shing Chen, CISE 1 Building Local Genomic Databases Genomic research integrates sequence data with gene function knowledge. Gene ontology to represent

More information

Computational Molecular Biology

Computational Molecular Biology Computational Molecular Biology Erwin M. Bakker Lecture 3, mainly from material by R. Shamir [2] and H.J. Hoogeboom [4]. 1 Pairwise Sequence Alignment Biological Motivation Algorithmic Aspect Recursive

More information

Sequence Alignment 1

Sequence Alignment 1 Sequence Alignment 1 Nucleotide and Base Pairs Purine: A and G Pyrimidine: T and C 2 DNA 3 For this course DNA is double-helical, with two complementary strands. Complementary bases: Adenine (A) - Thymine

More information

Genetic Algorithms. Kang Zheng Karl Schober

Genetic Algorithms. Kang Zheng Karl Schober Genetic Algorithms Kang Zheng Karl Schober Genetic algorithm What is Genetic algorithm? A genetic algorithm (or GA) is a search technique used in computing to find true or approximate solutions to optimization

More information

Supplemental Information

Supplemental Information Supplemental Information Title: Generation of clonal zebrafish line by androgenesis without egg irradiation Jilun Hou a,, Takafumi Fujimoto b, *, Taiju Saito c, Etsuro Yamaha d, Katsutoshi Arai b 1 Supplemental

More information

Algorithms and Data Structures

Algorithms and Data Structures Algorithms and Data Structures Sorting beyond Value Comparisons Marius Kloft Content of this Lecture Radix Exchange Sort Sorting bitstrings in linear time (almost) Bucket Sort Marius Kloft: Alg&DS, Summer

More information

BGGN 213 Foundations of Bioinformatics Barry Grant

BGGN 213 Foundations of Bioinformatics Barry Grant BGGN 213 Foundations of Bioinformatics Barry Grant http://thegrantlab.org/bggn213 Recap From Last Time: 25 Responses: https://tinyurl.com/bggn213-02-f17 Why ALIGNMENT FOUNDATIONS Why compare biological

More information

PLNT4610 BIOINFORMATICS FINAL EXAMINATION

PLNT4610 BIOINFORMATICS FINAL EXAMINATION 9:00 to 11:00 Friday December 6, 2013 PLNT4610 BIOINFORMATICS FINAL EXAMINATION Answer any combination of questions totalling to exactly 100 points. The questions on the exam sheet total to 120 points.

More information

Alignment of Pairs of Sequences

Alignment of Pairs of Sequences Bi03a_1 Unit 03a: Alignment of Pairs of Sequences Partners for alignment Bi03a_2 Protein 1 Protein 2 =amino-acid sequences (20 letter alphabeth + gap) LGPSSKQTGKGS-SRIWDN LN-ITKSAGKGAIMRLGDA -------TGKG--------

More information

CLC Server. End User USER MANUAL

CLC Server. End User USER MANUAL CLC Server End User USER MANUAL Manual for CLC Server 10.0.1 Windows, macos and Linux March 8, 2018 This software is for research purposes only. QIAGEN Aarhus Silkeborgvej 2 Prismet DK-8000 Aarhus C Denmark

More information

Created by Damian Goodridge Page 1 of 38 Created on 12/10/2004 2:08 PM. User Guide. Assign-SBT TM 3.2.7

Created by Damian Goodridge Page 1 of 38 Created on 12/10/2004 2:08 PM. User Guide. Assign-SBT TM 3.2.7 Created by Damian Goodridge Page 1 of 38 User Guide Assign-SBT TM 3.2.7 Created by Damian Goodridge Page 2 of 38 1 Introduction... 5 1.1 Overview... 5 1.2 Unique Features... 5 1.3 Summary of Functions...

More information

(DNA#): Molecular Biology Computation Language Proposal

(DNA#): Molecular Biology Computation Language Proposal (DNA#): Molecular Biology Computation Language Proposal Aalhad Patankar, Min Fan, Nan Yu, Oriana Fuentes, Stan Peceny {ap3536, mf3084, ny2263, oif2102, skp2140} @columbia.edu Motivation Inspired by the

More information

Sequencing. Short Read Alignment. Sequencing. Paired-End Sequencing 6/10/2010. Tobias Rausch 7 th June 2010 WGS. ChIP-Seq. Applied Biosystems.

Sequencing. Short Read Alignment. Sequencing. Paired-End Sequencing 6/10/2010. Tobias Rausch 7 th June 2010 WGS. ChIP-Seq. Applied Biosystems. Sequencing Short Alignment Tobias Rausch 7 th June 2010 WGS RNA-Seq Exon Capture ChIP-Seq Sequencing Paired-End Sequencing Target genome Fragments Roche GS FLX Titanium Illumina Applied Biosystems SOLiD

More information

A Grid Portal Implementation for Genetic Mapping of Multiple QTL

A Grid Portal Implementation for Genetic Mapping of Multiple QTL A Grid Portal Implementation for Genetic Mapping of Multiple QTL Salman Toor 1, Mahen Jayawardena 1,2, Jonas Lindemann 3 and Sverker Holmgren 1 1 Division of Scientific Computing, Department of Information

More information

Creating a custom mappings similarity matrix

Creating a custom mappings similarity matrix BioNumerics Tutorial: Creating a custom mappings similarity matrix 1 Aim In BioNumerics, character values can be mapped to categorical names according to predefined criteria (see tutorial Importing non-numerical

More information

The Human PAX6 Mutation Database

The Human PAX6 Mutation Database 1998 Oxford University Press Nucleic Acids Research, 1998, Vol. 26, No. 1 259 264 The Human PAX6 Mutation Database Alastair Brown*, Mark McKie, Veronica van Heyningen and Jane Prosser Medical Research

More information

Problem statement. CS267 Assignment 3: Parallelize Graph Algorithms for de Novo Genome Assembly. Spring Example.

Problem statement. CS267 Assignment 3: Parallelize Graph Algorithms for de Novo Genome Assembly. Spring Example. CS267 Assignment 3: Problem statement 2 Parallelize Graph Algorithms for de Novo Genome Assembly k-mers are sequences of length k (alphabet is A/C/G/T). An extension is a simple symbol (A/C/G/T/F). The

More information

Finding Selection in All the Right Places TA Notes and Key Lab 9

Finding Selection in All the Right Places TA Notes and Key Lab 9 Objectives: Finding Selection in All the Right Places TA Notes and Key Lab 9 1. Use published genome data to look for evidence of selection in individual genes. 2. Understand the need for DNA sequence

More information