Package GEOmetadb. October 4, 2013

Similar documents
Package SRAdb. April 5, 2018

Package SRAdb. October 12, 2016

Package GEOquery. R topics documented: April 11, Type Package

Package DupChecker. April 11, 2018

Using the SRAdb Package to Query the Sequence Read Archive

Bayesian Pathway Analysis (BPA) Tutorial

Package Risa. November 28, 2017

Drug versus Disease (DrugVsDisease) package

Package Onassis. March 21, 2019

Department of Computer Science, UTSA Technical Report: CS TR

Useful software utilities for computational genomics. Shamith Samarajiwa CRUK Autumn School in Bioinformatics September 2017

Import GEO Experiment into Partek Genomics Suite

Package kirby21.base

Package pdinfobuilder

Package KEGGprofile. February 10, 2018

Package sqliter. August 29, 2016

Package demi. February 19, 2015

Package PGSEA. R topics documented: May 4, Type Package Title Parametric Gene Set Enrichment Analysis Version 1.54.

Package rdryad. June 18, 2018

Package TilePlot. April 8, 2011

Package Onassis. April 16, 2019

Package GSCA. October 12, 2016

Package GSCA. April 12, 2018

Package ChIPXpress. October 4, 2013

Package ArrayExpress

Package HomoVert. November 10, 2010

Package MotifDb. R topics documented: October 4, Type Package. Title An Annotated Collection of Protein-DNA Binding Sequence Motifs

Package poplite. R topics documented: October 12, Version Date

Package AnnotationForge

Package TilePlot. February 15, 2013

Package condusco. November 8, 2017

Textual Description of webbioc

HOWTO generate biocviews HTML

Package AnnotationHub

Package affyio. January 18, Index 6. CDF file format function

Package cregulome. September 13, 2018

7. Working with Big Data

Bioconductor: Annotation Package Overview

7. Working with Big Data

Min Wang. April, 2003

Package fastqcr. April 11, 2017

Package AffyExpress. October 3, 2013

Package GREP2. July 25, 2018

Package scraep. July 3, Index 6

Package BgeeDB. January 5, 2019

How to store and visualize RNA-seq data

Package STROMA4. October 22, 2018

Package RefNet. R topics documented: April 16, Type Package

org.hs.ipi.db November 7, 2017 annotation data package

reactome.db September 23, 2018 Bioconductor annotation data package

Package mgsa. January 13, 2019

Towards an Optimized Illumina Microarray Data Analysis Pipeline

User guide for GEM-TREND

ChIPXpress: enhanced ChIP-seq and ChIP-chip target gene identification using publicly available gene expression data

Package OmaDB. R topics documented: December 19, Title R wrapper for the OMA REST API Version Author Klara Kaleb

Package DataClean. August 29, 2016

R version has been released on (Linux source code versions)

Package RLMM. March 7, 2019

Package corenlp. June 3, 2015

Package DAVIDQuery. R topics documented: October 4, Type Package. Title Retrieval from the DAVID bioinformatics data resource into R

Computer Exercise - Microarray Analysis using Bioconductor

Package MotifDb. R topics documented: February 8, Type Package

Using metama for differential gene expression analysis from multiple studies

Package clusterrepro

Package mygene. January 30, 2018

Core Technology Development Team Meeting

Package ChIPseeker. July 12, 2018

hom.dm.inp.db July 21, 2010 Bioconductor annotation data package

Package imgur. R topics documented: December 20, Type Package. Title Share plots using the imgur.com image hosting service. Version 0.1.

Package scraep. November 15, Index 6

Package FunciSNP. November 16, 2018

SAS 9.2 Foundation Services. Administrator s Guide

Package MiRaGE. October 16, 2018

Package httpcache. October 17, 2017

Package phantasus. March 8, 2019

BRB-ArrayTools Data Archive for Human Cancer Gene Expression: A Unique and Efficient Data Sharing Resource

Package RTCGAToolbox

Package PSICQUIC. January 18, 2018

Package SCAN.UPC. July 17, 2018

Lecture 8. Sequence alignments

Package censusapi. August 19, 2018

Package omu. August 2, 2018

pyensembl Documentation

Package GenomicDataCommons

Robert Gentleman! Copyright 2011, all rights reserved!

Package SCAN.UPC. October 9, Type Package. Title Single-channel array normalization (SCAN) and University Probability of expression Codes (UPC)

Package annotationtools

Package snplist. December 11, 2017

Creating Connection With Hive. Version: 16.0

Package mirnatap. February 1, 2018

Tutorial:OverRepresentation - OpenTutorials

SAS 9.4 Foundation Services: Administrator s Guide

An introduction to Genomic Data Structures

MiChip. Jonathon Blake. October 30, Introduction 1. 5 Plotting Functions 3. 6 Normalization 3. 7 Writing Output Files 3

Package gsalib. R topics documented: February 20, Type Package. Title Utility Functions For GATK. Version 2.1.

Package DFP. February 2, 2018

This assignment requires that you complete the following tasks (in no particular order).

Package HPO.db. R topics documented: February 15, Type Package

Package LOLA. December 24, 2018

Package ctv. October 8, 2017

Transcription:

Package GEOmetadb October 4, 2013 Type Package Title A compilation of metadata from NCBI GEO Version 1.20.0 Date 2011-11-28 Depends GEOquery,RSQLite Author Jack Zhu and Sean Davis Maintainer Jack Zhu <zhujack@mail.nih.gov> biocviews Infrastructure The NCBI Gene Expression Omnibus (GEO) represents the largest public repository of microarray data. However, finding data of interest can be challenging using current tools. GEOmetadb is an attempt to make access to the metadata associated with samples, platforms, and datasets much more feasible. This is accomplished by parsing all the NCBI GEO metadata into a SQLite database that can be stored and queried locally. GEOmetadb is simply a thin wrapper around the SQLite database along with associated documentation. Finally, the SQLite database is updated regularly as new data is added to GEO and can be downloaded at will for the most up-to-date metadata. GEOmetadb paper: http://bioinformatics.oxfordjournals.org/cgi/content/short/24/23/2798. URL http://gbnci.abcc.ncifcrf.gov/geo/ License Artistic-2.0 R topics documented: GEOmetadb-package.................................... 2 columns..................................... 3 geoconvert......................................... 4 getbiocplatformmap.................................... 5 getsqlitefile........................................ 6 Index 7 1

2 GEOmetadb-package GEOmetadb-package Query NCBI GEO metadata from a local SQLite database Details The NCBI Gene Expression Omnibus (GEO) represents the largest public repository of microarray data. However, finding data of interest can be challenging using current tools. GEOmetadb is an attempt to make access to the metadata associated with samples, platforms, and datasets much more feasible. This is accomplished by parsing all the NCBI GEO metadata into a SQLite database that can be stored and queried locally. GEOmetadb is simply a thin wrapper around the SQLite database along with associated documentation. Finally, the SQLite database is updated regularly as new data is added to GEO and can be downloaded at will for the most up-to-date metadata. Package: GEOmetadb Type: Package Version: 1.1.5 Date: 2008-09-09 License: Artistic-2.0 Jack Zhu and Sean Davis Maintainer: Jack Zhu <zhujack@mail.nih.gov> http://meltzerlab.nci.nih.gov/apps/geo, http://gbnci.abcc.ncifcrf.gov/geo/ if(file.exists( GEOmetadb.sqlite )) { a <- columns()[1:5,] b <- geoconvert( GPL97, GSM ) else { print("use getsqlitefile() to get a copy of the GEOmetadb SQLite file and then rerun the example")

columns 3 columns Get column descriptions for the GEOmetadb database Searching the GEOmetadb database requires a bit of knowledge about the structure of the database and column descriptions. This function returns those column descriptions for all columns in all tables in the database. columns(sqlite_db_name= GEOmetadb.sqlite ) sqlite_db_name The filename of the GEOmetadb sqlite database file A three-column data.frame including TableName, FieldName, and. Sean Davis <sdavis2@mail.nih.gov> http://meltzerlab.nci.nih.gov/apps/geo if(file.exists( GEOmetadb.sqlite )) { columns()[1:5,] else { print("you will need to usethe getsqlitefile() function to get a copy of the SQLite database file before this example will work")

4 geoconvert geoconvert Cross-reference between GEO data types A common task is to find all the GEO entities of one type associated with another GEO entity (eg., find all GEO samples associated with GEO platform GPL96 ). This function provides a very fast mapping between entity types to facilitate queries of this type. geoconvert(in_list, out_type = c("gse", "gpl", "gsm", "gds", "smatrix"), sqlite_db_name = "GEOmetadb.sq in_list Character vector of GEO entities to convert from. out_type Character vector of GEO entity types to which to convert. sqlite_db_name The filename of the GEOmetadb sqlite database file A list of data.frames. Jack Zhu <zhujack@mail.nih.gov> http://meltzerlab.nci.nih.gov/apps/geo, http://gbnci.abcc.ncifcrf.gov/geo/ if(file.exists("geometadb.sqlite")) { geoconvert( GPL96,out_type= GSM ) else { print("run getsqlitefile() to get a copy of the GEOmetadb SQLite file and then rerun the example")

getbiocplatformmap 5 getbiocplatformmap Get mappings between GPL and Bioconductor microarry annotation packages Query the gpl table and get GPL information of a given list of Bioconductor microarry annotation packages. Note currently the GEOmetadb does not contains all the mappings, but we are trying to construct a relative complete list. getbiocplatformmap(con, bioc= all ) con bioc Connection to the GEOmetadb.sqlite database Character vector of Biocondoctor microarry annotation packages, e.g. c( hgu133plus2, hgu95av2 ). all returns all mappings. A six-column data.frame including GPL title, GPL accession, bioc_package, manufacturer, organism, data_row_count. Jack Zhu <zhujack@mail.nih.gov>, Sean Davis <sdavis2@mail.nih.gov> http://meltzerlab.nci.nih.gov/apps/geo if(file.exists( GEOmetadb.sqlite )) { con <- dbconnect(sqlite(), "GEOmetadb.sqlite") getbiocplatformmap(con)[1:5,] getbiocplatformmap(con, bioc=c( hgu133a, hgu95av2 )) dbdisconnect(con) else { print("you will need to usethe getsqlitefile() function to get a copy of the SQLite database file before this example will work")

6 getsqlitefile getsqlitefile Download and unzip the most recent GEOmetadb SQLite file This function is the standard method for downloading and unzipping the most recent GEOmetadb SQLite file from the server. getsqlitefile(destdir = getwd(), destfile = "GEOmetadb.sqlite.gz") destdir destfile The destination directory of the downloaded file The filename of the downloaded file. This filename should end in ".gz" as the unzipping assumes that is the case Prints some diagnostic information to the screen. Returns the local filename for use later. Sean Davis <sdavis2@mail.nih.gov> http://meltzerlab.nci.nih.gov/apps/geo, http://gbnci.abcc.ncifcrf.gov/geo/ ## Not run: geometadbfile <- getsqlitefile()

Index Topic IO geoconvert, 4 getsqlitefile, 6 Topic database columns, 3 geoconvert, 4 getbiocplatformmap, 5 getsqlitefile, 6 Topic package GEOmetadb-package, 2 columns, 3 geoconvert, 4 GEOmetadb (GEOmetadb-package), 2 GEOmetadb-package, 2 getbiocplatformmap, 5 getsqlitefile, 6 7