Package rdryad. June 18, 2018

Similar documents
Package pangaear. January 3, 2018

Package ritis. May 23, 2018

Package wikitaxa. December 21, 2017

Package rbraries. April 18, 2018

Package cattonum. R topics documented: May 2, Type Package Version Title Encode Categorical Features

Package dkanr. July 12, 2018

Package fastdummies. January 8, 2018

Package taxizedb. June 21, 2017

Package nodbi. August 1, 2018

Package ecoseries. R topics documented: September 27, 2017

Package oec. R topics documented: May 11, Type Package

Package darksky. September 20, 2017

Package rgdax. January 7, 2019

Package clipr. June 23, 2018

Package rif. May 16, 2017

Package repec. August 31, 2018

Package geojsonsf. R topics documented: January 11, Type Package Title GeoJSON to Simple Feature Converter Version 1.3.

Package bigqueryr. October 23, 2017

Package bigqueryr. June 8, 2018

Package IATScore. January 10, 2018

Package robotstxt. November 12, 2017

Package datasets.load

Package bigreadr. R topics documented: August 13, Version Date Title Read Large Text Files

Package available. November 17, 2017

Package brranching. R topics documented:

Package aws.transcribe

Package rzeit2. January 7, 2019

Package farver. November 20, 2018

Package messaging. May 27, 2018

Package strat. November 23, 2016

Package pdfsearch. July 10, 2018

Package sqlscore. April 29, 2018

Package ECctmc. May 1, 2018

Package crossword.r. January 19, 2018

Package crul. October 3, 2017

Package calpassapi. August 25, 2018

Package rwars. January 14, 2017

Package goodpractice

Package canvasxpress

Package BANEScarparkinglite

Package patentsview. July 12, 2017

Package fst. December 18, 2017

Package opencage. January 16, 2018

Package tabulizer. June 7, 2018

Package geniusr. December 6, 2017

Package rsppfp. November 20, 2018

Package spark. July 21, 2017

Package gtrendsr. August 4, 2018

Package influxdbr. January 10, 2018

Package dataverse. June 15, 2017

Package loggit. April 9, 2018

Package exifr. October 15, 2017

Package githubinstall

Package estprod. May 2, 2018

Package crochet. January 8, 2018

Package censusr. R topics documented: June 14, Type Package Title Collect Data from the Census API Version 0.0.

Package pwrab. R topics documented: June 6, Type Package Title Power Analysis for AB Testing Version 0.1.0

Package rsdmx. September 21, 2018

Package postal. July 27, 2018

Package liftr. R topics documented: May 14, Type Package

Package guardianapi. February 3, 2019

Package jdx. R topics documented: January 9, Type Package Title 'Java' Data Exchange for 'R' and 'rjava'

Package gtrendsr. October 19, 2017

Package nlgeocoder. October 8, 2018

Package validara. October 19, 2017

Package webmockr. May 23, 2018

Package IRkernel. January 7, 2019

Package scraep. July 3, Index 6

Package fst. June 7, 2018

Package scrubr. August 29, 2016

Package tibble. August 22, 2017

Package reconstructr

Package pinyin. October 17, 2018

Package reval. May 26, 2015

Package jstree. October 24, 2017

Package kirby21.base

Package snakecase. R topics documented: March 25, Version Date Title Convert Strings into any Case

Package data.world. April 5, 2018

Package GetITRData. October 22, 2017

Package pkgbuild. October 16, 2018

Package mdftracks. February 6, 2017

Package ggimage. R topics documented: November 1, Title Use Image in 'ggplot2' Version 0.0.7

Package SimilaR. June 21, 2018

Package fitbitscraper

Package apastyle. March 29, 2017

Package fingertipsr. May 25, Type Package Version Title Fingertips Data for Public Health

Package RPresto. July 13, 2017

Package rgho. R topics documented: January 18, 2017

Package facerec. May 14, 2018

Package virustotal. May 1, 2017

Package datapasta. January 24, 2018

Package knitrprogressbar

Package genesysr. June 14, 2018

Package solrium. December 13, 2018

Package httpcache. October 17, 2017

Package BiocManager. November 13, 2018

Package rtext. January 23, 2019

Package dbx. July 5, 2018

Package projector. February 27, 2018

Package urlshortener

Transcription:

Type Package Title Access for Dryad Web Services Package rdryad June 18, 2018 Interface to the Dryad ``Solr'' API, their ``OAI-PMH'' service, and fetch datasets. Dryad (<http://datadryad.org/>) is a curated host of data underlying scientific publications. Version 0.4.0 License MIT + file LICENSE URL https://github.com/ropensci/rdryad BugReports https://github.com/ropensci/rdryad/issues Imports crul (>= 0.4.0), curl (>= 3.0), xml2 (>= 1.0.0), oai (>= 0.2.2), solrium (>= 1.0.0), data.table, tibble Suggests testthat RoxygenNote 6.0.1 NeedsCompilation no Author Scott Chamberlain [aut, cre] (<https://orcid.org/0000-0003-1444-9135>), Carl Boettiger [aut] (<https://orcid.org/0000-0002-1642-628x>), Karthik Ram [ctb] Maintainer Scott Chamberlain <myrmecocystus@gmail.com> Repository CRAN Date/Publication 2018-06-18 16:44:52 UTC R topics documented: rdryad-package....................................... 2 doi2handle.......................................... 3 dryad_fetch......................................... 3 dryad_files.......................................... 4 dryad_metadata....................................... 5 dryad_package_dois.................................... 6 dr_get_records....................................... 7 1

2 rdryad-package dr_identify.......................................... 7 dr_list_identifiers...................................... 8 dr_list_metadata_formats.................................. 9 dr_list_records....................................... 10 dr_list_sets......................................... 11 d_solr_search........................................ 11 Index 14 rdryad-package Interface to the Dryad Web services Includes access to Dryad s Solr API, OAI-PMH service, and part of their REST API. Package API The following functions work with the Dryad Solr service d_solr_facet() d_solr_group() d_solr_highlight() d_solr_mlt() d_solr_search() d_solr_stats() The following functions work with the Dryad OAI-PMH service dr_get_records() dr_identify() dr_list_identifiers() dr_list_metadata_formats() dr_list_records() dr_list_sets() The following functions sort out file URLs and help you download those files dryad_fetch() dryad_files() dryad_metadata() dryad_package_dois() These functions convert between Dryad handles and DOIs handle2doi() doi2handle()

doi2handle 3 Author(s) Scott Chamberlain <myrmecocystus@gmail.com> doi2handle Get a Dryad DOI from a handle, and vice versa Get a Dryad DOI from a handle, and vice versa doi2handle(x,...) handle2doi(x,...) x (character) A Dryad dataset DOI or handle. required... Curl options, passed on to crul::httpclient (character) a DOI or handle doi2handle('10.5061/dryad.c0765') handle2doi('10255/dryad.153920') doi2handle('10.5061/dryad.c0765') dryad_fetch Download Dryad files Download Dryad files dryad_fetch(url, destfile = NULL, try_file_names = FALSE,...)

4 dryad_files url Details destfile (character) One or more Dryad URL for a dataset (character) Destination file. If not given, we assign a file name based on URL provided. try_file_names (logical) try to parse file names out of the URLs. Default: FALSE... Further args passed on to curl::curl_download() This function is a thin wrapper around curl::curl_download() to get files to your machine only. We don t attempt to read/parse them named (list) with path(s) to the file(s) - list names are the urls passed into the url parameter # Single file x <- dryad_files('10.5061/dryad.1758') ## without specifying a destination file dryad_fetch(url = x) ## specify a destination file dryad_fetch(url = x[1], (f <- tempfile(fileext = ".csv"))) ## use try_file_names - we try to extract file names from URLs dryad_fetch(url = x, try_file_names = TRUE) # Many files x <- dryad_files(doi = '10.5061/dryad.60699') res <- dryad_fetch(x) head(read.delim(res[[1]], sep = ";")) dryad_files Get a URL given a Dryad DOI To get a DOI from a Dryad Handle, use handle2doi() dryad_files(doi,...)

dryad_metadata 5 doi (character) A Dryad dataset DOI, of the form 10.5061/dryad.xxx. required... Curl options, passed on to crul::httpclient (character) One or more URLS for direct download of datasets for the given Dryad DOI dryad_files(doi = '10.5061/dryad.1758') dryad_files(doi = '10.5061/dryad.60699') # if you have a handle, use handle2doi() to convert to a DOI (doi <- handle2doi('10255/dryad.153920')) (files <- dryad_files(doi)) (out <- dryad_fetch(files)) # file sizes in MB vapply(out, function(x) file.info(x)[["size"]], 1) / 10^6 dryad_metadata Download Dryad file metadata Download Dryad file metadata dryad_metadata(doi,...) doi (character) A Dryad DOI for a dataset of files within a dataset... Further args passed on to crul::httpclient named (list) with slots for: desc: object metadata files: file information attributes: metadata about the metadata file structmap: not sure what this is

6 dryad_package_dois dryad_metadata('10.5061/dryad.1758') dryad_metadata('10.5061/dryad.9t0n8/1') dryad_metadata('10.5061/dryad.60699/3') out <- dryad_metadata('10.5061/dryad.60699/5') out$desc$text[out$desc$qualifier %in% c("pageviews", "downloads")] dryad_package_dois Get file DOIs for a Dryad package DOI Get file DOIs for a Dryad package DOI dryad_package_dois(doi,...) doi (character) A Dryad package DOI. required... Further args passed on to crul::httpclient (character) zero or more DOIs for the files; if no results a zero length character vector dryad_package_dois('10.5061/dryad.1758') dryad_package_dois('10.5061/dryad.9t0n8') dryad_package_dois('10.5061/dryad.60699')

dr_get_records 7 dr_get_records Download metadata for individual Dryad id s Download metadata for individual Dryad id s dr_get_records(ids, prefix = "oai_dc", as = "df",...) ids prefix as Dryad identifier, i.e. oai:datadryad.org:10255/dryad.8820 A character string to specify the metadata format in OAI-PMH requests issued to the repository. The default ("oai_dc") corresponds to the mandatory OAI unqualified Dublin Core metadata schema. (character) What to return. One of "df" (for data.frame; default), "list", or "raw" (raw text)... Curl debugging options passed on to httr::get XML character string, data.frame, or list, depending on what requested witht the as parameter dr_get_records(ids = 'oai:datadryad.org:10255/dryad.8820') handles <- c('10255/dryad.36217', '10255/dryad.86943', '10255/dryad.84720', '10255/dryad.34100') ids <- paste0('oai:datadryad.org:', handles) dr_get_records(ids) dr_identify Learn about the Dryad OAI-PMH service. Learn about the Dryad OAI-PMH service. dr_identify(...)

8 dr_list_identifiers... Curl debugging options passed on to httr::get List of information describing Dryad. dr_identify() dr_list_identifiers Gets OAI Dryad identifiers Gets OAI Dryad identifiers dr_list_identifiers(prefix = "oai_dc", from = NULL, until = NULL, set = "hdl_10255_3", token = NULL, as = "df",...) prefix from until set token as A character string to specify the metadata format in OAI-PMH requests issued to the repository. The default ("oai_dc") corresponds to the mandatory OAI unqualified Dublin Core metadata schema. Character string giving datestamp to be used as lower bound for datestampbased selective harvesting (i.e., only harvest records with datestamps in the given range). Dates and times must be encoded using ISO 8601. The trailing Z must be used when including time. OAI-PMH implies UTC for data/time specifications. Character string giving a datestamp to be used as an upper bound, for datestampbased selective harvesting (i.e., only harvest records with datestamps in the given range). A character string giving a set to be used for selective harvesting (i.e., only harvest records in the given set). (character) a token previously provided by the server to resume a request where it last left off. 50 is max number of records returned. We will loop for you internally to get all the records you asked for. (character) What to return. One of "df" (for data.frame; default), "list", or "raw" (raw text)... Curl debugging options passed on to httr::get

dr_list_metadata_formats 9 XML character string, data.frame, or list, depending on what requested witht the as parameter List of OAI identifiers for each dataset. dr_list_identifiers(from='2010-01-01', until = "2010-06-30") dr_list_identifiers(prefix="mets", from='2015-09-01', until='2015-09-20') identifiers <- dr_list_identifiers('rdf') # Data packages identifiers[[1]] # Data files identifiers[[2]] dr_list_metadata_formats Get available Dryad metadata formats Get available Dryad metadata formats dr_list_metadata_formats(...)... Curl debugging options passed on to httr::get List of information on metadata formats. dr_list_metadata_formats()

10 dr_list_records dr_list_records List Dryad records List Dryad records dr_list_records(prefix = "oai_dc", from = NULL, until = NULL, set = "hdl_10255_3", token = NULL, as = "df",...) prefix from until set token as A character string to specify the metadata format in OAI-PMH requests issued to the repository. The default ("oai_dc") corresponds to the mandatory OAI unqualified Dublin Core metadata schema. Character string giving datestamp to be used as lower bound for datestampbased selective harvesting (i.e., only harvest records with datestamps in the given range). Dates and times must be encoded using ISO 8601. The trailing Z must be used when including time. OAI-PMH implies UTC for data/time specifications. Character string giving a datestamp to be used as an upper bound, for datestampbased selective harvesting (i.e., only harvest records with datestamps in the given range). A character string giving a set to be used for selective harvesting (i.e., only harvest records in the given set). (character) a token previously provided by the server to resume a request where it last left off. 50 is max number of records returned. We will loop for you internally to get all the records you asked for. (character) What to return. One of "df" (for data.frame; default), "list", or "raw" (raw text)... Curl debugging options passed on to httr::get XML character string, data.frame, or list, depending on what requested witht the as parameter dr_list_records(from='2016-01-01', until='2016-09-10')

dr_list_sets 11 dr_list_sets List the sets in the Dryad metadata repository. Retrieve the set structure of Dryad, useful for selective harvesting dr_list_sets(token = NULL, as = "df",...) token as (character) a token previously provided by the server to resume a request where it last left off. 50 is max number of records returned. We will loop for you internally to get all the records you asked for. (character) What to return. One of "df" (for data.frame; default), "list", or "raw" (raw text)... Curl debugging options passed on to httr::get dr_list_sets() dr_list_sets(as = "list") dr_list_sets(as = "raw") d_solr_search Search the Dryad Solr endpoint. Search the Dryad Solr endpoint. d_solr_search(..., proxy = NULL, callopts = list()) d_solr_facet(..., proxy = NULL, callopts = list()) d_solr_group(..., proxy = NULL, callopts = list()) d_solr_highlight(..., proxy = NULL, callopts = list())

12 d_solr_search d_solr_mlt(..., proxy = NULL, callopts = list()) d_solr_stats(..., proxy = NULL, callopts = list()) Details... Solr parameters passed on to the respective solrium package function. proxy callopts List of arguments for a proxy connection, including one or more of: url, user, pwd, and auth. See crul::proxy for help, which is used to construct the proxy connection. Further args passed on to crul::httpclient See the solrium package documentation for available parameters. For each of d_solr_search, d_solr_facet, d_solr_stats, and d_solr_mlt, d_solr_group, and d_solr_highlight see the equivalently named function in solrium. The wt parameter is now hard-coded to xml because a recent change in the Dryad Solr infrastructure makes it impossible to get JSON output - this shouldn t affect most users. In addition, we hard code a curl option to follow redirects, just so you re aware. # Basic search d_solr_search(q="galliard") # Basic search, restricting to certain fields d_solr_search(q="galliard", fl=c('handle', 'dc.title_sort')) # Search all text for a string, but limits results to two specified fields: d_solr_search(q="dwc.scientificname:drosophila", fl='handle,dc.title_sort') # Dryad data based on an article DOI: d_solr_search(q="dc.relation.isreferencedby:10.1038/nature04863", fl="dc.identifier,dc.title_ac") # All terms in the dc.subject facet, along with their frequencies: d_solr_facet(q="location:l2", facet.field="dc.subject_filter", facet.mincount=1, facet.limit=10) # Article DOIs associated with all data published in Dryad over the past 90 days: d_solr_search(q="dc.date.available_dt:[now-90day/day TO NOW]", fl="dc.relation.isreferencedby", rows=10) # Data DOIs published in Dryad during January 2011 query <- "location:l2 dc.date.available_dt:[2011-01-01t00:00:00z TO 2011-01-31T23:59:59Z]" d_solr_search(q=query, fl="dc.identifier", rows=10)

d_solr_search 13 # Highlight d_solr_highlight(q="bird", hl.fl="dc.description") # More like this d_solr_mlt(q="bird", mlt.count=10, mlt.fl='dc.title_sort', fl='handle,dc.title_sort') # Stats d_solr_stats(q="*:*", stats.field="dc.date.accessioned.year")

Index crul::httpclient, 3, 5, 6, 12 crul::proxy, 12 curl::curl_download(), 4 d_solr_facet (d_solr_search), 11 d_solr_facet(), 2 d_solr_group (d_solr_search), 11 d_solr_group(), 2 d_solr_highlight (d_solr_search), 11 d_solr_highlight(), 2 d_solr_mlt (d_solr_search), 11 d_solr_mlt(), 2 d_solr_search, 11 d_solr_search(), 2 d_solr_stats (d_solr_search), 11 d_solr_stats(), 2 doi2handle, 3 doi2handle(), 2 dr_get_records, 7 dr_get_records(), 2 dr_identify, 7 dr_identify(), 2 dr_list_identifiers, 8 dr_list_identifiers(), 2 dr_list_metadata_formats, 9 dr_list_metadata_formats(), 2 dr_list_records, 10 dr_list_records(), 2 dr_list_sets, 11 dr_list_sets(), 2 dryad_fetch, 3 dryad_fetch(), 2 dryad_files, 4 dryad_files(), 2 dryad_metadata, 5 dryad_metadata(), 2 dryad_package_dois, 6 dryad_package_dois(), 2 handle2doi(), 2, 4 rdryad (rdryad-package), 2 rdryad-package, 2 handle2doi (doi2handle), 3 14