Package rzeit2. January 7, 2019

Similar documents
Package calpassapi. August 25, 2018

Package cattonum. R topics documented: May 2, Type Package Version Title Encode Categorical Features

Package fitbitscraper

Package robotstxt. November 12, 2017

Package guardianapi. February 3, 2019

Package SEMrushR. November 3, 2018

Package statsdk. September 30, 2017

Package dkanr. July 12, 2018

Package facerec. May 14, 2018

Package patentsview. July 12, 2017

Package censusr. R topics documented: June 14, Type Package Title Collect Data from the Census API Version 0.0.

Package gtrendsr. October 19, 2017

Package available. November 17, 2017

Package ECctmc. May 1, 2018

Package validara. October 19, 2017

Package virustotal. May 1, 2017

Package gtrendsr. August 4, 2018

Package fastdummies. January 8, 2018

Package geniusr. December 6, 2017

Package internetarchive

Package repec. August 31, 2018

Package opencage. January 16, 2018

Package bisect. April 16, 2018

Package cancensus. February 4, 2018

Package fingertipsr. May 25, Type Package Version Title Fingertips Data for Public Health

Package rwars. January 14, 2017

Package goodpractice

Package spelling. December 18, 2017

Package messaging. May 27, 2018

Package wikitaxa. December 21, 2017

Package tidytransit. March 4, 2019

Package ggimage. R topics documented: November 1, Title Use Image in 'ggplot2' Version 0.0.7

Package docxtools. July 6, 2018

Package canvasxpress

Package jpmesh. December 4, 2017

Package ether. September 22, 2018

Package algorithmia. September 13, 2016

Package semver. January 6, 2017

Package ggimage. R topics documented: December 5, Title Use Image in 'ggplot2' Version 0.1.0

Package rgho. R topics documented: January 18, 2017

Package condusco. November 8, 2017

Package essurvey. August 23, 2018

Package geojsonsf. R topics documented: January 11, Type Package Title GeoJSON to Simple Feature Converter Version 1.3.

Package splithalf. March 17, 2018

Package knitrprogressbar

Package fastqcr. April 11, 2017

Package crossword.r. January 19, 2018

Package postgistools

Package tm.plugin.lexisnexis

Package githubinstall

Package data.world. April 5, 2018

Package ecoseries. R topics documented: September 27, 2017

Package rsppfp. November 20, 2018

Package datasets.load

Package jdx. R topics documented: January 9, Type Package Title 'Java' Data Exchange for 'R' and 'rjava'

Package pdfsearch. July 10, 2018

Package widyr. August 14, 2017

Package kdtools. April 26, 2018

Package modules. July 22, 2017

Package censusapi. August 19, 2018

Package gggenes. R topics documented: November 7, Title Draw Gene Arrow Maps in 'ggplot2' Version 0.3.2

Package barcoder. October 26, 2018

Package rbraries. April 18, 2018

Package WordR. September 7, 2017

Package balance. October 12, 2018

Package edgarwebr. December 22, 2017

Package nlgeocoder. October 8, 2018

Package rdryad. June 18, 2018

Package spark. July 21, 2017

Package BioInstaller

Package reconstructr

Package elasticsearchr

Package omu. August 2, 2018

Package oec. R topics documented: May 11, Type Package

Package liftr. R topics documented: May 14, Type Package

Package rprojroot. January 3, Title Finding Files in Project Subdirectories Version 1.3-2

Package imdbapi. April 24, 2018

Package smapr. October 20, 2017

Package BANEScarparkinglite

Package wrswor. R topics documented: February 2, Type Package

Package IATScore. January 10, 2018

Package genesysr. June 14, 2018

Package postal. July 27, 2018

Package lumberjack. R topics documented: July 20, 2018

Package skynet. December 12, 2018

Package rgdax. January 7, 2019

Package loggit. April 9, 2018

Package states. May 4, 2018

Package httpcache. October 17, 2017

Package TrafficBDE. March 1, 2018

Package textrank. December 18, 2017

Package rtika. May 2, 2018

Package qualmap. R topics documented: September 12, Type Package

Package pangaear. January 3, 2018

Package rcmdcheck. R topics documented: November 10, 2018

Package catenary. May 4, 2018

Package editdata. October 7, 2017

Package bea.r. December 8, 2017

Package projector. February 27, 2018

Package slickr. March 6, 2018

Transcription:

Type Package Title Client for the ZEIT ONLINE Content API Version 0.2.3 Package rzeit2 January 7, 2019 Interface to gather newspaper articles from 'DIE ZEIT' and 'ZEIT ONLINE', based on a multilevel query <http://developer.zeit.de/>. A personal API key is required for usage. License MIT + file LICENSE Encoding UTF-8 Depends R (>= 3.2.0) BugReports https://github.com/jandix/rzeit2/issues LazyData true RoxygenNote 6.1.1 Imports anytime, httr, jsonlite, openssl, rvest, stringr, xml2 Suggests dplyr, ggplot2, ggthemes, knitr, rmarkdown, robotstxt, tidytext VignetteBuilder knitr NeedsCompilation no Author Jan Dix [aut, cre] Maintainer Jan Dix <jan.dix@uni-konstanz.de> Repository CRAN Date/Publication 2019-01-07 18:00:03 UTC R topics documented: get_article_comments.................................... 2 get_article_images..................................... 3 get_article_text....................................... 4 get_client.......................................... 5 get_content......................................... 5 get_content_all....................................... 7 1

2 get_article_comments rzeit2............................................ 8 sentiment_example..................................... 8 senti_ws........................................... 9 set_api_key......................................... 9 Index 11 get_article_comments Get article comments Get the article comments for a single url. get_article_comments(url, id = NULL, simplify = FALSE, timeout = NULL) url id simplify timeout character. A single character string or character vector. character. You can provide your own id for each article. If is null the function uses the md5 hash of the url to create one. logical. If true the function returns a data frame else it returns a nested list. integer. Seconds to wait between queries. Details get_article_comments is the function, which fetches and parses article comments. This function may break in the future due to layout changes on the ZEIT ONLINE website. A list with comments and their respective replies. If the content lies beyond the paywall the function returns "[ZEIT PLUS CONTENT] You need a ZEIT PLUS account to access this content.". Warning Please use that function carefully because it uses a lot of HTTP requests. The extensive usage of this function may result in the blocking of IP.

get_article_images 3 url <- paste0("https://www.zeit.de/kultur/film/2018-04/", "tatort-frankfurt-unter-kriegern-obduktionsbericht") get_article_comments(url = url) get_article_images Get article images Get the article images for a single url or a vector of urls. get_article_images(url, timeout = 0, download = NULL) url timeout download character. A single character string or character vector. integer. Seconds to wait between queries. character. Path to download folder. If path is set to NULL images are not downloaded. Details get_article_images is the function, which fetches and parses meta information for each image of an article and downloads the images. This function may break in the future due to layout changes on the ZEIT ONLINE website. A data frame including meta information for each image. url <- paste0("https://www.zeit.de/kultur/film/2018-04/", "tatort-frankfurt-unter-kriegern-obduktionsbericht") get_article_images(url = url, timeout = 0)

4 get_article_text get_article_text Get article text Get the article text for a single url or a vector of urls. get_article_text(url, timeout = NULL) url timeout character. A single character string or character vector. integer. Seconds to wait between queries. Details get_article_text is the function, which fetches and parses an article. This function may break in the future due to layout changes on the ZEIT ONLINE website. A named character vector with the respective text. If the content lies beyond the paywall the function returns "[ZEIT PLUS CONTENT] You need a ZEIT PLUS account to access this content.". url <- paste0("https://www.zeit.de/kultur/film/2018-04/", "tatort-frankfurt-unter-kriegern-obduktionsbericht") get_article_text(url = url)

get_client 5 get_client Client status and API usage get_cleint returns API access status and usage. get_client(api_key = Sys.getenv("ZEIT_ONLINE_KEY")) api_key character. The personal api code. To request an API key see: http://developer. zeit.de/quickstart/ This parameter is by default set to the R Environment. a list of information about the client and API usage Jan Dix (<jan.dix@uni-konstanz.de>) get_client() get_content Content endpoint Exposes a search in the ZEIT online archive on the content endpoint and returns results for the given query. get_content(query, limit = 10, offset = 0, sort = "release_date asc", begin_date = NULL, end_date = NULL, api_key = Sys.getenv("ZEIT_ONLINE_KEY"))

6 get_content query limit offset sort begin_date end_date api_key character. Search query term. integer. The number of results given back. Please use get_content_all if the limit exceeds 1000 rows. integer. Offset for the list of matches. character. Sort search result by various fields. For example: sort=release_date asc, uuid desc. character. Begin date - Restricts responses to results with publication dates of the date specified or later. In the form YYYYMMDD. character. End date - Restricts responses to results with publication dates of the date specified or earlier. In the form YYYYMMDD. character. The personal api code. To request an API key see: http://developer. zeit.de/quickstart/ This parameter is by default set to the R Environment. Details get_content is the function, which interacts directly with the ZEIT Online API. I only used the content endpoint for this package. There are further endpoints (e.g. /author, /product) not included into this package to further specify the search if needed. The whole list of possible endpoints can be accessed here http://developer.zeit.de/docs/. A list including articles and meta information about the query. References http://developer.zeit.de See Also get_content_all get_content(query = "Merkel")

get_content_all 7 get_content_all Content endpoint (all) Exposes a search in the ZEIT online archive on the content endpoint and returns results for the given query. Performs multiple queries if limit exceeds 1000 rows. get_content_all(query, timeout = 2, begin_date = NULL, end_date = NULL, api_key = Sys.getenv("ZEIT_ONLINE_KEY")) query timeout begin_date end_date api_key character. Search query term. integer. Seconds to wait between queries. begin_date character. Begin date - Restricts responses to results with publication dates of the date specified or later. In the form YYYYMMDD. character. End date - Restricts responses to results with publication dates of the date specified or earlier. In the form YYYYMMDD. api_key character. The personal api code. To request an API key see: http: //developer.zeit.de/quickstart/ This parameter is by default set to the R Environment. Details get_content is the function, which interacts directly with the ZEIT Online API. I only used the content endpoint for this package. There are further endpoints (e.g. /author, /product) not included into this package to further specify the search if needed. The whole list of possible endpoints can be accessed here http://developer.zeit.de/docs/. A list including articles and meta information about the query. References See Also http://developer.zeit.de get_content

8 sentiment_example get_content(query = "Merkel") rzeit2 Client for the ZEIT ONLINE Content API Interface to gather newspaper articles from DIE ZEIT and ZEIT ONLINE, based on a multilevel query. A personal API key is required for usage. References http://developer.zeit.de See Also get_content to expose a search in the ZEIT online archive, get_content_all to get all results using get_content, get_client to get client information sentiment_example Sentiment scores for 103 ZEIT ONLINE articles The dataset contains 103 articles returned by a query using the keyword "Merkel" between 01st May and 31st May 2018. The sentiment scores are calculated using the Sentiment Worschatz dictionary. sentiment_example Format A data frame with 103 rows and 3 variables: url the url of the article date the release date of the article score the calculated sentiment score See Also senti_ws

senti_ws 9 senti_ws SentimentWortschatz (SentiWS) Format Source SentiWS is a publicly available German-language resource for sentiment analysis, opinion mining etc. It lists positive and negative polarity bearing words weighted within the interval of [-1; 1]. senti_ws A data frame with 3468 rows and 2 variables: word word score score of the respective word R. Remus, U. Quasthoff & G. Heyer: SentiWS - a Publicly Available German-language Resource for Sentiment Analysis. In: Proceedings of the 7th International Language Ressources and Evaluation (LREC 10), pp. 1168-1171, 2010 http://wortschatz.uni-leipzig.de/en/download/ set_api_key Set api key to the.renviron Function to set you API Key to the R environment when starting using rzeit package. Attention: You should only execute this functions once. set_api_key(api_key, path = stop("please specify a path.")) api_key path character. The personal api code. To request an API key see: http://developer. zeit.de/quickstart/ character. Path where the environment is stored. Default is the normalized path. None.

10 set_api_key # this is not an actual api key api_key <- "5t5yno5qqkufxis5q2vzx26vxq2hqej9" set_api_key(api_key, tempdir())

Index Topic datasets senti_ws, 9 sentiment_example, 8 get_article_comments, 2 get_article_images, 3 get_article_text, 4 get_client, 5, 8 get_content, 5, 7, 8 get_content_all, 6, 7, 8 rzeit2, 8 rzeit2-package (rzeit2), 8 senti_ws, 8, 9 sentiment_example, 8 set_api_key, 9 11