Package bigreadr. R topics documented: August 13, Version Date Title Read Large Text Files

Similar documents
Package validara. October 19, 2017

Package fastdummies. January 8, 2018

Package messaging. May 27, 2018

Package datasets.load

Package ECctmc. May 1, 2018

Package cattonum. R topics documented: May 2, Type Package Version Title Encode Categorical Features

Package geojsonsf. R topics documented: January 11, Type Package Title GeoJSON to Simple Feature Converter Version 1.3.

Package repec. August 31, 2018

Package spark. July 21, 2017

Package fasttextr. May 12, 2017

Package tidyselect. October 11, 2018

Package pwrab. R topics documented: June 6, Type Package Title Power Analysis for AB Testing Version 0.1.0

Package humanize. R topics documented: April 4, Version Title Create Values for Human Consumption

Package fst. December 18, 2017

Package canvasxpress

Package semver. January 6, 2017

Package clipr. June 23, 2018

Package strat. November 23, 2016

Package farver. November 20, 2018

Package hyphenatr. August 29, 2016

Package geogrid. August 19, 2018

Package kdtools. April 26, 2018

Package wrswor. R topics documented: February 2, Type Package

Package ompr. November 18, 2017

Package tidyimpute. March 5, 2018

Package sessioninfo. June 21, 2017

Package triebeard. August 29, 2016

Package readxl. April 18, 2017

Package sankey. R topics documented: October 22, 2017

Package mdftracks. February 6, 2017

Package raker. October 10, 2017

Package ggimage. R topics documented: November 1, Title Use Image in 'ggplot2' Version 0.0.7

Package fst. June 7, 2018

Package projector. February 27, 2018

Package barcoder. October 26, 2018

Package rbraries. April 18, 2018

Package SimilaR. June 21, 2018

Package deductive. June 2, 2017

Package jstree. October 24, 2017

Package kirby21.base

Package lumberjack. R topics documented: July 20, 2018

Package zip. R topics documented: March 11, Title Cross-Platform 'zip' Compression Version 2.0.1

Package ggimage. R topics documented: December 5, Title Use Image in 'ggplot2' Version 0.1.0

Package sfc. August 29, 2016

Package comparedf. February 11, 2019

Package spelling. December 18, 2017

Package PCADSC. April 19, 2017

Package ecoseries. R topics documented: September 27, 2017

Package SEMrushR. November 3, 2018

Package pkgbuild. October 16, 2018

Package coga. May 8, 2018

Package crossword.r. January 19, 2018

Package readobj. November 2, 2017

Package elmnnrcpp. July 21, 2018

Package embed. November 19, 2018

Package robotstxt. November 12, 2017

Package githubinstall

Package gtrendsr. August 4, 2018

Package calpassapi. August 25, 2018

Package available. November 17, 2017

Package CompGLM. April 29, 2018

Package rlas. June 11, 2018

Package dbx. July 5, 2018

Package rtext. January 23, 2019

Package nmslibr. April 14, 2018

Package gtrendsr. October 19, 2017

Package AhoCorasickTrie

Package stapler. November 27, 2017

Package jdx. R topics documented: January 9, Type Package Title 'Java' Data Exchange for 'R' and 'rjava'

Package datapasta. January 24, 2018

Package utilsipea. November 13, 2017

Package iccbeta. January 28, 2019

Package dkanr. July 12, 2018

Package pmatch. October 19, 2018

Package rsppfp. November 20, 2018

Package balance. October 12, 2018

Package regexselect. R topics documented: September 22, Version Date Title Regular Expressions in 'shiny' Select Lists

Package heims. January 25, 2018

Package docxtools. July 6, 2018

Package vinereg. August 10, 2018

Package staplr. May 9, 2018

Package librarian. R topics documented:

Package reconstructr

Package opencage. January 16, 2018

Package wikitaxa. December 21, 2017

Package snakecase. R topics documented: March 25, Version Date Title Convert Strings into any Case

Package combiter. December 4, 2017

Package gggenes. R topics documented: November 7, Title Draw Gene Arrow Maps in 'ggplot2' Version 0.3.2

Package CatEncoders. March 8, 2017

Package knitrprogressbar

Package IATScore. January 10, 2018

R topics documented: 2 clone

Package nngeo. September 29, 2018

Package slickr. March 6, 2018

Package qualmap. R topics documented: September 12, Type Package

Package ether. September 22, 2018

Package pinyin. October 17, 2018

Package TVsMiss. April 5, 2018

Package filelock. R topics documented: February 7, 2018

Package splithalf. March 17, 2018

Transcription:

Version 0.1.3 Date 2018-08-12 Title Read Large Text Files Package bigreadr August 13, 2018 Read large text s by splitting them in smaller s. License GPL-3 Encoding UTF-8 LazyData true ByteCompile true RoxygenNote 6.1.0 Imports data.table, Rcpp, parallel, fpeek, utils Suggests spelling, testthat, covr LinkingTo Rcpp Language en-us URL https://github.com/privefl/bigreadr BugReports https://github.com/privefl/bigreadr/issues NeedsCompilation yes Author Florian Privé [aut, cre] Maintainer Florian Privé <florian.prive.21@gmail.com> Repository CRAN Date/Publication 2018-08-13 16:30:09 UTC R topics documented: big_fread1.......................................... 2 big_fread2.......................................... 2 cbind_df........................................... 3 fread2............................................ 4 fwrite2............................................ 4 nlines............................................ 5 rbind_df........................................... 6 split_........................................... 6 1

2 big_fread2 Index 8 big_fread1 Read large text Read large text by splitting lines. big_fread1(, every_nlines,.transform = identity,.combine = rbind_df, skip = 0,..., print_timings = TRUE) every_nlines.transform.combine skip Path to that you want to read. Maximum number of lines in new parts. Function to transform each data frame corresponding to each part of the. Default doesn t change anything. Function to combine results (list of data frames). Number of lines to skip at the beginning of.... Other arguments to be passed to data.table::fread, excepted input,, skip and col.names. print_timings Whether to print timings? Default is TRUE. A data.frame by default; a data.table when data.table = TRUE. big_fread2 Read large text Read large text by splitting columns. big_fread2(, nb_parts = NULL,.transform = identity,.combine = cbind_df, skip = 0, select = NULL, progress = FALSE, part_size = 500 * 1024^2,...)

cbind_df 3 nb_parts.transform.combine skip select progress Path to that you want to read. Number of parts in which to split reading (and transforming). Parts are referring to blocks of selected columns. Default uses part_size to set a good value. Function to transform each data frame corresponding to each block of selected columns. Default doesn t change anything. Function to combine results (list of data frames). Number of lines to skip at the beginning of. Indices of columns to keep (sorted). Default keeps them all. Show progress? Default is FALSE. part_size Size of the parts if nb_parts is not supplied. Default is 500 * 1024^2 (500 MB).... Other arguments to be passed to data.table::fread, excepted input,, skip, select and showprogress. The outputs of fread2 +.transform, combined with.combine. cbind_df Merge data frames Merge data frames cbind_df(list_df) list_df A list of multiple data frames with the same observations in the same order. One merged data frame. cbind_df(list(iris, iris))

4 fwrite2 fread2 Read a text Read a text fread2(,..., data.table = FALSE, nthread = getoption("bigreadr.nthread")) Path to the that you want to read from.... Other arguments to be passed to data.table::fread. data.table nthread Whether to return a data.table or just a data.frame? Default is FALSE (and is the opposite of data.table::fread). Number of threads to use. Default uses all threads minus one. A data.frame by default; a data.table when data.table = TRUE. tmp <- fwrite2(iris) iris2 <- fread2(tmp) all.equal(iris2, iris) ## fread doesn't use factors fwrite2 Write a data frame to a text Write a data frame to a text fwrite2(x, = temp(),..., quote = FALSE, nthread = getoption("bigreadr.nthread"))

nlines 5 x Data frame to write. Path to the that you want to write to. Defaults uses temp().... Other arguments to be passed to data.table::fwrite. quote Whether to quote strings (default is FALSE). nthread Number of threads to use. Default uses all threads minus one. Input parameter, invisibly. tmp <- fwrite2(iris) iris2 <- fread2(tmp) all.equal(iris2, iris) ## fread doesn't use factors nlines Number of lines Get the number of lines of a using R package fpeek. nlines() Path of the. The number of lines as one integer. tmp <- fwrite2(iris) nlines(tmp)

6 split_ rbind_df Merge data frames Merge data frames rbind_df(list_df) list_df A list of multiple data frames with the same variables in the same order. One merged data frame with the names of the first input data frame. rbind_df(list(iris, iris)) split_ Split every nlines Split every nlines Get s from splitting. split_(, every_nlines, prefix_out = temp()) get_split_s(split out) Path to that you want to split. every_nlines Maximum number of lines in new parts. prefix_out Prefix for created s. Default uses temp(). split out Output of split_.

split_ 7 A list with name_in: input parameter, prefix_out: input parameter prefix_out, ns: Number of s (parts) created, nlines_part: input parameter every_nlines, nlines_all: total number of lines of. Vector of paths created by split_. tmp <- fwrite2(iris) infos <- split_(tmp, 100) str(infos) get_split_s(infos)

Index big_fread1, 2 big_fread2, 2 cbind_df, 3 data.table::fread, 2 4 data.table::fwrite, 5 fread2, 4 fwrite2, 4 get_split_s (split_), 6 nlines, 5 rbind_df, 6 split_, 6, 6, 7 8