Package lvec. May 24, 2018

Similar documents
Package LaF. November 20, 2017

R topics documented: 2 clone

Package geojsonsf. R topics documented: January 11, Type Package Title GeoJSON to Simple Feature Converter Version 1.3.

Package bigreadr. R topics documented: August 13, Version Date Title Read Large Text Files

Package lumberjack. R topics documented: July 20, 2018

Package kdtools. April 26, 2018

Package validara. October 19, 2017

Package kirby21.base

Package diagis. January 25, 2018

Package gsalib. R topics documented: February 20, Type Package. Title Utility Functions For GATK. Version 2.1.

Package labelvector. July 28, 2018

Package io. January 15, 2018

Package inca. February 13, 2018

Package rgho. R topics documented: January 18, 2017

Package ECctmc. May 1, 2018

Package errorlocate. R topics documented: March 30, Type Package Title Locate Errors with Validation Rules Version 0.1.3

Package fastdummies. January 8, 2018

Package narray. January 28, 2018

Package SimilaR. June 21, 2018

Package zip. R topics documented: March 11, Title Cross-Platform 'zip' Compression Version 2.0.1

Package reconstructr

Package fst. December 18, 2017

Package spark. July 21, 2017

Package datasets.load

Package graphframes. June 21, 2018

Package deductive. June 2, 2017

Package getopt. February 16, 2018

Package RODBCext. July 31, 2017

Package assertive.code

Package synchronicity

Package ChoR. May 27, 2018

Package stapler. November 27, 2017

Package RPostgres. April 6, 2018

Package bannercommenter

Package strat. November 23, 2016

Package fst. June 7, 2018

Package catenary. May 4, 2018

Package cattonum. R topics documented: May 2, Type Package Version Title Encode Categorical Features

Package clipr. June 23, 2018

Package rlas. June 11, 2018

Package CorporaCoCo. R topics documented: November 23, 2017

Package Rglpk. May 18, 2017

Package CompGLM. April 29, 2018

Package zebu. R topics documented: October 24, 2017

Package pwrab. R topics documented: June 6, Type Package Title Power Analysis for AB Testing Version 0.1.0

Package keep. R topics documented: December 16, 2015

Package smoothr. April 4, 2018

Package WriteXLS. March 14, 2013

Package ETLUtils. January 25, 2018

Package intcensroc. May 2, 2018

Package Combine. R topics documented: September 4, Type Package Title Game-Theoretic Probability Combination Version 1.

Package rmatio. July 28, 2017

Package githubinstall

Package rvinecopulib

Package RcppParallel

Package kdetrees. February 20, 2015

Package Numero. November 24, 2018

Package assertive.data

Package rmumps. September 19, 2017

Package RPostgres. December 6, 2017

Package labelled. December 19, 2017

Package nmslibr. April 14, 2018

Package slam. December 1, 2016

Package iterators. December 12, 2017

Package Rtsne. April 14, 2017

Package confidence. July 2, 2017

Package slam. February 15, 2013

Package DSL. July 3, 2015

Package haven. April 9, 2015

Package mfbvar. December 28, Type Package Title Mixed-Frequency Bayesian VAR Models Version Date

Package readxl. April 18, 2017

Package farver. November 20, 2018

Package dostats. R topics documented: August 29, Version Date Title Compute Statistics Helper Functions

Package coga. May 8, 2018

Package assertive.properties

Package timechange. April 26, 2018

Package RcppProgress

Package dat. January 20, 2018

Package statip. July 31, 2018

Package semver. January 6, 2017

Package imgur. R topics documented: December 20, Type Package. Title Share plots using the imgur.com image hosting service. Version 0.1.

Package condformat. October 19, 2017

Package raker. October 10, 2017

Package robustreg. R topics documented: April 27, Version Date Title Robust Regression Functions

Package fasttextr. May 12, 2017

Package fail. R topics documented: October 1, Type Package. Title File Abstraction Interface Layer (FAIL)

Package biganalytics

Package nngeo. September 29, 2018

Package dint. January 3, 2019

Package velox. R topics documented: December 1, 2017

Package gggenes. R topics documented: November 7, Title Draw Gene Arrow Maps in 'ggplot2' Version 0.3.2

Package sankey. R topics documented: October 22, 2017

Package phylogram. June 15, 2017

Package sessioninfo. June 21, 2017

Package chunked. July 2, 2017

Package queuecomputer

Package tidyselect. October 11, 2018

Package MicroStrategyR

Package purrrlyr. R topics documented: May 13, Title Tools at the Intersection of 'purrr' and 'dplyr' Version 0.0.2

Package table1. July 19, 2018

Transcription:

Package lvec May 24, 2018 Type Package Title Out of Memory Vectors Version 0.2.2 Date 2018-05-23 Author Jan van der Laan Maintainer Jan van der Laan <djvanderlaan@unrealizedtime.nl> Core functionality for working with vectors (numeric, integer, logical and character) that are too large to keep in memory. The vectors are kept (partially) on disk using memory mapping. This package contains the basic functionality for working with these memory mapped vectors (e.g. creating, indeing, ordering and sorting) and provides C++ headers which can be used by other packages to etend the functionality provided in this package. URL https://github.com/djvanderlaan/lvec Depends stats, R (>= 3.0.0) LinkingTo BH, Rcpp Imports Rcpp Suggests testthat SystemRequirements C++11 License GPL-3 LazyLoad yes RoygenNote 6.0.1 NeedsCompilation yes Repository CRAN Date/Publication 2018-05-24 07:27:58 UTC 1

2 as_lvec R topics documented: as_lvec........................................... 2 as_rvec........................................... 3 chunk............................................ 3 clone............................................. 4 is_lvec............................................ 5 length.lvec.......................................... 5 lget............................................. 6 lsave............................................. 7 lset.............................................. 8 lvec............................................. 9 lvec_type.......................................... 10 order............................................. 10 print.lvec.......................................... 11 rattr............................................. 12 sort.lvec........................................... 13 strlen............................................ 13 Inde 15 as_lvec Converts a primitive R-vector to lvec Converts a primitive R-vector to lvec as_lvec() the object to convert. This can be a vector of type character, integer, numeric or logical. Returns an lvec of the same type as. When is already and lvec, is returned. For character vectors the maimum length of the lvec is set to the maimum length found in. Eamples # convert a character vector to lvec <- as_lvec(letters) lget(, 1:26)

as_rvec 3 as_rvec Convert complete lvec to R vector Convert complete lvec to R vector as_rvec() lvec to convert. Returns an R vector of type integer, numeric, character or logical. chunk Generate a number of inde ranges from a vector The ranges have a maimum length. chunk(,...) chunk(, chunk_size = 1e+06,...) ## Default S3 method: chunk(, chunk_size = NULL,...) ## S3 method for class 'data.frame' chunk(, chunk_size = NULL,...) an object for which the inde ranges should be calculated. Should support the length method. For eample, an lvec or a regular R vector.... ignored; used to pass additional arguments to other methods. chunk_size a numeric vector of length 1 giving the maimum length of the chunks.

4 clone Details The default chunk size can be changes by setting the option chunk_size, ( options(chunk_size = <new default chunk size>) ). Implementations of chunk for data frames and regular vectors are provided to make it easier to write code that works on both lvec objects and regular R objects. clone Clone an lvec object Clone an lvec object clone(,...) clone(,...) Details lvec object to clone... ignored; used to pass additional arguments to other methods lvec objects are basically pointers to pieces of memory. When copying an object only the pointer is copied and when modifying the copied object also the original object is modified. The advantage of this is speed: these is less copying of the complete vector. In order to obtain a true copy of an lvec code can be used. Eamples a <- as_lvec(1:3) # Copy b <- a # When modifying the copy also the original is modified lset(b, 1, 10) print(b) # Use clone to make a true copy b <- clone(a) lset(b, 1, 100) print(b)

is_lvec 5 is_lvec Check if an object is of type lvec Check if an object is of type lvec is_lvec() the object to check Returns TRUE is the object is of type lvec and FALSE otherwise. length.lvec Get and set the length of an lvec Get and set the length of an lvec length() ## S3 replacement method for class 'lvec' length() <- value value the lvec the new length of the link{lvec} The length of the lvec.

6 lget lget Read elements from an lvec Read elements from an lvec lget(,...) lget(, inde = NULL, range = NULL,...) ## Default S3 method: lget(, inde = NULL, range = NULL,...) ## S3 method for class 'data.frame' lget(, inde = NULL, range = NULL,...) Details the lvec to read from... used to pass on additional arguments to other methods. inde range a logical or numeric vector to inde with a numeric vector of length 2 specifying a range of elements to select. Specify either inde or range. Indeing using inde should follow the same rules as indeing a regular R-vector using a logical or numeric inde. The range given by range includes both end elements. So, a range of c(1,3) selects the first three elements. Returns an lvec with the selected elements. In order to convert the selection to an R-vector as_rvec can e used. Eamples a <- as_lvec(letters[1:4]) # Select first two elements lget(a, 1:2) lget(a, c(true, TRUE, FALSE, FALSE)) lget(a, range = c(1,2))

lsave 7 # Logical indices are recycled: select odd elements lget(a, c(true, FALSE)) lsave Read and write lvec object to file Read and write lvec object to file lsave(, filename, overwrite = TRUE, compress = FALSE) lload(filename) filename overwrite compress lvec object to save name of the file(s) to save the lvec to. See details. overwrite eisting files or abort when files would be overwritten. a logical specifying if the data should be compressed. (see saverds). Details The lvec is written in chunks to a number of RDS files using saverds. When filename contains the etension RDS (capitalisation may differ), this etension is stripped from the filename. After that the lvec is written in blocks (or chunks) to files having names <filename>.00001.rds, <filename>.00002.rds etc. Some additional data (data type, the number of blocks, the size of the lvec, etc) is written to the file <filename>.rds. The size of the chunks can be controlled by the option chunk_size (see chunk). lsave does not return anything. lload returns an lvec.

8 lset lset Set values in an lvec Set values in an lvec lset(,...) lset(, inde = NULL, values, range = NULL,...) ## Default S3 method: lset(, inde = NULL, values, range = NULL,...) ## S3 method for class 'data.frame' lset(, inde = NULL, values, range = NULL,...) Details lvec to set values in... used to pass additional arguments to other methods inde values range a numeric or logical vector with indices at which the values should be set. a vector with the new values. When shorter than the length of the indices the values are recycled. a numeric vector of length 2 specifying a range of elements to select. Specify either inde or range. Should behave in the same way as assigning and indeing to a regular R-vector. The range given by range includes both end elements. So, a range of c(1,3) selects the first three elements. When range is given, and values is not given it is assumed inde contains the values. Therefore, one can do lset(, range = c(1,4), NA), to set the first four elements of to missing. Eamples a <- as_lvec(1:10) # set second element to 20 lset(a, 2, 20) # set odd elements to 20 lset(a, c(true, FALSE), 20) # values are recycled

lvec 9 lset(a, 1:4, 100:101) # range inde; set first 3 elements to NA lset(a, range = c(1,3), NA) lvec Create memory mapped vector The data in these vectors are stored on disk (partially buffered for speed) allowing one to work with more data than fits into available memory. lvec(size, type = c("numeric", "integer", "logical", "character"), strlen = NULL) size Details the size of the vector type the type of the vector. Should be one of the following value: "numeric", "integer", "logical" or "character". The types will create vectors corresponding to the corresponding R types. strlen in case of a vector of type "character" the maimum length of the strings should also be specified using strlen. The minimum value of strlen is two. When a value smaller than that is given it is automatically set to two. This is because a minimum of two bytes is necessary to also store missing values correctly. Returns an object of type lvec. Elements of this vector are stored on file (partially buffered in memory for speed) allowing one to work with more data than fits into memory. Eamples # create an integer vector of length 100 <- lvec(100, type = "integer") # Get the first 10 values; values are initialised to 0 by default lget(, 1:10) # Set the first 10 values to 11:20 lset(, 1:10, 11:20)

10 order # set maimum length of the string to 1, strings longer than that get # truncated. However, minimum value of strlen is 2. <- lvec(10, type = "character", strlen = 1) lset(, 1:3, c("a", "foo", NA)) lget(, 1:3) lvec_type Get the type of the lvec Get the type of the lvec lvec_type() the lvec to get the type from. Returns an character vector of length 1 with one of the following values: "integer", "numeric", "logical" or "character". order Order a lvec Order a lvec order(,...) ## Default S3 method: order(,...) order(,...)

print.lvec 11 lvec to sort... unused. Returns the order of. Unlike the default order function in R, the sort used is not stable (e.g. in case there are multiple records with the same value in, there relative order after sorting is not defined). Eamples <- as_lvec(rnorm(10)) order() print.lvec Print an lvec Print an lvec print(,...) lvec to print.... unused Returns invisibly.

12 rattr rattr Set and get attributes of the original R-vector stored in an lvec Set and get attributes of the original R-vector stored in an lvec rattr(, which) rattr(, which) <- value which value and object of type lvec a character vector of length one giving the name of the attribute. the new value of the attribute Details The attributes of the lvec can be set and obtained using the standard functions attr and attributes. However when an lvec is converted to an R-vector using, for eample, as_rvec, the attributes of the resulting R-vector are set using the result of rattr. This can be used to store vectors such as factors and dates (POSIXct) in lvec objects, as these are basically integer and numeric vectors with a number of additional attributes. Eamples dates <- as_lvec(as.date("2016-12-05", "2016-12-24")) # When printing and reading the result is converted back to a date object print(dates) as_rvec(dates) # make a factor of an integer lvec a <- as_lvec(1:3) rattr(a, "class") <- "factor" rattr(a, "levels") <- c("a", "b", "c")

sort.lvec 13 sort.lvec Sort a lvec Sort a lvec sort(, decreasing = FALSE, clone = TRUE,...) decreasing clone lvec to sort... unused. unused (a value unequal to FALSE will generate an error). clone before sorting Sorts and returns a sorted copy of. When clone is FALSE the input vector is modified. Eamples <- as_lvec(rnorm(10)) sort() # Effect of clone a <- as_lvec(rnorm(10)) b <- sort(a, clone = FALSE) strlen Get and set the maimum string length of a character lvec Get and set the maimum string length of a character lvec strlen() strlen() <- value

14 strlen value a lvec of type character. the new value of the maimum string length. Eamples a <- as_lvec('123') strlen(a) # = 3 # Strings are truncated to strlen lset(a, 1, '123456') # '123' strlen(a) <- 5 lset(a, 1, '123456') # '12345'

Inde as_lvec, 2 as_rvec, 3, 6, 12 attr, 12 attributes, 12 chunk, 3, 7 clone, 4 is_lvec, 5 length, 3 length.lvec, 5 length<-.lvec (length.lvec), 5 lget, 6 lload (lsave), 7 lsave, 7 lset, 8 lvec, 2 8, 9, 10 12 lvec_type, 10 order, 10, 11 print.lvec, 11 rattr, 12 rattr<- (rattr), 12 saverds, 7 sort.lvec, 13 strlen, 13 strlen<- (strlen), 13 15