Package kdevine. May 19, 2017

Similar documents
Package kdecopula. October 6, 2017

Package vinereg. August 10, 2018

Package rvinecopulib

Toward Automated Cancer Diagnosis: An Interactive System for Cell Feature Extraction

Stat 4510/7510 Homework 4

Package TVsMiss. April 5, 2018

Package ECctmc. May 1, 2018

Package cattonum. R topics documented: May 2, Type Package Version Title Encode Categorical Features

Package packcircles. April 28, 2018

Package strat. November 23, 2016

Package ompr. November 18, 2017

Package nngeo. September 29, 2018

Package fastdummies. January 8, 2018

Package datasets.load

Package mixsqp. November 14, 2018

Machine Learning nearest neighbors classification. Luigi Cerulo Department of Science and Technology University of Sannio

Package RaPKod. February 5, 2018

Package wrswor. R topics documented: February 2, Type Package

k-nn Disgnosing Breast Cancer

Package densityclust

Package geojsonsf. R topics documented: January 11, Type Package Title GeoJSON to Simple Feature Converter Version 1.3.

Package rtext. January 23, 2019

Package bigreadr. R topics documented: August 13, Version Date Title Read Large Text Files

Package gggenes. R topics documented: November 7, Title Draw Gene Arrow Maps in 'ggplot2' Version 0.3.2

Package fbroc. August 29, 2016

Package coga. May 8, 2018

Package Numero. November 24, 2018

Package fusionclust. September 19, 2017

Evaluating the SVM Component in Oracle 10g Beta

Package validara. October 19, 2017

Package pwrab. R topics documented: June 6, Type Package Title Power Analysis for AB Testing Version 0.1.0

Package VineCopula. November 9, 2015

Package weco. May 4, 2018

Package r2d2. February 20, 2015

Package combiter. December 4, 2017

Package UNF. June 13, 2017

Package autocogs. September 22, Title Automatic Cognostic Summaries Version 0.1.1

Package ConvergenceClubs

Package readobj. November 2, 2017

Package Kernelheaping

Package dyncorr. R topics documented: December 10, Title Dynamic Correlation Package Version Date

Package inca. February 13, 2018

Package munfold. R topics documented: February 8, Type Package. Title Metric Unfolding. Version Date Author Martin Elff

Package sure. September 19, 2017

Package simsurv. May 18, 2018

Package quickreg. R topics documented:

Package diagis. January 25, 2018

Package Kernelheaping

Package docxtools. July 6, 2018

Package libstabler. June 1, 2017

Package queuecomputer

Package mfbvar. December 28, Type Package Title Mixed-Frequency Bayesian VAR Models Version Date

Package statip. July 31, 2018

Package FWDselect. December 19, 2015

Package rmi. R topics documented: August 2, Title Mutual Information Estimators Version Author Isaac Michaud [cre, aut]

Package mdftracks. February 6, 2017

Package geogrid. August 19, 2018

Package nonlinearicp

Package pendvine. R topics documented: July 9, Type Package

Package CorporaCoCo. R topics documented: November 23, 2017

Package feature. R topics documented: October 26, Version Date

Package clustvarsel. April 9, 2018

Package spark. July 21, 2017

Package robets. March 6, Type Package

Package naivebayes. R topics documented: January 3, Type Package. Title High Performance Implementation of the Naive Bayes Algorithm

Package repec. August 31, 2018

Package intcensroc. May 2, 2018

Package deductive. June 2, 2017

Package fitur. March 11, 2018

Package feature. R topics documented: July 8, Version Date 2013/07/08

Package Rsomoclu. January 3, 2019

Package edci. May 16, 2018

Package kamila. August 29, 2016

Package nmslibr. April 14, 2018

Package bayesdp. July 10, 2018

Package penrvine. R topics documented: May 21, Type Package

Package interplot. R topics documented: June 30, 2018

Package mhde. October 23, 2015

Package gwfa. November 17, 2016

Package MatchIt. April 18, 2017

Package gsscopu. R topics documented: July 2, Version Date

Package ClusterBootstrap

Convex Optimization CMU-10725

Package grec. R topics documented: August 13, Type Package

Package sparsehessianfd

Package readxl. April 18, 2017

Package RobustGaSP. R topics documented: September 11, Type Package

Package preprosim. July 26, 2016

Package splithalf. March 17, 2018

Package MTLR. March 9, 2019

Package raker. October 10, 2017

Package ParetoPosStable

Package samplesizecmh

Package rsppfp. November 20, 2018

Package ADMM. May 29, 2018

Package kernelfactory

Package MIICD. May 27, 2017

Package JOUSBoost. July 12, 2017

Package mixphm. July 23, 2015

Package tuts. June 12, 2018

Transcription:

Type Package Package kdevine May 19, 2017 Title Multivariate Kernel Density Estimation with Vine Copulas Version 0.4.1 Date 2017-05-20 URL https://github.com/tnagler/kdevine BugReports https://github.com/tnagler/kdevine/issues Implements the vine copula based kernel density estimator of Nagler and Czado (2016) <doi:10.1016/j.jmva.2016.07.003>. The estimator does not suffer from the curse of dimensionality and is therefore well suited for high-dimensional applications. License GPL-3 Imports graphics, stats, utils, MASS, Rcpp, qrng, KernSmooth, cctools, kdecopula (>= 0.8.1), VineCopula, doparallel, parallel, foreach LazyData yes LinkingTo Rcpp RoxygenNote 6.0.1 Suggests testthat NeedsCompilation yes Author Thomas Nagler [aut, cre] Maintainer Thomas Nagler <thomas.nagler@tum.de> Repository CRAN Date/Publication 2017-05-19 21:26:22 UTC R topics documented: kdevine-package...................................... 2 contour.kdevinecop..................................... 3 dkde1d............................................ 3 dkdevine........................................... 4 1

2 kdevine-package dkdevinecop......................................... 5 kde1d............................................ 6 kdevine........................................... 7 kdevinecop......................................... 9 plot.kde1d.......................................... 10 rkdevine........................................... 11 wdbc............................................. 12 Index 14 kdevine-package Kernel Smoothing for Bivariate Copula Densities This package implements a vine copula based kernel density estimator. The estimator does not suffer from the curse of dimensionality and is therefore well suited for high-dimensional applications (see, Nagler and Czado, 2016). Details The multivariate kernel density estimators is implemented by the kdevine function. It combines a kernel density estimator for the margins (kde1d) and a kernel estimator of the vine copula density (kdevinecop). The package is built on top of the copula density estimators in the kdecopula::kdecopulapackage and let s you choose from all its implemented methods. Optionally, the vine copula can be estimated parameterically (only the margins are nonparametric). Author(s) Thomas Nagler References Nagler, T., Czado, C. (2016) Evading the curse of dimensionality in nonparametric density estimation with simplified vine copulas. Journal of Multivariate Analysis 151, 69-89 (doi:10.1016/j.jmva.2016.07.003) Nagler, T., Schellhase, C. and Czado, C. (2017) Nonparametric estimation of simplified vine copula models: comparison of methods arxiv:1701.00845 Nagler, T. (2017) A generic approach to nonparametric function estimation with mixed data. arxiv:1704.07457

contour.kdevinecop 3 contour.kdevinecop Contour plots of pair copula kernel estimates Contour plots of pair copula kernel estimates ## S3 method for class 'kdevinecop' contour(x, tree = "ALL", xylim = NULL, cex.nums = 1,...) x a kdevinecop object. tree "ALL" or integer vector; specifies which trees are plotted. xylim numeric vector of length 2; sets xlim and ylim for the contours. cex.nums numeric; expansion factor for font of the numbers.... arguments passed to contour.kdecopula. data(wdbc, package = "kdecopula") u <- VineCopula::pobs(wdbc[, 5:7], ties = "average") # load data # rank-transform # estimate density fit <- kdevinecop(u) # contour matrix contour(fit) dkde1d Working with a kde1d object The density, cdf, or quantile function of a kernel density estimate are evaluated at arbitrary points with dkde1d, pkde1d, and qkde1d respectively.

4 dkdevine dkde1d(x, obj) pkde1d(x, obj) qkde1d(x, obj) rkde1d(n, obj, quasi = FALSE) x obj n quasi vector of evaluation points. a kde1d object. integer; number of observations. logical; the default (FALSE) returns pseudo-random numbers, use TRUE for quasirandom numbers (generalized Halton, see ghalton). Value The density or cdf estimate evaluated at x. kde1d data(wdbc) # load data fit <- kde1d(wdbc[, 5]) dkde1d(1000, fit) pkde1d(1000, fit) qkde1d(0.5, fit) hist(rkde1d(100, fit)) # estimate density # evaluate density estimate # evaluate corresponding cdf # quantile function # simulate dkdevine Evaluate the density of a kdevine object Evaluate the density of a kdevine object dkdevine(x, obj)

dkdevinecop 5 x obj (mxd) matrix of evaluation points (or vector of length d). a kdevine object. Value The density estimate evaluated at x. kdevine # load data data(wdbc) # estimate density (use xmin to indicate positive support) fit <- kdevine(wdbc[, 5:7], xmin = rep(0, 3)) # evaluate density estimate dkdevine(c(1000, 0.1, 0.1), fit) dkdevinecop Working with a kdevinecop object A vine copula density estimate (stored in a kdevinecop object) can be evaluated on arbitrary points with dkevinecop. Furthermore, you can simulate from the estimated density with rkdevinecop. dkdevinecop(u, obj, stable = FALSE) rkdevinecop(n, obj, U = NULL, quasi = FALSE) u obj stable n U quasi mx2 matrix of evaluation points. kdevinecop object. logical; option for stabilizing the estimator: the estimated pair copula density is cut off at 50. integer; number of observations. (optional) nxd matrix of independent uniform random variables. logical; the default (FALSE) returns pseudo-random numbers, use TRUE for quasirandom numbers (generalized Halton, see ghalton).

6 kde1d Value A numeric vector of the density/cdf or a nx2 matrix of simulated data. Author(s) Thomas Nagler References Nagler, T., Czado, C. (2016) Evading the curse of dimensionality in nonparametric density estimation. Journal of Multivariate Analysis 151, 69-89 (doi:10.1016/j.jmva.2016.07.003) Dissmann, J., Brechmann, E. C., Czado, C., and Kurowicka, D. (2013). Selecting and estimating regular vine copulae and application to financial returns. Computational Statistics & Data Analysis, 59(0):52 69. kdevinecop, dkdecop, rkdecop, ghalton data(wdbc, package = "kdecopula") # load data u <- VineCopula::pobs(wdbc[, 5:7], ties = "average") # rank-transform fit <- kdevinecop(u) dkdevinecop(c(0.1, 0.1, 0.1), fit) # estimate density # evaluate density estimate kde1d Univariate kernel density estimation for bounded and unbounded support Discrete variables are convoluted with the uniform distribution (see, Nagler, 2017). If a variable should be treated as discrete, declare it as ordered(). kde1d(x, mult = 1, xmin = -Inf, xmax = Inf, bw = NULL, bw_min = 0,...)

kdevine 7 x vector of length n. mult xmin xmax bw Details bw_min... unused. numeric; the actual bandwidth used is bw mult. lower bound for the support of the density. upper bound for the support of the density. bandwidth parameter; has to be a positive number or NULL; the latter calls KernSmooth::dpik(). minimum value for the bandwidth. If xmin or xmax are finite, the density estimate will be 0 outside of [xmin, xmax]. Mirror-reflection is used to correct for boundary bias. Discrete variables are convoluted with the uniform distribution (see, Nagler, 2017). Value An object of class kde1d. References Nagler, T. (2017). arxiv:1704.07457 A generic approach to nonparametric function estimation with mixed data. dkde1d, pkde1d, qkde1d, rkde1d plot.kde1d, lines.kde1d data(wdbc, package = "kdecopula") fit <- kde1d(wdbc[, 5]) dkde1d(1000, fit) # load data # estimate density # evaluate density estimate kdevine Kernel density estimatior based on simplified vine copulas Implements the vine-copula based estimator of Nagler and Czado (2016). The marginal densities are estimated by kde1d, the vine copula density by kdevinecop. Discrete variables are convoluted with the uniform distribution (see, Nagler, 2017). If a variable should be treated as discrete, declare it as ordered(). Factors are expanded into binary dummy codes.

8 kdevine kdevine(x, mult_1d = NULL, xmin = NULL, xmax = NULL, copula.type = "kde",...) x Value mult_1d xmin xmax copula.type (nxd) data matrix. numeric; all bandwidhts for marginal kernel density estimation are multiplied with mult_1d. Defaults to log(1 + d) where d is the number of variables after applying cctools::expand_as_numeric(). numeric vector of length d; see kde1d. numeric vector of length d; see kde1d. either "kde" (default) or "parametric" for kernel or parametric estimation of the vine copula.... further arguments passed to kde1d or kdevinecop. An object of class kdevine. References Nagler, T., Czado, C. (2016) Evading the curse of dimensionality in nonparametric density estimation with simplified vine copulas. Journal of Multivariate Analysis 151, 69-89 (doi:10.1016/j.jmva.2016.07.003) Nagler, T. (2017). arxiv:1704.07457 A generic approach to nonparametric function estimation with mixed data. dkdevine kde1d kdevinecop # load data data(wdbc, package = "kdecopula") # estimate density (use xmin to indicate positive support) fit <- kdevine(wdbc[, 5:7], xmin = rep(0, 3)) # evaluate density estimate dkdevine(c(1000, 0.1, 0.1), fit) # plot simulated data pairs(rkdevine(nrow(wdbc), fit))

kdevinecop 9 kdevinecop Kernel estimation of vine copula densities The function estimates a vine copula density using kernel estimators for the pair copulas (based on the kdecopula package). kdevinecop(data, matrix = NA, method = "TLL2", renorm.iter = 3L, mult = 1, test.level = NA, trunc.level = NA, treecrit = "tau", cores = 1, info = FALSE) data (nxd) matrix of copula data (have to lie in [0, 1 d ]). matrix method renorm.iter mult test.level trunc.level treecrit R-Vine matrix (nxd) specifying the structure of the vine; if NA (default) the structure selection heuristic of Dissman et al. (2013) is applied. see kdecop. see kdecop. see kdecop. significance level for independence test. If you provide a number in [0, 1], an independence test (BiCopIndTest) will be performed for each pair; if the null hypothesis of independence cannot be rejected, the independence copula will be set for this pair. If test.level = NA (default), no independence test will be performed. integer; the truncation level. All pair copulas in trees above the truncation level will be set to independence. criterion for structure selection; defaults to "tau". cores integer; if cores > 1, estimation will be parallized within each tree (using foreach). info logical; if TRUE, additional information about the estimate will be gathered (see kdecop). Value An object of class kdevinecop. That is, a list containing T1, T2,... lists of the estimted pair copulas in each tree, matrix the structure matrix of the vine, info additional information about the fit (if info = TRUE).

10 plot.kde1d References Nagler, T., Czado, C. (2016) Evading the curse of dimensionality in nonparametric density estimation with simplified vine copulas. Journal of Multivariate Analysis 151, 69-89 (doi:10.1016/j.jmva.2016.07.003) Nagler, T., Schellhase, C. and Czado, C. (2017) Nonparametric estimation of simplified vine copula models: comparison of methods arxiv:1701.00845 Dissmann, J., Brechmann, E. C., Czado, C., and Kurowicka, D. (2013). Selecting and estimating regular vine copulae and application to financial returns. Computational Statistics & Data Analysis, 59(0):52 69. dkdevinecop, kdecop, BiCopIndTest, foreach data(wdbc, package = "kdecopula") # rank-transform to copula data (margins are uniform) u <- VineCopula::pobs(wdbc[, 5:7], ties = "average") fit <- kdevinecop(u) dkdevinecop(c(0.1, 0.1, 0.1), fit) contour(fit) pairs(rkdevinecop(500, fit)) # estimate density # evaluate density estimate # contour matrix (Gaussian scale) # plot simulated data plot.kde1d Plotting kde1d objects Plotting kde1d objects ## S3 method for class 'kde1d' plot(x,...) ## S3 method for class 'kde1d' lines(x,...) x kde1d object.... further arguments passed to plot.default.

rkdevine 11 kde1d lines.kde1d data(wdbc) # load data fit <- kde1d(wdbc[, 7]) # estimate density plot(fit) # plot density estimate fit2 <- kde1d(as.ordered(wdbc[, 1])) # discrete variable plot(fit2, col = 2) rkdevine Simulate from a kdevine object Simulate from a kdevine object rkdevine(n, obj) n obj number of observations. a kdevine object. Value An nxd matrix of simulated data from the kdevine object. kdevine, rkdevinecop, rkde1d # load and plot data data(wdbc) # estimate density fit <- kdevine(wdbc[, 5:7], xmin = rep(0, 3)) # plot simulated data pairs(rkdevine(nrow(wdbc), fit))

12 wdbc wdbc Wisconsin Diagnostic Breast Cancer (WDBC) The data contain measurements on cells in suspicious lumps in a women s breast. Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. All samples are classsified as either benign or malignant. data(wdbc) Format wdbc is a data.frame with 31 columns. The first column indicates wether the sample is classified as benign (B) or malignant (M). The remaining columns contain measurements for 30 features. Details Ten real-valued features are computed for each cell nucleus: a) radius (mean of distances from center to points on the perimeter) b) texture (standard deviation of gray-scale values) c) perimeter d) area e) smoothness (local variation in radius lengths) f) compactness (perimeter^2 / area - 1.0) g) concavity (severity of concave portions of the contour) h) concave points (number of concave portions of the contour) i) symmetry j) fractal dimension ("coastline approximation" - 1) The references listed below contain detailed descriptions of how these features are computed. The mean, standard error, and "worst" or largest (mean of the three largest values) of these features were computed for each image, resulting in 30 features. Note This breast cancer database was obtained from the University of Wisconsin Hospitals, Madison from Dr. William H. Wolberg.

wdbc 13 Source https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(diagnostic) Bache, K. & Lichman, M. (2013). UCI Machine Learning Repository. Irvine, CA: University of California, School of Information and Computer Science. References O. L. Mangasarian and W. H. Wolberg: "Cancer diagnosis via linear programming", SIAM News, Volume 23, Number 5, September 1990, pp 1 & 18. William H. Wolberg and O.L. Mangasarian: "Multisurface method of pattern separation for medical diagnosis applied to breast cytology", Proceedings of the National Academy of Sciences, U.S.A., Volume 87, December 1990, pp 9193-9196. K. P. Bennett & O. L. Mangasarian: "Robust linear programming discrimination of two linearly inseparable sets", Optimization Methods and Software 1, 1992, 23-34 (Gordon & Breach Science Publishers). data(wdbc) str(wdbc)

Index Topic datasets wdbc, 12 Topic package kdevine-package, 2 BiCopIndTest, 9, 10 cctools::expand_as_numeric(), 8 contour.kdecopula, 3 contour.kdevinecop, 3 dkde1d, 3, 3, 7 dkdecop, 6 dkdevine, 4, 8 dkdevinecop, 5, 10 dkevinecop (dkdevinecop), 5 qkde1d (dkde1d), 3 qkde1d, (dkde1d), 3 rkde1d, 7, 11 rkde1d (dkde1d), 3 rkdecop, 6 rkdevine, 11 rkdevinecop, 11 rkdevinecop (dkdevinecop), 5 wdbc, 12 foreach, 9, 10 ghalton, 4 6 kde1d, 2, 4, 6, 7, 8, 11 kdecop, 9, 10 kdecopula, 9 kdecopula::kdecopula-package, 2 kdevine, 2, 5, 7, 11 kdevine-package, 2 kdevinecop, 2, 3, 6 8, 9 KernSmooth::dpik(), 7 lines.kde1d, 7, 11 lines.kde1d (plot.kde1d), 10 ordered(), 6, 7 pkde1d, 3, 7 pkde1d (dkde1d), 3 pkde1d, (dkde1d), 3 plot.default, 10 plot.kde1d, 7, 10 qkde1d, 3, 7 14