Oracle R Technologies Overview
|
|
- Ross Malone
- 5 years ago
- Views:
Transcription
1 <Insert Picture Here> Oracle R Technologies Overview Mark Hornick, Director, Oracle Database Advanced Analytics
2 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle s products remain at the sole discretion of Oracle. 2
3 Mark Hornick Director, Oracle Advanced Analytics Joined Oracle in 1999 through acquisition of Thinking Machines Corp. Recent publications Using R to Unlock the Value of Big Data, Oracle Press Oracle Big Data Handbook, Oracle Press Founding member / Oracle Advisor Business Intelligence, Warehousing and Analytics SIG of IOUG BIWASummit.org at Oracle HQ, Redwood Shores, CA, Jan 14-16, 2014 Blogger: blogs.oracle.com/r Connect on LinkedIn: Mark Hornick 4
4 5
5 Motivation R increasingly recognized as the language of choice Oracle s R technologies scale R to the enterprise Oracle gives back to the R Community Share with the Boston Big Data Community 6
6 R s Popularity We can see that discussion of R has grown the most rapidly and, for the past few years, R is the most discussed software by an almost two-to-one margin. Robert A. Muenchen 7
7 2013 Data Mining Survey Vendors (18%) were excluded from these analyses 8
8 2013 Data Mining Survey 9
9 Oracle s R Technologies Oracle R Distribution ROracle Software available to R Community for free Oracle R Enterprise Oracle R Advanced Analytics for Hadoop 11
10 Agenda What is R? Three concerns: scalability, performance, deployment Oracle s R technologies How to get started 13
11 What is R? 14
12 What is R? R is an Open Source language and environment for statistical computing and graphics Started in 1994 as an alternative to SAS, SPSS, and other proprietary statistical environments An integrated suite of software facilities for data manipulation, analytical calculations, and graphics Over 2 million R users worldwide Widely taught in universities Many corporate analysts know and use R A thriving ecosystem with thousands of open sources packages 15
13 Why statisticians/data analysts use R R environment is.. Powerful Extensible Graphical Extensive statistics OOTB functionality with many knobs but smart defaults Ease of installation and use Free R is a statistics language similar to Base SAS or SPSS Statistics 16
14 Three Concerns for Enterprise Data 20
15 Three concerns for enterprise data analytics Scalability Performance Production Deployment 21
16 A fourth concern Remain in the R language and environment Same paradigm SQL not required Design, code, test, deploy from R 25
17 Oracle s R Technologies 27
18 Oracle R Distribution 28
19 Oracle R Distribution Ability to dynamically load Intel Math Kernel Library (MKL) AMD Core Math Library (ACML) Solaris Sun Performance Library Oracle Support Oracle s redistribution of open source R Enhanced linear algebra performance using Intel s MKL, AMD s ACML, and Sun Performance Library for Solaris Improve R scalability at client and at database server for embedded R execution Enterprise support for customers of Oracle Advanced Analytics option, Big Data Appliance, and Oracle Linux Free download Oracle makes bug fixes and enhancements available for open source R 29
20 Oracle R Distribution (ORD) Performance with MKL 3-node cluster 24 cores 3.07GHz per CPU 47 GB RAM Linux
21 ROracle 31
22 ROracle Oracle Database DBI & SQL R package enabling connectivity to Oracle Database Open source, publicly available on CRAN, free to R community Execute SQL statements from R interface Oracle Database Interface (DBI) for R based on OCI for high performance Supports Oracle R Enterprise database connectivity 32
23 Comparison loading database table to R data.frame ROracle Up to 79X faster than RJDBC Up to 2.5X faster than RODBC Scales across NUMBER, VARCHAR2, TIMESTAMP data types See 35
24 Comparison writing database table from R data.frame ROracle 61X faster for 10 cols x 10K rows than RODBC 630X faster on 10 cols x 10K rows than RJDBC Scales across the remaining data types 37
25 Example connecting to Oracle Database drv <- dbdriver("oracle") # Create the connection string host <- "myhost" port < sid <- "mysid" connect.string <- paste( "(DESCRIPTION=", "(ADDRESS=(PROTOCOL=tcp)(HOST=", host, ")(PORT=", port, "))", "(CONNECT_DATA=(SID=", sid, ")))", sep = "") con <- dbconnect(drv, username = "scott", password = "tiger", dbname = connect.string) 38
26 Example exerting transactional control dbreadtable(con, "EMP") rs <- dbsendquery(con, "delete from emp where deptno = 10") dbreadtable(con, "EMP") if(dbgetinfo(rs, what = "rowsaffected") > 1){ warning("dubious deletion -- rolling back transaction") dbrollback(con) } dbreadtable(con, "EMP") Example from ROracle package documentation 39
27 Oracle R Enterprise 41
28 Traditional R and Database Interaction read Flat Files extract / export Database export load SQL RODBC / RJDBC / ROracle R script cron job R memory limitation data size, call-by-value R single threaded Paradigm shift: R SQL R Access latency, backup, recovery, security Ad hoc script execution or porting code to target environment 42
29 Oracle R Enterprise A comprehensive, database-centric environment for end-to-end analytical processes in R, with immediate deployment to production environments Operationalize entire R scripts in production applications eliminate porting R code Seamlessly leverage Oracle Database as HPC environment for R scripts, providing data parallelism and resource management Execute R scripts through Oracle Database server machine for scalability and performance Enable integration and management through SQL Avoid reinventing code to integrate R results into existing applications Score R models in Oracle Database Transparently analyze and manipulate data in Oracle Database through R using versatile and customizable R functions Eliminate memory constraint of client R engine Get maximum value from your Oracle Database and Exadata Integrate R into the IT software stack, e.g. OBIEE Client R Engine Transparency Layer ORE packages Oracle Database User tables In-db stats SQL Interfaces SQL*Plus, SQLDeveloper, Database Server Machine 43
30 OBIEE Dashboard Integration Parameterized analytics and graph customization Improve time to insight Accommodate diverse consumption paths Deliver analytics that scale with data volumes, variables, techniques Integrate readily with IT infrastructure and software stack Leverage CRAN packages at database server 44
31 Performance In-Database Advantage 49
32 Performance In-Database Advantage 50
33 Time (seconds) Time (seconds) Time (seconds) Scalability In-Database Advantage GLM Model Build (DOP=64) Decision Tree Model Build (DOP=64) SVM Model Build (DOP=64) K 1M 10M 100M 1 100K 1M 10M 100M 1B 1 100K 1M 10M 100M Number of Rows Number of Rows Number of Rows 11 seconds to score 100M records (DOP=64) 92 seconds to score 1B records (DOP=128) 13 seconds to score 1B records (DOP=128) 2466 seconds to score 1B records (DOP=64) Exadata X2-2 half-rack 51
34 Oracle Database 12.1 Parallel Distributed Advanced Analytics Real world proof points Linear Regression (ore.lm) on Exadata X3-2 half-rack Data set: 2.9 billion rows spanning 12 months of data with over 350 predictors Elapsed time ~5 minutes! Logistic Regression (ore.glm) on Exadata X3-2 half-rack Data set: 2.9 billion rows spanning 12 months of data with over 350 predictors Elapsed time ~30 minutes! Neural networks (ore.neural) on T5-4 Solaris Data set: 1 billion rows with 40 columns Elapsed time ~6 minutes with 10 hidden neurons & 421 weights 52
35 Components of Oracle Advanced Analytics Option Fastest Way to Deliver Scalable Enterprise-wide Predictive Analytics Powerful Accelerate rate at which business problems are tackled Improve time to insight Combination of in-database predictive algorithms and open source R algorithms Accessible via SQL, PL/SQL, R and database APIs Scalable, parallel in-database execution of R language Easy to Use Expand user population that can build models Range of GUI and IDE options for business users to data scientists Enterprise-wide Integrated feature of Oracle Database via SQL - R is integrated into SQL Seamless support for enterprise analytical applications / BI environments Oracle R Enterprise + Oracle Data Mining 54
36 What might you want to do on a large data set? large == doesn t fit in memory Data set of 100M rows and 500 columns numeric data ~400 GB in database ~600 GB as CSV file Select data Summarize Aggregate Visualize data - boxplot Sample data Build models, persist models, score data Deploy R scripts in production executed from SQL 56
37 ORE Transparency Layer 57
38 Transparency No need to learn a different programming paradigm or environment Operate on database data as though they were R objects using R syntax Require minimal change to base R scripts for database data Implicitly translates R to SQL for in-database execution, performance, and scalability The Transparency Layer supports in-database data exploration, data preparation, and data analysis en route performing predictive analytics with a mix of in-database and CRAN techniques. 58
39 Establish a connection to Oracle Database library(ore) if (!ore.is.connected()) ore.connect(user="rquser", sid="orcl", host="localhost", password="rquser", all=true) ore.ls() 59
40 ore.ls returns names of ore.frame objects ore.frame objects are proxies for database tables Invoke R functions on ore.frame objects as if data.frames Data and results stay in Oracle Database until needed Example: ONTIME_S table as ore.frame Domestic airline flight data over 22 years Sample of ~220K rows x 26 columns 60
41 Data Selection Column selection df <- ONTIME_S[,c("YEAR","DEST","ARRDELAY")] class(df) head(df) head(ontime_s[,c(1,4,23)]) head(ontime_s[,-(1:22)]) Row selection df1 <- df[df$dest=="sfo",] class(df1) df2 <- df[df$dest=="sfo",c(1,3)] df3 <- df[df$dest=="sfo" df$dest=="bos",1:3] head(df1) head(df2) head(df3) 61
42 Summarize Data res <- summary(ontime_s[,1:13]) class(res) # table res 62
43 Aggregate Data R aggdata <- aggregate(ontime_s$dest, by = list(ontime_s$dest), FUN = length) class(aggdata) head(aggdata) Client R Engine Transparency Layer Oracle R Enterprise SQL select DEST, count(*) from ONTIME_S group by DEST Oracle Database ONTIME_S In-db stats 63
44 Visualize Data Overloaded graphics functions for in-database statistics dat <- LTV value <- CUST_LIFETIME_VALUE$LTV part <- dat$region bd <- split(value, part) boxplot(bd, notch = TRUE, col = "red", cex = 0.5, outline = FALSE, axes = FALSE, main = "Customer LTV by Region Distribution", ylab = "Lifetime Value ($)", xlab = "Region") axis(1, at=1:length(levels(part)), labels=levels(part)) axis(2) 64
45 High performance in-database sampling techniques dat <- ore.pull( ) samp <- dat[sample(nrow(x),size,] samp <- x[sample(nrow(x), size),,] samp <- ore.pull( ) Oracle Database Data Oracle Database Data Simple random sampling Split data sampling Systematic sampling Stratified sampling Cluster sampling Quota sampling Accidental / Convenience sampling via row order access via hashing 66
46 Simple random sampling Select rows at random set.seed(1) samplesize < row.names(ontime_s) <- ONTIME_S$YEAR simplerandomsample <- ONTIME_S[sample(nrow(ONTIME_S), samplesize),, drop=false] class(simplerandomsample) simplerandomsample[1:5,c(1,19:21)] 67
47 R Object Persistence Provide database storage to save/restore R and ORE objects across R sessions Facilitate production deployment Simplify application architecture Use cases Enable access to predictive models for scoring Passing complex arguments to R functions Preserve ORE objects across R sessions Functions: ore.save(), ore.load() ore.load(name="ds1") ls() x1 x2 x1 <- ore.lm(...) x2 <- ore.frame(...) ore.save(x1,x2,name="ds1") R Datastore ds1 {x1,x2} 70
48 ORE Analytics Packages and Functions 71
49 Philosophy for In-Database Analytics Whenever possible leave the data in the database leverage in-database statistics and algorithms do computations where the data reside 73
50 R Interface to In-Database Statistical Functions Special Functions Gamma function Natural logarithm of the Gamma function Digamma function Trigamma function Error function Complementary error function Tests Chi-square, McNemar, Bowker Simple and weighted kappas Cochran-Mantel-Haenzel correlation Cramer's V Binomial, KS, t, F, Wilcox Base SAS equivalents Freq, Summary, Sort Rank, Corr, Univariate Density, Probability, and Quantile Functions Beta distribution Binomial distribution Cauchy distribution Chi-square distribution Exponential distribution F-distribution Gamma distribution Geometric distribution Log Normal distribution Logistic distribution Negative Binomial distribution Normal distribution Poisson distribution Sign Rank distribution Student's t distribution Uniform distribution Weibull distribution Density Function Probability Function Quantile 74
51 R interface to in-database predictive algorithms SVM GLM k-means clustering Naïve Bayes Decision Trees Attribute Importance Neural Networks Linear Regression Stepwise Regression R interface to Oracle Data Mining algorithms 75
52 Build models in R, predict in-database OREpredict package provide a commercial grade scoring engine High performance Scalability Simplified application workflow Use R-generated models to score in-database on ore.frame Maximizes use of Oracle Database as compute engine Function ore.predict S4 generic function Specific method for each model supported by ORE 83
53 ore.predict supported algorithms Class Package Description glm stats Generalized Linear Model negbin MASS Negative binomial Generalized Linear Model hclust stats Hierarchical Clustering kmeans stats K-Means Clustering lm stats Linear Model multinom nnet Multinomial Log-Linear Model nnet nnet Neural Network rpart rpart Recursive Partitioning and Regression Tree 84
54 Example using lm irismodel <- lm(sepal.length ~., data = iris) IRIS <- ore.push(iris) IRISpred <- ore.predict(irismodel, IRIS, se.fit = TRUE, interval = "prediction") IRIS <- cbind(iris, IRISpred) head(iris) Build a typical R lm model Use ore.predict to score data in Oracle Database using ore.frame, e.g., IRIS Model implicitly translated to SQL for in-database scoring 86
55 ORE Embedded R Script Execution 87
56 Execute R script at database server machine res <- ore.doeval( function (num = 10, scale = 100) { ID <- seq(num) data.frame(id = ID, RES = ID / scale) }) Client R Engine ORE R user on desktop class(res) # ore.object res local_res <- ore.pull(res) class(local_res) # data.frame local_res DB R Engine ORE Oracle Database rq*apply interface User tables extproc 88
57 Results 89
58 Embedded R Execution Production deployment of R scripts Store and manage R scripts in Oracle Database Schedule R scripts for automatic execution Execute R scripts via SQL returning structured data, XML, images Scalability Data-parallelism Task-parallelism Performance High performance read/write between server R engine and Oracle Database Leverage powerful database server machine: memory, CPU 90
59 Execute R script data parallel at database server machine modlist <- ore.groupapply( X=LTV, INDEX=LTV$REGION, function(dat) { lm(ltv ~ SALARY+GENDER+AGE+HAS_CHILDREN+ MARITAL_STATUS+HOUSE_OWNERSHIP, dat) }); mod.ne <- ore.pull(modlist$northeast) summary(mod.ne) Client R Engine Transparency Layer ORE Oracle Database Family of embedded R functions ore.doeval ore.tableapply ore.groupapply ore.rowapply ore.indexapply DB R Engine ORE rq*apply () ORE User tables interface extproc extproc DB R Engine ORE 91
60 Results 92
61 Production Deployment store R script in DB repository ore.scriptdrop("simplescript1") ore.scriptcreate("simplescript1", function (num = 10, scale = 100) { ID <- seq(num) data.frame(id = ID, RES = ID / scale) }) res <- ore.doeval(fun.name="simplescript1", num = 20, scale = 1000) 96
62 Production Deployment using SQL API SQL begin sys.rqscriptcreate('example1', 'function() { ID <- 1:10 data.frame(id = ID, RES = ID / 100) }'); end; / select * from table(rqeval(null, 'select 1 id, 1 res from dual', 'Example1')); 97
63 Hello World! XML Example set long set pages 1000 begin sys.rqscriptcreate('example5', 'function() { res <- "Hello World!" res }'); end; / select name, value from table(rqeval( NULL, 'XML', 'Example5')); 99
64 Production Deployment same R function, multiple uses begin sys.rqscriptdrop('randomreddots'); sys.rqscriptcreate('randomreddots', 'function(){ id <- 1:10 plot(1:100,rnorm(100),pch=21,bg="red",cex =2) data.frame(id=id, val=id / 100) }'); end; / select from value table(rqeval( NULL,'XML', 'RandomRedDots ')); select from ID, IMAGE table(rqeval( NULL,'PNG', 'RandomRedDots ')); select * from table(rqeval( NULL, 'select 1 id, 1 val from dual','randomreddots')); 100
65 Results XML result PNG result select 1 id, 1 val from dual result 101
66 Oracle R Advanced Analytics for Hadoop 108
67 Goals Expand user population that can build models on Hadoop Accelerate rate at which business problems are tackled Deliver analytics that scale Data volumes Variables Techniques 110
68 Oracle R Advanced Analytics for Hadoop Big Data Appliance R script Hadoop Cluster ORD {CRAN packages} Hadoop Job Mapper R HDFS R MapReduce R sqoop MapReduce Nodes {CRAN packages} ORD HDFS Nodes R Client Reducer Oracle Database Provide transparent access to Hadoop Cluster Manipulate data in HDFS, database, and file system - all from R Write and execute MapReduce jobs with R Leverage CRAN R packages to work on HDFS-resident data Move from lab to production without requiring knowledge of Hadoop internals, Hadoop CLI, or IT infrastructure 111
69 Text analysis example Count the number of times each word occurs in a corpus of documents map One mapper per block of data Shuffle and Sort reduce One or more reducers combining results Documents divided into blocks in HDFS Outputs each word and its count: 1 each time a word is encountered Key Value The 1 Big 1 Data 1 word 1 count 1 example 1 One reducer receives only the key-value pairs for the word Big and sums up the counts Key Value Big 1... Big 1 It then outputs the final key-value result Key Value Big
70 Mapper and reducer code in ORCH for Word Count corpus <- scan("corpus.dat", what=" ",quiet= TRUE, sep="\n") corpus <- " ", corpus) input <- hdfs.put(corpus) res <- hadoop.exec(dfs.id = input, mapper = function(k,v) { x <- strsplit(v[[1]], " ")[[1]] x <- x[x!=''] out <- NULL for(i in 1:length(x)) out <- c(out, orch.keyval(x[i],1)) out }, reducer = function(k,vv) { orch.keyval(k, sum(unlist(vv))) }, config = new("mapred.config", job.name = "wordcount", map.output = data.frame(key='', val=0), reduce.output = data.frame(key='', val=0) ) ) res hdfs.get(res) Load the R data.frame into HDFS Specify and invoke map-reduce job Split words and output each word Sum the count of each word 113
71 Prepackaged Analytics Functions Function Description Application orch.lm orch.lmf orch.neural orch.nmf Fits a linear model using tall-and-skinny QR (TSQR) factorization and parallel distribution. The function computes the same statistical parameters as the Oracle R Enterprise ore.lm function. Fits a low rank matrix factorization model using either the jellyfish algorithm or the Mahout alternating least squares with weighted regularization (ALS-WR) algorithm. Provides a neural network to model complex, nonlinear relationships between inputs and outputs, or to find patterns in the data. Provides the main entry point to create a nonnegative matrix factorization model using the jellyfish algorithm. This function can work on much larger data sets than the R NMF package, because the input does not need to fit into memory. Regression analysis: predict numeric values e.g., Customer Lifetime Value Feature extraction e.g., recommender systems Regression e.g., predicting inventory demand Feature extraction e.g., recommender systems, dimension reduction 116
72 ORCHhive 117
73 What is Hive? SQL-like abstraction on Hadoop Becoming de facto standard for SQL based apps on Hadoop Converts SQL queries to map-reduce jobs to be run on Hadoop Provides simple query language (HQL) based on SQL Enables non-java users to leverage Hadoop via SQL-like interfaces 118
74 Motivation for ORCH Hive Big data scalability and performance for R users on Hadoop Enable R users to clean, explore, and prepare Hive data transparently Ready data for analytic techniques using ORCH map-reduce framework ORE provides transparent access to database tables and views from R based on SQL mapping Since Hive is SQL-based, natural to provide ORE-type transparency 119
75 Example using ORCHhive # Connect to Hive ore.connect(type="hive") # Attach the current environment into search path of R ore.attach() # create a Hive table by pushing the numeric columns of the campaign data set CMPGN_TABLE <- ore.push(campaigns[1:4]) # Create bins based on Years as Customer CMPGN_TABLE$YRS_AS_CUST = ifelse(cmpgn_table$yrs_as_cust < 2.0, "SHORT", ifelse(cmpgn_table$yrs_as_cust < 4.0, "MEDIUM", ifelse(cmpgn_table$yrs_as_cust < 6.0, "MEDIUM LONG", "LONG"))) # YRS_AS_CUST is now a derived column of the HIVE object names(yrs_as_cust) # Based on the bins, generate summary statistics for each group aggregate(cmpgn_table$yrs_as_cust, by = list(yrs_bins = CMPGN_TABLE$YRS_AS_CUST), FUN = summary) 120
76 Getting Started 123
77 Getting Started Tutorial resources available on the Oracle R Enterprise home page See Oracle software can be freely downloaded and installed for evaluation See our blog post at blogs.oracle.com/r for an ORE Test Drive Let us know about your experience! 124
78 Resources Book: Using R to Unlock the Value of Big Data, by Mark Hornick and Tom Plunkett Blog: Forum: Oracle R Distribution: ROracle: Oracle R Enterprise: Oracle R Advanced Analytics for Hadoop: 125
79 126
80 127
Oracle R Technologies Overview
Oracle R Technologies Overview Mark Hornick, Director, Oracle Advanced Analytics The following is intended to outline our general product direction. It is intended for information
More informationFault Detection using Advanced Analytics at CERN's Large Hadron Collider: Too Hot or Too Cold BIWA Summit 2016
Fault Detection using Advanced Analytics at CERN's Large Hadron Collider: Too Hot or Too Cold BIWA Summit 2016 Mark Hornick, Director, Advanced Analytics January 27, 2016 Safe Harbor Statement The following
More informationOracle R Technologies
Oracle R Technologies R for the Enterprise Mark Hornick, Director, Oracle Advanced Analytics @MarkHornick mark.hornick@oracle.com Safe Harbor Statement The following is intended to outline our general
More informationTaking R to New Heights for Scalability and Performance
Taking R to New Heights for Scalability and Performance Mark Hornick Director, Advanced Analytics and Machine Learning mark.hornick@oracle.com @MarkHornick blogs.oracle.com/r January 31,2017 Safe Harbor
More informationBrendan Tierney. Running R in the Database using Oracle R Enterprise 05/02/2018. Code Demo
Running R in the Database using Oracle R Enterprise Brendan Tierney Code Demo Data Warehousing since 1997 Data Mining since 1998 Analytics since 1993 1 Agenda What is R? Oracle Advanced Analytics Option
More informationOracle Big Data Science
Oracle Big Data Science Tim Vlamis and Dan Vlamis Vlamis Software Solutions 816-781-2880 www.vlamis.com @VlamisSoftware Vlamis Software Solutions Vlamis Software founded in 1992 in Kansas City, Missouri
More informationOracle Big Data Science IOUG Collaborate 16
Oracle Big Data Science IOUG Collaborate 16 Session 4762 Tim and Dan Vlamis Tuesday, April 12, 2016 Vlamis Software Solutions Vlamis Software founded in 1992 in Kansas City, Missouri Developed 200+ Oracle
More informationUsing Machine Learning in OBIEE for Actionable BI. By Lakshman Bulusu Mitchell Martin Inc./ Bank of America
Using Machine Learning in OBIEE for Actionable BI By Lakshman Bulusu Mitchell Martin Inc./ Bank of America Using Machine Learning in OBIEE for Actionable BI Using Machine Learning (ML) via Oracle R Technologies
More informationLearning R Series Session 1: Introduction to Oracle's R Technologies and Oracle R Enterprise 1.3
Learning R Series Session 1: Introduction to Oracle's R Technologies and Oracle R Enterprise 1.3 Mark Hornick, Senior Manager, Development Oracle Advanced Analytics 2012 Oracle All
More informationIntroducing Oracle R Enterprise 1.4 -
Hello, and welcome to this online, self-paced lesson entitled Introducing Oracle R Enterprise. This session is part of an eight-lesson tutorial series on Oracle R Enterprise. My name is Brian Pottle. I
More informationSession 4: Oracle R Enterprise Embedded R Execution SQL Interface Oracle R Technologies
Session 4: Oracle R Enterprise 1.5.1 Embedded R Execution SQL Interface Oracle R Technologies Mark Hornick Director, Oracle Advanced Analytics and Machine Learning July 2017 Safe Harbor Statement The following
More informationLearning R Series Session 5: Oracle R Enterprise 1.3 Integrating R Results and Images with OBIEE Dashboards Mark Hornick Oracle Advanced Analytics
Learning R Series Session 5: Oracle R Enterprise 1.3 Integrating R Results and Images with OBIEE Dashboards Mark Hornick Oracle Advanced Analytics Learning R Series 2012 Session Title
More informationGetting Started with Advanced Analytics in Finance, Marketing, and Operations
Getting Started with Advanced Analytics in Finance, Marketing, and Operations Southwest Regional Oracle Applications User Group Dan Vlamis February 24, 2017 @VlamisSoftware Vlamis Software Solutions Vlamis
More informationR mit Exadata und Exalytics Erfahrungen mit R
R mit Exadata und Exalytics Erfahrungen mit R Oliver Bracht Managing Director eoda Matthias Fuchs Senior Consultant ISE Information Systems Engineering GmbH extreme Datamining with Oracle R Enterprise
More informationMy name is Brian Pottle. I will be your guide for the next 45 minutes of interactive lectures and review on this lesson.
Hello, and welcome to this online, self-paced lesson entitled ORE Embedded R Scripts: SQL Interface. This session is part of an eight-lesson tutorial series on Oracle R Enterprise. My name is Brian Pottle.
More informationLearning R Series Session 3: Oracle R Enterprise 1.3 Embedded R Execution
Learning R Series Session 3: Oracle R Enterprise 1.3 Embedded R Execution Mark Hornick, Senior Manager, Development Oracle Advanced Analytics Learning R Series 2012 Session Title
More informationOracle R Enterprise. New Features in Oracle R Enterprise 1.5. Release Notes Release 1.5
Oracle R Enterprise Release Notes Release 1.5 E49659-04 December 2016 These release notes contain important information about Release 1.5 of Oracle R Enterprise. New Features in Oracle R Enterprise 1.5
More informationSession 3: Oracle R Enterprise Embedded R Execution R Interface
Session 3: Oracle R Enterprise 1.5.1 Embedded R Execution R Interface Oracle R Technologies Mark Hornick Director, Advanced Analytics and Machine Learning mark.hornick@oracle.com October 2018 Copyright
More informationOracle Big Data Connectors
Oracle Big Data Connectors Oracle Big Data Connectors is a software suite that integrates processing in Apache Hadoop distributions with operations in Oracle Database. It enables the use of Hadoop to process
More information<Insert Picture Here>
1 Oracle R Enterprise Training Sessions Session 1: Getting Started with Oracle R Enterprise Mark Hornick, Senior Manager, Development Oracle Advanced Analytics The following is intended
More informationOracle R Enterprise. User's Guide Release E
Oracle R Enterprise User's Guide Release 1.4.1 E56973-04 January 2015 Oracle R Enterprise User's Guide, Release 1.4.1 E56973-04 Copyright 2012, 2015, Oracle and/or its affiliates. All rights reserved.
More informationEvolving To The Big Data Warehouse
Evolving To The Big Data Warehouse Kevin Lancaster 1 Copyright Director, 2012, Oracle and/or its Engineered affiliates. All rights Insert Systems, Information Protection Policy Oracle Classification from
More information1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda
Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:
More informationOracle R Enterprise Platform and Configuration Requirements Oracle R Enterprise runs on 64-bit platforms only.
Oracle R Enterprise Release Notes Release 1.5.1 E83205-02 April 2017 These release notes contain important information about Release 1.5.1 of Oracle R Enterprise. New Features in Oracle R Enterprise 1.5.1
More informationIntroducing Microsoft SQL Server 2016 R Services. Julian Lee Advanced Analytics Lead Global Black Belt Asia Timezone
Introducing Microsoft SQL Server 2016 R Services Julian Lee Advanced Analytics Lead Global Black Belt Asia Timezone SQL Server 2016: Everything built-in built-in built-in built-in built-in built-in $2,230
More informationGraph Analytics and Machine Learning A Great Combination Mark Hornick
Graph Analytics and Machine Learning A Great Combination Mark Hornick Oracle Advanced Analytics and Machine Learning November 3, 2017 Safe Harbor Statement The following is intended to outline our research
More informationCopyright 2012, Oracle and/or its affiliates. All rights reserved.
1 Big Data Connectors: High Performance Integration for Hadoop and Oracle Database Melli Annamalai Sue Mavris Rob Abbott 2 Program Agenda Big Data Connectors: Brief Overview Connecting Hadoop with Oracle
More informationOracle Machine Learning Notebook
Oracle Machine Learning Notebook Included in Autonomous Data Warehouse Cloud Charlie Berger, MS Engineering, MBA Sr. Director Product Management, Machine Learning, AI and Cognitive Analytics charlie.berger@oracle.com
More informationRocky Mountain Oracle Users Group Fall Educational Workshop November 12, 2015
Rocky Mountain Oracle Users Group Fall Educational Workshop November 12, 2015 Dan Vlamis Tim Vlamis Vlamis Software Solutions 816-781-2880 http://www.vlamis.com Vlamis Software Solutions Vlamis Software
More informationOutrun Your Competition With SAS In-Memory Analytics Sascha Schubert Global Technology Practice, SAS
Outrun Your Competition With SAS In-Memory Analytics Sascha Schubert Global Technology Practice, SAS Topics AGENDA Challenges with Big Data Analytics How SAS can help you to minimize time to value with
More informationOracle R Advanced Analytics for Hadoop Release Notes. Oracle R Advanced Analytics for Hadoop Release Notes
Oracle R Advanced Analytics for Hadoop 2.7.1 Release Notes i Oracle R Advanced Analytics for Hadoop 2.7.1 Release Notes Oracle R Advanced Analytics for Hadoop 2.7.1 Release Notes ii REVISION HISTORY NUMBER
More informationEvent: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect
Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect BEOP.CTO.TP4 Owner: OCTO Revision: 0001 Approved by: JAT Effective: 08/30/2018 Buchanan & Edwards Proprietary: Printed copies of
More informationOracle Database 11g for Data Warehousing & Big Data: Strategy, Roadmap Jean-Pierre Dijcks, Hermann Baer Oracle Redwood City, CA, USA
Oracle Database 11g for Data Warehousing & Big Data: Strategy, Roadmap Jean-Pierre Dijcks, Hermann Baer Oracle Redwood City, CA, USA Keywords: Big Data, Oracle Big Data Appliance, Hadoop, NoSQL, Oracle
More informationSpotfire Data Science with Hadoop Using Spotfire Data Science to Operationalize Data Science in the Age of Big Data
Spotfire Data Science with Hadoop Using Spotfire Data Science to Operationalize Data Science in the Age of Big Data THE RISE OF BIG DATA BIG DATA: A REVOLUTION IN ACCESS Large-scale data sets are nothing
More informationGetting Started with ORE - 1
Hello, and welcome to this online, self-paced lesson entitled Getting Started with ORE. This session is part of an eight-lesson tutorial series on Oracle R Enterprise. My name is Brian Pottle. I will be
More informationSecurity and Performance advances with Oracle Big Data SQL
Security and Performance advances with Oracle Big Data SQL Jean-Pierre Dijcks Oracle Redwood Shores, CA, USA Key Words SQL, Oracle, Database, Analytics, Object Store, Files, Big Data, Big Data SQL, Hadoop,
More informationResource and Performance Distribution Prediction for Large Scale Analytics Queries
Resource and Performance Distribution Prediction for Large Scale Analytics Queries Prof. Rajiv Ranjan, SMIEEE School of Computing Science, Newcastle University, UK Visiting Scientist, Data61, CSIRO, Australia
More informationSession 7: Oracle R Enterprise OAAgraph Package
Session 7: Oracle R Enterprise 1.5.1 OAAgraph Package Oracle Spatial and Graph PGX Graph Algorithms Oracle R Technologies Mark Hornick Director, Oracle Advanced Analytics and Machine Learning July 2017
More informationOracle9i Data Mining. Data Sheet August 2002
Oracle9i Data Mining Data Sheet August 2002 Oracle9i Data Mining enables companies to build integrated business intelligence applications. Using data mining functionality embedded in the Oracle9i Database,
More informationSpecialist ICT Learning
Specialist ICT Learning APPLIED DATA SCIENCE AND BIG DATA ANALYTICS GTBD7 Course Description This intensive training course provides theoretical and technical aspects of Data Science and Business Analytics.
More informationOracle Machine Learning Notebook
Oracle Machine Learning Notebook Included in Autonomous Data Warehouse Cloud Charlie Berger, MS Engineering, MBA Sr. Director Product Management, Machine Learning, AI and Cognitive Analytics charlie.berger@oracle.com
More informationSafe Harbor Statement
Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment
More informationApache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context
1 Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes
More informationStages of Data Processing
Data processing can be understood as the conversion of raw data into a meaningful and desired form. Basically, producing information that can be understood by the end user. So then, the question arises,
More informationWorking with Data. L5-1 R and Databases
Working with Data L5-1 R and Databases R R Open source statistical computing and graphics language Started in 1993 as an alternative to SAS, SPSS and other proprietary statistical packages Originally called
More informationCERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI)
CERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI) The Certificate in Software Development Life Cycle in BIGDATA, Business Intelligence and Tableau program
More informationBoost your Analytics with ML for SQL Nerds
Boost your Analytics with ML for SQL Nerds SQL Saturday Spokane Mar 10, 2018 Julie Koesmarno @MsSQLGirl mssqlgirl.com jukoesma@microsoft.com Principal Program Manager in Business Analytics for SQL Products
More informationNetezza The Analytics Appliance
Software 2011 Netezza The Analytics Appliance Michael Eden Information Management Brand Executive Central & Eastern Europe Vilnius 18 October 2011 Information Management 2011IBM Corporation Thought for
More informationScalable Machine Learning in R. with H2O
Scalable Machine Learning in R with H2O Erin LeDell @ledell DSC July 2016 Introduction Statistician & Machine Learning Scientist at H2O.ai in Mountain View, California, USA Ph.D. in Biostatistics with
More informationIntroducing Oracle Machine Learning
Introducing Oracle Machine Learning A Collaborative Zeppelin notebook for Oracle s machine learning capabilities Charlie Berger Marcos Arancibia Mark Hornick Advanced Analytics and Machine Learning Copyright
More informationSpotfire: Brisbane Breakfast & Learn. Thursday, 9 November 2017
Spotfire: Brisbane Breakfast & Learn Thursday, 9 November 2017 CONFIDENTIALITY The following information is confidential information of TIBCO Software Inc. Use, duplication, transmission, or republication
More informationC5##54&6*"6*1%2345*D&'*E2)2*F"4G)&"69
?23(&65*@52%6&6'*A&)(*B*267* C5##54&6*"6*1%2345*D&'*E2)2*F"4G)&"69!"#$%&'%(?2%3"9*
More informationVerarbeitung von Vektor- und Rasterdaten auf der Hadoop Plattform DOAG Spatial and Geodata Day 2016
Verarbeitung von Vektor- und Rasterdaten auf der Hadoop Plattform DOAG Spatial and Geodata Day 2016 Hans Viehmann Product Manager EMEA ORACLE Corporation 12. Mai 2016 Safe Harbor Statement The following
More informationBig Data The end of Data Warehousing?
Big Data The end of Data Warehousing? Hermann Bär Oracle USA Redwood Shores, CA Schlüsselworte Big data, data warehousing, advanced analytics, Hadoop, unstructured data Introduction If there was an Unwort
More informationOracle Machine Learning & Advanced Analytics
Oracle Machine Learning & Advanced Analytics Data Management Platforms Move the Algorithms; Not the Data! Detlef E. Schröder Master Principal Sales Consultant Machine Learning, AI and Cognitive Analytics
More informationFastR: Status and Outlook
FastR: Status and Outlook Michael Haupt Tech Lead, FastR Project Virtual Machine Research Group, Oracle Labs June 2014 Copyright 2014 Oracle and/or its affiliates. All rights reserved. CMYK 0/100/100/20
More informationOracle NoSQL Database and Cisco- Collaboration that produces results. 1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.
Oracle NoSQL Database and Cisco- Collaboration that produces results 1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. What is Big Data? SOCIAL BLOG SMART METER VOLUME VELOCITY VARIETY
More informationFathom Dynamic Data TM Version 2 Specifications
Data Sources Fathom Dynamic Data TM Version 2 Specifications Use data from one of the many sample documents that come with Fathom. Enter your own data by typing into a case table. Paste data from other
More informationOracle Database Exadata Cloud Service Exadata Performance, Cloud Simplicity DATABASE CLOUD SERVICE
Oracle Database Exadata Exadata Performance, Cloud Simplicity DATABASE CLOUD SERVICE Oracle Database Exadata combines the best database with the best cloud platform. Exadata is the culmination of more
More informationAutomatic Data Optimization with Oracle Database 12c O R A C L E W H I T E P A P E R S E P T E M B E R
Automatic Data Optimization with Oracle Database 12c O R A C L E W H I T E P A P E R S E P T E M B E R 2 0 1 7 Table of Contents Disclaimer 1 Introduction 2 Storage Tiering and Compression Tiering 3 Heat
More informationBlended Learning Outline: Cloudera Data Analyst Training (171219a)
Blended Learning Outline: Cloudera Data Analyst Training (171219a) Cloudera Univeristy s data analyst training course will teach you to apply traditional data analytics and business intelligence skills
More informationData Science. Data Analyst. Data Scientist. Data Architect
Data Science Data Analyst Data Analysis in Excel Programming in R Introduction to Python/SQL/Tableau Data Visualization in R / Tableau Exploratory Data Analysis Data Scientist Inferential Statistics &
More informationCopyright 2018, Oracle and/or its affiliates. All rights reserved.
Beyond SQL Tuning: Insider's Guide to Maximizing SQL Performance Monday, Oct 22 10:30 a.m. - 11:15 a.m. Marriott Marquis (Golden Gate Level) - Golden Gate A Ashish Agrawal Group Product Manager Oracle
More informationWhat's New in MATLAB for Engineering Data Analytics?
What's New in MATLAB for Engineering Data Analytics? Will Wilson Application Engineer MathWorks, Inc. 2017 The MathWorks, Inc. 1 Agenda Data Types Tall Arrays for Big Data Machine Learning (for Everyone)
More information<Insert Picture Here> Oracle NoSQL Database A Distributed Key-Value Store
Oracle NoSQL Database A Distributed Key-Value Store Charles Lamb The following is intended to outline our general product direction. It is intended for information purposes only,
More information<Insert Picture Here> Introduction to Big Data Technology
Introduction to Big Data Technology The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into
More informationOracle Big Data SQL. Release 3.2. Rich SQL Processing on All Data
Oracle Big Data SQL Release 3.2 The unprecedented explosion in data that can be made useful to enterprises from the Internet of Things, to the social streams of global customer bases has created a tremendous
More informationOracle 1Z0-515 Exam Questions & Answers
Oracle 1Z0-515 Exam Questions & Answers Number: 1Z0-515 Passing Score: 800 Time Limit: 120 min File Version: 38.7 http://www.gratisexam.com/ Oracle 1Z0-515 Exam Questions & Answers Exam Name: Data Warehousing
More informationGain Insight and Improve Performance with Data Mining
Clementine 11.0 Specifications Gain Insight and Improve Performance with Data Mining Data mining provides organizations with a clearer view of current conditions and deeper insight into future events.
More informationTackling Big Data Using MATLAB
Tackling Big Data Using MATLAB Alka Nair Application Engineer 2015 The MathWorks, Inc. 1 Building Machine Learning Models with Big Data Access Preprocess, Exploration & Model Development Scale up & Integrate
More informationDistributed Computing with Spark
Distributed Computing with Spark Reza Zadeh Thanks to Matei Zaharia Outline Data flow vs. traditional network programming Limitations of MapReduce Spark computing engine Numerical computing on Spark Ongoing
More informationCOSC 6339 Big Data Analytics. Hadoop MapReduce Infrastructure: Pig, Hive, and Mahout. Edgar Gabriel Fall Pig
COSC 6339 Big Data Analytics Hadoop MapReduce Infrastructure: Pig, Hive, and Mahout Edgar Gabriel Fall 2018 Pig Pig is a platform for analyzing large data sets abstraction on top of Hadoop Provides high
More information2/26/2017. Originally developed at the University of California - Berkeley's AMPLab
Apache is a fast and general engine for large-scale data processing aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes Low latency: sub-second
More informationSAS High-Performance Analytics Products
Fact Sheet What do SAS High-Performance Analytics products do? With high-performance analytics products from SAS, you can develop and process models that use huge amounts of diverse data. These products
More informationORAAH Change List Summary. ORAAH Change List Summary
ORAAH 2.7.0 Change List Summary i ORAAH 2.7.0 Change List Summary ORAAH 2.7.0 Change List Summary ii REVISION HISTORY NUMBER DATE DESCRIPTION NAME ORAAH 2.7.0 Change List Summary iii Contents 1 ORAAH 2.7.0
More informationHow to Troubleshoot Databases and Exadata Using Oracle Log Analytics
How to Troubleshoot Databases and Exadata Using Oracle Log Analytics Nima Haddadkaveh Director, Product Management Oracle Management Cloud October, 2018 Copyright 2018, Oracle and/or its affiliates. All
More informationPrototyping Data Intensive Apps: TrendingTopics.org
Prototyping Data Intensive Apps: TrendingTopics.org Pete Skomoroch Research Scientist at LinkedIn Consultant at Data Wrangling @peteskomoroch 09/29/09 1 Talk Outline TrendingTopics Overview Wikipedia Page
More informationDistributed Computing with Spark and MapReduce
Distributed Computing with Spark and MapReduce Reza Zadeh @Reza_Zadeh http://reza-zadeh.com Traditional Network Programming Message-passing between nodes (e.g. MPI) Very difficult to do at scale:» How
More informationDATA SCIENCE USING SPARK: AN INTRODUCTION
DATA SCIENCE USING SPARK: AN INTRODUCTION TOPICS COVERED Introduction to Spark Getting Started with Spark Programming in Spark Data Science with Spark What next? 2 DATA SCIENCE PROCESS Exploratory Data
More informationActivator Library. Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success.
Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success. ACTIVATORS Designed to give your team assistance when you need it most without
More informationReal Time Summarization. Copyright 2014, Oracle and/or its affiliates. All rights reserved.
Real Time Summarization Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract.
More informationTI2736-B Big Data Processing. Claudia Hauff
TI2736-B Big Data Processing Claudia Hauff ti2736b-ewi@tudelft.nl Intro Streams Streams Map Reduce HDFS Pig Pig Design Patterns Hadoop Ctd. Graphs Giraph Spark Zoo Keeper Spark Learning objectives Implement
More informationOracle Big Data SQL High Performance Data Virtualization Explained
Keywords: Oracle Big Data SQL High Performance Data Virtualization Explained Jean-Pierre Dijcks Oracle Redwood City, CA, USA Big Data SQL, SQL, Big Data, Hadoop, NoSQL Databases, Relational Databases,
More informationData Science Training
Data Science Training R, Predictive Modeling, Machine Learning, Python, Bigdata & Spark 9886760678 Introduction: This is a comprehensive course which builds on the knowledge and experience a business analyst
More informationCombining Graph and Machine Learning Technology using R
Combining Graph and Machine Learning Technology using R Hassan Chafi Oracle Labs Mark Hornick Oracle Advanced Analytics February 2, 2017 Safe Harbor Statement The following is intended to outline our research
More informationSession 1079: Using Real Application Testing to Successfully Migrate to Exadata - Best Practices and Customer Case Studies
Session 1079: Using Real Application Testing to Successfully Migrate to Exadata - Best Practices and Customer Case Studies Prabhaker Gongloor (GP) Product Management Director, Database Manageability, Oracle
More informationOracle Big Data. A NA LYT ICS A ND MA NAG E MENT.
Oracle Big Data. A NALYTICS A ND MANAG E MENT. Oracle Big Data: Redundância. Compatível com ecossistema Hadoop, HIVE, HBASE, SPARK. Integração com Cloudera Manager. Possibilidade de Utilização da Linguagem
More informationAutomating Information Lifecycle Management with
Automating Information Lifecycle Management with Oracle Database 2c The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated
More informationUsing Existing Numerical Libraries on Spark
Using Existing Numerical Libraries on Spark Brian Spector Chicago Spark Users Meetup June 24 th, 2015 Experts in numerical algorithms and HPC services How to use existing libraries on Spark Call algorithm
More informationEnd to End Analysis on System z IBM Transaction Analysis Workbench for z/os. James Martin IBM Tools Product SME August 10, 2015
End to End Analysis on System z IBM Transaction Analysis Workbench for z/os James Martin IBM Tools Product SME August 10, 2015 Please note IBM s statements regarding its plans, directions, and intent are
More informationSAS (Statistical Analysis Software/System)
SAS (Statistical Analysis Software/System) SAS Adv. Analytics or Predictive Modelling:- Class Room: Training Fee & Duration : 30K & 3 Months Online Training Fee & Duration : 33K & 3 Months Learning SAS:
More informationLecture 7: MapReduce design patterns! Claudia Hauff (Web Information Systems)!
Big Data Processing, 2014/15 Lecture 7: MapReduce design patterns!! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl 1 Course content Introduction Data streams 1 & 2 The MapReduce paradigm
More informationAnalyzing Big Data with Microsoft R
Analyzing Big Data with Microsoft R 20773; 3 days, Instructor-led Course Description The main purpose of the course is to give students the ability to use Microsoft R Server to create and run an analysis
More informationMassive Scalability With InterSystems IRIS Data Platform
Massive Scalability With InterSystems IRIS Data Platform Introduction Faced with the enormous and ever-growing amounts of data being generated in the world today, software architects need to pay special
More informationSurvey of Oracle Database
Survey of Oracle Database About Oracle: Oracle Corporation is the largest software company whose primary business is database products. Oracle database (Oracle DB) is a relational database management system
More informationDivide & Recombine with Tessera: Analyzing Larger and More Complex Data. tessera.io
1 Divide & Recombine with Tessera: Analyzing Larger and More Complex Data tessera.io The D&R Framework Computationally, this is a very simple. 2 Division a division method specified by the analyst divides
More informationAn InterSystems Guide to the Data Galaxy. Benjamin De Boe Product Manager
An InterSystems Guide to the Data Galaxy Benjamin De Boe Product Manager Analytics 3 InterSystems Corporation. All rights reserved. 4 InterSystems Corporation. All rights reserved. 5 InterSystems Corporation.
More informationGain Greater Productivity in Enterprise Data Mining
Clementine 9.0 Specifications Gain Greater Productivity in Enterprise Data Mining Discover patterns and associations in your organization s data and make decisions that lead to significant, measurable
More informationData Science Bootcamp Curriculum. NYC Data Science Academy
Data Science Bootcamp Curriculum NYC Data Science Academy 100+ hours free, self-paced online course. Access to part-time in-person courses hosted at NYC campus Machine Learning with R and Python Foundations
More informationIBM Data Science Experience White paper. SparkR. Transforming R into a tool for big data analytics
IBM Data Science Experience White paper R Transforming R into a tool for big data analytics 2 R Executive summary This white paper introduces R, a package for the R statistical programming language that
More information