High-Performance Statistical Modeling
|
|
- Darcy Anthony
- 5 years ago
- Views:
Transcription
1 High-Performance Statistical Modeling Koen Knapen Academic Day, March 27 th, 2014 SAS Tervuren
2 The Routes (Roots) Of Confusion How do I get HP procedures? Just add HP?? Single-machine mode Distributed mode Distributed-Alongside Scalability REG vs. HPREG GENMOD vs. HPGENSELECT Symmetric vs. Asymmetric Mode support.sas.com/statistics/papers
3 Part 1: General Considerations
4 GENERAL CONSIDERATIONS Execution Modes Single-Machine Mode Executes entirely on the server where SAS is installed Also called client mode or SMP (Symmetric Multi-Processing) mode Distributed Mode Major computations done on an appliance ( blade server ) Also called MPP (massively parallel processing) mode
5 Single-Machine Mode SAS Server proc hpgenselect data=a2013; class c:; model ypoisson = x: c: ; selection method=stepwise; run; The HPA procedure determines the n of concurrent threads based on the n of CPUs (cores) on server.
6 Appliance - Racks of Blades and Software Multi-socket, multi-core platform Commodity blade Chassis of blades Appliance / blade server = tightly integrated homogeneous cluster of computers that are arranged in racks. The individual computers in each rack are called nodes or blades. Database appliances include database software.
7 Database Appliance Controller Worker Nodes A table is stored in parts across multiple worker nodes SQL queries operate in parallel on the different parts of the table
8 GENERAL CONSIDERATIONS Data Access Features Client-data (or local-data) method data are moved from SAS server to distributed computing environment. Alongside-the-database-method Data are stored in distributed DBMS and are read in parallel from the distributed DBMS into a SAS analytic process that runs on the database appliance. Alongside-HDFS method HDFS: Hadoop Distributed File System Alongside-LASR method The data are loaded from a SAS LASR Analytic Server that runs on the appliance.
9 Availability
10 AVAILABILITY High-Performance Analytical Products High-Performance Analytics Product Associated MVA Product SAS High-Performance Statistics SAS/STAT SAS High-Performance Econometrics SAS/ETS SAS High-Performance Optimization SAS/OR SAS High-Performance Data Mining SAS Enterprise Miner SAS High-Performance Text Mining SAS Text Miner SAS High-Performance Forecasting SAS High-Performance Forecasting MVA products include single-machine mode operation of HP procedures as part of the MVA product license.
11 AVAILABILITY SAS High-Performance Product Offerings Release 13.1 Available in December with SAS 9.4M High-Performance Statistics High-Performance Data Mining High-Performance Text Mining High-Performance Optimization High-Performance Econometrics High-Performance Forecasting 2 HPLOGISTIC HPREDUCE HPTMINE OPTLSO HPCOUNTREG HPFORECAST HPREG HPLMIXED HPNLMOD HPSPLIT HPGENSELECT HPQUANTSELECT HPFMM HPNEURAL HPFOREST HP4SCORE HPDECIDE HPCLUS HPSVM HPBNET HPTMSCORE Select features in OPTMILP OPTLP OPTMODEL HPSEVERITY HPQLIM HPPANEL HPCOPULA HPCDM HPTIMEDATA HPCANDISC HPPRINCOMP Common Set (HPDS2, HPDMDB, HPSAMPLE, HPSUMMARY, HPIMPUTE, HPBIN, HPCORR)
12 Part 2: High-Performance Statistical Modeling
13 HIGH-PERFORMANCE STATISTICAL MODELING General Design Principles for HPA Procedures 1. Support single-machine and distributed modes 2. Use multithreading to exploit all CPUs 3. Support a variety of data sources 4. Require syntactical consistency across modes 5. Require syntactical consistency across HPA procedures
14 HIGH-PERFORMANCE STATISTICAL MODELING Design Principles for High-Performance Statistical Procedures 1. Focus on prediction and not post-fit inference 2. Standardize and improve syntax where needed 3. Support model selection where appropriate 4. Combine functionality from SAS/STAT procedures when appropriate 5. Provide new functionality within HPA framework when viable
15 HIGH-PERFORMANCE STATISTICAL MODELING Functionality of HPGENSELECT Procedure Fits generalized linear models Distributions: Normal, Poisson, Tweedie, Link functions: log, logit, Linear predictors: effects involving continuous and classification variables Provides model building Forward, backward, stepwise methods Multiple criteria for choosing model: AIC, AICC, SBC Splitting of classification effects Writes DATA step code for computing predicted values
16 HIGH-PERFORMANCE STATISTICAL MODELING GENMOD or HPGENSELECT? GENMOD Fits models with moderate-to-large data Offers rich set of methods for statistical inference GEE methods for correlated responses Bayesian inference Exact conditional regression Wide array of postfitting analysis: contrasts, estimates, tests, HPGENSELECT Fits and builds models with large-to-massive data Designed for large-data tasks such as predictive model building
17 Performance Comparisons
18 Scalable Percentage Not Scalable Scalable t s t 1 Scalable Percentage = 100 t s / t 1 = 60%
19 Amdahl s Law Not Scalable 40% Scalable 60% 1 CPU t s t 1 57% 43% 2 CPUs ½ t s t 2 Speedup = t 1 /t 2 = %
20 HIGH-PERFORMANCE STATISTICAL MODELING Scalability and Big Data Amdahl s law implies a limit to scalability. Yet every job has some unavoidable serial component. Reading data with a single I/O controller in single-machine mode Establishing connections to an appliance and database in distributed mode
21 HIGH-PERFORMANCE STATISTICAL MODELING Benefits 1. High-performance procedures in SAS/STAT deliver modeling methods and scalability for a wide range of problem sizes. 2. If you have SAS/STAT, you can run these procedures in single-machine mode and exploit all the cores. 3. As your problem size grows, you can take full advantage of all the cores and huge amounts of memory available in distributed computing environments.
22 High-Performance Statistical Modeling Koen Knapen Academic Day, March 27 th, 2014 SAS Tervuren
WHAT S NEW IN SAS INCLUDING BASE, STAT, SAS ENTERPRISE GUIDE
WHAT S NEW IN SAS INCLUDING BASE, STAT, SAS ENTERPRISE GUIDE AGENDA WHAT S NEW Base SAS/Stat SAS Enterprise Guide 2 SAS 9 WHAT S NEW THEME Productivity enhancements Improved graphics More powerful algorithms
More informationHigh-Performance Procedures in SAS 9.4: Comparing Performance of HP and Legacy Procedures
Paper SD18 High-Performance Procedures in SAS 9.4: Comparing Performance of HP and Legacy Procedures Jessica Montgomery, Sean Joo, Anh Kellermann, Jeffrey D. Kromrey, Diep T. Nguyen, Patricia Rodriguez
More informationSAS and Hadoop. paulmkent. 3 rd Annual State of the Union. Paul Kent VP BigData, SAS
SAS and Hadoop 3 rd Annual State of the Union Paul Kent VP BigData, SAS Paul.Kent @ sas.com @hornpolish paulmkent SAS and Hadoop :: the BIG Picture SAS and Hadoop are made for each other This talk explains
More informationWhat does SAS Enterprise Miner do? For whom is SAS Enterprise Miner designed?
FACT SHEET SAS Enterprise Miner Create highly accurate analytical models that enable you to predict with confidence What does SAS Enterprise Miner do? It streamlines the data mining process so you can
More informationSAS Enterprise Miner High-Performance Procedures
SAS Enterprise Miner 13.1 High-Performance Procedures The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2013. SAS Enterprise Miner 13.1: High-Performance Procedures.
More informationBase SAS 9.4 Procedures Guide
Base SAS 9.4 Procedures Guide High-Performance Procedures Fourth Edition The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2015. Base SAS 9.4 Procedures Guide: High-Performance
More informationBase SAS 9.4 Procedures Guide
Base SAS 9.4 Procedures Guide High-Performance Procedures Fifth Edition The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2016. Base SAS 9.4 Procedures Guide: High-Performance
More informationSAS Meets Big Iron: High Performance Computing in SAS Analytic Procedures
SAS Meets Big Iron: High Performance Computing in SAS Analytic Procedures Robert A. Cohen SAS Institute Inc. Cary, North Carolina, USA Abstract Version 9targets the heavy-duty analytic procedures in SAS
More informationOracle Big Data Connectors
Oracle Big Data Connectors Oracle Big Data Connectors is a software suite that integrates processing in Apache Hadoop distributions with operations in Oracle Database. It enables the use of Hadoop to process
More informationSAS High-Performance Analytics Products
Fact Sheet What do SAS High-Performance Analytics products do? With high-performance analytics products from SAS, you can develop and process models that use huge amounts of diverse data. These products
More informationOutrun Your Competition With SAS In-Memory Analytics Sascha Schubert Global Technology Practice, SAS
Outrun Your Competition With SAS In-Memory Analytics Sascha Schubert Global Technology Practice, SAS Topics AGENDA Challenges with Big Data Analytics How SAS can help you to minimize time to value with
More informationRegression Model Building for Large, Complex Data with SAS Viya Procedures
Paper SAS2033-2018 Regression Model Building for Large, Complex Data with SAS Viya Procedures Robert N. Rodriguez and Weijie Cai, SAS Institute Inc. Abstract Analysts who do statistical modeling, data
More informationText Mine Your Big Data: What High Performance Really Means WHITE PAPER
Text Mine Your Big Data: What High Performance Really Means WHITE PAPER SAS White Paper Table of Contents Introduction.... 1 How It Works.... 2 SAS High-Performance Text Mining... 5 SAS High-Performance
More informationSAS Text Miner High-Performance Procedures
SAS Text Miner 12.3 High-Performance Procedures The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2013. SAS Text Miner 12.3: High-Performance Procedures. Cary, NC: SAS
More informationNetezza The Analytics Appliance
Software 2011 Netezza The Analytics Appliance Michael Eden Information Management Brand Executive Central & Eastern Europe Vilnius 18 October 2011 Information Management 2011IBM Corporation Thought for
More informationSAS Text Miner High-Performance Procedures
SAS Text Miner 14.2 High-Performance Procedures The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2016. SAS Text Miner 14.2: High-Performance Procedures. Cary, NC: SAS
More informationScoring with Analytic Stores
Scoring with Analytic Stores Merve Yasemin Tekbudak, SAS Institute Inc., Cary, NC In supervised learning, scoring is the process of applying a previously built predictive model to a new data set in order
More informationMassively Parallel Processing. Big Data Really Fast. A Proven In-Memory Analytical Processing Platform for Big Data
Big Data Really Fast A Proven In-Memory Analytical Processing Platform for Big Data 2 Executive Summary / Overview: Big Data can be a big headache for organizations that have outgrown the practicality
More informationModel Selection Using Information Criteria (Made Easy in SAS )
ABSTRACT Paper 2587-2018 Model Selection Using Information Criteria (Made Easy in SAS ) Wendy Christensen, University of California, Los Angeles Today s statistical modeler has an unprecedented number
More informationBull Fast Track/PDW and Big Data
Bull Fast Track/PDW and Big Data Add High Performance BI to your Big Data Roger Van Unen Expert Microsoft / BI roger.van-unen@bull.net http://www.bull.fr/bi/fastrack.html Michael Schmitter BI Sales Germany
More informationGLMSELECT for Model Selection
Winnipeg SAS User Group Meeting May 11, 2012 GLMSELECT for Model Selection Sylvain Tremblay SAS Canada Education Copyright 2010 SAS Institute Inc. All rights reserved. Proc GLM Proc REG Class Statement
More informationThe Future of the SAS Platform
SAS USER FORUM FINLAND 2017 The Future of the SAS Platform Fiona McNeill @fiona_r_mcn The analytics economy Our digital transformation to power the analytics economy Model inventory & management Asset
More informationCIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( )
Guide: CIS 601 Graduate Seminar Presented By: Dr. Sunnie S. Chung Dhruv Patel (2652790) Kalpesh Sharma (2660576) Introduction Background Parallel Data Warehouse (PDW) Hive MongoDB Client-side Shared SQL
More informationChapter 3. Foundations of Business Intelligence: Databases and Information Management
Chapter 3 Foundations of Business Intelligence: Databases and Information Management THE DATA HIERARCHY TRADITIONAL FILE PROCESSING Organizing Data in a Traditional File Environment Problems with the traditional
More informationSAS Platform Strategy Prepared for FANS usergroup. Mike Frost, Director, Product Management Fiona McNeill, Global Product Marketing
SAS Platform Strategy Prepared for FANS usergroup Mike Frost, Director, Product Management Fiona McNeill, Global Product Marketing Information is subject to change. Q1 2017 Q2 2017 Q3 2017 Q4 2017 H1
More informationBig Data Hadoop Stack
Big Data Hadoop Stack Lecture #1 Hadoop Beginnings What is Hadoop? Apache Hadoop is an open source software framework for storage and large scale processing of data-sets on clusters of commodity hardware
More informationBridging Traditional Analytics with BigData - SAS on UCS
Bridging Traditional Analytics with BigData - SAS on UCS Vadiraja Bhatt, Principal Engineer Cisco Spark How Questions? Use Cisco Spark to communicate with the speaker after the session 1. Find this session
More informationTwelve Cluster Technologies Available in SAS 9.4
ABSTRACT Paper SAS415-2017 Twelve Cluster Technologies Available in SAS 9.4 Rob Collum, SAS Institute Inc. We are always looking for ways to improve the performance, efficiency, and availability of our
More informationOptimizing Your Analytics Life Cycle with SAS & Teradata. Rick Lower
Optimizing Your Analytics Life Cycle with SAS & Teradata Rick Lower 1 Agenda The Analytic Life Cycle Common Problems SAS & Teradata solutions Analytical Life Cycle Exploration Explore All Your Data Preparation
More informationGEN_OMEGA2: The HPSUMMARY Procedure: A SAS Macro for Computing the Generalized Omega-Squared Effect Size Associated with
GEN_OMEGA2: A SAS Macro for Computing the Generalized Omega-Squared Effect Size Associated with The HPSUMMARY Procedure: Analysis of Variance Models An Old Friend s Younger (and Brawnier) Cousin The HPSUMMARY
More informationEmbedded Technosolutions
Hadoop Big Data An Important technology in IT Sector Hadoop - Big Data Oerie 90% of the worlds data was generated in the last few years. Due to the advent of new technologies, devices, and communication
More informationLecture 7: Parallel Processing
Lecture 7: Parallel Processing Introduction and motivation Architecture classification Performance evaluation Interconnection network Zebo Peng, IDA, LiTH 1 Performance Improvement Reduction of instruction
More informationBig Data with Hadoop Ecosystem
Diógenes Pires Big Data with Hadoop Ecosystem Hands-on (HBase, MySql and Hive + Power BI) Internet Live http://www.internetlivestats.com/ Introduction Business Intelligence Business Intelligence Process
More informationIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduce Antonino Virgillito THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION Large-scale Computation Traditional solutions for computing large
More informationThe Future of the SAS Platform. Mathias
The Future of the SAS Platform Mathias Coopmans @macoopma The analytics economy The question is not whether data should be shared, but how we can usher in responsible methods for doing so. Link to Press
More informationMicrosoft Analytics Platform System (APS)
Microsoft Analytics Platform System (APS) The turnkey modern data warehouse appliance Matt Usher, Senior Program Manager @ Microsoft About.me @two_under Senior Program Manager 9 years at Microsoft Visual
More informationIntroduction II. Overview
Introduction II Overview Today we will introduce multicore hardware (we will introduce many-core hardware prior to learning OpenCL) We will also consider the relationship between computer hardware and
More informationAdapted from: TRENDS AND ATTRIBUTES OF HORIZONTAL AND VERTICAL COMPUTING ARCHITECTURES
Adapted from: TRENDS AND ATTRIBUTES OF HORIZONTAL AND VERTICAL COMPUTING ARCHITECTURES Tom Atwood Business Development Manager Sun Microsystems, Inc. Takeaways Understand the technical differences between
More informationInformation Criteria Methods in SAS for Multiple Linear Regression Models
Paper SA5 Information Criteria Methods in SAS for Multiple Linear Regression Models Dennis J. Beal, Science Applications International Corporation, Oak Ridge, TN ABSTRACT SAS 9.1 calculates Akaike s Information
More informationWhat s New in SAS 9.3
What s New in SAS 9.3 Steve Herskovits Copyright 2010 SAS Institute Inc. All rights reserved. Big Data, Big Analytics, Data Governance 2 For the users interacting daily with SAS software SAS 9.3 delivers:
More informationResource allocation for autonomic data centers using analytic performance models.
Bennani, Mohamed N., and Daniel A. Menasce. "Resource allocation for autonomic data centers using analytic performance models." Autonomic Computing, 2005. ICAC 2005. Proceedings. Second International Conference
More informationBig Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018
Big Data com Hadoop Impala, Hive e Spark VIII Sessão - SQL Bahia 03/03/2018 Diógenes Pires Connect with PASS Sign up for a free membership today at: pass.org #sqlpass Internet Live http://www.internetlivestats.com/
More informationEvery SAS Cloud has a Silver Lining. Letting your data reign in the cloud
Every SAS Cloud has a Silver Lining Letting your data reign in the cloud DSS SAS SYSTEM Current Single Virtual Server unit with 16 cores upgraded to 32 cores 256 Gb RAM 150 registered users Data collector
More informationHiTune. Dataflow-Based Performance Analysis for Big Data Cloud
HiTune Dataflow-Based Performance Analysis for Big Data Cloud Jinquan (Jason) Dai, Jie Huang, Shengsheng Huang, Bo Huang, Yan Liu Intel Asia-Pacific Research and Development Ltd Shanghai, China, 200241
More informationIntroducing Microsoft SQL Server 2016 R Services. Julian Lee Advanced Analytics Lead Global Black Belt Asia Timezone
Introducing Microsoft SQL Server 2016 R Services Julian Lee Advanced Analytics Lead Global Black Belt Asia Timezone SQL Server 2016: Everything built-in built-in built-in built-in built-in built-in $2,230
More informationComputer Architecture
Computer Architecture Chapter 7 Parallel Processing 1 Parallelism Instruction-level parallelism (Ch.6) pipeline superscalar latency issues hazards Processor-level parallelism (Ch.7) array/vector of processors
More informationData Quality Control for Big Data: Preventing Information Loss With High Performance Binning
Data Quality Control for Big Data: Preventing Information Loss With High Performance Binning ABSTRACT Deanna Naomi Schreiber-Gregory, Henry M Jackson Foundation, Bethesda, MD It is a well-known fact that
More informationIntroducing Oracle R Enterprise 1.4 -
Hello, and welcome to this online, self-paced lesson entitled Introducing Oracle R Enterprise. This session is part of an eight-lesson tutorial series on Oracle R Enterprise. My name is Brian Pottle. I
More informationData Mining: STATISTICA
Outline Data Mining: STATISTICA Prepare the data Classification and regression (C & R, ANN) Clustering Association rules Graphic user interface Prepare the Data Statistica can read from Excel,.txt and
More informationData Quality Control: Using High Performance Binning to Prevent Information Loss
SESUG Paper DM-173-2017 Data Quality Control: Using High Performance Binning to Prevent Information Loss ABSTRACT Deanna N Schreiber-Gregory, Henry M Jackson Foundation It is a well-known fact that the
More informationSpring 2011 Parallel Computer Architecture Lecture 4: Multi-core. Prof. Onur Mutlu Carnegie Mellon University
18-742 Spring 2011 Parallel Computer Architecture Lecture 4: Multi-core Prof. Onur Mutlu Carnegie Mellon University Research Project Project proposal due: Jan 31 Project topics Does everyone have a topic?
More informationA Fast and High Throughput SQL Query System for Big Data
A Fast and High Throughput SQL Query System for Big Data Feng Zhu, Jie Liu, and Lijie Xu Technology Center of Software Engineering, Institute of Software, Chinese Academy of Sciences, Beijing, China 100190
More informationBIG DATA TESTING: A UNIFIED VIEW
http://core.ecu.edu/strg BIG DATA TESTING: A UNIFIED VIEW BY NAM THAI ECU, Computer Science Department, March 16, 2016 2/30 PRESENTATION CONTENT 1. Overview of Big Data A. 5 V s of Big Data B. Data generation
More informationData Management - 50%
Exam 1: SAS Big Data Preparation, Statistics, and Visual Exploration Data Management - 50% Navigate within the Data Management Studio Interface Register a new QKB Create and connect to a repository Define
More informationSAS Enterprise Miner Performance on IBM System p 570. Jan, Hsian-Fen Tsao Brian Porter Harry Seifert. IBM Corporation
SAS Enterprise Miner Performance on IBM System p 570 Jan, 2008 Hsian-Fen Tsao Brian Porter Harry Seifert IBM Corporation Copyright IBM Corporation, 2008. All Rights Reserved. TABLE OF CONTENTS ABSTRACT...3
More informationLecture 7: Parallel Processing
Lecture 7: Parallel Processing Introduction and motivation Architecture classification Performance evaluation Interconnection network Zebo Peng, IDA, LiTH 1 Performance Improvement Reduction of instruction
More informationImproving Performance and Ensuring Scalability of Large SAS Applications and Database Extracts
Improving Performance and Ensuring Scalability of Large SAS Applications and Database Extracts Michael Beckerle, ChiefTechnology Officer, Torrent Systems, Inc., Cambridge, MA ABSTRACT Many organizations
More informationTop Five Reasons for Data Warehouse Modernization Philip Russom
Top Five Reasons for Data Warehouse Modernization Philip Russom TDWI Research Director for Data Management May 28, 2014 Sponsor Speakers Philip Russom TDWI Research Director, Data Management Steve Sarsfield
More informationSub-Second Response Times with New In-Memory Analytics in MicroStrategy 10. Onur Kahraman
Sub-Second Response Times with New In-Memory Analytics in MicroStrategy 10 Onur Kahraman High Performance Is No Longer A Nice To Have In Analytical Applications Users expect Google Like performance from
More informationOutline. Prepare the data Classification and regression Clustering Association rules Graphic user interface
Data Mining: i STATISTICA Outline Prepare the data Classification and regression Clustering Association rules Graphic user interface 1 Prepare the Data Statistica can read from Excel,.txt and many other
More informationchapter two: building your first report... 15
An Introduction to SAS Visual Analytics: How to Explore Numbers, Design Reports, and Gain Insight into Your Data. Full book available for purchase here. contents about this book... ix about these authors...
More informationCOSC 6385 Computer Architecture - Multi Processor Systems
COSC 6385 Computer Architecture - Multi Processor Systems Fall 2006 Classification of Parallel Architectures Flynn s Taxonomy SISD: Single instruction single data Classical von Neumann architecture SIMD:
More informationGPU ACCELERATED DATABASE MANAGEMENT SYSTEMS
CIS 601 - Graduate Seminar Presentation 1 GPU ACCELERATED DATABASE MANAGEMENT SYSTEMS PRESENTED BY HARINATH AMASA CSU ID: 2697292 What we will talk about.. Current problems GPU What are GPU Databases GPU
More informationGLM II. Basic Modeling Strategy CAS Ratemaking and Product Management Seminar by Paul Bailey. March 10, 2015
GLM II Basic Modeling Strategy 2015 CAS Ratemaking and Product Management Seminar by Paul Bailey March 10, 2015 Building predictive models is a multi-step process Set project goals and review background
More informationAnalyzing Big Data with Microsoft R
Analyzing Big Data with Microsoft R 20773; 3 days, Instructor-led Course Description The main purpose of the course is to give students the ability to use Microsoft R Server to create and run an analysis
More informationDriveScale-DellEMC Reference Architecture
DriveScale-DellEMC Reference Architecture DellEMC/DRIVESCALE Introduction DriveScale has pioneered the concept of Software Composable Infrastructure that is designed to radically change the way data center
More informationOverview of Data Services and Streaming Data Solution with Azure
Overview of Data Services and Streaming Data Solution with Azure Tara Mason Senior Consultant tmason@impactmakers.com Platform as a Service Offerings SQL Server On Premises vs. Azure SQL Server SQL Server
More informationTackling the Challenges of Big Data! Tackling The Challenges of Big Data. This Module. Samuel Madden. Samuel Madden. Visualizing Twitter
Samuel Madden Professor and Director of Big Data at CSAIL Massachusetts Institute of Technology Introduction to Twitter Data Samuel Madden Professor and Director of Big Data at CSAIL Massachusetts Institute
More informationSome software included in SAS Foundation may display a release number other than 9.2.
Copyright Notice The correct bibliographic citation for this manual is as follows: SAS Institute Inc., SAS 9.2 Foundation System Requirements for Linux for Intel Architecture, Cary, NC: SAS Institute Inc.,
More informationIntroduction to Parallel Programming
Introduction to Parallel Programming David Lifka lifka@cac.cornell.edu May 23, 2011 5/23/2011 www.cac.cornell.edu 1 y What is Parallel Programming? Using more than one processor or computer to complete
More informationPervasive Insight. Mission Critical Platform
Empowered IT Pervasive Insight Mission Critical Platform Dynamic Development Desktop & Mobile Server & Datacenter Cloud Over 7 Million Downloads of SQL Server 2008 Over 30,000 partners are offering solutions
More informationOracle Big Data. A NA LYT ICS A ND MA NAG E MENT.
Oracle Big Data. A NALYTICS A ND MANAG E MENT. Oracle Big Data: Redundância. Compatível com ecossistema Hadoop, HIVE, HBASE, SPARK. Integração com Cloudera Manager. Possibilidade de Utilização da Linguagem
More informationPervasive DataRush TM
Pervasive DataRush TM Parallel Data Analysis with KNIME www.pervasivedatarush.com Company Overview Global Software Company Tens of thousands of users across the globe Americas, EMEA, Asia ~230 employees
More informationAn Oracle White Paper December SAS Application Performance on the Oracle M5-32 SPARC Server
An Oracle White Paper December 2013 SAS Application Performance on the Oracle M5-32 SPARC Server Introduction... 2 SAS Application Solutions Exploit Oracle's SPARC Technology... 2 Managing SAS Workloads
More informationDecision Making Procedure: Applications of IBM SPSS Cluster Analysis and Decision Tree
World Applied Sciences Journal 21 (8): 1207-1212, 2013 ISSN 1818-4952 IDOSI Publications, 2013 DOI: 10.5829/idosi.wasj.2013.21.8.2913 Decision Making Procedure: Applications of IBM SPSS Cluster Analysis
More informationData Quality Control: Using High Performance Binning to Prevent Information Loss
Paper 2821-2018 Data Quality Control: Using High Performance Binning to Prevent Information Loss Deanna Naomi Schreiber-Gregory, Henry M Jackson Foundation ABSTRACT It is a well-known fact that the structure
More informationOverview. Audience profile. At course completion. Course Outline. : 20773A: Analyzing Big Data with Microsoft R. Course Outline :: 20773A::
Module Title Duration : 20773A: Analyzing Big Data with Microsoft R : 3 days Overview The main purpose of the course is to give students the ability to use Microsoft R Server to create and run an analysis
More informationLogistic Model Selection with SAS PROC s LOGISTIC, HPLOGISTIC, HPGENSELECT
MWSUG 2017 - Paper AA02 Logistic Model Selection with SAS PROC s LOGISTIC, HPLOGISTIC, HPGENSELECT Bruce Lund, Magnify Analytic Solutions, Detroit MI, Wilmington DE, Charlotte NC ABSTRACT In marketing
More informationVOLTDB + HP VERTICA. page
VOLTDB + HP VERTICA ARCHITECTURE FOR FAST AND BIG DATA ARCHITECTURE FOR FAST + BIG DATA FAST DATA Fast Serve Analytics BIG DATA BI Reporting Fast Operational Database Streaming Analytics Columnar Analytics
More informationHigh Performance Computing on MapReduce Programming Framework
International Journal of Private Cloud Computing Environment and Management Vol. 2, No. 1, (2015), pp. 27-32 http://dx.doi.org/10.21742/ijpccem.2015.2.1.04 High Performance Computing on MapReduce Programming
More informationSAS/STAT 15.1 User s Guide The HPREG Procedure
SAS/STAT 15.1 User s Guide The HPREG Procedure This document is an individual chapter from SAS/STAT 15.1 User s Guide. The correct bibliographic citation for this manual is as follows: SAS Institute Inc.
More informationData Analytics and Machine Learning: From Node to Cluster
Data Analytics and Machine Learning: From Node to Cluster Presented by Viswanath Puttagunta Ganesh Raju Understanding use cases to optimize on ARM Ecosystem Date BKK16-404B March 10th, 2016 Event Linaro
More informationBIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE
BIG DATA AND HADOOP ON THE ZFS STORAGE APPLIANCE BRETT WENINGER, MANAGING DIRECTOR 10/21/2014 ADURANT APPROACH TO BIG DATA Align to Un/Semi-structured Data Instead of Big Scale out will become Big Greatest
More informationSystem Requirements for SAS 9.4 Foundation for Solaris for x64
System Requirements for SAS 9.4 Foundation for Solaris for x64 Copyright Notice The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2017. System Requirements for SAS 9.4
More informationWhere We Are. Review: Parallel DBMS. Parallel DBMS. Introduction to Data Management CSE 344
Where We Are Introduction to Data Management CSE 344 Lecture 22: MapReduce We are talking about parallel query processing There exist two main types of engines: Parallel DBMSs (last lecture + quick review)
More informationTopics covered 10/12/2015. Pengantar Teknologi Informasi dan Teknologi Hijau. Suryo Widiantoro, ST, MMSI, M.Com(IS)
Pengantar Teknologi Informasi dan Teknologi Hijau Suryo Widiantoro, ST, MMSI, M.Com(IS) 1 Topics covered 1. Basic concept of managing files 2. Database management system 3. Database models 4. Data mining
More informationIntegrate MATLAB Analytics into Enterprise Applications
Integrate Analytics into Enterprise Applications Dr. Roland Michaely 2015 The MathWorks, Inc. 1 Data Analytics Workflow Access and Explore Data Preprocess Data Develop Predictive Models Integrate Analytics
More informationSafe Harbor Statement
Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment
More informationSystem Requirements for SAS 9.4 Foundation for AIX
System Requirements for SAS 9.4 Foundation for AIX Copyright Notice The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2017. System Requirements for SAS 9.4 Foundation
More informationSAS/STAT 15.1 User s Guide The HPQUANTSELECT Procedure
SAS/STAT 15.1 User s Guide The HPQUANTSELECT Procedure This document is an individual chapter from SAS/STAT 15.1 User s Guide. The correct bibliographic citation for this manual is as follows: SAS Institute
More informationAdvanced Analytics with Enterprise Guide Catherine Truxillo, Ph.D., Stephen McDaniel, and David McNamara, SAS Institute Inc.
Advanced Analytics with Enterprise Guide Catherine Truxillo, Ph.D., Stephen McDaniel, and David McNamara, SAS Institute Inc., Cary, NC ABSTRACT From SAS/STAT to SAS/ETS to SAS/QC to SAS/GRAPH, Enterprise
More information10th August Part One: Introduction to Parallel Computing
Part One: Introduction to Parallel Computing 10th August 2007 Part 1 - Contents Reasons for parallel computing Goals and limitations Criteria for High Performance Computing Overview of parallel computer
More informationDecision Management with DS2
Decision Management with DS2 Helen Fowler, Teradata Corporation, West Chester, Ohio Tho Nguyen, Teradata Corporation, Raleigh, North Carolina ABSTRACT We all make tactical and strategic decisions every
More informationProject Requirements
Project Requirements Version 4.0 2 May, 2016 2015-2016 Computer Science Department, Texas Christian University Revision Signatures By signing the following document, the team member is acknowledging that
More informationSAP HANA. Jake Klein/ SVP SAP HANA June, 2013
SAP HANA Jake Klein/ SVP SAP HANA June, 2013 SAP 3 YEARS AGO Middleware BI / Analytics Core ERP + Suite 2013 WHERE ARE WE NOW? Cloud Mobile Applications SAP HANA Analytics D&T Changed Reality Disruptive
More informationSome software included in SAS Foundation may display a release number other than 9.2.
Copyright Notice The correct bibliographic citation for this manual is as follows: SAS Institute Inc., SAS 9.2 Foundation System Requirements for AIX, Cary, NC: SAS Institute Inc., 2012. SAS 9.2 Foundation
More informationHadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads
HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads Azza Abouzeid, Kamil Bajda-Pawlikowski, Daniel J. Abadi, Alexander Rasin and Avi Silberschatz Presented by
More informationDistributed Data Analytics Introduction
G-3.1.09, Campus III Hasso Plattner Institut Information Systems Team Prof. Felix Naumann Dr. Ralf Krestel Tim Repke Diana Stephan project DuDe Duplicate Detection Data Fusion Sebastian Kruse Data Change
More informationEvolving To The Big Data Warehouse
Evolving To The Big Data Warehouse Kevin Lancaster 1 Copyright Director, 2012, Oracle and/or its Engineered affiliates. All rights Insert Systems, Information Protection Policy Oracle Classification from
More informationAgenda Process Concept Process Scheduling Operations on Processes Interprocess Communication 3.2
Lecture 3: Processes Agenda Process Concept Process Scheduling Operations on Processes Interprocess Communication 3.2 Process in General 3.3 Process Concept Process is an active program in execution; process
More information