Big Data Its All Around You
|
|
- Rachel Gordon
- 5 years ago
- Views:
Transcription
1
2 Big Data Its All Around You Brian Macdonald Oracle Enterprise Architect
3 Big Data: Its All Around You Introduction What is Big Data What is Data Science Big Data Technologies Q&A 3
4 My Road to Big Data 5
5 My Road to Big Data Math and Computer Science Information Systems Analyze Data Implementation Sales 6
6 Who are Sales Engineers Engineers Mathematicians Programmers Business People Philosophers Biologist Any one who can solve problems and explain solutions! 7 Copyright 2012, Oracle and/or its affiliates. All rights reserved.
7 What is Big Data? 8
8 10 Copyright 2012, Oracle and/or its affiliates. All rights reserved.
9 Analyze Social Media Data 383+ Million Twitter accounts (255m+ tweeting) 1,280+ Million Facebook subscribers 200+ Million Instagram users 1 Billion YouTube users, 4 Billion Views/Day Billion Mobile Web users Over 6 million OnStar subscribers 11 Copyright 2012, Oracle and/or its affiliates. All rights reserved.
10 How Much Information was Produced in 2011? What do you call 1.8 Trillion Gigabytes? 1.8 Zettabytes* 1800 Exabytes 1.8 Million Petabytes 1.8 Billion Terabytes 1.8 Trillion Gigabytes Billion iphones (32GB model) Trillion Songs (6000/iPhone) 1,123,715,750 Years (3.5 Minutes/Song) *1.8 Zettabytes of information will be created and replicated in 2011 IDC Copyright 2012, Oracle and/or its affiliates. All rights reserved.
11 Twitter Demo 13
12 Twitter Exercise Search Twitter and analyze Sentiment Enter search terms and see if you get the results you would expect ID: bmacdona Pwd: bmacdona 14
13 How is Big Data Used Siri Google Maps Medicine Personalized Medicine Genomics Law Business Retail Finance Telecom Governments The Common Good Social Graphs Atrocity Watch Environmental 15
14 Public Data NYC Chicago Maps - Crime Pittsburgh - Data Sets - Interactive Maps 16
15 Some people are more certain of everything than I am of anything Robert Rubin In an Uncertain World 17
16 What Makes Big Data BIG DATA? SOCIAL BLOG SMART METER Volume Velocity Variety NACH Exa Business Development Team / NACH_Exa_Biz_Dev_Ca@oracle.com 18
17 The Internet of Things or the Sensor Revolution Source: The Economist 19
18 Big Data is Complex Structured Data Semi-Structured Data Unstructured Data <bib> <book year="1995"> <title> Database Systems </title> <author> <lastname> Date </lastname> </author> <publisher> Addison-Wesley </publisher> </book> <book year="1998"> <title> Foundation for Object/Relational Databases </title> <author> <lastname> Date </lastname> </author> <author> <lastname> Darwen </lastname> </author> <ISBN> <number> </number > </ISBN> </book> </bib> Oracle-SunGard Big Data Copyright Event 2014 Oracle and/or its affiliates. All rights reserved. 20
19 Visualization Look through Examples Important way to communicate information How to represent millions of data points is an art Many tools exist to help generate vizualization
20 What is Data Science? 22
21 When experts express uncertainty about their opinions, people find them more compelling. Harvard Business Review
22 Data Scientists The Most Awesome Job of the Future Create questions - Hypothesis Prepare Data Analyze large and small volumes of data Understand what the data means Visualize Data Tell Stories Create Innovative ways to use it 24 Copyright 2012, Oracle and/or its affiliates. All rights reserved.
23 What do you need to know? How to Think These are the things you re taught that you think you will never need As much as possible Math Science History Languages Art Be curious 25 Copyright 2012, Oracle and/or its affiliates. All rights reserved.
24 You must be able to think and solve problems! And observe Count the Number of F s in the following sentence. FINISHED FILES ARE THE RE- SULT OF YEARS OF SCIENTIF- IC STUDY COMBINED WITH THE EXPERIENCE OF YEARS. 26 Copyright 2012, Oracle and/or its affiliates. All rights reserved.
25 Leave your assumptions at the door! FINISHED FILES ARE THE RE- SULT OF YEARS OF SCIENTIF- IC STUDY COMBINED WITH THE EXPERIENCE OF YEARS. 27 Copyright 2012, Oracle and/or its affiliates. All rights reserved.
26 Lets Analyze some Data Correlation The degree to which two or more attributes or measurements on the same group of elements show a tendency to vary together Positive when values increase together Negative when values decrease together 28 Copyright 2012, Oracle and/or its affiliates. All rights reserved.
27 Does Ice Cream Consumption Cause Drowning? Obviously not Correlation does not imply Causation One may cause the other, but correlation just defines how they vary. There may be other reasons. i.e. Hot temperatures Be very cautious with Causation There are tests to determine causation 29 Copyright 2012, Oracle and/or its affiliates. All rights reserved.
28 How do I know if variables are correlated Understanding R and R 2 R = Correlation Coefficient Values between -1 & 1 Positive Correlation > 0 - As one variable increases, the other increases Perfect Correlation = 1 Negative Correlation < 0 - As one variable increases, the other decreases Perfect Negative Correlation = -1 0 = No correlation Can be shown with a trend line 30 Copyright 2012, Oracle and/or its affiliates. All rights reserved.
29 How do I know if variables are correlated Understanding R and R 2 R 2 = Coefficient of Determination Tells how likely one variable predicts the other variable Values between 0 & 1 If r 2 = 0.850, 85% of the total variation in y can be explained by the linear relationship between x and y R 2 is more commonly used 31 Copyright 2012, Oracle and/or its affiliates. All rights reserved.
30 Your Turn to Be a Data Scientist What does data about Pittsburgh tell you? - Find variables that have high Coefficient of Determination (R 2 ) Explore Data Create a hypothesis Who can find the best largest value of R 2? Experiment with multiple columns Create new variables Answer the Following Questions What variables are correlated? Are they Positively or Negatively correlated? What is R 2? Do you think they are causal? What does graph tell you? What additional data would you like to have? What would you do based on this information? 32 Copyright 2012, Oracle and/or its affiliates. All rights reserved.
31 Data Science can be used just for fun 33 Copyright 2012, Oracle and/or its affiliates. All rights reserved.
32 Predictive Analytics Where things get really interesting We looked at describing data What if you can predict what will happen Lots of algorithms exists to do this You just need the data And an interesting question 34 Copyright 2012, Oracle and/or its affiliates. All rights reserved.
33 Predictive Analytics Data Mining Algorithms Function Algorithms Applicability Classification Logistic Regression (GLM) Decision Trees Naïve Bayes Support Vector Machines (SVM) Classical statistical technique Popular / Rules / transparency Embedded app Wide / narrow data / text Regression Linear Regression (GLM) Support Vector Machine (SVM) Classical statistical technique Wide / narrow data / text Anomaly Detection One Class SVM Unknown fraud cases or anomalies Attribute Importance Association Rules A1 A2 A3 A4 A5 A6 A7 Minimum Description Length (MDL) Principal Components Analysis (PCA) Apriori Attribute reduction, Reduce data noise Market basket analysis / Next Best Offer Clustering Hierarchical k-means Hierarchical O-Cluster Expectation-Maximization Clustering (EM) Product grouping / Text mining Gene and protein analysis Feature Extraction F1 F2 F3 F4 Nonnegative Matrix Factorization (NMF) Singular Value Decomposition (SVD) Text analysis / Feature reduction
34 Data Mining Provides Better Information, Valuable Insights and Predictions Cell Phone Churners vs. Loyal Customers Segment #3: IF CUST_MO > 7 AND INCOME < $175K, THEN Prediction = Cell Phone Churner, Confidence = 83%, Support = 6/39 Insight & Prediction Segment #1: IF CUST_MO > 14 AND INCOME < $90K, THEN Prediction = Cell Phone Churner, Confidence = 100%, Support = 8/39 Customer Months Source: Inspired from Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management by Michael J. A. Berry, Gordon S. Linoff
35 Decision Trees Decision Trees Classification Prediction Customer profiling Owns foreign car = yes Status Age >55 <55 Foreign car = no 45 > Age Problem: Find customers likely to buy a new car and their profiles <=45 Income Gender Num children <100K >100K F M <=4 >4 Car = 0 Car = 1 Car = 0 Car = 1 Car = 0 Car = 1 IF (Age >55) AND Owns foreign car=no AND (Income >100K ) THEN P(Buy Car=1) =.77 Support = 250
36 Big Data Technologies 38
37 Many Technologies Relational Databases Oracle, Microsoft SQL Server, MySQL, DB2, Teradata, Netezza, Postgres NoSQL Databases Oracle NoSQL Database, Cassandra, MongoDB, RIAK Hadoop R Visualization Tools Tableau, Spotfire, Excel Open Source is becoming more prominent 39
38 R Statistical Programming Language Open source language and environment Used for statistical computing and graphics Strength in easily producing publication-quality graphs Highly extensible
39 What to do with Big Data? Like DNA Data won t fit in spreadsheet Would take a long time to do the math Human Genome 3.2 billion base pairs (ATCG) Split problem into smaller pieces 41 Copyright 2012, Oracle and/or its affiliates. All rights reserved.
40 Hadoop The Apache Framework Hadoop for distributed software processing library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. Hadoop is designed to scale up from single Large Data Sets servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library Clusters itself is of designed Computers to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures. Simple Computing Models Highly Available Service
41 Map Reduce
42 A MapReduce Analogy
43 A MapReduce Analogy Going From Estimates To Actuals
44 A MapReduce Analogy Sharing The Load Of Mapping Out The Raw Numbers Photo Credit: Lindsey Bauman/Hutchinson News
45 A MapReduce Analogy Reporting Back The Results Photo Credit: Lindsey Bauman/Hutchinson News Photo Credit: Renee Saff
46 A MapReduce Example Putting The Analogy Into Practice
47 A MapReduce Example The Input The data arrives into the system.
48 A MapReduce Example Splitting The Input Into Chunks The data is moved into the HDFS system, divided into blocks, each of which are copied multiple times for redundancy.
49 A MapReduce Example Mapping The Chunks The Mapper picks up a chunk for processing. The MR Framework ensures only one mapper will be assigned to a given chunk
50 A MapReduce Example Mapping The Chunks In this case, the Mapper emits a color with the number of times it was found.
51 A MapReduce Example A Shuffle Sort The Shuffler can do a rough sort of like items (optional)
52 A MapReduce Example Reducing The Emissions The Reducer combines the Mapper s output
53 A MapReduce Example The Output The job returns a count of colors found in the input
54 What would you like to predict? Let me know and I can sponsor a project. 56
55 Questions? 57
Getting Started with Advanced Analytics in Finance, Marketing, and Operations
Getting Started with Advanced Analytics in Finance, Marketing, and Operations Southwest Regional Oracle Applications User Group Dan Vlamis February 24, 2017 @VlamisSoftware Vlamis Software Solutions Vlamis
More informationCloud Computing 2. CSCI 4850/5850 High-Performance Computing Spring 2018
Cloud Computing 2 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning
More informationIntroduction to Data Mining and Data Analytics
1/28/2016 MIST.7060 Data Analytics 1 Introduction to Data Mining and Data Analytics What Are Data Mining and Data Analytics? Data mining is the process of discovering hidden patterns in data, where Patterns
More informationBig Data - Some Words BIG DATA 8/31/2017. Introduction
BIG DATA Introduction Big Data - Some Words Connectivity Social Medias Share information Interactivity People Business Data Data mining Text mining Business Intelligence 1 What is Big Data Big Data means
More informationEmbedded Technosolutions
Hadoop Big Data An Important technology in IT Sector Hadoop - Big Data Oerie 90% of the worlds data was generated in the last few years. Due to the advent of new technologies, devices, and communication
More informationSensor Data Collection and Processing
Sensor Data Collection and Processing Applying Web Scale To Sensor Data Today s speaker Josh Patterson josh@cloudera.com / twitter: @jpatanooga Master s Thesis: self-organizing mesh networks Published
More informationNowcasting. D B M G Data Base and Data Mining Group of Politecnico di Torino. Big Data: Hype or Hallelujah? Big data hype?
Big data hype? Big Data: Hype or Hallelujah? Data Base and Data Mining Group of 2 Google Flu trends On the Internet February 2010 detected flu outbreak two weeks ahead of CDC data Nowcasting http://www.internetlivestats.com/
More informationBIG DATA TECHNOLOGIES: WHAT EVERY MANAGER NEEDS TO KNOW ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29,
BIG DATA TECHNOLOGIES: WHAT EVERY MANAGER NEEDS TO KNOW ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29, 2016 1 OBJECTIVES ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29, 2016 2 WHAT
More informationIntroduction to the Mathematics of Big Data. Philippe B. Laval
Introduction to the Mathematics of Big Data Philippe B. Laval Fall 2017 Introduction In recent years, Big Data has become more than just a buzz word. Every major field of science, engineering, business,
More informationBig Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara
Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case
More informationHadoop An Overview. - Socrates CCDH
Hadoop An Overview - Socrates CCDH What is Big Data? Volume Not Gigabyte. Terabyte, Petabyte, Exabyte, Zettabyte - Due to handheld gadgets,and HD format images and videos - In total data, 90% of them collected
More informationIn-Database Analytics: Predictive Analytics, Data Mining, Oracle Exadata and Oracle Business Intelligence
In-Database Analytics: Predictive Analytics, Data Mining, Oracle Exadata and Oracle Business Intelligence Detlef E. Schröder Leitender Systemberater STCC DB Mitte detlef.e.schroeder@oracle.com www.oracledwh.de
More informationStages of Data Processing
Data processing can be understood as the conversion of raw data into a meaningful and desired form. Basically, producing information that can be understood by the end user. So then, the question arises,
More informationI am a Data Nerd and so are YOU!
I am a Data Nerd and so are YOU! Not This Type of Nerd Data Nerd Coffee Talk We saw Cloudera as the lone open source champion of Hadoop and the EMC/Greenplum/MapR initiative as a more closed and
More informationOnline Bill Processing System for Public Sectors in Big Data
IJIRST International Journal for Innovative Research in Science & Technology Volume 4 Issue 10 March 2018 ISSN (online): 2349-6010 Online Bill Processing System for Public Sectors in Big Data H. Anwer
More information<Insert Picture Here> Introduction to Big Data Technology
Introduction to Big Data Technology The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into
More informationBased on Big Data: Hype or Hallelujah? by Elena Baralis
Based on Big Data: Hype or Hallelujah? by Elena Baralis http://dbdmg.polito.it/wordpress/wp-content/uploads/2010/12/bigdata_2015_2x.pdf 1 3 February 2010 Google detected flu outbreak two weeks ahead of
More informationDatabase Management Systems
Database Management Systems Fall 2017 Knowledge is of two kinds: we know a subject ourselves, or we know where we can find information upon it. -- Samuel Johnson (1709-1784) Queries for Today Why? Who?
More informationOracle Machine Learning Notebook
Oracle Machine Learning Notebook Included in Autonomous Data Warehouse Cloud Charlie Berger, MS Engineering, MBA Sr. Director Product Management, Machine Learning, AI and Cognitive Analytics charlie.berger@oracle.com
More informationA REVIEW PAPER ON BIG DATA ANALYTICS
A REVIEW PAPER ON BIG DATA ANALYTICS Kirti Bhatia 1, Lalit 2 1 HOD, Department of Computer Science, SKITM Bahadurgarh Haryana, India bhatia.kirti.it@gmail.com 2 M Tech 4th sem SKITM Bahadurgarh, Haryana,
More information745: Advanced Database Systems
745: Advanced Database Systems Yanlei Diao University of Massachusetts Amherst Outline Overview of course topics Course requirements Database Management Systems 1. Online Analytical Processing (OLAP) vs.
More informationScaling Up HBase. Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech. CSE6242 / CX4242: Data & Visual Analytics
http://poloclub.gatech.edu/cse6242 CSE6242 / CX4242: Data & Visual Analytics Scaling Up HBase Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech Partly based on materials
More informationREVIEW ON BIG DATA ANALYTICS AND HADOOP FRAMEWORK
REVIEW ON BIG DATA ANALYTICS AND HADOOP FRAMEWORK 1 Dr.R.Kousalya, 2 T.Sindhupriya 1 Research Supervisor, Professor & Head, Department of Computer Applications, Dr.N.G.P Arts and Science College, Coimbatore
More informationHadoop محبوبه دادخواه کارگاه ساالنه آزمایشگاه فناوری وب زمستان 1391
Hadoop محبوبه دادخواه کارگاه ساالنه آزمایشگاه فناوری وب زمستان 1391 Outline Big Data Big Data Examples Challenges with traditional storage NoSQL Hadoop HDFS MapReduce Architecture 2 Big Data In information
More informationThe Mathematics of Big Data
The Mathematics of Big Data Philippe B. Laval KSU Fall 2017 Philippe B. Laval (KSU) Math & Big Data Fall 2017 1 / 10 Introduction We briefly present Big Data and the issues associated with Big Data. Philippe
More information2013 AWS Worldwide Public Sector Summit Washington, D.C.
2013 AWS Worldwide Public Sector Summit Washington, D.C. EMR for Fun and for Profit Ben Butler Sr. Manager, Big Data butlerb@amazon.com @bensbutler Overview 1. What is big data? 2. What is AWS Elastic
More informationTopics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples
Hadoop Introduction 1 Topics Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples 2 Big Data Analytics What is Big Data?
More informationBIG DATA TESTING: A UNIFIED VIEW
http://core.ecu.edu/strg BIG DATA TESTING: A UNIFIED VIEW BY NAM THAI ECU, Computer Science Department, March 16, 2016 2/30 PRESENTATION CONTENT 1. Overview of Big Data A. 5 V s of Big Data B. Data generation
More informationIntroducing Microsoft SQL Server 2016 R Services. Julian Lee Advanced Analytics Lead Global Black Belt Asia Timezone
Introducing Microsoft SQL Server 2016 R Services Julian Lee Advanced Analytics Lead Global Black Belt Asia Timezone SQL Server 2016: Everything built-in built-in built-in built-in built-in built-in $2,230
More informationBig Data on AWS. Peter-Mark Verwoerd Solutions Architect
Big Data on AWS Peter-Mark Verwoerd Solutions Architect What to get out of this talk Non-technical: Big Data processing stages: ingest, store, process, visualize Hot vs. Cold data Low latency processing
More informationBig Data The end of Data Warehousing?
Big Data The end of Data Warehousing? Hermann Bär Oracle USA Redwood Shores, CA Schlüsselworte Big data, data warehousing, advanced analytics, Hadoop, unstructured data Introduction If there was an Unwort
More informationSpatial Analytics Built for Big Data Platforms
Spatial Analytics Built for Big Platforms Roberto Infante Software Development Manager, Spatial and Graph 1 Copyright 2011, Oracle and/or its affiliates. All rights Global Digital Growth The Internet of
More informationOracle Big Data Science
Oracle Big Data Science Tim Vlamis and Dan Vlamis Vlamis Software Solutions 816-781-2880 www.vlamis.com @VlamisSoftware Vlamis Software Solutions Vlamis Software founded in 1992 in Kansas City, Missouri
More informationBig Data. Big Data Analyst. Big Data Engineer. Big Data Architect
Big Data Big Data Analyst INTRODUCTION TO BIG DATA ANALYTICS ANALYTICS PROCESSING TECHNIQUES DATA TRANSFORMATION & BATCH PROCESSING REAL TIME (STREAM) DATA PROCESSING Big Data Engineer BIG DATA FOUNDATION
More informationBig Data Big Mess? Ein Versuch einer Positionierung
Big Data Big Mess? Ein Versuch einer Positionierung Autor: Daniel Liebhart (Peter Welkenbach) Datum: 10. Oktober 2012 Ort: DBTA Workshop on Big Data, Cloud Data Management and NoSQL BASEL BERN LAUSANNE
More informationChapter 3. Foundations of Business Intelligence: Databases and Information Management
Chapter 3 Foundations of Business Intelligence: Databases and Information Management THE DATA HIERARCHY TRADITIONAL FILE PROCESSING Organizing Data in a Traditional File Environment Problems with the traditional
More informationBig Data Analytics. Izabela Moise, Evangelos Pournaras, Dirk Helbing
Big Data Analytics Izabela Moise, Evangelos Pournaras, Dirk Helbing Izabela Moise, Evangelos Pournaras, Dirk Helbing 1 Big Data "The world is crazy. But at least it s getting regular analysis." Izabela
More informationWhat is the maximum file size you have dealt so far? Movies/Files/Streaming video that you have used? What have you observed?
Simple to start What is the maximum file size you have dealt so far? Movies/Files/Streaming video that you have used? What have you observed? What is the maximum download speed you get? Simple computation
More informationBig data. Professor Dan Ariely, Duke University.
Big data BIG DATA is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it... Professor Dan Ariely,
More informationOracle Big Data Science IOUG Collaborate 16
Oracle Big Data Science IOUG Collaborate 16 Session 4762 Tim and Dan Vlamis Tuesday, April 12, 2016 Vlamis Software Solutions Vlamis Software founded in 1992 in Kansas City, Missouri Developed 200+ Oracle
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 1 1 Acknowledgement Several Slides in this presentation are taken from course slides provided by Han and Kimber (Data Mining Concepts and Techniques) and Tan,
More informationData Clustering on the Parallel Hadoop MapReduce Model. Dimitrios Verraros
Data Clustering on the Parallel Hadoop MapReduce Model Dimitrios Verraros Overview The purpose of this thesis is to implement and benchmark the performance of a parallel K- means clustering algorithm on
More informationIntegrating Oracle Databases with NoSQL Databases for Linux on IBM LinuxONE and z System Servers
Oracle zsig Conference IBM LinuxONE and z System Servers Integrating Oracle Databases with NoSQL Databases for Linux on IBM LinuxONE and z System Servers Sam Amsavelu Oracle on z Architect IBM Washington
More informationData Science Bootcamp Curriculum. NYC Data Science Academy
Data Science Bootcamp Curriculum NYC Data Science Academy 100+ hours free, self-paced online course. Access to part-time in-person courses hosted at NYC campus Machine Learning with R and Python Foundations
More informationIntegrating Advanced Analytics with Big Data
Integrating Advanced Analytics with Big Data Ian McKenna, Ph.D. Senior Financial Engineer 2017 The MathWorks, Inc. 1 The Goal SCALE! 2 The Solution tall 3 Agenda Introduction to tall data Case Study: Predicting
More informationChallenges for Data Driven Systems
Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Data Centric Systems and Networking Emergence of Big Data Shift of Communication Paradigm From end-to-end to data
More informationSTATS Data Analysis using Python. Lecture 7: the MapReduce framework Some slides adapted from C. Budak and R. Burns
STATS 700-002 Data Analysis using Python Lecture 7: the MapReduce framework Some slides adapted from C. Budak and R. Burns Unit 3: parallel processing and big data The next few lectures will focus on big
More informationA Survey on Comparative Analysis of Big Data Tools
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 5.258 IJCSMC,
More informationWe are all connected. The Networked Society
We are all connected The Networked Society And still it grows! The number of internet users in 2018 is 4.021 billion, up 7 percent year-on-year The number of social media users in 2018 is 3.196 billion,
More informationScaled Machine Learning at Matroid
Scaled Machine Learning at Matroid Reza Zadeh @Reza_Zadeh http://reza-zadeh.com Machine Learning Pipeline Learning Algorithm Replicate model Data Trained Model Serve Model Repeat entire pipeline Scaling
More informationParallel learning of content recommendations using map- reduce
Parallel learning of content recommendations using map- reduce Michael Percy Stanford University Abstract In this paper, machine learning within the map- reduce paradigm for ranking
More informationIntroduction to Text Mining. Hongning Wang
Introduction to Text Mining Hongning Wang CS@UVa Who Am I? Hongning Wang Assistant professor in CS@UVa since August 2014 Research areas Information retrieval Data mining Machine learning CS@UVa CS6501:
More informationAn Introduction to Big Data Formats
Introduction to Big Data Formats 1 An Introduction to Big Data Formats Understanding Avro, Parquet, and ORC WHITE PAPER Introduction to Big Data Formats 2 TABLE OF TABLE OF CONTENTS CONTENTS INTRODUCTION
More informationTESTING BIG DATA WORLD RIGA. by Konstantin Pletenev OCTOBER, 2017, TAPOST GROW CONFIDENTLY
RIGA TESTING BIG DATA WORLD by Konstantin Pletenev OCTOBER, 2017, TAPOST GROW CONFIDENTLY BIG DATA IS NOT ABOUT THE DATA THE REVOLUTION IS NOT THAT THERE S MORE DATA AVAILABLE THE REVOLUTION IS THAT WE
More informationTowards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis.
Towards Modeling Approach Enabling Efficient Platform for Heterogeneous Big Data Analysis Andrey.Sadovykh@softeam.fr www.modeliosoft.com 1 Outlines Introduction Model-driven development Big Data Juniper
More informationCOMP 465 Special Topics: Data Mining
COMP 465 Special Topics: Data Mining Introduction & Course Overview 1 Course Page & Class Schedule http://cs.rhodes.edu/welshc/comp465_s15/ What s there? Course info Course schedule Lecture media (slides,
More informationSpecialist ICT Learning
Specialist ICT Learning APPLIED DATA SCIENCE AND BIG DATA ANALYTICS GTBD7 Course Description This intensive training course provides theoretical and technical aspects of Data Science and Business Analytics.
More informationBusiness Analytics and Big Data: the process and the tools
Business Analytics and Big Data: the process and the tools Mehmet Gençer Assoc.Prof., Organization Studies & Computer Engineering mehmetgencer@yahoo.com mehmet.gencer@ieu.edu.tr https://mgencer.com How
More informationWhy Quality Depends on Big Data
Why Quality Depends on Big Data Korea Test Conference Michael Schuldenfrei, CTO Who are Optimal+? 2 Company Overview Optimal+ provides Manufacturing Intelligence software that delivers realtime, big data
More informationOutrun Your Competition With SAS In-Memory Analytics Sascha Schubert Global Technology Practice, SAS
Outrun Your Competition With SAS In-Memory Analytics Sascha Schubert Global Technology Practice, SAS Topics AGENDA Challenges with Big Data Analytics How SAS can help you to minimize time to value with
More informationData Mining. ❷Chapter 2 Basic Statistics. Asso.Prof.Dr. Xiao-dong Zhu. Business School, University of Shanghai for Science & Technology
❷Chapter 2 Basic Statistics Business School, University of Shanghai for Science & Technology 2016-2017 2nd Semester, Spring2017 Contents of chapter 1 1 recording data using computers 2 3 4 5 6 some famous
More informationGraph Algorithms using Map-Reduce. Graphs are ubiquitous in modern society. Some examples: The hyperlink structure of the web
Graph Algorithms using Map-Reduce Graphs are ubiquitous in modern society. Some examples: The hyperlink structure of the web Graph Algorithms using Map-Reduce Graphs are ubiquitous in modern society. Some
More informationConvergence and Collaboration: Transforming Business Process and Workflows
Convergence and Collaboration: Transforming Business Process and Workflows Steven Hagan, Vice President, Server Technologies 1 Copyright 2011, Oracle and/or its affiliates. All rights Convergence & Collaboration:
More informationData Analysis Using MapReduce in Hadoop Environment
Data Analysis Using MapReduce in Hadoop Environment Muhammad Khairul Rijal Muhammad*, Saiful Adli Ismail, Mohd Nazri Kama, Othman Mohd Yusop, Azri Azmi Advanced Informatics School (UTM AIS), Universiti
More informationDistributed Machine Learning" on Spark
Distributed Machine Learning" on Spark Reza Zadeh @Reza_Zadeh http://reza-zadeh.com Outline Data flow vs. traditional network programming Spark computing engine Optimization Example Matrix Computations
More informationIntroduction to Data Science Day 2
Introduction to Data Science Day 2 Data Matters Summer workshop series in data science Sponsored by the Odum Institute, RENCI, and NCDS Thomas M. Carsey carsey@unc.edu Examples of Data Science Google Flu
More informationNew Approaches to Big Data Processing and Analytics
New Approaches to Big Data Processing and Analytics Contributing authors: David Floyer, David Vellante Original publication date: February 12, 2013 There are number of approaches to processing and analyzing
More information3 Data, Data Mining. Chengkai Li
CSE4334/5334 Data Mining 3 Data, Data Mining Chengkai Li Department of Computer Science and Engineering University of Texas at Arlington Fall 2018 (Slides partly courtesy of Pang-Ning Tan, Michael Steinbach
More informationOverview of Big Data
Overview of Big Data Tools and Techniques, Discoveries and Pitfalls Spring 2018 What Does Big Data Mean? (1) Collecting large amounts of data Via computers, sensors, people, events (2) Doing something
More informationComparing SQL and NOSQL databases
COSC 6397 Big Data Analytics Data Formats (II) HBase Edgar Gabriel Spring 2014 Comparing SQL and NOSQL databases Types Development History Data Storage Model SQL One type (SQL database) with minor variations
More informationStoring data in databases
Storing data in databases The webinar will begin at 3pm You now have a menu in the top right corner of your screen. The red button with a white arrow allows you to expand and contract the webinar menu,
More informationBig Data Analytics with Oracle Advanced Analytics 12c and Big Data SQL
Big Data Analytics with Oracle Advanced Analytics 12c and Big Data SQL Make Big Data + Analytics Simple Charlie Berger, MS Engineering, MBA Sr. Director Product Management, Data Mining and Advanced Analytics
More informationRocky Mountain Oracle Users Group Fall Educational Workshop November 12, 2015
Rocky Mountain Oracle Users Group Fall Educational Workshop November 12, 2015 Dan Vlamis Tim Vlamis Vlamis Software Solutions 816-781-2880 http://www.vlamis.com Vlamis Software Solutions Vlamis Software
More informationSecurity analytics: From data to action Visual and analytical approaches to detecting modern adversaries
Security analytics: From data to action Visual and analytical approaches to detecting modern adversaries Chris Calvert, CISSP, CISM Director of Solutions Innovation Copyright 2013 Hewlett-Packard Development
More informationSOCIAL MEDIA. Charles Murphy
SOCIAL MEDIA Charles Murphy Social Media Overview 1. Introduction 2. Social Media Areas Blogging Bookmarking Deals Location-based Music Photo sharing Video 3. The Fab Four FaceBook Google+ Linked In Twitter
More informationFault Detection using Advanced Analytics at CERN's Large Hadron Collider: Too Hot or Too Cold BIWA Summit 2016
Fault Detection using Advanced Analytics at CERN's Large Hadron Collider: Too Hot or Too Cold BIWA Summit 2016 Mark Hornick, Director, Advanced Analytics January 27, 2016 Safe Harbor Statement The following
More informationBig Data in Egypt, Hype or Reality? Sameh El-Ansary, PhD (Teacher of What s not BigData)
Big Data in Egypt, Hype or Reality? Sameh El-Ansary, PhD (Teacher of What s not BigData) Researcher: About Sameh El-Ansary PhD in Large-Scale Distributed Systems, KTH, SWEDEN Researcher at the Swedish
More informationSpagoBI and Talend jointly support Big Data scenarios
SpagoBI and Talend jointly support Big Data scenarios Monica Franceschini - SpagoBI Architect SpagoBI Competency Center - Engineering Group Big-data Agenda Intro & definitions Layers Talend & SpagoBI SpagoBI
More informationSpotfire Data Science with Hadoop Using Spotfire Data Science to Operationalize Data Science in the Age of Big Data
Spotfire Data Science with Hadoop Using Spotfire Data Science to Operationalize Data Science in the Age of Big Data THE RISE OF BIG DATA BIG DATA: A REVOLUTION IN ACCESS Large-scale data sets are nothing
More informationMATH36032 Problem Solving by Computer. Data Science
MATH36032 Problem Solving by Computer Data Science NO. of jobs on jobsite 1 10000 NO. of Jobs 8000 6000 4000 2000 MATLAB Data Data Science 0 Jan 2016 Jul 2016 Jan 2017 1 http://www.jobsite.co.uk/ What
More informationMicrosoft Big Data and Hadoop
Microsoft Big Data and Hadoop Lara Rubbelke @sqlgal Cindy Gross @sqlcindy 2 The world of data is changing The 4Vs of Big Data http://nosql.mypopescu.com/post/9621746531/a-definition-of-big-data 3 Common
More informationBIG DATA SCIENTIST Certification. Big Data Scientist
BIG DATA SCIENTIST Certification Big Data Scientist Big Data Science Professional (BDSCP) certifications are formal accreditations that prove proficiency in specific areas of Big Data. To obtain a certification,
More informationBlazing BI with Oracle DB Analytical Options: Oracle OLAP, Oracle Data Mining, and Oracle Spatial
Blazing BI with Oracle DB Analytical Options: Oracle OLAP, Oracle Data Mining, and Oracle Spatial Heartland OUG October 20, 2011 Dan Vlamis and Tim Vlamis Vlamis Software Solutions 816-781-2880 http://www.vlamis.com
More informationOracle Big Data Connectors
Oracle Big Data Connectors Oracle Big Data Connectors is a software suite that integrates processing in Apache Hadoop distributions with operations in Oracle Database. It enables the use of Hadoop to process
More informationData Science Course Content
CHAPTER 1: INTRODUCTION TO DATA SCIENCE Data Science Course Content What is the need for Data Scientists Data Science Foundation Business Intelligence Data Analysis Data Mining Machine Learning Difference
More information10/14/2017. Dejan Sarka. Anomaly Detection. Sponsors
Dejan Sarka Anomaly Detection Sponsors About me SQL Server MVP (17 years) and MCT (20 years) 25 years working with SQL Server Authoring 16 th book Authoring many courses, articles Agenda Introduction Simple
More informationHierarchy of knowledge BIG DATA 9/7/2017. Architecture
BIG DATA Architecture Hierarchy of knowledge Data: Element (fact, figure, etc.) which is basic information that can be to be based on decisions, reasoning, research and which is treated by the human or
More informationComparative analysis of data mining methods for predicting credit default probabilities in a retail bank portfolio
Comparative analysis of data mining methods for predicting credit default probabilities in a retail bank portfolio Adela Ioana Tudor, Adela Bâra, Simona Vasilica Oprea Department of Economic Informatics
More informationOracle Big Data. A NA LYT ICS A ND MA NAG E MENT.
Oracle Big Data. A NALYTICS A ND MANAG E MENT. Oracle Big Data: Redundância. Compatível com ecossistema Hadoop, HIVE, HBASE, SPARK. Integração com Cloudera Manager. Possibilidade de Utilização da Linguagem
More informationIntroduction to Data Science
Introduction to Data Science CS 491, DES 430, IE 444, ME 444, MKTG 477 UIC Innovation Center Fall 2017 and Spring 2018 Instructors: Charles Frisbie, Marco Susani, Michael Scott and Ugo Buy Author: Ugo
More informationNew Challenges in Big Data: Technical Perspectives. Hwanjo Yu POSTECH
New Challenges in Big Data: Technical Perspectives Hwanjo Yu POSTECH http:/hwanjoyu.org Over 1 Billion SNS users!! Viral Marketing Word-of-Mouth Effect > TV advertising......... Influence Maximization
More informationData Mining. Jeff M. Phillips. January 7, 2019 CS 5140 / CS 6140
Data Mining CS 5140 / CS 6140 Jeff M. Phillips January 7, 2019 What is Data Mining? What is Data Mining? Finding structure in data? Machine learning on large data? Unsupervised learning? Large scale computational
More informationADVANCED ANALYTICS USING SAS ENTERPRISE MINER RENS FEENSTRA
INSIGHTS@SAS: ADVANCED ANALYTICS USING SAS ENTERPRISE MINER RENS FEENSTRA AGENDA 09.00 09.15 Intro 09.15 10.30 Analytics using SAS Enterprise Guide Ellen Lokollo 10.45 12.00 Advanced Analytics using SAS
More informationDatabases 2 (VU) ( / )
Databases 2 (VU) (706.711 / 707.030) MapReduce (Part 3) Mark Kröll ISDS, TU Graz Nov. 27, 2017 Mark Kröll (ISDS, TU Graz) MapReduce Nov. 27, 2017 1 / 42 Outline 1 Problems Suited for Map-Reduce 2 MapReduce:
More informationModern Database Concepts
Modern Database Concepts Introduction to the world of Big Data Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz What is Big Data? buzzword? bubble? gold rush? revolution? Big data is like teenage
More informationBig Data and FrameWorks; Perspectives to Applied Machine Learning
Big Data and FrameWorks; Perspectives to Applied Machine Learning Mehdi Habibzadeh PhD in Computer Science Outlines (Oct 2016) : Big Data and Challenges Review and Trends Math and Probability Concepts
More informationMachine Learning with Python
DEVNET-2163 Machine Learning with Python Dmitry Figol, SE WW Enterprise Sales @dmfigol Cisco Spark How Questions? Use Cisco Spark to communicate with the speaker after the session 1. Find this session
More informationKNIME for the life sciences Cambridge Meetup
KNIME for the life sciences Cambridge Meetup Greg Landrum, Ph.D. KNIME.com AG 12 July 2016 What is KNIME? A bit of motivation: tool blending, data blending, documentation, automation, reproducibility More
More informationData Analytics at Logitech Snowflake + Tableau = #Winning
Welcome # T C 1 8 Data Analytics at Logitech Snowflake + Tableau = #Winning Avinash Deshpande I am a futurist, scientist, engineer, designer, data evangelist at heart Find me at Avinash Deshpande Chief
More informationdata-based banking customer analytics
icare: A framework for big data-based banking customer analytics Authors: N.Sun, J.G. Morris, J. Xu, X.Zhu, M. Xie Presented By: Hardik Sahi Overview 1. 2. 3. 4. 5. 6. Why Big Data? Traditional versus
More information