MATH36032 Problem Solving by Computer. Data Science

Size: px
Start display at page:

Download "MATH36032 Problem Solving by Computer. Data Science"

Transcription

1 MATH36032 Problem Solving by Computer Data Science

2 NO. of jobs on jobsite NO. of Jobs MATLAB Data Data Science 0 Jan 2016 Jul 2016 Jan

3 What is Data Science? (from Wiki) an interdisciplinary field about processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured, which is a continuation of some of the data analysis fields such as statistics, data mining, and predictive analytics,...

4 What is Data Science? Math & Stat Knowledge: calculus, statistics/probability, linear algebra Hacking skills: number/text manipulation and (vectorized) manipulation, algorithmic thinking,... Substantive expertise/domain knowledge: knowledge related to specific facts

5 Data Science/Big Data: why now?

6 Data Science/Big Data: why now? We are generating more data before

7 Data Science/Big Data: why now? We are generating more data before Technology in data collection and storage are improving

8 Applications: Social media and search engine Personalised webpage How does Amazon know which items to recommend?

9 Applications: Retailers Personalised promotion offers Who should get what kind of offer?

10 Applications: Credit Card Fraud Detection Do these transactions look normal?

11 Applications: Credit Card Fraud Detection Do these transactions look normal? This is my credit card statement. The transactions are made within two hours after I lost the card in Montreal in the summer of 2015.

12 More applications Insurance/ Actuarial Science: how much do you charge your customer Weather/climate forecasting: long term prediction Finance: better prediction of the stock prices.

13 Big data leaked and generated Is the 2.6 terabytes Panama Papers big data? How about the 1 billion accounts leaked from Yahoo s database?

14 Big data leaked and generated Is the 2.6 terabytes Panama Papers big data? How about the 1 billion accounts leaked from Yahoo s database? Data generated daily More than 10 terabytes for most national meteorological center More than 500 terabytes of data processed by Facebook More than 20 petabytes ( bytes) handled by Google

15 Three V s of Big Data Volume: large quantity of data, big size of datasets Variety: many different types and forms of data, e.g. transactional from ATMs, social media site, s, demographics data, tracking data from cell phones, etc. Velocity: data that is coming in at a very fast pace

16 Three V s of Big Data Volume: large quantity of data, big size of datasets Variety: many different types and forms of data, e.g. transactional from ATMs, social media site, s, demographics data, tracking data from cell phones, etc. Velocity: data that is coming in at a very fast pace Require many new softwares/technologies: The support for big data sets is extended to all major technologies in MATLAB (mapreduce, datastore and other toolboxes)

17 Big Data Landscape

18 The first data science application? Kepler s three laws of planetary motion

19 Make a TV show using data?

20 TV made by Amazon and Netflix using big data

21 TV made by Amazon and Netflix using big data Alpha House was not as successful as expected. Netflix also had open competitions for best algorithms to predict user ratings for films (discontinued now for privacy and other reasons).

22 Google flu trends Google made a big splash in the news in 2008

23 Google flu trends Google made a big splash in the news in 2008 and five years later

24 The dark side of data science Data, if used in the right way, can greatly facilitate our life (like the concept of smart cities), but...

25 The dark side of data science Data, if used in the right way, can greatly facilitate our life (like the concept of smart cities), but... Privacy: how the data collected from you are being used?

26 The dark side of data science Data, if used in the right way, can greatly facilitate our life (like the concept of smart cities), but... Privacy: how the data collected from you are being used? Biased data: polls before Brexit or US presidential election in 2016

27 The dark side of data science Data, if used in the right way, can greatly facilitate our life (like the concept of smart cities), but... Privacy: how the data collected from you are being used? Biased data: polls before Brexit or US presidential election in 2016 Biased interpretation:

28 The dark side of data science Data, if used in the right way, can greatly facilitate our life (like the concept of smart cities), but... Privacy: how the data collected from you are being used? Biased data: polls before Brexit or US presidential election in 2016 Biased interpretation: Data shows that people who shop at Waitrose have longer life span than those at Aldi and Asda.

29 Data Science Workflow (in business)

30 Data Science using MATLAB MATLAB is not the best tool for data science. Certain tasks like text processing is better done with Python or R. More tools and data types are introduced in MATLAB for the past few years, mainly to cope with the increased need in data science.

31 MATLAB Data Type 2 We have already seen these (type whos in command window): Numerical Types: double (double precision floating number), uint8 (images), int32, int64,... Symbolic: Defined by syms Logical: true,false Characters and strings: A = string Function as in integral(@(x) sin(x),...) New data types introduced recently : structures (struct), cell arrays (cell), Time Series (timeseries from 2006b), Table (table from 2013b), Categorical Arrays (categorical from 2013b), Date and Time (datetime from 2015b), data-types.html

32 The plan for the rest of the semester In Week 8 (Friday) Review (and introduce) a few new data structures (mainly character strings) Read (and write) different data formats (csv, excel, image,...) Specific topics: Random simulation (week 9) Regression and classification (week 10) Dimension reduction/low rank approximation (week 10) Google Pagerank (week 11) Other related topics (if time permits, week 11)

Introduction to Data Mining and Data Analytics

Introduction to Data Mining and Data Analytics 1/28/2016 MIST.7060 Data Analytics 1 Introduction to Data Mining and Data Analytics What Are Data Mining and Data Analytics? Data mining is the process of discovering hidden patterns in data, where Patterns

More information

Data Mining. Jeff M. Phillips. January 12, 2015 CS 5140 / CS 6140

Data Mining. Jeff M. Phillips. January 12, 2015 CS 5140 / CS 6140 Data Mining CS 5140 / CS 6140 Jeff M. Phillips January 12, 2015 Data Mining What is Data Mining? Finding structure in data? Machine learning on large data? Unsupervised learning? Large scale computational

More information

What's New in MATLAB for Engineering Data Analytics?

What's New in MATLAB for Engineering Data Analytics? What's New in MATLAB for Engineering Data Analytics? Will Wilson Application Engineer MathWorks, Inc. 2017 The MathWorks, Inc. 1 Agenda Data Types Tall Arrays for Big Data Machine Learning (for Everyone)

More information

Big Data Specialized Studies

Big Data Specialized Studies Information Technologies Programs Big Data Specialized Studies Accelerate Your Career extension.uci.edu/bigdata Offered in partnership with University of California, Irvine Extension s professional certificate

More information

Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition

Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition What s the BIG deal?! 2011 2011 2008 2010 2012 What s the BIG deal?! (Gartner Hype Cycle) What s the

More information

Based on Big Data: Hype or Hallelujah? by Elena Baralis

Based on Big Data: Hype or Hallelujah? by Elena Baralis Based on Big Data: Hype or Hallelujah? by Elena Baralis http://dbdmg.polito.it/wordpress/wp-content/uploads/2010/12/bigdata_2015_2x.pdf 1 3 February 2010 Google detected flu outbreak two weeks ahead of

More information

COMP 465 Special Topics: Data Mining

COMP 465 Special Topics: Data Mining COMP 465 Special Topics: Data Mining Introduction & Course Overview 1 Course Page & Class Schedule http://cs.rhodes.edu/welshc/comp465_s15/ What s there? Course info Course schedule Lecture media (slides,

More information

PASSPORT USER GUIDE. This guide provides a detailed overview of how to use Passport, allowing you to find the information you need more efficiently.

PASSPORT USER GUIDE. This guide provides a detailed overview of how to use Passport, allowing you to find the information you need more efficiently. PASSPORT USER GUIDE Passport is a global market research database providing insight on industries, economies and consumers worldwide, helping our clients analyse market context and identify future trends

More information

Data Mining Concepts & Tasks

Data Mining Concepts & Tasks Data Mining Concepts & Tasks Duen Horng (Polo) Chau Georgia Tech CSE6242 / CX4242 Sept 9, 2014 Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John Stasko, Christos Faloutsos Last Time

More information

Data Mining. Jeff M. Phillips. January 8, 2014

Data Mining. Jeff M. Phillips. January 8, 2014 Data Mining Jeff M. Phillips January 8, 2014 Data Mining What is Data Mining? Finding structure in data? Machine learning on large data? Unsupervised learning? Large scale computational statistics? Data

More information

Nowcasting. D B M G Data Base and Data Mining Group of Politecnico di Torino. Big Data: Hype or Hallelujah? Big data hype?

Nowcasting. D B M G Data Base and Data Mining Group of Politecnico di Torino. Big Data: Hype or Hallelujah? Big data hype? Big data hype? Big Data: Hype or Hallelujah? Data Base and Data Mining Group of 2 Google Flu trends On the Internet February 2010 detected flu outbreak two weeks ahead of CDC data Nowcasting http://www.internetlivestats.com/

More information

Challenges and Opportunities with Big Data. By: Rohit Ranjan

Challenges and Opportunities with Big Data. By: Rohit Ranjan Challenges and Opportunities with Big Data By: Rohit Ranjan Introduction What is Big Data? Big data is data sets that are so voluminous and complex that traditional data processing application software

More information

Data Mining Concepts & Tasks

Data Mining Concepts & Tasks Data Mining Concepts & Tasks Duen Horng (Polo) Chau Georgia Tech CSE6242 / CX4242 Jan 16, 2014 Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John Stasko, Christos Faloutsos Last Time

More information

Computers Are Your Future

Computers Are Your Future Computers Are Your Future Twelfth Edition Chapter 12: Databases and Information Systems Copyright 2012 Pearson Education, Inc. Publishing as Prentice Hall 1 Databases and Information Systems Copyright

More information

Computing Yi Fang, PhD

Computing Yi Fang, PhD Computing Yi Fang, PhD Department of Computer Engineering Santa Clara University yfang@scu.edu http://www.cse.scu.edu/~yfang/ 1 This Talk Part I Computing Part II Computing at SCU Part III Data Science

More information

ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 4. Prof. James She

ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 4. Prof. James She ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 4 Prof. James She james.she@ust.hk 1 Selected Works of Activity 4 2 Selected Works of Activity 4 3 Last lecture 4 Mid-term

More information

An Introduction to Data Mining BY:GAGAN DEEP KAUSHAL

An Introduction to Data Mining BY:GAGAN DEEP KAUSHAL An Introduction to Data Mining BY:GAGAN DEEP KAUSHAL Trends leading to Data Flood More data is generated: Bank, telecom, other business transactions... Scientific Data: astronomy, biology, etc Web, text,

More information

A Review Paper on Big data & Hadoop

A Review Paper on Big data & Hadoop A Review Paper on Big data & Hadoop Rupali Jagadale MCA Department, Modern College of Engg. Modern College of Engginering Pune,India rupalijagadale02@gmail.com Pratibha Adkar MCA Department, Modern College

More information

M.S. in Information Systems

M.S. in Information Systems M.S. in Information Systems 1 M.S. in Information Systems (30 Credits) M.S. in Information Systems IS Core Courses IS 601 Web Systems Development 3 IS 663 System Analysis and Design 3 IS 631 Enterprise

More information

Analysing Search Trends

Analysing Search Trends Data Mining in Business Intelligence 7 March 2013, Ben-Gurion University Analysing Search Trends Yair Shimshoni, Google R&D center, Tel-Aviv. shimsh@google.com Outline What are search trends? The Google

More information

BIG DATA TESTING: A UNIFIED VIEW

BIG DATA TESTING: A UNIFIED VIEW http://core.ecu.edu/strg BIG DATA TESTING: A UNIFIED VIEW BY NAM THAI ECU, Computer Science Department, March 16, 2016 2/30 PRESENTATION CONTENT 1. Overview of Big Data A. 5 V s of Big Data B. Data generation

More information

Slice Intelligence!

Slice Intelligence! Intern @ Slice Intelligence! Wei1an(Wu( September(8,(2014( Outline!! Details about the job!! Skills required and learned!! My thoughts regarding the internship! About the company!! Slice, which we call

More information

Introduction to Big Data

Introduction to Big Data Università degli Studi di Roma Tor Vergata Dipartimento di Ingegneria Civile e Ingegneria Informatica Introduction to Big Data Corso di Sistemi e Architetture per Big Data A.A. 2016/17 Valeria Cardellini

More information

Overview of Big Data

Overview of Big Data Overview of Big Data Tools and Techniques, Discoveries and Pitfalls Spring 2018 What Does Big Data Mean? (1) Collecting large amounts of data Via computers, sensors, people, events (2) Doing something

More information

A Smart New Cofely Dutch Data Summit Roland Schneiders

A Smart New Cofely Dutch Data Summit Roland Schneiders A Smart New Cofely Dutch Data Summit 2014 Roland Schneiders 16 december 2014 2 How the world has changed: Stock Exchange 16 december 2014 3 A Smart New Cofely - Dutch Data Summit 2014 Big Data @ Stock

More information

CS 345A Data Mining Lecture 1. Introduction to Web Mining

CS 345A Data Mining Lecture 1. Introduction to Web Mining CS 345A Data Mining Lecture 1 Introduction to Web Mining What is Web Mining? Discovering useful information from the World-Wide Web and its usage patterns Web Mining v. Data Mining Structure (or lack of

More information

Big Data with Hadoop Ecosystem

Big Data with Hadoop Ecosystem Diógenes Pires Big Data with Hadoop Ecosystem Hands-on (HBase, MySql and Hive + Power BI) Internet Live http://www.internetlivestats.com/ Introduction Business Intelligence Business Intelligence Process

More information

MBA Tech Subjects. (All Branches)

MBA Tech Subjects. (All Branches) MBA Tech Subjects (All Branches) Semester I 1) Engineering Mathematics - I 2) Engineering Chemistry 3) Basic Electrical Engineering 4) Engineering Mechanics I 5) Computer Programming I 6) Workshop Practice

More information

Web Mining TEAM 8. Professor Anita Wasilewska CSE 634 Data Mining

Web Mining TEAM 8. Professor Anita Wasilewska CSE 634 Data Mining Web Mining TEAM 8 Paper - You Are What You Tweet : Analyzing Twitter for Public Health Authors : Paul, Michael J., and Mark Dredze. Conference : AAAI Publications, Fifth International AAAI Conference on

More information

Some Big Data Challenges

Some Big Data Challenges Some Big Data Challenges 2,500,000,000,000,000,000 Bytes (2.5 x 10 18 ) of data are created every day! (2012) or 8,000,000,000,000,000,000 (8 exabytes) of new data were stored globally by enterprises in

More information

Chapter 3. Foundations of Business Intelligence: Databases and Information Management

Chapter 3. Foundations of Business Intelligence: Databases and Information Management Chapter 3 Foundations of Business Intelligence: Databases and Information Management THE DATA HIERARCHY TRADITIONAL FILE PROCESSING Organizing Data in a Traditional File Environment Problems with the traditional

More information

Mining of Massive Datasets

Mining of Massive Datasets Mining of Massive Datasets Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, second

More information

Learning Objectives for Data Concept and Visualization

Learning Objectives for Data Concept and Visualization Learning Objectives for Data Concept and Visualization Assignment 1: Data Quality Concept and Impact of Data Quality Summarize concepts of data quality. Understand and describe the impact of data on actuarial

More information

Machine Learning for Large-Scale Data Analysis and Decision Making A. Distributed Machine Learning Week #9

Machine Learning for Large-Scale Data Analysis and Decision Making A. Distributed Machine Learning Week #9 Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Distributed Machine Learning Week #9 Today Distributed computing for machine learning Background MapReduce/Hadoop & Spark Theory

More information

Embedded Technosolutions

Embedded Technosolutions Hadoop Big Data An Important technology in IT Sector Hadoop - Big Data Oerie 90% of the worlds data was generated in the last few years. Due to the advent of new technologies, devices, and communication

More information

PASSPORT USER GUIDE. This guide provides a detailed overview of how to use Passport, allowing you to find the information you need more efficiently.

PASSPORT USER GUIDE. This guide provides a detailed overview of how to use Passport, allowing you to find the information you need more efficiently. PASSPORT USER GUIDE Passport is a global market research database providing insight on industries, economies and consumers worldwide, helping our clients analyse market context and identify future trends

More information

retail Free popcorn today cinema All food 20% off women s clothing counter food court

retail Free popcorn today cinema All food 20% off women s clothing counter food court retail Sundray supermarket and mall wireless solution combines wireless demands of shopping malls, supermarkets and chain stores into Facebook authentication, Wi-Fi advertising, marketing statistics, analysis

More information

Big Data Issues for Federal Records Managers

Big Data Issues for Federal Records Managers Big Data Issues for Federal Records Managers ARMA Metro Conference April 26, 2017 Lisa Haralampus Director, Federal Records Management Policy and Outreach Section Office of the Chief Records Officer for

More information

Course Structure A : General Education Course B : Major Course C : Free Elective Course

Course Structure A : General Education Course B : Major Course C : Free Elective Course Bachelor of Science Program in Computer Technology ---------------------------------------------- General Information Degree Designation : Bachelor of Science Program in Computer Technology Total Credits

More information

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 3: Programming Models RCFile: A Fast and Space-efficient Data

More information

Lecture 18. Business Intelligence and Data Warehousing. 1:M Normalization. M:M Normalization 11/1/2017. Topics Covered

Lecture 18. Business Intelligence and Data Warehousing. 1:M Normalization. M:M Normalization 11/1/2017. Topics Covered Lecture 18 Business Intelligence and Data Warehousing BDIS 6.2 BSAD 141 Dave Novak Topics Covered Test # Review What is Business Intelligence? How can an organization be data rich and information poor?

More information

COUNTRY PROFILE. Mexico

COUNTRY PROFILE. Mexico COUNTRY PROFILE Mexico Statistical tables Factor I: Economic Performance WORLD COMPETITIVENESS RANKING 2018 All data are available from the World Competitiveness Online. Visit our eshop 1 COMPETITIVENESS

More information

Optimizing Your Analytics Life Cycle with SAS & Teradata. Rick Lower

Optimizing Your Analytics Life Cycle with SAS & Teradata. Rick Lower Optimizing Your Analytics Life Cycle with SAS & Teradata Rick Lower 1 Agenda The Analytic Life Cycle Common Problems SAS & Teradata solutions Analytical Life Cycle Exploration Explore All Your Data Preparation

More information

Big Data Analytics The Data Mining process. Roger Bohn March. 2017

Big Data Analytics The Data Mining process. Roger Bohn March. 2017 Big Data Analytics The Data Mining process Roger Bohn March. 2017 Office hours RB Tuesday + Thursday 5:10 to 6:15. Tuesday = office rm 1315; Thursday = Peet s Sai Kolasani =? 1

More information

WEB SITE FUNCTIONAL SPECIFICATION FOR A FICTION EXECUTIVE EDUCTIONAL INSTITUTE

WEB SITE FUNCTIONAL SPECIFICATION FOR A FICTION EXECUTIVE EDUCTIONAL INSTITUTE WEB SITE FUNCTIONAL SPECIFICATION FOR A FICTION EXECUTIVE EDUCTIONAL INSTITUTE Martin Meister Boston University October 15, 2011 1 Table of Contents Project Overview... 3 Institution Information... 3 Goals

More information

14th Iran Media Technology Conference. by H. Shah-Hosseini. 12 Dec Gathered & presented by H. Shah-Hosseini 1

14th Iran Media Technology Conference. by H. Shah-Hosseini. 12 Dec Gathered & presented by H. Shah-Hosseini 1 14th Iran Media Technology Conference by H. Shah-Hosseini 12 Dec. 2017 Gathered & presented by H. Shah-Hosseini 1 Topics Big data: Big data and its four vs: volume, velocity, variety, and veracity Another

More information

Strategic Briefing Paper Big Data

Strategic Briefing Paper Big Data Strategic Briefing Paper Big Data The promise of Big Data is improved competitiveness, reduced cost and minimized risk by taking better decisions. This requires affordable solution architectures which

More information

REVIEW ON BIG DATA ANALYTICS AND HADOOP FRAMEWORK

REVIEW ON BIG DATA ANALYTICS AND HADOOP FRAMEWORK REVIEW ON BIG DATA ANALYTICS AND HADOOP FRAMEWORK 1 Dr.R.Kousalya, 2 T.Sindhupriya 1 Research Supervisor, Professor & Head, Department of Computer Applications, Dr.N.G.P Arts and Science College, Coimbatore

More information

Machine Learning with Python

Machine Learning with Python DEVNET-2163 Machine Learning with Python Dmitry Figol, SE WW Enterprise Sales @dmfigol Cisco Spark How Questions? Use Cisco Spark to communicate with the speaker after the session 1. Find this session

More information

What the is SEO? And how you can kick booty in the interwebs game

What the is SEO? And how you can kick booty in the interwebs game What the F^@& is SEO? And how you can kick booty in the interwebs game 1 WHAT THE F^$& is SEO?? SEO (SEARCH ENGINE OPTIMIZATION) is the process of improving your website so that it attracts more visitors

More information

2 nd Year. Module Basket of Courses Duration Credit Offered Status. 12 Weeks 4 NPTEL Programming in Java

2 nd Year. Module Basket of Courses Duration Credit Offered Status. 12 Weeks 4 NPTEL Programming in Java MAULANA ABUL KALAM AZAD UNIVERSITY OF TECHNOLOGY, WEST BENGAL List of Online Courses for 2nd Year, 3rd Year and 4th Year B.Tech Courses of IT and CSE for Additional Credit Earning 2 nd Year Module Basket

More information

Big Data Its All Around You

Big Data Its All Around You Big Data Its All Around You Brian Macdonald Oracle Enterprise Architect Brian.macdonald@oracle.com Big Data: Its All Around You 1 2 3 4 5 Introduction What is Big Data What is Data Science Big Data Technologies

More information

Dealing with Data Especially Big Data

Dealing with Data Especially Big Data Dealing with Data Especially Big Data INFO-GB-2346.01 Fall 2017 Professor Norman White nwhite@stern.nyu.edu normwhite@twitter Teaching Assistant: Frenil Sanghavi fps241@stern.nyu.edu Administrative Assistant:

More information

Data Science Course Content

Data Science Course Content CHAPTER 1: INTRODUCTION TO DATA SCIENCE Data Science Course Content What is the need for Data Scientists Data Science Foundation Business Intelligence Data Analysis Data Mining Machine Learning Difference

More information

Data Mining. Jeff M. Phillips. January 7, 2019 CS 5140 / CS 6140

Data Mining. Jeff M. Phillips. January 7, 2019 CS 5140 / CS 6140 Data Mining CS 5140 / CS 6140 Jeff M. Phillips January 7, 2019 What is Data Mining? What is Data Mining? Finding structure in data? Machine learning on large data? Unsupervised learning? Large scale computational

More information

Consumer Insights. YouGov Omnibus, 5 th -6 th April

Consumer Insights. YouGov Omnibus, 5 th -6 th April Consumer Insights YouGov Omnibus, 5 th -6 th April 2018 research@iabuk.net Methodology asked a series of questions to consumers to support the release of the Full Year 2017 Adspend Report Quantitative

More information

Business Analytics and Big Data: the process and the tools

Business Analytics and Big Data: the process and the tools Business Analytics and Big Data: the process and the tools Mehmet Gençer Assoc.Prof., Organization Studies & Computer Engineering mehmetgencer@yahoo.com mehmet.gencer@ieu.edu.tr https://mgencer.com How

More information

CSC 261/461 Database Systems. Fall 2017 MW 12:30 pm 1:45 pm CSB 601

CSC 261/461 Database Systems. Fall 2017 MW 12:30 pm 1:45 pm CSB 601 CSC 261/461 Database Systems Fall 2017 MW 12:30 pm 1:45 pm CSB 601 Agenda Administrative aspects Brief overview of the course Introduction to databases and SQL ADMINISTRATIVE ASPECTS Teaching Staff Instructor:

More information

In-Memory Analytics with EXASOL and KNIME //

In-Memory Analytics with EXASOL and KNIME // Watch our predictions come true! In-Memory Analytics with EXASOL and KNIME // Dr. Marcus Dill Analytics 2020 The volume and complexity of data today and in the future poses great challenges for IT systems.

More information

Unit 10 Databases. Computer Concepts Unit Contents. 10 Operational and Analytical Databases. 10 Section A: Database Basics

Unit 10 Databases. Computer Concepts Unit Contents. 10 Operational and Analytical Databases. 10 Section A: Database Basics Unit 10 Databases Computer Concepts 2016 ENHANCED EDITION 10 Unit Contents Section A: Database Basics Section B: Database Tools Section C: Database Design Section D: SQL Section E: Big Data Unit 10: Databases

More information

3 Data, Data Mining. Chengkai Li

3 Data, Data Mining. Chengkai Li CSE4334/5334 Data Mining 3 Data, Data Mining Chengkai Li Department of Computer Science and Engineering University of Texas at Arlington Fall 2018 (Slides partly courtesy of Pang-Ning Tan, Michael Steinbach

More information

Introduction to Data Science Day 2

Introduction to Data Science Day 2 Introduction to Data Science Day 2 Data Matters Summer workshop series in data science Sponsored by the Odum Institute, RENCI, and NCDS Thomas M. Carsey carsey@unc.edu Examples of Data Science Google Flu

More information

Chapter 6 VIDEO CASES

Chapter 6 VIDEO CASES Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

Tackling Big Data Using MATLAB

Tackling Big Data Using MATLAB Tackling Big Data Using MATLAB Alka Nair Application Engineer 2015 The MathWorks, Inc. 1 Building Machine Learning Models with Big Data Access Preprocess, Exploration & Model Development Scale up & Integrate

More information

Pre-Requisites: CS2510. NU Core Designations: AD

Pre-Requisites: CS2510. NU Core Designations: AD DS4100: Data Collection, Integration and Analysis Teaches how to collect data from multiple sources and integrate them into consistent data sets. Explains how to use semi-automated and automated classification

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 24, 2015 Course Information Website: www.stat.ucdavis.edu/~chohsieh/ecs289g_scalableml.html My office: Mathematical Sciences Building (MSB)

More information

The future of shopping: I want it all, and I want it now Anthony Norman, Managing Director, GfK Retail & Technology

The future of shopping: I want it all, and I want it now Anthony Norman, Managing Director, GfK Retail & Technology The future of shopping: I want it all, and I want it now Anthony Norman, Managing Director, GfK Retail & Technology GfK October 14, 2015 #FCSummit15 1 Customers increasingly want their shopping experiences

More information

Large-Scale Data Engineering. Overview and Introduction

Large-Scale Data Engineering. Overview and Introduction Large-Scale Data Engineering Overview and Introduction Administration Blackboard Page Announcements, also via email (pardon html formatting) Practical enrollment, Turning in assignments, Check Grades Contact:

More information

CS 6240: Parallel Data Processing in MapReduce: Module 1. Mirek Riedewald

CS 6240: Parallel Data Processing in MapReduce: Module 1. Mirek Riedewald CS 6240: Parallel Data Processing in MapReduce: Module 1 Mirek Riedewald Why Parallel Processing? Answer 1: Big Data 2 How Much Information? Source: http://www2.sims.berkeley.edu/research/projects/ho w-much-info-2003/execsum.htm

More information

DATA MINING II - 1DL460

DATA MINING II - 1DL460 DATA MINING II - 1DL460 Spring 2016 A second course in data mining http://www.it.uu.se/edu/course/homepage/infoutv2/vt16 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,

More information

Search Engine Optimization Specialized Studies

Search Engine Optimization Specialized Studies Information Technologies Programs Search Engine Optimization Specialized Studies Accelerate Your Career ce.uci.edu/seo UCI Division of Continuing Education s professional certificate and specialized Improve

More information

Seek and Ye shall Find

Seek and Ye shall Find Seek and Ye shall Find The continuum of computer intelligence COS 116, Spring 2010 Adam Finkelstein Final tally: Computer $77,147, Ken Jennings $24,000, Brad Rutter $21,600. Jennings: I, for one, welcome

More information

Big Data - Some Words BIG DATA 8/31/2017. Introduction

Big Data - Some Words BIG DATA 8/31/2017. Introduction BIG DATA Introduction Big Data - Some Words Connectivity Social Medias Share information Interactivity People Business Data Data mining Text mining Business Intelligence 1 What is Big Data Big Data means

More information

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset. Glossary of data mining terms: Accuracy Accuracy is an important factor in assessing the success of data mining. When applied to data, accuracy refers to the rate of correct values in the data. When applied

More information

data-based banking customer analytics

data-based banking customer analytics icare: A framework for big data-based banking customer analytics Authors: N.Sun, J.G. Morris, J. Xu, X.Zhu, M. Xie Presented By: Hardik Sahi Overview 1. 2. 3. 4. 5. 6. Why Big Data? Traditional versus

More information

MATH36032 Problem Solving by Computer. More Data Structure

MATH36032 Problem Solving by Computer. More Data Structure MATH36032 Problem Solving by Computer More Data Structure Data from real life/applications How do the data look like? In what format? Data from real life/applications How do the data look like? In what

More information

Hadoop, Yarn and Beyond

Hadoop, Yarn and Beyond Hadoop, Yarn and Beyond 1 B. R A M A M U R T H Y Overview We learned about Hadoop1.x or the core. Just like Java evolved, Java core, Java 1.X, Java 2.. So on, software and systems evolve, naturally.. Lets

More information

Big Data A Growing Technology

Big Data A Growing Technology Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 6.017 IJCSMC,

More information

Campaign Goals, Objectives and Timeline SEO & Pay Per Click Process SEO Case Studies SEO, SEM, Social Media Strategy On Page SEO Off Page SEO

Campaign Goals, Objectives and Timeline SEO & Pay Per Click Process SEO Case Studies SEO, SEM, Social Media Strategy On Page SEO Off Page SEO Campaign Goals, Objectives and Timeline SEO & Pay Per Click Process SEO Case Studies SEO, SEM, Social Media Strategy On Page SEO Off Page SEO Reporting Pricing Plans Why Us & Contact Generate organic search

More information

Sustained Progress Year on Year

Sustained Progress Year on Year The Groceries GCA Securing Code Change Adjudicator: # Working Together, Making Progress Insights from an effective modern regulator Sustained Progress Year on Year BEIS Statutory Review July 2017 Annual

More information

Project Design. Version May, Computer Science Department, Texas Christian University

Project Design. Version May, Computer Science Department, Texas Christian University Project Design Version 4.0 2 May, 2016 2015-2016 Computer Science Department, Texas Christian University Revision Signatures By signing the following document, the team member is acknowledging that he

More information

Netezza The Analytics Appliance

Netezza The Analytics Appliance Software 2011 Netezza The Analytics Appliance Michael Eden Information Management Brand Executive Central & Eastern Europe Vilnius 18 October 2011 Information Management 2011IBM Corporation Thought for

More information

Student Handbook Master of Information Systems Management (MISM)

Student Handbook Master of Information Systems Management (MISM) Student Handbook 2018-2019 Master of Information Systems Management (MISM) Table of Contents Contents 1 Masters of Information Systems Management (MISM) Curriculum... 3 1.1 Required Courses... 3 1.2 Analytic

More information

2011 TMT Predictions.

2011 TMT Predictions. 2011 TMT Predictions. October 2011 Deloitte s TMT predictions was released in 60 countries and launched in 60+ cities in 2011. Predictions was first published in 2001. 3 Predictions Methodology 7,000 specialists

More information

Question Bank. 4) It is the source of information later delivered to data marts.

Question Bank. 4) It is the source of information later delivered to data marts. Question Bank Year: 2016-2017 Subject Dept: CS Semester: First Subject Name: Data Mining. Q1) What is data warehouse? ANS. A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile

More information

McAfee Total Protection for Data Loss Prevention

McAfee Total Protection for Data Loss Prevention McAfee Total Protection for Data Loss Prevention Protect data leaks. Stay ahead of threats. Manage with ease. Key Advantages As regulations and corporate standards place increasing demands on IT to ensure

More information

Global Standalone VPA (Virtual Personal Assistant) Device Market: Size, Trends & Forecasts ( ) May 2018

Global Standalone VPA (Virtual Personal Assistant) Device Market: Size, Trends & Forecasts ( ) May 2018 Global Standalone VPA (Virtual Personal Assistant) Device Market: Size, Trends & Forecasts (2018-2022) May 2018 Global Standalone VPA Device Market: Coverage Executive Summary and Scope Introduction/Market

More information

ECON/FIN 250: Forecasting in Finance and Economics

ECON/FIN 250: Forecasting in Finance and Economics ECON/FIN 250: Forecasting in Finance and Economics Patrick Herb Brandeis University Spring 2016 Patrick Herb (Brandeis University) Course Introduction ECON/FIN 250: Spring 2016 1 / 18 Course Overview 1

More information

pandas: Rich Data Analysis Tools for Quant Finance

pandas: Rich Data Analysis Tools for Quant Finance pandas: Rich Data Analysis Tools for Quant Finance Wes McKinney April 24, 2012, QWAFAFEW Boston about me MIT 07 AQR Capital: 2007-2010 Global Macro and Credit Research WES MCKINNEY pandas: 2008 - Present

More information

TOP 7 UPDATES IN LOCAL SEARCH FOR JANUARY 2015 YAHOO DIRECTORY NOW OFFICALLY CLOSED GOOGLE INTRODUCES NEWADWORDS TOOL AD CUSTOMIZERS

TOP 7 UPDATES IN LOCAL SEARCH FOR JANUARY 2015 YAHOO DIRECTORY NOW OFFICALLY CLOSED GOOGLE INTRODUCES NEWADWORDS TOOL AD CUSTOMIZERS Changes In Google And Bing Local Results Penguin Update Continues To Affect Local Rankings How To Add A sticky Post on Google+ page TOP 7 UPDATES IN LOCAL SEARCH FOR JANUARY 2015 0 Facebook Allows Calls-To-Action

More information

A REVIEW PAPER ON BIG DATA ANALYTICS

A REVIEW PAPER ON BIG DATA ANALYTICS A REVIEW PAPER ON BIG DATA ANALYTICS Kirti Bhatia 1, Lalit 2 1 HOD, Department of Computer Science, SKITM Bahadurgarh Haryana, India bhatia.kirti.it@gmail.com 2 M Tech 4th sem SKITM Bahadurgarh, Haryana,

More information

Overview of Web Mining Techniques and its Application towards Web

Overview of Web Mining Techniques and its Application towards Web Overview of Web Mining Techniques and its Application towards Web *Prof.Pooja Mehta Abstract The World Wide Web (WWW) acts as an interactive and popular way to transfer information. Due to the enormous

More information

The amount of data increases every day Some numbers ( 2012):

The amount of data increases every day Some numbers ( 2012): 1 The amount of data increases every day Some numbers ( 2012): Data processed by Google every day: 100+ PB Data processed by Facebook every day: 10+ PB To analyze them, systems that scale with respect

More information

Data Clustering on the Parallel Hadoop MapReduce Model. Dimitrios Verraros

Data Clustering on the Parallel Hadoop MapReduce Model. Dimitrios Verraros Data Clustering on the Parallel Hadoop MapReduce Model Dimitrios Verraros Overview The purpose of this thesis is to implement and benchmark the performance of a parallel K- means clustering algorithm on

More information

2/26/2017. The amount of data increases every day Some numbers ( 2012):

2/26/2017. The amount of data increases every day Some numbers ( 2012): The amount of data increases every day Some numbers ( 2012): Data processed by Google every day: 100+ PB Data processed by Facebook every day: 10+ PB To analyze them, systems that scale with respect to

More information

Acceptance. Changes to this Policy

Acceptance. Changes to this Policy Privacy Policy Last Updated: January 3, 2019 Thank you for visiting Etalia Foods! We work hard to provide you unforgettable and naturally gluten-free pizzas. We know that by choosing Etalia Foods for your

More information

An Introduction to Data Analysis, Statistics, and Graphing

An Introduction to Data Analysis, Statistics, and Graphing An Introduction to Data Analysis, Statistics, and Graphing What is a Graph? Present processes, relationships, and changes in a visual format that is easily understandable Attempts to engage viewers by

More information

Gotcha! Network Analytics to augment Fraud Detection Big Data in the Food Chain: the un(der)explored goldmine?

Gotcha! Network Analytics to augment Fraud Detection Big Data in the Food Chain: the un(der)explored goldmine? Gotcha! Network Analytics to augment Fraud Detection Big Data in the Food Chain: the un(der)explored goldmine? December 4th, 2018 Author: Véronique Van Vlasselaer SAS Pre-Sales Analytical Consultant Introduction

More information

Big Data con MATLAB. Lucas García The MathWorks, Inc. 1

Big Data con MATLAB. Lucas García The MathWorks, Inc. 1 Big Data con MATLAB Lucas García 2015 The MathWorks, Inc. 1 Agenda Introduction Remote Arrays in MATLAB Tall Arrays for Big Data Scaling up Summary 2 Architecture of an analytics system Data from instruments

More information

Data science How to prepare engineers for this field

Data science How to prepare engineers for this field 16th Workshop Software Engineering Education and Reverse Engineering, Jahorina 2016 Data science How to prepare engineers for this field Ivica Marković Department of Computer Science Faculty of Electronic

More information