Database Management Systems

Similar documents
The Relational Model. Database Management Systems

Embedded Technosolutions

CMPT 354: Database System I. Lecture 1. Course Introduction

Big Data Specialized Studies

CSE6331: Cloud Computing

Page 1. Goals for Today" Background of Cloud Computing" Sources Driving Big Data" CS162 Operating Systems and Systems Programming Lecture 24

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache

Where We Are. Review: Parallel DBMS. Parallel DBMS. Introduction to Data Management CSE 344

Webinar Series TMIP VISION

Sql Learn The Structured Query Language For The Most Popular Databases Including Microsoft Sql Server Mysql Mariadb Postgresql And Oracle

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

What is the maximum file size you have dealt so far? Movies/Files/Streaming video that you have used? What have you observed?

Part I What are Databases?

Distributed Systems CS6421

Big Data It s not just for Google Any More

CloudSwyft Learning-as-a-Service Course Catalog 2018 (Individual LaaS Course Catalog List)

The Datacenter Needs an Operating System

Challenges for Data Driven Systems

The age of Big Data Big Data for Oracle Database Professionals

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

Lecture Notes CPSC 321 (Fall 2018) Today... Survey. Course Overview. Homework. HW1 (out) S. Bowers 1 of 8

From Internet Data Centers to Data Centers in the Cloud

Introduction to Graph Databases

Introduction to Data Management CSE 344

Stages of Data Processing

Big Data with Hadoop Ecosystem

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS

Tour of Database Platforms as a Service. June 2016 Warner Chaves Christo Kutrovsky Solutions Architect

2013 AWS Worldwide Public Sector Summit Washington, D.C.

Copyright 2012 EMC Corporation. All rights reserved.

Strategic Briefing Paper Big Data

SAP HANA Update. Saul Cunningham SAP Big Data Centre of Excellence

Introduction to Data Management CSE 344

Big Data Its All Around You

Who we are: Database Research - Provenance, Integration, and more hot stuff. Boris Glavic. Department of Computer Science

What We Have Already Learned. DBMS Deployment: Local. Where We Are Headed Next. DBMS Deployment: 3 Tiers. DBMS Deployment: Client/Server

Modern Database Concepts

Things Every Oracle DBA Needs to Know about the Hadoop Ecosystem. Zohar Elkayam

CS 102. Big Data. Spring Big Data Platforms

CIB Session 12th NoSQL Databases Structures

A NoSQL Introduction for Relational Database Developers. Andrew Karcher Las Vegas SQL Saturday September 12th, 2015

Sensor Data Collection and Processing

Big Data: Next Frontier for Innovation

CSE 444: Database Internals. Lecture 23 Spark

Chapter 6. Foundations of Business Intelligence: Databases and Information Management VIDEO CASES

Hierarchy of knowledge BIG DATA 9/7/2017. Architecture

Cloud Computing Techniques for Big Data and Hadoop Implementation

Next-Generation Cloud Platform

Why NoSQL? Why Riak?

Computing Yi Fang, PhD

Data 101 Which DB, When. Joe Yong Azure SQL Data Warehouse, Program Management Microsoft Corp.

Introduction to Data Science Day 2

Introduction to Big Data

I am a Data Nerd and so are YOU!

Large-Scale Web Applications

Cloud Computing 2. CSCI 4850/5850 High-Performance Computing Spring 2018

Microsoft Big Data and Hadoop

STATE OF MODERN APPLICATIONS IN THE CLOUD

BIG DATA TESTING: A UNIFIED VIEW

DATABASE DESIGN II - 1DL400

Next Generation DWH Modeling. An overview of DWH modeling methods

Introduction to the Mathematics of Big Data. Philippe B. Laval

Hadoop محبوبه دادخواه کارگاه ساالنه آزمایشگاه فناوری وب زمستان 1391

Big Data Architect.

TOOLS FOR INTEGRATING BIG DATA IN CLOUD COMPUTING: A STATE OF ART SURVEY

GRADUATE PROGRAMS IN ENTERPRISE AND CLOUD COMPUTING

Data Science and Open Source Software. Iraklis Varlamis Assistant Professor Harokopio University of Athens

The Return of Innovation. David May. David May 1 Cambridge December 2005

Intro Cassandra. Adelaide Big Data Meetup.

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

A Parallel R Framework

"Big Data... and Related Topics" John S. Erickson, Ph.D The Rensselaer IDEA Rensselaer Polytechnic Institute

Architekturen für die Cloud

New Approaches to Big Data Processing and Analytics

Introduction to Data Management. Lecture #1 (The Course Trailer )

Portal: Applications of New Technology to Transportation Data Archiving. Kristin Tufte & the Portal Team NATMEC, July 1, 2014, Chicago, IL

Informatica Enterprise Information Catalog

Database Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu

Data Platforms and Pattern Mining

The MapReduce Framework

Copyright 2012 EMC Corporation. All rights reserved.

How to survive the Data Deluge: Petabyte scale Cloud Computing

Top 25 Big Data Interview Questions And Answers

Introduction to Database Systems CSE 414

Big Data - Some Words BIG DATA 8/31/2017. Introduction

Data Clustering on the Parallel Hadoop MapReduce Model. Dimitrios Verraros

BEST BIG DATA CERTIFICATIONS

Five Common Myths About Scaling MySQL

When Computing Becomes Human: Automation, Innovation, and the Rise of the All-Powerful Service Provider

An Introductionto Big Data

Salary Guide. Technology Guide For further information please contact Ida Renaud, Manager ,

Distributed Systems Intro and Course Overview

CS 6240: Parallel Data Processing in MapReduce: Module 1. Mirek Riedewald

Big Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018

Some Big Data Challenges

WHITEPAPER. MemSQL Enterprise Feature List

Hadoop/MapReduce Computing Paradigm

SCU SEEDs Workshop Angela Musurlian

Scaling Up HBase. Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech. CSE6242 / CX4242: Data & Visual Analytics

Introduction to Database Services

Transcription:

Database Management Systems Fall 2017 Knowledge is of two kinds: we know a subject ourselves, or we know where we can find information upon it. -- Samuel Johnson (1709-1784) Queries for Today Why? Who? What? How? For instance?

Databases Why Study Them? The quest for knowledge used to begin with grand theories. Now it begins with massive amounts of data. Welcome to the Petabyte Age. Databases Why Study Them?

The Big Data Buzz Why? Between the dawn of civilization and 2003, we only created five exabytes of information; now we re creating that amount every two days. Eric Schmidt, Google (and others)

The Big Data Buzz Why? The sexy job in the next 10 years will be statisticians. Data Scientists? Hal Varian Prof. Emeritus UC Berkeley Chief Economist, Google Some Numbers by Industry Amount of Stored Data By Sector (in Petabytes, 2009) 1000 966 900 848 800 700 600 715 619 500 434 400 300 364 269 227 200 100 0 Sources: "Big Data: The Next Frontier for Innovation, Competition and Productivity." US Bureau of Labor Statistics McKinsley Global Institute Analysis

Industrial Revolution of Data! UPC RFID GPS Sensornets Software Logs Microphones Cameras... http://www.flickr.com/photos/revsorg/1168346563/ It s All Happening On-line Every: Click Ad impression Wall post, friending, Billing event Fast Forward, pause, Server request Transaction Network message Fault Generates Streams of Data that can be Analyzed 12

User Generated Content 13 Credit: Mike Carey, UCI Graph Data Lots of interesting data has a graph structure: Social networks Communication networks Computer Networks Road networks Citations Collaborations/Relationships Some of these graphs can get quite large (e.g., Facebook s user graph) 14

M2M - Internet of things 15 Fusion: e.g., NextGen Maps Crowdsourcing + physical modeling + sensing + data assimilation to produce: From Alex Bayen, UCB 16

What can you do with the data Reporting Post Hoc Real time Monitoring (fine-grained) Exploration Finding Patterns Root Cause Analysis Closed-loop Control Model construction Prediction 17 Big Data, Societal-Scale App? Cancer Tumor Genomics Vision: Personalized Therapy 10 years from now, each cancer patient is going to want to get a genomic analysis of their cancer and will expect customized therapy based on that information. Director, The Cancer Genome Atlas (TCGA), Time Magazine, 6/13/11

Opportunity or Obligation? Provocative Hypothesis: Given fast growing genomic databases, could CS now be a huge help in war on cancer? If a chance that we could help millions of cancer patients live longer and better lives, as moral people, aren t we obligated to try? David Patterson, Computer Scientists May Have What It Takes to Help Cure Cancer, New York Times, 12/5/2011 UCSF cancer researchers + UCSC cancer genetic database + AMP Lab The Cancer Genome Atlas: 5 PB = 20 cancers x 1000 genomes So, In Summary Why? Data will be at the center of the major issues and events of your life. As a computer professional, you d better be on top of how to manage, use, and make sense of it.

Queries for Today Why? Who? What? How? For instance? Who? Haidar M. Harmanani Professor of Computer Science Office: 810 Block A Office Hours: TTh: 8:00-9:30 and 3:00-4:30 Email: haidar@lau.edu.lb Course Website: http://vlsi.byblos.lau.edu.lb/classes/csc375/csc37 5.html http://harmanani.github.io/csc375.html

What: Current Market Relational DBMSs anchor the software industry Elephants: Oracle, IBM, Microsoft, Teradata, HP, EMC, Open source: MySQL, PostgreSQL Obviously, Search Google & Bing Open Source NoSQL Hadoop MapReduce Key-value stores: Cassandra, Riak, Voldemort, Mongo, Cloud services Amazon, Google AppEngine, MS Azure, Heroku, Increasing use of custom code What will we learn? Design patterns for dealing with Big Data When, why and how to structure your data How MySQL and Oracle and (a bit of) Google work SQL... and nosql Managing concurrency Fault tolerance and Recovery Scaling out: parallelism and replication Audacity and Reverence.

What: Summing up Data is at the center of many things. For instance: computer science. What: Summing up You might think that we ll learn to apply computer science to Big Data. This class should apply very broadly. The techniques we ll learn for Big Data are the key to scalable computer science.

Don t forget Hal Varian s prediction These professions barely have names: Cloud programmer Data scientist Scalable systems architect Data-driven thinker This will be a large fraction of the computing workforce. Google's Chief Economist By 2018, the US could face a shortage of up to 190,000 workers with analytical skills McKinsey Global Institute