Review - Relational Model Concepts

Similar documents
Lecture 25 Overview. Last Lecture Query optimisation/query execution strategies

Cloud Computing 2. CSCI 4850/5850 High-Performance Computing Spring 2018

COSC 416 NoSQL Databases. NoSQL Databases Overview. Dr. Ramon Lawrence University of British Columbia Okanagan

Overview. * Some History. * What is NoSQL? * Why NoSQL? * RDBMS vs NoSQL. * NoSQL Taxonomy. *TowardsNewSQL

COSC 304 Introduction to Database Systems. NoSQL Databases. Dr. Ramon Lawrence University of British Columbia Okanagan

CIB Session 12th NoSQL Databases Structures

NOSQL EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

A NoSQL Introduction for Relational Database Developers. Andrew Karcher Las Vegas SQL Saturday September 12th, 2015

CISC 7610 Lecture 2b The beginnings of NoSQL

Cassandra, MongoDB, and HBase. Cassandra, MongoDB, and HBase. I have chosen these three due to their recent

Relational databases

Distributed Databases: SQL vs NoSQL

relational Key-value Graph Object Document

Study of NoSQL Database Along With Security Comparison

Stages of Data Processing

Introduction to NoSQL by William McKnight

Big Data with Hadoop Ecosystem

Comparing SQL and NOSQL databases

Non-Relational Databases. Pelle Jakovits

Big Data Hadoop Stack

COSC344 Database Theory and Applications. σ a= c (P) Lecture 3 The Relational Data. Model. π A, COSC344 Lecture 3 1

Introduction to Databases

Introduction to BigData, Hadoop:-

Introduction to NoSQL Databases

CS-580K/480K Advanced Topics in Cloud Computing. NoSQL Database

Hands-on immersion on Big Data tools

The Hadoop Ecosystem. EECS 4415 Big Data Systems. Tilemachos Pechlivanoglou

Typical size of data you deal with on a daily basis

What is database? Types and Examples

BIG DATA COURSE CONTENT

NOSQL DATABASE SYSTEMS: DECISION GUIDANCE AND TRENDS. Big Data Technologies: NoSQL DBMS (Decision Guidance) - SoSe

BIG DATA TECHNOLOGIES: WHAT EVERY MANAGER NEEDS TO KNOW ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29,

Microsoft Big Data and Hadoop

Introduction to Big Data. NoSQL Databases. Instituto Politécnico de Tomar. Ricardo Campos

Chapter 5. Database Processing

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

DATABASE MANAGEMENT SYSTEMS. UNIT I Introduction to Database Systems

Intro Cassandra. Adelaide Big Data Meetup.

Getting to know. by Michelle Darling August 2013

INFO-H415 Adanvanced Databases Documents store and cloudant

Column-Family Databases Cassandra and HBase

Embedded Technosolutions

CSE 344 JULY 9 TH NOSQL

NoSQL : A Panorama for Scalable Databases in Web

Hadoop An Overview. - Socrates CCDH

Sources. P. J. Sadalage, M Fowler, NoSQL Distilled, Addison Wesley

Big Data Architect.

Databases : Lecture 1 2: Beyond ACID/Relational databases Timothy G. Griffin Lent Term Apologies to Martin Fowler ( NoSQL Distilled )

Distributed Non-Relational Databases. Pelle Jakovits

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre

Introduction to NoSQL

Big Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours

Scaling Up HBase. Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech. CSE6242 / CX4242: Data & Visual Analytics

Avancier Methods (AM) From logical model to physical database

Modern Database Concepts

Challenges for Data Driven Systems

Nowcasting. D B M G Data Base and Data Mining Group of Politecnico di Torino. Big Data: Hype or Hallelujah? Big data hype?

Database Evolution. DB NoSQL Linked Open Data. L. Vigliano

Big Data Analytics. Rasoul Karimi

NoSQL Databases. Amir H. Payberah. Swedish Institute of Computer Science. April 10, 2014

Research Works to Cope with Big Data Volume and Variety. Jiaheng Lu University of Helsinki, Finland

L22: NoSQL. CS3200 Database design (sp18 s2) 4/5/2018 Several slides courtesy of Benny Kimelfeld

A REVIEW PAPER ON BIG DATA ANALYTICS

Data Informatics. Seon Ho Kim, Ph.D.

/ Cloud Computing. Recitation 10 March 22nd, 2016

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

CSE 530A. Non-Relational Databases. Washington University Fall 2013

Performance Comparison of NOSQL Database Cassandra and SQL Server for Large Databases

Introduction to Hadoop. High Availability Scaling Advantages and Challenges. Introduction to Big Data

Data Science and Open Source Software. Iraklis Varlamis Assistant Professor Harokopio University of Athens

Understanding NoSQL Database Implementations

Online Bill Processing System for Public Sectors in Big Data

Goal of the presentation is to give an introduction of NoSQL databases, why they are there.

Based on Big Data: Hype or Hallelujah? by Elena Baralis

NoSQL Databases MongoDB vs Cassandra. Kenny Huynh, Andre Chik, Kevin Vu

NoSQL Database Comparison: Bigtable, Cassandra and MongoDB CJ Campbell Brigham Young University October 16, 2015

/ Cloud Computing. Recitation 8 October 18, 2016

Big Data Hadoop Course Content

Big Data Analytics using Apache Hadoop and Spark with Scala

The NoSQL Ecosystem. Adam Marcus MIT CSAIL

NoSQL systems. Lecture 21 (optional) Instructor: Sudeepa Roy. CompSci 516 Data Intensive Computing Systems

Making MongoDB Accessible to All. Brody Messmer Product Owner DataDirect On-Premise Drivers Progress Software

Efficient and Scalable Friend Recommendations

Presented by Sunnie S Chung CIS 612

Introduction to Graph Databases

We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info

NOSQL Databases: The Need of Enterprises

CERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI)

Topics. History. Architecture. MongoDB, Mongoose - RDBMS - SQL. - NoSQL

A Review Paper on Big data & Hadoop

Introduction Aggregate data model Distribution Models Consistency Map-Reduce Types of NoSQL Databases

WHITEPAPER

Databases and Big Data Today. CS634 Class 22

CISC 7610 Lecture 4 Approaches to multimedia databases. Topics: Document databases Graph databases Metadata Column databases

Big Data Management and NoSQL Databases

Processing Unstructured Data. Dinesh Priyankara Founder/Principal Architect dinesql Pvt Ltd.

Processing big data with modern applications: Hadoop as DWH backend at Pro7. Dr. Kathrin Spreyer Big data engineer

Cassandra- A Distributed Database

Database Solution in Cloud Computing

CS 655 Advanced Topics in Distributed Systems

Transcription:

Lecture 25 Overview Last Lecture Query optimisation/query execution strategies This Lecture Non-relational data models Source: web pages, textbook chapters 20-22 Next Lecture Revision Review - Relational Model Concepts First formulated and proposed in 1969 by Edgar F. Codd A database is a collection of relations Usually referred to as tables Review - Relational Model Concepts Other models before and after Relational Model Concepts? COSC344 Lecture 25 1 COSC344 Lecture 25 2 History of Database: https://www.youtube.com/watch?annotation_id=annotation_3070369099&feature=iv&src_vid=q UV2j3XBRHc&v=FR4QIeZaPeM COSC344 Lecture 25 3 The Hierarchical Model Began as IBM s database program IMS - Information Management System, introduced in the late 1960s Organize data into a tree-like structure s and parent-child relationships 1:N relationship Jo Jones COSC344 Lecture 25 4 COSC344 Lecture 25 5 COSC344 Lecture 25 6 Division Divname... Staff Stname... William Duncan Vic Evans Staff Stname... COSC344 Lecture 25 7 COSC344 Lecture 25 8 COSC344 Lecture 25 9

Sample data as seen by the model Sciences Commerce Humanities Management Spanish Accounting Information English Science Education Jo Jones Jo Jones Allan Jo Jones Abbot COSC344 Lecture 25 10 Hierarchical comments It needs to be designed, defined and then built It is difficult to extend or alter It lacks the relational flexibility where tables can be added and dropped COSC344 Lecture 25 11 The Network model The network database model was invented by Charles Bachman in 1969 as an enhancement of the hierarchical database model. ( CODASYL model ) An attempt to standardise the data model objects are s and sets - usual meaning, a group of related data values set - a description of a 1:N relationship between two types sets have names, owners and members and are represented using Bachman diagrams COSC344 Lecture 25 12 Bachman diagrams Bachman diagrams Ring (circular linked list) linking the owner and all member s. Division Divname... Set type Set name Major_Dept Ali Div_Dept Major_Dept COSC344 Lecture 25 13 COSC344 Lecture 25 14 COSC344 Lecture 25 15 Humanities Sciences Introduction to Big Data What is Big Data? What makes data, Big? The Model Has Changed The Model of Generating/Consuming Data has Changed Old Model: Few companies are generating data, all others are consuming data New Model: all of us are generating data, and all of us are consuming data COSC344 Lecture 25 16 COSC344 Lecture 25 17 COSC344 Lecture 25 18

Big Data Sources Who s Generating Big Data Big Data Definition No single standard definition Big Data is data whose scale, diversity, and Social media and networks (all of us are generating data) Scientific instruments (collecting all sorts of data) Mobile devices (tracking all objects all the time) complexity require new architecture, techniques, algorithms, and analytics to manage it and extract value and hidden knowledge from it Sensor technology and networks (measuring all kinds of data) The progress and innovation is no longer hindered by the ability to collect data But, by the ability to manage, analyze, summarize, visualize, and discover knowledge from the collected data in a timely manner and in a scalable fashion COSC344 Lecture 25 19 COSC344 Lecture 25 20 COSC344 Lecture 25 21 Data Volume 44x increase from 2009 2020 Data volume is increasing exponentially 1-Scale (Volume) Exponential increase in collected/generated data 2-Complexity (Varity) Various formats, types, and structures Text, numerical, images, audio, video, sequences, time series, social media data, multi-dim arrays, etc A single application can be generating/collecting many types of data To extract knowledgeè all these types of data need to linked together 3-Speed (Velocity) Data is begin generated fast and need to be processed fast Late decisions è missing opportunities Examples E-Promotions: Based on your current location, your purchase history, what you like è send promotions right now for store next to you Healthcare monitoring: sensors monitoring your activities and body è any abnormal measurements require immediate reaction COSC344 Lecture 25 22 COSC344 Lecture 25 23 COSC344 Lecture 25 24 Big Data Characteristics Some Make it 4V s Stands for Not Only SQL Looser schema definition NoSQL Usually do not require a fixed table schema Designed to handle distributed, large databases No joins It's not a replacement for a RDBMS but compliments it. COSC344 Lecture 25 25 COSC344 Lecture 25 26 COSC344 Lecture 25 27

Where NoSQL is Used? Google (BigTable, Cloud Datastore) LinkedIn (Espresso) Facebook (Cassandra) Twitter (Hadoop/Hbase, FlockDB, Cassandra) Netflix(Simple DB, Hadoop/Hbase) NoSQL Scaling Scaling horizontally is possible with NoSQL Scaling up / down is easy Key-Value Simple model, mapping between key and value You cannot query without having the key Easy to use, very scalable Example: Redis (Widely used by Twitter and Flickr) https://www.youtube.com/watch?v=og610oe_kxs COSC344 Lecture 25 28 COSC344 Lecture 25 29 COSC344 Lecture 25 30 Wide-Column Similar to tables Key -> (Column: Value)* Example: Hbase, Cassandra (by Facebook) Column families Document JSON (JavaScript Object Notation), XML Use when data is semi-structured Useful when the structure of data changes through the lifetime of application Example: MongoDB (https://www.youtube.com/watch?v=cvir-2lmlsk) Graph Useful when objective is to quickly find connections, pattern and relationship between lots of data Usage: Recommendations for product, social media Example: FlockDB, Stardog Row key Personal data Professional data ID First Name Last Name Date of Birth Job Category Salary Date of Hire Employer COSC344 Lecture 25 31 COSC344 Lecture 25 32 COSC344 Lecture 25 33 http://highlyscalable.wordpress.com/2012/0 3/01/nosql-data-modeling-techniques/ Post-relational data models see NoSQL: Distributed and Scalable Non-Relational Database Systems http://www.linux-mag.com/id/7579/ COSC344 Lecture 25 34 MapReduce Programming Model A single machine cannot serve all the data Need a distributed system to store and process in parallel How do you scale to more machines? MapReduce - programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster How does MapReduce work? Read a lot of data Map: extract something you care about from each Shuffle and Sort Reduce: aggregate, summarize, filter, or transform Write the results COSC344 Lecture 25 35 Input Data Split 0 Split 1 Split 2 read MapReduce workflow local write Map extract something you care about from each remote read, sort write Reduce aggregate, summarize, filter, or transform Output Data Output File 0 Output File 1 COSC344 Lecture 25 36

Example: Word Count Example: Word Count MapReduce http://kickstarthadoop.blogspot.ca/2011/04/word-count-hadoop-map-reduce-example.html COSC344 Lecture 25 37 37 http://kickstarthadoop.blogspot.ca/2011/04/word-count-hadoop-map-reduce-example.html COSC344 Lecture 25 38 38 COSC344 Lecture 25 39 Exam Review and exam guidance in next lecture References NoSQL: Distributed and Scalable Non-Relational Database Systems,http://www.linux-mag.com/id/7579/ What is NoSQL: https://www.youtube.com/watch?v=quv2j3xbrhc NoSQL vs. SQL Summary, http://www.mongodb.com/nosql-explained NoSQL Wiki, http://en.wikipedia.org/wiki/nosql Cassandra, http://cassandra.apache.org/ Mongodb, http://www.mongodb.org/ https://www.youtube.com/watch?v=fr4qiezapem https://www.mongodb.com/big-data-explained https://datajobs.com/what-is-hadoop-and-nosql http://bigdata-madesimple.com/a-deep-dive-into-nosql-a-complete-list-of-nosqldatabases/ Introduction to NoSQL and MongoDB - Northeastern University, www.ccs.neu.edu/home/kathleen/classes/cs3200/20-nosqlmongodb.pdf COSC344 Lecture 25 40 COSC344 Lecture 25 41