The NoSQL Landscape. Frank Weigel VP, Field Technical Opera;ons

Size: px
Start display at page:

Download "The NoSQL Landscape. Frank Weigel VP, Field Technical Opera;ons"

Transcription

1

2 The NoSQL Landscape Frank Weigel VP, Field Technical Opera;ons

3 What we ll talk about Why RDBMS are not enough? What are the different NoSQL taxonomies? Which NoSQL is right for me?

4 Macro Trends Driving NoSQL Technology More Data More Users Interac;ve Apps + + NoSQL

5 Draw Something Goes Viral 3 Weeks ATer Launch Draw Something by OMGPOP Daily Ac)ve Users (millions) Feb 2012 March 2012

6 Does it work with RDMBS backend? Application Scales Out Just add more commodity web servers Database Scales Up Get a bigger, more complex server Note Rela3onal database technology is great for what it is great for, but it is not great for this.

7 Some alterna;ves to scale out your RDBMS Scale out your RDBMS Run many SQL Servers Data are sharded (most of the *me using client code) Memcached for faster response ;me

8 Scale out with RDBMS Is this a good approach to scale? Lot of components to deploy Scale by Hand Caching Sharding/Replica3on Learn From Others This Scenario Costs Time and Money. Scaling SQL is poten3ally disastrous when going Viral: Very risky 3me for major code changes and migra3ons... You have no Time when skyrocke3ng up!

9 Lacking Solu;ons, Users Forced to Invent Bigtable November 2006 Dynamo October 2007 Cassandra August 2008 Voldemort February 2009 Very few organiza;ons can build and maintain database sotware technology. But every organiza;on building interac;ve web applica;ons needs this technology.

10 NoSQL database matches applica;on logic ;er architecture Data layer now scales with linear cost and constant performance Application Scales Out Just add more commodity web servers NoSQL Database Servers Database Scales Out Just add more commodity data servers Scaling out flattens the cost and performance curves.

11 What Is Biggest Data Management Problem Driving Use of NoSQL in Coming Year? 49 % 35 % 29 % 16 % 12 % 11 % Lack of flexibility/ rigid schemas Inability to scale out data Performance challenges Cost All of these Other Source: Couchbase Survey, December 2011, n = 1351.

12 NoSQL Taxonomy

13 The Key- Value Store the founda;on of NoSQL Key Opaque Binary Value

14 Memcached the NoSQL precursor Key Opaque Binary Value Memcached In- memory only Limited set of opera3ons Blob Storage: Set, Add, Replace, CAS Retrieval: Get Structured Data: Append, Increment Simple and fast. Challenges: cold cache, disrup3ve elas3city

15 NoSQL catalog Key- Value Cache (memory only) Memcached Database (memory/disk)

16 Redis More Structured Data commands Key Data Structures Blob List Set Hash Redis Disk Persistence (eventual consistency on the disk) Vast set of opera3ons Blob Storage: Set, Add, Replace, CAS Retrieval: Get, Pub- Sub Structured Data: Strings, Hashes, Lists, Sets, Sorted lists Example opera3ons for a Set Add, count, subtract sets, intersec3on, is member, atomic move from one set to another

17 NoSQL catalog Key- Value Data Structure Cache (memory only) Memcached Redis Database (memory/disk)

18 Membase From key- value cache to database Key Opaque Binary Value Membase Disk- based with built- in memcached cache Cache refill on restart Memcached compa3ble (drop in replacement) Highly- available (data replica3on) Add or remove capacity to live cluster Simple, fast, elas3c.

19 NoSQL catalog Key- Value Data Structure Cache (memory only) Memcached Redis Database (memory/disk) Membase

20 Couchbase document- oriented database Key { } string : string, string : value, string : { string JSON : string, string : value }, OBJECT string : [ array ] ( DOCUMENT ) Couchbase Auto- sharding Disk- based with built- in memcached cache Cache refill on restart Memcached compa3ble (drop in replace) Highly- available (data replica3on) Add or remove capacity to live cluster When values are JSON objects ( documents ): Create indices, views and query against the views

21 NoSQL catalog Key- Value Data Structure Document Cache (memory only) Memcached Redis Database (memory/disk) Membase Couchbase

22 MongoDB Document- oriented database Key { } string : string, string : value, string : { string BSON : string, string OBJECT : value }, string : [ array ] ( DOCUMENT ) MongoDB Disk- based with in- memory caching BSON ( binary JSON ) format and wire protocol Master- slave replica3on Auto- sharding Values are BSON objects Supports ad hoc queries best when indexed

23 NoSQL catalog Key- Value Data Structure Document Cache (memory only) Memcached Redis Database (memory/disk) Membase Couchbase MongoDB

24 Cassandra Column overlays Column 1 Column 2 Column 3 (not present) Key Opaque Binary Value Cassandra Disk- based system Clustered External caching required for low- latency reads Columns are overlaid on the data Not all rows must have all columns Supports efficient queries on columns Restart required when adding columns

25 NoSQL catalog Key- Value Data Structure Document Column Cache (memory only) Memcached Redis Database (memory/disk) Membase Couchbase Cassandra MongoDB

26 Neo4j Graph database Key Opaque Binary Value Neo4j Key Opaque Binary Value Key Opaque Binary Value Disk- based system External caching required for low- latency reads Nodes, rela3onships and paths Proper3es on nodes Delete, Insert, Traverse, etc. Key Opaque Binary Value Key Opaque Binary Value

27 NoSQL catalog Key- Value Data Structure Document Column Graph Cache (memory only) Memcached Redis Database (memory/disk) Membase Couchbase Cassandra Neo4j MongoDB

28 NoSQL catalog Key- Value Data Structure Document Column Graph Cache (memory only) Memcached Redis Coherence Database (memory/disk) Membase Couchbase Cassandra Neo4j MongoDB HBase InfiniteGraph

29 Speed and Scale

30 What about Hadoop?

31 Hadoop: Big Data Swiss Army Knife Oozie: Workflow, coordina;on Sqoop : Data connector to import/export data Hive : SQL- Like interface Pig : High level programming language Mahout : Machine learning library Whirr : Hadoop management tools for cloud services Flume : Aggregator Map Reduce : Framework to process large volume of data HBase : Key Value data store Zookeeper : Centralized configura;on management HDFS : Distributed file system

32 So what? Hadoop & Couchbase 40 milliseconds to respond with the decision. 3 profiles, real 3me campaign sta3s3cs 1 click stream events 2 profiles, campaigns

33 Which one is right for me?

34 Survey: Schema inflexibility #1 adop;on driver What is the biggest data management problem driving your use of NoSQL in the coming year? Lack of flexibility/rigid schemas 49% Inability to scale out data 35% High latency/low performance 29% Costs 16% All of these 12% Other 11% Source: Couchbase NoSQL Survey, December 2011, n=1351

35 Lack of Flexibility / Rigid Schema Aggregate Data Models (Mar*n Fowler) Flexible Data Structure Op3mized Access Easy to distribute data o::1001 hqp://mar3nfowler.com/bliki/aggregateorienteddatabase.html { uid: ji22jd, customer: Ann, line_items: [ { sku: , quan: 3, unit_price: 48.0 }, { sku: , quan: 1, unit_price: 39.0 }, { sku: , quan: 1, unit_price: 51.0 } ], payment: { type: Amex, expiry: 04/2001, last5: } }

36 Use Cases Key Value Document Columns Graph Session Management User Profile/Preferences Shopping Cart Event Logging Content Management Web Analy;cs E- Commerce Applica;on Event Logging Content Management Counters Connected Data / Social Networks Rou;ng, Dispatch Recommenda;ons based on Social Graph

37 Scale out your data Modify cluster topology should be simple Add, Remove, Configure Nodes on a running system What is the impact of topology changes? Sharding, Caching of the data Availability of the service during cluster changes More hardware = More failures Availability, reliability of the system: failover support

38 Produc;on Environment EMEA DC US DATA CENTER APAC DC

39 Add Nodes to Cluster APP SERVER 1 COUCHBASE Client Library CLUSTER MAP APP SERVER 2 COUCHBASE Client Library CLUSTER MAP READ/WRITE/UPDATE READ/WRITE/UPDATE SERVER 1 ACTIVE SERVER 2 ACTIVE SERVER 3 SERVER 4 SERVER 5 ACTIVE ACTIVE ACTIVE Two servers added One- click opera;on Doc 5 Doc Doc 2 Doc Doc 9 Doc REPLICA Doc 4 Doc Doc 1 Doc Doc 4 Doc Doc 7 Doc Doc 8 Doc REPLICA Doc 6 Doc Doc 3 Doc Doc 1 Doc Doc 2 Doc Doc 6 Doc REPLICA Doc 7 Doc Doc 9 Doc REPLICA REPLICA Docs automa;cally rebalanced across cluster Even distribu3on of docs Minimum doc movement Cluster map updated App database calls now distributed over larger number of servers Doc 8 Doc Doc 2 Doc Doc 5 Doc COUCHBASE SERVER CLUSTER User Configured Replica Count = 1

40 Fail Over Node APP SERVER 1 COUCHBASE Client Library CLUSTER MAP APP SERVER 2 COUCHBASE Client Library CLUSTER MAP SERVER 1 SERVER 2 ACTIVE ACTIVE Doc 5 Doc 2 Doc Doc Doc 4 Doc 7 Doc Doc Doc 1 Doc 3 REPLICA REPLICA Doc 4 Doc 1 Doc Doc Doc 6 Doc 3 Doc Doc SERVER 3 SERVER 4 SERVER 5 ACTIVE ACTIVE ACTIVE Doc 1 Doc 2 Doc Doc Doc 9 Doc 8 Doc Doc Doc 6 Doc Doc REPLICA REPLICA REPLICA Doc 7 Doc Doc 5 Doc Doc 8 Doc Doc 9 Doc Doc 2 Doc App servers accessing docs Requests to Server 3 fail Cluster detects server failed Promotes replicas of docs to ac3ve Updates cluster map Requests for docs now go to appropriate server Typically rebalance would follow COUCHBASE SERVER CLUSTER User Configured Replica Count = 1

41 Performance What is my working set? How cache is working? Put your data in RAM How to design my data model? Aggregate Model Easy to change

42 18 Read performance comparison - NoSQL databases Read latencies against throughput Cassandra 95th Percen;le Latency (ms) MongoDB cannot handle throughput above ~ 8000 ops / sec MongoDB Couchbase handles ~3X throughput with significantly lower latency Couchbase hqps://github.com/altoros/ycsb Opera;ons per Second

43 Write performance comparison - NoSQL databases 30 Insert/update latencies against throughput 95th Percen;le Latency (ms) MongoDB MongoDB latency shoots up beyond 6000 ops / sec Cassandra Couchbase latency stays consistently low even at ops / sec Couchbase Opera;ons per Second

44 Management and Monitoring Do not forget about Opera;ons! Service Reliability Engineering Team will thank you! Manage your cluster easily: Command Line, Administra3on Console to change cluster toplogy Monitor your NoSQL Analyze the overall status of your cluster View and fix boqlenecks

45

46 Conclusion One Size Does Not Fit All Overview of the the NoSQL types Choose the right solu;on Developer Produc;vity Large Scale Data

47 Q&A

48 Thanks!

49

Couchbase Server. Chris Anderson Chief

Couchbase Server. Chris Anderson Chief Couchbase Server Chris Anderson Chief Architect @jchris 1 Couchbase Server Simple = Fast Elas=c NoSQL Database Formerly known as Membase Server 2 Couchbase Server Features Built- in clustering All nodes

More information

CISC 7610 Lecture 2b The beginnings of NoSQL

CISC 7610 Lecture 2b The beginnings of NoSQL CISC 7610 Lecture 2b The beginnings of NoSQL Topics: Big Data Google s infrastructure Hadoop: open google infrastructure Scaling through sharding CAP theorem Amazon s Dynamo 5 V s of big data Everyone

More information

Goal of the presentation is to give an introduction of NoSQL databases, why they are there.

Goal of the presentation is to give an introduction of NoSQL databases, why they are there. 1 Goal of the presentation is to give an introduction of NoSQL databases, why they are there. We want to present "Why?" first to explain the need of something like "NoSQL" and then in "What?" we go in

More information

Latest Trends in Database Technology NoSQL and Beyond

Latest Trends in Database Technology NoSQL and Beyond Latest Trends in Database Technology NoSQL and Beyond Sebas>an Marsching www.aquenos.com Why we want more than SQL Performance / Data Size Opera>onal Costs Availability 2 NoSQL NoSQL Not Only SQL 3 NoSQL

More information

Jargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems

Jargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems Jargons, Concepts, Scope and Systems Key Value Stores, Document Stores, Extensible Record Stores Overview of different scalable relational systems Examples of different Data stores Predictions, Comparisons

More information

Hadoop An Overview. - Socrates CCDH

Hadoop An Overview. - Socrates CCDH Hadoop An Overview - Socrates CCDH What is Big Data? Volume Not Gigabyte. Terabyte, Petabyte, Exabyte, Zettabyte - Due to handheld gadgets,and HD format images and videos - In total data, 90% of them collected

More information

NoSQL systems. Lecture 21 (optional) Instructor: Sudeepa Roy. CompSci 516 Data Intensive Computing Systems

NoSQL systems. Lecture 21 (optional) Instructor: Sudeepa Roy. CompSci 516 Data Intensive Computing Systems CompSci 516 Data Intensive Computing Systems Lecture 21 (optional) NoSQL systems Instructor: Sudeepa Roy Duke CS, Spring 2016 CompSci 516: Data Intensive Computing Systems 1 Key- Value Stores Duke CS,

More information

NoSQL Databases MongoDB vs Cassandra. Kenny Huynh, Andre Chik, Kevin Vu

NoSQL Databases MongoDB vs Cassandra. Kenny Huynh, Andre Chik, Kevin Vu NoSQL Databases MongoDB vs Cassandra Kenny Huynh, Andre Chik, Kevin Vu Introduction - Relational database model - Concept developed in 1970 - Inefficient - NoSQL - Concept introduced in 1980 - Related

More information

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL CSE 544 Principles of Database Management Systems Magdalena Balazinska Winter 2015 Lecture 14 NoSQL References Scalable SQL and NoSQL Data Stores, Rick Cattell, SIGMOD Record, December 2010 (Vol. 39, No.

More information

Non-Relational Databases. Pelle Jakovits

Non-Relational Databases. Pelle Jakovits Non-Relational Databases Pelle Jakovits 25 October 2017 Outline Background Relational model Database scaling The NoSQL Movement CAP Theorem Non-relational data models Key-value Document-oriented Column

More information

Sources. P. J. Sadalage, M Fowler, NoSQL Distilled, Addison Wesley

Sources. P. J. Sadalage, M Fowler, NoSQL Distilled, Addison Wesley Big Data and NoSQL Sources P. J. Sadalage, M Fowler, NoSQL Distilled, Addison Wesley Very short history of DBMSs The seventies: IMS end of the sixties, built for the Apollo program (today: Version 15)

More information

Introduction to NoSQL Databases

Introduction to NoSQL Databases Introduction to NoSQL Databases Roman Kern KTI, TU Graz 2017-10-16 Roman Kern (KTI, TU Graz) Dbase2 2017-10-16 1 / 31 Introduction Intro Why NoSQL? Roman Kern (KTI, TU Graz) Dbase2 2017-10-16 2 / 31 Introduction

More information

BIG DATA COURSE CONTENT

BIG DATA COURSE CONTENT BIG DATA COURSE CONTENT [I] Get Started with Big Data Microsoft Professional Orientation: Big Data Duration: 12 hrs Course Content: Introduction Course Introduction Data Fundamentals Introduction to Data

More information

Big Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours

Big Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours Big Data Hadoop Developer Course Content Who is the target audience? Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours Complete beginners who want to learn Big Data Hadoop Professionals

More information

Introduction to Graph Databases

Introduction to Graph Databases Introduction to Graph Databases David Montag @dmontag #neo4j 1 Agenda NOSQL overview Graph Database 101 A look at Neo4j The red pill 2 Why you should listen Forrester says: The market for graph databases

More information

Introduction to Hadoop. High Availability Scaling Advantages and Challenges. Introduction to Big Data

Introduction to Hadoop. High Availability Scaling Advantages and Challenges. Introduction to Big Data Introduction to Hadoop High Availability Scaling Advantages and Challenges Introduction to Big Data What is Big data Big Data opportunities Big Data Challenges Characteristics of Big data Introduction

More information

Relational databases

Relational databases COSC 6397 Big Data Analytics NoSQL databases Edgar Gabriel Spring 2017 Relational databases Long lasting industry standard to store data persistently Key points concurrency control, transactions, standard

More information

Microsoft Big Data and Hadoop

Microsoft Big Data and Hadoop Microsoft Big Data and Hadoop Lara Rubbelke @sqlgal Cindy Gross @sqlcindy 2 The world of data is changing The 4Vs of Big Data http://nosql.mypopescu.com/post/9621746531/a-definition-of-big-data 3 Common

More information

Integrating Oracle Databases with NoSQL Databases for Linux on IBM LinuxONE and z System Servers

Integrating Oracle Databases with NoSQL Databases for Linux on IBM LinuxONE and z System Servers Oracle zsig Conference IBM LinuxONE and z System Servers Integrating Oracle Databases with NoSQL Databases for Linux on IBM LinuxONE and z System Servers Sam Amsavelu Oracle on z Architect IBM Washington

More information

A Survey Paper on NoSQL Databases: Key-Value Data Stores and Document Stores

A Survey Paper on NoSQL Databases: Key-Value Data Stores and Document Stores A Survey Paper on NoSQL Databases: Key-Value Data Stores and Document Stores Nikhil Dasharath Karande 1 Department of CSE, Sanjay Ghodawat Institutes, Atigre nikhilkarande18@gmail.com Abstract- This paper

More information

Overview. * Some History. * What is NoSQL? * Why NoSQL? * RDBMS vs NoSQL. * NoSQL Taxonomy. *TowardsNewSQL

Overview. * Some History. * What is NoSQL? * Why NoSQL? * RDBMS vs NoSQL. * NoSQL Taxonomy. *TowardsNewSQL * Some History * What is NoSQL? * Why NoSQL? * RDBMS vs NoSQL * NoSQL Taxonomy * Towards NewSQL Overview * Some History * What is NoSQL? * Why NoSQL? * RDBMS vs NoSQL * NoSQL Taxonomy *TowardsNewSQL NoSQL

More information

The Hadoop Ecosystem. EECS 4415 Big Data Systems. Tilemachos Pechlivanoglou

The Hadoop Ecosystem. EECS 4415 Big Data Systems. Tilemachos Pechlivanoglou The Hadoop Ecosystem EECS 4415 Big Data Systems Tilemachos Pechlivanoglou tipech@eecs.yorku.ca A lot of tools designed to work with Hadoop 2 HDFS, MapReduce Hadoop Distributed File System Core Hadoop component

More information

Challenges for Data Driven Systems

Challenges for Data Driven Systems Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Data Centric Systems and Networking Emergence of Big Data Shift of Communication Paradigm From end-to-end to data

More information

Couchbase Architecture Couchbase Inc. 1

Couchbase Architecture Couchbase Inc. 1 Couchbase Architecture 2015 Couchbase Inc. 1 $whoami Laurent Doguin Couchbase Developer Advocate @ldoguin laurent.doguin@couchbase.com 2015 Couchbase Inc. 2 2 Big Data = Operational + Analytic (NoSQL +

More information

Hadoop & Big Data Analytics Complete Practical & Real-time Training

Hadoop & Big Data Analytics Complete Practical & Real-time Training An ISO Certified Training Institute A Unit of Sequelgate Innovative Technologies Pvt. Ltd. www.sqlschool.com Hadoop & Big Data Analytics Complete Practical & Real-time Training Mode : Instructor Led LIVE

More information

PROFESSIONAL. NoSQL. Shashank Tiwari WILEY. John Wiley & Sons, Inc.

PROFESSIONAL. NoSQL. Shashank Tiwari WILEY. John Wiley & Sons, Inc. PROFESSIONAL NoSQL Shashank Tiwari WILEY John Wiley & Sons, Inc. Examining CONTENTS INTRODUCTION xvil CHAPTER 1: NOSQL: WHAT IT IS AND WHY YOU NEED IT 3 Definition and Introduction 4 Context and a Bit

More information

CIB Session 12th NoSQL Databases Structures

CIB Session 12th NoSQL Databases Structures CIB Session 12th NoSQL Databases Structures By: Shahab Safaee & Morteza Zahedi Software Engineering PhD Email: safaee.shx@gmail.com, morteza.zahedi.a@gmail.com cibtrc.ir cibtrc cibtrc 2 Agenda What is

More information

Big Data Architect.

Big Data Architect. Big Data Architect www.austech.edu.au WHAT IS BIG DATA ARCHITECT? A big data architecture is designed to handle the ingestion, processing, and analysis of data that is too large or complex for traditional

More information

Parallel Programming Principle and Practice. Lecture 10 Big Data Processing with MapReduce

Parallel Programming Principle and Practice. Lecture 10 Big Data Processing with MapReduce Parallel Programming Principle and Practice Lecture 10 Big Data Processing with MapReduce Outline MapReduce Programming Model MapReduce Examples Hadoop 2 Incredible Things That Happen Every Minute On The

More information

NOSQL DATABASE SYSTEMS: DECISION GUIDANCE AND TRENDS. Big Data Technologies: NoSQL DBMS (Decision Guidance) - SoSe

NOSQL DATABASE SYSTEMS: DECISION GUIDANCE AND TRENDS. Big Data Technologies: NoSQL DBMS (Decision Guidance) - SoSe NOSQL DATABASE SYSTEMS: DECISION GUIDANCE AND TRENDS h_da Prof. Dr. Uta Störl Big Data Technologies: NoSQL DBMS (Decision Guidance) - SoSe 2017 163 Performance / Benchmarks Traditional database benchmarks

More information

Distributed Non-Relational Databases. Pelle Jakovits

Distributed Non-Relational Databases. Pelle Jakovits Distributed Non-Relational Databases Pelle Jakovits Tartu, 7 December 2018 Outline Relational model NoSQL Movement Non-relational data models Key-value Document-oriented Column family Graph Non-relational

More information

Introduc)on to Apache Ka1a. Jun Rao Co- founder of Confluent

Introduc)on to Apache Ka1a. Jun Rao Co- founder of Confluent Introduc)on to Apache Ka1a Jun Rao Co- founder of Confluent Agenda Why people use Ka1a Technical overview of Ka1a What s coming What s Apache Ka1a Distributed, high throughput pub/sub system Ka1a Usage

More information

Apache Hadoop Goes Realtime at Facebook. Himanshu Sharma

Apache Hadoop Goes Realtime at Facebook. Himanshu Sharma Apache Hadoop Goes Realtime at Facebook Guide - Dr. Sunny S. Chung Presented By- Anand K Singh Himanshu Sharma Index Problem with Current Stack Apache Hadoop and Hbase Zookeeper Applications of HBase at

More information

Matt Ingenthron. Couchbase, Inc.

Matt Ingenthron. Couchbase, Inc. Matt Ingenthron Couchbase, Inc. 2 What is Membase? Before: Application scales linearly, data hits wall Application Scales Out Just add more commodity web servers Database Scales Up Get a bigger, more complex

More information

The NoSQL Ecosystem. Adam Marcus MIT CSAIL

The NoSQL Ecosystem. Adam Marcus MIT CSAIL The NoSQL Ecosystem Adam Marcus MIT CSAIL marcua@csail.mit.edu / @marcua About Me Social Computing + Database Systems Easily Distracted: Wrote The NoSQL Ecosystem in The Architecture of Open Source Applications

More information

NoSQL data stores and SOS: Uniform Access to Non-Relational Database Systems Paolo Atzeni Francesca Bugiotti Luca Rossi

NoSQL data stores and SOS: Uniform Access to Non-Relational Database Systems Paolo Atzeni Francesca Bugiotti Luca Rossi NoSQL data stores and SOS: Uniform Access to Non-Relational Database Systems Paolo Atzeni Francesca Bugiotti Luca Rossi Outline Context Rela&onal DBMS NoSQL Data Stores NoSQL Timeline NoSQL Data Stores

More information

NoSQL Databases Analysis

NoSQL Databases Analysis NoSQL Databases Analysis Jeffrey Young Intro I chose to investigate Redis, MongoDB, and Neo4j. I chose Redis because I always read about Redis use and its extreme popularity yet I know little about it.

More information

HBase... And Lewis Carroll! Twi:er,

HBase... And Lewis Carroll! Twi:er, HBase... And Lewis Carroll! jw4ean@cloudera.com Twi:er, LinkedIn: @jw4ean 1 Introduc@on 2010: Cloudera Solu@ons Architect 2011: Cloudera TAM/DSE 2012-2013: Cloudera Training focusing on Partners and Newbies

More information

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case

More information

Database Solution in Cloud Computing

Database Solution in Cloud Computing Database Solution in Cloud Computing CERC liji@cnic.cn Outline Cloud Computing Database Solution Our Experiences in Database Cloud Computing SaaS Software as a Service PaaS Platform as a Service IaaS Infrastructure

More information

Developing in NoSQL with Couchbase

Developing in NoSQL with Couchbase Developing in NoSQL with Couchbase Raghavan Rags Srinivas Developer Advocate Simple. Fast. Elastic. Speaker Introduction Architect and Evangelist working with developers Speaker at JavaOne, RSA conferences,

More information

@ COUCHBASE CONNECT. Using Couchbase. By: Carleton Miyamoto, Michael Kehoe Version: 1.1w LinkedIn Corpora3on

@ COUCHBASE CONNECT. Using Couchbase. By: Carleton Miyamoto, Michael Kehoe Version: 1.1w LinkedIn Corpora3on @ COUCHBASE CONNECT Using Couchbase By: Carleton Miyamoto, Michael Kehoe Version: 1.1w Overview The LinkedIn Story Enter Couchbase Development and Opera3ons Clusters and Numbers Opera3onal Tooling Carleton

More information

SQL, NoSQL, MongoDB. CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden

SQL, NoSQL, MongoDB. CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden SQL, NoSQL, MongoDB CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden SQL Databases Really better called Relational Databases Key construct is the Relation, a.k.a. the table Rows represent records Columns

More information

Big Data Syllabus. Understanding big data and Hadoop. Limitations and Solutions of existing Data Analytics Architecture

Big Data Syllabus. Understanding big data and Hadoop. Limitations and Solutions of existing Data Analytics Architecture Big Data Syllabus Hadoop YARN Setup Programming in YARN framework j Understanding big data and Hadoop Big Data Limitations and Solutions of existing Data Analytics Architecture Hadoop Features Hadoop Ecosystem

More information

Friday, April 26, 13

Friday, April 26, 13 Introduc)on to Map Reduce with Couchbase Tugdual Grall / @tgrall NoSQL Ma)ers 13 - Cologne - April 25th 2013 About Me Tugdual Tug Grall Couchbase exo Technical Evangelist CTO Oracle Developer/Product Manager

More information

Introduction Aggregate data model Distribution Models Consistency Map-Reduce Types of NoSQL Databases

Introduction Aggregate data model Distribution Models Consistency Map-Reduce Types of NoSQL Databases Introduction Aggregate data model Distribution Models Consistency Map-Reduce Types of NoSQL Databases Key-Value Document Column Family Graph John Edgar 2 Relational databases are the prevalent solution

More information

Scaling for Humongous amounts of data with MongoDB

Scaling for Humongous amounts of data with MongoDB Scaling for Humongous amounts of data with MongoDB Alvin Richards Technical Director, EMEA alvin@10gen.com @jonnyeight alvinonmongodb.com From here... http://bit.ly/ot71m4 ...to here... http://bit.ly/oxcsis

More information

Presented by Sunnie S Chung CIS 612

Presented by Sunnie S Chung CIS 612 By Yasin N. Silva, Arizona State University Presented by Sunnie S Chung CIS 612 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. See http://creativecommons.org/licenses/by-nc-sa/4.0/

More information

Managing IoT and Time Series Data with Amazon ElastiCache for Redis

Managing IoT and Time Series Data with Amazon ElastiCache for Redis Managing IoT and Time Series Data with ElastiCache for Redis Darin Briskman, ElastiCache Developer Outreach Michael Labib, Specialist Solutions Architect 2016, Web Services, Inc. or its Affiliates. All

More information

Introduction to Big Data. NoSQL Databases. Instituto Politécnico de Tomar. Ricardo Campos

Introduction to Big Data. NoSQL Databases. Instituto Politécnico de Tomar. Ricardo Campos Instituto Politécnico de Tomar Introduction to Big Data NoSQL Databases Ricardo Campos Mestrado EI-IC Análise e Processamento de Grandes Volumes de Dados Tomar, Portugal, 2016 Part of the slides used in

More information

NOSQL EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

NOSQL EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY NOSQL EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY WHAT IS NOSQL? Stands for No-SQL or Not Only SQL. Class of non-relational data storage systems E.g.

More information

CSE 344 JULY 9 TH NOSQL

CSE 344 JULY 9 TH NOSQL CSE 344 JULY 9 TH NOSQL ADMINISTRATIVE MINUTIAE HW3 due Wednesday tests released actual_time should have 0s not NULLs upload new data file or use UPDATE to change 0 ~> NULL Extra OOs on Mondays 5-7pm in

More information

Introduction to BigData, Hadoop:-

Introduction to BigData, Hadoop:- Introduction to BigData, Hadoop:- Big Data Introduction: Hadoop Introduction What is Hadoop? Why Hadoop? Hadoop History. Different types of Components in Hadoop? HDFS, MapReduce, PIG, Hive, SQOOP, HBASE,

More information

Innovatus Technologies

Innovatus Technologies HADOOP 2.X BIGDATA ANALYTICS 1. Java Overview of Java Classes and Objects Garbage Collection and Modifiers Inheritance, Aggregation, Polymorphism Command line argument Abstract class and Interfaces String

More information

How Apache Hadoop Complements Existing BI Systems. Dr. Amr Awadallah Founder, CTO Cloudera,

How Apache Hadoop Complements Existing BI Systems. Dr. Amr Awadallah Founder, CTO Cloudera, How Apache Hadoop Complements Existing BI Systems Dr. Amr Awadallah Founder, CTO Cloudera, Inc. Twitter: @awadallah, @cloudera 2 The Problems with Current Data Systems BI Reports + Interactive Apps RDBMS

More information

Hadoop. copyright 2011 Trainologic LTD

Hadoop. copyright 2011 Trainologic LTD Hadoop Hadoop is a framework for processing large amounts of data in a distributed manner. It can scale up to thousands of machines. It provides high-availability. Provides map-reduce functionality. Hides

More information

Chapter 24 NOSQL Databases and Big Data Storage Systems

Chapter 24 NOSQL Databases and Big Data Storage Systems Chapter 24 NOSQL Databases and Big Data Storage Systems - Large amounts of data such as social media, Web links, user profiles, marketing and sales, posts and tweets, road maps, spatial data, email - NOSQL

More information

Outline. Spanner Mo/va/on. Tom Anderson

Outline. Spanner Mo/va/on. Tom Anderson Spanner Mo/va/on Tom Anderson Outline Last week: Chubby: coordina/on service BigTable: scalable storage of structured data GFS: large- scale storage for bulk data Today/Friday: Lessons from GFS/BigTable

More information

Big Data Hadoop Course Content

Big Data Hadoop Course Content Big Data Hadoop Course Content Topics covered in the training Introduction to Linux and Big Data Virtual Machine ( VM) Introduction/ Installation of VirtualBox and the Big Data VM Introduction to Linux

More information

ICALEPS 2013 Exploring No-SQL Alternatives for ALMA Monitoring System ADC

ICALEPS 2013 Exploring No-SQL Alternatives for ALMA Monitoring System ADC ICALEPS 2013 Exploring No-SQL Alternatives for ALMA Monitoring System Overview The current paradigm (CCL and Relational DataBase) Propose of a new monitor data system using NoSQL Monitoring Storage Requirements

More information

MapR Enterprise Hadoop

MapR Enterprise Hadoop 2014 MapR Technologies 2014 MapR Technologies 1 MapR Enterprise Hadoop Top Ranked Cloud Leaders 500+ Customers 2014 MapR Technologies 2 Key MapR Advantage Partners Business Services APPLICATIONS & OS ANALYTICS

More information

Intro Cassandra. Adelaide Big Data Meetup.

Intro Cassandra. Adelaide Big Data Meetup. Intro Cassandra Adelaide Big Data Meetup instaclustr.com @Instaclustr Who am I and what do I do? Alex Lourie Worked at Red Hat, Datastax and now Instaclustr We currently manage x10s nodes for various customers,

More information

Polyglot Persistence in Today s Data World

Polyglot Persistence in Today s Data World Polyglot Persistence in Today s Data World Kimberly Wilkins Principal Engineer Databases ObjectRocket by Rackspace www.linkedin.com/in/wilkinskimberly, kimberly.wilkins@rackspace.com, @dba_denizen 1 Background

More information

MapReduce, Apache Hadoop

MapReduce, Apache Hadoop NDBI040: Big Data Management and NoSQL Databases hp://www.ksi.mff.cuni.cz/ svoboda/courses/2016-1-ndbi040/ Lecture 2 MapReduce, Apache Hadoop Marn Svoboda svoboda@ksi.mff.cuni.cz 11. 10. 2016 Charles University

More information

Big Data Analytics using Apache Hadoop and Spark with Scala

Big Data Analytics using Apache Hadoop and Spark with Scala Big Data Analytics using Apache Hadoop and Spark with Scala Training Highlights : 80% of the training is with Practical Demo (On Custom Cloudera and Ubuntu Machines) 20% Theory Portion will be important

More information

Study of NoSQL Database Along With Security Comparison

Study of NoSQL Database Along With Security Comparison Study of NoSQL Database Along With Security Comparison Ankita A. Mall [1], Jwalant B. Baria [2] [1] Student, Computer Engineering Department, Government Engineering College, Modasa, Gujarat, India ank.fetr@gmail.com

More information

Topics. History. Architecture. MongoDB, Mongoose - RDBMS - SQL. - NoSQL

Topics. History. Architecture. MongoDB, Mongoose - RDBMS - SQL. - NoSQL Databases Topics History - RDBMS - SQL Architecture - SQL - NoSQL MongoDB, Mongoose Persistent Data Storage What features do we want in a persistent data storage system? We have been using text files to

More information

Splout SQL When Big Data Output is also Big Data

Splout SQL When Big Data Output is also Big Data Iván de Prado Alonso CEO of Datasalt www.datasalt.es @ivanprado @datasalt Splout SQL When Big Data Output is also Big Data Big Data consulting & training Full SQL * * Within each par??on Full SQL * Unlike

More information

Cassandra, MongoDB, and HBase. Cassandra, MongoDB, and HBase. I have chosen these three due to their recent

Cassandra, MongoDB, and HBase. Cassandra, MongoDB, and HBase. I have chosen these three due to their recent Tanton Jeppson CS 401R Lab 3 Cassandra, MongoDB, and HBase Introduction For my report I have chosen to take a deeper look at 3 NoSQL database systems: Cassandra, MongoDB, and HBase. I have chosen these

More information

A Study of NoSQL Database

A Study of NoSQL Database A Study of NoSQL Database International Journal of Engineering Research & Technology (IJERT) Biswajeet Sethi 1, Samaresh Mishra 2, Prasant ku. Patnaik 3 1,2,3 School of Computer Engineering, KIIT University

More information

Stages of Data Processing

Stages of Data Processing Data processing can be understood as the conversion of raw data into a meaningful and desired form. Basically, producing information that can be understood by the end user. So then, the question arises,

More information

Building High Performance Apps using NoSQL. Swami Sivasubramanian General Manager, AWS NoSQL

Building High Performance Apps using NoSQL. Swami Sivasubramanian General Manager, AWS NoSQL Building High Performance Apps using NoSQL Swami Sivasubramanian General Manager, AWS NoSQL Building high performance apps There is a lot to building high performance apps Scalability Performance at high

More information

Cassandra Design Patterns

Cassandra Design Patterns Cassandra Design Patterns Sanjay Sharma Chapter No. 1 "An Overview of Architecture and Data Modeling in Cassandra" In this package, you will find: A Biography of the author of the book A preview chapter

More information

Accelerate MySQL for Demanding OLAP and OLTP Use Cases with Apache Ignite. Peter Zaitsev, Denis Magda Santa Clara, California April 25th, 2017

Accelerate MySQL for Demanding OLAP and OLTP Use Cases with Apache Ignite. Peter Zaitsev, Denis Magda Santa Clara, California April 25th, 2017 Accelerate MySQL for Demanding OLAP and OLTP Use Cases with Apache Ignite Peter Zaitsev, Denis Magda Santa Clara, California April 25th, 2017 About the Presentation Problems Existing Solutions Denis Magda

More information

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017) Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017) Week 10: Mutable State (1/2) March 14, 2017 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo These

More information

10 Million Smart Meter Data with Apache HBase

10 Million Smart Meter Data with Apache HBase 10 Million Smart Meter Data with Apache HBase 5/31/2017 OSS Solution Center Hitachi, Ltd. Masahiro Ito OSS Summit Japan 2017 Who am I? Masahiro Ito ( 伊藤雅博 ) Software Engineer at Hitachi, Ltd. Focus on

More information

Things Every Oracle DBA Needs to Know about the Hadoop Ecosystem. Zohar Elkayam

Things Every Oracle DBA Needs to Know about the Hadoop Ecosystem. Zohar Elkayam Things Every Oracle DBA Needs to Know about the Hadoop Ecosystem Zohar Elkayam www.realdbamagic.com Twitter: @realmgic Who am I? Zohar Elkayam, CTO at Brillix Programmer, DBA, team leader, database trainer,

More information

Hadoop Development Introduction

Hadoop Development Introduction Hadoop Development Introduction What is Bigdata? Evolution of Bigdata Types of Data and their Significance Need for Bigdata Analytics Why Bigdata with Hadoop? History of Hadoop Why Hadoop is in demand

More information

MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS

MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS SUJEE MANIYAM FOUNDER / PRINCIPAL @ ELEPHANT SCALE www.elephantscale.com sujee@elephantscale.com HI, I M SUJEE MANIYAM Founder / Principal @ ElephantScale

More information

We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info

We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info START DATE : TIMINGS : DURATION : TYPE OF BATCH : FEE : FACULTY NAME : LAB TIMINGS : PH NO: 9963799240, 040-40025423

More information

Databases and Big Data Today. CS634 Class 22

Databases and Big Data Today. CS634 Class 22 Databases and Big Data Today CS634 Class 22 Current types of Databases SQL using relational tables: still very important! NoSQL, i.e., not using relational tables: term NoSQL popular since about 2007.

More information

Scaling. Marty Weiner Grayskull, Eternia. Yashh Nelapati Gotham City

Scaling. Marty Weiner Grayskull, Eternia. Yashh Nelapati Gotham City Scaling Marty Weiner Grayskull, Eternia Yashh Nelapati Gotham City Pinterest is... An online pinboard to organize and share what inspires you. Relationships Marty Weiner Grayskull, Eternia Yashh Nelapati

More information

MapReduce, Apache Hadoop

MapReduce, Apache Hadoop Czech Technical University in Prague, Faculty of Informaon Technology MIE-PDB: Advanced Database Systems hp://www.ksi.mff.cuni.cz/~svoboda/courses/2016-2-mie-pdb/ Lecture 12 MapReduce, Apache Hadoop Marn

More information

DEMYSTIFYING BIG DATA WITH RIAK USE CASES. Martin Schneider Basho Technologies!

DEMYSTIFYING BIG DATA WITH RIAK USE CASES. Martin Schneider Basho Technologies! DEMYSTIFYING BIG DATA WITH RIAK USE CASES Martin Schneider Basho Technologies! Agenda Defining Big Data in Regards to Riak A Series of Trade-Offs Use Cases Q & A About Basho & Riak Basho Technologies is

More information

What is the maximum file size you have dealt so far? Movies/Files/Streaming video that you have used? What have you observed?

What is the maximum file size you have dealt so far? Movies/Files/Streaming video that you have used? What have you observed? Simple to start What is the maximum file size you have dealt so far? Movies/Files/Streaming video that you have used? What have you observed? What is the maximum download speed you get? Simple computation

More information

BIG DATA TECHNOLOGIES: WHAT EVERY MANAGER NEEDS TO KNOW ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29,

BIG DATA TECHNOLOGIES: WHAT EVERY MANAGER NEEDS TO KNOW ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29, BIG DATA TECHNOLOGIES: WHAT EVERY MANAGER NEEDS TO KNOW ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29, 2016 1 OBJECTIVES ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29, 2016 2 WHAT

More information

Perspectives on NoSQL

Perspectives on NoSQL Perspectives on NoSQL PGCon 2010 Gavin M. Roy What is NoSQL? NoSQL is a movement promoting a loosely defined class of nonrelational data stores that break with a long history of relational

More information

CS-580K/480K Advanced Topics in Cloud Computing. NoSQL Database

CS-580K/480K Advanced Topics in Cloud Computing. NoSQL Database CS-580K/480K dvanced Topics in Cloud Computing NoSQL Database 1 1 Where are we? Cloud latforms 2 VM1 VM2 VM3 3 Operating System 4 1 2 3 Operating System 4 1 2 Virtualization Layer 3 Operating System 4

More information

relational Relational to Riak Why Move From Relational to Riak? Introduction High Availability Riak At-a-Glance

relational Relational to Riak Why Move From Relational to Riak? Introduction High Availability Riak At-a-Glance WHITEPAPER Relational to Riak relational Introduction This whitepaper looks at why companies choose Riak over a relational database. We focus specifically on availability, scalability, and the / data model.

More information

Big Data Management and NoSQL Databases

Big Data Management and NoSQL Databases NDBI040 Big Data Management and NoSQL Databases Lecture 1. Introduction Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz http://www.ksi.mff.cuni.cz/~holubova/ndbi040/ What is Big Data? buzzword?

More information

A NoSQL Introduction for Relational Database Developers. Andrew Karcher Las Vegas SQL Saturday September 12th, 2015

A NoSQL Introduction for Relational Database Developers. Andrew Karcher Las Vegas SQL Saturday September 12th, 2015 A NoSQL Introduction for Relational Database Developers Andrew Karcher Las Vegas SQL Saturday September 12th, 2015 About Me http://www.andrewkarcher.com Twitter: @akarcher LinkedIn, Twitter Email: akarcher@gmail.com

More information

Configuring and Deploying Hadoop Cluster Deployment Templates

Configuring and Deploying Hadoop Cluster Deployment Templates Configuring and Deploying Hadoop Cluster Deployment Templates This chapter contains the following sections: Hadoop Cluster Profile Templates, on page 1 Creating a Hadoop Cluster Profile Template, on page

More information

Spotify. Scaling storage to million of users world wide. Jimmy Mårdell October 14, 2014

Spotify. Scaling storage to million of users world wide. Jimmy Mårdell October 14, 2014 Cassandra @ Spotify Scaling storage to million of users world wide! Jimmy Mårdell October 14, 2014 2 About me Jimmy Mårdell Tech Product Owner in the Cassandra team 4 years at Spotify

More information

Why NoSQL? Why Riak?

Why NoSQL? Why Riak? Why NoSQL? Why Riak? Justin Sheehy justin@basho.com 1 What's all of this NoSQL nonsense? Riak Voldemort HBase MongoDB Neo4j Cassandra CouchDB Membase Redis (and the list goes on...) 2 What went wrong with

More information

Introduction to NoSQL by William McKnight

Introduction to NoSQL by William McKnight Introduction to NoSQL by William McKnight All rights reserved. Reproduction in whole or part prohibited except by written permission. Product and company names mentioned herein may be trademarks of their

More information

Databases : Lecture 1 2: Beyond ACID/Relational databases Timothy G. Griffin Lent Term Apologies to Martin Fowler ( NoSQL Distilled )

Databases : Lecture 1 2: Beyond ACID/Relational databases Timothy G. Griffin Lent Term Apologies to Martin Fowler ( NoSQL Distilled ) Databases : Lecture 1 2: Beyond ACID/Relational databases Timothy G. Griffin Lent Term 2016 Rise of Web and cluster-based computing NoSQL Movement Relationships vs. Aggregates Key-value store XML or JSON

More information

Distributed Databases: SQL vs NoSQL

Distributed Databases: SQL vs NoSQL Distributed Databases: SQL vs NoSQL Seda Unal, Yuchen Zheng April 23, 2017 1 Introduction Distributed databases have become increasingly popular in the era of big data because of their advantages over

More information

/ Cloud Computing. Recitation 8 October 18, 2016

/ Cloud Computing. Recitation 8 October 18, 2016 15-319 / 15-619 Cloud Computing Recitation 8 October 18, 2016 1 Overview Administrative issues Office Hours, Piazza guidelines Last week s reflection Project 3.2, OLI Unit 3, Module 13, Quiz 6 This week

More information

Top 25 Hadoop Admin Interview Questions and Answers

Top 25 Hadoop Admin Interview Questions and Answers Top 25 Hadoop Admin Interview Questions and Answers 1) What daemons are needed to run a Hadoop cluster? DataNode, NameNode, TaskTracker, and JobTracker are required to run Hadoop cluster. 2) Which OS are

More information

When, Where & Why to Use NoSQL?

When, Where & Why to Use NoSQL? When, Where & Why to Use NoSQL? 1 Big data is becoming a big challenge for enterprises. Many organizations have built environments for transactional data with Relational Database Management Systems (RDBMS),

More information