4. Managing Big Data. Cloud Computing & Big Data MASTER ENGINYERIA INFORMÀTICA FIB/UPC. Fall Jordi Torres, UPC - BSC
|
|
- Laurence Baker
- 5 years ago
- Views:
Transcription
1 4. Managing Big Data Cloud Computing & Big Data MASTER ENGINYERIA INFORMÀTICA FIB/UPC Fall Jordi Torres, UPC - BSC
2 Slides are only for presentation guide We will discuss+debate additional concepts/ideas appeared during your participation! (and we could skip part of the content) FEEL FREE TO PARTICIPATE!
3 Content Motivation HDFS and Hbase Alternatives to HDFS/Hbase? CAP Theorem Case Study: Cassandra Big Data AWS in our Desktop Guest Lecture: AENEAS HOMEWORK: AENEAS Hands-on 3
4 Relational DB can t support everything Execution Time Conventional Systems The relational DB has ruled for 2-3 decades capabilities, implementations,. (Main problem: scalability) Large Data Sets, growing too big for conventional storage/tools GBs Data Volume PBs 4
5 proposals Proposal: Hadoop + Which DB??? Execution time GBs PBs 5
6 Example: last.fm Internet radio and music community website that offers many services to its users, such as free music streams and downloads, music and event recommendations, personalized charts, and much more. Founded in 2002 There are about 25 million people who use Last.fm 6
7 Example: last.fm Million people who use Last.fm generate huge amounts of data that need to be processed. 7
8 Example: last.fm Source: Marc de Palol - 8
9 Example: last.fm Source: Marc de Palol - 9
10 Example: last.fm One example of this is users transmitting information indicating which songs they are listening to (this is known as scrobbling ). This data is processed and stored by Last.fm, so the user can access it directly (in the form of charts), and it is also used to make decisions about users musical tastes and compatibility, and artist and track similarity. Source: Marc de Palol
11 Example: last.fm Source: Marc de Palol
12 Example: last.fm data from users userid, trackid, albumid, artistid web/api web nodes? Idea: Marc de Palol
13 Example: last.fm data from users userid, trackid, albumid, artistid web/api web nodes? Idea: Marc de Palol
14 Example: last.fm data from users (input example) user1, track1,... user1, track2,... user2, track Tb (output) track1, #users track2, #users? Idea: Marc de Palol
15 Example: last.fm data from users (output) track1, #users track2, #users (input example) user1, track1,... user1, track2,... user2, track1...? HADOOP 4 Tb Idea: Marc de Palol
16 Example: last.fm Problem: a user can listen to this song several times!!!! 16
17 Example: last.fm Problem: a user can listen to this song several times!!!! we want to get rid of duplicates 17
18 Example: last.fm Problem: a user can listen to this song several times!!!! Do you want the code? 18
19 Example: last.fm Solution? mapper( ) { } reduce( ) { } Idea: Marc de Palol
20 Example: last.fm Mapper? mapper(facebookid, songid) { output(songid, facebookid) } Idea: Marc de Palol
21 Example: last.fm Reduce? mapper(facebookid, songid) { output(songid, facebookid) } reduce(songid, List<facebookIds>) { } } Idea: Marc de Palol
22 Example: last.fm Reduce: so let's add the usersids into a set mapper(facebookid, songid) { output(songid, facebookid) } reduce(songid, List<facebookIds>) { Set uniqueusers = new Set() for (facebookid in facebookids) { uniqueusers.add(facebookid) } output(songid, uniqueusers.size()) } 22
23 Example : last.fm Another important example for last.fm: Log Processing Input: Log files Assume each day they could have 10Gb of logs per server Assume 500 servers Output: We require to know how many IP accessed the cluster of web servers. Problem: Size!!!! 500x10Gb of logs Technology & Algorithm required? Information from: Marc de Palol
24 Example : last.fm Another important example for last.fm: Log Processing Input: Log files Assume each day they could have 10Gb of logs per server Assume 500 servers Output: We require to know how many IP accessed the cluster of web servers. Problem: Size!!!! 500x10Gb of logs Technology & Algorithm required? You already have the answer Information from: Marc de Palol
25 Reminder!!!! Source: CLOUDERA
26 Reminder!!!! Source: CLOUDERA
27 Reminder!!!! Source: CLOUDERA
28 Last.fm example: final words There were several reasons for adopting Hadoop: The distributed filesystem provided redundant backups for the data stored on it (e.g., web logs, user listening data) at no extra cost. Scalability was simplified through the ability to add cheap, commodity hardware when required. The cost was right (free) at a time when Last.fm had limited financial resources. The open source code and active community meant that Last.fm could freely modify Hadoop to add custom features and patches. Hadoop provided a flexible framework for running distributed computing algorithms with a relatively easy learning curve. 28
29 MapReduce. excellent but data requirements? 29
30 MapReduce: data requirements!!! In general... The data expected is not relational data This data does not require a schema and may be unstructured Map Reduce Data 30
31 MapReduce: data requirements!!! In general... The data expected is not relational data This data does not require a schema and may be unstructured Instead, data is consumed in chunks which are then divided among nodes fed to the map phase as key-value pairs Map Reduce Data 31
32 MapReduce: data requirements!!! In general... The data expected is not relational data This data does not require a schema and may be unstructured Instead, data is consumed in chunks which are then divided among nodes fed to the map phase as key-value pairs Map Reduce Data The data must be available in a distributed fashion, to serve each processing node. 32
33 Parallel Data Bases Shared nothing Shared disc Shared memory interconnect interconnect interconnect Storage Network processor memory disk 33 17
34 MapReduce: data layer requirements The design and features of the data layer are important because they affect the ease with which data can be loaded and the results of computation extracted and searched. Map Reduce Data 34
35 Content Motivation HDFS and Hbase Alternatives to HDFS/Hbase? CAP Theorem Case Study: Cassandra Big Data AWS in our Desktop Guest Lecture: AENEAS HOMEWORK: AENEAS Hands-on 35
36 HDFS Hadoop: standard storage mechanism for HADOOP Hadoop Distributed File System (HDFS) 36
37 HDFS Hadoop Distributed File System (HDFS) Fault tolerance Assuming that failure will happen allows HDFS to run on commodity hardware. Streaming data access HDFS is written with batch processing in mind, and emphasizes high throughput rather than random access to data. Extreme scalability HDFS will scale to petabytes (current versions) Portability HDFS is portable across platforms. 37
38 Hadoop: standard storage mechanism Hadoop Distributed File System (HDFS) Most HDFS applications need a write-once-read-many access model for files By assuming a file will remain unchanged after it is written, HDFS simplifies replication and speeds up data throughput. Moving Computation is Cheaper than Moving Data : Locality of computation Due to data volume, it is often much faster to move the program near to the data HDFS has features to facilitate this. 38
39 HDFS: an example A given file is broken down into blocks (default=64mb),
40 HDFS: an example then blocks are replicated across cluster (default=3)
41 MapReduce: Resource Management Scheduling A given job is broken down into tasks, then tasks are scheduled to be as close to data as possible Optimized for Bach processing Failure recovery
42 Hadoop: standard storage mechanism Starting point / 42
43 Hadoop: standard storage mechanism HDFS Interface Interface similar to that of regular filesystems. can only store and retrieve data, not index it. Simple random access to data is not possible. Map Reduce Solution: higher-level layers HBase have been created to provide finer-grained functionality to Hadoop deployments Hbase HDFS 43
44 Hbase, the Hadoop Database HBase Creates indexes offers fast and random access to its content Modeled after Google's BigTable DB Uses HDFS as a storage system Map Reduce Hbase It belongs to the NoSQL universe similar to Cassandra, Hypertable, HDFS 44
45 Hbase versus HDFS (a brief comparison) HDFS: Optimized For: Large Files Sequential Access (High Throughput) Append Only Use for fact tables that are mostly append only and require sequential full table scans. HBase: Optimized For: Small Records (but many records) Random Access Atomic Record Updates Use for dimension lookup tables which are updated frequently and require random low-latency lookups. 45
46 Content Motivation HDFS and Hbase Alternatives to HDFS/Hbase? CAP Theorem Case Study: Cassandra Big Data AWS in our Desktop Guest Lecture: AENEAS HOMEWORK: AENEAS Hands-on 46
47 Alternatives to Hbase/HDFS? An Apache project, Cassandra originated at Facebook and is now in production in many large-scale websites (also at BSC). Hypertable was created at Zvents and spun out as an open source project. Are both scalable column-store databases that follow the pattern of BigTable, similar to HBase. Map Reduce Cassandra Map Reduce Hypertable And 47
48 And dozens List Of NoSQL Databases [currently 150+] 48
49 NoS QL The concept is something that has gained momentum in recent years Today is a mature and efficient alternative that can help us solve the problems of scalability and performance (e.g. online applications with thousands of concurrent users and million hits a day) 49
50 Different Types of NoSQL Systems Distributed Key-Value Systems Amazon s S3 Key-Value Store (Dynamo) Voldemort (LinkedIn) Column-based Systems BigTable (Google) HBase Cassandra Document-based systems CouchDB MongoDB Graph DB 50 50
51 DB data model Relational systems are the databases we've been using for a while now. RDBMSs and systems that support ACIDity and joins are considered relational. Key-value systems basically support get, put, and delete operations based on a primary key. Column-oriented systems still use tables but have no joins (joins must be handled within your application). Obviously, they store data by column as opposed to traditional row-oriented databases. This makes aggregations much easier. Document-oriented systems store structured "documents" such as JSON or XML but have no joins (joins must be handled within your application). It's very easy to map data from object-oriented software to these systems. 51
52 DB data model Relational systems are the databases we've been using for a while now. RDBMSs and systems that support ACIDity and joins are considered relational. Atomicity requires that each transaction is "all or nothing": if one part of the transaction fails, the entire transaction fails, and the database state is left unchanged. An atomic system must guarantee atomicity in each and every situation, including power failures, errors, and crashes. This guarantees that a transaction cannot be left in an incomplete state. Consistency ensures that any transaction will bring the database from one valid state to another. Any data written to the database must be valid according to all defined rules, including but not limited to constraints, cascades, triggers, and any combination thereof. Isolation refers to the requirement that no transaction should be able to interfere with another transaction. One way of achieving this is to ensure that no transactions that affect the same rows can run concurrently, since their sequence, and hence the outcome, might be unpredictable. This property of ACID is often partly relaxed due to the huge speed decrease this type of concurrency management entails. Durability means that once a transaction has been committed, it will remain so, even in the event of power loss, crashes, or errors. In a relational database, for instance, once a group of SQL statements execute, the results need to be stored permanently. If the database crashes immediately thereafter, it should be possible to restore the database to the state after the last transaction committed. Source: wikipedia 52
53 The problems with Relational DB RDBMS scale up well in a single node (vertical scalability) price!!!! Apparent solution? Replication and caches Vertical partitioning: Different tables in different servers Horizontal partitioning: Rows of same table in different Servers Good for fault-tolerance, for sure OK for many concurrent reads Not much help with writes, if we want to keep ACID 53
54 There s a reason: The CAP theorem Source: Ricard Gavaldà. "Information Retrieval", Erasmus Mundus Master program on Data Mining and Knowledge Discovery 54
55 There s a reason: The CAP theorem Source: Ricard Gavaldà. "Information Retrieval", Erasmus Mundus Master program on Data Mining and Knowledge Discovery 55
56 Content Motivation HDFS and Hbase Alternatives to HDFS/Hbase? CAP Theorem Case Study: Cassandra Big Data AWS in our Desktop Guest Lecture: AENEAS HOMEWORK: AENEAS Hands-on 56
57 NoS QL and the CAP Theorem What about NoSQL Scalability? Vertical: CPU, Memory,. (Price!!! ) Horizontal More servers Better Fault Tolerance of the global system 57
58 The CAP theorem Source: Ricard Gavaldà. "Information Retrieval", Erasmus Mundus Master program on Data Mining and Knowledge Discovery 58
59 There s a reason: The CAP theorem Source: Ricard Gavaldà. "Information Retrieval", Erasmus Mundus Master program on Data Mining and Knowledge Discovery 59
60 The CAP theorem proof proof Source: Ricard Gavaldà. "Information Retrieval", Erasmus Mundus Master program on Data Mining and Knowledge Discovery 60
61 The problem with RDBMS proof Source: Ricard Gavaldà. "Information Retrieval", Erasmus Mundus Master program on Data Mining and Knowledge Discovery 61
62 Scale out requires partitions IMPORTANT!!!!! A distributed system only offers simultaneously two of this three characteristics Most large web-based systems choose availability over consistency 62
63 CAP Choose Two Per Operation C Consistency CA: available and consistent, unless there is a partition. CP: always consistent, even in a partition, but a reachable replica may deny service without quorum. A Availability AP: a reachable replica provides service even in a partition, but may be inconsistent. 63 P Partition-Tolerant
64 Visual Guide to NoSQL System Source: 64
65 Visual Guide to NoSQL System Source: 65
66 Consistent, Available (CA) Systems have trouble with partitions and typically deal with it with replication. Examples of CA systems include: Traditional RDBMSs like Postgres, MySQL, etc (relational) Vertica (column-oriented) Aster Data (relational) Greenplum (relational) Source: 66
67 Consistent, Partition-Tolerant (CP) Systems have trouble with availability while keeping data consistent across partitioned nodes. Examples of CP systems include: BigTable (column-oriented/tabular) Hypertable (column-oriented/tabular) HBase (column-oriented/tabular) MongoDB (document-oriented) Terrastore (document-oriented) Redis (key-value) Scalaris (key-value) MemcacheDB (key-value) Berkeley DB (key-value) Source: 67
68 Available, Partition-Tolerant (AP) Systems achieve "eventual consistency" through replication and verification. Examples of AP systems include: Dynamo (key-value) Voldemort (key-value) Tokyo Cabinet (key-value) KAI (key-value) Cassandra (column-oriented/tabular) CouchDB (document-oriented) SimpleDB (document-oriented) Riak (document-oriented) Source: 68
69 Eventual Consistency If no updates occur for a while, all updates eventually propagate through the system and all the nodes will be consistent Eventually, a node is either updated or removed from service. Can be implemented with Gossip protocol Amazon s Dynamo popularized this approach Sometimes this is called BASE (Basically Available, Soft state, Eventual consistency), as opposed to ACID 69
70 NoSQL alternatives But, the differences between NoSQL databases are much bigger than ever was between one SQL database and another!!!!. This means that it is a bigger responsibility on software architects to choose the appropriate one for a project right at the beginning. 70
71 Content Motivation HDFS and Hbase Alternatives to HDFS/Hbase? CAP Theorem Case Study: Cassandra Big Data AWS in our Desktop Guest Lecture: AENEAS HOMEWORK: AENEAS Hands-on 71
72 Cassandra: main features Cassandra does not support relationships between column families ( tables ), disregarding foreign keys and join operations. Knowing this, the best practice when designing a data model is to keep related data in the same column family. In this section we will review only the main features of Cassandra as an example 72
73 Architecture The architecture of Cassandra is completely decentralized and peer-to-peer, meaning all nodes in a Cassandra cluster are equivalent and provide the same functionality: receive read and write requests, or forward them to other nodes. Peer-to-peer, distributed system All nodes the same Data Partitioned Custom data replication 73
74 Partitioning Cassandra implements automatic partitioning and replication mechanisms to decide which nodes are in charge of each replica. How? PARTITIONER Divide the data across the nodes in the cluster Each node is responsible for a range of the overall data Source: Juan Luis Pérez researcher at BSC (EEDC 2012 master course) 74
75 Partitioning Node A Node B Node C Node D Source: Juan Luis Pérez researcher at BSC (EEDC 2012 master course) 75
76 Partitioning Row Key determines node placement raiser name: john pass: **** url: icann.org trucker name: james pass: **** url: w3.org dumpe r name: maria pass: **** biker name: linda pass: **** 76
77 Partitioning Range of MD5 hash [ ] [ ] [ c00..0] [c ] 77
78 Partitioning Row Key MD5 Hash raiser trucker dumpe r biker 65236c... a113f4... d4ab [ ] [ c00..0] [ ] [c ] 78
79 Partitioning Row Key MD5 Hash raiser trucker dumpe r biker 65236c... a113f4... d4ab [ ] [ c00..0] [ ] [c ] 79
80 Partitioning Row Key MD5 Hash raiser trucker dumpe r biker 65236c... a113f4... d4ab [ ] [ c00..0] [ ] [c ] 80
81 Partitioning Row Key MD5 Hash raiser trucker dumpe r biker 65236c... a113f4... d4ab [ ] [ c00..0] [ ] [c ] 81
82 Partitioning Row Key MD5 Hash raiser trucker dumpe r biker 65236c... a113f4... d4ab [ ] [ c00..0] [ ] [c ] 82
83 Replication Remember: Cassandra implements automatic partitioning and replication mechanisms to decide which nodes are in charge of each replica The user only needs to configure the number of replicas and the system assigns each replica to a node in the cluster. 83
84 Replication Cassandra stores multiple copies of rows on multiple nodes Replication factor = number of replicas Replica Placement Strategy DEFAULT: SimpleStrategy NetworkTopologyStrategy Configurable both: Replication factor Placement Strategy 84
85 Replication SimpleStrategy First replica determined by the partitioner Additional replicas rows are placed on the next nodes clockwise in the ring Original Row raiser Copy Row raiser 85
86 Replication NetworkTopologyStrategy Allows replication between different racks Racks in a data center or in multiple data centers Reliability & Performance Others 86
87 Consistency The goal of current distributed key-value stores such as Cassandra is to read and write data operations, exactly the same as any database system However, while traditional databases provide strong consistency guarantees of replicated data by controlling the concurrent execution of transactions, Cassandra provides tunable consistency in order to favour scalability and availability. 87
88 Consistency Data consistency is tunable by the user when queries are performed, so depending on the desired level of consistency, operations can either return as soon as possible or wait until a majority or all nodes respond Tunable data consistency Choose between strong and eventual consistency Consistency per-operation (reads & writes) 88
89 Strategy for Read 89 89
90 Strategy for Writes 90 90
91 Strong/Weak consistency? As it can be derived from their description, strong consistency can only be achieved when using (Quorum and) All consistency levels. Operations that use weaker consistency levels, such as Zero, Any and One, aren t guaranteed to read the most recent data. However, this weaker consistency provides certain flexibility for applications that can benefit from better performance and don t have strong consistency needs. imagine your facebook wall!!! 91
92 Caching Data is first written to a commit log for durability Local to the node (for disaster recovery purpouse) Then written to a in-memory structure (memtable) Node that store the row And then to disk (SSTable) once memtable is full Data durability is assured memtable Commit log SSTable Source: Juan Luis Pérez researcher at BSC (EEDC 2012 master course) 92
93 Content Motivation HDFS and Hbase Alternatives to HDFS/Hbase? CAP Theorem Case Study: Cassandra Big Data AWS in our Desktop Guest Lecture: AENEAS HOMEWORK: AENEAS Hands-on 93
94 What we need? Amazon AWS Account Eclipse for any platform 5 minutes 94 Felipe Caicedo
95 Instructions (I) 1. Open Eclipse and click on Help in the toolbar 1. Make click on: 1. Install New Software 95 Felipe Caicedo
96 Instructions (II) 1. Enter in the first input text, then click Add 2. Enter the name of the repository and click Ok 96 Felipe Caicedo
97 Instructions (III) 1. Select all elements of the list (if needed) 2. Continue with the wizard 97 Felipe Caicedo
98 Instructions (IV) 1. To finish the wizard, accept the terms and conditions and Click Finish 2. After the installation, restart Eclipse 98 Felipe Caicedo
99 Instructions (V) 1. We should have an icon like this image, Click on that icon 2. Click on preferences to configure the Amazon AWS Account 99 Felipe Caicedo
100 Instructions (VI) 1. Click on Find your existing AWS security credentials to configure the account 2. You should see a Web page like the below image 100 Felipe Caicedo
101 Instructions (VII) 1. Copy the Access Key ID and the Secret Access Key (You can create a new access key if you haven t) 101 Felipe Caicedo
102 Instructions (VIII) 1. Click on Add account, enter the name of the new account and finally, enter the Access Key ID and the Secret Access Key copied previously 2. Click Ok 102 Felipe Caicedo
103 Instructions (IV) 1. Click on Show AWS Explorer View 2. You should see a view like the below image (the position of this view depends of your Eclipse configuration) 103 Felipe Caicedo
104 Instructions (X) Clicking on any item, you can see the corresponding view 104 Felipe Caicedo
105 Instructions (XI) 1. Clicking on Amazon DynamoDB for instance 2. You should see a view like this (with your tables, if created) 105 Felipe Caicedo
106 Instructions (Creating an AWS project) 1. File 2. New 3. Other 4. Select Aws Java Project 106 Felipe Caicedo
107 References Download Eclipse AWS toolkit Creating new AWS project Thank you to Felipe Caicedo (FIB student) for producing this slides 107 Felipe Caicedo
108 Content Motivation HDFS and Hbase Alternatives to HDFS/Hbase? CAP Theorem Case Study: Cassandra Big Data AWS in our Desktop Guest Lecture: AENEAS NEXT CLASS HOMEWORK: AENEAS Hands-on 108
CompSci 516 Database Systems
CompSci 516 Database Systems Lecture 20 NoSQL and Column Store Instructor: Sudeepa Roy Duke CS, Fall 2018 CompSci 516: Database Systems 1 Reading Material NOSQL: Scalable SQL and NoSQL Data Stores Rick
More informationCIB Session 12th NoSQL Databases Structures
CIB Session 12th NoSQL Databases Structures By: Shahab Safaee & Morteza Zahedi Software Engineering PhD Email: safaee.shx@gmail.com, morteza.zahedi.a@gmail.com cibtrc.ir cibtrc cibtrc 2 Agenda What is
More informationNoSQL systems. Lecture 21 (optional) Instructor: Sudeepa Roy. CompSci 516 Data Intensive Computing Systems
CompSci 516 Data Intensive Computing Systems Lecture 21 (optional) NoSQL systems Instructor: Sudeepa Roy Duke CS, Spring 2016 CompSci 516: Data Intensive Computing Systems 1 Key- Value Stores Duke CS,
More informationJargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems
Jargons, Concepts, Scope and Systems Key Value Stores, Document Stores, Extensible Record Stores Overview of different scalable relational systems Examples of different Data stores Predictions, Comparisons
More informationCISC 7610 Lecture 2b The beginnings of NoSQL
CISC 7610 Lecture 2b The beginnings of NoSQL Topics: Big Data Google s infrastructure Hadoop: open google infrastructure Scaling through sharding CAP theorem Amazon s Dynamo 5 V s of big data Everyone
More informationCS 655 Advanced Topics in Distributed Systems
Presented by : Walid Budgaga CS 655 Advanced Topics in Distributed Systems Computer Science Department Colorado State University 1 Outline Problem Solution Approaches Comparison Conclusion 2 Problem 3
More informationIntroduction to Big Data. NoSQL Databases. Instituto Politécnico de Tomar. Ricardo Campos
Instituto Politécnico de Tomar Introduction to Big Data NoSQL Databases Ricardo Campos Mestrado EI-IC Análise e Processamento de Grandes Volumes de Dados Tomar, Portugal, 2016 Part of the slides used in
More informationIntroduction to Computer Science. William Hsu Department of Computer Science and Engineering National Taiwan Ocean University
Introduction to Computer Science William Hsu Department of Computer Science and Engineering National Taiwan Ocean University Chapter 9: Database Systems supplementary - nosql You can have data without
More informationCSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL
CSE 544 Principles of Database Management Systems Magdalena Balazinska Winter 2015 Lecture 14 NoSQL References Scalable SQL and NoSQL Data Stores, Rick Cattell, SIGMOD Record, December 2010 (Vol. 39, No.
More informationCassandra, MongoDB, and HBase. Cassandra, MongoDB, and HBase. I have chosen these three due to their recent
Tanton Jeppson CS 401R Lab 3 Cassandra, MongoDB, and HBase Introduction For my report I have chosen to take a deeper look at 3 NoSQL database systems: Cassandra, MongoDB, and HBase. I have chosen these
More informationADVANCED DATABASES CIS 6930 Dr. Markus Schneider
ADVANCED DATABASES CIS 6930 Dr. Markus Schneider Group 2 Archana Nagarajan, Krishna Ramesh, Raghav Ravishankar, Satish Parasaram Drawbacks of RDBMS Replication Lag Master Slave Vertical Scaling. ACID doesn
More informationOverview. * Some History. * What is NoSQL? * Why NoSQL? * RDBMS vs NoSQL. * NoSQL Taxonomy. *TowardsNewSQL
* Some History * What is NoSQL? * Why NoSQL? * RDBMS vs NoSQL * NoSQL Taxonomy * Towards NewSQL Overview * Some History * What is NoSQL? * Why NoSQL? * RDBMS vs NoSQL * NoSQL Taxonomy *TowardsNewSQL NoSQL
More informationGoal of the presentation is to give an introduction of NoSQL databases, why they are there.
1 Goal of the presentation is to give an introduction of NoSQL databases, why they are there. We want to present "Why?" first to explain the need of something like "NoSQL" and then in "What?" we go in
More informationNOSQL EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY
NOSQL EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY WHAT IS NOSQL? Stands for No-SQL or Not Only SQL. Class of non-relational data storage systems E.g.
More informationEEDC. Part 2. Big Data. Execution Environments for Distributed Computing. Master in Computer Architecture, Networks and Systems - CANS
EEDC Execution Environments for Distributed Computing 34330 Master in Computer Architecture, Networks and Systems - CANS Part 2. Big Data Course Content 2 Content Part 2. Big Data Challenges 2.1. Motivation
More informationA Study of NoSQL Database
A Study of NoSQL Database International Journal of Engineering Research & Technology (IJERT) Biswajeet Sethi 1, Samaresh Mishra 2, Prasant ku. Patnaik 3 1,2,3 School of Computer Engineering, KIIT University
More informationCISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL
CISC 7610 Lecture 5 Distributed multimedia databases Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL Motivation YouTube receives 400 hours of video per minute That is 200M hours
More informationNoSQL Databases MongoDB vs Cassandra. Kenny Huynh, Andre Chik, Kevin Vu
NoSQL Databases MongoDB vs Cassandra Kenny Huynh, Andre Chik, Kevin Vu Introduction - Relational database model - Concept developed in 1970 - Inefficient - NoSQL - Concept introduced in 1980 - Related
More informationDistributed File Systems II
Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation
More informationRelational databases
COSC 6397 Big Data Analytics NoSQL databases Edgar Gabriel Spring 2017 Relational databases Long lasting industry standard to store data persistently Key points concurrency control, transactions, standard
More informationCSE 530A. Non-Relational Databases. Washington University Fall 2013
CSE 530A Non-Relational Databases Washington University Fall 2013 NoSQL "NoSQL" was originally the name of a specific RDBMS project that did not use a SQL interface Was co-opted years later to refer to
More informationIntroduction to NoSQL Databases
Introduction to NoSQL Databases Roman Kern KTI, TU Graz 2017-10-16 Roman Kern (KTI, TU Graz) Dbase2 2017-10-16 1 / 31 Introduction Intro Why NoSQL? Roman Kern (KTI, TU Graz) Dbase2 2017-10-16 2 / 31 Introduction
More informationWarnings! Today. New Systems. OLAP vs. OLTP. New Systems vs. RDMS. NoSQL 11/5/17. Material from Cattell s paper ( ) some info will be outdated
Announcements CompSci 516 Database Systems Lecture 19 NoSQL and Column Store HW3 released on Sakai Due on Monday, Nov 20, 11:55 pm (in 2 weeks) Start soon, finish soon! You can learn about conceptual questions
More informationChapter 24 NOSQL Databases and Big Data Storage Systems
Chapter 24 NOSQL Databases and Big Data Storage Systems - Large amounts of data such as social media, Web links, user profiles, marketing and sales, posts and tweets, road maps, spatial data, email - NOSQL
More informationDistributed Data Store
Distributed Data Store Large-Scale Distributed le system Q: What if we have too much data to store in a single machine? Q: How can we create one big filesystem over a cluster of machines, whose data is
More informationRule 14 Use Databases Appropriately
Rule 14 Use Databases Appropriately Rule 14: What, When, How, and Why What: Use relational databases when you need ACID properties to maintain relationships between your data. For other data storage needs
More informationIntroduction Aggregate data model Distribution Models Consistency Map-Reduce Types of NoSQL Databases
Introduction Aggregate data model Distribution Models Consistency Map-Reduce Types of NoSQL Databases Key-Value Document Column Family Graph John Edgar 2 Relational databases are the prevalent solution
More informationNoSQL Databases. Amir H. Payberah. Swedish Institute of Computer Science. April 10, 2014
NoSQL Databases Amir H. Payberah Swedish Institute of Computer Science amir@sics.se April 10, 2014 Amir H. Payberah (SICS) NoSQL Databases April 10, 2014 1 / 67 Database and Database Management System
More informationPROFESSIONAL. NoSQL. Shashank Tiwari WILEY. John Wiley & Sons, Inc.
PROFESSIONAL NoSQL Shashank Tiwari WILEY John Wiley & Sons, Inc. Examining CONTENTS INTRODUCTION xvil CHAPTER 1: NOSQL: WHAT IT IS AND WHY YOU NEED IT 3 Definition and Introduction 4 Context and a Bit
More informationArchitekturen für die Cloud
Architekturen für die Cloud Eberhard Wolff Architecture & Technology Manager adesso AG 08.06.11 What is Cloud? National Institute for Standards and Technology (NIST) Definition On-demand self-service >
More informationCloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018
Cloud Computing and Hadoop Distributed File System UCSB CS70, Spring 08 Cluster Computing Motivations Large-scale data processing on clusters Scan 000 TB on node @ 00 MB/s = days Scan on 000-node cluster
More informationCOSC 416 NoSQL Databases. NoSQL Databases Overview. Dr. Ramon Lawrence University of British Columbia Okanagan
COSC 416 NoSQL Databases NoSQL Databases Overview Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca Databases Brought Back to Life!!! Image copyright: www.dragoart.com Image
More informationMigrating Oracle Databases To Cassandra
BY UMAIR MANSOOB Why Cassandra Lower Cost of ownership makes it #1 choice for Big Data OLTP Applications. Unlike Oracle, Cassandra can store structured, semi-structured, and unstructured data. Cassandra
More informationExploring Cassandra and HBase with BigTable Model
Exploring Cassandra and HBase with BigTable Model Hemanth Gokavarapu hemagoka@indiana.edu (Guidance of Prof. Judy Qiu) Department of Computer Science Indiana University Bloomington Abstract Cassandra is
More informationHow Apache Hadoop Complements Existing BI Systems. Dr. Amr Awadallah Founder, CTO Cloudera,
How Apache Hadoop Complements Existing BI Systems Dr. Amr Awadallah Founder, CTO Cloudera, Inc. Twitter: @awadallah, @cloudera 2 The Problems with Current Data Systems BI Reports + Interactive Apps RDBMS
More informationCassandra Design Patterns
Cassandra Design Patterns Sanjay Sharma Chapter No. 1 "An Overview of Architecture and Data Modeling in Cassandra" In this package, you will find: A Biography of the author of the book A preview chapter
More informationBig Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara
Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case
More informationAdvanced Database Technologies NoSQL: Not only SQL
Advanced Database Technologies NoSQL: Not only SQL Christian Grün Database & Information Systems Group NoSQL Introduction 30, 40 years history of well-established database technology all in vain? Not at
More information10. Replication. Motivation
10. Replication Page 1 10. Replication Motivation Reliable and high-performance computation on a single instance of a data object is prone to failure. Replicate data to overcome single points of failure
More informationDatabase Availability and Integrity in NoSQL. Fahri Firdausillah [M ]
Database Availability and Integrity in NoSQL Fahri Firdausillah [M031010012] What is NoSQL Stands for Not Only SQL Mostly addressing some of the points: nonrelational, distributed, horizontal scalable,
More informationMapReduce programming model
MapReduce programming model technology basics for data scientists Spring - 2014 Jordi Torres, UPC - BSC www.jorditorres.eu @JordiTorresBCN Warning! Slides are only for presenta8on guide We will discuss+debate
More informationSCALABLE CONSISTENCY AND TRANSACTION MODELS
Data Management in the Cloud SCALABLE CONSISTENCY AND TRANSACTION MODELS 69 Brewer s Conjecture Three properties that are desirable and expected from realworld shared-data systems C: data consistency A:
More informationCMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS
Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB s C. Faloutsos A. Pavlo Lecture#23: Distributed Database Systems (R&G ch. 22) Administrivia Final Exam Who: You What: R&G Chapters 15-22
More informationThe NoSQL Ecosystem. Adam Marcus MIT CSAIL
The NoSQL Ecosystem Adam Marcus MIT CSAIL marcua@csail.mit.edu / @marcua About Me Social Computing + Database Systems Easily Distracted: Wrote The NoSQL Ecosystem in The Architecture of Open Source Applications
More informationTransactions and ACID
Transactions and ACID Kevin Swingler Contents Recap of ACID transactions in RDBMSs Transactions and ACID in MongoDB 1 Concurrency Databases are almost always accessed by multiple users concurrently A user
More informationIntroduction to NoSQL
Introduction to NoSQL Agenda History What is NoSQL Types of NoSQL The CAP theorem History - RDBMS Relational DataBase Management Systems were invented in the 1970s. E. F. Codd, "Relational Model of Data
More informationVoldemort. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Overview Design Evaluation
Voldemort Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/29 Outline 1 2 3 Smruti R. Sarangi Leader Election 2/29 Data
More informationTopics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples
Hadoop Introduction 1 Topics Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples 2 Big Data Analytics What is Big Data?
More informationHaridimos Kondylakis Computer Science Department, University of Crete
CS-562 Advanced Topics in Databases Haridimos Kondylakis Computer Science Department, University of Crete QSX (LN2) 2 NoSQL NoSQL: Not Only SQL. User case of NoSQL? Massive write performance. Fast key
More informationPresented by Sunnie S Chung CIS 612
By Yasin N. Silva, Arizona State University Presented by Sunnie S Chung CIS 612 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. See http://creativecommons.org/licenses/by-nc-sa/4.0/
More informationA BigData Tour HDFS, Ceph and MapReduce
A BigData Tour HDFS, Ceph and MapReduce These slides are possible thanks to these sources Jonathan Drusi - SCInet Toronto Hadoop Tutorial, Amir Payberah - Course in Data Intensive Computing SICS; Yahoo!
More informationData Management for Big Data Part 1
2018-04-09 2 Outline Today Part 1 Data Management for Big Data Part 1 Valentina Ivanova IDA, Linköping University RDBMS NoSQL NewSQL DBMS OLAP vs OLTP (ACID) NoSQL Concepts and Techniques Horizontal scalability
More informationNoSQL : A Panorama for Scalable Databases in Web
NoSQL : A Panorama for Scalable Databases in Web Jagjit Bhatia P.G. Dept of Computer Science,Hans Raj Mahila Maha Vidyalaya, Jalandhar Abstract- Various business applications deal with large amount of
More informationNOSQL Databases: The Need of Enterprises
International Journal of Allied Practice, Research and Review Website: www.ijaprr.com (ISSN 2350-1294) NOSQL Databases: The Need of Enterprises Basit Maqbool Mattu M-Tech CSE Student. (4 th semester).
More informationMapReduce and Friends
MapReduce and Friends Craig C. Douglas University of Wyoming with thanks to Mookwon Seo Why was it invented? MapReduce is a mergesort for large distributed memory computers. It was the basis for a web
More information5/2/16. Announcements. NoSQL Motivation. The New Hipster: NoSQL. Serverless. What is the Problem? Database Systems CSE 414
Announcements Database Systems CSE 414 Lecture 16: NoSQL and JSon Current assignments: Homework 4 due tonight Web Quiz 6 due next Wednesday [There is no Web Quiz 5 Today s lecture: JSon The book covers
More informationDistributed Databases: SQL vs NoSQL
Distributed Databases: SQL vs NoSQL Seda Unal, Yuchen Zheng April 23, 2017 1 Introduction Distributed databases have become increasingly popular in the era of big data because of their advantages over
More informationA Review Of Non Relational Databases, Their Types, Advantages And Disadvantages
A Review Of Non Relational Databases, Their Types, Advantages And Disadvantages Harpreet kaur, Jaspreet kaur, Kamaljit kaur Student of M.Tech(CSE) Student of M.Tech(CSE) Assit.Prof.in CSE deptt. Sri Guru
More informationAxway API Management 7.5.x Cassandra Best practices. #axway
Axway API Management 7.5.x Cassandra Best practices #axway Axway API Management 7.5.x Cassandra Best practices Agenda Apache Cassandra - Overview Apache Cassandra - Focus on consistency level Apache Cassandra
More informationDatabase Systems CSE 414
Database Systems CSE 414 Lecture 16: NoSQL and JSon CSE 414 - Spring 2016 1 Announcements Current assignments: Homework 4 due tonight Web Quiz 6 due next Wednesday [There is no Web Quiz 5] Today s lecture:
More informationIntroduction to Data Management CSE 344
Introduction to Data Management CSE 344 Lecture 26: Parallel Databases and MapReduce CSE 344 - Winter 2013 1 HW8 MapReduce (Hadoop) w/ declarative language (Pig) Cluster will run in Amazon s cloud (AWS)
More informationNoSQL Concepts, Techniques & Systems Part 1. Valentina Ivanova IDA, Linköping University
NoSQL Concepts, Techniques & Systems Part 1 Valentina Ivanova IDA, Linköping University 2017-03-20 2 Outline Today Part 1 RDBMS NoSQL NewSQL DBMS OLAP vs OLTP NoSQL Concepts and Techniques Horizontal scalability
More informationNext-Generation Cloud Platform
Next-Generation Cloud Platform Jangwoo Kim Jun 24, 2013 E-mail: jangwoo@postech.ac.kr High Performance Computing Lab Department of Computer Science & Engineering Pohang University of Science and Technology
More informationApache Hadoop Goes Realtime at Facebook. Himanshu Sharma
Apache Hadoop Goes Realtime at Facebook Guide - Dr. Sunny S. Chung Presented By- Anand K Singh Himanshu Sharma Index Problem with Current Stack Apache Hadoop and Hbase Zookeeper Applications of HBase at
More informationCassandra- A Distributed Database
Cassandra- A Distributed Database Tulika Gupta Department of Information Technology Poornima Institute of Engineering and Technology Jaipur, Rajasthan, India Abstract- A relational database is a traditional
More information10/18/2017. Announcements. NoSQL Motivation. NoSQL. Serverless Architecture. What is the Problem? Database Systems CSE 414
Announcements Database Systems CSE 414 Lecture 11: NoSQL & JSON (mostly not in textbook only Ch 11.1) HW5 will be posted on Friday and due on Nov. 14, 11pm [No Web Quiz 5] Today s lecture: NoSQL & JSON
More informationModule - 17 Lecture - 23 SQL and NoSQL systems. (Refer Slide Time: 00:04)
Introduction to Morden Application Development Dr. Gaurav Raina Prof. Tanmai Gopal Department of Computer Science and Engineering Indian Institute of Technology, Madras Module - 17 Lecture - 23 SQL and
More informationIntroduction to Database Services
Introduction to Database Services Shaun Pearce AWS Solutions Architect 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Today s agenda Why managed database services? A non-relational
More informationDatabases and Big Data Today. CS634 Class 22
Databases and Big Data Today CS634 Class 22 Current types of Databases SQL using relational tables: still very important! NoSQL, i.e., not using relational tables: term NoSQL popular since about 2007.
More informationDATABASE DESIGN II - 1DL400
DATABASE DESIGN II - 1DL400 Fall 2016 A second course in database systems http://www.it.uu.se/research/group/udbl/kurser/dbii_ht16 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,
More informationIntroduction to Distributed Data Systems
Introduction to Distributed Data Systems Serge Abiteboul Ioana Manolescu Philippe Rigaux Marie-Christine Rousset Pierre Senellart Web Data Management and Distribution http://webdam.inria.fr/textbook January
More informationInternational Journal of Informative & Futuristic Research ISSN:
www.ijifr.com Volume 5 Issue 8 April 2018 International Journal of Informative & Futuristic Research ISSN: 2347-1697 TRANSITION FROM TRADITIONAL DATABASES TO NOSQL DATABASES Paper ID IJIFR/V5/ E8/ 010
More informationMongoDB and Mysql: Which one is a better fit for me? Room 204-2:20PM-3:10PM
MongoDB and Mysql: Which one is a better fit for me? Room 204-2:20PM-3:10PM About us Adamo Tonete MongoDB Support Engineer Agustín Gallego MySQL Support Engineer Agenda What are MongoDB and MySQL; NoSQL
More informationUsing space-filling curves for multidimensional
Using space-filling curves for multidimensional indexing Dr. Bisztray Dénes Senior Research Engineer 1 Nokia Solutions and Networks 2014 In medias res Performance problems with RDBMS Switch to NoSQL store
More informationBig Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016)
Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016) Week 10: Mutable State (1/2) March 15, 2016 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo These
More informationGetting to know. by Michelle Darling August 2013
Getting to know by Michelle Darling mdarlingcmt@gmail.com August 2013 Agenda: What is Cassandra? Installation, CQL3 Data Modelling Summary Only 15 min to cover these, so please hold questions til the end,
More informationCSE 344 JULY 9 TH NOSQL
CSE 344 JULY 9 TH NOSQL ADMINISTRATIVE MINUTIAE HW3 due Wednesday tests released actual_time should have 0s not NULLs upload new data file or use UPDATE to change 0 ~> NULL Extra OOs on Mondays 5-7pm in
More informationData Informatics. Seon Ho Kim, Ph.D.
Data Informatics Seon Ho Kim, Ph.D. seonkim@usc.edu HBase HBase is.. A distributed data store that can scale horizontally to 1,000s of commodity servers and petabytes of indexed storage. Designed to operate
More informationCSE-E5430 Scalable Cloud Computing Lecture 9
CSE-E5430 Scalable Cloud Computing Lecture 9 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 15.11-2015 1/24 BigTable Described in the paper: Fay
More informationWhere We Are. Review: Parallel DBMS. Parallel DBMS. Introduction to Data Management CSE 344
Where We Are Introduction to Data Management CSE 344 Lecture 22: MapReduce We are talking about parallel query processing There exist two main types of engines: Parallel DBMSs (last lecture + quick review)
More informationTHE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES
1 THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES Vincent Garonne, Mario Lassnig, Martin Barisits, Thomas Beermann, Ralph Vigne, Cedric Serfon Vincent.Garonne@cern.ch ph-adp-ddm-lab@cern.ch XLDB
More informationOutline. Introduction Background Use Cases Data Model & Query Language Architecture Conclusion
Outline Introduction Background Use Cases Data Model & Query Language Architecture Conclusion Cassandra Background What is Cassandra? Open-source database management system (DBMS) Several key features
More informationModern Database Concepts
Modern Database Concepts Basic Principles Doc. RNDr. Irena Holubova, Ph.D. holubova@ksi.mff.cuni.cz NoSQL Overview Main objective: to implement a distributed state Different objects stored on different
More informationDistributed Systems 16. Distributed File Systems II
Distributed Systems 16. Distributed File Systems II Paul Krzyzanowski pxk@cs.rutgers.edu 1 Review NFS RPC-based access AFS Long-term caching CODA Read/write replication & disconnected operation DFS AFS
More informationStages of Data Processing
Data processing can be understood as the conversion of raw data into a meaningful and desired form. Basically, producing information that can be understood by the end user. So then, the question arises,
More informationBIG DATA basics. November Cloud Computing & Big Data. FIB-UPC Master MEI
BIG DATA basics Cloud Computing & Big Data November- 2012 FIB-UPC Master MEI Content (Big Data part) A. Motivation B. Big Data Challenges C. Processing Big Data D. Big Data Storage E. Managing Big Data
More informationSpotify. Scaling storage to million of users world wide. Jimmy Mårdell October 14, 2014
Cassandra @ Spotify Scaling storage to million of users world wide! Jimmy Mårdell October 14, 2014 2 About me Jimmy Mårdell Tech Product Owner in the Cassandra team 4 years at Spotify
More informationParallel Programming Principle and Practice. Lecture 10 Big Data Processing with MapReduce
Parallel Programming Principle and Practice Lecture 10 Big Data Processing with MapReduce Outline MapReduce Programming Model MapReduce Examples Hadoop 2 Incredible Things That Happen Every Minute On The
More informationA Survey Paper on NoSQL Databases: Key-Value Data Stores and Document Stores
A Survey Paper on NoSQL Databases: Key-Value Data Stores and Document Stores Nikhil Dasharath Karande 1 Department of CSE, Sanjay Ghodawat Institutes, Atigre nikhilkarande18@gmail.com Abstract- This paper
More informationWebinar Series TMIP VISION
Webinar Series TMIP VISION TMIP provides technical support and promotes knowledge and information exchange in the transportation planning and modeling community. Today s Goals To Consider: Parallel Processing
More informationCS November 2017
Bigtable Highly available distributed storage Distributed Systems 18. Bigtable Built with semi-structured data in mind URLs: content, metadata, links, anchors, page rank User data: preferences, account
More informationBIG DATA TECHNOLOGIES: WHAT EVERY MANAGER NEEDS TO KNOW ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29,
BIG DATA TECHNOLOGIES: WHAT EVERY MANAGER NEEDS TO KNOW ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29, 2016 1 OBJECTIVES ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29, 2016 2 WHAT
More information5/1/17. Announcements. NoSQL Motivation. NoSQL. Serverless Architecture. What is the Problem? Database Systems CSE 414
Announcements Database Systems CSE 414 Lecture 15: NoSQL & JSON (mostly not in textbook only Ch 11.1) 1 Homework 4 due tomorrow night [No Web Quiz 5] Midterm grading hopefully finished tonight post online
More informationKey Value Store. Yiding Wang, Zhaoxiong Yang
Key Value Store Yiding Wang, Zhaoxiong Yang Outline Part 1 Definitions/Operations Compare with RDBMS Scale Up Part 2 Distributed Key Value Store Network Acceleration Definitions A key-value database, or
More informationBig Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)
Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017) Week 10: Mutable State (1/2) March 14, 2017 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo These
More informationTools for Social Networking Infrastructures
Tools for Social Networking Infrastructures 1 Cassandra - a decentralised structured storage system Problem : Facebook Inbox Search hundreds of millions of users distributed infrastructure inbox changes
More informationWe are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info
We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info START DATE : TIMINGS : DURATION : TYPE OF BATCH : FEE : FACULTY NAME : LAB TIMINGS : PH NO: 9963799240, 040-40025423
More informationZHT A Fast, Reliable and Scalable Zero- hop Distributed Hash Table
ZHT A Fast, Reliable and Scalable Zero- hop Distributed Hash Table 1 What is KVS? Why to use? Why not to use? Who s using it? Design issues A storage system A distributed hash table Spread simple structured
More informationColumn-Family Databases Cassandra and HBase
Column-Family Databases Cassandra and HBase Kevin Swingler Google Big Table Google invented BigTableto store the massive amounts of semi-structured data it was generating Basic model stores items indexed
More informationNon-Relational Databases. Pelle Jakovits
Non-Relational Databases Pelle Jakovits 25 October 2017 Outline Background Relational model Database scaling The NoSQL Movement CAP Theorem Non-relational data models Key-value Document-oriented Column
More information