AN introduction to nosql databases Terry McCann @SQLshark
Purpose of this presentation? It is important for a data scientist / data engineer to have the right tool for the right job. We will look at an overview of the major types of NoSQL databases and why we might want to use them
What is NoSQL? Where did the term come from? NoSQL!= A database that does not use SQL NoSQL!= Not only SQL NoSQL was an accident - but it has stuck. http://martinfowler.com/bliki/nosqldefinition.html
What is NoSQL? What are no SQL Databases? > Not using the relational model (nor the SQL language) > Open source > Designed to run on large clusters > Based on the needs of 21st century web properties > No schema, allowing fields to be added to any record without controls (Schema-on-read) http://martinfowler.com/bliki/nosqldefinition.html
A history of databases. Databases have been around since the 1960s Hierarchical 1960-1980 Relational 1980 2016+ NoSQL 2000+
What is NoSQL? - Relational databases A short history of relational databases E.F. CODD Father of relational theory (1970) A Relational Model of Data for Large Shared Data Banks Relational model/theory Further developed by Codd & C Date
What is NoSQL? - Relational databases Codd devised 12 rules (there are 13 ) For how a relational database should work. Any idea why there are actually 13 rules? https://en.wikipedia.org/wiki/codd%27s_12_rules
What is NoSQL? - Relational databases Database building blocks: Database Table Columns Rows Cell Logical structure Holds data (Columns and rows) Attributes about that data A single set of attributes where a row and column intercept.
What is NoSQL? - Relational databases 1. Entity 2. Relationship 3. Diagram 4. ERD
What is NoSQL? - Relational databases Normal form Devised by Codd: 1 st - Eliminate repeating groups, Table for related data, Add Primary key 2 nd - Every attribute should be related its own primary key 3 rd - No functional dependency
What is NoSQL? - Relational databases 3 rd Normal form
What is NoSQL? - Relational databases 3 rd Normal form
What is NoSQL? - Relational databases ACID compliance A C I D - Atomic - Consistent - Isolated - Durable
What is NoSQL? - Relational databases Commonly used for an integration layer
What is NoSQL? - Relational databases It has SQL
What is NoSQL? - Relational databases SELECT - Which columns do you want - Req FROM - Where are they coming from - Req WHERE - Apply filters to limit rows - Opt GROUP BY - Aggregate data - Opt HAVING - Apply filters to limit groups - Opt ORDER BY - Sort the data -Opt
What is NoSQL? - Relational databases SELECT SELECT BusinessEntityID,PersonType,NameStyle FROM Person.Person Column names What is being selected From where (Column delimiter) Schema.TableName
What is NoSQL? - Relational databases Problems with Relational databases 3NF means many joins Poor support for partitions (nodes) Not Web scale Scale up not out Impedance mismatch
What is NoSQL? NoSQL to the rescue well, not really.
What is NoSQL? NoSQL can be generalised using Brewers CAP Theorem Pick 2 Consistency Availability Partition Tolerance
What is NoSQL? NoSQL Databases come in 4 main types 1. Key/Value 2. Document 3. Column-store 4. Graph
What is NoSQL? - Key/Value Keys and values JSON Schema-less / schema-on-read Used for caching http://bigdata-blog.com/document-oriented-database
What is NoSQL? - Document Keys and values (as a document) JSON Schema-less / schema-on-read Used for completed orders etc http://bigdata-blog.com/document-oriented-database
What is NoSQL? Columnfamily Keys and column familes Columns which are related (Customer, Product) Used for completed orders etc
What is NoSQL? Graph Nodes, edges, relationships K/V pairs Used for social graphs LinkedIn Really hard to do in a relational database
What is NoSQL? What is the benefit? Designed from the ground up for a particular function. Not multi-purpose
What is NoSQL? Impedance mismatch
What is NoSQL? Relational is DEAD long live NoSQL.. well, not really.
Hierarchical 1960-1980 Relational 1980 2016+ NoSQL 2000+
What is NoSQL? Polyglot persistence Pick the right database for the right job http://www.informit.com/articles/article.aspx?p=1930511&seqnum=2
What is NoSQL? Polyglot Persistence Martin Fowler
0. Relational database 1. Key/Value 2. Document 3. Column-store 4. Graph Right database for the right job
Questions?