Flexible Data Modeling in MariaDB

Similar documents
Flexible Data Modeling in MariaDB

MARIADB & JSON: FLEXIBLE DATA MODELING

Enterprise Open Source Databases

Total Cost of Ownership: Database Software and Support

JSON Support in MariaDB

Total Cost of Ownership: Database Software and Support

High availability with MariaDB TX: The definitive guide

Programming and Database Fundamentals for Data Scientists

SQL: Concepts. Todd Bacastow IST 210: Organization of Data 2/17/ IST 210

Introduction to Data Management CSE 344. Lecture 2: Data Models

1Z MySQL 5 Database Administrator Certified Professional Exam, Part II Exam.

CS121 MIDTERM REVIEW. CS121: Relational Databases Fall 2017 Lecture 13

SQL (and MySQL) Useful things I have learnt, borrowed and stolen

Covering indexes. Stéphane Combaudon - SQLI

SQL Functionality SQL. Creating Relation Schemas. Creating Relation Schemas

Jean-Marc Krikorian Strategic Alliance Director

EGCI 321: Database Systems. Dr. Tanasanee Phienthrakul

CS2300: File Structures and Introduction to Database Systems

Introduction to SQL on GRAHAM ED ARMSTRONG SHARCNET AUGUST 2018

Smart Data Center From Hitachi Vantara: Transform to an Agile, Learning Data Center

Connecting Legacy Equipment to the Industrial Internet of Things White Paper

Principles of Data Management

Basic SQL. Basic SQL. Basic SQL

Chapter 2 The relational Model of data. Relational model introduction

SQL: Data Definition Language

Constraints. Primary Key Foreign Key General table constraints Domain constraints Assertions Triggers. John Edgar 2

SQL Data Definition Language: Create and Change the Database Ray Lockwood

How to Use JSON in MySQL Wrong

SQL DDL II. CS121: Relational Databases Fall 2017 Lecture 8

DATABASE WRITEBACK. This document describes the process of setting up writeback to a database from a report within Cognos Connection.

5. SQL Query Syntax 1. Select Statement. 6. CPS: Database Schema

SQL DATA DEFINITION LANGUAGE

SQL DATA DEFINITION LANGUAGE

IBM DB2 UDB V7.1 Family Fundamentals.

ORACLE DATABASE 12C INTRODUCTION

BEGINNING T-SQL. Jen McCown MidnightSQL Consulting, LLC MinionWare, LLC

SQL DATA DEFINITION LANGUAGE

CMPT 354: Database System I. Lecture 2. Relational Model

GlobAl EDITION. Database Concepts SEVENTH EDITION. David M. Kroenke David J. Auer

Oral Questions and Answers (DBMS LAB) Questions & Answers- DBMS

T-SQL Training: T-SQL for SQL Server for Developers

The Top 20 Design Tips

user specifies what is wanted, not how to find it

Database Management

DATA AND SCHEMA MODIFICATIONS CHAPTERS 4,5 (6/E) CHAPTER 8 (5/E)

2.9 Table Creation. CREATE TABLE TableName ( AttrName AttrType, AttrName AttrType,... )

Information Systems Engineering. SQL Structured Query Language DDL Data Definition (sub)language

Where Are We? Next Few Lectures. Integrity Constraints Motivation. Constraints in E/R Diagrams. Keys in E/R Diagrams

Practical MySQL indexing guidelines

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight

Announcements. Using Electronics in Class. Review. Staff Instructor: Alvin Cheung Office hour on Wednesdays, 1-2pm. Class Overview

MTAT Introduction to Databases

CS W Introduction to Databases Spring Computer Science Department Columbia University

A HOLISTIC APPROACH TO IDENTITY AND AUTHENTICATION. Establish Create Use Manage

Introduction to Computer Science and Business

Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications. Today's Party. Example Database. Faloutsos/Pavlo CMU /615

CS 327E Lecture 2. Shirley Cohen. January 27, 2016

Private Institute of Aga NETWORK DATABASE LECTURER NIYAZ M. SALIH

SQL Data Definition and Data Manipulation Languages (DDL and DML)

Creating Tables, Defining Constraints. Rose-Hulman Institute of Technology Curt Clifton

Workbooks (File) and Worksheet Handling

Introduction to Databases CSE 414. Lecture 2: Data Models

MySQL 5.0 Certification Study Guide

System Pages (Setup Admins Only)

Shine a Light on Dark Data with Vertica Flex Tables

Hitachi Enterprise Cloud Container Platform

Create a simple database with MySQL


BlackBerry Enterprise Identity

SELECT Product.name, Purchase.store FROM Product JOIN Purchase ON Product.name = Purchase.prodName

Code Centric: T-SQL Programming with Stored Procedures and Triggers

Borland InterBase and MySQL

Relational Query Languages. Preliminaries. Formal Relational Query Languages. Example Schema, with table contents. Relational Algebra

SQL OVERVIEW. CS121: Relational Databases Fall 2017 Lecture 4

Part 1: IoT demo Part 2: MySQL, JSON and Flexible storage

THE UNIVERSITY OF AUCKLAND

Chapter 4. Basic SQL. Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

Tanium Asset User Guide. Version 1.3.1

The Relational Model of Data (ii)

DB Fundamentals Exam.

Waypoint Leasing. A Global Helicopter Leasing Company

10/18/2017. Announcements. NoSQL Motivation. NoSQL. Serverless Architecture. What is the Problem? Database Systems CSE 414

Basic SQL. Dr Fawaz Alarfaj. ACKNOWLEDGEMENT Slides are adopted from: Elmasri & Navathe, Fundamentals of Database Systems MySQL Documentation

Query Optimization, part 2: query plans in practice

MySQL Query Tuning 101. Sveta Smirnova, Alexander Rubin April, 16, 2015

SQL Fundamentals. Chapter 3. Class 03: SQL Fundamentals 1

Writing High Performance SQL Statements. Tim Sharp July 14, 2014

Referential Integrity and Other Table Constraints Ray Lockwood

Make the most of your Energy from Power Plant to Plug. Chris Simard, Regional Director Field Sales Group

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1

Databases 1. Defining Tables, Constraints

CSCI3030U Database Models

G64DBS Database Systems. Lecture 6 More SQL Data. Creating Relations. Column Constraints. Last Lecture s Question

DS Introduction to SQL Part 1 Single-Table Queries. By Michael Hahsler based on slides for CS145 Introduction to Databases (Stanford)

WHAT IS A DATABASE? There are at least six commonly known database types: flat, hierarchical, network, relational, dimensional, and object.

The Relational Model and SQL

Lab 4: Tables and Constraints

Comp 5311 Database Management Systems. 4b. Structured Query Language 3

Relational Algebra. Note: Slides are posted on the class website, protected by a password written on the board

Mobile MOUSe MTA DATABASE ADMINISTRATOR FUNDAMENTALS ONLINE COURSE OUTLINE

Transcription:

Flexible Data Modeling in MariaDB WHITE PAPER Part 1 Dynamic Columns

Table of Contents Introduction - Benefits and Limitations - Benefit - Limitations - Use Cases Dynamic Columns - Definitions - Create - Read - Indexes - Updates - Data Integrity Conclusion 1 3 3 4 4 7 8 9 10

Introduction With a history of proven reliability, relational databases are trusted to ensure the safety and integrity of data. In the digital age, where offline interactions become online interactions, data is fundamental to business success, powering mission-critical web, mobile and Internet of Things (IoT) applications and services. The safety and integrity of data has never been more important. However, the digital age is creating new challenges. It is leading to new use cases, requiring businesses to innovate faster and developers to iterate faster and diversity of data is expanding with increased use of semi-structured data in modern web and mobile applications. As a result, businesses had to decide between the reliability of relational databases and the flexibility of schemaless databases. When faced with this choice, many businesses chose to deploy multiple types of databases, increasing both the cost and complexity of their database infrastructure. With strict schema definitions and ACID transactions, relational databases ensure the safety and integrity of data. But, what if a relational database supported flexible schemas? What if a relational database could not only support semi-structured data, but could enable applications to extend or define the structure of data on demand, as needed? That s the power of MariaDB Server, ensuring data is safe for the business while at the same time supporting schema flexibility and semi-structured data for developers. With dynamic columns and JSON functions, flexible schemas and semi-structured data (e.g., JSON) can be used without sacrificing transactions and data integrity. With MariaDB, it is now possible to: Combine relational and semi-structured and/or JSON data Query semi-structured data and/or JSON data with SQL Modify semi-structured and/or JSON in an ACID transaction Extend the data model without modifying the schema first This white paper is the first in a two-part series on flexible data modeling for developers, architects and database administrators. Through sample data and queries, it explains how to create flexible schemas using dynamic columns. Part of this white paper series explains how to create flexible schemas using JSON functions. 1

Benefits and Limitations There are many benefits to using dynamic columns, for both the business and its developers. However, there are limitations as well. In order to benefit from flexible schemas, applications must share responsibility for managing the schema and the integrity of data. Benefits Faster time to market: develop new features without waiting for schema changes Easier development: build applications and features using simpler data models and queries More use cases: support modern and emerging use cases web, mobile and IoT Simpler infrastructure: use a single database for both legacy and modern applications Less risk: support flexible data models without sacrificing reliability, consistency and integrity Limitations Data models: complex, hierarchical data models are not supported with dynamic columns Data types: applications are responsible for managing the data types of dynamic columns Use Cases With dynamic columns, organizations in every industry from finance to retail can support modern use cases for mission-critical applications, or simplify existing use cases, by combining the flexibility of dynamic columns with the reliability of a relational database and transactions. Advertising: visitor profiles with varying attributes Finance: financial instruments with different properties Insurance: policies with unique declarations IoT: devices with many possible configurations Manufacturing: parts with different measurements Media and Entertainment: assets with different rendering and/or streaming properties Retail: products with different SKUs

Dynamic Columns When dynamic columns are used, different rows can have different columns. That s because every row contains a column and its value is a set of nested columns ideal for collapsing a one-to-one relationship within a single row. While every row has this column, every row can have a different value a different set of nested columns. The outer column is in the table definition. However, the inner columns are created by the application. id name format price attr 1 Tron Blu-ray 9.99 Rating Duration Aspect ratio PG-13 96.1:1 Foundation Paperback 7.99 Author Page count Isaac Asimov 96 Definitions To use dynamic columns, create a BLOB column for MariaDB Server to store dynamic column names and values (as name/value pairs). In the example, dynamic columns will be stored in the attr column with a NOT NULL constraint. However, a NOT NULL constraint is optional. For example, if dynamic columns were optional some rows have them, some do not. CREATE TABLE IF NOT EXISTS products ( id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, type VARCHAR(1) NOT NULL, name VARCHAR(40) NOT NULL, format VARCHAR(0) NOT NULL, price FLOAT(5, ) NOT NULL, attr BLOB NOT NULL); Example 1: Create a table with dynamic columns 3

Create To insert a row with dynamic columns, use COLUMN_CREATE with one or more name/value pairs in an INSERT statement. While the data type can be specified, it is optional. MariaDB Server can infer the data type of dynamic columns. In the example below, two products are inserted into the products_dc table: a book (Foundation) and movie (Tron). While products have a type, name, format and price, books have an author and a page count as well, while movies have a rating, duration and aspect ratio. INSERT INTO products (type, name, format, price, attr) VALUES ('M', 'Tron', 'Blu-ray', 9.99, COLUMN_CREATE('rating', 'PG-13', 'duration', 96, 'aspect_ratio', '.1:1')); INSERT INTO products (type, name, format, price, attr) VALUES ('B', 'Foundation', 'Paperback', 7.99, COLUMN_CREATE('author', 'Isaac Asimov', 'page_count', 96)); Example : Create rows with dynamic columns With the use of dynamic columns, rows for books can contain dynamic columns for the author and page count while rows for movies can contain dynamic columns for the rating, duration and aspect ratio and they are both stored in the same table. Read To read dynamic columns, use COLUMN_GET with the dynamic column name and data type. While the data type is not required to create a dynamic column, it is required to read one. In the example below, the query returns the name, format, price, rating, duration and aspect ratio of movies using COLUMN_GET for the three dynamic columns: rating, duration and aspect_ratio. SELECT name, format, price, COLUMN_GET(attr, 'rating' AS CHAR(5)) AS 'rating', COLUMN_GET(attr, 'duration' AS INTEGER) AS 'duration', COLUMN_GET(attr, 'aspect_ratio' AS CHAR(6)) AS 'aspect_ratio' WHERE type = 'M'; name format price rating duration aspect_ratio Tron Blu-ray 9.99 PG-13 96.1:1 Example 3: Read dynamic columns in a query (movies) 4

Reference: data types BINARY[(n)] CHAR[(n)] DECIMAL[(m[, d])] DOUBLE[m, d] INTEGER SIGNED [INTEGER] UNSIGNED [INTEGER] DATE DATETIME[(d)] TIME[D] In the example below, the query returns the name, format, price, author and page count of books using COLUMN_GET for the two dynamic columns: author, page_count. SELECT name, format, price, COLUMN_GET(attr, 'author' AS CHAR(0)) AS 'rating', COLUMN_GET(attr, 'page_count' AS INTEGER) AS 'page_count' WHERE type = 'B'; name format price author page_count Foundation Paperback 7.99 Isaac Asimov 96 Example 4: Read dynamic columns in a query (books) In addition, COLUMN_GET can be used in the WHERE clause. SELECT name, format, price WHERE COLUMN_GET(attr, 'rating' AS CHAR(5)) = 'PG-13'; name Tron format Blu-ray price 9.99 Example 5: Evaluate dynamic columns in the WHERE clause of a query 5

Note: Reading a dynamic column that does not exist In the example below, the query returns the name, format, price, rating, duration and aspect ratio of all products, including books. If the dynamic column does not exist, COLUMN_GET will return NULL, not an error. SELECT name, format, price, COLUMN_GET(attr, 'rating' AS CHAR(5)) AS 'rating', COLUMN_GET(attr, 'duration' AS INTEGER) AS 'duration', COLUMN_GET(attr, 'aspect_ratio' AS CHAR(6)) AS 'aspect_ratio' ; name format price rating duration aspect_ratio Tron Blu-ray 9.99 PG-13 96.1:1 Foundation Paperback 7.99 NULL NULL NULL Example 6: Read dynamic columns that do not exist in a query Tip: Reading dynamic columns as JSON objects To return dynamic columns as a JSON object, use COLUMN_JSON. The dynamic columns will be returned as name/value pairs (i.e., fields) within a JSON object. SELECT type, name, format, price, COLUMN_JSON(attr) AS 'attributes' ; id type name format price attributes 1 M Tron Blu-ray 9.99 { "rating": "PG-13", "duration": 96, "aspect_ratio": ".1:1" } B 7.99 NULL NULL { "author": "Isaac Asimov", "page_count": 96 Example 6: read dynamic columns that do not exist in a query} Example 7: Return dynamic columns as JSON objects in a query 6

Indexes It is not possible to create an index on a dynamic column. However, it is possible to create a virtual column (i.e., generated column) based on a dynamic column, and to index the virtual column. A virtual column can be PERSISTENT or VIRTUAL. If a virtual column is VIRTUAL, it is not stored in the database / written to disk. In the example below, a virtual column, rating, is created based on the rating dynamic column using COLUMN_ GET. ALTER TABLE products ADD COLUMN rating VARCHAR(0) AS (COLUMN_GET(attr, 'rating' AS CHAR(5))) VIRTUAL; EXPLAIN SELECT name, format, price WHERE rating = 'PG'; id select_type table ref possible_keys 1 SIMPLE products ALL NULL Example 8: Create a virtual column based on a dynamic column However, in the example above, per the explain plan, an index will not be used for the query because an index has not been created on the virtual column. In the example below, an index is created, ratings, on the rating virtual column, and per the explain plan, the query will use the index. CREATE INDEX ratings ON products(rating); EXPLAIN SELECT name, format, price WHERE rating = 'PG'; id select_type table ref possible_keys 1 SIMPLE products ALL ratings Example 9: Create an index on a virtual column based on a dynamic column 7

Updates To create a new dynamic column or update an existing dynamic column, use COLUMN_ADD with the dynamic column name and value. If the dynamic column exists, it will be updated. If it does not, it will be created. In the first example below, the rating dynamic column for Tron is updated. It is not PG-13, it is PG. In the second, the disks dynamic column is created for Tron. UPDATE products WHERE id = 1; SET attr = COLUMN_ADD(attr, 'rating', 'PG') SELECT name, format, price, COLUMN_GET(attr, 'rating' AS CHAR(5)) AS 'rating', COLUMN_GET(attr, 'duration' AS INTEGER) AS 'duration', COLUMN_GET(attr, 'aspect_ratio' AS CHAR(6)) AS 'aspect_ratio' WHERE id = 1; name format price rating duration aspect_ratio Tron Blu-ray 9.99 PG-13 96.1:1 Example 10: Update a dynamic column in an existing row UPDATE products SET attr = COLUMN_ADD(attr, 'disks', ) WHERE id = 1; SELECT name, format, price, COLUMN_GET(attr, 'disks' AS INTEGER) AS 'disks' WHERE id = 1; name format price disks Tron Blu-ray 9.99 PG-13 Example 11: Create a dynamic column in an existing row 8

Data Integrity While applications can create columns not defined in the table definition, data integrity can be enforced by the database using CHECK constraints, introduced in MariaDB Server 10., with dynamic column functions. In the example below, a CHECK constraint is created to ensure movies have a rating, duration and aspect ratio while books have an author and page count. And while it ensures these columns exist, it doesn t prevent the application from creating more. ALTER TABLE products ADD CONSTRAINT check_attr CHECK ( (type = 'M' and COLUMN_EXISTS(attr, 'rating') and COLUMN_EXISTS(attr, 'duration') and COLUMN_EXISTS(attr, 'aspect_ratio')) or (type = 'B' and COLUMN_EXISTS(attr, 'author') and COLUMN_EXISTS(attr, 'page_count'))); Example 1: Create a check constraint to require specific dynamic columns In the example below, the INSERT fails and returns an error because it was for a movie with a rating and duration, but without an aspect ratio - it violated the CHECK constraint. INSERT INTO products (type, name, format, price, attr) VALUES ('M', 'Tron', 'Blu-ray', 9.99, COLUMN_CREATE('rating', 'PG', 'duration', 96)); ERROR 405 (3000): CONSTRAINT 'check_attr' failed for 'test'.'products' Example 13: Violate a check constraint because a dynamic column is missing (movie) In the example below, the INSERT fails and returns an error because it was for a book with an author, but not page count - it violated the CHECK constraint. INSERT INTO products (type, name, format, price, attr) VALUES ('B', 'Foundation', 'Paperback', 7.99, COLUMN_CREATE('author', 'Isaac Asimov')); ERROR 405 (3000): CONSTRAINT 'check_attr' failed for 'test'.'products_' Example 14: Violate a check constraint because a dynamic column is missing (book) 9

Conclusion MariaDB Server, with dynamic columns and JSON functions, enables businesses to use a single database for both structured and semi-structured data, relational and JSON, together. It is no longer necessary to choose between data integrity and schema flexibility. When dynamic columns are used, a flexible schema can be created with a relational data model without sacrificing data integrity, transactions and SQL. MariaDB Server is a trusted, proven and reliable database, and with support for flexible schemas and semistructured data, including JSON, it meets the requirements of modern, mission-critical applications in the digital age - web, mobile and IoT. MariaDB Server helps business and developers save time and money by: Developing new features without waiting for schema changes Developing applications with simpler data models and queries Supporting current and future use cases - web, mobile and IoT Supporting both legacy and modern applications Supporting flexible data models without sacrificing data integrity Americas: sales-amer@mariadb.com Europe, Middle East, Africa: sales-emea@mariadb.com Asia Pacific: sales-apac@mariadb.com Copyright 017 MariaDB Corporation Ab, Tekniikantie 1, 0150 Espoo, Finland. MariaDB is a trademark or registered trademark of MariaDB Corporation.