COMP9321 Web Application Engineering

Similar documents
COMP9321 Web Application Engineering

COMP9321 Web Application Engineering

COMP9321 Web Application Engineering

NOSQL EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

Introduction Aggregate data model Distribution Models Consistency Map-Reduce Types of NoSQL Databases

CS-580K/480K Advanced Topics in Cloud Computing. NoSQL Database

COSC 416 NoSQL Databases. NoSQL Databases Overview. Dr. Ramon Lawrence University of British Columbia Okanagan

Chapter 24 NOSQL Databases and Big Data Storage Systems

CIB Session 12th NoSQL Databases Structures

COMP9321 Web Application Engineering

Overview. * Some History. * What is NoSQL? * Why NoSQL? * RDBMS vs NoSQL. * NoSQL Taxonomy. *TowardsNewSQL

Introduction to Big Data. NoSQL Databases. Instituto Politécnico de Tomar. Ricardo Campos

Hands-on immersion on Big Data tools

Goal of the presentation is to give an introduction of NoSQL databases, why they are there.

Database Availability and Integrity in NoSQL. Fahri Firdausillah [M ]

CSE 344 JULY 9 TH NOSQL

Introduction to NoSQL Databases

Distributed Data Store

NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre

Big Data Analytics. Rasoul Karimi

Cassandra, MongoDB, and HBase. Cassandra, MongoDB, and HBase. I have chosen these three due to their recent

A Review to the Approach for Transformation of Data from MySQL to NoSQL

What is database? Types and Examples

Polyglot Persistence in Today s Data World

CISC 7610 Lecture 2b The beginnings of NoSQL

AN introduction to nosql databases

BIS Database Management Systems.

MIS Database Systems.

Data Informatics. Seon Ho Kim, Ph.D.

CompSci 516 Database Systems

10/18/2017. Announcements. NoSQL Motivation. NoSQL. Serverless Architecture. What is the Problem? Database Systems CSE 414

5/1/17. Announcements. NoSQL Motivation. NoSQL. Serverless Architecture. What is the Problem? Database Systems CSE 414

Getting to know. by Michelle Darling August 2013

Database Systems CSE 414

5/2/16. Announcements. NoSQL Motivation. The New Hipster: NoSQL. Serverless. What is the Problem? Database Systems CSE 414

A Survey Paper on NoSQL Databases: Key-Value Data Stores and Document Stores

L22: NoSQL. CS3200 Database design (sp18 s2) 4/5/2018 Several slides courtesy of Benny Kimelfeld

Study of NoSQL Database Along With Security Comparison

Lecture Notes to Big Data Management and Analytics Winter Term 2017/2018 NoSQL Databases

Sources. P. J. Sadalage, M Fowler, NoSQL Distilled, Addison Wesley

Database Architectures

Cassandra- A Distributed Database

foreword to the first edition preface xxi acknowledgments xxiii about this book xxv about the cover illustration

Chapter 5. Database Processing

Introduction to Computer Science. William Hsu Department of Computer Science and Engineering National Taiwan Ocean University

NOSQL Databases: The Need of Enterprises

PROFESSIONAL. NoSQL. Shashank Tiwari WILEY. John Wiley & Sons, Inc.

Evolution of Database Systems

CSE 530A. Non-Relational Databases. Washington University Fall 2013

Migrating Oracle Databases To Cassandra

Neo4j.rb. Graph Database. The Natural Way to Persist Data? Andreas Kollegge. Andreas Ronge

BIG DATA TECHNOLOGIES: WHAT EVERY MANAGER NEEDS TO KNOW ANALYTICS AND FINANCIAL INNOVATION CONFERENCE JUNE 26-29,

A NoSQL Introduction for Relational Database Developers. Andrew Karcher Las Vegas SQL Saturday September 12th, 2015

Challenges for Data Driven Systems

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2015 Lecture 14 NoSQL

NOSQL Databases and Neo4j

Class Overview. Two Classes of Database Applications. NoSQL Motivation. RDBMS Review: Client-Server. RDBMS Review: Serverless

1

Jargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems

Announcements. Two Classes of Database Applications. Class Overview. NoSQL Motivation. RDBMS Review: Serverless

NoSQL systems. Lecture 21 (optional) Instructor: Sudeepa Roy. CompSci 516 Data Intensive Computing Systems

Non-Relational Databases. Pelle Jakovits

COMP9321 Web Application Engineering

Scott Meder Senior Regional Sales Manager

CISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL

Object-Relational Mapping

DATABASE DESIGN II - 1DL400

Relational Database Features

The Object-Oriented Paradigm. Employee Application Object. The Reality of DBMS. Employee Database Table. From Database to Application.

HIBERNATE MOCK TEST HIBERNATE MOCK TEST I

Advanced Data Management Technologies

/ Cloud Computing. Recitation 6 October 2 nd, 2018

Distributed Databases: SQL vs NoSQL

PHP Object-Relational Mapping Libraries in action

Databases : Lecture 1 2: Beyond ACID/Relational databases Timothy G. Griffin Lent Term Apologies to Martin Fowler ( NoSQL Distilled )

Storing data in databases

MongoDB Schema Design

relational Key-value Graph Object Document

OPEN SOURCE DB SYSTEMS TYPES OF DBMS

INTERNET ENGINEERING Sadegh Aliakbary

Advances in Programming Languages

COSC 304 Introduction to Database Systems. NoSQL Databases. Dr. Ramon Lawrence University of British Columbia Okanagan

NoSQL Databases. Amir H. Payberah. Swedish Institute of Computer Science. April 10, 2014

DB2 NoSQL Graph Store

Apache Cassandra - A Decentralized Structured Storage System

Part I What are Databases?

Database Evolution. DB NoSQL Linked Open Data. L. Vigliano

Column-Family Databases Cassandra and HBase

CSE 344 Final Review. August 16 th

Introduction to NoSQL

Distributed Non-Relational Databases. Pelle Jakovits

CSE 308. Database Issues. Goals. Separate the application code from the database

Introduction to Databases

NoSQL Databases MongoDB vs Cassandra. Kenny Huynh, Andre Chik, Kevin Vu

HBase vs Neo4j. Technical overview. Name: Vladan Jovičić CR09 Advanced Scalable Data (Fall, 2017) Ecolé Normale Superiuere de Lyon

A Study of NoSQL Database

Perspectives on NoSQL

NoSQL : A Panorama for Scalable Databases in Web

Advanced Database Technologies NoSQL: Not only SQL

COMP9321 Web Application Engineering

Transcription:

COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 6 http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411 1

We are Generating Vast Amounts of Data!! Remote patient monitoring Product sensors Healthcare Social media Manufacturing books, music, videos, etc. Retail Real time location data Digitalization of Artefacts Location-Based Services 2

We are Generating Vast Amounts of Data!! Air Bus A380: o generate 10 TB every 30 min Twitter: o Generate approximately 12 TB of data per day. Facebook: o Facebook data grows by over 500 TB daily. New York Stock: o Exchange 1TB of data everyday. 3

Challenge How do we store and access this data over the web? 4

Challenge How do we store and access this data over the web? E-Commerce website Data operations are mainly transactions (Reads and Writes) Operations are mostly on-line Response time should be quick but important to maintain security and reliability of transactions. ACID properties are important 5

Challenge How do we store and access this data over the web? E-Commerce website Data operations are mainly transactions (Reads and Writes) Operations are mostly on-line Response time should be quick but important to maintain security and reliability of transactions. ACID properties are important http://www.techtweet.org/ 6

Challenge How do we store and access this data over the web? Image serving website Data operations are mainly fetching large files (Reads) ACID requirements can be relaxed Operations are mainly on-line High bandwidth requirement 7

Challenge How do we store and access this data over the web? Search Website Data operations are mainly reading index files for answering queries (Reads) ACID requirements can be relaxed Index compilation is performed off-line due to the large size of source data (the entire Web) Response times must be as fast as possible. 8

Persistence (Hibernate, pp.5-29) 9

Persistence Persistence is: the continuance of an effect after its cause is removed In the context of storing data in a computer system, this means that: the data survives after the process with which it was created has ended In other words, for a data store to be considered persistent: it must write to non-volatile storage (Hibernate, pp.5-29) 10

Persistence Persistence is a fundamental concept in application development. In an object-oriented applications, persistence allows an object to outlive the process that created it. The state of the object may be stored to disk and an object with the same state re-created at some point in the future. Sometimes entire graphs of interconnected objects may be made persistent and later re-created in a new process. (Hibernate, pp.5-29) 11

Persistence Not all objects are persistent: o some (transient objects) will have a limited lifetime that is bounded by the life of the process that instantiated it. Almost all Java applications contain a mix of persistent and transient Objects. This means we need a subsystem that manages our persistent objects. (Hibernate, pp.5-29) 12

Data Persistence (Hibernate, pp.5-29) 13

Data Persistence When we talk about persistence in Java, we normally mean storing data in a relational database using SQL. Relational technology is a common denominator for many disparate systems and technology platforms. Relational technology provides a way of sharing data across different applications or technologies that form part of the same application. The relational data model is often the common enterprise wide presentation of business entities. (Hibernate, pp.5-29) 14

Data Persistence When you work with a relational database in a Java application, the Java code issues SQL statements to the database via the JDBC API. The Java Database Connectivity (JDBC) API provides universal data access from the Java programming language. Using the JDBC API, you can access virtually any data source, from relational databases to spreadsheets and flat files. The JDBC API is comprised of two packages: java.sql javax.sql (Hibernate, pp.5-29) 15

Data Persistence When you work with a relational database in a Java application, the Java code issues SQL statements to the database via the JDBC API. The Java Database Connectivity (JDBC) API provides universal data access from the Java programming language. Using the JDBC API, you can access virtually any data source, from relational databases to spreadsheets and flat files. The JDBC API is comprised of two packages: java.sql javax.sql (Hibernate, pp.5-29) 16

Relational Databases (Hibernate, pp.5-29) 17

Relational Databases Data is stored as a collection of tuples that groups attributes e.g. (student-id, name, birthdate, courses). Data is visualized as tables, where the tuples are the rows and the attributes form the columns. Tables can be related to each other through specific columns. Each row in a table has at least one unique attribute. (Hibernate, pp.5-29) 18

Structured Query Language (SQL) 19

Structured Query Language (SQL) 20

Database Concepts 21

Database Concepts 22

Accessing DB from an Application (JDBC) 23

Accessing DB from an Application 24

Java DataBase Connectivity 25

JDBC Concepts When developers use JDBC, they construct SQL statements that can be executed. A template like query string: SELECT name FROM employee WHERE age =? can be combined with local data structures so that regular Java objects can be mapped to the bindings in the string. e.g., a java.lang.integer object with the value of 42 can be mapped: SELECT name FROM employee WHERE age = 42 The results of execution, if any, are combined in a set returned to the caller. For example, the query may return: We can browse this result set as necessary. (Barish, p.310) 26

JDBC Interfaces 27

Typical JDBC Scenario 28

PreparedStatement object A more realistic case is that the same kind of SQL statement is processed over and over (rather than a static SQL statement). In PreparedStatement, a place holder (?) will be bound to an incoming value before execution (no recompilation). 29

Transaction Management By default, JDBC commits each update when you call executeupdate(). Committing after each update can be suboptimal in terms of performance. It is also not suitable if you want to manage a series of operations as a logical single operation (i.e., transaction). 30

Data Access Objects (DAO) 31

Data Access Objects (DAO) 32

Data Access Objects (DAO) 33

Data Access Objects (DAO) http://onewebsql.com/ 34

Data Access Objects (DAO) Example: Cars Database 35

Data Access Objects (DAO) Example: Cars Database DTO (Data Transfer Object) 36

Data Access Objects (DAO) Example: Cars Database DTO (Data Transfer Object) carries the actual data... 37

Data Access Objects (DAO) Example: Cars Database 38

Data Access Objects (DAO) Example: Cars Database 39

Data Access Objects (DAO) Example: Cars Database 40

Data Access Objects (DAO) Example: Cars Database 41

Object-Relational Impedance Mismatch Problems 42

Object-Relational Impedance Mismatch Problems 43

Object-Relational Impedance Mismatch Problems https://docs.oracle.com/cd/e16162_01/user.1112/e17455/img/mismatch.gif 44

Object-Relational Impedance Mismatch Problems 45

Impedance (or Paradigm) Mismatch Problem 46

Impedance (or Paradigm) Mismatch Problem Granularity (Hibernate, pp.5-29) The problem of granularity 47

Impedance (or Paradigm) Mismatch Problem Granularity Observation: Classes in your OO-based model come in a range of different levels of granularity (coarse-grained entity classes like User, finer-grained classes like Address, simple String class like Postcode) Just two levels of granularity in RDB: Tables and Columns with scalar types (i.e., not as flexible as Java type system) Sometimes one ends up forcing the less flexible representation upon the object model (e.g., User class with properties like postcode, state). (Hibernate, pp.5-29) The problem of granularity 48

Impedance (or Paradigm) Mismatch Problem Subtypes (Hibernate, pp.5-29) The problem of subtypes 49

Impedance (or Paradigm) Mismatch Problem Identity (Hibernate, pp.5-29) The problem of identity 50

Impedance (or Paradigm) Mismatch Problem Identity While on the subject of identity Modern object persistence solutions recommend using surrogate key. A surrogate key in a database is a unique identifier for either an entity in the modelled world or an object in the database. The surrogate key is not derived from application data, unlike a natural (or business) key which is derived from application data. (Hibernate, pp.5-29) The problem of identity 51

Impedance (or Paradigm) Mismatch Problem Association (Hibernate, pp.5-29) The problem of association 52

Impedance (or Paradigm) Mismatch Problem Association (Hibernate, pp.5-29) The problem of association 53

Impedance (or Paradigm) Mismatch Problem Object Graph Navigation (Hibernate, pp.5-29) The problem of object graph navigation 54

Impedance (or Paradigm) Mismatch Problem Object Graph Navigation Considering the following example: (Hibernate, pp.5-29) The problem of object graph navigation 55

Impedance (or Paradigm) Mismatch Problem 1+N selects problem: The N+1 query problem is a common performance issue. It looks like this: Assuming load_cats() has an implementation that boils down to:..and load_hats_for_cat($cat) has an implementation something like this:..you will issue "N+1" queries when the code executes, where N is the number of cats: https://secure.phabricator.com/book/phabcontrib/article/n_plus_one/ 56

Impedance (or Paradigm) Mismatch Problem The cost of mismatch problems: The DAO pattern helps isolate the mismatch problems by separating the interfaces from implementation, but someone (usually application developers) still has to provide the implementation classes!! (Hibernate, pp.5-29) The cost of mismatch problems 57

Object-Relational Mapping (ORM) 58

Object-Relational Mapping (ORM) 59

Hibernate 60

Hibernate 61

Hibernate 62

Continuing with the Cars example... 63

Continuing with the Cars example... 64

Continuing with the Cars example... 65

Continuing with the Cars example... 66

Continuing with the Cars example... 67

Continuing with the Cars example... 68

To use Hibernate, you need: Hibernate packages (hibernate*.jar) A set of mapping (between a table and an object) les A Hibernate configuration file (e.g., database connection details) 69

Hibernate Example See course material, week 6 70

NoSQL 71

What is NoSQL? Stands for No-SQL or Not Only SQL?? Class of non-relational data storage systems E.g. BigTable, Dynamo, PNUTS/Sherpa,.. Usually do not require a fixed table schema nor do they use the concept of joins Distributed data storage systems All NoSQL offerings relax one or more of the ACID properties (will talk about the CAP theorem) Chapter 19: Distributed Databases 72

NoSQL Data Storage Classification: Uninterpreted key/value or the big hash table. Amazon S3 (Dynamo) Flexible schema BigTable, Cassandra, HBase (ordered keys, semistructured data), Sherpa/PNuts (unordered keys, JSON) MongoDB (based on JSON) CouchDB (name/value in text) 73

Three properties of a system CAP Theorem Consistency (all copies have same value) Availability (system can run even if parts have failed) Via replication. Partitions (network can break into two or more parts, each with active systems that can t talk to other parts) Brewer s CAP Theorem : You can have at most two of these three properties for any system. Very large systems will partition at some point. 74

Why NoSQL? NoSQL Data storage systems makes sense for applications that need to deal with very large semi-structured data : e.g. Social Networking Feeds 75

Why NoSQL? share, comment, review, crowdsource, etc. 76

Examples NoSQL databases: Employs less constrained consistency models. Simple retrieval and appending operations. Significant performance benefits. Examples: Key value Store Document Store Graph Database 77

Graph Database Social Network User Collaborative Filtering Netflix Movie Probabilistic Analysis Text Analysis Docs Wiki Words 78

Graph Database Social Network User Collaborative Filtering Netflix Movie Probabilistic Analysis Text Analysis Docs Wiki Words 79

Use a graph structure Graph Stores Labeled, directed, attributed multi-graph Label for each edge Directed edges Multiple attributes per node Multiple edges between nodes Relational DBs can model graphs, but an edge requires a join which is expensive Example Neo4j neo4j.com/

Advantages of NoSQL Cheap, easy to implement Data are replicated and can be partitioned Easy to distribute Don't require a schema Can scale up and down Quickly process large amounts of data Relax the data consistency requirement (CAP) Can handle web-scale data, whereas Relational DBs cannot

Disadvantages of NoSQL New and sometimes buggy Data is generally duplicated, potential for inconsistency No standardized schema No standard format for queries No standard language Difficult to impose complicated structures Depend on the application layer to enforce data integrity No guarantee of support Too many options, which one, or ones to pick

References (Hibernate) Hibernate In Action, Christian Bauer and Gavin King, Manning Publications (HibernateDOC) http://www.hibernate.org/hib docs/reference/en/html/ Some examples are originated from Dr. David Edmond from School of Information Systems, QUT, Brisbane and S. Sudarshan from IIT Bombay. 83

84