Enterprise Data Analysis and Design

Similar documents
Chapter 1: Distributed Information Systems

CIS 330: Applied Database Systems

Enterprise Data Analysis and Design

Gustavo Alonso, ETH Zürich. Web services: Concepts, Architectures and Applications - Chapter 1 2

CIS 330: Applied Database Systems

Outline. Database Management Systems (DBMS) Database Management and Organization. IT420: Database Management and Organization

Middleware. Adapted from Alonso, Casati, Kuno, Machiraju Web Services Springer 2004

Chapter 2 Distributed Information Systems Architecture

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Distributed transactions (quick refresh) Layers of an information system

Database Applications (15-415)

The Relational Model. Why Study the Relational Model? Relational Database: Definitions

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Layers of an information system. Design strategies.

The Relational Model. Chapter 3. Database Management Systems, R. Ramakrishnan and J. Gehrke 1

The Relational Model. Chapter 3

Relational data model

Introduction to Data Management. Lecture #4 (E-R Relational Translation)

The Relational Data Model. Data Model

Introduction. Example Databases

Scott Meder Senior Regional Sales Manager

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

Database Management Systems. Chapter 1

Introduction and Overview

CIS 330: Applied Database Systems

Introduction and Overview

DATABASE MANAGEMENT SYSTEMS. UNIT I Introduction to Database Systems

Introduction to Database Systems. Chapter 1. Instructor: . Database Management Systems, R. Ramakrishnan and J. Gehrke 1

Introduction to Database Management Systems

The Relational Model. Chapter 3. Comp 521 Files and Databases Fall

Introduction to Data Management. Lecture #1 (Course Trailer )

Intro to Transaction Management

B.H.GARDI COLLEGE OF MASTER OF COMPUTER APPLICATION. Ch. 1 :- Introduction Database Management System - 1

Database Management Systems. Chapter 3 Part 1

Introduction to Database Systems

Distributed KIDS Labs 1

Security Mechanisms I. Key Slide. Key Slide. Security Mechanisms III. Security Mechanisms II

Relational Model. Topics. Relational Model. Why Study the Relational Model? Linda Wu (CMPT )

Introduction to Data Management. Lecture #1 (Course Trailer )

Introduction to Databases CS348

U1. Data Base Management System (DBMS) Unit -1. MCA 203, Data Base Management System

Page 1. Goals for Today" What is a Database " Key Concept: Structured Data" CS162 Operating Systems and Systems Programming Lecture 13.

The Relational Model. Week 2

One Size Fits All: An Idea Whose Time Has Come and Gone

Active Server Pages Architecture

Why Study the Relational Model? The Relational Model. Relational Database: Definitions. The SQL Query Language. Relational Query Languages

Database Systems. Lecture2:E-R model. Juan Huo( 霍娟 )

Database. Università degli Studi di Roma Tor Vergata. ICT and Internet Engineering. Instructor: Andrea Giglio

The Relational Model. Outline. Why Study the Relational Model? Faloutsos SCS object-relational model

Database System Concepts and Architecture

Database Management Systems Chapter 1 Instructor: Oliver Schulte Database Management Systems 3ed, R. Ramakrishnan and J.

Introduction to Database Systems CS432. CS432/433: Introduction to Database Systems. CS432/433: Introduction to Database Systems

Database System Concepts and Architecture. Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

Systems Analysis and Design in a Changing World, Fourth Edition. Chapter 12: Designing Databases

Web Services - Concepts, Architecture and Applications Part 3: Asynchronous middleware

Course Logistics & Chapter 1 Introduction

Introduction to Data Management. Lecture #1 (Course Trailer ) Instructor: Chen Li

Intro to Transactions

Relational Databases BORROWED WITH MINOR ADAPTATION FROM PROF. CHRISTOS FALOUTSOS, CMU /615

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe Slide 2-1

Introduction to Transaction Management

Introduction. Who wants to study databases?

Administrivia. The Relational Model. Review. Review. Review. Some useful terms

CMPT 354 Database Systems I

Chapter 3. Database Architecture and the Web

UNIT I. Introduction

Database Systems ( 資料庫系統 )

BIS Database Management Systems.

John Edgar 2

MIS Database Systems.

The Relational Model. Roadmap. Relational Database: Definitions. Why Study the Relational Model? Relational database: a set of relations

Transaction Management: Concurrency Control

Introduction to Data Management. Lecture #18 (Transactions)

CS5412: TRANSACTIONS (I)

The Relational Model 2. Week 3

Review. The Relational Model. Glossary. Review. Data Models. Why Study the Relational Model? Why use a DBMS? OS provides RAM and disk

Database Management Systems Introduction to DBMS

COURSE 1. Database Management Systems

INTRODUCTION TO Object Oriented Systems BHUSHAN JADHAV

CORBA Object Transaction Service

Relational Database Systems Part 01. Karine Reis Ferreira

MIT Database Management Systems Lesson 01: Introduction

Information Systems Engineering Distributed and Mobile Information Systems

Outline. Purpose of this paper. Purpose of this paper. Transaction Review. Outline. Aries: A Transaction Recovery Method

Lecture 1: Introduction

Several major software companies including IBM, Informix, Microsoft, Oracle, and Sybase have all released object-relational versions of their

Page 1. Quiz 18.1: Flow-Control" Goals for Today" Quiz 18.1: Flow-Control" CS162 Operating Systems and Systems Programming Lecture 18 Transactions"

TOPLink for WebLogic. Whitepaper. The Challenge: The Solution:

Data Modeling. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Concurrency Control & Recovery

Database Management Systems MIT Lesson 01 - Introduction By S. Sabraz Nawaz

CSE 3241: Database Systems I Databases Introduction (Ch. 1-2) Jeremy Morris

Data Management Lecture Outline 2 Part 2. Instructor: Trevor Nadeau

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS

Overview of Data Management

Introduction to Data Management CSE 344

Outline. Quick Introduction to Database Systems. Data Manipulation Tasks. What do they all have in common? CSE142 Wi03 G-1

Outline. Definition of a Distributed System Goals of a Distributed System Types of Distributed Systems

CS425 Fall 2016 Boris Glavic Chapter 1: Introduction

Transaction Management Overview. Transactions. Concurrency in a DBMS. Chapter 16

Database Management System Dr. S. Srinath Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No.

Introduction to Data Management. Lecture #5 Relational Model (Cont.) & E-Rà Relational Mapping

Transcription:

Analysis 1 Enterprise Data Analysis and Design Lecture 2: Enterprise Architectures Johannes Gehrke johannes@cs.cornell.edu http://www.cs.cornell.edu/johannes (Some of the slides are courtesy of Gustavo Alonso, Fabio Casati, Harumi Kuno, and Vijay Machiraju) Course Outline 1/26 Database Management Systems 1/28 Enterprise Information Architectures 2/2, 2/4, and 2/9: Data Modeling 2/11, 2/16, 2/18, 2/23, and 2/25: Data Mining 3/2 OLAP 3/4 Web Services 3/9 Future Trends 2 Why Store Data in a DBMS? Benefits Transactions (concurrent data access, recovery from system crashes) High-level abstractions for data access, manipulation, and administration Data integrity and security Performance and scalability 3

Analysis 2 A Digress What Is a Transaction? The execution of a program that performs a function by accessing a database. Examples: Reserve an airline seat. Buy an airline ticket. Withdraw money from an ATM. Verify a credit card sale. Order an item from an Internet retailer. Download a video clip and pay for it. Play a bid at an on-line auction. 4 Transactions A transaction is an atomic sequence of actions Each transaction must leave the system in a consistent state (if system is consistent when the transaction starts). The ACID Properties: Atomicity Consistency Isolation Durability 5 Concurrency Control for Isolation (Start: A=$100; B=$100) Consider two transactions: T1: START, A=A+100, B=B-100, COMMIT T2: START, A=1.06*A, B=1.06*B, COMMIT The first transaction is transferring $100 from B s account to A s account. The second transaction is crediting both accounts with a 6% interest payment. Database systems try to do as many operations concurrently as possible, to increase performance. 6

Analysis 3 Example (Contd.) (Start: A=$100; B=$100) Consider a possible interleaving (schedule): T1: A=A+$100, B=B-$100 COMMIT T2: A=1.06*A, B=1.06*B COMMIT End result: A=$106; B=$0 Another possible interleaving: T1: A=A+100, B=B-100 COMMIT T2: A=1.06*A, B=1.06*B COMMIT End result: A=$112; B=$6 The second interleaving is incorrect! Concurrency control of a database system makes sure that the second schedule does not happen. 7 Ensuring Atomicity DBMS ensures atomicity (all-or-nothing property) even if the system crashes in the middle of a transaction. Idea: Keep a log (history) of all actions carried out by the DBMS while executing : Before a change is made to the database, the corresponding log entry is forced to a safe location. After a crash, the effects of partially executed transactions are undone using the log. 8 Recovery A DBMS logs all elementary events on stable storage. This data is called the log. The log contains everything that changes data: Inserts, updates, and deletes. Reasons for logging: Need to UNDO transactions Recover from a systems crash 9

Analysis 4 Recovery: Example (Simplified process) Insert customer data into the database Check order availability Insert order data into the database Write recovery data (the log) to stable storage Return order confirmation number to the customer 10 Why Store Data in a DBMS? Benefits Transactions (concurrent data access, recovery from system crashes) High-level abstractions for data access, manipulation, and administration Data integrity and security Performance and scalability 11 Data Model A data model is a collection of concepts for describing data. Examples: ER model (used for conceptual modeling) Relational model, object-oriented model, object-relational model (actually implemented in current DBMS) 12

Analysis 5 The Relational Data Model A relational database is a set of relations. Turing Award (Nobel Price in CS) for Codd in 1980 for his work on the relational model Example relation: Customers(cid: integer, name: string, byear: integer, state: string) cid name byear state 1 Jones 1960 NY 2 Smith 1974 CA 3 Smith 1950 NY 13 The Relational Model: Terminology Relation instance and schema Field (column) Record or tuple (row) Cardinality cid name byear state 1 Jones 1960 NY 2 Smith 1974 CA 3 Smith 1950 NY 14 Customer Relation (Contd.) In your enterprise, you are more likely to have a schema similar to the following: Customers(cid, identifier, nametype, salutation, firstname, middlenames, lastname, culturalgreetingstyle, gender, customertype, degrees, ethnicity, companyname, departmentname, jobtitle, primaryphone, primaryfax, email, website, building, floor, mailstop, addresstype, streetnumber, streetname, streetdirection, POBox, city, state, zipcode, region, country, assembledaddressblock, currency, maritalstatus, byear, profession) 15

Analysis 6 Product Relation Relation schema: Products(pid: integer, pname: string, price: float, category: string) Relation instance: pid pname price category 1 Intel PIII-700 300.00 hardware 2 MS Office Pro 500.00 software 3 IBM DB2 5000.00 software 4 Thinkpad 600E 5000.00 hardware 16 Transaction Relation Relation schema: Transactions( tid: integer, tdate: date, cid: integer, pid: integer) Relation instance: tid tdate cid pid 1 1/1/2000 1 1 1 1/1/2000 1 2 2 1/1/2000 1 4 3 2/1/2000 2 3 3 2/1/2000 2 4 17 The Relational DBMS Market Market Share Of RDBMS In 1998 (Gartner Group 4/1999) 70.0% 60.0% 50.0% 40.0% 30.0% 20.0% 10.0% 0.0% NT Unix Oracle 7/8 and Lite Microsoft SQL IBM DB2 Sybase ASA/ASA Informix NCR Other 18

Analysis 7 The Relational DBMS Market (Contd.) Best-Selling Client/Server DBMS (Computer Reseller News 8/1999) 50% 40% 30% 20% 10% 0% 43% 41% 32% 25% 22% 16% 6% 6% 4% 5% Microsoft Oracle Sybase IBM Others Jan-Jun 98 Jan-Jun 99 19 The Relational DBMS Market (Contd.) Market Share Of Database Revenues (Computer Reseller News, 5/1999) 35.0% 30.0% 25.0% 20.0% 15.0% 10.0% 5.0% 0.0% IBM Oracle Microsoft Informix Sybase Other In 1997 In 1998 20 The Object-Oriented Data Model Richer data model. Goal: Bridge impedance mismatch between programming languages and the database system. Example components of the data model: Relationships between objects directly as pointers. Result: Can store abstract data types directly in the DBMS Pictures Geographic coordinates Movies CAD objects 21

Analysis 8 Object-Oriented DBMS Advantages: Engineering applications (CAD and CAM and CASE computer aided software engineering), multimedia applications. Disadvantages: Technology not as mature as relational DMBS Not suitable for decision support, weak security Vendors are much smaller companies and their financial stability is questionable. 22 Object-Oriented DBMS (Contd.) Vendors: Gemstone (www.gemstone.com) Objectivity (www.objy.com) ObjectStore (www.objectstore.net) POET (www.poet.com) Versant (www.versant.com, merged with POET) Organizations: OMG: Object Management Group (www.omg.org) 23 The OO DBMS Market Forecast Revenues For OO Systems Software Worldwide (IDC 3/1999) Million $ 1,000.0 800.0 600.0 400.0 200.0 0.0 1997 1998 1999 2000 2001 2002 Year 24

Analysis 9 Object-Relational DBMS Mixture between the object-oriented and the object-relational data model Combines ease of querying with ability to store abstract data types Conceptually, the relational model, but every field All major relational vendors are currently extending their relational DBMS to the object-relational model 25 Query Languages We need a high-level language to describe and manipulate the data Requirements: Precise semantics Easy integration into applications written in C++/Java/Visual Basic/etc. Easy to learn DBMS needs to be able to efficiently evaluate queries written in the language 26 Relational Query Languages The relational model supports simple, powerful querying of data. Precise semantics for relational queries Efficient execution of queries by the DBMS Independent of physical storage 27

Analysis 10 SQL: Structured Query Language Developed by IBM (System R) in the 1970s ANSI standard since 1986: SQL-86 SQL-89 (minor revision) SQL-92 (major revision, current standard) SQL-99 (major extensions) More about SQL in the next lecture 28 Example Query Example Schema: Customers( cid: integer, name: string, byear: integer, state: string) Query: SELECT Customers.cid, Customers.name, Customers.byear, Customers.state FROM Customers WHERE Customers.cid = 3 cid name byear state 1 Jones 1960 NY 2 Smith 1974 CA 3 Smith 1950 NY cid name byear state 3 Smith 1950 NY 29 Example Query SELECT Customers.cid, Customers.name, Customers.byear, Customers.state FROM Customers WHERE cid name byear state 1 Jones 1960 NY 2 Smith 1974 CA 3 Smith 1950 NY Customers.cid = 1 cid name byear state 1 Jones 1960 NY 30

Analysis 11 Why Store Data in a DBMS? Benefits Transactions (concurrent data access, recovery from system crashes) High-level abstractions for data access, manipulation, and administration Data integrity and security Performance and scalability 31 Integrity Constraints Integrity Constraints (ICs): Condition that must be true for any instance of the database. ICs are specified when schema is defined. ICs are checked when relations are modified. A legal instance of a relation is one that satisfies all specified ICs. DBMS should only allow legal instances. Example: Domain constraints. 32 Primary Key Constraints A set of fields is a superkey for a relation if no two distinct tuples can have same values in all key fields. A set of fields is a key if the set is a superkey, and none of its subsets is a superkey. Example: {cid, name} is a superkey for Customers {cid} is a key for Customers Where do primary key constraints come from? 33

Analysis 12 Primary Key Constraints (Contd.) Can there be more than one key for a relation? What is the maximum number of superkeys for a relation? What is the primary key of the Products relation? How about the Transactions relation? 34 Enforcing Referential Integrity What should be done if a Transaction tuple with a non-existent Customer id is inserted? (Reject it!) What should be done if a Customer tuple is deleted? Also delete all Transaction tuples that refer to it. Disallow deletion of a Customer tuple that has associated Transactions. Set cid in Transactions tuples to a default or special cid. SQL supports all three choices 35 Where do ICs Come From? ICs are based upon the semantics of the realworld enterprise that is being described in the database relations. We can check a database instance to see if an IC is violated, but we can NEVER infer that an IC is true by looking at an instance. An IC is a statement about all possible instances! From example, we know state cannot be a key, but the assertion that cid is a key is given to us. Key and foreign key ICs are very common; a DBMS supports more general ICs. 36

Analysis 13 Security Secrecy:Users should not be able to see things they are not supposed to. E.g., A student can t see other students grades. Integrity:Users should not be able to modify things they are not supposed to. E.g., Only instructors can assign grades. Availability: Users should be able to see and modify things they are allowed to. 37 Discretionary Access Control Based on the concept of access rights or privileges for objects (tables and views), and mechanisms for giving users privileges (and revoking privileges). Creator of data automatically gets all privileges on it. DMBS keeps track of who subsequently gains and loses privileges, and ensures that only requests from users who have the necessary privileges (at the time the request is issued) are allowed. Users can grant and revoke privileges 38 Role-Based Authorization In SQL-92, privileges are actually assigned to authorization ids, which can denote a single user or a group of users. In SQL:1999 (and in many current systems), privileges are assigned to roles. Roles can then be granted to users and to other roles. Reflects how real organizations work. Illustrates how standards often catch up with de facto standards embodied in popular systems. 39

Analysis 14 Security Is Important Financial estimate of database losses, by activity in 1998 (ENT/CSI/FBI Computer Crime Survey, 5/1999) Activity Loss in million $ Theft of Proprietary Information 28.51 System Penetration by Outsiders 13.39 Financial Fraud 11.24 Unauthorized Insider Access 50.57 Laptop Theft 0.16 Total 103.87 40 Why Store Data in a DBMS? Benefits Transactions (concurrent data access, recovery from system crashes) High-level abstractions for data access, manipulation, and administration Data integrity and security Performance and scalability 41 DBMS and Performance Efficient implementation of all database operations Indexes: Auxiliary structures that allow fast access to the portion of data that a query is about Smart buffer management Query optimization: Finds the best way to execute a query Automatic high-performance concurrent query execution, query parallelization 42

Analysis 15 Summary Of DBMS Benefits Transactions ACID properties, concurrency control, recovery High-level abstractions for data access Data models Data integrity and security Key constraints, foreign key constraints, access control Performance and scalability Parallel DBMS, distributed DBMS, performance tuning 43 The Big Picture (Revisited) WWW Site Visitor THE WEB Public Web Server Business Transaction Server Main Memory Cache DBMS Internal User INTRANET, VPN Internal Web Server Data Warehouse Application Server 44 Layers and Tiers Client Application Logic Resource Manager Client Server Database Presentation layer Client is any user or program that wants to perform an operation over the Business rules system. Clients interact with the system through a presentation layer Business objects The application logic determines what the system actually does. It takes care of enforcing the business rules and establish the business processes. The application logic can take many forms: programs, constraints, business Client processes, etc. The resource manager deals with the organization (storage, indexing, and Business processes retrieval) of the data necessary to support the application logic. This is typically a database but it can also be Persistent storage a text retrieval system or any other data management system providing querying capabilities and persistence. 45

Analysis 16 A Game of Boxes and Arrows There is no problem in system design that cannot be solved by adding a level of indirection. There is no performance problem that cannot be solved by removing a level of indirection. Each box represents a part of the system. Each arrow represents a connection between two parts of the system. The more boxes, the more modular the system: more opportunities for distribution and parallelism. This allows encapsulation, component based design, reuse. The more boxes, the more arrows: more sessions (connections) need to be maintained, more coordination is necessary. The system becomes more complex to monitor and manage. The more boxes, the greater the number of context switches and intermediate steps to go through before one gets to the data. Performance suffers considerably. System designers try to balance the flexibility of modular design with the performance demands of real applications. Once a layer is established, it tends to migrate down and merge with lower layers. 46 Top-Down Design top-down architecture PL-A PL-B PL-C top-down design PL-A AL-B AL-A PL-B PL-C AL-C AL-D AL-A AL-C AL-D AL-B RM-1 RM-2 RM-1 RM-2 47 top-down design 1. define access channels and client platforms 2. define presentation formats and protocols for the selected clients and protocols 3. define the functionality necessary to deliver the contents and formats needed at the presentation layer 4. define the data sources and data organization needed to implement the application logic Top-Down design client presentation layer application logic layer resource management layer information system 48

bottom-up architecture NBA 518: Enterprise Data Design and Analysis 17 Bottom-Up Design New application Legacy systems Legacy applicati on In a bottom up design, many of the basic components already exist. These are stand alone systems which need to be integrated into new systems. The components do not necessarily cease to work as stand alone components. Often old applications continue running at the same time as new applications. This approach has a wide application because the underlying systems already exist and cannot be easily replaced. Much of the work and products in this area are related to middleware, the intermediate layer used to provide a common interface, bridge heterogeneity, and cope with distribution. 49 Bottom-Up Design bottom-up design PL-A AL-B AL-A PL-B PL-C AL-C AL-D AL-A PL-A AL-C PL-B PL-C AL-B AL-D wrapper wrapper wrapper wrapper wrapper wrapper legacy application legacy application legacy system legacy system legacy system 50 bottom-up design 1. define access channels and client platforms Bottom-Up Design client 2. examine existing resources and the functionality they offer 3. wrap existing resources and integrate their functionality into a consistent interface 4. adapt the output of the application logic so that it can be used with the required access channels and client protocols presentation layer application logic layer resource management layer information system 51

Analysis 18 One Tier: Fully Centralized Server The presentation layer, application logic and resource manager are built as a monolithic entity. Access through dumb terminals This was the typical architecture of mainframes, offering several advantages: no forced context switches in the control flow (everything happens within the system), all is centralized, managing and controlling resources is easier, the design can be highly optimized by blurring the separation between layers. 52 Two Tier: Client/Server Server As computers became more powerful, it was possible to move the presentation layer to the client. This has several advantages: Clients are independent. Computing power at clients. It introduces the concept of API (Application Program Interface). An interface to invoke the system from the outside. It also allows designers to think about federating the systems into a single system. The resource manager only sees one client: the application logic. This greatly helps with performance since there are no client connections/sessions to maintain. 53 APIs in Client/Server Introduced notion of a service Introduced notion of an interface (how the client can invoke a given service) Many standardization efforts due to need for common APIs server s API service interface service interface service interface service interface service service service service server resource management layer 54

Analysis 19 Technical Aspects Of Two Tier Advantages to Single Tier: Take advantage of client capacity to off-load work to the clients Work within the server takes place within one scope (almost as in 1 tier), The server design is still tightly coupled and can be optimized by ignoring presentation issues Still relatively easy to manage and control from a software engineering point of view Disadvantages: Connection management Clients are tied to the system (no standard presentation layer). Connect to two systems, a client needs two presentation layers. No failure or load encapsulation. If the server fails, nobody can work. The load created by one client will directly affect the work of others since they are all competing for the same resources. 55 The Main Limitation of Client/Server Server A Server B Accessing more than two servers: The underlying systems don t know about each other No common business logic Client is the point of integration (increasingly fat clients) The responsibility of dealing with heterogeneous systems is shifted to the client. The client becomes responsible for knowing where things are, how to get to them, and how to ensure consistency Very inefficient (software design, portability, code reuse, performance since the client capacity is limited, etc.). These issues cannot be solved with 2-tier 56 Three Tier: Middleware Three layers are fully separated. The layers are also typically distributed taking advantage of the complete modularity of the design 57

Analysis 20 Middleware clients Middleware or global application logic Local application logic Local resource managers Server A middleware Server B Middleware is just a level of indirection between clients and other layers of the system. Introduces an additional layer of business logic encompassing all underlying systems. By doing this, a middleware system: simplifies the design of the clients by reducing the number of interfaces, provides transparent access to the underlying systems, acts as the platform for intersystem functionality and high level application logic, and takes care of locating resources, accessing them, and gathering results. 58 Technical Aspects of Middleware The introduction of a middleware layer helps in that: the number of necessary interfaces is greatly reduced: clients see only one system (the middleware), local applications see only one system (the middleware), it centralizes control (middleware systems themselves are usually 2 tier), it makes necessary functionality widely available to all clients, it allows to implement functionality that otherwise would be very difficult to provide, and it is a first step towards dealing with application heterogeneity (some forms of it). The middleware layer does not help in that: it is another indirection level, it is complex software, it is a development platform, not a complete system 59 A three tier middleware based system... External clients External client middleware system internal clients connecting logic control user logic middleware wrappers 2 tier systems Resource managers 2 tier system Resource manager 60

Analysis 21 client Web browser Web server HTML filter application logic layer N-Tier Architectures presentation layer middleware N-tier architectures result from connecting several three tier systems to each other The addition of the Web layer led to the notion of application servers, which was used to refer to middleware platforms supporting access through the Web resource management layer information system 61 N-tier In reality LAN internal clients LAN middleware application logic INTERNET FIREWALL Web server cluster LAN LAN, gateways resource management layer database server middleware Wrappers application and logic gateways LAN LAN file application additional resource server management layers 62 Blocking or Synchronous Interaction Traditionally, information systems use blocking calls Synchronous interaction requires both parties to be on-line : the caller makes a request, the receiver gets the request, processes the request, sends a response, the caller receives the response. The caller must wait until the response comes back. but the interaction requires both client and server to be alive at the same time client Call Answer idle time server Disadvantages due to synchronization: Connection overhead Higher probability of failures Difficult to identify and react to failures It is not really practical for complex interactions Receive Response 63

Analysis 22 Overhead of Synchronism Need to maintain a session between the caller and the receiver. Maintaining sessions is expensive. There is also a limit on how many sessions can be active at the same time For this reason, client/server systems often resort to connection pooling to optimize resource utilization Have a pool of open connections Allocate connections as needed Synchronous interaction requires a context for each call and a context management system for all incoming calls. request() do with answer request() do with answer session duration Context is lost Needs to be restarted!! receive process return receive process return 64 Failures In Synchronous Calls If the client or the server fail, the context is lost. If the failure occurred before 1, nothing has happened If the failure occurs after 1 but before 2 (receiver crashes), then the request is lost If the failure happens after 2 but before 3, side effects may cause inconsistencies If the failure occurs after 3 but before 4, the response is lost but the action has been performed (do it again?) Who is responsible for finding out what happened? Finding out when the failure took place may not be easy. If there is a chain of invocations the failure can occur anywhere along the chain. request() 4 do with answer request() do with answer timeout try again do with answer 1 1 receive process return receive process return receive process return 65 2 3 2 3 2 3 Two Solutions ENHANCED SUPPORT Client/Server systems and middleware platforms provide a number of mechanisms to deal with the problems created by synchronous interaction: Transactional interaction Service replication and load balancing ASYNCHRONOUS INTERACTION Using asynchronous interaction, the caller sends a message that gets stored somewhere until the receiver reads it and sends a response. The response is sent in a similar manner Asynchronous interaction can take place in two forms: Non-blocking invocation Persistent queues 66

Analysis 23 Message Queuing Reliable queuing is an excellent complement to synchronous interactions: Suitable to modular design: the code for making a request can be in a different module (even a different machine!) than the code for dealing with the response Easier to design sophisticated distribution modes and it also helps to handle communication sessions in a more abstract way More natural way to implement complex interactions between heterogeneous systems request() do with answer queue queue receive process return 67 Summary Functionality of database systems Three-tier architectures 68