Streaming Data Integration: Challenges and Opportunities. Nesime Tatbul

Size: px
Start display at page:

Download "Streaming Data Integration: Challenges and Opportunities. Nesime Tatbul"

Transcription

1 Streaming Data Integration: Challenges and Opportunities Nesime Tatbul

2 Talk Outline Integrated data stream processing An example project: MaxStream Architecture Query model Conclusions ICDE NTII Workshop, 2010 Nesime Tatbul, ETH Zurich 2

3 Data Stream Processing Monitoring applications require collecting, processing, disseminating, and reacting to real-time events from push-based data sources. Store and Pull model of traditional databases does not work well. Query DBMS Answer Data SPE Answer Data Base Query Base Traditional Database Systems Stream Processing Engines ICDE NTII Workshop, 2010 Nesime Tatbul, ETH Zurich 3

4 Integrated Data Stream Processing Today, integration support for SPEs is needed in three main forms: 1. across multiple streaming data sources example: news feeds, weather sensors, traffic cameras 2. over multiple SPEs example: supply-chain management 3. between SPEs and traditional DBMSs example: operational business intelligence ICDE NTII Workshop, 2010 Nesime Tatbul, ETH Zurich 4

5 #1: Streaming Data Source Integration Goal: Integrated querying over multiple, potentially heterogeneous streaming data sources Challenges: Schemas of different sources can differ from one another and from the input schemas of the already running CQs. Input sources or the network can introduce imperfections into the stream. Adapters may become a bottleneck. Current state of the art: Commercial SPEs offer a collection of common adapters and SDKs for developing custom ones. Mapping Data to Queries [Hentschel et al] ASPEN project [Ives et al] ICDE NTII Workshop, 2010 Nesime Tatbul, ETH Zurich 5

6 #2: SPE-SPE Integration Goal: Integrated querying over multiple, potentially heterogeneous SPEs to exploit the advantages of distributed operation to exploit specialized capabilities and strengths of SPEs to provide higher-level monitoring over large-scale enterprises with loosely-coupled operational units Challenges: The need for functional integration The need to deal with heterogeneity at different levels (e.g., query models, capabilities, performance, interfaces) Current state of the art: MaxStream project [Tatbul et al] ICDE NTII Workshop, 2010 Nesime Tatbul, ETH Zurich 6

7 #3: SPE-DBMS Integration Goal: Integrated querying over SPEs and traditional database systems Challenges: Bridge the data vs. operation gap between the two worlds. Find the right language and architecture primitives for the required level of querying, persistence, and performance. Current state of the art: Languages [STREAM CQL, StreamSQL] Architectures SPE-based [typical SPEs such as Coral8, StreamBase] DBMS-based * stream-relational systems such as TelegraphCQ/Truviso, DataCell, DejaVu, MaxStream] ICDE NTII Workshop, 2010 Nesime Tatbul, ETH Zurich 7

8 MaxStream: A Platform for SPE-SPE and SPE-DBMS Integration Client Application Federation Layer Data Wrapper Agent Wrapper Wrapper Key design ideas: Uniform query language and API Relational database infrastructure as the basis for the federation layer (in our case: SAP MaxDB and SAP MaxDB Federator) DB DB SPE SPE Just enough streaming capability inside the federation layer ICDE NTII Workshop, 2010 Nesime Tatbul, ETH Zurich 8

9 MaxStream vs. Traditional Virtual Integration Traditional Virtual Integration Goal: to query across multiple autonomous & heterogeneous data sources Source wrappers send queries and receive answers Sources host the data Mapping between source schemas and a global schema Queries are posed against the global schema More focus on data locality MaxStream Goal: to query across multiple autonomous & heterogeneous SPEs (and DBMSs) SPE wrappers send queries and data, and receive answers SPEs host the CQs Mapping between SPE CQ models and a global CQ model CQs are posed and data is fed against the global CQ model More focus on functional heterogeneity ICDE NTII Workshop, 2010 Nesime Tatbul, ETH Zurich 9

10 Output Events MaxStream Architecture Client Application Output Events Input Events DDL/DML statements in MaxStream s SQL Dialect Metadata Input Event Tables Output Event Tables SQL Parser Query Rewriter Query Optimizer Query Executer MaxStream Federator SQL Dialect Translator Input Events Data Agent for SPE DDL/DML statements in SPE s SQL Dialect Data Data Agent Agent Data Agent for SPE SPE s SDK SPE s SDK MaxDB ODBC DB DB MaxDB ODBC SPE ICDE NTII Workshop, 2010 Nesime Tatbul, ETH Zurich SPE 10

11 MaxStream Architecture Two Key Building Blocks Streaming inputs through MaxStream ISTREAM operator for persistent input events Tuple queues for transient input events Streaming outputs through MaxStream Monitoring Select operator over event tables Persistent event tables for persistent output events In-memory event tables for transient output events ICDE NTII Workshop, 2010 Nesime Tatbul, ETH Zurich 11

12 Streaming Input Events The ISTREAM ( Insert STREAM ) Operator Inspired by the relation-to-stream operator of the same name in the STREAM Project, that streams new tuples being inserted into a given relation. Example: OrdersTable(ClientId, OrderId, ProductId, Quantity) INSERT INTO STREAM OrdersStream SELECT ClientId, OrderId, ProductId, Quantity FROM ISTREAM(OrdersTable); T r1 r2 r3 ICDE NTII Workshop, 2010 T+1 r1 r2 r3 r4 r5 ISTREAM(OrdersTable) at T+1 returns: <r4, T+1>, <r5, T+1> Nesime Tatbul, ETH Zurich 12

13 Streaming Output Events Opposite of streaming input events, but Unlike the SPE interface, the client application interface is not push-based. The Monitoring Select Operator Select operation blocks until there is at least one row to return. For continuous monitoring, the client program re-issues Monitoring Select in a loop. Monitoring Select operates on event tables. Example: Detect unusually large order volumes. SELECT * FROM /*+ EVENT */ TotalSalesTable WHERE TotalSales > ; ICDE NTII Workshop, 2010 Nesime Tatbul, ETH Zurich 13

14 ISTREAM and Monitoring Select in Action Data Feeder Client // Continuous Insert into OrdersTable kept in MaxStream CREATE TABLE OrdersTable; WHILE (true) { INSERT INTO OrdersTable VALUES ( ); sleep(period); }; // Continuous Insert into OrdersStream kept in the SPE CREATE STREAM OrdersStream; INSERT INTO STREAM OrdersStream SELECT FROM ISTREAM(OrdersTable); Monitoring Clients // Setting up the Continuous Query to push down to the SPE INSERT INTO STREAM TotalSalesStream SELECT SUM( ) FROM OrdersStream KEEP 1 HOUR; // Streaming Output Events inserted by the SPE CREATE TABLE TotalSalesTable; // Sales spikes Query to run in MaxStream WHILE (true) { SELECT * FROM /*+EVENT*/ TotalSalesTable WHERE TotalSales > ; }; MaxStream Monitoring Select SPE TotalSalesTable TotalSalesStream OrdersTable ISTREAM OrdersStream INSERT INTO STREAM TotalSalesStream SELECT SUM( ) FROM OrdersStream KEEP 1 HOUR; ICDE NTII Workshop, 2010 Nesime Tatbul, ETH Zurich 14

15 Hybrid Queries in MaxStream Hybrid queries are continuous queries that join Streams with Tables. Two important factors that affect efficiency: The streaming data source must be first in the join ordering. Hybrid queries can be rewritten to perform the join within the MaxStream Federator, removing the need for the SPE to establish connections to external databases. One can conveniently use hybrid queries in MaxStream in two ways: To enrich the input stream before it is passed to the SPE To enrich the output stream after it is received from the SPE ICDE NTII Workshop, 2010 Nesime Tatbul, ETH Zurich 15

16 MaxStream Query Model Problem: Heterogeneity of SPE query models Syntax heterogeneity Language clauses/keywords for common constructs syntactically differ. Capability heterogeneity Support for certain query types differs. Execution model heterogeneity Underlying query execution models differ. Not exposed to the application developer at the language syntax level. First step towards a solution: Create a model to analyze and predict the query execution semantics of SPEs. ICDE NTII Workshop, 2010 Nesime Tatbul, ETH Zurich 16

17 The SECRET Model What affects query results produced by an SPE? ScopE: Given a query with certain window properties, what are the potential window intervals? Content: Given an input stream, what are the actual contents for those window intervals? REport: Under what conditions, do those window contents become visible to the query processor for evaluation? Tick: What drives an SPE to take action on a given input stream? Query Result = F(system, query, input) Tick REport ScopE Content ICDE NTII Workshop, 2010 Nesime Tatbul, ETH Zurich 17

18 The SECRET of a Query Plan SECRET SECRET ScopE window Tick REport Content query operator ICDE NTII Workshop, 2010 Nesime Tatbul, ETH Zurich 18

19 The SECRET of an SPE Tick: tuple-driven (e.g., Aurora, Borealis, StreamBase, TelegraphCQ, Truviso) time-driven (e.g., STREAM, Oracle CEP) batch-driven (e.g., Coral8, *Jain et al, VLDB 08+) REport: window close & non-empty (e.g., StreamBase) content change & non-empty (e.g., Coral8) window close & content change & non-empty (e.g., STREAM) ICDE NTII Workshop, 2010 Nesime Tatbul, ETH Zurich 19

20 MaxStream: Future Outlook Query model how to extend SECRET (other query types, analysis of other SPEs, input imperfections) how to use SECRET in MaxStream ( SECRET-based query and SPE capability analyzer) Capability- and Cost-based query optimization Transactional stream processing Distributed operation ICDE NTII Workshop, 2010 Nesime Tatbul, ETH Zurich 20

21 Conclusions Today, integration support for SPEs is needed in three main forms: across sources, SPE-SPE, SPE-DBMS. There are many open research challenges. MaxStream takes up some of these challenges for SPE-SPE and SPE-DBMS integration. More information about MaxStream: ICDE NTII Workshop, 2010 Nesime Tatbul, ETH Zurich 21

Big Data. Donald Kossmann & Nesime Tatbul Systems Group ETH Zurich

Big Data. Donald Kossmann & Nesime Tatbul Systems Group ETH Zurich Big Data Donald Kossmann & Nesime Tatbul Systems Group ETH Zurich Data Stream Processing Overview Introduction Models and languages Query processing systems Distributed stream processing Real-time analytics

More information

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2012/13

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2012/13 Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Data Stream Processing Topics Model Issues System Issues Distributed Processing Web-Scale Streaming 3 Data Streams Continuous

More information

DATA STREAMS AND DATABASES. CS121: Introduction to Relational Database Systems Fall 2016 Lecture 26

DATA STREAMS AND DATABASES. CS121: Introduction to Relational Database Systems Fall 2016 Lecture 26 DATA STREAMS AND DATABASES CS121: Introduction to Relational Database Systems Fall 2016 Lecture 26 Static and Dynamic Data Sets 2 So far, have discussed relatively static databases Data may change slowly

More information

UDP Packet Monitoring with Stanford Data Stream Manager

UDP Packet Monitoring with Stanford Data Stream Manager UDP Packet Monitoring with Stanford Data Stream Manager Nadeem Akhtar #1, Faridul Haque Siddiqui #2 # Department of Computer Engineering, Aligarh Muslim University Aligarh, India 1 nadeemalakhtar@gmail.com

More information

Data Streams. Building a Data Stream Management System. DBMS versus DSMS. The (Simplified) Big Picture. (Simplified) Network Monitoring

Data Streams. Building a Data Stream Management System. DBMS versus DSMS. The (Simplified) Big Picture. (Simplified) Network Monitoring Building a Data Stream Management System Prof. Jennifer Widom Joint project with Prof. Rajeev Motwani and a team of graduate students http://www-db.stanford.edu/stream stanfordstreamdatamanager Data Streams

More information

Systems Analysis and Design in a Changing World, Fourth Edition. Chapter 12: Designing Databases

Systems Analysis and Design in a Changing World, Fourth Edition. Chapter 12: Designing Databases Systems Analysis and Design in a Changing World, Fourth Edition Chapter : Designing Databases Learning Objectives Describe the differences and similarities between relational and object-oriented database

More information

MIS Database Systems.

MIS Database Systems. MIS 335 - Database Systems http://www.mis.boun.edu.tr/durahim/ Ahmet Onur Durahim Learning Objectives Database systems concepts Designing and implementing a database application Life of a Query in a Database

More information

BIS Database Management Systems.

BIS Database Management Systems. BIS 512 - Database Management Systems http://www.mis.boun.edu.tr/durahim/ Ahmet Onur Durahim Learning Objectives Database systems concepts Designing and implementing a database application Life of a Query

More information

Enterprise Big Data Platforms

Enterprise Big Data Platforms Enterprise Big Data Platforms + Big Data research @ Roma Tre Antonio Maccioni maccioni@dia.uniroma3.it 19 April 2017 Outline Polystores QUEPA project Data Lakes KAYAK project No one size fits all Polyglot

More information

Staying FIT: Efficient Load Shedding Techniques for Distributed Stream Processing

Staying FIT: Efficient Load Shedding Techniques for Distributed Stream Processing Staying FIT: Efficient Load Shedding Techniques for Distributed Stream Processing Nesime Tatbul Uğur Çetintemel Stan Zdonik Talk Outline Problem Introduction Approach Overview Advance Planning with an

More information

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2012/13

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2012/13 Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2012/13 Data Stream Processing Topics Model Issues System Issues Distributed Processing Web-Scale Streaming 3 System Issues Architecture

More information

Distributed KIDS Labs 1

Distributed KIDS Labs 1 Distributed Databases @ KIDS Labs 1 Distributed Database System A distributed database system consists of loosely coupled sites that share no physical component Appears to user as a single system Database

More information

Rethinking the Design of Distributed Stream Processing Systems

Rethinking the Design of Distributed Stream Processing Systems Rethinking the Design of Distributed Stream Processing Systems Yongluan Zhou Karl Aberer Ali Salehi Kian-Lee Tan EPFL, Switzerland National University of Singapore Abstract In this paper, we present a

More information

Lecture 21 11/27/2017 Next Lecture: Quiz review & project meetings Streaming & Apache Kafka

Lecture 21 11/27/2017 Next Lecture: Quiz review & project meetings Streaming & Apache Kafka Lecture 21 11/27/2017 Next Lecture: Quiz review & project meetings Streaming & Apache Kafka What problem does Kafka solve? Provides a way to deliver updates about changes in state from one service to another

More information

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Layers of an information system. Design strategies.

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Layers of an information system. Design strategies. Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de Chapter 2 Distributed Information Systems Architecture Chapter Outline

More information

Outline. q Database integration & querying. q Peer-to-Peer data management q Stream data management q MapReduce-based distributed data management

Outline. q Database integration & querying. q Peer-to-Peer data management q Stream data management q MapReduce-based distributed data management Outline n Introduction & architectural issues n Data distribution n Distributed query processing n Distributed query optimization n Distributed transactions & concurrency control n Distributed reliability

More information

Load Shedding in a Data Stream Manager

Load Shedding in a Data Stream Manager Load Shedding in a Data Stream Manager Nesime Tatbul, Uur U Çetintemel, Stan Zdonik Brown University Mitch Cherniack Brandeis University Michael Stonebraker M.I.T. The Overload Problem Push-based data

More information

One Size Fits All: An Idea Whose Time Has Come and Gone

One Size Fits All: An Idea Whose Time Has Come and Gone ICS 624 Spring 2013 One Size Fits All: An Idea Whose Time Has Come and Gone Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 1/9/2013 Lipyeow Lim -- University

More information

DRAFT A Survey of Event Processing Languages (EPLs)

DRAFT A Survey of Event Processing Languages (EPLs) DRAFT A Survey of Event Processing Languages (EPLs) October 15, 2006 (v14) Tim Bass, CISSP Co-Chair Event Processing Reference Architecture Working Group Principal Global Architect, Director TIBCO Software

More information

Gustavo Alonso, ETH Zürich. Web services: Concepts, Architectures and Applications - Chapter 1 2

Gustavo Alonso, ETH Zürich. Web services: Concepts, Architectures and Applications - Chapter 1 2 Chapter 1: Distributed Information Systems Gustavo Alonso Computer Science Department Swiss Federal Institute of Technology (ETHZ) alonso@inf.ethz.ch http://www.iks.inf.ethz.ch/ Contents - Chapter 1 Design

More information

Chapter 6. Foundations of Business Intelligence: Databases and Information Management VIDEO CASES

Chapter 6. Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

USERS CONFERENCE Copyright 2016 OSIsoft, LLC

USERS CONFERENCE Copyright 2016 OSIsoft, LLC Bridge IT and OT with a process data warehouse Presented by Matt Ziegler, OSIsoft Complexity Problem Complexity Drives the Need for Integrators Disparate assets or interacting one-by-one Monitoring Real-time

More information

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Distributed transactions (quick refresh) Layers of an information system

Chapter Outline. Chapter 2 Distributed Information Systems Architecture. Distributed transactions (quick refresh) Layers of an information system Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de Chapter 2 Distributed Information Systems Architecture Chapter Outline

More information

Query Processing over Data Streams. Formula for a Database Research Project. Following the Formula

Query Processing over Data Streams. Formula for a Database Research Project. Following the Formula Query Processing over Data Streams Joint project with Prof. Rajeev Motwani and a group of graduate students stanfordstreamdatamanager Formula for a Database Research Project Pick a simple but fundamental

More information

Improving Query Plans. CS157B Chris Pollett Mar. 21, 2005.

Improving Query Plans. CS157B Chris Pollett Mar. 21, 2005. Improving Query Plans CS157B Chris Pollett Mar. 21, 2005. Outline Parse Trees and Grammars Algebraic Laws for Improving Query Plans From Parse Trees To Logical Query Plans Syntax Analysis and Parse Trees

More information

GRID Stream Database Managent for Scientific Applications

GRID Stream Database Managent for Scientific Applications GRID Stream Database Managent for Scientific Applications Milena Ivanova (Koparanova) and Tore Risch IT Department, Uppsala University, Sweden Outline Motivation Stream Data Management Computational GRIDs

More information

Addressed Issue. P2P What are we looking at? What is Peer-to-Peer? What can databases do for P2P? What can databases do for P2P?

Addressed Issue. P2P What are we looking at? What is Peer-to-Peer? What can databases do for P2P? What can databases do for P2P? Peer-to-Peer Data Management - Part 1- Alex Coman acoman@cs.ualberta.ca Addressed Issue [1] Placement and retrieval of data [2] Server architectures for hybrid P2P [3] Improve search in pure P2P systems

More information

LiSEP: a Lightweight and Extensible tool for Complex Event Processing

LiSEP: a Lightweight and Extensible tool for Complex Event Processing LiSEP: a Lightweight and Extensible tool for Complex Event Processing Ivan Zappia, David Parlanti, Federica Paganelli National Interuniversity Consortium for Telecommunications Firenze, Italy References

More information

MetaMatrix Enterprise Data Services Platform

MetaMatrix Enterprise Data Services Platform MetaMatrix Enterprise Data Services Platform MetaMatrix Overview Agenda Background What it does Where it fits How it works Demo Q/A 2 Product Review: Problem Data Challenges Difficult to implement new

More information

Database Server. 2. Allow client request to the database server (using SQL requests) over the network.

Database Server. 2. Allow client request to the database server (using SQL requests) over the network. Database Server Introduction: Client/Server Systems is networked computing model Processes distributed between clients and servers. Client Workstation (usually a PC) that requests and uses a service Server

More information

Metadata Services for Distributed Event Stream Processing Agents

Metadata Services for Distributed Event Stream Processing Agents Metadata Services for Distributed Event Stream Processing Agents Mahesh B. Chaudhari School of Computing, Informatics, and Decision Systems Engineering Arizona State University Tempe, AZ 85287-8809, USA

More information

Availability Digest. Attunity Integration Suite December 2010

Availability Digest.  Attunity Integration Suite December 2010 the Availability Digest Attunity Integration Suite December 2010 Though not focused primarily on high availability in the uptime sense, the Attunity Integration Suite (www.attunity.com) provides extensive

More information

DATABASE DESIGN II - 1DL400

DATABASE DESIGN II - 1DL400 DATABASE DESIGN II - 1DL400 Spring 2014 2014-01-30 A course on modern database systems http://www.it.uu.se/research/group/udbl/kurser/dbii_vt14/activedb.pdf Tore Risch Uppsala Database Laboratory Department

More information

Client/Server-Architecture

Client/Server-Architecture Client/Server-Architecture Content Client/Server Beginnings 2-Tier, 3-Tier, and N-Tier Architectures Communication between Tiers The Power of Distributed Objects Managing Distributed Systems The State

More information

Using Relational Databases for Digital Research

Using Relational Databases for Digital Research Using Relational Databases for Digital Research Definition (using a) relational database is a way of recording information in a structure that maximizes efficiency by separating information into different

More information

Call: JSP Spring Hibernate Webservice Course Content:35-40hours Course Outline

Call: JSP Spring Hibernate Webservice Course Content:35-40hours Course Outline JSP Spring Hibernate Webservice Course Content:35-40hours Course Outline Advanced Java Database Programming JDBC overview SQL- Structured Query Language JDBC Programming Concepts Query Execution Scrollable

More information

Introduction to Database Systems CSE 414

Introduction to Database Systems CSE 414 Introduction to Database Systems CSE 414 Lectures 4 and 5: Aggregates in SQL CSE 414 - Spring 2013 1 Announcements Homework 1 is due on Wednesday Quiz 2 will be out today and due on Friday CSE 414 - Spring

More information

Data Modeling and Databases Ch 9: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich

Data Modeling and Databases Ch 9: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Data Modeling and Databases Ch 9: Query Processing - Algorithms Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Transactions (Locking, Logging) Metadata Mgmt (Schema, Stats) Application

More information

Language Support for Stream. Processing. Robert Soulé

Language Support for Stream. Processing. Robert Soulé Language Support for Stream Processing Robert Soulé Introduction Proposal Work Plan Outline Formalism Translators & Abstractions Optimizations Related Work Conclusion 2 Introduction Stream processing is

More information

1. General. 2. Stream. 3. Aurora. 4. Conclusion

1. General. 2. Stream. 3. Aurora. 4. Conclusion 1. General 2. Stream 3. Aurora 4. Conclusion 1. Motivation Applications 2. Definition of Data Streams 3. Data Base Management System (DBMS) vs. Data Stream Management System(DSMS) 4. Stream Projects interpreting

More information

CISC 7610 Lecture 4 Approaches to multimedia databases

CISC 7610 Lecture 4 Approaches to multimedia databases CISC 7610 Lecture 4 Approaches to multimedia databases Topics: Metadata Loose vs tight coupling of data and metadata Object (oriented) databases Graph databases Object-relational mapping Homework 1 Entity-relationship

More information

RDF stream processing models Daniele Dell Aglio, Jean-Paul Cabilmonte,

RDF stream processing models Daniele Dell Aglio, Jean-Paul Cabilmonte, Stream Reasoning For Linked Data M. Balduini, J-P Calbimonte, O. Corcho, D. Dell'Aglio, E. Della Valle, and J.Z. Pan RDF stream processing models Daniele Dell Aglio, daniele.dellaglio@polimi.it Jean-Paul

More information

Computer-based Tracking Protocols: Improving Communication between Databases

Computer-based Tracking Protocols: Improving Communication between Databases Computer-based Tracking Protocols: Improving Communication between Databases Amol Deshpande Database Group Department of Computer Science University of Maryland Overview Food tracking and traceability

More information

The Raindrop Engine: Continuous Query Processing at WPI

The Raindrop Engine: Continuous Query Processing at WPI The Raindrop Engine: Continuous Query Processing at WPI Elke A. Rundensteiner Database Systems Research Lab, WPI 2003 This talk is based on joint work with students at DSRG lab. Invited Talk at Database

More information

Best Practices for Choosing Content Reporting Tools and Datasources. Andrew Grohe Pentaho Director of Services Delivery, Hitachi Vantara

Best Practices for Choosing Content Reporting Tools and Datasources. Andrew Grohe Pentaho Director of Services Delivery, Hitachi Vantara Best Practices for Choosing Content Reporting Tools and Datasources Andrew Grohe Pentaho Director of Services Delivery, Hitachi Vantara Agenda Discuss best practices for choosing content with Pentaho Business

More information

DQpowersuite. Superior Architecture. A Complete Data Integration Package

DQpowersuite. Superior Architecture. A Complete Data Integration Package DQpowersuite Superior Architecture Since its first release in 1995, DQpowersuite has made it easy to access and join distributed enterprise data. DQpowersuite provides an easy-toimplement architecture

More information

Introduction in Eventing in SOA Suite 11g

Introduction in Eventing in SOA Suite 11g Introduction in Eventing in SOA Suite 11g Ronald van Luttikhuizen Vennster Utrecht, The Netherlands Keywords: Events, EDA, Oracle SOA Suite 11g, SOA, JMS, AQ, EDN Introduction Services and events are highly

More information

Introduction JDBC 4.1. Bok, Jong Soon

Introduction JDBC 4.1. Bok, Jong Soon Introduction JDBC 4.1 Bok, Jong Soon javaexpert@nate.com www.javaexpert.co.kr What is the JDBC TM Stands for Java TM Database Connectivity. Is an API (included in both J2SE and J2EE releases) Provides

More information

Oracle Transparent Gateways

Oracle Transparent Gateways Oracle Transparent Gateways Using Transparent Gateways with Oracle9i Application Server Release 1.0.2.1 February 2001 Part No. A88729-01 Oracle offers two solutions for integrating data from non-oracle

More information

High-Performance Event Processing Bridging the Gap between Low Latency and High Throughput Bernhard Seeger University of Marburg

High-Performance Event Processing Bridging the Gap between Low Latency and High Throughput Bernhard Seeger University of Marburg High-Performance Event Processing Bridging the Gap between Low Latency and High Throughput Bernhard Seeger University of Marburg common work with Nikolaus Glombiewski, Michael Körber, Marc Seidemann 1.

More information

Introduction to Federation Server

Introduction to Federation Server Introduction to Federation Server Alex Lee IBM Information Integration Solutions Manager of Technical Presales Asia Pacific 2006 IBM Corporation WebSphere Federation Server Federation overview Tooling

More information

Parallel Patterns for Window-based Stateful Operators on Data Streams: an Algorithmic Skeleton Approach

Parallel Patterns for Window-based Stateful Operators on Data Streams: an Algorithmic Skeleton Approach Parallel Patterns for Window-based Stateful Operators on Data Streams: an Algorithmic Skeleton Approach Tiziano De Matteis, Gabriele Mencagli University of Pisa Italy INTRODUCTION The recent years have

More information

PANEL Streams vs Rules vs Subscriptions: System and Language Issues. The Case for Rules. Paul Vincent TIBCO Software Inc.

PANEL Streams vs Rules vs Subscriptions: System and Language Issues. The Case for Rules. Paul Vincent TIBCO Software Inc. PANEL Streams vs Rules vs Subscriptions: System and Language Issues The Case for Rules Paul Vincent TIBCO Software Inc. Rules, rules, everywhere Data aquisition Data processing Workflow Data relationships

More information

Performance in the Multicore Era

Performance in the Multicore Era Performance in the Multicore Era Gustavo Alonso Systems Group -- ETH Zurich, Switzerland Systems Group Enterprise Computing Center Performance in the multicore era 2 BACKGROUND - SWISSBOX SwissBox: An

More information

LinkViews: An Integration Framework for Relational and Stream Systems

LinkViews: An Integration Framework for Relational and Stream Systems LinkViews: An Integration Framework for Relational and Stream Systems Yannis Sotiropoulos, Damianos Chatziantoniou Department of Management Science and Technology Athens University of Economics and Business

More information

Data Modeling and Databases Ch 10: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich

Data Modeling and Databases Ch 10: Query Processing - Algorithms. Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Data Modeling and Databases Ch 10: Query Processing - Algorithms Gustavo Alonso Systems Group Department of Computer Science ETH Zürich Transactions (Locking, Logging) Metadata Mgmt (Schema, Stats) Application

More information

Business Analytics in the Oracle 12.2 Database: Analytic Views. Event: BIWA 2017 Presenter: Dan Vlamis and Cathye Pendley Date: January 31, 2017

Business Analytics in the Oracle 12.2 Database: Analytic Views. Event: BIWA 2017 Presenter: Dan Vlamis and Cathye Pendley Date: January 31, 2017 Business Analytics in the Oracle 12.2 Database: Analytic Views Event: BIWA 2017 Presenter: Dan Vlamis and Cathye Pendley Date: January 31, 2017 Vlamis Software Solutions Vlamis Software founded in 1992

More information

Spatio-Temporal Stream Processing in Microsoft StreamInsight

Spatio-Temporal Stream Processing in Microsoft StreamInsight Spatio-Temporal Stream Processing in Microsoft StreamInsight Mohamed Ali 1 Badrish Chandramouli 2 Balan Sethu Raman 1 Ed Katibah 1 1 Microsoft Corporation, One Microsoft Way, Redmond WA 98052 2 Microsoft

More information

Reactive Microservices Architecture on AWS

Reactive Microservices Architecture on AWS Reactive Microservices Architecture on AWS Sascha Möllering Solutions Architect, @sascha242, Amazon Web Services Germany GmbH Why are we here today? https://secure.flickr.com/photos/mgifford/4525333972

More information

(C) Global Journal of Engineering Science and Research Management

(C) Global Journal of Engineering Science and Research Management ANDROID BASED SECURED PHOTO IDENTIFICATION SYSTEM USING DIGITAL WATERMARKING Prof.Abhijeet A.Chincholkar *1, Ms.Najuka B.Todekar 2, Ms.Sunita V.Ghai 3 *1 M.E. Digital Electronics, JCOET Yavatmal, India.

More information

Relational Databases

Relational Databases Relational Databases Jan Chomicki University at Buffalo Jan Chomicki () Relational databases 1 / 49 Plan of the course 1 Relational databases 2 Relational database design 3 Conceptual database design 4

More information

Pulsar. Realtime Analytics At Scale. Wang Xinglang

Pulsar. Realtime Analytics At Scale. Wang Xinglang Pulsar Realtime Analytics At Scale Wang Xinglang Agenda Pulsar : Real Time Analytics At ebay Business Use Cases Product Requirements Pulsar : Technology Deep Dive 2 Pulsar Business Use Case: Behavioral

More information

Crescando: Predictable Performance for Unpredictable Workloads

Crescando: Predictable Performance for Unpredictable Workloads Crescando: Predictable Performance for Unpredictable Workloads G. Alonso, D. Fauser, G. Giannikis, D. Kossmann, J. Meyer, P. Unterbrunner Amadeus S.A. ETH Zurich, Systems Group (Funded by Enterprise Computing

More information

LIT Middleware: Design and Implementation of RFID Middleware based on the EPC Network Architecture

LIT Middleware: Design and Implementation of RFID Middleware based on the EPC Network Architecture LIT Middleware: Design and Implementation of RFID Middleware based on the EPC Network Architecture Ashad Kabir, Bonghee Hong, Wooseok Ryu, Sungwoo Ahn Dept. of Computer Engineering Pusan National University,

More information

Lecture 8. Database Management and Queries

Lecture 8. Database Management and Queries Lecture 8 Database Management and Queries Lecture 8: Outline I. Database Components II. Database Structures A. Conceptual, Logical, and Physical Components III. Non-Relational Databases A. Flat File B.

More information

Conception of Information Systems Lecture 1: Basics

Conception of Information Systems Lecture 1: Basics Conception of Information Systems Lecture 1: Basics 8 March 2005 http://lsirwww.epfl.ch/courses/cis/2005ss/ 2004-2005, Karl Aberer & J.P. Martin-Flatin 1 Information System: Definition Webopedia: An information

More information

A SEMANTIC MATCHMAKER SERVICE ON THE GRID

A SEMANTIC MATCHMAKER SERVICE ON THE GRID DERI DIGITAL ENTERPRISE RESEARCH INSTITUTE A SEMANTIC MATCHMAKER SERVICE ON THE GRID Andreas Harth Yu He Hongsuda Tangmunarunkit Stefan Decker Carl Kesselman DERI TECHNICAL REPORT 2004-05-18 MAY 2004 DERI

More information

RUNTIME OPTIMIZATION AND LOAD SHEDDING IN MAVSTREAM: DESIGN AND IMPLEMENTATION BALAKUMAR K. KENDAI. Presented to the Faculty of the Graduate School of

RUNTIME OPTIMIZATION AND LOAD SHEDDING IN MAVSTREAM: DESIGN AND IMPLEMENTATION BALAKUMAR K. KENDAI. Presented to the Faculty of the Graduate School of RUNTIME OPTIMIZATION AND LOAD SHEDDING IN MAVSTREAM: DESIGN AND IMPLEMENTATION by BALAKUMAR K. KENDAI Presented to the Faculty of the Graduate School of The University of Texas at Arlington in Partial

More information

Chapter 1: Distributed Information Systems

Chapter 1: Distributed Information Systems Chapter 1: Distributed Information Systems Contents - Chapter 1 Design of an information system Layers and tiers Bottom up design Top down design Architecture of an information system One tier Two tier

More information

Logi Ad Hoc Reporting System Administration Guide

Logi Ad Hoc Reporting System Administration Guide Logi Ad Hoc Reporting System Administration Guide Version 12 July 2016 Page 2 Table of Contents INTRODUCTION... 4 APPLICATION ARCHITECTURE... 5 DOCUMENT OVERVIEW... 6 GENERAL USER INTERFACE... 7 CONTROLS...

More information

4th National Conference on Electrical, Electronics and Computer Engineering (NCEECE 2015)

4th National Conference on Electrical, Electronics and Computer Engineering (NCEECE 2015) 4th National Conference on Electrical, Electronics and Computer Engineering (NCEECE 2015) Benchmark Testing for Transwarp Inceptor A big data analysis system based on in-memory computing Mingang Chen1,2,a,

More information

Scalable Streaming Analytics

Scalable Streaming Analytics Scalable Streaming Analytics KARTHIK RAMASAMY @karthikz TALK OUTLINE BEGIN I! II ( III b Overview Storm Overview Storm Internals IV Z V K Heron Operational Experiences END WHAT IS ANALYTICS? according

More information

Logi Ad Hoc Reporting System Administration Guide

Logi Ad Hoc Reporting System Administration Guide Logi Ad Hoc Reporting System Administration Guide Version 10.3 Last Updated: August 2012 Page 2 Table of Contents INTRODUCTION... 4 Target Audience... 4 Application Architecture... 5 Document Overview...

More information

Database System Concepts and Architecture

Database System Concepts and Architecture CHAPTER 2 Database System Concepts and Architecture Copyright 2017 Ramez Elmasri and Shamkant B. Navathe Slide 2-2 Outline Data Models and Their Categories History of Data Models Schemas, Instances, and

More information

Streaming SQL. Julian Hyde. 9 th XLDB Conference SLAC, Menlo Park, 2016/05/25

Streaming SQL. Julian Hyde. 9 th XLDB Conference SLAC, Menlo Park, 2016/05/25 Streaming SQL Julian Hyde 9 th XLDB Conference SLAC, Menlo Park, 2016/05/25 @julianhyde SQL Query planning Query federation OLAP Streaming Hadoop Apache member VP Apache Calcite PMC Apache Arrow, Drill,

More information

CSE 544: Principles of Database Systems

CSE 544: Principles of Database Systems CSE 544: Principles of Database Systems Anatomy of a DBMS, Parallel Databases 1 Announcements Lecture on Thursday, May 2nd: Moved to 9am-10:30am, CSE 403 Paper reviews: Anatomy paper was due yesterday;

More information

ITP 140 Mobile Technologies. Databases Client/Server

ITP 140 Mobile Technologies. Databases Client/Server ITP 140 Mobile Technologies Databases Client/Server Databases Data: recorded facts and figures Information: knowledge derived from data Databases record data, but they do so in such a way that we can produce

More information

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:

More information

XML Systems & Benchmarks

XML Systems & Benchmarks XML Systems & Benchmarks Christoph Staudt Peter Chiv Saarland University, Germany July 1st, 2003 Main Goals of our talk Part I Show up how databases and XML come together Make clear the problems that arise

More information

Provide Real-Time Data To Financial Applications

Provide Real-Time Data To Financial Applications Provide Real-Time Data To Financial Applications DATA SHEET Introduction Companies typically build numerous internal applications and complex APIs for enterprise data access. These APIs are often engineered

More information

CSE 444: Database Internals. Lecture 22 Distributed Query Processing and Optimization

CSE 444: Database Internals. Lecture 22 Distributed Query Processing and Optimization CSE 444: Database Internals Lecture 22 Distributed Query Processing and Optimization CSE 444 - Spring 2014 1 Readings Main textbook: Sections 20.3 and 20.4 Other textbook: Database management systems.

More information

Copyright 2018, Oracle and/or its affiliates. All rights reserved.

Copyright 2018, Oracle and/or its affiliates. All rights reserved. Beyond SQL Tuning: Insider's Guide to Maximizing SQL Performance Monday, Oct 22 10:30 a.m. - 11:15 a.m. Marriott Marquis (Golden Gate Level) - Golden Gate A Ashish Agrawal Group Product Manager Oracle

More information

Intelligent Caching in Data Virtualization Recommended Use of Caching Controls in the Denodo Platform

Intelligent Caching in Data Virtualization Recommended Use of Caching Controls in the Denodo Platform Data Virtualization Intelligent Caching in Data Virtualization Recommended Use of Caching Controls in the Denodo Platform Introduction Caching is one of the most important capabilities of a Data Virtualization

More information

Comprehensive Guide to Evaluating Event Stream Processing Engines

Comprehensive Guide to Evaluating Event Stream Processing Engines Comprehensive Guide to Evaluating Event Stream Processing Engines i Copyright 2006 Coral8, Inc. All rights reserved worldwide. Worldwide Headquarters: Coral8, Inc. 82 Pioneer Way, Suite 106 Mountain View,

More information

Oracle BI 11g R1: Build Repositories Course OR102; 5 Days, Instructor-led

Oracle BI 11g R1: Build Repositories Course OR102; 5 Days, Instructor-led Oracle BI 11g R1: Build Repositories Course OR102; 5 Days, Instructor-led Course Description This Oracle BI 11g R1: Build Repositories training is based on OBI EE release 11.1.1.7. Expert Oracle Instructors

More information

DATABASES AND THE CLOUD. Gustavo Alonso Systems Group / ECC Dept. of Computer Science ETH Zürich, Switzerland

DATABASES AND THE CLOUD. Gustavo Alonso Systems Group / ECC Dept. of Computer Science ETH Zürich, Switzerland DATABASES AND THE CLOUD Gustavo Alonso Systems Group / ECC Dept. of Computer Science ETH Zürich, Switzerland AVALOQ Conference Zürich June 2011 Systems Group www.systems.ethz.ch Enterprise Computing Center

More information

COMPARISON WHITEPAPER. Snowplow Insights VS SaaS load-your-data warehouse providers. We do data collection right.

COMPARISON WHITEPAPER. Snowplow Insights VS SaaS load-your-data warehouse providers. We do data collection right. COMPARISON WHITEPAPER Snowplow Insights VS SaaS load-your-data warehouse providers We do data collection right. Background We were the first company to launch a platform that enabled companies to track

More information

CHAPTER 3 Implementation of Data warehouse in Data Mining

CHAPTER 3 Implementation of Data warehouse in Data Mining CHAPTER 3 Implementation of Data warehouse in Data Mining 3.1 Introduction to Data Warehousing A data warehouse is storage of convenient, consistent, complete and consolidated data, which is collected

More information

What is database? Types and Examples

What is database? Types and Examples What is database? Types and Examples Visit our site for more information: www.examplanning.com Facebook Page: https://www.facebook.com/examplanning10/ Twitter: https://twitter.com/examplanning10 TABLE

More information

Oracle BI 11g R1: Build Repositories

Oracle BI 11g R1: Build Repositories Oracle University Contact Us: + 36 1224 1760 Oracle BI 11g R1: Build Repositories Duration: 5 Days What you will learn This Oracle BI 11g R1: Build Repositories training is based on OBI EE release 11.1.1.7.

More information

Distributed Data Management

Distributed Data Management Lecture Data Management Chapter 1: Erik Buchmann buchmann@ipd.uka.de IPD, Forschungsbereich Systeme der Informationsverwaltung of this Chapter are databases and distributed data management not completely

More information

Chapter 2 Distributed Information Systems Architecture

Chapter 2 Distributed Information Systems Architecture Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 dessloch@informatik.uni-kl.de Chapter 2 Distributed Information Systems Architecture Chapter Outline

More information

RDP203 - Enhanced Support for SAP NetWeaver BW Powered by SAP HANA and Mixed Scenarios. October 2013

RDP203 - Enhanced Support for SAP NetWeaver BW Powered by SAP HANA and Mixed Scenarios. October 2013 RDP203 - Enhanced Support for SAP NetWeaver BW Powered by SAP HANA and Mixed Scenarios October 2013 Disclaimer This presentation outlines our general product direction and should not be relied on in making

More information

Optimizing and Modeling SAP Business Analytics for SAP HANA. Iver van de Zand, Business Analytics

Optimizing and Modeling SAP Business Analytics for SAP HANA. Iver van de Zand, Business Analytics Optimizing and Modeling SAP Business Analytics for SAP HANA Iver van de Zand, Business Analytics Early data warehouse projects LIMITATIONS ISSUES RAISED Data driven by acquisition, not architecture Too

More information

Oracle BI 12c: Build Repositories

Oracle BI 12c: Build Repositories Oracle University Contact Us: Local: 1800 103 4775 Intl: +91 80 67863102 Oracle BI 12c: Build Repositories Duration: 5 Days What you will learn This Oracle BI 12c: Build Repositories training teaches you

More information

Query Processing on Multi-Core Architectures

Query Processing on Multi-Core Architectures Query Processing on Multi-Core Architectures Frank Huber and Johann-Christoph Freytag Department for Computer Science, Humboldt-Universität zu Berlin Rudower Chaussee 25, 12489 Berlin, Germany {huber,freytag}@dbis.informatik.hu-berlin.de

More information

Oracle Database 10g The Self-Managing Database

Oracle Database 10g The Self-Managing Database Oracle Database 10g The Self-Managing Database Benoit Dageville Oracle Corporation benoit.dageville@oracle.com Page 1 1 Agenda Oracle10g: Oracle s first generation of self-managing database Oracle s Approach

More information

Context-aware Services for UMTS-Networks*

Context-aware Services for UMTS-Networks* Context-aware Services for UMTS-Networks* * This project is partly financed by the government of Bavaria. Thomas Buchholz LMU München 1 Outline I. Properties of current context-aware architectures II.

More information

Advanced Data Management Technologies

Advanced Data Management Technologies ADMT 2017/18 Unit 13 J. Gamper 1/42 Advanced Data Management Technologies Unit 13 DW Pre-aggregation and View Maintenance J. Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Acknowledgements:

More information

Managing Virtual Records in Relational Databases. Frank Wm. Tompa Ahmed Ataullah

Managing Virtual Records in Relational Databases. Frank Wm. Tompa Ahmed Ataullah Managing Virtual Records in Relational Databases Frank Wm. Tompa Ahmed Ataullah ACKNOWLEDGEMENTS Records Retention in Database Management Systems, in CIKM 2008, 873-882 (co-authored with Ashraf Aboulnaga).

More information